Db ReaderUntill it is included in WordNet-Similarity the dbreader has to be downloaded and installed manually features Using dbreader to read pre-stored similarity-values from a cdb database greatly improves disambiguation speed. This because in order to fill the database with all word-pairs that occur within the disambiguation-window, the relatedness (for all senses of these words) only has to be calculated once from WordNet (using WordNet-Similarity-lesk for example). Moreover if the collected list of words is sorted alphabetically then even less lookups are neccesary as the SuperGloss-cache that the lesk measure has is used best. This set-up was tested for disambiguating the entire ICE-corpus in abount an hour (includes building the list and sensing it for a window of 5). The WordCombinationFinder tool can create the sorted word-pair list. This list can be sensed (and the result stored in a cdb database) by the ListSenser tool. Finally sentences can be disambiguated using the SentenceSenser tool. Download [the dbreader measure] [a small test database] [a small script for testing it with the example-database] Requirements DbReader requires the [CDB_File perl module] which is available from [CPAN] just like [WordNet Similarity]. Besides that the [cdb database by D.J. Bernstein] or an equivalent should be installed (available under Debian as the freecdb package). Installation After all requirements have been met it can be installed by just copying it into the perl-lib dir where you have installed WordNet-Similarity, and there within the dir where all other WordNet-Similarity measures reside. On my computer the dir in which it was installed is the following: /usr/local/share/perl/5.8.4/WordNet/Similarity |
MenuList
