Db Reader


Untill it is included in WordNet-Similarity the dbreader has to be downloaded and installed manually

features

Using dbreader to read pre-stored similarity-values from a cdb database greatly improves disambiguation speed. This because in order to fill the database with all word-pairs that occur within the disambiguation-window, the relatedness (for all senses of these words) only has to be calculated once from WordNet (using WordNet-Similarity-lesk for example). Moreover if the collected list of words is sorted alphabetically then even less lookups are neccesary as the SuperGloss-cache that the lesk measure has is used best.

This set-up was tested for disambiguating the entire ICE-corpus in abount an hour (includes building the list and sensing it for a window of 5).

The WordCombinationFinder tool can create the sorted word-pair list. This list can be sensed (and the result stored in a cdb database) by the ListSenser tool. Finally sentences can be disambiguated using the SentenceSenser tool.

Download

[the dbreader measure]

[a small test database]

[a small script for testing it with the example-database]

Requirements

DbReader requires the [CDB_File perl module] which is available from [CPAN] just like [WordNet Similarity].

Besides that the [cdb database by D.J. Bernstein] or an equivalent should be installed (available under Debian as the freecdb package).

Installation

After all requirements have been met it can be installed by just copying it into the perl-lib dir where you have installed WordNet-Similarity, and there within the dir where all other WordNet-Similarity measures reside.

On my computer the dir in which it was installed is the following:
/usr/local/share/perl/5.8.4/WordNet/Similarity
Part of the LogiLogi Network: The LogiLogi Foundation - LogiLogi.org - OgOg.org
This is an old version for archival purposes, see www.LogiLogi.org for the current version.
< Edit this document | View history | Printer friendly (inc. links) >
Visited 943 times
Document last modified Tue, 13 Dec 2005 09:38:11
All content is available under the GNU Free Documentation License. The LogiLogi-system is under the GPL
SourceForge.net Logo Zylon Internet Services-Groningen Logo
Visitor statistics