List Senser


Purpose

Finds the sense-relation between sets of words from a list. It is especially usefull for speeding up things if the words-sets are sorted by either the right or the left word while these words occur more than once. In this way one can utilize the cache-function in the (modified) Lesk-measure. It becomes extra usefull to precalculate the simmilarities if one needs the data more than once, as I did in my Semantic Gravity research (not to speak of combinations occurring more than once within a large corpus). On overall it can reduce the required proc-time by more than 50%. It requires WordNet Simmilarity lesk

The configfile is configlistsenser.pl

Synopsis

./listsenser.pl
[-c <corpus>] [-f <fromsubdir>] | [-fd <fromdir>] [-tc <tocorpus>] [-t <tosubdir>] | [-td <targetdir>] [-ff <combinedwordsfile>] [-tf <senselistfile>] [-e<totempdir>] | [-ed <tempdir>] [-ef <tempfile>] [-es <existingsensefile> (not implemented!)] [-dr] [-? = -h = -help = --help] [-v [<verboselvl>]]

-c
the corpus with which the combinedwordsfile is stored
-f
subdir below corpus where the combinedwordsfile can be found (defaults to combilist if none is specified and if it's not changed in the configfile)
-fd
full path to the combinedwordsfile
-tc
target-corpus. If none is set the source corpus is used
-t
the subdir relative to the corpusdir in which the list of sensed word- combinations should be stored. Note that this option is only possible if the corpus is given with the -c or -tc option (not the full path with the -td option)
-td
full path to the place where the list of sensed word-combinations should be stored
-ff
the name of the source word-combinations file
-tf
the name of the sensed (target) word-combinations file
-e
the temporary-subdir used for storing the temporary file needed for the db- creation
-ed
full path to the temp-dir
-ef
the temp-file (defaults to tempdb.tmp)
-es
the existing sense-file to use as an alternative source
-dr
dry-run. Nothing is written or deleted, only reading and reporting is done
-v
the level of verbosity, default verboselevel = 2, available levels: 0,1,2,3
-?
(and equivalents) prints help: the purpose and the synopsys

NOTE:
If no from and to-dirs are given the defaults in the config file are used

You can download (or look at the sources of) ListSenser [here]. To run it you will also need [the config file] and the [fiauimenre library]. You can also get the entire tool-package (containing the newest version of all fiauimenre tools and the library) [in one download]
Part of the LogiLogi Network: The LogiLogi Foundation - LogiLogi.org - OgOg.org
This is an old version for archival purposes, see www.LogiLogi.org for the current version.
< Edit this document | View history | Printer friendly (inc. links) >
Visited 878 times
Document last modified Tue, 13 Dec 2005 06:33:40
All content is available under the GNU Free Documentation License. The LogiLogi-system is under the GPL
SourceForge.net Logo Zylon Internet Services-Groningen Logo
Visitor statistics