Semantic Gravitor


Purpose

Finds the semantic gravity for different Part of Speech (POS) tags within a given window. This means that the relatedness value is calculated for all words with other words within the window (relative to the word). Then these values are summed per pos-tag. If instead of a tagrow with !POS-tags a row of words is given then a sum per word (type as in type versus token) will be made.

The configfile is configsemanticgravitor.pl

Synopsis

./semanticgravitor.pl
[-c <corpus>] [-f <fromsubdir>] | [-fd <fromdir>] [-a <tagrowsubdir>] | [-ad <tagrowdir>] [-e <sensedtagrowsubdir>] | [-ed <sensedtagrowdir>] [-tc <tocorpus>] [-t <tosubdir>] | [-td <targetdir>] [-tf <tofile>] [-w <windowsize>] [-ce <ceiling>] [-lo] [-ao [-aw]] [-ps <partsubdir>] [-p <part>] | [-pd <partdir>] [-pf <divprtfile> [...]] [-dr] [-? = -h = -help = --help] [-v [<verboselvl>]]

-c
the corpus whose wordrow files are to be used
-f
subdir below corpus where the wordrow files can be found (defaults to reduced if none is specified and if it's not changed in the configfile)
-fd
full path to the wordrows dir
-a
the tagrowsubdir. Defaults to tagrow. This is the dir in which the !POS- tagrows are to be found for which the semantic gravity-properties should be calculated
-ad
full path to the tagrow dir
-e
the sensedtagrowsubdir. Defaults to sensetagrow. This dir contains the wordnet-sense-tags (they look like #v#1, #n#12, etc...)
-ed
full path to the sensedtagrow dir
-tc
target-corpus. If none is set the source corpus is used
-t
the subdir relative to the corpusdir in which the semantic gravity list should be stored. Note that this option is only possible if the corpus is given with the -c or -tc option (not the full path with the -td option)
-td
full path to the place where the semantic gravity list should be stored
-tf
the file in which the list of semantic gravity data per !POS-tag should be stored
-w
the window-size for disambiguating (same as in wordcombinationfinder.pl, must be even)
-ce
the ceiling to use for relatedness (to prevent skews)
-lo
enable the logarithmic scale (ceiling is applied befor this if enabled)
-ao
all relatedness-values are set to one, usefull for finding the distribution of terms
-aw
all words, not just those for which senses exist. Should only be called with -ao
-ps
subdir in which the divisions are
-p
the name of a division to use as a part. The file-names in all the divprt -files in the division-dir are added and used as the list of files to use unless you specify one or more explicitly using the -pf option
-pd
full path to the division-dir to use as a part. This option causes the -p option to be ignored
-pf
one or more divprt files to use as the part instead of all files in the division
-dr
dry-run. Nothing is written or deleted, only reading and reporting is done
-v
the level of verbosity, default verboselevel = 2, available levels: 0,1,2,3
-?
(and equivalents) prints help: the purpose and the synopsys

NOTE:
If no from and to-dirs are given the defaults in the config file are used

You can download (or look at the sources of) SemanticGravitor [here]. To run it you will also need [the config file] and the [fiauimenre library]. You can also get the entire tool-package (containing the newest version of all fiauimenre tools and the library) [in one download]
Part of the LogiLogi Network: The LogiLogi Foundation - LogiLogi.org - OgOg.org
This is an old version for archival purposes, see www.LogiLogi.org for the current version.
< Edit this document | View history | Printer friendly (inc. links) >
Visited 986 times
Document last modified Thu, 26 Apr 2007 14:15:03
All content is available under the GNU Free Documentation License. The LogiLogi-system is under the GPL
SourceForge.net Logo Zylon Internet Services-Groningen Logo
Visitor statistics