Corpus Examiner


Purpose

Examines the stipped corpus. Looks for certain elements (what elements exactly depends on the examine-library used, and the actionlist specified) and reports occurences to a file. It is intended to be ran before the stripping is done (or even before a stripping-lib has been made for the corpus) to invesigate the properties of the corpus

Use -ll to list the examine-libs

The configfile is configcorpusexaminer.pl

The following examine-libs are available:

Synopsis

./corpusexaminer.pl
[-c <corpus>] [-f <fromsubdir>] | [-fd <fromdir>] [-tc <tocorpus>] [-t <tosubdir>] | [-td <targetdir>] [-l <examinelibshortname>] | [-lf <examinelibfullname>] [-ll] [-la <extraexaminelibargs>] [-a <actionlist>] [-ps <partsubdir>] [-p <part>] | [-pd <partdir>] [-pf <divprtfile> [...]] [-dr] [-? = -h = -help = --help] [-v [<verboselvl>]]

-c
the corpus to work on
-f
the subdir in which the files to work on can be found
-fd
full path to the stuff to work on (conflicts with -c and -f)
-tc
target-corpus. If none is set the source corpus is used
-t
the subdir in which the examineresults should be stored
-td
full path to the place where the examineresults should be stored (conflicts with -t) -l the short name of the examine-library to use (conflictswith -lf)
-lf
the examine-library to use (conflicts with -l)
-ll
list the installed examine-libraries (exits the program immidiately)
-la
extra args to hand over to the examine-lib
-a
a detailed specification of the actions to perform, in this case what to examine and what not in the form of a string in which each token stands for astring/thing to look for). Have a look at the description of the examine- lib you use for info on what actionlist-settings it supports
-ps
subdir in which the divisions are
-p
the name of a division to use as a part. The file-names in all the divprt -files in the division-dir are added and used as the list of files to use unless you specify one or more explicitly using the -pf option
-pd
full path to the division-dir to use as a part. This option causes the -p option to be ignored
-pf
one or more divprt files to use as the part instead of all files in the division
-dr
dry-run. Nothing is written or deleted, only reading and reporting is done
-v
the level of verbosity, default verboselevel = 2, available levels: 0,1,2,3
-?
(and equivalents) prints help: the purpose and the synopsys

NOTE:
If no lists or levels are given the defaults in the config file are used

WARNING:
If options are not given they can be still used if they're specified in the config file

You can download (or look at the sources of) CorpusExaminer [here]. To run it you will also need [the config file] and the [fiauimenrelibrary]. You can also get the entire tool-package (containing the newest version of all fiauimenre tools and the library) [in one download]
Part of the LogiLogi Network: The LogiLogi Foundation - LogiLogi.org - OgOg.org
This is an old version for archival purposes, see www.LogiLogi.org for the current version.
< Edit this document | View history | Printer friendly (inc. links) >
Visited 1240 times
Document last modified Fri, 29 Jul 2005 07:44:45
All content is available under the GNU Free Documentation License. The LogiLogi-system is under the GPL
SourceForge.net Logo Zylon Internet Services-Groningen Logo
Visitor statistics