Corpus Meta Collector
- Purpose
Collects metadata from a corpus. The metadata should be within the corpus -files and a metacollector-lib should be available for the corpus
Use -ll to list the installed metacollector-libs
The configfile is configcorpusmetacollector.pl
The following metacollector-libs are available:
- Synopsis
./corpusmetacollector.pl
- [-c <corpus>] [-f <fromsubdir>] | [-fd <fromdir>] [-tf <tofile>] [-tc <tocorpus>] [-t <tosubdir>] | [-td <targetdir>] [-l <collectorlibshortname>] | [-lf <colllibfullname>] [-ll] [-la <extracollectorlibargs>] [-ps <partsubdir>] [-p <part>] | [-pd <partdir>] [-pf <divprtfile> [...]] [-dr] [-? = -h = -help = --help] [-v [<verboselvl>]]
- -c
- the corpus to work on
- -f
- the subdir below the corpus-dir in which the corpus can be found from which the meta-data is to be collected (defaults to raw if none is specified)
- -fd
- the full path to the stuff to work on (conflicts with the -c and -f options)
- -tf
- the name of the metadatafile in which the metadata should be stored (defaults to meta.txt if none is specified)
- -tc
- target-corpus. If none is set the source corpus is used
- -t
- the subdir in which the metadata should be stored (defaults to meta if none is specified)
- -td
- full path to the place where the metadata should be stored
- -l
- the short name of the meta-collector-library to use (conflicts with -lf)
- -lf
- the meta-collector-library to use (conflicts with -l)
- -ll
- list the installed meta-collector-libraries (exits the program immidiately)
- -la
- extra args to hand over to the meta-collector-lib
- -ps
- subdir in which the divisions are
- -p
- the name of a division to use as a part. The file-names in all the divprt -files in the division-dir are added and used as the list of files to use unless you specify one or more explicitly using the -pf option
- -pd
- full path to the division-dir to use as a part. This option causes the -p option to be ignored
- -pf
- one or more divprt files to use as the part instead of all files in the division
- -dr
- dry-run. Nothing is written or deleted, only reading and reporting is done
- -v
- the level of verbosity, default verboselevel = 2, available levels: 0,1,2,3
- -?
- (and equivalents) prints help: the purpose and the synopsys
NOTE:
- If no from and to-dirs are given the defaults in the config file are used
You can download (or look at the sources of) CorpusMetaCollector [here]. To run it you will also need [the config file] and the [fiauimenre library]. You can also get the entire tool-package (containing the newest version of all fiauimenre tools and the library) [in one download]
|