Corpus ST ice Stripper


Short name: ice

The CorpusSTiceStripper library strips the ice (International Corpus of English) corpus. The available actionlist options are 'ifrupan'

All characters in it stand for a yes/no action, 'i' stands for strip <ICE.*?> tags that denote sentences, 'r' stands for remove fractuals from tags, 'r' stands for rewrite tags (neccesary if you want to use the tags in the rest of the fiauimenre toolset), 'u' stands for remove <unclear*> tags, 'p' stands for remove <>'s around punctuation, 'a' stands for remove accents, 'n' stands for remove anything that is not interpunctuation or a-zA-Z0-9

If no actionlist is given it will do all actions by default
Part of the LogiLogi Network: The LogiLogi Foundation - LogiLogi.org - OgOg.org
This is an old version for archival purposes, see www.LogiLogi.org for the current version.
< Edit this document | View history | Printer friendly (inc. links) >
Visited 608 times
Document last modified Fri, 29 Jul 2005 06:58:14
All content is available under the GNU Free Documentation License. The LogiLogi-system is under the GPL
SourceForge.net Logo Zylon Internet Services-Groningen Logo
Visitor statistics