A sample configuration file follows. This is only a sample, and probably won't work if you use it as is. You can start using freeling with the default configuration files which -after installation- are located in /usr/local/share/FreeLing/config (note than prefix /usr/local may differ if you specified an alternative location when installing FreeLing. If you installed from a binary .deb package), it will be at /usr/share/FreeLing/config.
You can use those files as a starting point to customize one configuration file to suit your needs.
Note that file paths in the sample configuration file contain
$FREELINGSHARE
, which is supposed to be an environment
variable. If this variable is not defined, the analyzer will
abort, complaining about not finding the files.
If you use the analyze script, it will define the variable for you as /usr/local/share/Freeling (or the right installation path), unless you define it to point somewhere else.
You can also adjust your configuration files to use normal paths for the files (either relative or absolute) instead of using variables.
## #### default configuration file for Spanish analyzer ## TraceLevel=3 TraceModule=0x0000 ## Options to control the applied modules. The input may be partially ## processed, or not a full analysis may me wanted. The specific ## formats are a choice of the main program using the library, as well ## as the responsability of calling only the required modules. ## Valid input formats are: plain, token, splitted, morfo, tagged, sense. ## Valid output formats are: : plain, token, splitted, morfo, tagged, ## shallow, parsed, dep. InputFormat=plain OutputFormat=tagged # consider each newline as a sentence end AlwaysFlush=no #### Tokenizer options TokenizerFile=$FREELINGSHARE/es/tokenizer.dat #### Splitter options SplitterFile=$FREELINGSHARE/es/splitter.dat #### Morfo options AffixAnalysis=yes MultiwordsDetection=yes NumbersDetection=yes PunctuationDetection=yes DatesDetection=yes QuantitiesDetection=yes DictionarySearch=yes ProbabilityAssignment=yes OrthographicCorrection=no DecimalPoint=, ThousandPoint=. LocutionsFile=$FREELINGSHARE/es/locucions.dat QuantitiesFile=$FREELINGSHARE/es/quantities.dat AffixFile=$FREELINGSHARE/es/afixos.dat ProbabilityFile=$FREELINGSHARE/es/probabilitats.dat DictionaryFile=$FREELINGSHARE/es/dicc.src PunctuationFile=$FREELINGSHARE/common/punct.dat ProbabilityThreshold=0.001 # NER options NERecognition=yes NPDataFile=$FREELINGSHARE/es/np.dat ## comment line above and uncomment that below, if you want ## a better NE recognizer (higer accuracy, lower speed) #NPDataFile=$FREELINGSHARE/es/ner/ner-ab.dat #Spelling Corrector config file CorrectorFile=$FREELINGSHARE/es/corrector/corrector.dat ## Phonetic encoding of words. Phonetics=no PhoneticsFile=$FREELINGSHARE/es/phonetics.dat ## NEC options NEClassification=no NECFile=$FREELINGSHARE/es/nec/nec-svm.dat ## Sense annotation options (none,all,mfs,ukb) SenseAnnotation=none SenseConfigFile=$FREELINGSHARE/es/senses.dat UKBConfigFile=$FREELINGSHARE/es/ukb.dat #### Tagger options Tagger=hmm TaggerHMMFile=$FREELINGSHARE/es/tagger.dat TaggerRelaxFile=$FREELINGSHARE/es/constr_gram.dat TaggerRelaxMaxIter=500 TaggerRelaxScaleFactor=670.0 TaggerRelaxEpsilon=0.001 TaggerRetokenize=yes TaggerForceSelect=tagger #### Parser options GrammarFile=$FREELINGSHARE/es/grammar-dep.dat #### Dependence Parser options DepTxalaFile=$FREELINGSHARE/es/dep/dependences.dat #### Coreference Solver options CoreferenceResolution=no CorefFile=$FREELINGSHARE/es/coref/coref.dat
Lluís Padró 2013-09-09