HMM-Tagger Parameter File

This file contains the statistical data for the Hidden Markov Model, plus some additional data to smooth the missing values. Initial probabilities, transition probabilities, lexical probabilities, etc.

The file may be generated by your own means, or using a tagged corpus and the script src/utilities/train-tagger/bin/TRAIN.sh provided in FreeLing package.
See src/utilities/train-tagger/README for details.

The file has eight sections: <TagsetFile>, <Tag>, <Bigram>, <Trigram>, <Initial>, <Word>, <Smoothing>, and <Forbidden>. Each section is closed by it corresponding tag </Tag>, </Bigram>, </Trigram>, etc.

The tag (unigram), bigram, and trigram probabilities are used in Linear Interpolation smoothing by the tagger to compute state transition probabilities ($\alpha_{ij}$ parameters of the HMM).

Lluís Padró 2013-09-09