Multiword Recognition Module

This module aggregates input tokens in a single word object if they are found in a given list of multiwords.

The API for this class is:

class locutions: public automat {
  public:
    /// Constructor, receives the name of the file
    ///  containing the multiwords to recognize.
    locutions(const std::string &);

    /// Detect multiwords starting at given sentence position
    bool matching(sentence &, sentence::iterator &);

    /// analyze given sentence.
    void analyze(sentence &) const;
    /// analyze given sentences.
    void analyze(std::list<sentence> &) const;
    /// return analyzed copy of given sentence
    sentence analyze(const sentence &) const;
    /// return analyzed copy of given sentences
    std::list<sentence> analyze(const std::list<sentence> &) const;
};

Class automat implements a generic FSA. The locutions class is a derived class which implements a FSA to recognize the word patterns listed in the file given to the constructor.



Subsections

Lluís Padró 2013-09-09