Adding custom feature functions

Custom feature functions can be defined, and called from the .rgf file enclosed in curly brackets (e.g.: {quoted(0)}). Calls to custom feature functions in the .rgf file must have one integer parameter, indicating a word position relative to the target word.

Actual code computing custom feature functions must be provided by the caller. A map std::map<std::wstring,const feature_function*> needs to be given to the constructor, associating the custom function as used in the rule file with a feature_function pointer.

Custom feature functions must be classes derived from class feature_function:

  class feature_function {  
    public: 
      virtual void extract (const sentence &, int, std::list<std::wstring> &) const =0;
      /// Destructor
      virtual ~feature_function() {};
  };
They must implement a method extract that receives the sentence, the position of the target word, and a list of strings where the resulting feature name (or names if more than one is to be generated) will be added.

For instance, the example below generates the feature name in_quotes when the target word is surrounded by words with the Fe PoS tag (which is assigned to any quote symbol by the punctuation module).

  class fquoted : public feature_function {
    public:
      void extract (const sentence &sent, int i, std::list<std::wstring> &res) const {
        if ( (i>0 and sent[i-1].get_tag()==L"Fe") and
             (i<(int)sent.size()-1 and sent[i+1].get_tag()==L"Fe") )
          res.push_back(L"in_quotes");
      }
  };

We can associate this function with the function name quoted adding the pair to a map:

map<wstring,const feature_function*> myfunctions;
myfunctions.insert(make_pair(L"quoted", (feature_function *) new fquoted()));

If we now create a fex object passing this map to the constructor, the created instance will call fquoted::extract with the appropriate parameters whenever a rule in the .rgf file refers to e.g. {quote(0)}.

Note that there are three naming levels for custom feature functions:

Lluís Padró 2013-09-09