FreeLing  3.1
Public Member Functions | Private Member Functions | Private Attributes
freeling::dictionary Class Reference

The class dictionary implements dictionary search and suffix analysis for word forms. More...

#include <dictionary.h>

Inheritance diagram for freeling::dictionary:
Inheritance graph
[legend]
Collaboration diagram for freeling::dictionary:
Collaboration graph
[legend]

List of all members.

Public Member Functions

 dictionary (const std::wstring &, const std::wstring &, bool, const std::wstring &, bool invDic=false, bool retok=true)
 Constructor.
 ~dictionary ()
 Destructor.
void add_analysis (const std::wstring &, const analysis &)
 add analysis to dictionary entry (create entry if not there)
void remove_entry (const std::wstring &)
 remove entry from dictionary
void search_form (const std::wstring &, std::list< analysis > &) const
 Get dictionary entry for a given form, add to given list.
bool annotate_word (word &, std::list< word > &, bool override=false) const
 Fills the analysis list of a word, checking for suffixes and contractions.
void annotate_word (word &) const
 Fills the analysis list of a word, checking for suffixes and contractions.
std::list< std::wstring > get_forms (const std::wstring &, const std::wstring &) const
 Get possible forms for a lemma+pos.
void analyze (sentence &) const
 analyze given sentence

Private Member Functions

bool check_contracted (const std::wstring &, std::wstring, std::wstring, std::list< word > &) const
 check whether the word is a contraction, and if so, fill the list with the contracted words
std::list< std::wstring > tag_combinations (std::list< std::wstring >::const_iterator, std::list< std::wstring >::const_iterator) const
 Generate valid tag combinations for an ambiguous contraction.
bool parse_dict_entry (const std::wstring &, std::list< std::pair< std::wstring, std::list< std::wstring > > > &) const
 parse data string into a map lemma->list of tags
std::wstring compact_data (const std::list< std::pair< std::wstring, std::list< std::wstring > > > &) const
 compact data in format lema1 pos1a|pos1b|pos1c lema2 pos2a|posb to save memory
bool less (const std::wstring &, const std::wstring &, const std::map< std::wstring, std::wstring > &) const
 compare two strings (lemmas or PoS) using given list of preferences
void sort_list (std::list< std::wstring > &, const std::map< std::wstring, std::wstring > &) const
 sort given list using given preferences

Private Attributes

bool AffixAnalysis
 configuration options
bool InverseDict
bool RetokenizeContractions
affixessuf
 suffix analyzer
databasemorfodb
 key-value file or hash
databaseinverdb
std::map< std::wstring,
std::wstring > 
lemma_prefs
std::map< std::wstring,
std::wstring > 
pos_prefs

Detailed Description

The class dictionary implements dictionary search and suffix analysis for word forms.


Constructor & Destructor Documentation

freeling::dictionary::dictionary ( const std::wstring &  Lang,
const std::wstring &  dicFile,
bool  activateAff,
const std::wstring &  sufFile,
bool  invDic = false,
bool  retok = true 
)

Destructor.

Destroy dictionary module, close database.


Member Function Documentation

void freeling::dictionary::add_analysis ( const std::wstring &  form,
const analysis newan 
)

add analysis to dictionary entry (create entry if not there)

References freeling::analysis::get_lemma(), freeling::analysis::get_tag(), freeling::LEMMA_DIVIDER, list2wstring, freeling::TAG_DIVIDER, and wstring2list.

void freeling::dictionary::analyze ( sentence se) const [virtual]

analyze given sentence

Dictionary search and affix analysis for all words in a sentence, using given options.

Implements freeling::processor.

References int2wstring, freeling::sentence::rebuild_word_index(), TRACE, and TRACE_SENTENCE.

Referenced by freeling::maco::analyze().

bool freeling::dictionary::annotate_word ( word ,
std::list< word > &  ,
bool  override = false 
) const

Fills the analysis list of a word, checking for suffixes and contractions.

Returns true iff the form is a contraction, returns contraction components in given list

Fills the analysis list of a word, checking for suffixes and contractions.

Search form in the dictionary.

Never retokenizing contractions, nor returning component list. It is just a convenience equivalent to "annotate_word(w,dummy,true)"

Add* found analysis to the given word. Do not retokenize contractions, nor return a component list.

bool freeling::dictionary::check_contracted ( const std::wstring &  ,
std::wstring  ,
std::wstring  ,
std::list< word > &   
) const [private]

check whether the word is a contraction, and if so, fill the list with the contracted words

Check whether the given word is a contraction, if so, obtain composing words (and store them into lw).

References freeling::word::add_analysis(), ERROR_CRASH, freeling::word::get_n_analysis(), list2wstring, TRACE, and wstring2list.

wstring freeling::dictionary::compact_data ( const std::list< std::pair< std::wstring, std::list< std::wstring > > > &  ) const [private]

compact data in format lema1 pos1a|pos1b|pos1c lema2 pos2a|posb to save memory

References freeling::LEMMA_DIVIDER, list2wstring, and freeling::TAG_DIVIDER.

list< wstring > freeling::dictionary::get_forms ( const std::wstring &  ,
const std::wstring &   
) const

Get possible forms for a lemma+pos.

References WARNING, and wstring2list.

bool freeling::dictionary::less ( const std::wstring &  ,
const std::wstring &  ,
const std::map< std::wstring, std::wstring > &   
) const [private]

compare two strings (lemmas or PoS) using given list of preferences

bool freeling::dictionary::parse_dict_entry ( const std::wstring &  ,
std::list< std::pair< std::wstring, std::list< std::wstring > > > &   
) const [private]

parse data string into a map lemma->list of tags

References wstring2list.

void freeling::dictionary::remove_entry ( const std::wstring &  form)

remove entry from dictionary

References list2wstring, and wstring2list.

void freeling::dictionary::search_form ( const std::wstring &  s,
std::list< analysis > &  la 
) const

Get dictionary entry for a given form, add to given list.

Search form in the dictionary, according to given options, Add* found analysis to the given list.

References freeling::analysis::init(), int2wstring, freeling::LEMMA_DIVIDER, list2wstring, freeling::TAG_DIVIDER, TRACE, and wstring2list.

Referenced by freeling::affixes::CheckRetokenizable(), and freeling::affixes::SearchRootsList().

void freeling::dictionary::sort_list ( std::list< std::wstring > &  ,
const std::map< std::wstring, std::wstring > &   
) const [private]

sort given list using given preferences

bubble sort given list (of lemmas or tags) using given preferences

list< wstring > freeling::dictionary::tag_combinations ( std::list< std::wstring >::const_iterator  ,
std::list< std::wstring >::const_iterator   
) const [private]

Generate valid tag combinations for an ambiguous contraction.

References wstring2list.


Member Data Documentation

configuration options

std::map<std::wstring,std::wstring> freeling::dictionary::lemma_prefs [private]

key-value file or hash

std::map<std::wstring,std::wstring> freeling::dictionary::pos_prefs [private]

suffix analyzer


The documentation for this class was generated from the following files: