FreeLing
3.1
|
Class alternatives suggests words that are orthogrphically/phonetically similar to input word. More...
#include <alternatives.h>
Public Member Functions | |
alternatives (const std::wstring &) | |
Constructor. | |
~alternatives () | |
Destructor. | |
void | get_similar_words (const std::wstring &, std::list< std::pair< std::wstring, int > > &) const |
direct access to results of underlying automata | |
void | analyze (sentence &) const |
spell check each word in sentence | |
Private Member Functions | |
void | filter_candidate (const std::wstring &, const std::wstring &, int distance, std::map< std::wstring, int > &) const |
filter given candidate and decide if it is a valid alternative. | |
void | filter_alternatives (const std::list< std::pair< std::wstring, int > > &, word &) const |
adds the new words that are posible correct spellings from original word to the word analysys data | |
Private Attributes | |
foma_FSM * | sed |
FSM for orthographic edit distance. | |
std::multimap< std::wstring, std::wstring > | orthography |
remember from which word(s) every phonetic form came from (only for phonetic distances) | |
phonetics * | ph |
The class that translates a word into phonetic sounds. | |
int | DistanceThreshold |
Maximum distance to consider an entry as an alternative. | |
int | MaxSizeDiff |
Maximum lentgh difference to consider a word as a possible correction. | |
freeling::regexp | CheckKnownTags |
tags of known word to be be checked | |
bool | CheckUnknown |
whether unknown words should be checked | |
int | DistanceType |
Static Private Attributes | |
static const int | ORTHOGRAPHIC = 1 |
type of distance used | |
static const int | PHONETIC = 2 |
Class alternatives suggests words that are orthogrphically/phonetically similar to input word.
Results may be used for spell checking.
freeling::alternatives::alternatives | ( | const std::wstring & | altsFile | ) |
Constructor.
Create a alternatives module, loading dictionary and options.
Create phonetic transcriptor
References freeling::util::absolute(), freeling::config_file::add_section(), CheckKnownTags, CheckUnknown, freeling::config_file::close(), DistanceThreshold, DistanceType, ERROR_CRASH, freeling::config_file::get_content_line(), freeling::config_file::get_section(), freeling::phonetics::get_sound(), freeling::util::lowercase(), MaxSizeDiff, freeling::util::new_tempfile_name(), freeling::config_file::open(), freeling::util::open_utf8_file(), ORTHOGRAPHIC, orthography, ph, PHONETIC, sed, freeling::foma_FSM::set_cutoff_threshold(), TRACE, WARNING, wstring2int, and wstring2string.
void freeling::alternatives::analyze | ( | sentence & | se | ) | const [virtual] |
spell check each word in sentence
Navigates the sentence adding alternative words (possible correct spelling data)
Implements freeling::processor.
References CheckKnownTags, CheckUnknown, DistanceType, filter_alternatives(), freeling::foma_FSM::get_similar_words(), freeling::phonetics::get_sound(), int2wstring, ORTHOGRAPHIC, ph, PHONETIC, freeling::regexp::search(), sed, and TRACE.
void freeling::alternatives::filter_alternatives | ( | const std::list< std::pair< std::wstring, int > > & | , |
word & | |||
) | const [private] |
adds the new words that are posible correct spellings from original word to the word analysys data
adds the new words that are valid alternatives.
References freeling::word::alternatives_begin(), freeling::word::alternatives_end(), freeling::word::clear_alternatives(), DistanceThreshold, DistanceType, filter_candidate(), freeling::word::get_alternatives(), freeling::word::get_lc_form(), ORTHOGRAPHIC, orthography, PHONETIC, and TRACE.
Referenced by analyze().
void freeling::alternatives::filter_candidate | ( | const std::wstring & | , |
const std::wstring & | , | ||
int | distance, | ||
std::map< std::wstring, int > & | |||
) | const [private] |
filter given candidate and decide if it is a valid alternative.
References int2wstring, MaxSizeDiff, and TRACE.
Referenced by filter_alternatives().
void freeling::alternatives::get_similar_words | ( | const std::wstring & | , |
std::list< std::pair< std::wstring, int > > & | |||
) | const |
direct access to results of underlying automata
Provide direct access to results of underlying automata, in case caller only want the list of strings.
References DistanceType, freeling::foma_FSM::get_similar_words(), freeling::phonetics::get_sound(), MaxSizeDiff, ORTHOGRAPHIC, orthography, ph, PHONETIC, sed, and TRACE.
tags of known word to be be checked
Referenced by alternatives(), and analyze().
bool freeling::alternatives::CheckUnknown [private] |
whether unknown words should be checked
Referenced by alternatives(), and analyze().
int freeling::alternatives::DistanceThreshold [private] |
Maximum distance to consider an entry as an alternative.
Referenced by alternatives(), and filter_alternatives().
int freeling::alternatives::DistanceType [private] |
Referenced by alternatives(), analyze(), filter_alternatives(), and get_similar_words().
int freeling::alternatives::MaxSizeDiff [private] |
Maximum lentgh difference to consider a word as a possible correction.
Referenced by alternatives(), filter_candidate(), and get_similar_words().
const int freeling::alternatives::ORTHOGRAPHIC = 1 [static, private] |
type of distance used
Referenced by alternatives(), analyze(), filter_alternatives(), and get_similar_words().
std::multimap<std::wstring,std::wstring> freeling::alternatives::orthography [private] |
remember from which word(s) every phonetic form came from (only for phonetic distances)
Referenced by alternatives(), filter_alternatives(), and get_similar_words().
phonetics* freeling::alternatives::ph [private] |
The class that translates a word into phonetic sounds.
Referenced by alternatives(), analyze(), get_similar_words(), and ~alternatives().
const int freeling::alternatives::PHONETIC = 2 [static, private] |
Referenced by alternatives(), analyze(), filter_alternatives(), and get_similar_words().
foma_FSM* freeling::alternatives::sed [private] |
FSM for orthographic edit distance.
Referenced by alternatives(), analyze(), get_similar_words(), and ~alternatives().