FreeLing
3.1
|
Class "lang_ident" checks a text against all known languages and sorts the results by probability. More...
#include <lang_ident.h>
Public Member Functions | |
lang_ident () | |
Build an empty language identifier. | |
lang_ident (const std::wstring &) | |
Build a language identifier, read options from given file. | |
void | add_language (const std::wstring &) |
load given language from given model file, add to existing languages. | |
void | train_language (const std::wstring &, const std::wstring &, const std::wstring &) |
train a model for a language, store in modelFile, and add it to the known languages list. | |
std::wstring | identify_language (const std::wstring &, const std::set< std::wstring > &ls=std::set< std::wstring >()) const |
Classify the input text and return the code of the best language (or "none") | |
void | rank_languages (std::vector< std::pair< double, std::wstring > > &, const std::wstring &, const std::set< std::wstring > &ls=std::set< std::wstring >()) const |
fill a vector with sorted probabilities for each language | |
Private Member Functions | |
void | language_probabilities (std::vector< std::pair< double, std::wstring > > &, const std::wstring &, const std::set< std::wstring > &) const |
fill a vector with unsorted probabilities for each language | |
Private Attributes | |
std::map< std::wstring, idioma > | idiomes |
List of known languages . | |
std::set< std::wstring > | all_known_languages |
double | Threshold |
Threshold likelihood to consider a text as belonging to a language. | |
double | ScaleFactor |
ScaleFactor to correct likelihood of each language. |
Class "lang_ident" checks a text against all known languages and sorts the results by probability.
It creates an instance of "idioma" for each known language, and checks input text against all existing instances.
Build an empty language identifier.
freeling::lang_ident::lang_ident | ( | const std::wstring & | ) |
Build a language identifier, read options from given file.
void freeling::lang_ident::add_language | ( | const std::wstring & | ) |
load given language from given model file, add to existing languages.
std::wstring freeling::lang_ident::identify_language | ( | const std::wstring & | , |
const std::set< std::wstring > & | ls = std::set< std::wstring >() |
||
) | const |
Classify the input text and return the code of the best language (or "none")
void freeling::lang_ident::language_probabilities | ( | std::vector< std::pair< double, std::wstring > > & | , |
const std::wstring & | , | ||
const std::set< std::wstring > & | |||
) | const [private] |
fill a vector with unsorted probabilities for each language
void freeling::lang_ident::rank_languages | ( | std::vector< std::pair< double, std::wstring > > & | , |
const std::wstring & | , | ||
const std::set< std::wstring > & | ls = std::set< std::wstring >() |
||
) | const |
fill a vector with sorted probabilities for each language
void freeling::lang_ident::train_language | ( | const std::wstring & | , |
const std::wstring & | , | ||
const std::wstring & | |||
) |
train a model for a language, store in modelFile, and add it to the known languages list.
std::set<std::wstring> freeling::lang_ident::all_known_languages [private] |
std::map<std::wstring,idioma> freeling::lang_ident::idiomes [private] |
List of known languages .
double freeling::lang_ident::ScaleFactor [private] |
ScaleFactor to correct likelihood of each language.
double freeling::lang_ident::Threshold [private] |
Threshold likelihood to consider a text as belonging to a language.