FreeLing  3.1
Public Member Functions | Private Member Functions | Private Attributes
freeling::lang_ident Class Reference

Class "lang_ident" checks a text against all known languages and sorts the results by probability. More...

#include <lang_ident.h>

List of all members.

Public Member Functions

 lang_ident ()
 Build an empty language identifier.
 lang_ident (const std::wstring &)
 Build a language identifier, read options from given file.
void add_language (const std::wstring &)
 load given language from given model file, add to existing languages.
void train_language (const std::wstring &, const std::wstring &, const std::wstring &)
 train a model for a language, store in modelFile, and add it to the known languages list.
std::wstring identify_language (const std::wstring &, const std::set< std::wstring > &ls=std::set< std::wstring >()) const
 Classify the input text and return the code of the best language (or "none")
void rank_languages (std::vector< std::pair< double, std::wstring > > &, const std::wstring &, const std::set< std::wstring > &ls=std::set< std::wstring >()) const
 fill a vector with sorted probabilities for each language

Private Member Functions

void language_probabilities (std::vector< std::pair< double, std::wstring > > &, const std::wstring &, const std::set< std::wstring > &) const
 fill a vector with unsorted probabilities for each language

Private Attributes

std::map< std::wstring, idiomaidiomes
 List of known languages .
std::set< std::wstring > all_known_languages
double Threshold
 Threshold likelihood to consider a text as belonging to a language.
double ScaleFactor
 ScaleFactor to correct likelihood of each language.

Detailed Description

Class "lang_ident" checks a text against all known languages and sorts the results by probability.

It creates an instance of "idioma" for each known language, and checks input text against all existing instances.


Constructor & Destructor Documentation

Build an empty language identifier.

freeling::lang_ident::lang_ident ( const std::wstring &  )

Build a language identifier, read options from given file.


Member Function Documentation

void freeling::lang_ident::add_language ( const std::wstring &  )

load given language from given model file, add to existing languages.

std::wstring freeling::lang_ident::identify_language ( const std::wstring &  ,
const std::set< std::wstring > &  ls = std::set< std::wstring >() 
) const

Classify the input text and return the code of the best language (or "none")

void freeling::lang_ident::language_probabilities ( std::vector< std::pair< double, std::wstring > > &  ,
const std::wstring &  ,
const std::set< std::wstring > &   
) const [private]

fill a vector with unsorted probabilities for each language

void freeling::lang_ident::rank_languages ( std::vector< std::pair< double, std::wstring > > &  ,
const std::wstring &  ,
const std::set< std::wstring > &  ls = std::set< std::wstring >() 
) const

fill a vector with sorted probabilities for each language

void freeling::lang_ident::train_language ( const std::wstring &  ,
const std::wstring &  ,
const std::wstring &   
)

train a model for a language, store in modelFile, and add it to the known languages list.


Member Data Documentation

std::set<std::wstring> freeling::lang_ident::all_known_languages [private]
std::map<std::wstring,idioma> freeling::lang_ident::idiomes [private]

List of known languages .

ScaleFactor to correct likelihood of each language.

Threshold likelihood to consider a text as belonging to a language.


The documentation for this class was generated from the following file: