public class OptimaizeLangDetector extends LanguageDetector
mixedLanguages, shortText
Constructor and Description |
---|
OptimaizeLangDetector() |
Modifier and Type | Method and Description |
---|---|
void |
addText(char[] cbuf,
int off,
int len)
Add statistics about this text for the current document.
|
List<LanguageResult> |
detectAll()
Detect languages based on previously submitted text (via addText calls).
|
boolean |
hasEnoughText()
Tell the caller whether more text is required for the current document
before the language can be reliably detected.
|
boolean |
hasModel(String language)
Provide information about whether a model exists for a specific
language.
|
LanguageDetector |
loadModels()
Load (or re-load) all available language models.
|
LanguageDetector |
loadModels(Set<String> languages)
Load (or re-load) the models specified in
|
void |
reset()
Reset statistics about the current document being processed
|
LanguageDetector |
setPriors(Map<String,Float> languageProbabilities)
Set the a-priori probabilities for these languages.
|
addText, detect, detect, detectAll, getDefaultLanguageDetector, getLanguageDetectors, getLanguageDetectors, isMixedLanguages, isShortText, setMixedLanguages, setShortText
public LanguageDetector loadModels() throws IOException
LanguageDetector
loadModels
in class LanguageDetector
IOException
public LanguageDetector loadModels(Set<String> languages) throws IOException
LanguageDetector
loadModels
in class LanguageDetector
languages
- list of target languages.IOException
public boolean hasModel(String language)
LanguageDetector
hasModel
in class LanguageDetector
language
- ISO 639-1 name for languagepublic LanguageDetector setPriors(Map<String,Float> languageProbabilities) throws IOException
LanguageDetector
setPriors
in class LanguageDetector
languageProbabilities
- Map from language to probabilityIOException
public void reset()
LanguageDetector
reset
in class LanguageDetector
public void addText(char[] cbuf, int off, int len)
LanguageDetector
addText
in class LanguageDetector
cbuf
- Character bufferoff
- Offset into cbuf to first character in the run of textlen
- Number of characters in the run of text.public List<LanguageResult> detectAll()
LanguageDetector
detectAll
in class LanguageDetector
public boolean hasEnoughText()
LanguageDetector
hasEnoughText
in class LanguageDetector
Copyright © 2007–2017 The Apache Software Foundation. All rights reserved.