Package | Description |
---|---|
org.apache.tika.eval.langid | |
org.apache.tika.eval.textstats |
Modifier and Type | Class and Description |
---|---|
class |
LanguageIDWrapper
The most efficient way to call this in a multithreaded environment
is to call
LanguageIDWrapper.loadBuiltInModels() before
instantiating the |
Modifier and Type | Interface and Description |
---|---|
interface |
LanguageAwareTokenCountStats<T>
Interface for calculators that require language probabilities and token stats
|
interface |
StringStatsCalculator<T>
Interface for calculators that require a string
|
interface |
TokenCountStatsCalculator<T>
Interface for calculators that require token stats
|
Modifier and Type | Class and Description |
---|---|
class |
BasicTokenCountStatsCalculator |
class |
CommonTokens |
class |
CommonTokensBhattacharyya |
class |
CommonTokensCosine |
class |
CommonTokensHellinger |
class |
CommonTokensKLDivergence |
class |
CommonTokensKLDNormed |
class |
ContentLengthCalculator |
class |
TokenEntropy |
class |
TokenLengths |
class |
TopNTokens |
class |
UnicodeBlockCounter |
Constructor and Description |
---|
CompositeTextStatsCalculator(List<TextStatsCalculator> calculators) |
CompositeTextStatsCalculator(List<TextStatsCalculator> calculators,
org.apache.lucene.analysis.Analyzer analyzer,
LanguageIDWrapper languageIDWrapper) |
Copyright © 2007–2020 The Apache Software Foundation. All rights reserved.