| Class | Description |
|---|---|
| AlphaIdeographFilterFactory |
Factory for filter that only allows tokens with characters that "isAlphabetic" or "isIdeographic" through.
|
| AnalyzerManager | |
| CJKBigramAwareLengthFilterFactory |
Creates a very narrowly focused TokenFilter that limits tokens based on length
_unless_ they've been identified as <DOUBLE> or <SINGLE>
by the CJKBigramFilter.
|
| CommonTokenCountManager | |
| CommonTokenResult | |
| ContrastStatistics | |
| LangModel | |
| TokenContraster |
Computes some corpus contrast statistics.
|
| TokenCounter | Deprecated
use
CompositeTextStatsCalculator
with TokenEntropy,
TokenLengths
and TopNTokens. |
| TokenCountPriorityQueue | |
| TokenCounts | |
| TokenIntPair | |
| TokenStatistics | |
| URLEmailNormalizingFilterFactory |
Factory for filter that normalizes urls and emails to __url__ and __email__
respectively.
|
Copyright © 2007–2022 The Apache Software Foundation. All rights reserved.