Class | Description |
---|---|
AlphaIdeographFilterFactory |
Factory for filter that only allows tokens with characters that "isAlphabetic" or "isIdeographic" through.
|
AnalyzerManager | |
CJKBigramAwareLengthFilterFactory |
Creates a very narrowly focused TokenFilter that limits tokens based on length
_unless_ they've been identified as <DOUBLE> or <SINGLE>
by the CJKBigramFilter.
|
CommonTokenCountManager | |
CommonTokenResult | |
ContrastStatistics | |
LangModel | |
TokenContraster |
Computes some corpus contrast statistics.
|
TokenCounter | Deprecated
use
CompositeTextStatsCalculator
with TokenEntropy ,
TokenLengths
and TopNTokens . |
TokenCountPriorityQueue | |
TokenCounts | |
TokenIntPair | |
TokenStatistics | |
URLEmailNormalizingFilterFactory |
Factory for filter that normalizes urls and emails to __url__ and __email__
respectively.
|
Copyright © 2007–2022 The Apache Software Foundation. All rights reserved.