Package org.apache.tika.eval.tokens
-
Class Summary Class Description AlphaIdeographFilterFactory Factory for filter that only allows tokens with characters that "isAlphabetic" or "isIdeographic" through.AnalyzerManager CJKBigramAwareLengthFilterFactory Creates a very narrowly focused TokenFilter that limits tokens based on length _unless_ they've been identified as <DOUBLE> or <SINGLE> by the CJKBigramFilter.CommonTokenCountManager CommonTokenResult ContrastStatistics LangModel TokenContraster Computes some corpus contrast statistics.TokenCounter Deprecated. TokenCountPriorityQueue TokenCounts TokenIntPair TokenStatistics URLEmailNormalizingFilterFactory Factory for filter that normalizes urls and emails to __url__ and __email__ respectively.