| Interface | Description |
|---|---|
| BytesRefCalculator<T> |
Interface for calculators that require a string
|
| BytesRefCalculator.BytesRefCalcInstance<T> | |
| LanguageAwareTokenCountStats<T> |
Interface for calculators that require language probabilities and token stats
|
| StringStatsCalculator<T> |
Interface for calculators that require a string
|
| TextStatsCalculator |
Base text stats interface
|
| TokenCountStatsCalculator<T> |
Interface for calculators that require token stats
|
| Class | Description |
|---|---|
| BasicTokenCountStatsCalculator | |
| CommonTokens | |
| CommonTokensBhattacharyya | |
| CommonTokensCosine | |
| CommonTokensHellinger | |
| CommonTokensKLDivergence | |
| CommonTokensKLDNormed | |
| CompositeTextStatsCalculator | |
| ContentLengthCalculator | |
| TextProfileSignature |
Copied nearly directly from Apache Nutch:
https://github.com/apache/nutch/blob/master/src/java/org/apache/nutch/crawl/TextProfileSignature.java
|
| TextSha256Signature |
Calculates the base32 encoded SHA-256 checksum on the analyzed text
|
| TokenCountPriorityQueue | |
| TokenEntropy | |
| TokenLengths | |
| TopNTokens | |
| UnicodeBlockCounter |
Copyright © 2007–2021 The Apache Software Foundation. All rights reserved.