Package org.apache.tika.eval.core.textstats
-
Interface Summary Interface Description BytesRefCalculator<T> Interface for calculators that require a stringBytesRefCalculator.BytesRefCalcInstance<T> LanguageAwareTokenCountStats<T> Interface for calculators that require language probabilities and token statsStringStatsCalculator<T> Interface for calculators that require a stringTextStatsCalculator Base text stats interfaceTokenCountStatsCalculator<T> Interface for calculators that require token stats -
Class Summary Class Description BasicTokenCountStatsCalculator CommonTokens CommonTokensBhattacharyya CommonTokensCosine CommonTokensHellinger CommonTokensKLDivergence CommonTokensKLDNormed CompositeTextStatsCalculator ContentLengthCalculator TextProfileSignature Copied nearly directly from Apache Nutch: https://github.com/apache/nutch/blob/master/src/java/org/apache/nutch/crawl/TextProfileSignature.javaTextSha256Signature Calculates the base32 encoded SHA-256 checksum on the analyzed textTokenCountPriorityQueue TokenEntropy TokenLengths TopNTokens UnicodeBlockCounter