Package org.apache.tika.example
Class TextStatsFromTikaEval
java.lang.Object
org.apache.tika.example.TextStatsFromTikaEval
These examples create a new
CompositeTextStatsCalculator
for each call. This is extremely inefficient because the lang id
model has to be loaded and the common words for each call.-
Constructor Summary
-
Method Summary
-
Constructor Details
-
TextStatsFromTikaEval
public TextStatsFromTikaEval()
-
-
Method Details
-
getOOV
Use the default language id models and the default common tokens lists in tika-eval to calculate the out-of-vocabulary percentage for a given string.- Parameters:
txt
-- Returns:
-