Package org.apache.tika.example
Class TextStatsFromTikaEval
- java.lang.Object
-
- org.apache.tika.example.TextStatsFromTikaEval
-
public class TextStatsFromTikaEval extends Object
These examples create a newCompositeTextStatsCalculator
for each call. This is extremely inefficient because the lang id model has to be loaded and the common words for each call.
-
-
Constructor Summary
Constructors Constructor Description TextStatsFromTikaEval()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description double
getOOV(String txt)
Use the default language id models and the default common tokens lists in tika-eval to calculate the out-of-vocabulary percentage for a given string.
-
-
-
Method Detail
-
getOOV
public double getOOV(String txt)
Use the default language id models and the default common tokens lists in tika-eval to calculate the out-of-vocabulary percentage for a given string.- Parameters:
txt
-- Returns:
-
-