Class EvalCharsetDetectors
java.lang.Object
org.apache.tika.ml.chardetect.tools.EvalCharsetDetectors
Compares
MojibusterEncodingDetector against ICU4J and juniversalchardet.
Supports:
--lengths 20,50,100,200,full— per-probe-length accuracy sweep--confusion— top-confusion report for the ML-All detector
Usage:
java EvalCharsetDetectors \
[--model /path/to/chardetect.bin] \
--data /path/to/test-dir \
[--lengths 20,50,100,200,full] \
[--confusion]
-
Constructor Summary
Constructors -
Method Summary
-
Constructor Details
-
EvalCharsetDetectors
public EvalCharsetDetectors()
-
-
Method Details
-
main
- Throws:
Exception
-