Package org.apache.tika.ml.chardetect.tools
package org.apache.tika.ml.chardetect.tools
-
ClassesClassDescriptionMicro-benchmark comparing charset detector throughput.Generates charset-detection training, devtest, and test data from MADLAD-400 and Cantonese Wikipedia sentence files.Compares
MojibusterEncodingDetectoragainst ICU4J and juniversalchardet.Naive-Bayes byte-bigram charset classifier trainer.