Package org.apache.tika.ml.junkdetect.tools
package org.apache.tika.ml.junkdetect.tools
-
ClassesClassDescriptionBuilds per-script positive training data for the junk detector from MADLAD-400 and Wikipedia sentence files.Ablation evaluation for the junk detector.Trains the junk detector model from per-script corpus files produced by
BuildJunkTrainingData.