Package org.apache.tika.detect
Media type detection.
-
Interface Summary Interface Description Detector Content type detector.EncodingDetector Character encoding detector. -
Class Summary Class Description AutoDetectReader An input stream reader that automatically detects the character encoding to be used for converting bytes to characters.CompositeDetector Content type detector that combines multiple different detection mechanisms.CompositeEncodingDetector DefaultDetector A composite detector based on all theDetector
implementations available through theservice provider mechanism
.DefaultEncodingDetector A composite encoding detector based on all theEncodingDetector
implementations available through theservice provider mechanism
.DefaultProbDetector A version ofDefaultDetector
for probabilistic mime detectors, which use statistical techniques to blend the results of differing underlying detectors when attempting to detect the type of a given file.EmptyDetector Dummy detector that returns application/octet-stream for all documents.MagicDetector Content type detection based on magic bytes, i.e.NameDetector Content type detection based on the resource name.NNExampleModelDetector NNTrainedModel NNTrainedModelBuilder NonDetectingEncodingDetector Always returns the charset passed in via the initializerOverrideDetector Use this to force a content type detection via theTikaCoreProperties.CONTENT_TYPE_OVERRIDE
key in the metadata object.TextDetector Content type detection of plain text documents.TextStatistics Utility class for computing a histogram of the bytes seen in a stream.TrainedModel TrainedModelDetector TypeDetector Content type detection based on a content type hint.XmlRootExtractor Utility class that uses aSAXParser
to determine the namespace URI and local name of the root element of an XML file.ZeroSizeFileDetector Detector to identify zero length files as application/x-zerovalue