Class DefaultEncodingDetector

java.lang.Object
org.apache.tika.detect.CompositeEncodingDetector
org.apache.tika.detect.DefaultEncodingDetector
All Implemented Interfaces:
Serializable, SelfConfiguring, EncodingDetector

public class DefaultEncodingDetector extends CompositeEncodingDetector
A composite encoding detector based on all the EncodingDetector implementations available through the service provider mechanism.

The default chain (Tika 3.x style) runs three detectors in order, with the first non-empty result winning:

  1. org.apache.tika.parser.html.HtmlEncodingDetector
  2. org.apache.tika.parser.txt.UniversalEncodingDetector
  3. org.apache.tika.parser.txt.Icu4jEncodingDetector
Any other EncodingDetector discovered via SPI (e.g., user-supplied detectors) runs after the three blessed detectors, preserving back-compat for callers who add their own.

If you need to control the order of the Detectors explicitly, construct your own CompositeEncodingDetector and pass in the list in the required order.

Since:
Apache Tika 1.15
See Also:
  • Constructor Details

    • DefaultEncodingDetector

      public DefaultEncodingDetector()
    • DefaultEncodingDetector

      public DefaultEncodingDetector(ServiceLoader loader)
    • DefaultEncodingDetector

      public DefaultEncodingDetector(ServiceLoader loader, Collection<Class<? extends EncodingDetector>> excludeEncodingDetectors)