Package | Description |
---|---|
org.apache.tika.config |
Tika configuration tools.
|
org.apache.tika.detect |
Media type detection.
|
org.apache.tika.parser |
Tika parsers.
|
org.apache.tika.parser.code | |
org.apache.tika.parser.csv | |
org.apache.tika.parser.envi | |
org.apache.tika.parser.html | |
org.apache.tika.parser.html.charsetdetector | |
org.apache.tika.parser.txt |
Modifier and Type | Method and Description |
---|---|
EncodingDetector |
TikaConfig.getEncodingDetector()
Returns the configured encoding detector instance
|
Modifier and Type | Class and Description |
---|---|
class |
CompositeEncodingDetector |
class |
DefaultEncodingDetector
A composite encoding detector based on all the
EncodingDetector implementations
available through the service provider mechanism . |
class |
NonDetectingEncodingDetector
Always returns the charset passed in via the initializer
|
Modifier and Type | Method and Description |
---|---|
List<EncodingDetector> |
CompositeEncodingDetector.getDetectors() |
Constructor and Description |
---|
AutoDetectReader(InputStream stream,
Metadata metadata,
EncodingDetector encodingDetector) |
Constructor and Description |
---|
CompositeEncodingDetector(List<EncodingDetector> detectors) |
CompositeEncodingDetector(List<EncodingDetector> detectors,
Collection<Class<? extends EncodingDetector>> excludeEncodingDetectors) |
CompositeEncodingDetector(List<EncodingDetector> detectors,
Collection<Class<? extends EncodingDetector>> excludeEncodingDetectors) |
DefaultEncodingDetector(ServiceLoader loader,
Collection<Class<? extends EncodingDetector>> excludeEncodingDetectors) |
Modifier and Type | Method and Description |
---|---|
EncodingDetector |
AbstractEncodingDetectorParser.getEncodingDetector() |
protected EncodingDetector |
AbstractEncodingDetectorParser.getEncodingDetector(ParseContext parseContext)
Look for an EncodingDetetor in the ParseContext.
|
Modifier and Type | Method and Description |
---|---|
void |
AbstractEncodingDetectorParser.setEncodingDetector(EncodingDetector encodingDetector) |
Constructor and Description |
---|
AbstractEncodingDetectorParser(EncodingDetector encodingDetector) |
DefaultParser(MediaTypeRegistry registry,
ServiceLoader loader,
Collection<Class<? extends Parser>> excludeParsers,
EncodingDetector encodingDetector) |
DefaultParser(MediaTypeRegistry registry,
ServiceLoader loader,
EncodingDetector encodingDetector) |
Constructor and Description |
---|
SourceCodeParser(EncodingDetector encodingDetector) |
Constructor and Description |
---|
TextAndCSVParser(EncodingDetector encodingDetector) |
Constructor and Description |
---|
EnviHeaderParser(EncodingDetector encodingDetector) |
Modifier and Type | Class and Description |
---|---|
class |
HtmlEncodingDetector
Character encoding detector for determining the character encoding of a
HTML document based on the potential charset parameter found in a
Content-Type http-equiv meta tag somewhere near the beginning.
|
Constructor and Description |
---|
HtmlParser(EncodingDetector encodingDetector) |
Modifier and Type | Class and Description |
---|---|
class |
StandardHtmlEncodingDetector
An encoding detector that tries to respect the spirit of the HTML spec
part 12.2.3 "The input byte stream", or at least the part that is compatible with
the implementation of tika.
|
Modifier and Type | Class and Description |
---|---|
class |
Icu4jEncodingDetector |
class |
UniversalEncodingDetector |
Constructor and Description |
---|
TXTParser(EncodingDetector encodingDetector) |
Copyright © 2007–2020 The Apache Software Foundation. All rights reserved.