@Version("1.0.0")
Package org.apache.tika.extractor
Extraction of component documents.
-
Interface Summary Interface Description ContainerExtractor Tika container extractor interface.DocumentSelector Interface for different document selection strategies for purposes like embedded document extraction by aContainerExtractor
instance.EmbeddedDocumentExtractor EmbeddedDocumentExtractorFactory EmbeddedResourceHandler Tika container extractor callback interface.EmbeddedStreamTranslator Interface for different filtering of embedded streams. -
Class Summary Class Description DefaultEmbeddedStreamTranslator Loads EmbeddedStreamTranslators via service loading.EmbeddedDocumentUtil Utility class to handle common issues with embedded documents.ParserContainerExtractor An implementation ofContainerExtractor
powered by the regularParser
API.ParsingEmbeddedDocumentExtractor Helper class for parsers of package archives or other compound document formats that support embedded or attached component documents.ParsingEmbeddedDocumentExtractorFactory