Package org.apache.tika.extractor
@Version("1.0.0")
package org.apache.tika.extractor
Extraction of component documents.
-
ClassDescriptionTika container extractor interface.Loads EmbeddedStreamTranslators via service loading.Interface for different document selection strategies for purposes like embedded document extraction by a
ContainerExtractorinstance.This factory creates EmbeddedDocumentExtractors that require anUnpackHandlerin theParseContextshould extend this.Utility class to handle common issues with embedded documents.Type of embedded resource, used for generating canonical resource names.Tika container extractor callback interface.Interface for different filtering of embedded streams.Simple pointer class to allow parsers to pass on the parent contenthandler through to the embedded document's parseAn implementation ofContainerExtractorpowered by the regularParserAPI.Helper class for parsers of package archives or other compound document formats that support embedded or attached component documents.ADocumentSelectorthat skips all embedded documents.Standard factory for creatingParsingEmbeddedDocumentExtractorinstances.