Package org.apache.tika.extractor
@Version("1.0.0")
package org.apache.tika.extractor
Extraction of component documents.
-
ClassDescriptionFor now, this is an in-memory EmbeddedDocumentBytesHandler that stores all the bytes in memory.Tika container extractor interface.Loads EmbeddedStreamTranslators via service loading.Interface for different document selection strategies for purposes like embedded document extraction by a
ContainerExtractorinstance.This factory creates EmbeddedDocumentExtractors that require anEmbeddedDocumentBytesHandlerin theParseContextshould extend this.Utility class to handle common issues with embedded documents.Tika container extractor callback interface.Interface for different filtering of embedded streams.Simple pointer class to allow parsers to pass on the parent contenthandler through to the embedded document's parseAn implementation ofContainerExtractorpowered by the regularParserAPI.Helper class for parsers of package archives or other compound document formats that support embedded or attached component documents.Recursive Unpacker and text and metadata extractor.