Uses of Interface
org.apache.tika.extractor.EmbeddedDocumentExtractor
Package
Description
Extraction of component documents.
-
Uses of EmbeddedDocumentExtractor in org.apache.tika.extractor
Modifier and TypeClassDescriptionclass
Helper class for parsers of package archives or other compound document formats that support embedded or attached component documents.Modifier and TypeMethodDescriptionstatic EmbeddedDocumentExtractor
EmbeddedDocumentUtil.getEmbeddedDocumentExtractor
(ParseContext context) This offers a uniform way to get an EmbeddedDocumentExtractor from a ParseContext.EmbeddedDocumentExtractorFactory.newInstance
(Metadata metadata, ParseContext parseContext) ParsingEmbeddedDocumentExtractorFactory.newInstance
(Metadata metadata, ParseContext parseContext) -
Uses of EmbeddedDocumentExtractor in org.apache.tika.parser.microsoft
Modifier and TypeMethodDescriptionstatic void
OfficeParser.extractMacros
(org.apache.poi.poifs.filesystem.POIFSFileSystem fs, ContentHandler xhtml, EmbeddedDocumentExtractor embeddedDocumentExtractor) Helper to extract macros from an NPOIFS/vbaProject.bin -
Uses of EmbeddedDocumentExtractor in org.apache.tika.parser.pdf.image
Modifier and TypeFieldDescriptionprotected final EmbeddedDocumentExtractor
ImageGraphicsEngine.embeddedDocumentExtractor
Modifier and TypeMethodDescriptionImageGraphicsEngineFactory.newEngine
(org.apache.pdfbox.pdmodel.PDPage page, int pageNumber, EmbeddedDocumentExtractor embeddedDocumentExtractor, PDFParserConfig pdfParserConfig, Map<org.apache.pdfbox.cos.COSStream, Integer> processedInlineImages, AtomicInteger imageCounter, XHTMLContentHandler xhtml, Metadata parentMetadata, ParseContext parseContext) ModifierConstructorDescriptionprotected
ImageGraphicsEngine
(org.apache.pdfbox.pdmodel.PDPage page, int pageNumber, EmbeddedDocumentExtractor embeddedDocumentExtractor, PDFParserConfig pdfParserConfig, Map<org.apache.pdfbox.cos.COSStream, Integer> processedInlineImages, AtomicInteger imageCounter, XHTMLContentHandler xhtml, Metadata parentMetadata, ParseContext parseContext)