Uses of Interface
org.apache.tika.extractor.EmbeddedDocumentExtractor
-
Packages that use EmbeddedDocumentExtractor Package Description org.apache.tika.extractor Extraction of component documents.org.apache.tika.parser.microsoft org.apache.tika.parser.pdf.image -
-
Uses of EmbeddedDocumentExtractor in org.apache.tika.extractor
Classes in org.apache.tika.extractor that implement EmbeddedDocumentExtractor Modifier and Type Class Description class
ParsingEmbeddedDocumentExtractor
Helper class for parsers of package archives or other compound document formats that support embedded or attached component documents.class
RUnpackExtractor
Recursive Unpacker and text and metadata extractor.Methods in org.apache.tika.extractor that return EmbeddedDocumentExtractor Modifier and Type Method Description static EmbeddedDocumentExtractor
EmbeddedDocumentUtil. getEmbeddedDocumentExtractor(ParseContext context)
This offers a uniform way to get an EmbeddedDocumentExtractor from a ParseContext.EmbeddedDocumentExtractor
EmbeddedDocumentExtractorFactory. newInstance(Metadata metadata, ParseContext parseContext)
EmbeddedDocumentExtractor
ParsingEmbeddedDocumentExtractorFactory. newInstance(Metadata metadata, ParseContext parseContext)
EmbeddedDocumentExtractor
RUnpackExtractorFactory. newInstance(Metadata metadata, ParseContext parseContext)
-
Uses of EmbeddedDocumentExtractor in org.apache.tika.parser.microsoft
Methods in org.apache.tika.parser.microsoft with parameters of type EmbeddedDocumentExtractor Modifier and Type Method Description static void
OfficeParser. extractMacros(org.apache.poi.poifs.filesystem.POIFSFileSystem fs, ContentHandler xhtml, EmbeddedDocumentExtractor embeddedDocumentExtractor)
Helper to extract macros from an NPOIFS/vbaProject.bin -
Uses of EmbeddedDocumentExtractor in org.apache.tika.parser.pdf.image
Fields in org.apache.tika.parser.pdf.image declared as EmbeddedDocumentExtractor Modifier and Type Field Description protected EmbeddedDocumentExtractor
ImageGraphicsEngine. embeddedDocumentExtractor
Methods in org.apache.tika.parser.pdf.image with parameters of type EmbeddedDocumentExtractor Modifier and Type Method Description ImageGraphicsEngine
ImageGraphicsEngineFactory. newEngine(org.apache.pdfbox.pdmodel.PDPage page, int pageNumber, EmbeddedDocumentExtractor embeddedDocumentExtractor, PDFParserConfig pdfParserConfig, Map<org.apache.pdfbox.cos.COSStream,Integer> processedInlineImages, AtomicInteger imageCounter, XHTMLContentHandler xhtml, Metadata parentMetadata, ParseContext parseContext)
Constructors in org.apache.tika.parser.pdf.image with parameters of type EmbeddedDocumentExtractor Constructor Description ImageGraphicsEngine(org.apache.pdfbox.pdmodel.PDPage page, int pageNumber, EmbeddedDocumentExtractor embeddedDocumentExtractor, PDFParserConfig pdfParserConfig, Map<org.apache.pdfbox.cos.COSStream,Integer> processedInlineImages, AtomicInteger imageCounter, XHTMLContentHandler xhtml, Metadata parentMetadata, ParseContext parseContext)
-