Class XPSExtractorDecorator
java.lang.Object
org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
org.apache.tika.parser.microsoft.ooxml.xps.XPSExtractorDecorator
- All Implemented Interfaces:
OOXMLExtractor
-
Field Summary
Fields inherited from class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
config, EMBEDDED_RELATIONSHIPS, extractor -
Constructor Summary
ConstructorsConstructorDescriptionXPSExtractorDecorator(ParseContext context, org.apache.poi.ooxml.extractor.POIXMLTextExtractor extractor) -
Method Summary
Modifier and TypeMethodDescriptionprotected voidbuildXHTML(XHTMLContentHandler xhtml) Populates theXHTMLContentHandlerobject received as parameter.org.apache.poi.ooxml.POIXMLDocumentReturns the opened document.protected List<org.apache.poi.openxml4j.opc.PackagePart>Return a list of the main parts of the document, used when searching for embedded resources.Methods inherited from class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
getEmbeddedPartMetadataMap, getJustFileName, getMetadataExtractor, getXHTML, handleEmbeddedFile, loadLinkedRelationships
-
Constructor Details
-
XPSExtractorDecorator
public XPSExtractorDecorator(ParseContext context, org.apache.poi.ooxml.extractor.POIXMLTextExtractor extractor) throws TikaException - Throws:
TikaException
-
-
Method Details
-
getDocument
public org.apache.poi.ooxml.POIXMLDocument getDocument()Description copied from interface:OOXMLExtractorReturns the opened document.- Specified by:
getDocumentin interfaceOOXMLExtractor- Overrides:
getDocumentin classAbstractOOXMLExtractor- See Also:
-
buildXHTML
Description copied from class:AbstractOOXMLExtractorPopulates theXHTMLContentHandlerobject received as parameter.- Specified by:
buildXHTMLin classAbstractOOXMLExtractor- Throws:
SAXExceptionIOException
-
getMainDocumentParts
protected List<org.apache.poi.openxml4j.opc.PackagePart> getMainDocumentParts() throws TikaExceptionDescription copied from class:AbstractOOXMLExtractorReturn a list of the main parts of the document, used when searching for embedded resources. This should be all the parts of the document that end up with things embedded into them.- Specified by:
getMainDocumentPartsin classAbstractOOXMLExtractor- Throws:
TikaException
-