Class XPSExtractorDecorator
- java.lang.Object
-
- org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
- org.apache.tika.parser.microsoft.ooxml.xps.XPSExtractorDecorator
-
- All Implemented Interfaces:
OOXMLExtractor
public class XPSExtractorDecorator extends AbstractOOXMLExtractor
-
-
Field Summary
-
Fields inherited from class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
config, EMBEDDED_RELATIONSHIPS, extractor
-
-
Constructor Summary
Constructors Constructor Description XPSExtractorDecorator(ParseContext context, org.apache.poi.ooxml.extractor.POIXMLTextExtractor extractor)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
buildXHTML(XHTMLContentHandler xhtml)
Populates theXHTMLContentHandler
object received as parameter.org.apache.poi.ooxml.POIXMLDocument
getDocument()
Returns the opened document.protected List<org.apache.poi.openxml4j.opc.PackagePart>
getMainDocumentParts()
Return a list of the main parts of the document, used when searching for embedded resources.-
Methods inherited from class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
getEmbeddedPartMetadataMap, getJustFileName, getMetadataExtractor, getXHTML, handleEmbeddedFile, loadLinkedRelationships
-
-
-
-
Constructor Detail
-
XPSExtractorDecorator
public XPSExtractorDecorator(ParseContext context, org.apache.poi.ooxml.extractor.POIXMLTextExtractor extractor) throws TikaException
- Throws:
TikaException
-
-
Method Detail
-
getDocument
public org.apache.poi.ooxml.POIXMLDocument getDocument()
Description copied from interface:OOXMLExtractor
Returns the opened document.- Specified by:
getDocument
in interfaceOOXMLExtractor
- Overrides:
getDocument
in classAbstractOOXMLExtractor
- See Also:
OOXMLExtractor.getDocument()
-
buildXHTML
protected void buildXHTML(XHTMLContentHandler xhtml) throws SAXException, IOException
Description copied from class:AbstractOOXMLExtractor
Populates theXHTMLContentHandler
object received as parameter.- Specified by:
buildXHTML
in classAbstractOOXMLExtractor
- Throws:
SAXException
IOException
-
getMainDocumentParts
protected List<org.apache.poi.openxml4j.opc.PackagePart> getMainDocumentParts() throws TikaException
Description copied from class:AbstractOOXMLExtractor
Return a list of the main parts of the document, used when searching for embedded resources. This should be all the parts of the document that end up with things embedded into them.- Specified by:
getMainDocumentParts
in classAbstractOOXMLExtractor
- Throws:
TikaException
-
-