Class XWPFWordExtractorDecorator
- java.lang.Object
-
- org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
- org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator
-
- All Implemented Interfaces:
OOXMLExtractor
public class XWPFWordExtractorDecorator extends AbstractOOXMLExtractor
-
-
Field Summary
-
Fields inherited from class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
config, EMBEDDED_RELATIONSHIPS, extractor
-
-
Constructor Summary
Constructors Constructor Description XWPFWordExtractorDecorator(Metadata metadata, ParseContext context, org.apache.poi.xwpf.extractor.XWPFWordExtractor extractor)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
buildXHTML(XHTMLContentHandler xhtml)
Populates theXHTMLContentHandler
object received as parameter.protected Map<String,EmbeddedPartMetadata>
getEmbeddedPartMetadataMap()
protected List<org.apache.poi.openxml4j.opc.PackagePart>
getMainDocumentParts()
Include main body and anything else that can have an attachment/embedded object-
Methods inherited from class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
getDocument, getJustFileName, getMetadataExtractor, getXHTML, handleEmbeddedFile, loadLinkedRelationships
-
-
-
-
Constructor Detail
-
XWPFWordExtractorDecorator
public XWPFWordExtractorDecorator(Metadata metadata, ParseContext context, org.apache.poi.xwpf.extractor.XWPFWordExtractor extractor)
-
-
Method Detail
-
buildXHTML
protected void buildXHTML(XHTMLContentHandler xhtml) throws SAXException, org.apache.xmlbeans.XmlException, IOException
Description copied from class:AbstractOOXMLExtractor
Populates theXHTMLContentHandler
object received as parameter.- Specified by:
buildXHTML
in classAbstractOOXMLExtractor
- Throws:
SAXException
org.apache.xmlbeans.XmlException
IOException
- See Also:
XWPFWordExtractor.getText()
-
getEmbeddedPartMetadataMap
protected Map<String,EmbeddedPartMetadata> getEmbeddedPartMetadataMap()
- Overrides:
getEmbeddedPartMetadataMap
in classAbstractOOXMLExtractor
-
getMainDocumentParts
protected List<org.apache.poi.openxml4j.opc.PackagePart> getMainDocumentParts()
Include main body and anything else that can have an attachment/embedded object- Specified by:
getMainDocumentParts
in classAbstractOOXMLExtractor
-
-