org.apache.tika.parser.microsoft.ooxml
Class XWPFWordExtractorDecorator

java.lang.Object
  extended by org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
      extended by org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator
All Implemented Interfaces:
OOXMLExtractor

public class XWPFWordExtractorDecorator
extends AbstractOOXMLExtractor


Field Summary
 
Fields inherited from class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
extractor
 
Constructor Summary
XWPFWordExtractorDecorator(org.apache.poi.xwpf.extractor.XWPFWordExtractor extractor)
           
 
Method Summary
protected  void buildXHTML(XHTMLContentHandler xhtml)
          Populates the XHTMLContentHandler object received as parameter.
protected  java.util.List<org.apache.poi.openxml4j.opc.PackagePart> getMainDocumentParts()
          Word documents are simple, they only have the one main part
 
Methods inherited from class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
getDocument, getMetadataExtractor, getXHTML, handleEmbedded
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

XWPFWordExtractorDecorator

public XWPFWordExtractorDecorator(org.apache.poi.xwpf.extractor.XWPFWordExtractor extractor)
Method Detail

buildXHTML

protected void buildXHTML(XHTMLContentHandler xhtml)
                   throws org.xml.sax.SAXException,
                          org.apache.xmlbeans.XmlException,
                          java.io.IOException
Description copied from class: AbstractOOXMLExtractor
Populates the XHTMLContentHandler object received as parameter.

Specified by:
buildXHTML in class AbstractOOXMLExtractor
Throws:
org.xml.sax.SAXException
org.apache.xmlbeans.XmlException
java.io.IOException
See Also:
XWPFWordExtractor.getText()

getMainDocumentParts

protected java.util.List<org.apache.poi.openxml4j.opc.PackagePart> getMainDocumentParts()
Word documents are simple, they only have the one main part

Specified by:
getMainDocumentParts in class AbstractOOXMLExtractor


Copyright © 2007-2010 The Apache Software Foundation. All Rights Reserved.