org.apache.tika.parser.microsoft.ooxml
Class POIXMLTextExtractorDecorator

java.lang.Object
  extended by org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
      extended by org.apache.tika.parser.microsoft.ooxml.POIXMLTextExtractorDecorator
All Implemented Interfaces:
OOXMLExtractor

public class POIXMLTextExtractorDecorator
extends AbstractOOXMLExtractor


Field Summary
 
Fields inherited from class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
extractor
 
Constructor Summary
POIXMLTextExtractorDecorator(ParseContext context, org.apache.poi.POIXMLTextExtractor extractor)
           
 
Method Summary
protected  void buildXHTML(XHTMLContentHandler xhtml)
          Populates the XHTMLContentHandler object received as parameter.
protected  java.util.List<org.apache.poi.openxml4j.opc.PackagePart> getMainDocumentParts()
          Return a list of the main parts of the document, used when searching for embedded resources.
 
Methods inherited from class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
getDocument, getMetadataExtractor, getXHTML, handleEmbedded
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

POIXMLTextExtractorDecorator

public POIXMLTextExtractorDecorator(ParseContext context,
                                    org.apache.poi.POIXMLTextExtractor extractor)
Method Detail

buildXHTML

protected void buildXHTML(XHTMLContentHandler xhtml)
                   throws org.xml.sax.SAXException
Description copied from class: AbstractOOXMLExtractor
Populates the XHTMLContentHandler object received as parameter.

Specified by:
buildXHTML in class AbstractOOXMLExtractor
Throws:
org.xml.sax.SAXException

getMainDocumentParts

protected java.util.List<org.apache.poi.openxml4j.opc.PackagePart> getMainDocumentParts()
Description copied from class: AbstractOOXMLExtractor
Return a list of the main parts of the document, used when searching for embedded resources. This should be all the parts of the document that end up with things embedded into them.

Specified by:
getMainDocumentParts in class AbstractOOXMLExtractor


Copyright © 2007-2011 The Apache Software Foundation. All Rights Reserved.