org.apache.tika.parser.microsoft.ooxml
Class XSSFExcelExtractorDecorator

java.lang.Object
  extended by org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
      extended by org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
All Implemented Interfaces:
OOXMLExtractor

public class XSSFExcelExtractorDecorator
extends AbstractOOXMLExtractor


Nested Class Summary
protected static class XSSFExcelExtractorDecorator.HeaderFooterFromString
           
protected static class XSSFExcelExtractorDecorator.SheetTextAsHTML
          Turns formatted sheet events into HTML
protected static class XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
          Captures information on interesting tags, whilst delegating the main work to the formatting handler
 
Constructor Summary
XSSFExcelExtractorDecorator(ParseContext context, org.apache.poi.xssf.extractor.XSSFEventBasedExcelExtractor extractor, Locale locale)
           
 
Method Summary
protected  void buildXHTML(XHTMLContentHandler xhtml)
          Populates the XHTMLContentHandler object received as parameter.
protected  List<org.apache.poi.openxml4j.opc.PackagePart> getMainDocumentParts()
          In Excel files, sheets have things embedded in them, and sheet drawings which have the images
 MetadataExtractor getMetadataExtractor()
          POIXMLTextExtractor.getMetadataTextExtractor() not yet supported for OOXML by POI.
 void processSheet(org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler.SheetContentsHandler sheetContentsExtractor, org.apache.poi.xssf.model.StylesTable styles, org.apache.poi.xssf.eventusermodel.ReadOnlySharedStringsTable strings, InputStream sheetInputStream)
           
 
Methods inherited from class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
getDocument, getXHTML, handleEmbeddedFile
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

XSSFExcelExtractorDecorator

public XSSFExcelExtractorDecorator(ParseContext context,
                                   org.apache.poi.xssf.extractor.XSSFEventBasedExcelExtractor extractor,
                                   Locale locale)
Method Detail

buildXHTML

protected void buildXHTML(XHTMLContentHandler xhtml)
                   throws SAXException,
                          org.apache.xmlbeans.XmlException,
                          IOException
Description copied from class: AbstractOOXMLExtractor
Populates the XHTMLContentHandler object received as parameter.

Specified by:
buildXHTML in class AbstractOOXMLExtractor
Throws:
SAXException
org.apache.xmlbeans.XmlException
IOException
See Also:
XSSFExcelExtractor.getText()

processSheet

public void processSheet(org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler.SheetContentsHandler sheetContentsExtractor,
                         org.apache.poi.xssf.model.StylesTable styles,
                         org.apache.poi.xssf.eventusermodel.ReadOnlySharedStringsTable strings,
                         InputStream sheetInputStream)
                  throws IOException,
                         SAXException
Throws:
IOException
SAXException

getMainDocumentParts

protected List<org.apache.poi.openxml4j.opc.PackagePart> getMainDocumentParts()
                                                                       throws TikaException
In Excel files, sheets have things embedded in them, and sheet drawings which have the images

Specified by:
getMainDocumentParts in class AbstractOOXMLExtractor
Throws:
TikaException

getMetadataExtractor

public MetadataExtractor getMetadataExtractor()
Description copied from interface: OOXMLExtractor
POIXMLTextExtractor.getMetadataTextExtractor() not yet supported for OOXML by POI.

Specified by:
getMetadataExtractor in interface OOXMLExtractor
Overrides:
getMetadataExtractor in class AbstractOOXMLExtractor
See Also:
OOXMLExtractor.getMetadataExtractor()


Copyright © 2007-2012 The Apache Software Foundation. All Rights Reserved.