public class XSSFExcelExtractorDecorator extends AbstractOOXMLExtractor
Modifier and Type | Class and Description |
---|---|
protected static class |
XSSFExcelExtractorDecorator.HeaderFooterFromString |
protected static class |
XSSFExcelExtractorDecorator.SheetTextAsHTML
Turns formatted sheet events into HTML
|
protected static class |
XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
Captures information on interesting tags, whilst
delegating the main work to the formatting handler
|
Constructor and Description |
---|
XSSFExcelExtractorDecorator(ParseContext context,
org.apache.poi.xssf.extractor.XSSFEventBasedExcelExtractor extractor,
Locale locale) |
Modifier and Type | Method and Description |
---|---|
protected void |
buildXHTML(XHTMLContentHandler xhtml)
Populates the
XHTMLContentHandler object received as parameter. |
protected List<org.apache.poi.openxml4j.opc.PackagePart> |
getMainDocumentParts()
In Excel files, sheets have things embedded in them,
and sheet drawings which have the images
|
void |
getXHTML(ContentHandler handler,
Metadata metadata,
ParseContext context)
Parses the document into a sequence of XHTML SAX events sent to the
given content handler.
|
void |
processSheet(org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler.SheetContentsHandler sheetContentsExtractor,
org.apache.poi.xssf.model.CommentsTable comments,
org.apache.poi.xssf.model.StylesTable styles,
org.apache.poi.xssf.eventusermodel.ReadOnlySharedStringsTable strings,
InputStream sheetInputStream) |
getDocument, getJustFileName, getMetadataExtractor, handleEmbeddedFile
public XSSFExcelExtractorDecorator(ParseContext context, org.apache.poi.xssf.extractor.XSSFEventBasedExcelExtractor extractor, Locale locale)
public void getXHTML(ContentHandler handler, Metadata metadata, ParseContext context) throws SAXException, org.apache.xmlbeans.XmlException, IOException, TikaException
OOXMLExtractor
getXHTML
in interface OOXMLExtractor
getXHTML
in class AbstractOOXMLExtractor
SAXException
org.apache.xmlbeans.XmlException
IOException
TikaException
org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor#getXHTML(org.xml.sax.ContentHandler,
org.apache.tika.metadata.Metadata)
protected void buildXHTML(XHTMLContentHandler xhtml) throws SAXException, org.apache.xmlbeans.XmlException, IOException
AbstractOOXMLExtractor
XHTMLContentHandler
object received as parameter.buildXHTML
in class AbstractOOXMLExtractor
SAXException
org.apache.xmlbeans.XmlException
IOException
XSSFExcelExtractor.getText()
public void processSheet(org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler.SheetContentsHandler sheetContentsExtractor, org.apache.poi.xssf.model.CommentsTable comments, org.apache.poi.xssf.model.StylesTable styles, org.apache.poi.xssf.eventusermodel.ReadOnlySharedStringsTable strings, InputStream sheetInputStream) throws IOException, SAXException
IOException
SAXException
protected List<org.apache.poi.openxml4j.opc.PackagePart> getMainDocumentParts() throws TikaException
getMainDocumentParts
in class AbstractOOXMLExtractor
TikaException
Copyright © 2007-2015 The Apache Software Foundation. All Rights Reserved.