Interface OOXMLExtractor
-
- All Known Implementing Classes:
AbstractOOXMLExtractor
,POIXMLTextExtractorDecorator
,SXSLFPowerPointExtractorDecorator
,SXWPFWordExtractorDecorator
,XPSExtractorDecorator
,XSLFPowerPointExtractorDecorator
,XSSFBExcelExtractorDecorator
,XSSFExcelExtractorDecorator
,XWPFWordExtractorDecorator
public interface OOXMLExtractor
Interface implemented by all Tika OOXML extractors.- See Also:
POIXMLTextExtractor
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description org.apache.poi.ooxml.POIXMLDocument
getDocument()
Returns the opened document.MetadataExtractor
getMetadataExtractor()
POIXMLTextExtractor.getMetadataTextExtractor()
not yet supported for OOXML by POI.void
getXHTML(ContentHandler handler, Metadata metadata, ParseContext context)
Parses the document into a sequence of XHTML SAX events sent to the given content handler.
-
-
-
Method Detail
-
getDocument
org.apache.poi.ooxml.POIXMLDocument getDocument()
Returns the opened document.- See Also:
POIXMLTextExtractor.getDocument()
-
getMetadataExtractor
MetadataExtractor getMetadataExtractor()
POIXMLTextExtractor.getMetadataTextExtractor()
not yet supported for OOXML by POI.
-
getXHTML
void getXHTML(ContentHandler handler, Metadata metadata, ParseContext context) throws SAXException, org.apache.xmlbeans.XmlException, IOException, TikaException
Parses the document into a sequence of XHTML SAX events sent to the given content handler.- Throws:
SAXException
org.apache.xmlbeans.XmlException
IOException
TikaException
-
-