Interface OOXMLExtractor
-
- All Known Implementing Classes:
AbstractOOXMLExtractor,POIXMLTextExtractorDecorator,SXSLFPowerPointExtractorDecorator,SXWPFWordExtractorDecorator,XPSExtractorDecorator,XSLFPowerPointExtractorDecorator,XSSFBExcelExtractorDecorator,XSSFExcelExtractorDecorator,XWPFWordExtractorDecorator
public interface OOXMLExtractorInterface implemented by all Tika OOXML extractors.- See Also:
POIXMLTextExtractor
-
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description org.apache.poi.ooxml.POIXMLDocumentgetDocument()Returns the opened document.MetadataExtractorgetMetadataExtractor()POIXMLTextExtractor.getMetadataTextExtractor()not yet supported for OOXML by POI.voidgetXHTML(ContentHandler handler, Metadata metadata, ParseContext context)Parses the document into a sequence of XHTML SAX events sent to the given content handler.
-
-
-
Method Detail
-
getDocument
org.apache.poi.ooxml.POIXMLDocument getDocument()
Returns the opened document.- See Also:
POIXMLTextExtractor.getDocument()
-
getMetadataExtractor
MetadataExtractor getMetadataExtractor()
POIXMLTextExtractor.getMetadataTextExtractor()not yet supported for OOXML by POI.
-
getXHTML
void getXHTML(ContentHandler handler, Metadata metadata, ParseContext context) throws SAXException, org.apache.xmlbeans.XmlException, IOException, TikaException
Parses the document into a sequence of XHTML SAX events sent to the given content handler.- Throws:
SAXExceptionorg.apache.xmlbeans.XmlExceptionIOExceptionTikaException
-
-