Package org.apache.tika.parser.microsoft.ooxml
-
Interface Summary Interface Description OOXMLExtractor Interface implemented by all Tika OOXML extractors.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler -
Class Summary Class Description AbstractOOXMLExtractor Base class for all Tika OOXML extractors.MetadataExtractor OOXML metadata extractor.OOXMLExtractorFactory Figures out the correctOOXMLExtractor
for the supplied document and returns it.OOXMLParser Office Open XML (OOXML) parser.OOXMLTikaBodyPartHandler OOXMLWordAndPowerPointTextHandler This class is intended to handle anything that might contain IBodyElements: main document, headers, footers, notes, slides, etc.ParagraphProperties POIXMLTextExtractorDecorator RunProperties WARNING: This class is mutable.SXSLFPowerPointExtractorDecorator SAX/Streaming pptx extractiorSXWPFWordExtractorDecorator This is an experimental, alternative extractor for docx files.XSLFPowerPointExtractorDecorator XSSFBExcelExtractorDecorator XSSFExcelExtractorDecorator XSSFExcelExtractorDecorator.HeaderFooterFromString XSSFExcelExtractorDecorator.SheetTextAsHTML Turns formatted sheet events into HTMLXSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer Captures information on interesting tags, whilst delegating the main work to the formatting handlerXWPFListManager XWPFWordExtractorDecorator -
Enum Summary Enum Description OOXMLWordAndPowerPointTextHandler.EditType