Uses of Class
org.apache.tika.sax.XHTMLContentHandler
- 
- 
Uses of XHTMLContentHandler in org.apache.tika.parser.executableMethods in org.apache.tika.parser.executable with parameters of type XHTMLContentHandler Modifier and Type Method Description voidExecutableParser. parseELF(XHTMLContentHandler xhtml, Metadata metadata, InputStream stream, byte[] first4)Parses a Unix ELF filevoidExecutableParser. parseMachO(XHTMLContentHandler xhtml, Metadata metadata, InputStream stream, byte[] first4)Parses a Mach-O filevoidUniversalExecutableParser. parseMachO(XHTMLContentHandler xhtml, EmbeddedDocumentExtractor extractor, Metadata metadata, InputStream stream, byte[] first4)Parses a Mach-O Universal filevoidExecutableParser. parsePE(XHTMLContentHandler xhtml, Metadata metadata, InputStream stream, byte[] first4)Parses a DOS or Windows PE file
- 
Uses of XHTMLContentHandler in org.apache.tika.parser.hwpMethods in org.apache.tika.parser.hwp with parameters of type XHTMLContentHandler Modifier and Type Method Description voidHwpTextExtractorV5. extract(InputStream source, Metadata metadata, XHTMLContentHandler xhtml)extract Text from HWP Stream.
- 
Uses of XHTMLContentHandler in org.apache.tika.parser.isatabMethods in org.apache.tika.parser.isatab with parameters of type XHTMLContentHandler Modifier and Type Method Description static voidISATabUtils. parseAssay(InputStream stream, XHTMLContentHandler xhtml, Metadata metadata, ParseContext context)static voidISATabUtils. parseInvestigation(InputStream stream, XHTMLContentHandler handler, Metadata metadata, ParseContext context)static voidISATabUtils. parseInvestigation(InputStream stream, XHTMLContentHandler handler, Metadata metadata, ParseContext context, String studyFileName)static voidISATabUtils. parseStudy(InputStream stream, XHTMLContentHandler xhtml, Metadata metadata, ParseContext context)
- 
Uses of XHTMLContentHandler in org.apache.tika.parser.microsoftMethods in org.apache.tika.parser.microsoft with parameters of type XHTMLContentHandler Modifier and Type Method Description static voidFormattingUtils. closeStyleTags(XHTMLContentHandler xhtml, Deque<FormattingUtils.Tag> formattingState)Closes all formatting tags.static voidFormattingUtils. ensureFormattingState(XHTMLContentHandler xhtml, EnumSet<FormattingUtils.Tag> desired, Deque<FormattingUtils.Tag> currentState)Closes all tags untilcurrentStatecontains only tags fromdesiredset, then open all required tags to reach desired state.protected voidExcelExtractor. parse(org.apache.poi.poifs.filesystem.DirectoryNode root, XHTMLContentHandler xhtml, Locale locale)protected voidExcelExtractor. parse(org.apache.poi.poifs.filesystem.POIFSFileSystem filesystem, XHTMLContentHandler xhtml, Locale locale)Extracts text from an Excel Workbook writing the extracted content to the specifiedAppendable.protected voidHSLFExtractor. parse(org.apache.poi.poifs.filesystem.DirectoryNode root, XHTMLContentHandler xhtml)protected voidHSLFExtractor. parse(org.apache.poi.poifs.filesystem.POIFSFileSystem filesystem, XHTMLContentHandler xhtml)protected voidOfficeParser. parse(org.apache.poi.poifs.filesystem.DirectoryNode root, ParseContext context, Metadata metadata, XHTMLContentHandler xhtml)protected static voidOldExcelParser. parse(org.apache.poi.hssf.extractor.OldExcelExtractor extractor, XHTMLContentHandler xhtml)voidOutlookExtractor. parse(XHTMLContentHandler xhtml)protected voidWordExtractor. parse(org.apache.poi.poifs.filesystem.DirectoryNode root, XHTMLContentHandler xhtml)protected voidWordExtractor. parse(org.apache.poi.poifs.filesystem.POIFSFileSystem filesystem, XHTMLContentHandler xhtml)protected voidWordExtractor. parseWord6(org.apache.poi.poifs.filesystem.DirectoryNode root, XHTMLContentHandler xhtml)protected voidWordExtractor. parseWord6(org.apache.poi.poifs.filesystem.POIFSFileSystem filesystem, XHTMLContentHandler xhtml)voidCell. render(XHTMLContentHandler handler)Renders the content to the given XHTML SAX event stream.voidCellDecorator. render(XHTMLContentHandler handler)voidLinkedCell. render(XHTMLContentHandler handler)voidNumberCell. render(XHTMLContentHandler handler)voidTextCell. render(XHTMLContentHandler handler)
- 
Uses of XHTMLContentHandler in org.apache.tika.parser.microsoft.libpstConstructors in org.apache.tika.parser.microsoft.libpst with parameters of type XHTMLContentHandler Constructor Description EmailVisitor(Path root, boolean processEmailAsMsg, XHTMLContentHandler xhtml, Metadata parentMetadata, ParseContext parseContext)
- 
Uses of XHTMLContentHandler in org.apache.tika.parser.microsoft.onenote.fsshttpbMethods in org.apache.tika.parser.microsoft.onenote.fsshttpb with parameters of type XHTMLContentHandler Modifier and Type Method Description voidMSOneStorePackage. walkTree(OneNoteTreeWalkerOptions options, Metadata metadata, XHTMLContentHandler xhtml)
- 
Uses of XHTMLContentHandler in org.apache.tika.parser.microsoft.ooxmlMethods in org.apache.tika.parser.microsoft.ooxml with parameters of type XHTMLContentHandler Modifier and Type Method Description protected abstract voidAbstractOOXMLExtractor. buildXHTML(XHTMLContentHandler xhtml)Populates theXHTMLContentHandlerobject received as parameter.protected voidPOIXMLTextExtractorDecorator. buildXHTML(XHTMLContentHandler xhtml)protected voidSXSLFPowerPointExtractorDecorator. buildXHTML(XHTMLContentHandler xhtml)protected voidSXWPFWordExtractorDecorator. buildXHTML(XHTMLContentHandler xhtml)protected voidXSLFPowerPointExtractorDecorator. buildXHTML(XHTMLContentHandler xhtml)protected voidXSSFBExcelExtractorDecorator. buildXHTML(XHTMLContentHandler xhtml)protected voidXSSFExcelExtractorDecorator. buildXHTML(XHTMLContentHandler xhtml)protected voidXWPFWordExtractorDecorator. buildXHTML(XHTMLContentHandler xhtml)protected voidXSSFBExcelExtractorDecorator. extractHeaderFooter(String hf, XHTMLContentHandler xhtml)protected voidXSSFExcelExtractorDecorator. extractHeaderFooter(String hf, XHTMLContentHandler xhtml)protected voidXSSFExcelExtractorDecorator. extractHyperLinks(org.apache.poi.openxml4j.opc.PackagePart sheetPart, XHTMLContentHandler xhtml)protected voidAbstractOOXMLExtractor. handleEmbeddedFile(org.apache.poi.openxml4j.opc.PackagePart part, XHTMLContentHandler xhtml, String rel, EmbeddedPartMetadata embeddedPartMetadata, TikaCoreProperties.EmbeddedResourceType embeddedResourceType)Handles an embedded file in the documentprotected voidXSSFExcelExtractorDecorator. processShapes(List<org.apache.poi.xssf.usermodel.XSSFShape> shapes, XHTMLContentHandler xhtml)Constructors in org.apache.tika.parser.microsoft.ooxml with parameters of type XHTMLContentHandler Constructor Description OOXMLTikaBodyPartHandler(XHTMLContentHandler xhtml)OOXMLTikaBodyPartHandler(XHTMLContentHandler xhtml, XWPFStylesShim styles, XWPFListManager listManager, OfficeParserConfig parserConfig)SheetTextAsHTML(OfficeParserConfig config, XHTMLContentHandler xhtml)
- 
Uses of XHTMLContentHandler in org.apache.tika.parser.microsoft.ooxml.xpsMethods in org.apache.tika.parser.microsoft.ooxml.xps with parameters of type XHTMLContentHandler Modifier and Type Method Description protected voidXPSExtractorDecorator. buildXHTML(XHTMLContentHandler xhtml)
- 
Uses of XHTMLContentHandler in org.apache.tika.parser.mp4Constructors in org.apache.tika.parser.mp4 with parameters of type XHTMLContentHandler Constructor Description TikaMp4BoxHandler(com.drew.metadata.Metadata metadata, Metadata tikaMetadata, XHTMLContentHandler xhtml)
- 
Uses of XHTMLContentHandler in org.apache.tika.parser.mp4.boxesConstructors in org.apache.tika.parser.mp4.boxes with parameters of type XHTMLContentHandler Constructor Description TikaUserDataBox(String box, byte[] payload, Metadata metadata, XHTMLContentHandler xhtml)
- 
Uses of XHTMLContentHandler in org.apache.tika.parser.pdf.imageFields in org.apache.tika.parser.pdf.image declared as XHTMLContentHandler Modifier and Type Field Description protected XHTMLContentHandlerImageGraphicsEngine. xhtmlMethods in org.apache.tika.parser.pdf.image with parameters of type XHTMLContentHandler Modifier and Type Method Description ImageGraphicsEngineImageGraphicsEngineFactory. newEngine(org.apache.pdfbox.pdmodel.PDPage page, int pageNumber, EmbeddedDocumentExtractor embeddedDocumentExtractor, PDFParserConfig pdfParserConfig, Map<org.apache.pdfbox.cos.COSStream,Integer> processedInlineImages, AtomicInteger imageCounter, XHTMLContentHandler xhtml, Metadata parentMetadata, ParseContext parseContext)Constructors in org.apache.tika.parser.pdf.image with parameters of type XHTMLContentHandler Constructor Description ImageGraphicsEngine(org.apache.pdfbox.pdmodel.PDPage page, int pageNumber, EmbeddedDocumentExtractor embeddedDocumentExtractor, PDFParserConfig pdfParserConfig, Map<org.apache.pdfbox.cos.COSStream,Integer> processedInlineImages, AtomicInteger imageCounter, XHTMLContentHandler xhtml, Metadata parentMetadata, ParseContext parseContext)
- 
Uses of XHTMLContentHandler in org.apache.tika.parser.pkgMethods in org.apache.tika.parser.pkg with parameters of type XHTMLContentHandler Modifier and Type Method Description protected static MetadataPackageParser. handleEntryMetadata(String name, Date createAt, Date modifiedAt, Long size, XHTMLContentHandler xhtml)
 
-