Uses of Class
org.apache.tika.sax.XHTMLContentHandler
Packages that use XHTMLContentHandler
Package
Description
-
Uses of XHTMLContentHandler in org.apache.tika.parser.executable
Methods in org.apache.tika.parser.executable with parameters of type XHTMLContentHandlerModifier and TypeMethodDescriptionvoidExecutableParser.parseELF(XHTMLContentHandler xhtml, Metadata metadata, InputStream stream, byte[] first4) Parses a Unix ELF filevoidExecutableParser.parseMachO(XHTMLContentHandler xhtml, Metadata metadata, InputStream stream, byte[] first4) Parses a Mach-O filevoidExecutableParser.parsePE(XHTMLContentHandler xhtml, Metadata metadata, InputStream stream, byte[] first4) Parses a DOS or Windows PE file -
Uses of XHTMLContentHandler in org.apache.tika.parser.hwp
Methods in org.apache.tika.parser.hwp with parameters of type XHTMLContentHandlerModifier and TypeMethodDescriptionvoidHwpTextExtractorV5.extract(InputStream source, Metadata metadata, XHTMLContentHandler xhtml) extract Text from HWP Stream. -
Uses of XHTMLContentHandler in org.apache.tika.parser.isatab
Methods in org.apache.tika.parser.isatab with parameters of type XHTMLContentHandlerModifier and TypeMethodDescriptionstatic voidISATabUtils.parseAssay(InputStream stream, XHTMLContentHandler xhtml, Metadata metadata, ParseContext context) static voidISATabUtils.parseInvestigation(InputStream stream, XHTMLContentHandler handler, Metadata metadata, ParseContext context) static voidISATabUtils.parseInvestigation(InputStream stream, XHTMLContentHandler handler, Metadata metadata, ParseContext context, String studyFileName) static voidISATabUtils.parseStudy(InputStream stream, XHTMLContentHandler xhtml, Metadata metadata, ParseContext context) -
Uses of XHTMLContentHandler in org.apache.tika.parser.microsoft
Methods in org.apache.tika.parser.microsoft with parameters of type XHTMLContentHandlerModifier and TypeMethodDescriptionstatic voidFormattingUtils.closeStyleTags(XHTMLContentHandler xhtml, Deque<FormattingUtils.Tag> formattingState) Closes all formatting tags.static voidFormattingUtils.ensureFormattingState(XHTMLContentHandler xhtml, EnumSet<FormattingUtils.Tag> desired, Deque<FormattingUtils.Tag> currentState) Closes all tags untilcurrentStatecontains only tags fromdesiredset, then open all required tags to reach desired state.protected voidExcelExtractor.parse(org.apache.poi.poifs.filesystem.DirectoryNode root, XHTMLContentHandler xhtml, Locale locale) protected voidExcelExtractor.parse(org.apache.poi.poifs.filesystem.POIFSFileSystem filesystem, XHTMLContentHandler xhtml, Locale locale) Extracts text from an Excel Workbook writing the extracted content to the specifiedAppendable.protected voidHSLFExtractor.parse(org.apache.poi.poifs.filesystem.DirectoryNode root, XHTMLContentHandler xhtml) protected voidHSLFExtractor.parse(org.apache.poi.poifs.filesystem.POIFSFileSystem filesystem, XHTMLContentHandler xhtml) protected voidOfficeParser.parse(org.apache.poi.poifs.filesystem.DirectoryNode root, ParseContext context, Metadata metadata, XHTMLContentHandler xhtml) protected static voidOldExcelParser.parse(org.apache.poi.hssf.extractor.OldExcelExtractor extractor, XHTMLContentHandler xhtml) voidOutlookExtractor.parse(XHTMLContentHandler xhtml) protected voidWordExtractor.parse(org.apache.poi.poifs.filesystem.DirectoryNode root, XHTMLContentHandler xhtml) protected voidWordExtractor.parse(org.apache.poi.poifs.filesystem.POIFSFileSystem filesystem, XHTMLContentHandler xhtml) protected voidWordExtractor.parseWord6(org.apache.poi.poifs.filesystem.DirectoryNode root, XHTMLContentHandler xhtml) protected voidWordExtractor.parseWord6(org.apache.poi.poifs.filesystem.POIFSFileSystem filesystem, XHTMLContentHandler xhtml) voidCell.render(XHTMLContentHandler handler) Renders the content to the given XHTML SAX event stream.voidCellDecorator.render(XHTMLContentHandler handler) voidLinkedCell.render(XHTMLContentHandler handler) voidNumberCell.render(XHTMLContentHandler handler) voidTextCell.render(XHTMLContentHandler handler) -
Uses of XHTMLContentHandler in org.apache.tika.parser.microsoft.libpst
Constructors in org.apache.tika.parser.microsoft.libpst with parameters of type XHTMLContentHandlerModifierConstructorDescriptionEmailVisitor(Path root, boolean processEmailAsMsg, XHTMLContentHandler xhtml, Metadata parentMetadata, ParseContext parseContext) -
Uses of XHTMLContentHandler in org.apache.tika.parser.microsoft.onenote.fsshttpb
Methods in org.apache.tika.parser.microsoft.onenote.fsshttpb with parameters of type XHTMLContentHandlerModifier and TypeMethodDescriptionvoidMSOneStorePackage.walkTree(OneNoteTreeWalkerOptions options, Metadata metadata, XHTMLContentHandler xhtml) -
Uses of XHTMLContentHandler in org.apache.tika.parser.microsoft.ooxml
Methods in org.apache.tika.parser.microsoft.ooxml with parameters of type XHTMLContentHandlerModifier and TypeMethodDescriptionprotected abstract voidAbstractOOXMLExtractor.buildXHTML(XHTMLContentHandler xhtml) Populates theXHTMLContentHandlerobject received as parameter.protected voidPOIXMLTextExtractorDecorator.buildXHTML(XHTMLContentHandler xhtml) protected voidSXSLFPowerPointExtractorDecorator.buildXHTML(XHTMLContentHandler xhtml) protected voidSXWPFWordExtractorDecorator.buildXHTML(XHTMLContentHandler xhtml) protected voidXSLFPowerPointExtractorDecorator.buildXHTML(XHTMLContentHandler xhtml) protected voidXSSFBExcelExtractorDecorator.buildXHTML(XHTMLContentHandler xhtml) protected voidXSSFExcelExtractorDecorator.buildXHTML(XHTMLContentHandler xhtml) protected voidXWPFWordExtractorDecorator.buildXHTML(XHTMLContentHandler xhtml) protected voidXSSFBExcelExtractorDecorator.extractHeaderFooter(String hf, XHTMLContentHandler xhtml) protected voidXSSFExcelExtractorDecorator.extractHeaderFooter(String hf, XHTMLContentHandler xhtml) protected voidXSSFExcelExtractorDecorator.extractHyperLinks(org.apache.poi.openxml4j.opc.PackagePart sheetPart, XHTMLContentHandler xhtml) protected voidAbstractOOXMLExtractor.handleEmbeddedFile(org.apache.poi.openxml4j.opc.PackagePart part, XHTMLContentHandler xhtml, String rel, EmbeddedPartMetadata embeddedPartMetadata, TikaCoreProperties.EmbeddedResourceType embeddedResourceType) Handles an embedded file in the documentprotected voidXSSFExcelExtractorDecorator.processShapes(List<org.apache.poi.xssf.usermodel.XSSFShape> shapes, XHTMLContentHandler xhtml) Constructors in org.apache.tika.parser.microsoft.ooxml with parameters of type XHTMLContentHandlerModifierConstructorDescriptionOOXMLTikaBodyPartHandler(XHTMLContentHandler xhtml, XWPFStylesShim styles, XWPFListManager listManager, OfficeParserConfig parserConfig) protectedSheetTextAsHTML(OfficeParserConfig config, XHTMLContentHandler xhtml) -
Uses of XHTMLContentHandler in org.apache.tika.parser.microsoft.ooxml.xps
Methods in org.apache.tika.parser.microsoft.ooxml.xps with parameters of type XHTMLContentHandlerModifier and TypeMethodDescriptionprotected voidXPSExtractorDecorator.buildXHTML(XHTMLContentHandler xhtml) -
Uses of XHTMLContentHandler in org.apache.tika.parser.mp4
Constructors in org.apache.tika.parser.mp4 with parameters of type XHTMLContentHandlerModifierConstructorDescriptionTikaMp4BoxHandler(com.drew.metadata.Metadata metadata, Metadata tikaMetadata, XHTMLContentHandler xhtml) -
Uses of XHTMLContentHandler in org.apache.tika.parser.mp4.boxes
Constructors in org.apache.tika.parser.mp4.boxes with parameters of type XHTMLContentHandlerModifierConstructorDescriptionTikaUserDataBox(String box, byte[] payload, Metadata metadata, XHTMLContentHandler xhtml) -
Uses of XHTMLContentHandler in org.apache.tika.parser.pdf.image
Fields in org.apache.tika.parser.pdf.image declared as XHTMLContentHandlerMethods in org.apache.tika.parser.pdf.image with parameters of type XHTMLContentHandlerModifier and TypeMethodDescriptionImageGraphicsEngineFactory.newEngine(org.apache.pdfbox.pdmodel.PDPage page, int pageNumber, EmbeddedDocumentExtractor embeddedDocumentExtractor, PDFParserConfig pdfParserConfig, Map<org.apache.pdfbox.cos.COSStream, Integer> processedInlineImages, AtomicInteger imageCounter, XHTMLContentHandler xhtml, Metadata parentMetadata, ParseContext parseContext) Constructors in org.apache.tika.parser.pdf.image with parameters of type XHTMLContentHandlerModifierConstructorDescriptionprotectedImageGraphicsEngine(org.apache.pdfbox.pdmodel.PDPage page, int pageNumber, EmbeddedDocumentExtractor embeddedDocumentExtractor, PDFParserConfig pdfParserConfig, Map<org.apache.pdfbox.cos.COSStream, Integer> processedInlineImages, AtomicInteger imageCounter, XHTMLContentHandler xhtml, Metadata parentMetadata, ParseContext parseContext) -
Uses of XHTMLContentHandler in org.apache.tika.parser.pkg
Methods in org.apache.tika.parser.pkg with parameters of type XHTMLContentHandlerModifier and TypeMethodDescriptionprotected static MetadataPackageParser.handleEntryMetadata(String name, Date createAt, Date modifiedAt, Long size, XHTMLContentHandler xhtml)