Uses of Class
org.apache.tika.sax.XHTMLContentHandler
Package
Description
-
Uses of XHTMLContentHandler in org.apache.tika.parser.executable
Modifier and TypeMethodDescriptionvoid
ExecutableParser.parseELF
(XHTMLContentHandler xhtml, Metadata metadata, InputStream stream, byte[] first4) Parses a Unix ELF filevoid
ExecutableParser.parsePE
(XHTMLContentHandler xhtml, Metadata metadata, InputStream stream, byte[] first4) Parses a DOS or Windows PE file -
Uses of XHTMLContentHandler in org.apache.tika.parser.hwp
Modifier and TypeMethodDescriptionvoid
HwpTextExtractorV5.extract
(InputStream source, Metadata metadata, XHTMLContentHandler xhtml) extract Text from HWP Stream. -
Uses of XHTMLContentHandler in org.apache.tika.parser.isatab
Modifier and TypeMethodDescriptionstatic void
ISATabUtils.parseAssay
(InputStream stream, XHTMLContentHandler xhtml, Metadata metadata, ParseContext context) static void
ISATabUtils.parseInvestigation
(InputStream stream, XHTMLContentHandler handler, Metadata metadata, ParseContext context) static void
ISATabUtils.parseInvestigation
(InputStream stream, XHTMLContentHandler handler, Metadata metadata, ParseContext context, String studyFileName) static void
ISATabUtils.parseStudy
(InputStream stream, XHTMLContentHandler xhtml, Metadata metadata, ParseContext context) -
Uses of XHTMLContentHandler in org.apache.tika.parser.microsoft
Modifier and TypeMethodDescriptionstatic void
FormattingUtils.closeStyleTags
(XHTMLContentHandler xhtml, Deque<FormattingUtils.Tag> formattingState) Closes all formatting tags.static void
FormattingUtils.ensureFormattingState
(XHTMLContentHandler xhtml, EnumSet<FormattingUtils.Tag> desired, Deque<FormattingUtils.Tag> currentState) Closes all tags untilcurrentState
contains only tags fromdesired
set, then open all required tags to reach desired state.protected void
ExcelExtractor.parse
(org.apache.poi.poifs.filesystem.DirectoryNode root, XHTMLContentHandler xhtml, Locale locale) protected void
ExcelExtractor.parse
(org.apache.poi.poifs.filesystem.POIFSFileSystem filesystem, XHTMLContentHandler xhtml, Locale locale) Extracts text from an Excel Workbook writing the extracted content to the specifiedAppendable
.protected void
HSLFExtractor.parse
(org.apache.poi.poifs.filesystem.DirectoryNode root, XHTMLContentHandler xhtml) protected void
HSLFExtractor.parse
(org.apache.poi.poifs.filesystem.POIFSFileSystem filesystem, XHTMLContentHandler xhtml) protected void
OfficeParser.parse
(org.apache.poi.poifs.filesystem.DirectoryNode root, ParseContext context, Metadata metadata, XHTMLContentHandler xhtml) protected static void
OldExcelParser.parse
(org.apache.poi.hssf.extractor.OldExcelExtractor extractor, XHTMLContentHandler xhtml) void
OutlookExtractor.parse
(XHTMLContentHandler xhtml) void
OutlookExtractor.parse
(XHTMLContentHandler xhtml, Metadata metadata) Deprecated.use {@link #parse(XHTMLContentHandler), will be removed after 2.4.0}protected void
WordExtractor.parse
(org.apache.poi.poifs.filesystem.DirectoryNode root, XHTMLContentHandler xhtml) protected void
WordExtractor.parse
(org.apache.poi.poifs.filesystem.POIFSFileSystem filesystem, XHTMLContentHandler xhtml) protected void
WordExtractor.parseWord6
(org.apache.poi.poifs.filesystem.DirectoryNode root, XHTMLContentHandler xhtml) protected void
WordExtractor.parseWord6
(org.apache.poi.poifs.filesystem.POIFSFileSystem filesystem, XHTMLContentHandler xhtml) void
Cell.render
(XHTMLContentHandler handler) Renders the content to the given XHTML SAX event stream.void
CellDecorator.render
(XHTMLContentHandler handler) void
LinkedCell.render
(XHTMLContentHandler handler) void
NumberCell.render
(XHTMLContentHandler handler) void
TextCell.render
(XHTMLContentHandler handler) -
Uses of XHTMLContentHandler in org.apache.tika.parser.microsoft.onenote.fsshttpb
Modifier and TypeMethodDescriptionvoid
MSOneStorePackage.walkTree
(OneNoteTreeWalkerOptions options, Metadata metadata, XHTMLContentHandler xhtml) -
Uses of XHTMLContentHandler in org.apache.tika.parser.microsoft.ooxml
Modifier and TypeMethodDescriptionprotected abstract void
AbstractOOXMLExtractor.buildXHTML
(XHTMLContentHandler xhtml) Populates theXHTMLContentHandler
object received as parameter.protected void
POIXMLTextExtractorDecorator.buildXHTML
(XHTMLContentHandler xhtml) protected void
SXSLFPowerPointExtractorDecorator.buildXHTML
(XHTMLContentHandler xhtml) protected void
SXWPFWordExtractorDecorator.buildXHTML
(XHTMLContentHandler xhtml) protected void
XSLFPowerPointExtractorDecorator.buildXHTML
(XHTMLContentHandler xhtml) protected void
XSSFBExcelExtractorDecorator.buildXHTML
(XHTMLContentHandler xhtml) protected void
XSSFExcelExtractorDecorator.buildXHTML
(XHTMLContentHandler xhtml) protected void
XWPFWordExtractorDecorator.buildXHTML
(XHTMLContentHandler xhtml) protected void
XSSFBExcelExtractorDecorator.extractHeaderFooter
(String hf, XHTMLContentHandler xhtml) protected void
XSSFExcelExtractorDecorator.extractHeaderFooter
(String hf, XHTMLContentHandler xhtml) protected void
XSSFExcelExtractorDecorator.extractHyperLinks
(org.apache.poi.openxml4j.opc.PackagePart sheetPart, XHTMLContentHandler xhtml) protected void
AbstractOOXMLExtractor.handleEmbeddedFile
(org.apache.poi.openxml4j.opc.PackagePart part, XHTMLContentHandler xhtml, String rel, EmbeddedPartMetadata embeddedPartMetadata, TikaCoreProperties.EmbeddedResourceType embeddedResourceType) Handles an embedded file in the documentprotected void
XSSFExcelExtractorDecorator.processShapes
(List<org.apache.poi.xssf.usermodel.XSSFShape> shapes, XHTMLContentHandler xhtml) ModifierConstructorDescriptionOOXMLTikaBodyPartHandler
(XHTMLContentHandler xhtml, XWPFStylesShim styles, XWPFListManager listManager, OfficeParserConfig parserConfig) protected
SheetTextAsHTML
(OfficeParserConfig config, XHTMLContentHandler xhtml) -
Uses of XHTMLContentHandler in org.apache.tika.parser.microsoft.ooxml.xps
Modifier and TypeMethodDescriptionprotected void
XPSExtractorDecorator.buildXHTML
(XHTMLContentHandler xhtml) -
Uses of XHTMLContentHandler in org.apache.tika.parser.mp4
ModifierConstructorDescriptionTikaMp4BoxHandler
(com.drew.metadata.Metadata metadata, Metadata tikaMetadata, XHTMLContentHandler xhtml) -
Uses of XHTMLContentHandler in org.apache.tika.parser.mp4.boxes
ModifierConstructorDescriptionTikaUserDataBox
(String box, byte[] payload, Metadata metadata, XHTMLContentHandler xhtml) -
Uses of XHTMLContentHandler in org.apache.tika.parser.pdf.image
Modifier and TypeMethodDescriptionImageGraphicsEngineFactory.newEngine
(org.apache.pdfbox.pdmodel.PDPage page, int pageNumber, EmbeddedDocumentExtractor embeddedDocumentExtractor, PDFParserConfig pdfParserConfig, Map<org.apache.pdfbox.cos.COSStream, Integer> processedInlineImages, AtomicInteger imageCounter, XHTMLContentHandler xhtml, Metadata parentMetadata, ParseContext parseContext) ModifierConstructorDescriptionprotected
ImageGraphicsEngine
(org.apache.pdfbox.pdmodel.PDPage page, int pageNumber, EmbeddedDocumentExtractor embeddedDocumentExtractor, PDFParserConfig pdfParserConfig, Map<org.apache.pdfbox.cos.COSStream, Integer> processedInlineImages, AtomicInteger imageCounter, XHTMLContentHandler xhtml, Metadata parentMetadata, ParseContext parseContext) -
Uses of XHTMLContentHandler in org.apache.tika.parser.pkg
Modifier and TypeMethodDescriptionprotected static Metadata
PackageParser.handleEntryMetadata
(String name, Date createAt, Date modifiedAt, Long size, XHTMLContentHandler xhtml)