Uses of Class
org.apache.tika.sax.XHTMLContentHandler
-
-
Uses of XHTMLContentHandler in org.apache.tika.parser.executable
Methods in org.apache.tika.parser.executable with parameters of type XHTMLContentHandler Modifier and Type Method Description void
ExecutableParser. parseELF(XHTMLContentHandler xhtml, Metadata metadata, InputStream stream, byte[] first4)
Parses a Unix ELF filevoid
ExecutableParser. parsePE(XHTMLContentHandler xhtml, Metadata metadata, InputStream stream, byte[] first4)
Parses a DOS or Windows PE file -
Uses of XHTMLContentHandler in org.apache.tika.parser.hwp
Methods in org.apache.tika.parser.hwp with parameters of type XHTMLContentHandler Modifier and Type Method Description void
HwpTextExtractorV5. extract(InputStream source, Metadata metadata, XHTMLContentHandler xhtml)
extract Text from HWP Stream. -
Uses of XHTMLContentHandler in org.apache.tika.parser.isatab
Methods in org.apache.tika.parser.isatab with parameters of type XHTMLContentHandler Modifier and Type Method Description static void
ISATabUtils. parseAssay(InputStream stream, XHTMLContentHandler xhtml, Metadata metadata, ParseContext context)
static void
ISATabUtils. parseInvestigation(InputStream stream, XHTMLContentHandler handler, Metadata metadata, ParseContext context)
static void
ISATabUtils. parseInvestigation(InputStream stream, XHTMLContentHandler handler, Metadata metadata, ParseContext context, String studyFileName)
static void
ISATabUtils. parseStudy(InputStream stream, XHTMLContentHandler xhtml, Metadata metadata, ParseContext context)
-
Uses of XHTMLContentHandler in org.apache.tika.parser.microsoft
Methods in org.apache.tika.parser.microsoft with parameters of type XHTMLContentHandler Modifier and Type Method Description static void
FormattingUtils. closeStyleTags(XHTMLContentHandler xhtml, Deque<FormattingUtils.Tag> formattingState)
Closes all formatting tags.static void
FormattingUtils. ensureFormattingState(XHTMLContentHandler xhtml, EnumSet<FormattingUtils.Tag> desired, Deque<FormattingUtils.Tag> currentState)
Closes all tags untilcurrentState
contains only tags fromdesired
set, then open all required tags to reach desired state.protected void
ExcelExtractor. parse(org.apache.poi.poifs.filesystem.DirectoryNode root, XHTMLContentHandler xhtml, Locale locale)
protected void
ExcelExtractor. parse(org.apache.poi.poifs.filesystem.POIFSFileSystem filesystem, XHTMLContentHandler xhtml, Locale locale)
Extracts text from an Excel Workbook writing the extracted content to the specifiedAppendable
.protected void
HSLFExtractor. parse(org.apache.poi.poifs.filesystem.DirectoryNode root, XHTMLContentHandler xhtml)
protected void
HSLFExtractor. parse(org.apache.poi.poifs.filesystem.POIFSFileSystem filesystem, XHTMLContentHandler xhtml)
protected void
OfficeParser. parse(org.apache.poi.poifs.filesystem.DirectoryNode root, ParseContext context, Metadata metadata, XHTMLContentHandler xhtml)
protected static void
OldExcelParser. parse(org.apache.poi.hssf.extractor.OldExcelExtractor extractor, XHTMLContentHandler xhtml)
void
OutlookExtractor. parse(XHTMLContentHandler xhtml, Metadata metadata)
protected void
WordExtractor. parse(org.apache.poi.poifs.filesystem.DirectoryNode root, XHTMLContentHandler xhtml)
protected void
WordExtractor. parse(org.apache.poi.poifs.filesystem.POIFSFileSystem filesystem, XHTMLContentHandler xhtml)
protected void
WordExtractor. parseWord6(org.apache.poi.poifs.filesystem.DirectoryNode root, XHTMLContentHandler xhtml)
protected void
WordExtractor. parseWord6(org.apache.poi.poifs.filesystem.POIFSFileSystem filesystem, XHTMLContentHandler xhtml)
void
Cell. render(XHTMLContentHandler handler)
Renders the content to the given XHTML SAX event stream.void
CellDecorator. render(XHTMLContentHandler handler)
void
LinkedCell. render(XHTMLContentHandler handler)
void
NumberCell. render(XHTMLContentHandler handler)
void
TextCell. render(XHTMLContentHandler handler)
-
Uses of XHTMLContentHandler in org.apache.tika.parser.microsoft.ooxml
Methods in org.apache.tika.parser.microsoft.ooxml with parameters of type XHTMLContentHandler Modifier and Type Method Description protected abstract void
AbstractOOXMLExtractor. buildXHTML(XHTMLContentHandler xhtml)
Populates theXHTMLContentHandler
object received as parameter.protected void
POIXMLTextExtractorDecorator. buildXHTML(XHTMLContentHandler xhtml)
protected void
SXSLFPowerPointExtractorDecorator. buildXHTML(XHTMLContentHandler xhtml)
protected void
SXWPFWordExtractorDecorator. buildXHTML(XHTMLContentHandler xhtml)
protected void
XSLFPowerPointExtractorDecorator. buildXHTML(XHTMLContentHandler xhtml)
protected void
XSSFBExcelExtractorDecorator. buildXHTML(XHTMLContentHandler xhtml)
protected void
XSSFExcelExtractorDecorator. buildXHTML(XHTMLContentHandler xhtml)
protected void
XWPFWordExtractorDecorator. buildXHTML(XHTMLContentHandler xhtml)
protected void
XSSFBExcelExtractorDecorator. extractHeaderFooter(String hf, XHTMLContentHandler xhtml)
protected void
XSSFExcelExtractorDecorator. extractHeaderFooter(String hf, XHTMLContentHandler xhtml)
protected void
XSSFExcelExtractorDecorator. extractHyperLinks(org.apache.poi.openxml4j.opc.PackagePart sheetPart, XHTMLContentHandler xhtml)
protected void
XSSFExcelExtractorDecorator. processShapes(List<org.apache.poi.xssf.usermodel.XSSFShape> shapes, XHTMLContentHandler xhtml)
Constructors in org.apache.tika.parser.microsoft.ooxml with parameters of type XHTMLContentHandler Constructor Description OOXMLTikaBodyPartHandler(XHTMLContentHandler xhtml)
OOXMLTikaBodyPartHandler(XHTMLContentHandler xhtml, XWPFStylesShim styles, XWPFListManager listManager, OfficeParserConfig parserConfig)
SheetTextAsHTML(OfficeParserConfig config, XHTMLContentHandler xhtml)
-
Uses of XHTMLContentHandler in org.apache.tika.parser.microsoft.ooxml.xps
Methods in org.apache.tika.parser.microsoft.ooxml.xps with parameters of type XHTMLContentHandler Modifier and Type Method Description protected void
XPSExtractorDecorator. buildXHTML(XHTMLContentHandler xhtml)
-
Uses of XHTMLContentHandler in org.apache.tika.parser.ocr
Methods in org.apache.tika.parser.ocr with parameters of type XHTMLContentHandler Modifier and Type Method Description void
TesseractOCRParser. parseInline(InputStream stream, XHTMLContentHandler xhtml, TesseractOCRConfig config)
void
TesseractOCRParser. parseInline(InputStream stream, XHTMLContentHandler xhtml, ParseContext parseContext, TesseractOCRConfig config)
Use this to parse content without starting a new document. -
Uses of XHTMLContentHandler in org.apache.tika.parser.pkg
Methods in org.apache.tika.parser.pkg with parameters of type XHTMLContentHandler Modifier and Type Method Description protected static Metadata
PackageParser. handleEntryMetadata(String name, Date createAt, Date modifiedAt, Long size, XHTMLContentHandler xhtml)
-