Class | Description |
---|---|
AccessChecker |
Checks whether or not a document allows extraction generally
or extraction for accessibility only.
|
NoTextPDFRenderer |
This class extends the PDFRenderer to exclude rendering of electronic text.
|
PDFMarkedContent2XHTML |
This was added in Tika 1.24 as an alpha version of a text extractor
that builds the text from the marked text tree and includes/normalizes
some of the structural tags.
|
PDFParser |
PDF parser.
|
PDFParserConfig |
Config for PDFParser.
|
PDFParserConfig.OCRStrategyAuto |
Encapsulate the numbers used to control OCR Strategy when set to auto
|
Enum | Description |
---|---|
PDFParserConfig.OCR_RENDERING_STRATEGY | |
PDFParserConfig.OCR_STRATEGY |
Copyright © 2007–2021 The Apache Software Foundation. All rights reserved.