Package org.apache.tika.parser.pdf
Class PDFParserConfig.OCRStrategyAuto
- java.lang.Object
-
- org.apache.tika.parser.pdf.PDFParserConfig.OCRStrategyAuto
-
- All Implemented Interfaces:
Serializable
- Enclosing class:
- PDFParserConfig
public static class PDFParserConfig.OCRStrategyAuto extends Object implements Serializable
Encapsulate the numbers used to control OCR Strategy when set to autoIf the total characters on the page < this.totalCharsPerPage or total unmapped unicode characters on the page > this.unmappedUnicodeCharsPerPage then we will perform OCR on the page
If unamppedUnicodeCharsPerPage is an integer > 0, then we compare absolute number of characters. If it is a float < 1, then we assume it is a percentage and we compare it to the percentage of unmappedCharactersPerPage/totalCharsPerPage
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description OCRStrategyAuto(float unmappedUnicodeCharsPerPage, int totalCharsPerPage)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description int
getTotalCharsPerPage()
float
getUnmappedUnicodeCharsPerPage()
-