Class PDFParserConfig.OCRStrategyAuto

  • All Implemented Interfaces:
    Serializable
    Enclosing class:
    PDFParserConfig

    public static class PDFParserConfig.OCRStrategyAuto
    extends Object
    implements Serializable
    Encapsulate the numbers used to control OCR Strategy when set to auto

    If the total characters on the page < this.totalCharsPerPage or total unmapped unicode characters on the page > this.unmappedUnicodeCharsPerPage then we will perform OCR on the page

    If unamppedUnicodeCharsPerPage is an integer > 0, then we compare absolute number of characters. If it is a float < 1, then we assume it is a percentage and we compare it to the percentage of unmappedCharactersPerPage/totalCharsPerPage

    See Also:
    Serialized Form
    • Constructor Detail

      • OCRStrategyAuto

        public OCRStrategyAuto​(float unmappedUnicodeCharsPerPage,
                               int totalCharsPerPage)
    • Method Detail

      • getUnmappedUnicodeCharsPerPage

        public float getUnmappedUnicodeCharsPerPage()
      • getTotalCharsPerPage

        public int getTotalCharsPerPage()