public class PDFParserConfig extends Object implements Serializable
Constructor and Description |
---|
PDFParserConfig() |
PDFParserConfig(InputStream is)
Loads properties from InputStream and then tries to close InputStream.
|
Modifier and Type | Method and Description |
---|---|
boolean |
equals(Object obj) |
boolean |
getEnableAutoSpace() |
boolean |
getExtractAcroFormContent() |
boolean |
getExtractAnnotationText() |
boolean |
getSortByPosition() |
boolean |
getSuppressDuplicateOverlappingText() |
boolean |
getUseNonSequentialParser() |
int |
hashCode() |
void |
setEnableAutoSpace(boolean enableAutoSpace)
If true (the default), the parser should estimate
where spaces should be inserted between words.
|
void |
setExtractAcroFormContent(boolean extractAcroFormContent)
If true (the default), extract content from AcroForms
at the end of the document.
|
void |
setExtractAnnotationText(boolean extractAnnotationText)
If true (the default), text in annotations will be
extracted.
|
void |
setSortByPosition(boolean sortByPosition)
If true, sort text tokens by their x/y position
before extracting text.
|
void |
setSuppressDuplicateOverlappingText(boolean suppressDuplicateOverlappingText)
If true, the parser should try to remove duplicated
text over the same region.
|
void |
setUseNonSequentialParser(boolean useNonSequentialParser)
If true, uses PDFBox's non-sequential parser.
|
String |
toString() |
public PDFParserConfig()
public PDFParserConfig(InputStream is)
is
- public void setExtractAcroFormContent(boolean extractAcroFormContent)
b
- public boolean getExtractAcroFormContent()
setExtractAcroFormContent(boolean)
public boolean getEnableAutoSpace()
#setEnableAutoSpace.
public void setEnableAutoSpace(boolean enableAutoSpace)
public boolean getSuppressDuplicateOverlappingText()
public void setSuppressDuplicateOverlappingText(boolean suppressDuplicateOverlappingText)
public boolean getExtractAnnotationText()
setExtractAnnotationText(boolean)
public void setExtractAnnotationText(boolean extractAnnotationText)
public boolean getSortByPosition()
setSortByPosition(boolean)
public void setSortByPosition(boolean sortByPosition)
public boolean getUseNonSequentialParser()
setUseNonSequentialParser(boolean)
public void setUseNonSequentialParser(boolean useNonSequentialParser)
Default is false (use the traditional parser)
useNonSequentialParser
- Copyright © 2007-2014 The Apache Software Foundation. All Rights Reserved.