public class StandardsExtractingContentHandler extends ContentHandlerDecorator
This handler relies on complex regular expressions which can be slow on some types of input data.
| Modifier and Type | Field and Description |
|---|---|
static String |
STANDARD_REFERENCES |
| Modifier | Constructor and Description |
|---|---|
protected |
StandardsExtractingContentHandler()
Creates a decorator that by default forwards incoming SAX events to a
dummy content handler that simply ignores all the events.
|
|
StandardsExtractingContentHandler(ContentHandler handler,
Metadata metadata)
Creates a decorator for the given SAX event handler and Metadata object.
|
| Modifier and Type | Method and Description |
|---|---|
void |
characters(char[] ch,
int start,
int length)
The characters method is called whenever a Parser wants to pass raw
characters to the ContentHandler.
|
void |
endDocument()
This method is called whenever the Parser is done parsing the file.
|
double |
getThreshold()
Gets the threshold to be used for selecting the standard references found
within the text based on their score.
|
void |
setMaxBufferLength(int maxBufferLength)
The number of characters to store in memory for checking for standards.
|
void |
setThreshold(double score)
Sets the score to be used as threshold.
|
endElement, endPrefixMapping, handleException, ignorableWhitespace, processingInstruction, setContentHandler, setDocumentLocator, skippedEntity, startDocument, startElement, startPrefixMapping, toStringerror, fatalError, notationDecl, resolveEntity, unparsedEntityDecl, warningpublic static final String STANDARD_REFERENCES
public StandardsExtractingContentHandler(ContentHandler handler, Metadata metadata)
handler - SAX event handler to be decorated.metadata - Metadata object.protected StandardsExtractingContentHandler()
ContentHandlerDecorator.setContentHandler(ContentHandler) method to
switch to a more usable underlying content handler. Also creates a dummy
Metadata object to store phone numbers in.public double getThreshold()
public void setThreshold(double score)
score - the score to be used as threshold.public void characters(char[] ch,
int start,
int length)
throws SAXException
characters in interface ContentHandlercharacters in class ContentHandlerDecoratorSAXExceptionpublic void endDocument()
throws SAXException
endDocument in interface ContentHandlerendDocument in class ContentHandlerDecoratorSAXExceptionpublic void setMaxBufferLength(int maxBufferLength)
Copyright © 2007–2022 The Apache Software Foundation. All rights reserved.