public class StandardsExtractingContentHandler extends ContentHandlerDecorator
This handler relies on complex regular expressions which can be slow on some types of input data.
Modifier and Type | Field and Description |
---|---|
static String |
STANDARD_REFERENCES |
Modifier | Constructor and Description |
---|---|
protected |
StandardsExtractingContentHandler()
Creates a decorator that by default forwards incoming SAX events to a
dummy content handler that simply ignores all the events.
|
|
StandardsExtractingContentHandler(ContentHandler handler,
Metadata metadata)
Creates a decorator for the given SAX event handler and Metadata object.
|
Modifier and Type | Method and Description |
---|---|
void |
characters(char[] ch,
int start,
int length)
The characters method is called whenever a Parser wants to pass raw
characters to the ContentHandler.
|
void |
endDocument()
This method is called whenever the Parser is done parsing the file.
|
double |
getThreshold()
Gets the threshold to be used for selecting the standard references found
within the text based on their score.
|
void |
setMaxBufferLength(int maxBufferLength)
The number of characters to store in memory for checking for standards.
|
void |
setThreshold(double score)
Sets the score to be used as threshold.
|
endElement, endPrefixMapping, handleException, ignorableWhitespace, processingInstruction, setContentHandler, setDocumentLocator, skippedEntity, startDocument, startElement, startPrefixMapping, toString
error, fatalError, notationDecl, resolveEntity, unparsedEntityDecl, warning
public static final String STANDARD_REFERENCES
public StandardsExtractingContentHandler(ContentHandler handler, Metadata metadata)
handler
- SAX event handler to be decorated.metadata
- Metadata
object.protected StandardsExtractingContentHandler()
ContentHandlerDecorator.setContentHandler(ContentHandler)
method to
switch to a more usable underlying content handler. Also creates a dummy
Metadata object to store phone numbers in.public double getThreshold()
public void setThreshold(double score)
score
- the score to be used as threshold.public void characters(char[] ch, int start, int length) throws SAXException
characters
in interface ContentHandler
characters
in class ContentHandlerDecorator
SAXException
public void endDocument() throws SAXException
endDocument
in interface ContentHandler
endDocument
in class ContentHandlerDecorator
SAXException
public void setMaxBufferLength(int maxBufferLength)
Copyright © 2007–2023 The Apache Software Foundation. All rights reserved.