Package org.apache.tika.sax
Class AbstractRecursiveParserWrapperHandler
java.lang.Object
org.xml.sax.helpers.DefaultHandler
org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- All Implemented Interfaces:
- Serializable,- ContentHandler,- DTDHandler,- EntityResolver,- ErrorHandler
- Direct Known Subclasses:
- RecursiveParserWrapperHandler
public abstract class AbstractRecursiveParserWrapperHandler
extends DefaultHandler
implements Serializable
This is a special handler to be used only with the
 
RecursiveParserWrapper.
 It allows for finer-grained processing of embedded documents than in the legacy handlers.
 Subclasses can choose how to process individual embedded documents.- See Also:
- 
Field SummaryFields
- 
Constructor SummaryConstructorsConstructorDescriptionAbstractRecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory) AbstractRecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory, int maxEmbeddedResources) 
- 
Method SummaryModifier and TypeMethodDescriptionvoidendDocument(ContentHandler contentHandler, Metadata metadata) This is called after the full parse has completed.voidendEmbeddedDocument(ContentHandler contentHandler, Metadata metadata) This is called after parsing each embedded document.getNewContentHandler(OutputStream os, Charset charset) booleanvoidstartEmbeddedDocument(ContentHandler contentHandler, Metadata metadata) This is called before parsing each embedded document.Methods inherited from class org.xml.sax.helpers.DefaultHandlercharacters, endDocument, endElement, endPrefixMapping, error, fatalError, ignorableWhitespace, notationDecl, processingInstruction, resolveEntity, setDocumentLocator, skippedEntity, startDocument, startElement, startPrefixMapping, unparsedEntityDecl, warning
- 
Field Details- 
EMBEDDED_RESOURCE_LIMIT_REACHED
 
- 
- 
Constructor Details- 
AbstractRecursiveParserWrapperHandler
- 
AbstractRecursiveParserWrapperHandlerpublic AbstractRecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory, int maxEmbeddedResources) 
 
- 
- 
Method Details- 
getNewContentHandler
- 
getNewContentHandler
- 
startEmbeddedDocumentpublic void startEmbeddedDocument(ContentHandler contentHandler, Metadata metadata) throws SAXException This is called before parsing each embedded document. Override this for custom behavior. Make sure to call this in your custom classes because this tracks the number of embedded documents.- Parameters:
- contentHandler- local handler to be used on this embedded document
- metadata- embedded document's metadata
- Throws:
- SAXException
 
- 
endEmbeddedDocumentpublic void endEmbeddedDocument(ContentHandler contentHandler, Metadata metadata) throws SAXException This is called after parsing each embedded document. Override this for custom behavior. This is currently a no-op.- Parameters:
- contentHandler- content handler that was used on this embedded document
- metadata- metadata for this embedded document
- Throws:
- SAXException
 
- 
endDocumentThis is called after the full parse has completed. Override this for custom behavior. Make sure to call this assuper.endDocument(...)in subclasses because this adds whether or not the embedded resource maximum has been hit to the metadata.- Parameters:
- contentHandler- content handler that was used on the main document
- metadata- metadata that was gathered for the main document
- Throws:
- SAXException
 
- 
hasHitMaximumEmbeddedResourcespublic boolean hasHitMaximumEmbeddedResources()- Returns:
- whether this handler has hit the maximum embedded resources during the parse
 
- 
getContentHandlerFactory
 
-