Package org.apache.tika.sax
Class AbstractRecursiveParserWrapperHandler
java.lang.Object
org.xml.sax.helpers.DefaultHandler
org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- All Implemented Interfaces:
Serializable,ContentHandler,DTDHandler,EntityResolver,ErrorHandler
- Direct Known Subclasses:
RecursiveParserWrapperHandler
public abstract class AbstractRecursiveParserWrapperHandler
extends DefaultHandler
implements Serializable
This is a special handler to be used only with the
RecursiveParserWrapper.
It allows for finer-grained processing of embedded documents than in the legacy handlers.
Subclasses can choose how to process individual embedded documents.- See Also:
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionAbstractRecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory) -
Method Summary
Modifier and TypeMethodDescriptionprotected voidThis is called byendEmbeddedDocument(ContentHandler, Metadata).voidendDocument(ContentHandler contentHandler, Metadata metadata) This is called after the full parse has completed.voidendEmbeddedDocument(ContentHandler contentHandler, Metadata metadata) This is called after parsing each embedded document.voidstartEmbeddedDocument(ContentHandler contentHandler, Metadata metadata) This is called before parsing each embedded document.Methods inherited from class org.xml.sax.helpers.DefaultHandler
characters, endDocument, endElement, endPrefixMapping, error, fatalError, ignorableWhitespace, notationDecl, processingInstruction, resolveEntity, setDocumentLocator, skippedEntity, startDocument, startElement, startPrefixMapping, unparsedEntityDecl, warningMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.xml.sax.ContentHandler
declaration
-
Field Details
-
EMBEDDED_RESOURCE_LIMIT_REACHED
-
EMBEDDED_DEPTH_LIMIT_REACHED
-
-
Constructor Details
-
AbstractRecursiveParserWrapperHandler
-
-
Method Details
-
createHandler
-
startEmbeddedDocument
public void startEmbeddedDocument(ContentHandler contentHandler, Metadata metadata) throws SAXException This is called before parsing each embedded document. Override this for custom behavior. Make sure to call this in your custom classes because this tracks the embedded depth.- Parameters:
contentHandler- local handler to be used on this embedded documentmetadata- embedded document's metadata- Throws:
SAXException
-
endEmbeddedDocument
public void endEmbeddedDocument(ContentHandler contentHandler, Metadata metadata) throws SAXException This is called after parsing each embedded document. Override this for custom behavior. This is currently a no-op aside from tracking embedded depth.When overriding, make sure to call
decrementEmbeddedDepth()- Parameters:
contentHandler- content handler that was used on this embedded documentmetadata- metadata for this embedded document- Throws:
SAXException
-
decrementEmbeddedDepth
protected void decrementEmbeddedDepth()This is called byendEmbeddedDocument(ContentHandler, Metadata). Users overridingendEmbeddedDocument(ContentHandler, Metadata)need to call this unless they are triggering it viasuper.endEmbeddedDocument(contentHandler, metadata); -
endDocument
This is called after the full parse has completed. Override this for custom behavior. Make sure to call this assuper.endDocument(...)in subclasses.- Parameters:
contentHandler- content handler that was used on the main documentmetadata- metadata that was gathered for the main document- Throws:
SAXException
-
getContentHandlerFactory
-