Package org.apache.tika.sax
Class RecursiveParserWrapperHandler
- java.lang.Object
- 
- org.xml.sax.helpers.DefaultHandler
- 
- org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- 
- org.apache.tika.sax.RecursiveParserWrapperHandler
 
 
 
- 
- All Implemented Interfaces:
- Serializable,- ContentHandler,- DTDHandler,- EntityResolver,- ErrorHandler
 
 public class RecursiveParserWrapperHandler extends AbstractRecursiveParserWrapperHandler This is the default implementation ofAbstractRecursiveParserWrapperHandler. See its documentation for more details.This caches the a metadata object for each embedded file and for the container file. It places the extracted content in the metadata object, with this key: TikaCoreProperties.TIKA_CONTENTIf memory is a concern, subclass AbstractRecursiveParserWrapperHandler to handle each embedded document.NOTE: This handler must only be used with the RecursiveParserWrapper- See Also:
- Serialized Form
 
- 
- 
Field SummaryFields Modifier and Type Field Description protected List<Metadata>metadataList- 
Fields inherited from class org.apache.tika.sax.AbstractRecursiveParserWrapperHandlerEMBEDDED_RESOURCE_LIMIT_REACHED
 
- 
 - 
Constructor SummaryConstructors Constructor Description RecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory)Create a handler with no limit on the number of embedded resourcesRecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory, int maxEmbeddedResources)Create a handler that limits the number of embedded resources that will be parsedRecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory, int maxEmbeddedResources, MetadataFilter metadataFilter)
 - 
Method SummaryAll Methods Instance Methods Concrete Methods Modifier and Type Method Description voidendDocument(ContentHandler contentHandler, Metadata metadata)This is called after the full parse has completed.voidendEmbeddedDocument(ContentHandler contentHandler, Metadata metadata)This is called after parsing an embedded document.List<Metadata>getMetadataList()voidstartEmbeddedDocument(ContentHandler contentHandler, Metadata metadata)This is called before parsing an embedded document- 
Methods inherited from class org.apache.tika.sax.AbstractRecursiveParserWrapperHandlergetContentHandlerFactory, getNewContentHandler, getNewContentHandler, hasHitMaximumEmbeddedResources
 - 
Methods inherited from class org.xml.sax.helpers.DefaultHandlercharacters, endDocument, endElement, endPrefixMapping, error, fatalError, ignorableWhitespace, notationDecl, processingInstruction, resolveEntity, setDocumentLocator, skippedEntity, startDocument, startElement, startPrefixMapping, unparsedEntityDecl, warning
 
- 
 
- 
- 
- 
Constructor Detail- 
RecursiveParserWrapperHandlerpublic RecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory) Create a handler with no limit on the number of embedded resources
 - 
RecursiveParserWrapperHandlerpublic RecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory, int maxEmbeddedResources) Create a handler that limits the number of embedded resources that will be parsed- Parameters:
- maxEmbeddedResources- number of embedded resources that will be parsed
 
 - 
RecursiveParserWrapperHandlerpublic RecursiveParserWrapperHandler(ContentHandlerFactory contentHandlerFactory, int maxEmbeddedResources, MetadataFilter metadataFilter) 
 
- 
 - 
Method Detail- 
startEmbeddedDocumentpublic void startEmbeddedDocument(ContentHandler contentHandler, Metadata metadata) throws SAXException This is called before parsing an embedded document- Overrides:
- startEmbeddedDocumentin class- AbstractRecursiveParserWrapperHandler
- Parameters:
- contentHandler- - local content handler to use on the embedded document
- metadata- metadata to use for the embedded document
- Throws:
- SAXException
 
 - 
endEmbeddedDocumentpublic void endEmbeddedDocument(ContentHandler contentHandler, Metadata metadata) throws SAXException This is called after parsing an embedded document.- Overrides:
- endEmbeddedDocumentin class- AbstractRecursiveParserWrapperHandler
- Parameters:
- contentHandler- local contenthandler used on the embedded document
- metadata- metadata from the embedded document
- Throws:
- SAXException
 
 - 
endDocumentpublic void endDocument(ContentHandler contentHandler, Metadata metadata) throws SAXException Description copied from class:AbstractRecursiveParserWrapperHandlerThis is called after the full parse has completed. Override this for custom behavior. Make sure to call this assuper.endDocument(...)in subclasses because this adds whether or not the embedded resource maximum has been hit to the metadata.- Overrides:
- endDocumentin class- AbstractRecursiveParserWrapperHandler
- Parameters:
- contentHandler- content handler used on the main document
- metadata- metadata from the main document
- Throws:
- SAXException
 
 
- 
 
-