Class RecursiveParserWrapper
- java.lang.Object
- 
- org.apache.tika.parser.ParserDecorator
- 
- org.apache.tika.parser.RecursiveParserWrapper
 
 
- 
- All Implemented Interfaces:
- Serializable,- Parser
 
 public class RecursiveParserWrapper extends ParserDecorator This is a helper class that wraps a parser in a recursive handler. It takes care of setting the embedded parser in the ParseContext and handling the embedded path calculations.After parsing a document, call getMetadata() to retrieve a list of Metadata objects, one for each embedded resource. The first item in the list will contain the Metadata for the outer container file. Content can also be extracted and stored in the TikaCoreProperties.TIKA_CONTENTfield of a Metadata object. Select the type of content to be stored at initialization.If a WriteLimitReachedException is encountered, the wrapper will stop processing the current resource, and it will not process any of the child resources for the given resource. However, it will try to parse as much as it can. If a WLRE is reached in the parent document, no child resources will be parsed. The implementation is based on Jukka's RecursiveMetadataParser and Nick's additions. See: RecursiveMetadataParser. Note that this wrapper holds all data in memory and is not appropriate for files with content too large to be held in memory. The unit tests for this class are in the tika-parsers module. - See Also:
- Serialized Form
 
- 
- 
Constructor SummaryConstructors Constructor Description RecursiveParserWrapper(Parser wrappedParser)Initialize the wrapper withcatchEmbeddedExceptionsset totrueas default.RecursiveParserWrapper(Parser wrappedParser, boolean catchEmbeddedExceptions)
 - 
Method SummaryAll Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description static StringgetResourceName(Metadata metadata, AtomicInteger counter)Set<MediaType>getSupportedTypes(ParseContext context)Delegates the method call to the decorated parser.voidparse(InputStream stream, ContentHandler recursiveParserWrapperHandler, Metadata metadata, ParseContext context)Delegates the method call to the decorated parser.- 
Methods inherited from class org.apache.tika.parser.ParserDecoratorgetDecorationName, getWrappedParser, withFallbacks, withoutTypes, withTypes
 
- 
 
- 
- 
- 
Constructor Detail- 
RecursiveParserWrapperpublic RecursiveParserWrapper(Parser wrappedParser) Initialize the wrapper withcatchEmbeddedExceptionsset totrueas default.- Parameters:
- wrappedParser- parser to use for the container documents and the embedded documents
 
 - 
RecursiveParserWrapperpublic RecursiveParserWrapper(Parser wrappedParser, boolean catchEmbeddedExceptions) - Parameters:
- wrappedParser- parser to wrap
- catchEmbeddedExceptions- whether or not to catch+record embedded exceptions. If set to- false, embedded exceptions will be thrown and the rest of the file will not be parsed. The following will not be ignored:- CorruptedFileException,- RuntimeException
 
 
- 
 - 
Method Detail- 
getSupportedTypespublic Set<MediaType> getSupportedTypes(ParseContext context) Description copied from class:ParserDecoratorDelegates the method call to the decorated parser. Subclasses should override this method (and usesuper.getSupportedTypes()to invoke the decorated parser) to implement extra decoration.- Specified by:
- getSupportedTypesin interface- Parser
- Overrides:
- getSupportedTypesin class- ParserDecorator
- Parameters:
- context- parse context
- Returns:
- immutable set of media types
 
 - 
parsepublic void parse(InputStream stream, ContentHandler recursiveParserWrapperHandler, Metadata metadata, ParseContext context) throws IOException, SAXException, TikaException Description copied from class:ParserDecoratorDelegates the method call to the decorated parser. Subclasses should override this method (and usesuper.parse()to invoke the decorated parser) to implement extra decoration.- Specified by:
- parsein interface- Parser
- Overrides:
- parsein class- ParserDecorator
- Parameters:
- stream-
- recursiveParserWrapperHandler- -- handler must implement- RecursiveParserWrapperHandler
- metadata-
- context-
- Throws:
- IOException
- SAXException
- TikaException
- IllegalStateException- if the handler is not a- RecursiveParserWrapperHandler
 
 - 
getResourceNamepublic static String getResourceName(Metadata metadata, AtomicInteger counter) 
 
- 
 
-