org.apache.tika.parser
Class DelegatingParser

java.lang.Object
  extended by org.apache.tika.parser.AbstractParser
      extended by org.apache.tika.parser.DelegatingParser
All Implemented Interfaces:
Serializable, Parser
Direct Known Subclasses:
CryptoParser

public class DelegatingParser
extends AbstractParser

Base class for parser implementations that want to delegate parts of the task of parsing an input document to another parser. The delegate parser is looked up from the parsing context using the Parser class as the key.

Since:
Apache Tika 0.4, major changes in Tika 0.5
See Also:
Serialized Form

Constructor Summary
DelegatingParser()
           
 
Method Summary
protected  Parser getDelegateParser(ParseContext context)
          Returns the parser instance to which parsing tasks should be delegated.
 Set<MediaType> getSupportedTypes(ParseContext context)
          Returns the set of media types supported by this parser when used with the given parse context.
 void parse(InputStream stream, ContentHandler handler, Metadata metadata, ParseContext context)
          Looks up the delegate parser from the parsing context and delegates the parse operation to it.
 
Methods inherited from class org.apache.tika.parser.AbstractParser
parse
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DelegatingParser

public DelegatingParser()
Method Detail

getDelegateParser

protected Parser getDelegateParser(ParseContext context)
Returns the parser instance to which parsing tasks should be delegated. The default implementation looks up the delegate parser from the given parse context, and uses an EmptyParser instance as a fallback. Subclasses can override this method to implement alternative delegation strategies.

Parameters:
context - parse context
Returns:
delegate parser
Since:
Apache Tika 0.7

getSupportedTypes

public Set<MediaType> getSupportedTypes(ParseContext context)
Description copied from interface: Parser
Returns the set of media types supported by this parser when used with the given parse context.

Parameters:
context - parse context
Returns:
immutable set of media types

parse

public void parse(InputStream stream,
                  ContentHandler handler,
                  Metadata metadata,
                  ParseContext context)
           throws SAXException,
                  IOException,
                  TikaException
Looks up the delegate parser from the parsing context and delegates the parse operation to it. If a delegate parser is not found, then an empty XHTML document is returned.

Subclasses should override this method to parse the top level structure of the given document stream. Parsed sub-streams can be passed to this base class method to be parsed by the configured delegate parser.

Parameters:
stream - the document stream (input)
handler - handler for the XHTML SAX events (output)
metadata - document metadata (input and output)
context - parse context
Throws:
SAXException - if the SAX events could not be processed
IOException - if the document stream could not be read
TikaException - if the document could not be parsed


Copyright © 2007-2012 The Apache Software Foundation. All Rights Reserved.