org.apache.tika.parser
Class DelegatingParser

java.lang.Object
  extended by org.apache.tika.parser.DelegatingParser
All Implemented Interfaces:
java.io.Serializable, Parser
Direct Known Subclasses:
ForkParser

public class DelegatingParser
extends java.lang.Object
implements Parser

Base class for parser implementations that want to delegate parts of the task of parsing an input document to another parser. The delegate parser is looked up from the parsing context using the Parser class as the key.

Since:
Apache Tika 0.4, major changes in Tika 0.5
See Also:
Serialized Form

Constructor Summary
DelegatingParser()
           
 
Method Summary
protected  Parser getDelegateParser(ParseContext context)
          Returns the parser instance to which parsing tasks should be delegated.
 java.util.Set<MediaType> getSupportedTypes(ParseContext context)
          Returns the set of media types supported by this parser when used with the given parse context.
 void parse(java.io.InputStream stream, org.xml.sax.ContentHandler handler, Metadata metadata)
          Deprecated. This method will be removed in Apache Tika 1.0.
 void parse(java.io.InputStream stream, org.xml.sax.ContentHandler handler, Metadata metadata, ParseContext context)
          Looks up the delegate parser from the parsing context and delegates the parse operation to it.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DelegatingParser

public DelegatingParser()
Method Detail

getDelegateParser

protected Parser getDelegateParser(ParseContext context)
Returns the parser instance to which parsing tasks should be delegated. The default implementation looks up the delegate parser from the given parse context, and uses an EmptyParser instance as a fallback. Subclasses can override this method to implement alternative delegation strategies.

Parameters:
context - parse context
Returns:
delegate parser
Since:
Apache Tika 0.7

getSupportedTypes

public java.util.Set<MediaType> getSupportedTypes(ParseContext context)
Description copied from interface: Parser
Returns the set of media types supported by this parser when used with the given parse context.

Specified by:
getSupportedTypes in interface Parser
Parameters:
context - parse context
Returns:
immutable set of media types

parse

public void parse(java.io.InputStream stream,
                  org.xml.sax.ContentHandler handler,
                  Metadata metadata,
                  ParseContext context)
           throws org.xml.sax.SAXException,
                  java.io.IOException,
                  TikaException
Looks up the delegate parser from the parsing context and delegates the parse operation to it. If a delegate parser is not found, then an empty XHTML document is returned.

Subclasses should override this method to parse the top level structure of the given document stream. Parsed sub-streams can be passed to this base class method to be parsed by the configured delegate parser.

Specified by:
parse in interface Parser
Parameters:
stream - the document stream (input)
handler - handler for the XHTML SAX events (output)
metadata - document metadata (input and output)
context - parse context
Throws:
org.xml.sax.SAXException - if the SAX events could not be processed
java.io.IOException - if the document stream could not be read
TikaException - if the document could not be parsed

parse

public void parse(java.io.InputStream stream,
                  org.xml.sax.ContentHandler handler,
                  Metadata metadata)
           throws java.io.IOException,
                  org.xml.sax.SAXException,
                  TikaException
Deprecated. This method will be removed in Apache Tika 1.0.

Description copied from interface: Parser
The parse() method from Tika 0.4 and earlier. Please use the Parser.parse(InputStream, ContentHandler, Metadata, ParseContext) method instead in new code. Calls to this backwards compatibility method are forwarded to the new parse() method with an empty parse context.

Specified by:
parse in interface Parser
Throws:
java.io.IOException
org.xml.sax.SAXException
TikaException


Copyright © 2007-2010 The Apache Software Foundation. All Rights Reserved.