org.apache.tika.parser
Class ParserPostProcessor

java.lang.Object
  extended by org.apache.tika.parser.AbstractParser
      extended by org.apache.tika.parser.ParserDecorator
          extended by org.apache.tika.parser.ParserPostProcessor
All Implemented Interfaces:
java.io.Serializable, Parser

public class ParserPostProcessor
extends ParserDecorator

Parser decorator that post-processes the results from a decorated parser. The post-processing takes care of filling in the "fulltext", "summary", and "outlinks" metadata entries based on the full text content returned by the decorated parser.

See Also:
Serialized Form

Constructor Summary
ParserPostProcessor(Parser parser)
          Creates a post-processing decorator for the given parser.
 
Method Summary
 void parse(java.io.InputStream stream, org.xml.sax.ContentHandler handler, Metadata metadata, ParseContext context)
          Forwards the call to the delegated parser and post-processes the results as described above.
 
Methods inherited from class org.apache.tika.parser.ParserDecorator
getSupportedTypes, getWrappedParser, withTypes
 
Methods inherited from class org.apache.tika.parser.AbstractParser
parse
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

ParserPostProcessor

public ParserPostProcessor(Parser parser)
Creates a post-processing decorator for the given parser.

Parameters:
parser - the parser to be decorated
Method Detail

parse

public void parse(java.io.InputStream stream,
                  org.xml.sax.ContentHandler handler,
                  Metadata metadata,
                  ParseContext context)
           throws java.io.IOException,
                  org.xml.sax.SAXException,
                  TikaException
Forwards the call to the delegated parser and post-processes the results as described above.

Specified by:
parse in interface Parser
Overrides:
parse in class ParserDecorator
Parameters:
stream - the document stream (input)
handler - handler for the XHTML SAX events (output)
metadata - document metadata (input and output)
context - parse context
Throws:
java.io.IOException - if the document stream could not be read
org.xml.sax.SAXException - if the SAX events could not be processed
TikaException - if the document could not be parsed


Copyright © 2007-2011 The Apache Software Foundation. All Rights Reserved.