org.apache.tika.parser
Class ExternalParser
java.lang.Object
org.apache.tika.parser.ExternalParser
- All Implemented Interfaces:
- Parser
public class ExternalParser
- extends java.lang.Object
- implements Parser
Parser that uses an external program (like catdoc or pdf2txt) to extract
text content from a given document.
Method Summary |
void |
parse(java.io.InputStream stream,
org.xml.sax.ContentHandler handler,
Metadata metadata)
Deprecated. This method will be removed in Apache Tika 1.0. |
void |
parse(java.io.InputStream stream,
org.xml.sax.ContentHandler handler,
Metadata metadata,
ParseContext context)
Executes the configured external command and passes the given document
stream as a simple XHTML document to the given SAX content handler. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ExternalParser
public ExternalParser()
parse
public void parse(java.io.InputStream stream,
org.xml.sax.ContentHandler handler,
Metadata metadata,
ParseContext context)
throws java.io.IOException,
org.xml.sax.SAXException,
TikaException
- Executes the configured external command and passes the given document
stream as a simple XHTML document to the given SAX content handler.
No metadata is extracted.
- Specified by:
parse
in interface Parser
- Parameters:
stream
- the document stream (input)handler
- handler for the XHTML SAX events (output)metadata
- document metadata (input and output)context
- parse context
- Throws:
java.io.IOException
- if the document stream could not be read
org.xml.sax.SAXException
- if the SAX events could not be processed
TikaException
- if the document could not be parsed
parse
public void parse(java.io.InputStream stream,
org.xml.sax.ContentHandler handler,
Metadata metadata)
throws java.io.IOException,
org.xml.sax.SAXException,
TikaException
- Deprecated. This method will be removed in Apache Tika 1.0.
- Description copied from interface:
Parser
- The parse() method from Tika 0.4 and earlier. Please use the
#parse(InputStream, ContentHandler, Metadata, Map)
method
instead in new code. Calls to this backwards compatibility method
are forwarded to the new parse() method with an empty parse context.
- Specified by:
parse
in interface Parser
- Throws:
java.io.IOException
org.xml.sax.SAXException
TikaException
Copyright © 2010 The Apache Software Foundation. All Rights Reserved.