org.apache.tika.extractor
Class ParsingEmbeddedDocumentExtractor
java.lang.Object
org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
- All Implemented Interfaces:
- EmbeddedDocumentExtractor
public class ParsingEmbeddedDocumentExtractor
- extends Object
- implements EmbeddedDocumentExtractor
Helper class for parsers of package archives or other compound document
formats that support embedded or attached component documents.
- Since:
- Apache Tika 0.8
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
ParsingEmbeddedDocumentExtractor
public ParsingEmbeddedDocumentExtractor(ParseContext context)
shouldParseEmbedded
public boolean shouldParseEmbedded(Metadata metadata)
- Specified by:
shouldParseEmbedded
in interface EmbeddedDocumentExtractor
parseEmbedded
public void parseEmbedded(InputStream stream,
ContentHandler handler,
Metadata metadata,
boolean outputHtml)
throws SAXException,
IOException
- Description copied from interface:
EmbeddedDocumentExtractor
- Processes the supplied embedded resource, calling the delegating
parser with the appropriate details.
- Specified by:
parseEmbedded
in interface EmbeddedDocumentExtractor
- Parameters:
stream
- The embedded resourcehandler
- The handler to usemetadata
- The metadata for the embedded resourceoutputHtml
- Should we output HTML for this resource, or has the parser already done so?
- Throws:
SAXException
IOException
Copyright © 2007-2011 The Apache Software Foundation. All Rights Reserved.