Package org.apache.tika.extractor
Class ParsingEmbeddedDocumentExtractor
- java.lang.Object
-
- org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
-
- All Implemented Interfaces:
EmbeddedDocumentExtractor
public class ParsingEmbeddedDocumentExtractor extends Object implements EmbeddedDocumentExtractor
Helper class for parsers of package archives or other compound document formats that support embedded or attached component documents.- Since:
- Apache Tika 0.8
-
-
Constructor Summary
Constructors Constructor Description ParsingEmbeddedDocumentExtractor(ParseContext context)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
parseEmbedded(InputStream stream, ContentHandler handler, Metadata metadata, boolean outputHtml)
Processes the supplied embedded resource, calling the delegating parser with the appropriate details.boolean
shouldParseEmbedded(Metadata metadata)
-
-
-
Constructor Detail
-
ParsingEmbeddedDocumentExtractor
public ParsingEmbeddedDocumentExtractor(ParseContext context)
-
-
Method Detail
-
shouldParseEmbedded
public boolean shouldParseEmbedded(Metadata metadata)
- Specified by:
shouldParseEmbedded
in interfaceEmbeddedDocumentExtractor
-
parseEmbedded
public void parseEmbedded(InputStream stream, ContentHandler handler, Metadata metadata, boolean outputHtml) throws SAXException, IOException
Description copied from interface:EmbeddedDocumentExtractor
Processes the supplied embedded resource, calling the delegating parser with the appropriate details.- Specified by:
parseEmbedded
in interfaceEmbeddedDocumentExtractor
- Parameters:
stream
- The embedded resourcehandler
- The handler to usemetadata
- The metadata for the embedded resourceoutputHtml
- Should we output HTML for this resource, or has the parser already done so?- Throws:
SAXException
IOException
-
-