Package org.apache.tika.extractor
Class ParsingEmbeddedDocumentExtractor
java.lang.Object
org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
- All Implemented Interfaces:
EmbeddedDocumentExtractor
- Direct Known Subclasses:
RUnpackExtractor
Helper class for parsers of package archives or other compound document
formats that support embedded or attached component documents.
- Since:
- Apache Tika 0.8
-
Field Summary
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionboolean
void
parseEmbedded
(InputStream stream, ContentHandler handler, Metadata metadata, boolean outputHtml) Processes the supplied embedded resource, calling the delegating parser with the appropriate details.void
setWriteFileNameToContent
(boolean writeFileNameToContent) boolean
shouldParseEmbedded
(Metadata metadata)
-
Field Details
-
context
-
-
Constructor Details
-
ParsingEmbeddedDocumentExtractor
-
-
Method Details
-
shouldParseEmbedded
- Specified by:
shouldParseEmbedded
in interfaceEmbeddedDocumentExtractor
-
parseEmbedded
public void parseEmbedded(InputStream stream, ContentHandler handler, Metadata metadata, boolean outputHtml) throws SAXException, IOException Description copied from interface:EmbeddedDocumentExtractor
Processes the supplied embedded resource, calling the delegating parser with the appropriate details.- Specified by:
parseEmbedded
in interfaceEmbeddedDocumentExtractor
- Parameters:
stream
- The embedded resourcehandler
- The handler to usemetadata
- The metadata for the embedded resourceoutputHtml
- Should we output HTML for this resource, or has the parser already done so?- Throws:
SAXException
IOException
-
getDelegatingParser
-
setWriteFileNameToContent
public void setWriteFileNameToContent(boolean writeFileNameToContent) -
isWriteFileNameToContent
public boolean isWriteFileNameToContent()
-