Package org.apache.tika.extractor
Class ParsingEmbeddedDocumentExtractor
java.lang.Object
org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
- All Implemented Interfaces:
EmbeddedDocumentExtractor
- Direct Known Subclasses:
RUnpackExtractor
Helper class for parsers of package archives or other compound document
formats that support embedded or attached component documents.
- Since:
- Apache Tika 0.8
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionbooleanvoidparseEmbedded(InputStream stream, ContentHandler handler, Metadata metadata, boolean outputHtml) Processes the supplied embedded resource, calling the delegating parser with the appropriate details.voidsetWriteFileNameToContent(boolean writeFileNameToContent) booleanshouldParseEmbedded(Metadata metadata)
-
Field Details
-
context
-
-
Constructor Details
-
ParsingEmbeddedDocumentExtractor
-
-
Method Details
-
shouldParseEmbedded
- Specified by:
shouldParseEmbeddedin interfaceEmbeddedDocumentExtractor
-
parseEmbedded
public void parseEmbedded(InputStream stream, ContentHandler handler, Metadata metadata, boolean outputHtml) throws SAXException, IOException Description copied from interface:EmbeddedDocumentExtractorProcesses the supplied embedded resource, calling the delegating parser with the appropriate details.- Specified by:
parseEmbeddedin interfaceEmbeddedDocumentExtractor- Parameters:
stream- The embedded resourcehandler- The handler to usemetadata- The metadata for the embedded resourceoutputHtml- Should we output HTML for this resource, or has the parser already done so?- Throws:
SAXExceptionIOException
-
getDelegatingParser
-
setWriteFileNameToContent
public void setWriteFileNameToContent(boolean writeFileNameToContent) -
isWriteFileNameToContent
public boolean isWriteFileNameToContent()
-