org.apache.tika.extractor
Class EmbeddedDocumentExtractor

java.lang.Object
  extended by org.apache.tika.extractor.EmbeddedDocumentExtractor

public class EmbeddedDocumentExtractor
extends java.lang.Object

Helper class for parsers of package archives or other compound document formats that support embedded or attached component documents.

Since:
Apache Tika 0.8

Constructor Summary
EmbeddedDocumentExtractor(ParseContext context)
           
 
Method Summary
 void parseEmbedded(java.io.InputStream stream, org.xml.sax.ContentHandler handler, Metadata metadata, boolean outputHtml)
          Processes the supplied embedded resource, calling the delegating parser with the appropriate details.
 boolean shouldParseEmbedded(Metadata metadata)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

EmbeddedDocumentExtractor

public EmbeddedDocumentExtractor(ParseContext context)
Method Detail

shouldParseEmbedded

public boolean shouldParseEmbedded(Metadata metadata)

parseEmbedded

public void parseEmbedded(java.io.InputStream stream,
                          org.xml.sax.ContentHandler handler,
                          Metadata metadata,
                          boolean outputHtml)
                   throws org.xml.sax.SAXException,
                          java.io.IOException
Processes the supplied embedded resource, calling the delegating parser with the appropriate details.

Parameters:
stream - The embedded resource
handler - The handler to use
metadata - The metadata for the embedded resource
outputHtml - Should we output HTML for this resource, or has the parser already done so?
Throws:
org.xml.sax.SAXException
java.io.IOException


Copyright © 2007-2010 The Apache Software Foundation. All Rights Reserved.