org.apache.tika.parser.pkg
Class PackageParser
java.lang.Object
org.apache.tika.parser.DelegatingParser
org.apache.tika.parser.pkg.PackageParser
- All Implemented Interfaces:
- Parser
- Direct Known Subclasses:
- ArParser, CpioParser, TarParser, ZipParser
public abstract class PackageParser
- extends DelegatingParser
Abstract base class for parsers that deal with package formats.
Subclasses can call the
#parseEntry(InputStream, XHTMLContentHandler, Metadata)
method to parse the given package entry using the configured
entry parser. The entries will be written to the XHTML event stream
as <div class="package-entry"> elements that contain the
(optional) entry name as a <h1> element and the full
structured body content of the parsed entry.
Method Summary |
protected void |
parseArchive(org.apache.commons.compress.archivers.ArchiveInputStream archive,
org.xml.sax.ContentHandler handler,
Metadata metadata,
ParseContext context)
Parses the given stream as a package of multiple underlying files. |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
PackageParser
public PackageParser()
parseArchive
protected void parseArchive(org.apache.commons.compress.archivers.ArchiveInputStream archive,
org.xml.sax.ContentHandler handler,
Metadata metadata,
ParseContext context)
throws java.io.IOException,
org.xml.sax.SAXException
- Parses the given stream as a package of multiple underlying files.
The package entries are parsed using the delegate parser instance.
It is not an error if the entry can not be parsed, in that case
just the entry name (if given) is emitted.
- Parameters:
stream
- package streamhandler
- content handlermetadata
- package metadata
- Throws:
java.io.IOException
- if an IO error occurs
org.xml.sax.SAXException
- if a SAX error occurs
Copyright © 2010 The Apache Software Foundation. All Rights Reserved.