Package org.apache.tika.sax

SAX utilities.


Interface Summary
SafeContentHandler.Output Internal interface that allows both character and ignorable whitespace content to be filtered the same way.

Class Summary
BodyContentHandler Content handler decorator that only passes everything inside the XHTML <body/> tag to the underlying handler.
ContentHandlerDecorator Decorator base class for the ContentHandler interface.
ElementMappingContentHandler Content handler decorator that maps element QNames using a Map.
EmbeddedContentHandler Content handler decorator that prevents the EmbeddedContentHandler.startDocument() and EmbeddedContentHandler.endDocument() events from reaching the decorated handler.
EndDocumentShieldingContentHandler A wrapper around a ContentHandler which will ignore normal SAX calls to EndDocumentShieldingContentHandler.endDocument(), and only fire them later.
LinkContentHandler Content handler that collects links from an XHTML document.
OfflineContentHandler Content handler decorator that always returns an empty stream from the OfflineContentHandler.resolveEntity(String, String) method to prevent potential network or other external resources from being accessed by an XML parser.
SafeContentHandler Content handler decorator that makes sure that the character events (SafeContentHandler.characters(char[], int, int) or SafeContentHandler.ignorableWhitespace(char[], int, int)) passed to the decorated content handler contain only valid XML characters.
SecureContentHandler Content handler decorator that attempts to prevent denial of service attacks against Tika parsers.
TaggedContentHandler A content handler decorator that tags potential exceptions so that the handler that caused the exception can easily be identified.
TeeContentHandler Content handler proxy that forwards the received SAX events to zero or more underlying content handlers.
TextContentHandler Content handler decorator that only passes the TextContentHandler.characters(char[], int, int) and (@link TextContentHandler.ignorableWhitespace(char[], int, int) (plus TextContentHandler.startDocument() and TextContentHandler.endDocument() events to the decorated content handler.
ToHTMLContentHandler SAX event handler that serializes the HTML document to a character stream.
ToTextContentHandler SAX event handler that writes all character content out to a character stream.
ToXMLContentHandler SAX event handler that serializes the XML document to a character stream.
WriteOutContentHandler SAX event handler that writes content up to an optional write limit out to a character stream or other decorated handler.
XHTMLContentHandler Content handler decorator that simplifies the task of producing XHTML events for Tika content parsers.

Exception Summary
TaggedSAXException A SAXException wrapper that tags the wrapped exception with a given object reference.

Package org.apache.tika.sax Description

SAX utilities.

Copyright © 2007-2011 The Apache Software Foundation. All Rights Reserved.