org.apache.tika.sax
Class ToTextContentHandler

java.lang.Object
  extended by org.xml.sax.helpers.DefaultHandler
      extended by org.apache.tika.sax.ToTextContentHandler
All Implemented Interfaces:
ContentHandler, DTDHandler, EntityResolver, ErrorHandler
Direct Known Subclasses:
ToXMLContentHandler

public class ToTextContentHandler
extends org.xml.sax.helpers.DefaultHandler

SAX event handler that writes all character content out to a character stream. No escaping or other transformations are made on the character content.

Since:
Apache Tika 0.10

Constructor Summary
ToTextContentHandler()
          Creates a content handler that writes character events to an internal string buffer.
ToTextContentHandler(OutputStream stream)
          Creates a content handler that writes character events to the given output stream using the platform default encoding.
ToTextContentHandler(OutputStream stream, String encoding)
          Creates a content handler that writes character events to the given output stream using the given encoding.
ToTextContentHandler(Writer writer)
          Creates a content handler that writes character events to the given writer.
 
Method Summary
 void characters(char[] ch, int start, int length)
          Writes the given characters to the given character stream.
 void endDocument()
          Flushes the character stream so that no characters are forgotten in internal buffers.
 void ignorableWhitespace(char[] ch, int start, int length)
          Writes the given ignorable characters to the given character stream.
 String toString()
          Returns the contents of the internal string buffer where all the received characters have been collected.
 
Methods inherited from class org.xml.sax.helpers.DefaultHandler
endElement, endPrefixMapping, error, fatalError, notationDecl, processingInstruction, resolveEntity, setDocumentLocator, skippedEntity, startDocument, startElement, startPrefixMapping, unparsedEntityDecl, warning
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

ToTextContentHandler

public ToTextContentHandler(Writer writer)
Creates a content handler that writes character events to the given writer.

Parameters:
writer - writer

ToTextContentHandler

public ToTextContentHandler(OutputStream stream)
Creates a content handler that writes character events to the given output stream using the platform default encoding.

Parameters:
stream - output stream

ToTextContentHandler

public ToTextContentHandler(OutputStream stream,
                            String encoding)
                     throws UnsupportedEncodingException
Creates a content handler that writes character events to the given output stream using the given encoding.

Parameters:
stream - output stream
encoding - output encoding
Throws:
UnsupportedEncodingException - if the encoding is unsupported

ToTextContentHandler

public ToTextContentHandler()
Creates a content handler that writes character events to an internal string buffer. Use the toString() method to access the collected character content.

Method Detail

characters

public void characters(char[] ch,
                       int start,
                       int length)
                throws SAXException
Writes the given characters to the given character stream.

Specified by:
characters in interface ContentHandler
Overrides:
characters in class org.xml.sax.helpers.DefaultHandler
Throws:
SAXException

ignorableWhitespace

public void ignorableWhitespace(char[] ch,
                                int start,
                                int length)
                         throws SAXException
Writes the given ignorable characters to the given character stream. The default implementation simply forwards the call to the characters(char[], int, int) method.

Specified by:
ignorableWhitespace in interface ContentHandler
Overrides:
ignorableWhitespace in class org.xml.sax.helpers.DefaultHandler
Throws:
SAXException

endDocument

public void endDocument()
                 throws SAXException
Flushes the character stream so that no characters are forgotten in internal buffers.

Specified by:
endDocument in interface ContentHandler
Overrides:
endDocument in class org.xml.sax.helpers.DefaultHandler
Throws:
SAXException - if the stream can not be flushed
See Also:
TIKA-179

toString

public String toString()
Returns the contents of the internal string buffer where all the received characters have been collected. Only works when this object was constructed using the empty default constructor or by passing a StringWriter to the other constructor.

Overrides:
toString in class Object


Copyright © 2007-2012 The Apache Software Foundation. All Rights Reserved.