Package org.apache.tika.sax
Class ToTextContentHandler
java.lang.Object
org.xml.sax.helpers.DefaultHandler
org.apache.tika.sax.ToTextContentHandler
- All Implemented Interfaces:
- ContentHandler,- DTDHandler,- EntityResolver,- ErrorHandler
- Direct Known Subclasses:
- ToXMLContentHandler
SAX event handler that writes all character content out to a character
 stream. No escaping or other transformations are made on the character
 content.
 
As of Tika 1.20, this handler ignores content within <script> and <style> tags.
- Since:
- Apache Tika 0.10
- 
Constructor SummaryConstructorsConstructorDescriptionCreates a content handler that writes character events to an internal string buffer.ToTextContentHandler(OutputStream stream, String encoding) Creates a content handler that writes character events to the given output stream using the given encoding.ToTextContentHandler(Writer writer) Creates a content handler that writes character events to the given writer.
- 
Method SummaryModifier and TypeMethodDescriptionvoidcharacters(char[] ch, int start, int length) Writes the given characters to the given character stream.voidFlushes the character stream so that no characters are forgotten in internal buffers.voidendElement(String uri, String localName, String qName) voidignorableWhitespace(char[] ch, int start, int length) Writes the given ignorable characters to the given character stream.voidstartElement(String uri, String localName, String qName, Attributes atts) toString()Returns the contents of the internal string buffer where all the received characters have been collected.Methods inherited from class org.xml.sax.helpers.DefaultHandlerendPrefixMapping, error, fatalError, notationDecl, processingInstruction, resolveEntity, setDocumentLocator, skippedEntity, startDocument, startPrefixMapping, unparsedEntityDecl, warning
- 
Constructor Details- 
ToTextContentHandlerCreates a content handler that writes character events to the given writer.- Parameters:
- writer- writer
 
- 
ToTextContentHandlerpublic ToTextContentHandler(OutputStream stream, String encoding) throws UnsupportedEncodingException Creates a content handler that writes character events to the given output stream using the given encoding.- Parameters:
- stream- output stream
- encoding- output encoding
- Throws:
- UnsupportedEncodingException- if the encoding is unsupported
 
- 
ToTextContentHandlerpublic ToTextContentHandler()Creates a content handler that writes character events to an internal string buffer. Use thetoString()method to access the collected character content.
 
- 
- 
Method Details- 
charactersWrites the given characters to the given character stream.- Specified by:
- charactersin interface- ContentHandler
- Overrides:
- charactersin class- DefaultHandler
- Throws:
- SAXException
 
- 
ignorableWhitespaceWrites the given ignorable characters to the given character stream. The default implementation simply forwards the call to thecharacters(char[], int, int)method.- Specified by:
- ignorableWhitespacein interface- ContentHandler
- Overrides:
- ignorableWhitespacein class- DefaultHandler
- Throws:
- SAXException
 
- 
endDocumentFlushes the character stream so that no characters are forgotten in internal buffers.- Specified by:
- endDocumentin interface- ContentHandler
- Overrides:
- endDocumentin class- DefaultHandler
- Throws:
- SAXException- if the stream can not be flushed
- See Also:
 
- 
startElementpublic void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException - Specified by:
- startElementin interface- ContentHandler
- Overrides:
- startElementin class- DefaultHandler
- Throws:
- SAXException
 
- 
endElement- Specified by:
- endElementin interface- ContentHandler
- Overrides:
- endElementin class- DefaultHandler
- Throws:
- SAXException
 
- 
toStringReturns the contents of the internal string buffer where all the received characters have been collected. Only works when this object was constructed using the empty default constructor or by passing aStringWriterto the other constructor.
 
-