public class ToTextContentHandler extends DefaultHandler
As of Tika 1.20, this handler ignores content within <script> and <style> tags.
| Constructor and Description |
|---|
ToTextContentHandler()
Creates a content handler that writes character events
to an internal string buffer.
|
ToTextContentHandler(OutputStream stream)
Deprecated.
|
ToTextContentHandler(OutputStream stream,
String encoding)
Creates a content handler that writes character events to
the given output stream using the given encoding.
|
ToTextContentHandler(Writer writer)
Creates a content handler that writes character events to
the given writer.
|
| Modifier and Type | Method and Description |
|---|---|
void |
characters(char[] ch,
int start,
int length)
Writes the given characters to the given character stream.
|
void |
endDocument()
Flushes the character stream so that no characters are forgotten
in internal buffers.
|
void |
endElement(String uri,
String localName,
String qName) |
void |
ignorableWhitespace(char[] ch,
int start,
int length)
Writes the given ignorable characters to the given character stream.
|
void |
startElement(String uri,
String localName,
String qName,
Attributes atts) |
String |
toString()
Returns the contents of the internal string buffer where
all the received characters have been collected.
|
endPrefixMapping, error, fatalError, notationDecl, processingInstruction, resolveEntity, setDocumentLocator, skippedEntity, startDocument, startPrefixMapping, unparsedEntityDecl, warningpublic ToTextContentHandler(Writer writer)
writer - writerpublic ToTextContentHandler(OutputStream stream)
ToTextContentHandler(Writer)stream - output streampublic ToTextContentHandler(OutputStream stream, String encoding) throws UnsupportedEncodingException
stream - output streamencoding - output encodingUnsupportedEncodingException - if the encoding is unsupportedpublic ToTextContentHandler()
toString()
method to access the collected character content.public void characters(char[] ch,
int start,
int length)
throws SAXException
characters in interface ContentHandlercharacters in class DefaultHandlerSAXExceptionpublic void ignorableWhitespace(char[] ch,
int start,
int length)
throws SAXException
characters(char[], int, int) method.ignorableWhitespace in interface ContentHandlerignorableWhitespace in class DefaultHandlerSAXExceptionpublic void endDocument()
throws SAXException
endDocument in interface ContentHandlerendDocument in class DefaultHandlerSAXException - if the stream can not be flushedpublic void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException
startElement in interface ContentHandlerstartElement in class DefaultHandlerSAXExceptionpublic void endElement(String uri, String localName, String qName) throws SAXException
endElement in interface ContentHandlerendElement in class DefaultHandlerSAXExceptionpublic String toString()
StringWriter to the
other constructor.Copyright © 2007–2022 The Apache Software Foundation. All rights reserved.