org.apache.tika.sax
Class XHTMLContentHandler
java.lang.Object
org.xml.sax.helpers.DefaultHandler
org.apache.tika.sax.ContentHandlerDecorator
org.apache.tika.sax.SafeContentHandler
org.apache.tika.sax.XHTMLContentHandler
- All Implemented Interfaces:
- ContentHandler, DTDHandler, EntityResolver, ErrorHandler
public class XHTMLContentHandler
- extends SafeContentHandler
Content handler decorator that simplifies the task of producing XHTML
events for Tika content parsers.
Method Summary |
void |
characters(char[] ch,
int start,
int length)
|
void |
characters(String characters)
|
void |
element(String name,
String value)
|
void |
endDocument()
Ends the XHTML document by writing the following footer and
clearing the namespace mappings: |
void |
endElement(String name)
|
void |
endElement(String uri,
String local,
String name)
Ends the given element. |
void |
newline()
|
void |
startDocument()
Starts an XHTML document by setting up the namespace mappings. |
void |
startElement(String name)
|
void |
startElement(String name,
String attribute,
String value)
|
void |
startElement(String uri,
String local,
String name,
Attributes attributes)
Starts the given element. |
XHTML
public static final String XHTML
- The XHTML namespace URI
- See Also:
- Constant Field Values
ENDLINE
public static final Set<String> ENDLINE
- The elements that get appended with the
NL
character.
XHTMLContentHandler
public XHTMLContentHandler(ContentHandler handler,
Metadata metadata)
startDocument
public void startDocument()
throws SAXException
- Starts an XHTML document by setting up the namespace mappings.
The standard XHTML prefix is generated lazily when the first
element is started.
- Specified by:
startDocument
in interface ContentHandler
- Overrides:
startDocument
in class ContentHandlerDecorator
- Throws:
SAXException
endDocument
public void endDocument()
throws SAXException
- Ends the XHTML document by writing the following footer and
clearing the namespace mappings:
</body>
</html>
- Specified by:
endDocument
in interface ContentHandler
- Overrides:
endDocument
in class ContentHandlerDecorator
- Throws:
SAXException
startElement
public void startElement(String uri,
String local,
String name,
Attributes attributes)
throws SAXException
- Starts the given element. Table cells and list items are automatically
indented by emitting a tab character as ignorable whitespace.
- Specified by:
startElement
in interface ContentHandler
- Overrides:
startElement
in class ContentHandlerDecorator
- Throws:
SAXException
endElement
public void endElement(String uri,
String local,
String name)
throws SAXException
- Ends the given element. Block elements are automatically followed
by a newline character.
- Specified by:
endElement
in interface ContentHandler
- Overrides:
endElement
in class ContentHandlerDecorator
- Throws:
SAXException
characters
public void characters(char[] ch,
int start,
int length)
throws SAXException
- Specified by:
characters
in interface ContentHandler
- Overrides:
characters
in class SafeContentHandler
- Throws:
SAXException
- See Also:
- TIKA-210
startElement
public void startElement(String name)
throws SAXException
- Throws:
SAXException
startElement
public void startElement(String name,
String attribute,
String value)
throws SAXException
- Throws:
SAXException
endElement
public void endElement(String name)
throws SAXException
- Throws:
SAXException
characters
public void characters(String characters)
throws SAXException
- Throws:
SAXException
newline
public void newline()
throws SAXException
- Throws:
SAXException
element
public void element(String name,
String value)
throws SAXException
- Throws:
SAXException
Copyright © 2007-2010 The Apache Software Foundation. All Rights Reserved.