Package org.apache.tika.sax
Class ToMarkdownContentHandler
java.lang.Object
org.xml.sax.helpers.DefaultHandler
org.apache.tika.sax.ToMarkdownContentHandler
- All Implemented Interfaces:
ContentHandler,DTDHandler,EntityResolver,ErrorHandler
SAX event handler that writes content as Markdown.
Supports headings, paragraphs, bold, italic, links, images, lists (ordered
and unordered, including nested), tables (GFM pipe tables), code blocks,
inline code, blockquotes, horizontal rules, and definition lists.
Content within <script> and <style> tags is ignored.
- Since:
- Apache Tika 3.2
-
Constructor Summary
ConstructorsConstructorDescriptionToMarkdownContentHandler(OutputStream stream, String encoding) ToMarkdownContentHandler(Writer writer) -
Method Summary
Modifier and TypeMethodDescriptionvoidcharacters(char[] ch, int start, int length) voidvoidendElement(String uri, String localName, String qName) voidignorableWhitespace(char[] ch, int start, int length) voidstartElement(String uri, String localName, String qName, Attributes atts) toString()Methods inherited from class org.xml.sax.helpers.DefaultHandler
endPrefixMapping, error, fatalError, notationDecl, processingInstruction, resolveEntity, setDocumentLocator, skippedEntity, startDocument, startPrefixMapping, unparsedEntityDecl, warning
-
Constructor Details
-
ToMarkdownContentHandler
-
ToMarkdownContentHandler
public ToMarkdownContentHandler(OutputStream stream, String encoding) throws UnsupportedEncodingException - Throws:
UnsupportedEncodingException
-
ToMarkdownContentHandler
public ToMarkdownContentHandler()
-
-
Method Details
-
startElement
public void startElement(String uri, String localName, String qName, Attributes atts) throws SAXException - Specified by:
startElementin interfaceContentHandler- Overrides:
startElementin classDefaultHandler- Throws:
SAXException
-
endElement
- Specified by:
endElementin interfaceContentHandler- Overrides:
endElementin classDefaultHandler- Throws:
SAXException
-
characters
- Specified by:
charactersin interfaceContentHandler- Overrides:
charactersin classDefaultHandler- Throws:
SAXException
-
ignorableWhitespace
- Specified by:
ignorableWhitespacein interfaceContentHandler- Overrides:
ignorableWhitespacein classDefaultHandler- Throws:
SAXException
-
endDocument
- Specified by:
endDocumentin interfaceContentHandler- Overrides:
endDocumentin classDefaultHandler- Throws:
SAXException
-
toString
-