Package org.apache.tika.parser.html

Interface Summary
HtmlMapper HTML mapper used to make incoming HTML documents easier to handle by Tika clients.
 

Class Summary
BoilerpipeContentHandler Uses the boilerpipe library to automatically extract the main content from a web page.
DefaultHtmlMapper The default HTML mapping rules in Tika.
HtmlParser HTML parser.
IdentityHtmlMapper Alternative HTML mapping rules that pass the input HTML as-is without any modifications.
 



Copyright © 2007-2012 The Apache Software Foundation. All Rights Reserved.