|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.tika.parser.html.DefaultHtmlMapper
public class DefaultHtmlMapper
The default HTML mapping rules in Tika.
Constructor Summary | |
---|---|
DefaultHtmlMapper()
|
Method Summary | |
---|---|
boolean |
isDiscardElement(java.lang.String name)
Checks whether all content within the given HTML element should be discarded instead of including it in the parse output. |
java.lang.String |
mapSafeElement(java.lang.String name)
Maps "safe" HTML element names to semantic XHTML equivalents. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public DefaultHtmlMapper()
Method Detail |
---|
public java.lang.String mapSafeElement(java.lang.String name)
HtmlMapper
null
and the element
will be ignored but the content inside it is still processed. See
the HtmlMapper.isDiscardElement(String)
method for a way to discard
the entire contents of an element.
mapSafeElement
in interface HtmlMapper
name
- HTML element name (upper case)
null
if the element is unsafepublic boolean isDiscardElement(java.lang.String name)
HtmlMapper
isDiscardElement
in interface HtmlMapper
name
- HTML element name (upper case)
true
if content inside the named element
should be ignored, false
otherwise
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |