|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
public interface HtmlMapper
HTML mapper used to make incoming HTML documents easier to handle by
Tika clients. The HtmlParser
looks up an optional HTML mapper from
the parse context and uses it to map parsed HTML to "safe" XHTML. A client
that wants to customize this mapping can place a custom HtmlMapper instance
into the parse context.
Method Summary | |
---|---|
boolean |
isDiscardElement(java.lang.String name)
Checks whether all content within the given HTML element should be discarded instead of including it in the parse output. |
java.lang.String |
mapSafeAttribute(java.lang.String elementName,
java.lang.String attributeName)
Maps "safe" HTML attribute names to semantic XHTML equivalents. |
java.lang.String |
mapSafeElement(java.lang.String name)
Maps "safe" HTML element names to semantic XHTML equivalents. |
Method Detail |
---|
java.lang.String mapSafeElement(java.lang.String name)
null
and the element
will be ignored but the content inside it is still processed. See
the isDiscardElement(String)
method for a way to discard
the entire contents of an element.
name
- HTML element name (upper case)
null
if the element is unsafeboolean isDiscardElement(java.lang.String name)
name
- HTML element name (upper case)
true
if content inside the named element
should be ignored, false
otherwisejava.lang.String mapSafeAttribute(java.lang.String elementName, java.lang.String attributeName)
null
and the attribute
will be ignored. This method assumes that the element name
is valid and normalised.
elementName
- HTML element name (lower case)attributeName
- HTML attribute name (lower case)
null
if the element is unsafe
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |