org.apache.tika.parser.html
Class DefaultHtmlMapper

java.lang.Object
  extended by org.apache.tika.parser.html.DefaultHtmlMapper
All Implemented Interfaces:
HtmlMapper

public class DefaultHtmlMapper
extends java.lang.Object
implements HtmlMapper

The default HTML mapping rules in Tika.

Since:
Apache Tika 0.6

Constructor Summary
DefaultHtmlMapper()
           
 
Method Summary
 boolean isDiscardElement(java.lang.String name)
          Checks whether all content within the given HTML element should be discarded instead of including it in the parse output.
 java.lang.String mapSafeElement(java.lang.String name)
          Maps "safe" HTML element names to semantic XHTML equivalents.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DefaultHtmlMapper

public DefaultHtmlMapper()
Method Detail

mapSafeElement

public java.lang.String mapSafeElement(java.lang.String name)
Description copied from interface: HtmlMapper
Maps "safe" HTML element names to semantic XHTML equivalents. If the given element is unknown or deemed unsafe for inclusion in the parse output, then this method returns null and the element will be ignored but the content inside it is still processed. See the HtmlMapper.isDiscardElement(String) method for a way to discard the entire contents of an element.

Specified by:
mapSafeElement in interface HtmlMapper
Parameters:
name - HTML element name (upper case)
Returns:
XHTML element name (lower case), or null if the element is unsafe

isDiscardElement

public boolean isDiscardElement(java.lang.String name)
Description copied from interface: HtmlMapper
Checks whether all content within the given HTML element should be discarded instead of including it in the parse output.

Specified by:
isDiscardElement in interface HtmlMapper
Parameters:
name - HTML element name (upper case)
Returns:
true if content inside the named element should be ignored, false otherwise


Copyright © 2007-2010 The Apache Software Foundation. All Rights Reserved.