org.apache.tika.utils
Class RegexUtils

java.lang.Object
  extended by org.apache.tika.utils.RegexUtils

public class RegexUtils
extends java.lang.Object

Inspired from Nutch code class OutlinkExtractor. Apply regex to extract content


Constructor Summary
RegexUtils()
           
 
Method Summary
static java.util.List<java.lang.String> extractLinks(java.lang.String content)
          Extract urls from plain text.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

RegexUtils

public RegexUtils()
Method Detail

extractLinks

public static java.util.List<java.lang.String> extractLinks(java.lang.String content)
Extract urls from plain text.

Parameters:
content - The plain text content to examine
Returns:
List of urls within found in the plain text


Copyright © 2007-2010 The Apache Software Foundation. All Rights Reserved.