org.apache.tika.utils
Class RegexUtils

java.lang.Object
  extended by org.apache.tika.utils.RegexUtils

public class RegexUtils
extends Object

Inspired from Nutch code class OutlinkExtractor. Apply regex to extract content


Constructor Summary
RegexUtils()
           
 
Method Summary
static List<String> extractLinks(String content)
          Extract urls from plain text.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

RegexUtils

public RegexUtils()
Method Detail

extractLinks

public static List<String> extractLinks(String content)
Extract urls from plain text.

Parameters:
content - The plain text content to examine
Returns:
List of urls within found in the plain text


Copyright © 2007-2012 The Apache Software Foundation. All Rights Reserved.