org.apache.tika.utils
Class RegexUtils
java.lang.Object
org.apache.tika.utils.RegexUtils
public class RegexUtils
- extends Object
Inspired from Nutch code class OutlinkExtractor. Apply regex to extract
content
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
RegexUtils
public RegexUtils()
extractLinks
public static List<String> extractLinks(String content)
- Extract urls from plain text.
- Parameters:
content
- The plain text content to examine
- Returns:
- List of urls within found in the plain text
Copyright © 2007-2012 The Apache Software Foundation. All Rights Reserved.