Package org.apache.tika.parser.ner.regex
Class RegexNERecogniser
- java.lang.Object
- 
- org.apache.tika.parser.ner.regex.RegexNERecogniser
 
- 
- All Implemented Interfaces:
- NERecogniser
 
 public class RegexNERecogniser extends Object implements NERecogniser This class offers an implementation ofNERecogniserbased on Regular Expressions.The default configuration file "ner-regex.txt" is used when no argument constructor is used to instantiate this class. The regex file is loaded via The format of regex configuration as follows:Class.getResourceAsStream(String), so the file should be placed in the same package path as of this class.ENTITY_TYPE1=REGEX1 ENTITY_TYPE2=REGEX2 For example, to extract week day from text:WEEK_DAY=(?i)((sun)|(mon)|(tues)|(thurs)|(fri)|((sat)(ur)?))(day)? - Since:
- Nov. 7, 2015
 
- 
- 
Field SummaryFields Modifier and Type Field Description Set<String>entityTypesstatic StringNER_REGEX_FILEMap<String,Pattern>patterns- 
Fields inherited from interface org.apache.tika.parser.ner.NERecogniserDATE, LOCATION, MISCELLANEOUS, MONEY, ORGANIZATION, PERCENT, PERSON, TIME
 
- 
 - 
Constructor SummaryConstructors Constructor Description RegexNERecogniser()RegexNERecogniser(InputStream stream)
 - 
Method SummaryAll Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description Set<String>findMatches(String text, Pattern pattern)finds matching sub groups in textSet<String>getEntityTypes()gets a set of entity types whose names are recognisable by thisstatic RegexNERecognisergetInstance()booleanisAvailable()checks if this Named Entity recogniser is available for serviceMap<String,Set<String>>recognise(String text)call for name recognition action from text
 
- 
- 
- 
Constructor Detail- 
RegexNERecogniserpublic RegexNERecogniser() 
 - 
RegexNERecogniserpublic RegexNERecogniser(InputStream stream) 
 
- 
 - 
Method Detail- 
getInstancepublic static RegexNERecogniser getInstance() 
 - 
isAvailablepublic boolean isAvailable() Description copied from interface:NERecogniserchecks if this Named Entity recogniser is available for service- Specified by:
- isAvailablein interface- NERecogniser
- Returns:
- true if this recogniser is ready to recognise, false otherwise
 
 - 
getEntityTypespublic Set<String> getEntityTypes() Description copied from interface:NERecognisergets a set of entity types whose names are recognisable by this- Specified by:
- getEntityTypesin interface- NERecogniser
- Returns:
- set of entity types/classes
 
 - 
findMatchespublic Set<String> findMatches(String text, Pattern pattern) finds matching sub groups in text- Parameters:
- text- text containing interesting sub strings
- pattern- pattern to find sub strings
- Returns:
- set of sub strings if any found, or null if none found
 
 - 
recognisepublic Map<String,Set<String>> recognise(String text) Description copied from interface:NERecognisercall for name recognition action from text- Specified by:
- recognisein interface- NERecogniser
- Parameters:
- text- text with possibly contains names
- Returns:
- map of entityType -> set of names
 
 
- 
 
-