Package org.apache.tika.parser.ner.regex
Class RegexNERecogniser
java.lang.Object
org.apache.tika.parser.ner.regex.RegexNERecogniser
- All Implemented Interfaces:
- NERecogniser
This class offers an implementation of 
NERecogniser based on
 Regular Expressions.
 
 The default configuration file "ner-regex.txt" is used when no
 argument constructor is used to instantiate this class. The regex file is
 loaded via Class.getResourceAsStream(String), so the file should be
 placed in the same package path as of this class.
 
ENTITY_TYPE1=REGEX1 ENTITY_TYPE2=REGEX2For example, to extract week day from text:
WEEK_DAY=(?i)((sun)|(mon)|(tues)|(thurs)|(fri)|((sat)(ur)?))(day)?
- Since:
- Nov. 7, 2015
- 
Field SummaryFieldsFields inherited from interface org.apache.tika.parser.ner.NERecogniserDATE, LOCATION, MISCELLANEOUS, MONEY, ORGANIZATION, PERCENT, PERSON, TIME
- 
Constructor SummaryConstructors
- 
Method SummaryModifier and TypeMethodDescriptionfindMatches(String text, Pattern pattern) finds matching sub groups in textgets a set of entity types whose names are recognisable by thisstatic RegexNERecogniserbooleanchecks if this Named Entity recogniser is available for servicecall for name recognition action from text
- 
Field Details- 
NER_REGEX_FILE- See Also:
 
- 
entityTypes
- 
patterns
 
- 
- 
Constructor Details- 
RegexNERecogniserpublic RegexNERecogniser()
- 
RegexNERecogniser
 
- 
- 
Method Details- 
getInstance
- 
isAvailablepublic boolean isAvailable()Description copied from interface:NERecogniserchecks if this Named Entity recogniser is available for service- Specified by:
- isAvailablein interface- NERecogniser
- Returns:
- true if this recogniser is ready to recognise, false otherwise
 
- 
getEntityTypesDescription copied from interface:NERecognisergets a set of entity types whose names are recognisable by this- Specified by:
- getEntityTypesin interface- NERecogniser
- Returns:
- set of entity types/classes
 
- 
findMatchesfinds matching sub groups in text- Parameters:
- text- text containing interesting sub strings
- pattern- pattern to find sub strings
- Returns:
- set of sub strings if any found, or null if none found
 
- 
recogniseDescription copied from interface:NERecognisercall for name recognition action from text- Specified by:
- recognisein interface- NERecogniser
- Parameters:
- text- text with possibly contains names
- Returns:
- map of entityType -> set of names
 
 
-