Class AgeRecogniser
- java.lang.Object
-
- org.apache.tika.parser.AbstractParser
-
- org.apache.tika.parser.recognition.AgeRecogniser
-
- All Implemented Interfaces:
Serializable
,Initializable
,Parser
public class AgeRecogniser extends AbstractParser implements Initializable
Parser for extracting features from text. Below features are extracted- Author Age
- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static String
MD_KEY_ESTIMATED_AGE
static String
MD_KEY_ESTIMATED_AGE_RANGE
Tika
secondaryParser
-
Constructor Summary
Constructors Constructor Description AgeRecogniser()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
checkInitialization(InitializableProblemHandler problemHandler)
edu.usc.irds.agepredictor.authorage.AgePredicterLocal
getAgePredictorClient()
Set<MediaType>
getSupportedTypes(ParseContext parseContext)
Returns the set of media types supported by this parser when used with the given parse context.void
initialize(Map<String,Param> params)
void
parse(InputStream inputStream, ContentHandler handler, Metadata metadata, ParseContext context)
Parses a document stream into a sequence of XHTML SAX events.protected static void
setAgePredictorClient(edu.usc.irds.agepredictor.authorage.AgePredicterLocal agePredicter)
USED in test cases to mock response of AgeClassifier-
Methods inherited from class org.apache.tika.parser.AbstractParser
parse
-
-
-
-
Field Detail
-
MD_KEY_ESTIMATED_AGE_RANGE
public static final String MD_KEY_ESTIMATED_AGE_RANGE
- See Also:
- Constant Field Values
-
MD_KEY_ESTIMATED_AGE
public static final String MD_KEY_ESTIMATED_AGE
- See Also:
- Constant Field Values
-
secondaryParser
public Tika secondaryParser
-
-
Method Detail
-
checkInitialization
public void checkInitialization(InitializableProblemHandler problemHandler) throws TikaConfigException
- Specified by:
checkInitialization
in interfaceInitializable
- Parameters:
problemHandler
- if there is a problem and no custom initializableProblemHandler has been configured via Initializable parameters, this is called to respond.- Throws:
TikaConfigException
-
getSupportedTypes
public Set<MediaType> getSupportedTypes(ParseContext parseContext)
Description copied from interface:Parser
Returns the set of media types supported by this parser when used with the given parse context.- Specified by:
getSupportedTypes
in interfaceParser
- Parameters:
parseContext
- parse context- Returns:
- immutable set of media types
-
initialize
public void initialize(Map<String,Param> params) throws TikaConfigException
- Specified by:
initialize
in interfaceInitializable
- Parameters:
params
- params to use for initialization- Throws:
TikaConfigException
-
getAgePredictorClient
public edu.usc.irds.agepredictor.authorage.AgePredicterLocal getAgePredictorClient() throws opennlp.tools.util.InvalidFormatException, IOException
- Throws:
opennlp.tools.util.InvalidFormatException
IOException
-
setAgePredictorClient
protected static void setAgePredictorClient(edu.usc.irds.agepredictor.authorage.AgePredicterLocal agePredicter)
USED in test cases to mock response of AgeClassifier
-
parse
public void parse(InputStream inputStream, ContentHandler handler, Metadata metadata, ParseContext context) throws IOException
Description copied from interface:Parser
Parses a document stream into a sequence of XHTML SAX events. Fills in related document metadata in the given metadata object.The given document stream is consumed but not closed by this method. The responsibility to close the stream remains on the caller.
Information about the parsing context can be passed in the context parameter. See the parser implementations for the kinds of context information they expect.
- Specified by:
parse
in interfaceParser
- Parameters:
inputStream
- the document stream (input)handler
- handler for the XHTML SAX events (output)metadata
- document metadata (input and output)context
- parse context- Throws:
IOException
- if the document stream could not be read
-
-