Class TensorflowImageRecParser
- java.lang.Object
-
- org.apache.tika.parser.external.ExternalParser
-
- org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
-
- All Implemented Interfaces:
Serializable
,Initializable
,Parser
,ObjectRecogniser
public class TensorflowImageRecParser extends ExternalParser implements ObjectRecogniser
This is an implementation ofObjectRecogniser
powered by Tensorflow convolutional neural network (CNN). This implementation binds to Python API usingExternalParser
.
// NOTE: This is a proof of concept for an efficient implementation using JNI binding to Tensorflow's C++ api.
b>Environment Setup:
- Python must be available
- Tensorflow must be available for import by the python script. Setup Instructions here
- All dependencies of tensor flow (such as numpy) must also be available. Follow the image recognition guide and make sure it works
- Since:
- Apache Tika 1.14
- See Also:
TensorflowRESTRecogniser
, Serialized Form
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class org.apache.tika.parser.external.ExternalParser
ExternalParser.LineConsumer
-
-
Field Summary
-
Fields inherited from class org.apache.tika.parser.external.ExternalParser
INPUT_FILE_TOKEN, OUTPUT_FILE_TOKEN
-
-
Constructor Summary
Constructors Constructor Description TensorflowImageRecParser()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
checkInitialization(InitializableProblemHandler handler)
Set<MediaType>
getSupportedMimes()
The mimes supported by this recogniservoid
initialize(Map<String,Param> params)
This is the hook for configuring the recogniserboolean
isAvailable()
Is this service availableList<RecognisedObject>
recognise(InputStream stream, ContentHandler handler, Metadata metadata, ParseContext context)
Recognise the objects in the stream-
Methods inherited from class org.apache.tika.parser.external.ExternalParser
check, check, getCommand, getIgnoredLineConsumer, getMetadataExtractionPatterns, getSupportedTypes, getSupportedTypes, parse, setCommand, setIgnoredLineConsumer, setMetadataExtractionPatterns, setSupportedTypes
-
-
-
-
Method Detail
-
getSupportedMimes
public Set<MediaType> getSupportedMimes()
Description copied from interface:ObjectRecogniser
The mimes supported by this recogniser- Specified by:
getSupportedMimes
in interfaceObjectRecogniser
- Returns:
- set of mediatypes
-
isAvailable
public boolean isAvailable()
Description copied from interface:ObjectRecogniser
Is this service available- Specified by:
isAvailable
in interfaceObjectRecogniser
- Returns:
true
when the service is available,false
otherwise
-
initialize
public void initialize(Map<String,Param> params) throws TikaConfigException
Description copied from interface:ObjectRecogniser
This is the hook for configuring the recogniser- Specified by:
initialize
in interfaceInitializable
- Specified by:
initialize
in interfaceObjectRecogniser
- Parameters:
params
- configuration instance in the form of context- Throws:
TikaConfigException
- when there is an issue with configuration
-
checkInitialization
public void checkInitialization(InitializableProblemHandler handler) throws TikaConfigException
- Specified by:
checkInitialization
in interfaceInitializable
- Parameters:
handler
- if there is a problem and no custom initializableProblemHandler has been configured via Initializable parameters, this is called to respond.- Throws:
TikaConfigException
-
recognise
public List<RecognisedObject> recognise(InputStream stream, ContentHandler handler, Metadata metadata, ParseContext context) throws IOException, SAXException, TikaException
Description copied from interface:ObjectRecogniser
Recognise the objects in the stream- Specified by:
recognise
in interfaceObjectRecogniser
- Parameters:
stream
- content streamhandler
- tika's content handlermetadata
- metadata instancecontext
- parser context- Returns:
- List of
RecognisedObject
s - Throws:
IOException
- when an I/O error occursSAXException
- when an issue with XML occursTikaException
- any generic error
-
-