Package org.apache.tika.example
Class TranscribeTranslateExample
java.lang.Object
org.apache.tika.example.TranscribeTranslateExample
This example demonstrates primitive logic for
chaining Tika API calls. In this case translation
could be considered as a downstream process to
transcription.
We simply pass the output of
a call to
Tika.parseToString(Path)
into Translator.translate(String, String).
The GoogleTranslator is configured with a target
language of "en-US".- Author:
- lewismc
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic StringamazonTranscribe(Path tikaConfig, Path file) UseAmazonTranscribeto execute transcription on input data.static StringUseGoogleTranslatorto execute translation on input data.static voidMain method to run this example.
-
Constructor Details
-
TranscribeTranslateExample
public TranscribeTranslateExample()
-
-
Method Details
-
googleTranslateToEnglish
UseGoogleTranslatorto execute translation on input data. This implementation needs configured as explained in the Javadoc. In this implementation, Google will try to guess the input language. The target language is "en-US".- Parameters:
text- input text to translate.- Returns:
- translated text String.
-
amazonTranscribe
UseAmazonTranscribeto execute transcription on input data. This implementation needs to be configured as explained in the Javadoc.- Parameters:
file- the name of the file (which needs to be on the Java Classpath) to transcribe.- Returns:
- transcribed text.
- Throws:
Exception
-
main
Main method to run this example. This program can be invoked as followstranscribe-translate ${tika-config.xml} ${file}; which executes both transcription then translation on the given resource, ortranscribe ${tika-config.xml} ${file}; which executes only translation
- Parameters:
args- either of the commands described above and the input file (which needs to be on the Java Classpath).${tika-config.xml} must include credentials for aws and a temporary storage bucket:
<properties> <parsers> <parser class="org.apache.tika.parser.DefaultParser"/> <parser class="org.apache.tika.parser.transcribe.aws.AmazonTranscribe"> <params> <param name="bucket" type="string">bucket</param> <param name="clientId" type="string">clientId</param> <param name="clientSecret" type="string">clientSecret</param> </params> </parser> </parsers> </properties>- Throws:
Exception
-