Package org.apache.tika.example
Class TranscribeTranslateExample
java.lang.Object
org.apache.tika.example.TranscribeTranslateExample
This example demonstrates primitive logic for
chaining Tika API calls. In this case translation
could be considered as a downstream process to
transcription.
We simply pass the output of
a call to
Tika.parseToString(Path)
into Translator.translate(String, String)
.
The GoogleTranslator
is configured with a target
language of "en-US".- Author:
- lewismc
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic String
amazonTranscribe
(Path tikaConfig, Path file) UseAmazonTranscribe
to execute transcription on input data.static String
UseGoogleTranslator
to execute translation on input data.static void
Main method to run this example.
-
Constructor Details
-
TranscribeTranslateExample
public TranscribeTranslateExample()
-
-
Method Details
-
googleTranslateToEnglish
UseGoogleTranslator
to execute translation on input data. This implementation needs configured as explained in the Javadoc. In this implementation, Google will try to guess the input language. The target language is "en-US".- Parameters:
text
- input text to translate.- Returns:
- translated text String.
-
amazonTranscribe
UseAmazonTranscribe
to execute transcription on input data. This implementation needs to be configured as explained in the Javadoc.- Parameters:
file
- the name of the file (which needs to be on the Java Classpath) to transcribe.- Returns:
- transcribed text.
- Throws:
Exception
-
main
Main method to run this example. This program can be invoked as followstranscribe-translate ${tika-config.xml} ${file}
; which executes both transcription then translation on the given resource, ortranscribe ${tika-config.xml} ${file}
; which executes only translation
- Parameters:
args
- either of the commands described above and the input file (which needs to be on the Java Classpath).${tika-config.xml} must include credentials for aws and a temporary storage bucket:
<properties> <parsers> <parser class="org.apache.tika.parser.DefaultParser"/> <parser class="org.apache.tika.parser.transcribe.aws.AmazonTranscribe"> <params> <param name="bucket" type="string">bucket</param> <param name="clientId" type="string">clientId</param> <param name="clientSecret" type="string">clientSecret</param> </params> </parser> </parsers> </properties>
- Throws:
Exception
-