public class S3Emitter extends AbstractEmitter implements Initializable, StreamEmitter
<properties> <emitters> <emitter class="org.apache.tika.pipes.emitter.s3.S3Emitter> <params> <!-- required --> <param name="name" type="string">s3e</param> <!-- required --> <param name="region" type="string">us-east-1</param> <!-- required --> <param name="credentialsProvider" type="string">(profile|instance)</param> <!-- required if credentialsProvider=profile--> <param name="profile" type="string">my-profile</param> <!-- required --> <param name="bucket" type="string">my-bucket</param> <!-- optional; prefix to add to the path before emitting; default is no prefix --> <param name="prefix" type="string">my-prefix</param> <!-- optional; default is 'json' this will be added to the SOURCE_PATH if no emitter key is specified. Do not add a "." before the extension --> <param name="fileExtension" type="string">json</param> <!-- optional; default is 'true'-- whether to copy the json to a local file before putting to s3 --> <param name="spoolToTemp" type="bool">true</param> </params> </emitter> </emitters> </properties>
Constructor and Description |
---|
S3Emitter() |
Modifier and Type | Method and Description |
---|---|
void |
checkInitialization(InitializableProblemHandler problemHandler) |
void |
emit(String path,
InputStream is,
Metadata userMetadata) |
void |
emit(String emitKey,
List<Metadata> metadataList)
Requires the src-bucket/path/to/my/file.txt in the
TikaCoreProperties.SOURCE_PATH . |
void |
initialize(Map<String,Param> params)
This initializes the s3 client.
|
void |
setBucket(String bucket) |
void |
setCredentialsProvider(String credentialsProvider) |
void |
setFileExtension(String fileExtension)
If you want to customize the output file's file extension.
|
void |
setMaxConnections(int maxConnections)
maximum number of http connections allowed.
|
void |
setPrefix(String prefix) |
void |
setProfile(String profile) |
void |
setRegion(String region) |
void |
setSpoolToTemp(boolean spoolToTemp)
Whether or not to spool the metadatalist to a tmp file before putting object.
|
emit, getName, setName
public void emit(String emitKey, List<Metadata> metadataList) throws IOException, TikaEmitterException
TikaCoreProperties.SOURCE_PATH
.emit
in interface Emitter
metadataList
- IOException
TikaException
TikaEmitterException
public void emit(String path, InputStream is, Metadata userMetadata) throws IOException, TikaEmitterException
emit
in interface StreamEmitter
path
- -- object path, not including the bucketis
- inputStream to copyuserMetadata
- this will be written to the s3 ObjectMetadata's userMetadataTikaEmitterException
- or IOexception if there is a Runtime s3 client exceptionIOException
@Field public void setSpoolToTemp(boolean spoolToTemp)
true
. If this is set to false
,
this emitter writes the json object to memory and then puts that into s3.spoolToTemp
- @Field public void setFileExtension(String fileExtension)
fileExtension
- @Field public void setMaxConnections(int maxConnections)
maxConnections
- public void initialize(Map<String,Param> params) throws TikaConfigException
initialize
in interface Initializable
params
- params to use for initializationTikaConfigException
public void checkInitialization(InitializableProblemHandler problemHandler) throws TikaConfigException
checkInitialization
in interface Initializable
problemHandler
- if there is a problem and no
custom initializableProblemHandler has been configured
via Initializable parameters,
this is called to respond.TikaConfigException
Copyright © 2007–2022 The Apache Software Foundation. All rights reserved.