Package org.apache.tika.pipes.emitter.s3
Class S3Emitter
- java.lang.Object
-
- org.apache.tika.pipes.emitter.AbstractEmitter
-
- org.apache.tika.pipes.emitter.s3.S3Emitter
-
- All Implemented Interfaces:
Initializable
,Emitter
,StreamEmitter
public class S3Emitter extends AbstractEmitter implements Initializable, StreamEmitter
Emits to existing s3 bucket<properties> <emitters> <emitter class="org.apache.tika.pipes.emitter.s3.S3Emitter> <params> <!-- required --> <param name="name" type="string">s3e</param> <!-- required --> <param name="region" type="string">us-east-1</param> <!-- required --> <param name="credentialsProvider" type="string">(profile|instance)</param> <!-- required if credentialsProvider=profile--> <param name="profile" type="string">my-profile</param> <!-- required --> <param name="bucket" type="string">my-bucket</param> <!-- optional; prefix to add to the path before emitting; default is no prefix --> <param name="prefix" type="string">my-prefix</param> <!-- optional; default is 'json' this will be added to the SOURCE_PATH if no emitter key is specified. Do not add a "." before the extension --> <param name="fileExtension" type="string">json</param> <!-- optional; default is 'true'-- whether to copy the json to a local file before putting to s3 --> <param name="spoolToTemp" type="bool">true</param> </params> </emitter> </emitters> </properties>
-
-
Constructor Summary
Constructors Constructor Description S3Emitter()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
checkInitialization(InitializableProblemHandler problemHandler)
void
emit(String path, InputStream is, Metadata userMetadata)
void
emit(String emitKey, List<Metadata> metadataList)
Requires the src-bucket/path/to/my/file.txt in theTikaCoreProperties.SOURCE_PATH
.void
initialize(Map<String,Param> params)
This initializes the s3 client.void
setAccessKey(String accessKey)
void
setBucket(String bucket)
void
setCredentialsProvider(String credentialsProvider)
void
setEndpointConfigurationService(String endpointConfigurationService)
void
setFileExtension(String fileExtension)
If you want to customize the output file's file extension.void
setMaxConnections(int maxConnections)
maximum number of http connections allowed.void
setPathStyleAccessEnabled(boolean pathStyleAccessEnabled)
void
setPrefix(String prefix)
void
setProfile(String profile)
void
setRegion(String region)
void
setSecretKey(String secretKey)
void
setSpoolToTemp(boolean spoolToTemp)
Whether or not to spool the metadatalist to a tmp file before putting object.-
Methods inherited from class org.apache.tika.pipes.emitter.AbstractEmitter
emit, getName, setName
-
-
-
-
Method Detail
-
emit
public void emit(String emitKey, List<Metadata> metadataList) throws IOException, TikaEmitterException
Requires the src-bucket/path/to/my/file.txt in theTikaCoreProperties.SOURCE_PATH
.- Specified by:
emit
in interfaceEmitter
- Parameters:
metadataList
-- Throws:
IOException
TikaException
TikaEmitterException
-
emit
public void emit(String path, InputStream is, Metadata userMetadata) throws IOException, TikaEmitterException
- Specified by:
emit
in interfaceStreamEmitter
- Parameters:
path
- -- object path, not including the bucketis
- inputStream to copyuserMetadata
- this will be written to the s3 ObjectMetadata's userMetadata- Throws:
TikaEmitterException
- or IOexception if there is a Runtime s3 client exceptionIOException
-
setSpoolToTemp
@Field public void setSpoolToTemp(boolean spoolToTemp)
Whether or not to spool the metadatalist to a tmp file before putting object. Default:true
. If this is set tofalse
, this emitter writes the json object to memory and then puts that into s3.- Parameters:
spoolToTemp
-
-
setFileExtension
@Field public void setFileExtension(String fileExtension)
If you want to customize the output file's file extension. Do not include the "."- Parameters:
fileExtension
-
-
setMaxConnections
@Field public void setMaxConnections(int maxConnections)
maximum number of http connections allowed. This should be greater than or equal to the number of threads emitting to S3.- Parameters:
maxConnections
-
-
setEndpointConfigurationService
@Field public void setEndpointConfigurationService(String endpointConfigurationService)
-
initialize
public void initialize(Map<String,Param> params) throws TikaConfigException
This initializes the s3 client. Note, we wrap S3's RuntimeExceptions, e.g. AmazonClientException in a TikaConfigException.- Specified by:
initialize
in interfaceInitializable
- Parameters:
params
- params to use for initialization- Throws:
TikaConfigException
-
checkInitialization
public void checkInitialization(InitializableProblemHandler problemHandler) throws TikaConfigException
- Specified by:
checkInitialization
in interfaceInitializable
- Parameters:
problemHandler
- if there is a problem and no custom initializableProblemHandler has been configured via Initializable parameters, this is called to respond.- Throws:
TikaConfigException
-
setPathStyleAccessEnabled
@Field public void setPathStyleAccessEnabled(boolean pathStyleAccessEnabled)
-
-