Package org.apache.tika.pipes.emitter.s3
Class S3Emitter
- java.lang.Object
-
- org.apache.tika.pipes.emitter.AbstractEmitter
-
- org.apache.tika.pipes.emitter.s3.S3Emitter
-
- All Implemented Interfaces:
Initializable,Emitter,StreamEmitter
public class S3Emitter extends AbstractEmitter implements Initializable, StreamEmitter
Emits to existing s3 bucket<properties> <emitters> <emitter class="org.apache.tika.pipes.emitter.s3.S3Emitter> <params> <!-- required --> <param name="name" type="string">s3e</param> <!-- required --> <param name="region" type="string">us-east-1</param> <!-- required --> <param name="credentialsProvider" type="string">(profile|instance)</param> <!-- required if credentialsProvider=profile--> <param name="profile" type="string">my-profile</param> <!-- required --> <param name="bucket" type="string">my-bucket</param> <!-- optional; prefix to add to the path before emitting; default is no prefix --> <param name="prefix" type="string">my-prefix</param> <!-- optional; default is 'json' this will be added to the SOURCE_PATH if no emitter key is specified. Do not add a "." before the extension --> <param name="fileExtension" type="string">json</param> <!-- optional; default is 'true'-- whether to copy the json to a local file before putting to s3 --> <param name="spoolToTemp" type="bool">true</param> </params> </emitter> </emitters> </properties>
-
-
Constructor Summary
Constructors Constructor Description S3Emitter()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description voidcheckInitialization(InitializableProblemHandler problemHandler)voidemit(String path, InputStream is, Metadata userMetadata, ParseContext parseContext)voidemit(String emitKey, List<Metadata> metadataList, ParseContext parseContext)Requires the src-bucket/path/to/my/file.txt in theTikaCoreProperties.SOURCE_PATH.voidinitialize(Map<String,Param> params)This initializes the s3 client.voidsetAccessKey(String accessKey)voidsetBucket(String bucket)voidsetCredentialsProvider(String credentialsProvider)voidsetEndpointConfigurationService(String endpointConfigurationService)voidsetFileExtension(String fileExtension)If you want to customize the output file's file extension.voidsetMaxConnections(int maxConnections)maximum number of http connections allowed.voidsetPathStyleAccessEnabled(boolean pathStyleAccessEnabled)voidsetPrefix(String prefix)voidsetProfile(String profile)voidsetRegion(String region)voidsetSecretKey(String secretKey)voidsetSpoolToTemp(boolean spoolToTemp)Whether or not to spool the metadatalist to a tmp file before putting object.-
Methods inherited from class org.apache.tika.pipes.emitter.AbstractEmitter
emit, getName, setName
-
-
-
-
Method Detail
-
emit
public void emit(String emitKey, List<Metadata> metadataList, ParseContext parseContext) throws IOException, TikaEmitterException
Requires the src-bucket/path/to/my/file.txt in theTikaCoreProperties.SOURCE_PATH.- Specified by:
emitin interfaceEmitter- Parameters:
metadataList-- Throws:
IOExceptionTikaExceptionTikaEmitterException
-
emit
public void emit(String path, InputStream is, Metadata userMetadata, ParseContext parseContext) throws IOException, TikaEmitterException
- Specified by:
emitin interfaceStreamEmitter- Parameters:
path- -- object path, not including the bucketis- inputStream to copyuserMetadata- this will be written to the s3 ObjectMetadata's userMetadata- Throws:
TikaEmitterException- or IOexception if there is a Runtime s3 client exceptionIOException
-
setSpoolToTemp
@Field public void setSpoolToTemp(boolean spoolToTemp)
Whether or not to spool the metadatalist to a tmp file before putting object. Default:true. If this is set tofalse, this emitter writes the json object to memory and then puts that into s3.- Parameters:
spoolToTemp-
-
setFileExtension
@Field public void setFileExtension(String fileExtension)
If you want to customize the output file's file extension. Do not include the "."- Parameters:
fileExtension-
-
setMaxConnections
@Field public void setMaxConnections(int maxConnections)
maximum number of http connections allowed. This should be greater than or equal to the number of threads emitting to S3.- Parameters:
maxConnections-
-
setEndpointConfigurationService
@Field public void setEndpointConfigurationService(String endpointConfigurationService)
-
initialize
public void initialize(Map<String,Param> params) throws TikaConfigException
This initializes the s3 client. Note, we wrap S3's RuntimeExceptions, e.g. AmazonClientException in a TikaConfigException.- Specified by:
initializein interfaceInitializable- Parameters:
params- params to use for initialization- Throws:
TikaConfigException
-
checkInitialization
public void checkInitialization(InitializableProblemHandler problemHandler) throws TikaConfigException
- Specified by:
checkInitializationin interfaceInitializable- Parameters:
problemHandler- if there is a problem and no custom initializableProblemHandler has been configured via Initializable parameters, this is called to respond.- Throws:
TikaConfigException
-
setPathStyleAccessEnabled
@Field public void setPathStyleAccessEnabled(boolean pathStyleAccessEnabled)
-
-