Package org.apache.tika.pipes.fetcher.s3
Class S3Fetcher
- java.lang.Object
-
- org.apache.tika.pipes.fetcher.AbstractFetcher
-
- org.apache.tika.pipes.fetcher.s3.S3Fetcher
-
- All Implemented Interfaces:
Initializable,Fetcher,RangeFetcher
public class S3Fetcher extends AbstractFetcher implements Initializable, RangeFetcher
Fetches files from s3. Example file: s3://my_bucket/path/to/my_file.pdf The bucket must be specified via the tika-config or before initialization, and the fetch key is "path/to/my_file.pdf".
-
-
Constructor Summary
Constructors Constructor Description S3Fetcher()S3Fetcher(S3FetcherConfig s3FetcherConfig)
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description voidcheckInitialization(InitializableProblemHandler problemHandler)InputStreamfetch(String fetchKey, long startRange, long endRange, Metadata metadata, ParseContext parseContext)InputStreamfetch(String fetchKey, Metadata metadata, ParseContext parseContext)long[]getThrottleSeconds()voidinitialize(Map<String,Param> params)This initializes the s3 client.voidsetAccessKey(String accessKey)voidsetBucket(String bucket)voidsetCredentialsProvider(String credentialsProvider)voidsetEndpointConfigurationService(String endpointConfigurationService)voidsetExtractUserMetadata(boolean extractUserMetadata)Whether or not to extract user metadata from the S3ObjectvoidsetMaxConnections(int maxConnections)voidsetMaxLength(long maxLength)voidsetPathStyleAccessEnabled(boolean pathStyleAccessEnabled)voidsetPrefix(String prefix)prefix to prepend to the fetch key before fetching.voidsetProfile(String profile)voidsetRegion(String region)voidsetSecretKey(String secretKey)voidsetSleepBeforeRetryMillis(long sleepBeforeRetryMillis)Deprecated.voidsetSpoolToTemp(boolean spoolToTemp)voidsetThrottleSeconds(long[] throttleSeconds)voidsetThrottleSeconds(String commaDelimitedLongs)Set seconds to throttle retries as a comma-delimited list, e.g.: 30,60,120,600-
Methods inherited from class org.apache.tika.pipes.fetcher.AbstractFetcher
getName, setName
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.tika.pipes.fetcher.RangeFetcher
fetch
-
-
-
-
Constructor Detail
-
S3Fetcher
public S3Fetcher()
-
S3Fetcher
public S3Fetcher(S3FetcherConfig s3FetcherConfig)
-
-
Method Detail
-
fetch
public InputStream fetch(String fetchKey, Metadata metadata, ParseContext parseContext) throws TikaException, IOException
- Specified by:
fetchin interfaceFetcher- Throws:
TikaExceptionIOException
-
fetch
public InputStream fetch(String fetchKey, long startRange, long endRange, Metadata metadata, ParseContext parseContext) throws TikaException, IOException
- Specified by:
fetchin interfaceRangeFetcher- Throws:
TikaExceptionIOException
-
setSpoolToTemp
@Field public void setSpoolToTemp(boolean spoolToTemp)
-
setThrottleSeconds
@Field public void setThrottleSeconds(String commaDelimitedLongs) throws TikaConfigException
Set seconds to throttle retries as a comma-delimited list, e.g.: 30,60,120,600- Parameters:
commaDelimitedLongs-- Throws:
TikaConfigException
-
setThrottleSeconds
public void setThrottleSeconds(long[] throttleSeconds)
-
getThrottleSeconds
public long[] getThrottleSeconds()
-
setPrefix
@Field public void setPrefix(String prefix)
prefix to prepend to the fetch key before fetching. This will automatically add a '/' at the end.- Parameters:
prefix-
-
setExtractUserMetadata
@Field public void setExtractUserMetadata(boolean extractUserMetadata)
Whether or not to extract user metadata from the S3Object- Parameters:
extractUserMetadata-
-
setMaxConnections
@Field public void setMaxConnections(int maxConnections)
-
setMaxLength
@Field public void setMaxLength(long maxLength)
-
setSleepBeforeRetryMillis
@Deprecated @Field public void setSleepBeforeRetryMillis(long sleepBeforeRetryMillis)
Deprecated.- Parameters:
sleepBeforeRetryMillis- -- amount of time in millis to sleep if there was a failure
-
initialize
public void initialize(Map<String,Param> params) throws TikaConfigException
This initializes the s3 client. Note, we wrap S3's RuntimeExceptions, e.g. AmazonClientException in a TikaConfigException.- Specified by:
initializein interfaceInitializable- Parameters:
params- params to use for initialization- Throws:
TikaConfigException
-
checkInitialization
public void checkInitialization(InitializableProblemHandler problemHandler) throws TikaConfigException
- Specified by:
checkInitializationin interfaceInitializable- Parameters:
problemHandler- if there is a problem and no custom initializableProblemHandler has been configured via Initializable parameters, this is called to respond.- Throws:
TikaConfigException
-
setEndpointConfigurationService
@Field public void setEndpointConfigurationService(String endpointConfigurationService)
-
setPathStyleAccessEnabled
@Field public void setPathStyleAccessEnabled(boolean pathStyleAccessEnabled)
-
-