Package org.apache.tika.pipes.fetcher.s3
Class S3Fetcher
java.lang.Object
org.apache.tika.pipes.fetcher.AbstractFetcher
org.apache.tika.pipes.fetcher.s3.S3Fetcher
- All Implemented Interfaces:
Initializable
,Fetcher
,RangeFetcher
Fetches files from s3. Example file: s3://my_bucket/path/to/my_file.pdf
The bucket must be specified via the tika-config or before
initialization, and the fetch key is "path/to/my_file.pdf".
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
checkInitialization
(InitializableProblemHandler problemHandler) long[]
void
initialize
(Map<String, Param> params) This initializes the s3 client.void
setAccessKey
(String accessKey) void
void
setCredentialsProvider
(String credentialsProvider) void
setEndpointConfigurationService
(String endpointConfigurationService) void
setExtractUserMetadata
(boolean extractUserMetadata) Whether or not to extract user metadata from the S3Objectvoid
setMaxConnections
(int maxConnections) void
setMaxLength
(long maxLength) void
setPathStyleAccessEnabled
(boolean pathStyleAccessEnabled) void
prefix to prepend to the fetch key before fetching.void
setProfile
(String profile) void
void
setRetries
(int retries) void
setSecretKey
(String secretKey) void
setSleepBeforeRetryMillis
(long sleepBeforeRetryMillis) Deprecated.void
setSpoolToTemp
(boolean spoolToTemp) void
setThrottleSeconds
(long[] throttleSeconds) void
setThrottleSeconds
(String commaDelimitedLongs) Set seconds to throttle retries as a comma-delimited list, e.g.: 30,60,120,600Methods inherited from class org.apache.tika.pipes.fetcher.AbstractFetcher
getName, setName
-
Constructor Details
-
S3Fetcher
public S3Fetcher()
-
-
Method Details
-
fetch
- Specified by:
fetch
in interfaceFetcher
- Throws:
TikaException
IOException
-
fetch
public InputStream fetch(String fetchKey, long startRange, long endRange, Metadata metadata) throws TikaException, IOException - Specified by:
fetch
in interfaceRangeFetcher
- Throws:
TikaException
IOException
-
setSpoolToTemp
-
setRegion
-
setProfile
-
setBucket
-
setThrottleSeconds
Set seconds to throttle retries as a comma-delimited list, e.g.: 30,60,120,600- Parameters:
commaDelimitedLongs
-- Throws:
TikaConfigException
-
setThrottleSeconds
public void setThrottleSeconds(long[] throttleSeconds) -
getThrottleSeconds
public long[] getThrottleSeconds() -
setPrefix
prefix to prepend to the fetch key before fetching. This will automatically add a '/' at the end.- Parameters:
prefix
-
-
setExtractUserMetadata
Whether or not to extract user metadata from the S3Object- Parameters:
extractUserMetadata
-
-
setMaxConnections
-
setRetries
-
setCredentialsProvider
-
setMaxLength
-
setSleepBeforeRetryMillis
Deprecated.- Parameters:
sleepBeforeRetryMillis
- -- amount of time in millis to sleep if there was a failure
-
setAccessKey
-
setSecretKey
-
initialize
This initializes the s3 client. Note, we wrap S3's RuntimeExceptions, e.g. AmazonClientException in a TikaConfigException.- Specified by:
initialize
in interfaceInitializable
- Parameters:
params
- params to use for initialization- Throws:
TikaConfigException
-
checkInitialization
public void checkInitialization(InitializableProblemHandler problemHandler) throws TikaConfigException - Specified by:
checkInitialization
in interfaceInitializable
- Parameters:
problemHandler
- if there is a problem and no custom initializableProblemHandler has been configured via Initializable parameters, this is called to respond.- Throws:
TikaConfigException
-
setEndpointConfigurationService
-
setPathStyleAccessEnabled
-
setThrottleSeconds(String)