Package org.apache.tika.pipes.fetcher.s3
Class S3Fetcher
- java.lang.Object
-
- org.apache.tika.pipes.fetcher.AbstractFetcher
-
- org.apache.tika.pipes.fetcher.s3.S3Fetcher
-
- All Implemented Interfaces:
Initializable
,Fetcher
,RangeFetcher
public class S3Fetcher extends AbstractFetcher implements Initializable, RangeFetcher
Fetches files from s3. Example file: s3://my_bucket/path/to/my_file.pdf The bucket must be specified via the tika-config or before initialization, and the fetch key is "path/to/my_file.pdf".
-
-
Constructor Summary
Constructors Constructor Description S3Fetcher()
S3Fetcher(S3FetcherConfig s3FetcherConfig)
-
Method Summary
All Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description void
checkInitialization(InitializableProblemHandler problemHandler)
InputStream
fetch(String fetchKey, long startRange, long endRange, Metadata metadata, ParseContext parseContext)
InputStream
fetch(String fetchKey, Metadata metadata, ParseContext parseContext)
long[]
getThrottleSeconds()
void
initialize(Map<String,Param> params)
This initializes the s3 client.void
setAccessKey(String accessKey)
void
setBucket(String bucket)
void
setCredentialsProvider(String credentialsProvider)
void
setEndpointConfigurationService(String endpointConfigurationService)
void
setExtractUserMetadata(boolean extractUserMetadata)
Whether or not to extract user metadata from the S3Objectvoid
setMaxConnections(int maxConnections)
void
setMaxLength(long maxLength)
void
setPathStyleAccessEnabled(boolean pathStyleAccessEnabled)
void
setPrefix(String prefix)
prefix to prepend to the fetch key before fetching.void
setProfile(String profile)
void
setRegion(String region)
void
setSecretKey(String secretKey)
void
setSleepBeforeRetryMillis(long sleepBeforeRetryMillis)
Deprecated.void
setSpoolToTemp(boolean spoolToTemp)
void
setThrottleSeconds(long[] throttleSeconds)
void
setThrottleSeconds(String commaDelimitedLongs)
Set seconds to throttle retries as a comma-delimited list, e.g.: 30,60,120,600-
Methods inherited from class org.apache.tika.pipes.fetcher.AbstractFetcher
getName, setName
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.tika.pipes.fetcher.RangeFetcher
fetch
-
-
-
-
Constructor Detail
-
S3Fetcher
public S3Fetcher()
-
S3Fetcher
public S3Fetcher(S3FetcherConfig s3FetcherConfig)
-
-
Method Detail
-
fetch
public InputStream fetch(String fetchKey, Metadata metadata, ParseContext parseContext) throws TikaException, IOException
- Specified by:
fetch
in interfaceFetcher
- Throws:
TikaException
IOException
-
fetch
public InputStream fetch(String fetchKey, long startRange, long endRange, Metadata metadata, ParseContext parseContext) throws TikaException, IOException
- Specified by:
fetch
in interfaceRangeFetcher
- Throws:
TikaException
IOException
-
setSpoolToTemp
@Field public void setSpoolToTemp(boolean spoolToTemp)
-
setThrottleSeconds
@Field public void setThrottleSeconds(String commaDelimitedLongs) throws TikaConfigException
Set seconds to throttle retries as a comma-delimited list, e.g.: 30,60,120,600- Parameters:
commaDelimitedLongs
-- Throws:
TikaConfigException
-
setThrottleSeconds
public void setThrottleSeconds(long[] throttleSeconds)
-
getThrottleSeconds
public long[] getThrottleSeconds()
-
setPrefix
@Field public void setPrefix(String prefix)
prefix to prepend to the fetch key before fetching. This will automatically add a '/' at the end.- Parameters:
prefix
-
-
setExtractUserMetadata
@Field public void setExtractUserMetadata(boolean extractUserMetadata)
Whether or not to extract user metadata from the S3Object- Parameters:
extractUserMetadata
-
-
setMaxConnections
@Field public void setMaxConnections(int maxConnections)
-
setMaxLength
@Field public void setMaxLength(long maxLength)
-
setSleepBeforeRetryMillis
@Field public void setSleepBeforeRetryMillis(long sleepBeforeRetryMillis)
Deprecated.- Parameters:
sleepBeforeRetryMillis
- -- amount of time in millis to sleep if there was a failure
-
initialize
public void initialize(Map<String,Param> params) throws TikaConfigException
This initializes the s3 client. Note, we wrap S3's RuntimeExceptions, e.g. AmazonClientException in a TikaConfigException.- Specified by:
initialize
in interfaceInitializable
- Parameters:
params
- params to use for initialization- Throws:
TikaConfigException
-
checkInitialization
public void checkInitialization(InitializableProblemHandler problemHandler) throws TikaConfigException
- Specified by:
checkInitialization
in interfaceInitializable
- Parameters:
problemHandler
- if there is a problem and no custom initializableProblemHandler has been configured via Initializable parameters, this is called to respond.- Throws:
TikaConfigException
-
setEndpointConfigurationService
@Field public void setEndpointConfigurationService(String endpointConfigurationService)
-
setPathStyleAccessEnabled
@Field public void setPathStyleAccessEnabled(boolean pathStyleAccessEnabled)
-
-