Class HttpFetcher
java.lang.Object
org.apache.tika.pipes.fetcher.AbstractFetcher
org.apache.tika.pipes.fetcher.http.HttpFetcher
- All Implemented Interfaces:
Initializable
,Fetcher
,RangeFetcher
Based on Apache httpclient
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
checkInitialization
(InitializableProblemHandler problemHandler) void
initialize
(Map<String, Param> params) void
setAuthScheme
(String authScheme) void
setConnectTimeout
(int connectTimeout) void
setHttpHeaders
(List<String> headers) Which http headers should we capture in the metadata.void
setMaxConnections
(int maxConnections) void
setMaxConnectionsPerRoute
(int maxConnectionsPerRoute) void
setMaxErrMsgSize
(int maxErrMsgSize) void
setMaxRedirects
(int maxRedirects) void
setMaxSpoolSize
(long maxSpoolSize) Set the maximum number of bytes to spool to a temp file.void
setNtDomain
(String domain) void
setOverallTimeout
(long overallTimeout) This sets an overall timeout on the request.void
setPassword
(String password) void
setProxyHost
(String proxyHost) void
setProxyPort
(int proxyPort) void
setRequestTimeout
(int requestTimeout) void
setSocketTimeout
(int socketTimeout) void
setUserAgent
(String userAgent) When making the request, what User-Agent is sent in the request.void
setUserName
(String userName) Methods inherited from class org.apache.tika.pipes.fetcher.AbstractFetcher
getName, setName
-
Field Details
-
HTTP_HEADER_PREFIX
-
HTTP_FETCH_PREFIX
-
HTTP_STATUS_CODE
http status code -
HTTP_NUM_REDIRECTS
Number of redirects -
HTTP_TARGET_URL
If there were redirects, this captures the final URL visited -
HTTP_TARGET_IP_ADDRESS
-
HTTP_FETCH_TRUNCATED
-
HTTP_CONTENT_ENCODING
-
HTTP_CONTENT_TYPE
-
-
Constructor Details
-
HttpFetcher
public HttpFetcher()
-
-
Method Details
-
fetch
- Specified by:
fetch
in interfaceFetcher
- Throws:
IOException
TikaException
-
fetch
public InputStream fetch(String fetchKey, long startRange, long endRange, Metadata metadata) throws IOException - Specified by:
fetch
in interfaceRangeFetcher
- Throws:
IOException
-
setUserName
-
setPassword
-
setNtDomain
-
setAuthScheme
-
setProxyHost
-
setProxyPort
-
setConnectTimeout
-
setRequestTimeout
-
setSocketTimeout
-
setMaxConnections
-
setMaxConnectionsPerRoute
-
setMaxSpoolSize
Set the maximum number of bytes to spool to a temp file. If this value is-1
, the full stream will be spooled to a temp file Default size is -1.- Parameters:
maxSpoolSize
-
-
setMaxRedirects
-
setHttpHeaders
Which http headers should we capture in the metadata. Keys will be prepended withHTTP_HEADER_PREFIX
- Parameters:
headers
-
-
setOverallTimeout
This sets an overall timeout on the request. If a server is super slow or the file is very long, the other timeouts might not be triggered.- Parameters:
overallTimeout
-
-
setMaxErrMsgSize
-
setUserAgent
When making the request, what User-Agent is sent in the request. By default httpclient adds e.g. "Apache-HttpClient/4.5.13 (Java/x.y.z)"- Parameters:
userAgent
-
-
initialize
- Specified by:
initialize
in interfaceInitializable
- Parameters:
params
- params to use for initialization- Throws:
TikaConfigException
-
checkInitialization
public void checkInitialization(InitializableProblemHandler problemHandler) throws TikaConfigException - Specified by:
checkInitialization
in interfaceInitializable
- Parameters:
problemHandler
- if there is a problem and no custom initializableProblemHandler has been configured via Initializable parameters, this is called to respond.- Throws:
TikaConfigException
-