Class HttpFetcher
java.lang.Object
org.apache.tika.pipes.fetcher.AbstractFetcher
org.apache.tika.pipes.fetcher.http.HttpFetcher
- All Implemented Interfaces:
Initializable,Fetcher,RangeFetcher
Based on Apache httpclient
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidcheckInitialization(InitializableProblemHandler problemHandler) voidinitialize(Map<String, Param> params) voidsetAuthScheme(String authScheme) voidsetConnectTimeout(int connectTimeout) voidsetHttpHeaders(List<String> headers) Which http headers should we capture in the metadata.voidsetMaxConnections(int maxConnections) voidsetMaxConnectionsPerRoute(int maxConnectionsPerRoute) voidsetMaxErrMsgSize(int maxErrMsgSize) voidsetMaxRedirects(int maxRedirects) voidsetMaxSpoolSize(long maxSpoolSize) Set the maximum number of bytes to spool to a temp file.voidsetNtDomain(String domain) voidsetOverallTimeout(long overallTimeout) This sets an overall timeout on the request.voidsetPassword(String password) voidsetProxyHost(String proxyHost) voidsetProxyPort(int proxyPort) voidsetRequestTimeout(int requestTimeout) voidsetSocketTimeout(int socketTimeout) voidsetUserAgent(String userAgent) When making the request, what User-Agent is sent in the request.voidsetUserName(String userName) Methods inherited from class org.apache.tika.pipes.fetcher.AbstractFetcher
getName, setName
-
Field Details
-
HTTP_HEADER_PREFIX
-
HTTP_FETCH_PREFIX
-
HTTP_STATUS_CODE
http status code -
HTTP_NUM_REDIRECTS
Number of redirects -
HTTP_TARGET_URL
If there were redirects, this captures the final URL visited -
HTTP_TARGET_IP_ADDRESS
-
HTTP_FETCH_TRUNCATED
-
HTTP_CONTENT_ENCODING
-
HTTP_CONTENT_TYPE
-
-
Constructor Details
-
HttpFetcher
public HttpFetcher()
-
-
Method Details
-
fetch
- Specified by:
fetchin interfaceFetcher- Throws:
IOExceptionTikaException
-
fetch
public InputStream fetch(String fetchKey, long startRange, long endRange, Metadata metadata) throws IOException - Specified by:
fetchin interfaceRangeFetcher- Throws:
IOException
-
setUserName
-
setPassword
-
setNtDomain
-
setAuthScheme
-
setProxyHost
-
setProxyPort
-
setConnectTimeout
-
setRequestTimeout
-
setSocketTimeout
-
setMaxConnections
-
setMaxConnectionsPerRoute
-
setMaxSpoolSize
Set the maximum number of bytes to spool to a temp file. If this value is-1, the full stream will be spooled to a temp file Default size is -1.- Parameters:
maxSpoolSize-
-
setMaxRedirects
-
setHttpHeaders
Which http headers should we capture in the metadata. Keys will be prepended withHTTP_HEADER_PREFIX- Parameters:
headers-
-
setOverallTimeout
This sets an overall timeout on the request. If a server is super slow or the file is very long, the other timeouts might not be triggered.- Parameters:
overallTimeout-
-
setMaxErrMsgSize
-
setUserAgent
When making the request, what User-Agent is sent in the request. By default httpclient adds e.g. "Apache-HttpClient/4.5.13 (Java/x.y.z)"- Parameters:
userAgent-
-
initialize
- Specified by:
initializein interfaceInitializable- Parameters:
params- params to use for initialization- Throws:
TikaConfigException
-
checkInitialization
public void checkInitialization(InitializableProblemHandler problemHandler) throws TikaConfigException - Specified by:
checkInitializationin interfaceInitializable- Parameters:
problemHandler- if there is a problem and no custom initializableProblemHandler has been configured via Initializable parameters, this is called to respond.- Throws:
TikaConfigException
-