Class HttpFetcher
- java.lang.Object
- 
- org.apache.tika.pipes.fetcher.AbstractFetcher
- 
- org.apache.tika.pipes.fetcher.http.HttpFetcher
 
 
- 
- All Implemented Interfaces:
- Initializable,- Fetcher,- RangeFetcher
 
 public class HttpFetcher extends AbstractFetcher implements Initializable, RangeFetcher Based on Apache httpclient
- 
- 
Field SummaryFields Modifier and Type Field Description static PropertyHTTP_CONTENT_ENCODINGstatic PropertyHTTP_CONTENT_TYPEstatic StringHTTP_FETCH_PREFIXstatic PropertyHTTP_FETCH_TRUNCATEDstatic StringHTTP_HEADER_PREFIXstatic PropertyHTTP_NUM_REDIRECTSNumber of redirectsstatic PropertyHTTP_STATUS_CODEhttp status codestatic PropertyHTTP_TARGET_IP_ADDRESSstatic PropertyHTTP_TARGET_URLIf there were redirects, this captures the final URL visited
 - 
Constructor SummaryConstructors Constructor Description HttpFetcher()HttpFetcher(HttpFetcherConfig httpFetcherConfig)
 - 
Method SummaryAll Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description voidcheckInitialization(InitializableProblemHandler problemHandler)InputStreamfetch(String fetchKey, long startRange, long endRange, Metadata metadata, ParseContext parseContext)InputStreamfetch(String fetchKey, Metadata metadata, ParseContext parseContext)org.apache.http.client.HttpClientgetHttpClient()HttpFetcherConfiggetHttpFetcherConfig()voidinitialize(Map<String,Param> params)static Map<String,List<String>>parseHeaders(String headersString)voidsetAuthScheme(String authScheme)voidsetConnectTimeout(int connectTimeout)voidsetHttpClient(org.apache.http.client.HttpClient httpClient)voidsetHttpClientFactory(HttpClientFactory httpClientFactory)voidsetHttpFetcherConfig(HttpFetcherConfig httpFetcherConfig)voidsetHttpHeaders(List<String> headers)Which http headers should we capture in the metadata.voidsetHttpRequestHeaders(List<String> headers)Which http request headers should we send in the http fetch requests.voidsetJwtExpiresInSeconds(int jwtExpiresInSeconds)voidsetJwtIssuer(String jwtIssuer)voidsetJwtPrivateKeyBase64(String jwtPrivateKeyBase64)voidsetJwtSecret(String jwtSecret)voidsetJwtSubject(String jwtSubject)voidsetMaxConnections(int maxConnections)voidsetMaxConnectionsPerRoute(int maxConnectionsPerRoute)voidsetMaxErrMsgSize(int maxErrMsgSize)voidsetMaxRedirects(int maxRedirects)voidsetMaxSpoolSize(long maxSpoolSize)Set the maximum number of bytes to spool to a temp file.voidsetNtDomain(String domain)voidsetOverallTimeout(long overallTimeout)This sets an overall timeout on the request.voidsetPassword(String password)voidsetProxyHost(String proxyHost)voidsetProxyPort(int proxyPort)voidsetRequestTimeout(int requestTimeout)voidsetSocketTimeout(int socketTimeout)voidsetUserAgent(String userAgent)When making the request, what User-Agent is sent in the request.voidsetUserName(String userName)- 
Methods inherited from class org.apache.tika.pipes.fetcher.AbstractFetchergetName, setName
 - 
Methods inherited from class java.lang.Objectclone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 - 
Methods inherited from interface org.apache.tika.pipes.fetcher.RangeFetcherfetch
 
- 
 
- 
- 
- 
Field Detail- 
HTTP_HEADER_PREFIXpublic static String HTTP_HEADER_PREFIX 
 - 
HTTP_FETCH_PREFIXpublic static String HTTP_FETCH_PREFIX 
 - 
HTTP_STATUS_CODEpublic static Property HTTP_STATUS_CODE http status code
 - 
HTTP_NUM_REDIRECTSpublic static Property HTTP_NUM_REDIRECTS Number of redirects
 - 
HTTP_TARGET_URLpublic static Property HTTP_TARGET_URL If there were redirects, this captures the final URL visited
 - 
HTTP_TARGET_IP_ADDRESSpublic static Property HTTP_TARGET_IP_ADDRESS 
 - 
HTTP_FETCH_TRUNCATEDpublic static Property HTTP_FETCH_TRUNCATED 
 - 
HTTP_CONTENT_ENCODINGpublic static Property HTTP_CONTENT_ENCODING 
 - 
HTTP_CONTENT_TYPEpublic static Property HTTP_CONTENT_TYPE 
 
- 
 - 
Constructor Detail- 
HttpFetcherpublic HttpFetcher() 
 - 
HttpFetcherpublic HttpFetcher(HttpFetcherConfig httpFetcherConfig) 
 
- 
 - 
Method Detail- 
fetchpublic InputStream fetch(String fetchKey, Metadata metadata, ParseContext parseContext) throws IOException, TikaException - Specified by:
- fetchin interface- Fetcher
- Throws:
- IOException
- TikaException
 
 - 
fetchpublic InputStream fetch(String fetchKey, long startRange, long endRange, Metadata metadata, ParseContext parseContext) throws IOException, TikaException - Specified by:
- fetchin interface- RangeFetcher
- Throws:
- IOException
- TikaException
 
 - 
setProxyPort@Field public void setProxyPort(int proxyPort) 
 - 
setConnectTimeout@Field public void setConnectTimeout(int connectTimeout) 
 - 
setRequestTimeout@Field public void setRequestTimeout(int requestTimeout) 
 - 
setSocketTimeout@Field public void setSocketTimeout(int socketTimeout) 
 - 
setMaxConnections@Field public void setMaxConnections(int maxConnections) 
 - 
setMaxConnectionsPerRoute@Field public void setMaxConnectionsPerRoute(int maxConnectionsPerRoute) 
 - 
setMaxSpoolSize@Field public void setMaxSpoolSize(long maxSpoolSize) Set the maximum number of bytes to spool to a temp file. If this value is-1, the full stream will be spooled to a temp fileDefault size is -1. - Parameters:
- maxSpoolSize-
 
 - 
setMaxRedirects@Field public void setMaxRedirects(int maxRedirects) 
 - 
setHttpRequestHeaders@Field public void setHttpRequestHeaders(List<String> headers) Which http request headers should we send in the http fetch requests.- Parameters:
- headers- The headers to add to the HTTP GET requests.
 
 - 
setHttpHeaders@Field public void setHttpHeaders(List<String> headers) Which http headers should we capture in the metadata. Keys will be prepended withHTTP_HEADER_PREFIX- Parameters:
- headers-
 
 - 
setOverallTimeout@Field public void setOverallTimeout(long overallTimeout) This sets an overall timeout on the request. If a server is super slow or the file is very long, the other timeouts might not be triggered.- Parameters:
- overallTimeout-
 
 - 
setMaxErrMsgSize@Field public void setMaxErrMsgSize(int maxErrMsgSize) 
 - 
setUserAgent@Field public void setUserAgent(String userAgent) When making the request, what User-Agent is sent in the request. By default httpclient adds e.g. "Apache-HttpClient/4.5.13 (Java/x.y.z)"- Parameters:
- userAgent-
 
 - 
setJwtExpiresInSeconds@Field public void setJwtExpiresInSeconds(int jwtExpiresInSeconds) 
 - 
initializepublic void initialize(Map<String,Param> params) throws TikaConfigException - Specified by:
- initializein interface- Initializable
- Parameters:
- params- params to use for initialization
- Throws:
- TikaConfigException
 
 - 
checkInitializationpublic void checkInitialization(InitializableProblemHandler problemHandler) throws TikaConfigException - Specified by:
- checkInitializationin interface- Initializable
- Parameters:
- problemHandler- if there is a problem and no custom initializableProblemHandler has been configured via Initializable parameters, this is called to respond.
- Throws:
- TikaConfigException
 
 - 
setHttpClientFactorypublic void setHttpClientFactory(HttpClientFactory httpClientFactory) 
 - 
setHttpClientpublic void setHttpClient(org.apache.http.client.HttpClient httpClient) 
 - 
getHttpClientpublic org.apache.http.client.HttpClient getHttpClient() 
 - 
getHttpFetcherConfigpublic HttpFetcherConfig getHttpFetcherConfig() 
 - 
setHttpFetcherConfigpublic void setHttpFetcherConfig(HttpFetcherConfig httpFetcherConfig) 
 
- 
 
-