Class HttpFetcher
- java.lang.Object
-
- org.apache.tika.pipes.fetcher.AbstractFetcher
-
- org.apache.tika.pipes.fetcher.http.HttpFetcher
-
- All Implemented Interfaces:
Initializable
,Fetcher
,RangeFetcher
public class HttpFetcher extends AbstractFetcher implements Initializable, RangeFetcher
Based on Apache httpclient
-
-
Field Summary
Fields Modifier and Type Field Description static Property
HTTP_CONTENT_ENCODING
static Property
HTTP_CONTENT_TYPE
static String
HTTP_FETCH_PREFIX
static Property
HTTP_FETCH_TRUNCATED
static String
HTTP_HEADER_PREFIX
static Property
HTTP_NUM_REDIRECTS
Number of redirectsstatic Property
HTTP_STATUS_CODE
http status codestatic Property
HTTP_TARGET_IP_ADDRESS
static Property
HTTP_TARGET_URL
If there were redirects, this captures the final URL visited
-
Constructor Summary
Constructors Constructor Description HttpFetcher()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
checkInitialization(InitializableProblemHandler problemHandler)
InputStream
fetch(String fetchKey, long startRange, long endRange, Metadata metadata)
InputStream
fetch(String fetchKey, Metadata metadata)
void
initialize(Map<String,Param> params)
void
setAuthScheme(String authScheme)
void
setConnectTimeout(int connectTimeout)
void
setHttpHeaders(List<String> headers)
Which http headers should we capture in the metadata.void
setMaxConnections(int maxConnections)
void
setMaxConnectionsPerRoute(int maxConnectionsPerRoute)
void
setMaxErrMsgSize(int maxErrMsgSize)
void
setMaxRedirects(int maxRedirects)
void
setMaxSpoolSize(long maxSpoolSize)
Set the maximum number of bytes to spool to a temp file.void
setNtDomain(String domain)
void
setOverallTimeout(long overallTimeout)
This sets an overall timeout on the request.void
setPassword(String password)
void
setProxyHost(String proxyHost)
void
setProxyPort(int proxyPort)
void
setRequestTimeout(int requestTimeout)
void
setSocketTimeout(int socketTimeout)
void
setUserAgent(String userAgent)
When making the request, what User-Agent is sent in the request.void
setUserName(String userName)
-
Methods inherited from class org.apache.tika.pipes.fetcher.AbstractFetcher
getName, setName
-
-
-
-
Field Detail
-
HTTP_HEADER_PREFIX
public static String HTTP_HEADER_PREFIX
-
HTTP_FETCH_PREFIX
public static String HTTP_FETCH_PREFIX
-
HTTP_STATUS_CODE
public static Property HTTP_STATUS_CODE
http status code
-
HTTP_NUM_REDIRECTS
public static Property HTTP_NUM_REDIRECTS
Number of redirects
-
HTTP_TARGET_URL
public static Property HTTP_TARGET_URL
If there were redirects, this captures the final URL visited
-
HTTP_TARGET_IP_ADDRESS
public static Property HTTP_TARGET_IP_ADDRESS
-
HTTP_FETCH_TRUNCATED
public static Property HTTP_FETCH_TRUNCATED
-
HTTP_CONTENT_ENCODING
public static Property HTTP_CONTENT_ENCODING
-
HTTP_CONTENT_TYPE
public static Property HTTP_CONTENT_TYPE
-
-
Method Detail
-
fetch
public InputStream fetch(String fetchKey, Metadata metadata) throws IOException, TikaException
- Specified by:
fetch
in interfaceFetcher
- Throws:
IOException
TikaException
-
fetch
public InputStream fetch(String fetchKey, long startRange, long endRange, Metadata metadata) throws IOException
- Specified by:
fetch
in interfaceRangeFetcher
- Throws:
IOException
-
setProxyPort
@Field public void setProxyPort(int proxyPort)
-
setConnectTimeout
@Field public void setConnectTimeout(int connectTimeout)
-
setRequestTimeout
@Field public void setRequestTimeout(int requestTimeout)
-
setSocketTimeout
@Field public void setSocketTimeout(int socketTimeout)
-
setMaxConnections
@Field public void setMaxConnections(int maxConnections)
-
setMaxConnectionsPerRoute
@Field public void setMaxConnectionsPerRoute(int maxConnectionsPerRoute)
-
setMaxSpoolSize
@Field public void setMaxSpoolSize(long maxSpoolSize)
Set the maximum number of bytes to spool to a temp file. If this value is-1
, the full stream will be spooled to a temp file Default size is -1.- Parameters:
maxSpoolSize
-
-
setMaxRedirects
@Field public void setMaxRedirects(int maxRedirects)
-
setHttpHeaders
@Field public void setHttpHeaders(List<String> headers)
Which http headers should we capture in the metadata. Keys will be prepended withHTTP_HEADER_PREFIX
- Parameters:
headers
-
-
setOverallTimeout
@Field public void setOverallTimeout(long overallTimeout)
This sets an overall timeout on the request. If a server is super slow or the file is very long, the other timeouts might not be triggered.- Parameters:
overallTimeout
-
-
setMaxErrMsgSize
@Field public void setMaxErrMsgSize(int maxErrMsgSize)
-
setUserAgent
@Field public void setUserAgent(String userAgent)
When making the request, what User-Agent is sent in the request. By default httpclient adds e.g. "Apache-HttpClient/4.5.13 (Java/x.y.z)"- Parameters:
userAgent
-
-
initialize
public void initialize(Map<String,Param> params) throws TikaConfigException
- Specified by:
initialize
in interfaceInitializable
- Parameters:
params
- params to use for initialization- Throws:
TikaConfigException
-
checkInitialization
public void checkInitialization(InitializableProblemHandler problemHandler) throws TikaConfigException
- Specified by:
checkInitialization
in interfaceInitializable
- Parameters:
problemHandler
- if there is a problem and no custom initializableProblemHandler has been configured via Initializable parameters, this is called to respond.- Throws:
TikaConfigException
-
-