Package org.apache.tika.io
Class SpoolingStrategy
java.lang.Object
org.apache.tika.io.SpoolingStrategy
Strategy for determining when to spool a TikaInputStream to disk.
Components (detectors, parsers) can check this strategy before calling
TikaInputStream.getFile() to determine if spooling is appropriate
for the given media type.
Default behavior (when no strategy is in ParseContext): components spool when needed. A strategy allows fine-grained control over spooling decisions.
Configure via JSON:
{
"spooling-strategy": {
"spoolTypes": ["application/zip", "application/x-tika-msoffice", "application/pdf"]
}
}
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionReturns the media type registry.Returns the media types that should be spooled to disk.voidsetMediaTypeRegistry(MediaTypeRegistry registry) Sets the media type registry used for checking type specializations.voidsetSpoolTypes(Set<MediaType> spoolTypes) Sets the media types that should be spooled to disk.booleanshouldSpool(TikaInputStream tis, Metadata metadata, MediaType mediaType) Determines whether the stream should be spooled to disk.
-
Constructor Details
-
SpoolingStrategy
public SpoolingStrategy()
-
-
Method Details
-
shouldSpool
Determines whether the stream should be spooled to disk.- Parameters:
tis- the TikaInputStream (can check hasFile(), getLength())metadata- metadata (can check content-type hints, filename)mediaType- the detected or declared media type- Returns:
- true if the stream should be spooled to disk
-
setSpoolTypes
Sets the media types that should be spooled to disk. Specializations of these types are also included.- Parameters:
spoolTypes- set of media types to spool
-
getSpoolTypes
Returns the media types that should be spooled to disk.- Returns:
- set of media types to spool
-
setMediaTypeRegistry
Sets the media type registry used for checking type specializations.- Parameters:
registry- the media type registry
-
getMediaTypeRegistry
Returns the media type registry.- Returns:
- the media type registry, or null if not set
-