Package org.apache.tika.detect.siegfried
Class SiegfriedDetector
java.lang.Object
org.apache.tika.detect.siegfried.SiegfriedDetector
- All Implemented Interfaces:
Serializable,org.apache.tika.detect.Detector
Simple wrapper around Siegfried https://github.com/richardlehane/siegfried
The default behavior is to run detection, report the results in the
metadata and then return null so that other detectors will be used.
- See Also:
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic Stringstatic Stringstatic Stringstatic Stringstatic Stringstatic org.apache.tika.metadata.Propertystatic org.apache.tika.metadata.Propertystatic org.apache.tika.metadata.Propertystatic final Stringstatic org.apache.tika.metadata.Propertystatic org.apache.tika.metadata.Propertystatic org.apache.tika.metadata.Propertystatic Stringstatic String -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic booleancheckHasSiegfried(String siegfriedCommandPath) org.apache.tika.mime.MediaTypedetect(InputStream input, org.apache.tika.metadata.Metadata metadata) booleanprotected static org.apache.tika.mime.MediaTypeprocessResult(org.apache.tika.utils.FileProcessResult result, org.apache.tika.metadata.Metadata metadata, boolean returnMime) voidsetMaxBytes(int maxBytes) If this is not called on a TikaInputStream, this detector will spool up to this many bytes to a file to be detected by the 'file' command.voidsetSiegfriedPath(String fileCommandPath) voidsetTimeoutMs(long timeoutMs) voidsetUseMime(boolean useMime) As default behavior, Tika runs Siegfried to add its detection to the metadata, but NOT to use detection in determining parsers etc.
-
Field Details
-
SIEGFRIED_PREFIX
- See Also:
-
SIEGFRIED_STATUS
public static org.apache.tika.metadata.Property SIEGFRIED_STATUS -
SIEGFRIED_VERSION
public static org.apache.tika.metadata.Property SIEGFRIED_VERSION -
SIEGFRIED_SIGNATURE
public static org.apache.tika.metadata.Property SIEGFRIED_SIGNATURE -
SIEGFRIED_IDENTIFIERS_NAME
public static org.apache.tika.metadata.Property SIEGFRIED_IDENTIFIERS_NAME -
SIEGFRIED_IDENTIFIERS_DETAILS
public static org.apache.tika.metadata.Property SIEGFRIED_IDENTIFIERS_DETAILS -
SIEGFRIED_ERRORS
public static org.apache.tika.metadata.Property SIEGFRIED_ERRORS -
ID
-
FORMAT
-
VERSION
-
MIME
-
WARNING
-
BASIS
-
ERRORS
-
-
Constructor Details
-
SiegfriedDetector
public SiegfriedDetector()
-
-
Method Details
-
checkHasSiegfried
-
detect
public org.apache.tika.mime.MediaType detect(InputStream input, org.apache.tika.metadata.Metadata metadata) throws IOException - Specified by:
detectin interfaceorg.apache.tika.detect.Detector- Parameters:
input- document input stream, ornullmetadata- input metadata for the document- Returns:
- mime as identified by the file command or application/octet-stream otherwise
- Throws:
IOException
-
setUseMime
@Field public void setUseMime(boolean useMime) As default behavior, Tika runs Siegfried to add its detection to the metadata, but NOT to use detection in determining parsers etc. If this is set totrue, this detector will return the first mime detected by Siegfried and that mime will be used by the AutoDetectParser to select the appropriate parser.- Parameters:
useMime-
-
isUseMime
public boolean isUseMime() -
processResult
protected static org.apache.tika.mime.MediaType processResult(org.apache.tika.utils.FileProcessResult result, org.apache.tika.metadata.Metadata metadata, boolean returnMime) -
setSiegfriedPath
-
setMaxBytes
@Field public void setMaxBytes(int maxBytes) If this is not called on a TikaInputStream, this detector will spool up to this many bytes to a file to be detected by the 'file' command.- Parameters:
maxBytes-
-
setTimeoutMs
@Field public void setTimeoutMs(long timeoutMs)
-