public class MiscOLEDetector extends Object implements Detector
|Modifier and Type||Field and Description|
Hangul Word Processor (Korean)
The OLE base file format
Base QuattroPro mime
|Constructor and Description|
|Modifier and Type||Method and Description|
Detects the content type of the given input document.
Internal detection of the specific kind of OLE2 document, based on the names of the top-level streams within the file.
If a TikaInputStream is passed in to
public static final MediaType OLE
public static final MediaType HWP
public static final MediaType QUATTROPRO
protected static MediaType detect(Set<String> names)
detect(Set, DirectoryEntry)and pass the root entry of the filesystem whose type is to be detected, as a second argument.
protected static MediaType detect(Set<String> names, org.apache.poi.poifs.filesystem.DirectoryEntry root)
DirectoryEntryof that file for best results. The entry can be given as a second, optional argument.
public void setMarkLimit(int markLimit)
detect(InputStream, Metadata), and there is not an underlying file, this detector will spool up to
markLimitto disk. If the stream was read in entirety (e.g. the spooled file is not truncated), this detector will open the file with POI and perform detection. If the spooled file is truncated, the detector will return
MediaType.OCTET_STREAMif there's no OLE header).
As of Tika 1.21, this detector respects the legacy behavior of not performing detection on a non-TikaInputStream.
public MediaType detect(InputStream input, Metadata metadata) throws IOException
application/octet-streamif the type of the document can not be detected.
If the document input stream is not available, then the first
argument may be
null. Otherwise the detector may
read bytes from the start of the stream to help in type detection.
The given stream is guaranteed to support the
mark feature and the detector
is expected to
mark the stream before
reading any bytes from it, and to
the stream before returning. The stream must not be closed by the
The given input metadata is only read, not modified, by the detector.
input- document input stream, or
metadata- input metadata for the document
IOException- if the document input stream could not be read
Copyright © 2007–2022 The Apache Software Foundation. All rights reserved.