Class MimeTypes
- java.lang.Object
-
- org.apache.tika.mime.MimeTypes
-
- All Implemented Interfaces:
Serializable
,Detector
public final class MimeTypes extends Object implements Detector, Serializable
This class is a MimeType repository. It gathers a set of MimeTypes and enables to retrieves a content-type from its name, from a file name, or from a magic character sequence.The MIME type detection methods that take an
InputStream
as an argument will never reads more thangetMinLength()
bytes from the stream. Also the given stream is neverclosed
,marked
, orreset
by the methods. Thus a client can use themark feature
of the stream (if available) to restore the stream back to the state it was before type detection if it wants to process the stream based on the detected type.- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static String
OCTET_STREAM
Name of theroot
type, application/octet-stream.static String
PLAIN_TEXT
Name of thetext
type, text/plain.static String
XML
Name of thexml
type, application/xml.
-
Constructor Summary
Constructors Constructor Description MimeTypes()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Deprecated Methods Modifier and Type Method Description void
addPattern(MimeType type, String pattern)
Adds a file name pattern for the given media type.void
addPattern(MimeType type, String pattern, boolean isRegex)
Adds a file name pattern for the given media type.MediaType
detect(InputStream input, Metadata metadata)
Automatically detects the MIME type of a document based on magic markers in the stream prefix and any given metadata hints.MimeType
forName(String name)
Returns the registered media type with the given name (or alias).static MimeTypes
getDefaultMimeTypes()
Get the default MimeTypes.static MimeTypes
getDefaultMimeTypes(ClassLoader classLoader)
Get the default MimeTypes.MediaTypeRegistry
getMediaTypeRegistry()
MimeType
getMimeType(File file)
Deprecated.UseTika.detect(File)
insteadMimeType
getMimeType(String name)
Deprecated.UseTika.detect(String)
insteadint
getMinLength()
Return the minimum length of data to provide to analyzing methods based on the document's content in order to check all the known MimeTypes.MimeType
getRegisteredMimeType(String name)
Returns the registered, normalised media type with the given name (or alias).void
setSuperType(MimeType type, MediaType parent)
-
-
-
Field Detail
-
OCTET_STREAM
public static final String OCTET_STREAM
Name of theroot
type, application/octet-stream.- See Also:
- Constant Field Values
-
PLAIN_TEXT
public static final String PLAIN_TEXT
Name of thetext
type, text/plain.- See Also:
- Constant Field Values
-
XML
public static final String XML
Name of thexml
type, application/xml.- See Also:
- Constant Field Values
-
-
Method Detail
-
getDefaultMimeTypes
public static MimeTypes getDefaultMimeTypes()
Get the default MimeTypes. This includes all the build in media types, and any custom override ones present.- Returns:
- MimeTypes default type registry
-
getDefaultMimeTypes
public static MimeTypes getDefaultMimeTypes(ClassLoader classLoader)
Get the default MimeTypes. This includes all the built-in media types, and any custom override ones present.- Parameters:
classLoader
- to use, if not the default- Returns:
- MimeTypes default type registry
-
getMimeType
public MimeType getMimeType(String name)
Deprecated.UseTika.detect(String)
insteadFind the Mime Content Type of a document from its name. Returns application/octet-stream if no better match is found.- Parameters:
name
- of the document to analyze.- Returns:
- the Mime Content Type of the specified document name
-
getMimeType
public MimeType getMimeType(File file) throws MimeTypeException, IOException
Deprecated.UseTika.detect(File)
insteadFind the Mime Content Type of a document stored in the given file. Returns application/octet-stream if no better match is found.- Parameters:
file
- file to analyze- Returns:
- the Mime Content Type of the specified document
- Throws:
MimeTypeException
- if the type can't be detectedIOException
- if the file can't be read
-
forName
public MimeType forName(String name) throws MimeTypeException
Returns the registered media type with the given name (or alias). The named media type is automatically registered (and returned) if it doesn't already exist.- Parameters:
name
- media type name (case-insensitive)- Returns:
- the registered media type with the given name or alias
- Throws:
MimeTypeException
- if the given media type name is invalid
-
getRegisteredMimeType
public MimeType getRegisteredMimeType(String name) throws MimeTypeException
Returns the registered, normalised media type with the given name (or alias).Unlike
forName(String)
, this function will not create a new MimeType and register it. Instead,null
will be returned if there is no definition available for the given name.Also, unlike
forName(String)
, this function may return a mime type that has fewer parameters than were included in the supplied name. If the registered mime type has parameters (e.g.application/dita+xml;format=map
), then those will be maintained. However, if the supplied name has paramenters that the registered mime type does not (e.g.application/xml; charset=UTF-8
as a name, compared to justapplication/xml
for the type in the registry), then those parameters will not be included in the returned type.- Parameters:
name
- media type name (case-insensitive)- Returns:
- the registered media type with the given name or alias, or null if not found
- Throws:
MimeTypeException
- if the given media type name is invalid
-
addPattern
public void addPattern(MimeType type, String pattern) throws MimeTypeException
Adds a file name pattern for the given media type. Assumes that the pattern being added is not a JDK standard regular expression.- Parameters:
type
- media typepattern
- file name pattern- Throws:
MimeTypeException
- if the pattern conflicts with existing ones
-
addPattern
public void addPattern(MimeType type, String pattern, boolean isRegex) throws MimeTypeException
Adds a file name pattern for the given media type. The caller can specify whether the pattern being added is or is not a JDK standard regular expression via theisRegex
parameter. If the value is set to true, then a JDK standard regex is assumed, otherwise the freedesktop glob type is assumed.- Parameters:
type
- media typepattern
- file name patternisRegex
- set to true if JDK std regexs are desired, otherwise set to false.- Throws:
MimeTypeException
- if the pattern conflicts with existing ones.
-
getMediaTypeRegistry
public MediaTypeRegistry getMediaTypeRegistry()
-
getMinLength
public int getMinLength()
Return the minimum length of data to provide to analyzing methods based on the document's content in order to check all the known MimeTypes.- Returns:
- the minimum length of data to provide.
- See Also:
getMimeType(byte[])
-
detect
public MediaType detect(InputStream input, Metadata metadata) throws IOException
Automatically detects the MIME type of a document based on magic markers in the stream prefix and any given metadata hints.The given stream is expected to support marks, so that this method can reset the stream to the position it was in before this method was called.
- Specified by:
detect
in interfaceDetector
- Parameters:
input
- document stream, ornull
metadata
- metadata hints- Returns:
- MIME type of the document
- Throws:
IOException
- if the document stream could not be read
-
-