org.apache.tika.mime.MimeTypes

All Implemented Interfaces:: Serializable, Detector

public final class MimeTypes extends Object implements Detector, Serializable

This class is a MimeType repository. It gathers a set of MimeTypes and enables to retrieves a content-type from its name, from a file name, or from a magic character sequence.

The MIME type detection methods that take an InputStream as an argument will never reads more than getMinLength() bytes from the stream. Also the given stream is never closed, marked, or reset by the methods. Thus a client can use the mark feature of the stream (if available) to restore the stream back to the state it was before type detection if it wants to process the stream based on the detected type.

See Also:

Serialized Form

Field Summary

Fields

Modifier and Type

Field

Description

static final String

OCTET_STREAM

Name of the root type, application/octet-stream.

static final String

PLAIN_TEXT

Name of the text type, text/plain.

static final String

XML

Name of the xml type, application/xml.
Constructor Summary

Constructors

Constructor

Description

MimeTypes()
Method Summary

Modifier and Type

Method

Description

void

addPattern(MimeType type, String pattern)

Adds a file name pattern for the given media type.

void

addPattern(MimeType type, String pattern, boolean isRegex)

Adds a file name pattern for the given media type.

MediaType

detect(InputStream input, Metadata metadata)

Automatically detects the MIME type of a document based on magic markers in the stream prefix and any given metadata hints.

MimeType

forName(String name)

Returns the registered media type with the given name (or alias).

static MimeTypes

getDefaultMimeTypes()

Get the default MimeTypes.

static MimeTypes

getDefaultMimeTypes(ClassLoader classLoader)

Get the default MimeTypes.

MediaTypeRegistry

getMediaTypeRegistry()

MimeType

getMimeType(File file)

Deprecated.
Use Tika.detect(File) instead

MimeType

getMimeType(String name)

Deprecated.
Use Tika.detect(String) instead

int

getMinLength()

Return the minimum length of data to provide to analyzing methods based on the document's content in order to check all the known MimeTypes.

MimeType

getRegisteredMimeType(String name)

Returns the registered, normalised media type with the given name (or alias).

void

setSuperType(MimeType type, MediaType parent)

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- OCTET_STREAM
  
  public static final String OCTET_STREAM
  
  Name of the root type, application/octet-stream.
  See Also:
  
  Constant Field Values
- PLAIN_TEXT
  
  public static final String PLAIN_TEXT
  
  Name of the text type, text/plain.
  See Also:
  
  Constant Field Values
- XML
  
  public static final String XML
  
  Name of the xml type, application/xml.
  See Also:
  
  Constant Field Values
Constructor Details
- MimeTypes
  
  public MimeTypes()
Method Details
- getDefaultMimeTypes
  
  public static MimeTypes getDefaultMimeTypes()
  
  Get the default MimeTypes. This includes all the build in media types, and any custom override ones present.
  
  Returns:
  
  MimeTypes default type registry
- getDefaultMimeTypes
  
  public static MimeTypes getDefaultMimeTypes(ClassLoader classLoader)
  
  Get the default MimeTypes. This includes all the built-in media types, and any custom override ones present.
  
  Parameters:
  
  classLoader - to use, if not the default
  
  Returns:
  
  MimeTypes default type registry
- getMimeType
  
  @Deprecated public MimeType getMimeType(String name)
  
  Deprecated.
  Use Tika.detect(String) instead
  
  Find the Mime Content Type of a document from its name. Returns application/octet-stream if no better match is found.
  
  Parameters:
  
  name - of the document to analyze.
  
  Returns:
  
  the Mime Content Type of the specified document name
- getMimeType
  
  @Deprecated public MimeType getMimeType(File file) throws MimeTypeException, IOException
  
  Deprecated.
  Use Tika.detect(File) instead
  
  Find the Mime Content Type of a document stored in the given file. Returns application/octet-stream if no better match is found.
  
  Parameters:
  
  file - file to analyze
  
  Returns:
  
  the Mime Content Type of the specified document
  
  Throws:
  
  MimeTypeException - if the type can't be detected
  
  IOException - if the file can't be read
- forName
  
  public MimeType forName(String name) throws MimeTypeException
  
  Returns the registered media type with the given name (or alias). The named media type is automatically registered (and returned) if it doesn't already exist.
  
  Parameters:
  
  name - media type name (case-insensitive)
  
  Returns:
  
  the registered media type with the given name or alias
  
  Throws:
  
  MimeTypeException - if the given media type name is invalid
- getRegisteredMimeType
  
  public MimeType getRegisteredMimeType(String name) throws MimeTypeException
  
  Returns the registered, normalised media type with the given name (or alias).
  Unlike forName(String), this function will not create a new MimeType and register it. Instead, null will be returned if there is no definition available for the given name.
  Also, unlike forName(String), this function may return a mime type that has fewer parameters than were included in the supplied name. If the registered mime type has parameters (e.g. application/dita+xml;format=map), then those will be maintained. However, if the supplied name has paramenters that the registered mime type does not (e.g. application/xml; charset=UTF-8 as a name, compared to just application/xml for the type in the registry), then those parameters will not be included in the returned type.
  
  Parameters:
  
  name - media type name (case-insensitive)
  
  Returns:
  
  the registered media type with the given name or alias, or null if not found
  
  Throws:
  
  MimeTypeException - if the given media type name is invalid
- setSuperType
  
  public void setSuperType(MimeType type, MediaType parent)
- addPattern
  
  public void addPattern(MimeType type, String pattern) throws MimeTypeException
  
  Adds a file name pattern for the given media type. Assumes that the pattern being added is not a JDK standard regular expression.
  
  Parameters:
  
  type - media type
  
  pattern - file name pattern
  
  Throws:
  
  MimeTypeException - if the pattern conflicts with existing ones
- addPattern
  
  public void addPattern(MimeType type, String pattern, boolean isRegex) throws MimeTypeException
  
  Adds a file name pattern for the given media type. The caller can specify whether the pattern being added is or is not a JDK standard regular expression via the isRegex parameter. If the value is set to true, then a JDK standard regex is assumed, otherwise the freedesktop glob type is assumed.
  
  Parameters:
  
  type - media type
  
  pattern - file name pattern
  
  isRegex - set to true if JDK std regexs are desired, otherwise set to false.
  
  Throws:
  
  MimeTypeException - if the pattern conflicts with existing ones.
- getMediaTypeRegistry
  
  public MediaTypeRegistry getMediaTypeRegistry()
- getMinLength
  
  public int getMinLength()
  
  Return the minimum length of data to provide to analyzing methods based on the document's content in order to check all the known MimeTypes.
  Returns:
  
  the minimum length of data to provide.
  
  See Also:
  
  getMimeType(byte[])
- detect
  
  public MediaType detect(InputStream input, Metadata metadata) throws IOException
  
  Automatically detects the MIME type of a document based on magic markers in the stream prefix and any given metadata hints.
  The given stream is expected to support marks, so that this method can reset the stream to the position it was in before this method was called.
  
  Specified by:
  
  detect in interface Detector
  
  Parameters:
  
  input - document stream, or null
  
  metadata - metadata hints
  
  Returns:
  
  MIME type of the document
  
  Throws:
  
  IOException - if the document stream could not be read

Class MimeTypes

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

OCTET_STREAM

PLAIN_TEXT

XML

Constructor Details

MimeTypes

Method Details

getDefaultMimeTypes

getDefaultMimeTypes

getMimeType

getMimeType

forName

getRegisteredMimeType

setSuperType

addPattern

addPattern

getMediaTypeRegistry

getMinLength

detect