Package org.apache.tika.mime
Class MimeTypesReader
java.lang.Object
org.xml.sax.helpers.DefaultHandler
org.apache.tika.mime.MimeTypesReader
- All Implemented Interfaces:
MimeTypesReaderMetKeys,ContentHandler,DTDHandler,EntityResolver,ErrorHandler
A reader for XML files compliant with the freedesktop MIME-info DTD.
<!DOCTYPE mime-info [
<!ELEMENT mime-info (mime-type)+>
<!ATTLIST mime-info xmlns CDATA #FIXED
"http://www.freedesktop.org/standards/shared-mime-info">
<!ELEMENT mime-type
(comment|acronym|expanded-acronym|glob|magic|root-XML|alias|sub-class-of)*>
<!ATTLIST mime-type type CDATA #REQUIRED>
<!-- a comment describing a document with the respective MIME type. Example:
"WMV video" -->
<!ELEMENT _comment (#PCDATA)>
<!ATTLIST _comment xml:lang CDATA #IMPLIED>
<!-- a comment describing a the respective unexpanded MIME type acronym. Example:
"WMV" -->
<!ELEMENT acronym (#PCDATA)>
<!ATTLIST acronym xml:lang CDATA #IMPLIED>
<!-- a comment describing a the respective unexpanded MIME type acronym. Example:
"Windows Media Video" -->
<!ELEMENT expanded-acronym (#PCDATA)>
<!ATTLIST expanded-acronym xml:lang CDATA #IMPLIED>
<!ELEMENT glob EMPTY>
<!ATTLIST glob pattern CDATA #REQUIRED>
<!ATTLIST glob isregex CDATA #IMPLIED>
<!ELEMENT magic (match)+>
<!ATTLIST magic priority CDATA #IMPLIED>
<!ELEMENT match (match)*>
<!ATTLIST match offset CDATA #REQUIRED>
<!ATTLIST match type
(string|big16|big32|little16|little32|host16|host32|byte) #REQUIRED>
<!ATTLIST match value CDATA #REQUIRED>
<!ATTLIST match mask CDATA #IMPLIED>
<!ELEMENT root-XML EMPTY>
<!ATTLIST root-XML
namespaceURI CDATA #REQUIRED
localName CDATA #REQUIRED>
<!ELEMENT alias EMPTY>
<!ATTLIST alias
type CDATA #REQUIRED>
<!ELEMENT sub-class-of EMPTY>
<!ATTLIST sub-class-of
type CDATA #REQUIRED>
]>
In addition to the standard fields, this will also read two Tika specific fields: - link - uti
-
Field Summary
FieldsModifier and TypeFieldDescriptionprotected StringBuilderprotected intprotected MimeTypeCurrent typeprotected final MimeTypesFields inherited from interface org.apache.tika.mime.MimeTypesReaderMetKeys
ACRONYM_TAG, ALIAS_TAG, ALIAS_TYPE_ATTR, COMMENT_TAG, GLOB_TAG, INTERPRETED_ATTR, ISREGEX_ATTR, LOCAL_NAME_ATTR, MAGIC_PRIORITY_ATTR, MAGIC_TAG, MATCH_MASK_ATTR, MATCH_MINSHOULDMATCH_ATTR, MATCH_OFFSET_ATTR, MATCH_TAG, MATCH_TYPE_ATTR, MATCH_VALUE_ATTR, MIME_INFO_TAG, MIME_TYPE_TAG, MIME_TYPE_TYPE_ATTR, NS_URI_ATTR, PATTERN_ATTR, ROOT_XML_TAG, SUB_CLASS_OF_TAG, SUB_CLASS_TYPE_ATTR, TIKA_LINK_TAG, TIKA_UTI_TAG -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidcharacters(char[] ch, int start, int length) voidendElement(String uri, String localName, String qName) protected voidhandleGlobError(MimeType type, String pattern, MimeTypeException ex, String qName, Attributes attributes) protected voidhandleMimeError(String input, MimeTypeException ex, String qName, Attributes attributes) voidread(InputStream stream) voidresolveEntity(String publicId, String systemId) static voidsetPoolSize(int poolSize) Set the pool size for cached XML parsers.voidstartElement(String uri, String localName, String qName, Attributes attributes) Methods inherited from class org.xml.sax.helpers.DefaultHandler
endDocument, endPrefixMapping, error, fatalError, ignorableWhitespace, notationDecl, processingInstruction, setDocumentLocator, skippedEntity, startDocument, startPrefixMapping, unparsedEntityDecl, warning
-
Field Details
-
types
-
type
Current type -
priority
protected int priority -
characters
-
-
Constructor Details
-
MimeTypesReader
-
-
Method Details
-
setPoolSize
Set the pool size for cached XML parsers.- Parameters:
poolSize-- Throws:
TikaException
-
read
- Throws:
IOExceptionMimeTypeException
-
read
- Throws:
MimeTypeException
-
resolveEntity
- Specified by:
resolveEntityin interfaceEntityResolver- Overrides:
resolveEntityin classDefaultHandler
-
startElement
public void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException - Specified by:
startElementin interfaceContentHandler- Overrides:
startElementin classDefaultHandler- Throws:
SAXException
-
endElement
- Specified by:
endElementin interfaceContentHandler- Overrides:
endElementin classDefaultHandler
-
characters
public void characters(char[] ch, int start, int length) - Specified by:
charactersin interfaceContentHandler- Overrides:
charactersin classDefaultHandler
-
handleMimeError
protected void handleMimeError(String input, MimeTypeException ex, String qName, Attributes attributes) throws SAXException - Throws:
SAXException
-
handleGlobError
protected void handleGlobError(MimeType type, String pattern, MimeTypeException ex, String qName, Attributes attributes) throws SAXException - Throws:
SAXException
-