Class WordMLParser
java.lang.Object
org.apache.tika.parser.AbstractParser
org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
org.apache.tika.parser.microsoft.xml.WordMLParser
- All Implemented Interfaces:
Serializable
,Parser
Parses wordml 2003 format word files. These are single xml files
that predate ooxml.
See https://en.wikipedia.org/wiki/Microsoft_Office_XML_formats
- See Also:
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionprotected ContentHandler
getContentHandler
(ContentHandler ch, Metadata metadata, ParseContext context) getSupportedTypes
(ParseContext context) Returns the set of media types supported by this parser when used with the given parse context.void
setContentType
(Metadata metadata) Methods inherited from class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
parse
Methods inherited from class org.apache.tika.parser.AbstractParser
parse
-
Constructor Details
-
WordMLParser
public WordMLParser()
-
-
Method Details
-
getSupportedTypes
Description copied from interface:Parser
Returns the set of media types supported by this parser when used with the given parse context.- Parameters:
context
- parse context- Returns:
- immutable set of media types
-
getContentHandler
protected ContentHandler getContentHandler(ContentHandler ch, Metadata metadata, ParseContext context) - Overrides:
getContentHandler
in classAbstractXML2003Parser
-
setContentType
- Specified by:
setContentType
in classAbstractXML2003Parser
-