Class WordMLParser
- java.lang.Object
-
- org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
-
- org.apache.tika.parser.microsoft.xml.WordMLParser
-
- All Implemented Interfaces:
Serializable,Parser
public class WordMLParser extends AbstractXML2003Parser
Parses wordml 2003 format word files. These are single xml files that predate ooxml.See https://en.wikipedia.org/wiki/Microsoft_Office_XML_formats
- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description WordMLParser()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected ContentHandlergetContentHandler(ContentHandler ch, Metadata metadata, ParseContext context)Set<MediaType>getSupportedTypes(ParseContext context)Returns the set of media types supported by this parser when used with the given parse context.voidsetContentType(Metadata metadata)-
Methods inherited from class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
parse
-
-
-
-
Method Detail
-
getSupportedTypes
public Set<MediaType> getSupportedTypes(ParseContext context)
Description copied from interface:ParserReturns the set of media types supported by this parser when used with the given parse context.- Parameters:
context- parse context- Returns:
- immutable set of media types
-
getContentHandler
protected ContentHandler getContentHandler(ContentHandler ch, Metadata metadata, ParseContext context)
- Overrides:
getContentHandlerin classAbstractXML2003Parser
-
setContentType
public void setContentType(Metadata metadata)
- Specified by:
setContentTypein classAbstractXML2003Parser
-
-