Class WordMLParser
- java.lang.Object
-
- org.apache.tika.parser.AbstractParser
-
- org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
-
- org.apache.tika.parser.microsoft.xml.WordMLParser
-
- All Implemented Interfaces:
Serializable
,Parser
public class WordMLParser extends AbstractXML2003Parser
Parses wordml 2003 format word files. These are single xml files that predate ooxml. See https://en.wikipedia.org/wiki/Microsoft_Office_XML_formats- See Also:
- Serialized Form
-
-
Constructor Summary
Constructors Constructor Description WordMLParser()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected ContentHandler
getContentHandler(ContentHandler ch, Metadata metadata, ParseContext context)
Set<MediaType>
getSupportedTypes(ParseContext context)
Returns the set of media types supported by this parser when used with the given parse context.void
setContentType(Metadata metadata)
-
Methods inherited from class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
parse
-
Methods inherited from class org.apache.tika.parser.AbstractParser
parse
-
-
-
-
Method Detail
-
getSupportedTypes
public Set<MediaType> getSupportedTypes(ParseContext context)
Description copied from interface:Parser
Returns the set of media types supported by this parser when used with the given parse context.- Parameters:
context
- parse context- Returns:
- immutable set of media types
-
getContentHandler
protected ContentHandler getContentHandler(ContentHandler ch, Metadata metadata, ParseContext context)
- Overrides:
getContentHandler
in classAbstractXML2003Parser
-
setContentType
public void setContentType(Metadata metadata)
- Specified by:
setContentType
in classAbstractXML2003Parser
-
-