public class CompositeParser extends AbstractParser
Constructor and Description |
---|
CompositeParser() |
CompositeParser(MediaTypeRegistry registry,
List<Parser> parsers) |
CompositeParser(MediaTypeRegistry registry,
List<Parser> parsers,
Collection<Class<? extends Parser>> excludeParsers) |
CompositeParser(MediaTypeRegistry registry,
Parser... parsers) |
Modifier and Type | Method and Description |
---|---|
Map<MediaType,List<Parser>> |
findDuplicateParsers(ParseContext context)
Utility method that goes through all the component parsers and finds
all media types for which more than one parser declares support.
|
List<Parser> |
getAllComponentParsers()
Returns all parsers registered with the Composite Parser,
including ones which may not currently be active.
|
Parser |
getFallback()
Returns the fallback parser.
|
MediaTypeRegistry |
getMediaTypeRegistry()
Returns the media type registry used to infer type relationships.
|
protected Parser |
getParser(Metadata metadata)
Returns the parser that best matches the given metadata.
|
protected Parser |
getParser(Metadata metadata,
ParseContext context) |
Map<MediaType,Parser> |
getParsers()
Returns the component parsers.
|
Map<MediaType,Parser> |
getParsers(ParseContext context) |
Set<MediaType> |
getSupportedTypes(ParseContext context)
Returns the set of media types supported by this parser when used
with the given parse context.
|
void |
parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Delegates the call to the matching component parser.
|
void |
setFallback(Parser fallback)
Sets the fallback parser.
|
void |
setMediaTypeRegistry(MediaTypeRegistry registry)
Sets the media type registry used to infer type relationships.
|
void |
setParsers(Map<MediaType,Parser> parsers)
Sets the component parsers.
|
parse
public CompositeParser(MediaTypeRegistry registry, List<Parser> parsers, Collection<Class<? extends Parser>> excludeParsers)
public CompositeParser(MediaTypeRegistry registry, List<Parser> parsers)
public CompositeParser(MediaTypeRegistry registry, Parser... parsers)
public CompositeParser()
public Map<MediaType,Parser> getParsers(ParseContext context)
public Map<MediaType,List<Parser>> findDuplicateParsers(ParseContext context)
context
- parsing contextpublic MediaTypeRegistry getMediaTypeRegistry()
public void setMediaTypeRegistry(MediaTypeRegistry registry)
registry
- media type registrypublic List<Parser> getAllComponentParsers()
public Map<MediaType,Parser> getParsers()
public void setParsers(Map<MediaType,Parser> parsers)
parsers
- component parsers, keyed by media typepublic Parser getFallback()
public void setFallback(Parser fallback)
fallback
- fallback parserprotected Parser getParser(Metadata metadata)
Subclasses can override this method to provide more accurate parser resolution.
metadata
- document metadataprotected Parser getParser(Metadata metadata, ParseContext context)
public Set<MediaType> getSupportedTypes(ParseContext context)
Parser
context
- parse contextpublic void parse(InputStream stream, ContentHandler handler, Metadata metadata, ParseContext context) throws IOException, SAXException, TikaException
Potential RuntimeException
s, IOException
s and
SAXException
s unrelated to the given input stream and content
handler are automatically wrapped into TikaException
s to better
honor the Parser
contract.
stream
- the document stream (input)handler
- handler for the XHTML SAX events (output)metadata
- document metadata (input and output)context
- parse contextIOException
- if the document stream could not be readSAXException
- if the SAX events could not be processedTikaException
- if the document could not be parsedCopyright © 2007–2023 The Apache Software Foundation. All rights reserved.