public class OfficeParser extends AbstractOfficeParser
Modifier and Type | Class and Description |
---|---|
static class |
OfficeParser.POIFSDocumentType |
Constructor and Description |
---|
OfficeParser() |
Modifier and Type | Method and Description |
---|---|
static void |
extractMacros(org.apache.poi.poifs.filesystem.POIFSFileSystem fs,
ContentHandler xhtml,
EmbeddedDocumentExtractor embeddedDocumentExtractor)
Helper to extract macros from an NPOIFS/vbaProject.bin
|
Set<MediaType> |
getSupportedTypes(ParseContext context)
Returns the set of media types supported by this parser when used
with the given parse context.
|
protected void |
parse(org.apache.poi.poifs.filesystem.DirectoryNode root,
ParseContext context,
Metadata metadata,
XHTMLContentHandler xhtml) |
void |
parse(InputStream stream,
ContentHandler handler,
Metadata metadata,
ParseContext context)
Extracts properties and text from an MS Document input stream
|
configure, getByteArrayMaxOverride, getDateFormatOverride, isConcatenatePhoneticRuns, isExtractAllAlternativesFromMSG, isExtractMacros, isIncludeDeletedContent, isIncludeHeadersAndFooters, isIncludeMoveFromContent, isIncludeShapeBasedContent, isUseSAXDocxExtractor, isUseSAXPptxExtractor, setByteArrayMaxOverride, setConcatenatePhoneticRuns, setDateFormatOverride, setExtractAllAlternativesFromMSG, setExtractMacros, setIncludeDeletedContent, setIncludeHeadersAndFooters, setIncludeMoveFromContent, setIncludeShapeBasedContent, setUseSAXDocxExtractor, setUseSAXPptxExtractor
parse
public static void extractMacros(org.apache.poi.poifs.filesystem.POIFSFileSystem fs, ContentHandler xhtml, EmbeddedDocumentExtractor embeddedDocumentExtractor) throws IOException, SAXException
As of POI-3.15-final, there are still some bugs in VBAMacroReader. For now, we are swallowing NPE and other runtime exceptions
fs
- NPOIFS to extract fromxhtml
- SAX writerembeddedDocumentExtractor
- extractor for embedded documentsIOException
- on IOException if it occurs during the extraction of the embedded docSAXException
- on SAXException for writing to xhtmlpublic Set<MediaType> getSupportedTypes(ParseContext context)
Parser
context
- parse contextpublic void parse(InputStream stream, ContentHandler handler, Metadata metadata, ParseContext context) throws IOException, SAXException, TikaException
stream
- the document stream (input)handler
- handler for the XHTML SAX events (output)metadata
- document metadata (input and output)context
- parse contextIOException
- if the document stream could not be readSAXException
- if the SAX events could not be processedTikaException
- if the document could not be parsedprotected void parse(org.apache.poi.poifs.filesystem.DirectoryNode root, ParseContext context, Metadata metadata, XHTMLContentHandler xhtml) throws IOException, SAXException, TikaException
IOException
SAXException
TikaException
Copyright © 2007–2023 The Apache Software Foundation. All rights reserved.