Package org.apache.tika.sax
Class SecureContentHandler
java.lang.Object
org.xml.sax.helpers.DefaultHandler
org.apache.tika.sax.ContentHandlerDecorator
org.apache.tika.sax.SecureContentHandler
- All Implemented Interfaces:
ContentHandler
,DTDHandler
,EntityResolver
,ErrorHandler
Content handler decorator that attempts to prevent denial of service
attacks against Tika parsers.
Currently this class simply compares the number of output characters to to the number of input bytes and keeps track of the XML nesting levels. An exception gets thrown if the output seems excessive compared to the input document. This is a strong indication of a zip bomb.
- Since:
- Apache Tika 0.4
- See Also:
-
Constructor Summary
ConstructorDescriptionSecureContentHandler
(ContentHandler handler, TikaInputStream stream) Decorates the given content handler with zip bomb prevention based on the count of bytes read from the given counting input stream. -
Method Summary
Modifier and TypeMethodDescriptionprotected void
advance
(int length) Records the given number of output characters (or more accurately UTF-16 code units).void
characters
(char[] ch, int start, int length) void
endElement
(String uri, String localName, String name) long
Returns the maximum compression ratio.int
Returns the maximum XML element nesting level.int
Returns the maximum package entry nesting level.long
Returns the configured output threshold.void
ignorableWhitespace
(char[] ch, int start, int length) void
setMaximumCompressionRatio
(long ratio) Sets the ratio between output characters and input bytes.void
setMaximumDepth
(int depth) Sets the maximum XML element nesting level.void
setMaximumPackageEntryDepth
(int depth) Sets the maximum package entry nesting level.void
setOutputThreshold
(long threshold) Sets the threshold for output characters before the zip bomb prevention is activated.void
startElement
(String uri, String localName, String name, Attributes atts) void
Converts the givenSAXException
to a correspondingTikaException
if it's caused by this instance detecting a zip bomb.Methods inherited from class org.apache.tika.sax.ContentHandlerDecorator
endDocument, endPrefixMapping, error, fatalError, handleException, processingInstruction, setContentHandler, setDocumentLocator, skippedEntity, startDocument, startPrefixMapping, toString, warning
Methods inherited from class org.xml.sax.helpers.DefaultHandler
notationDecl, resolveEntity, unparsedEntityDecl
-
Constructor Details
-
SecureContentHandler
Decorates the given content handler with zip bomb prevention based on the count of bytes read from the given counting input stream. The resulting decorator can be passed to a Tika parser along with the given counting input stream.- Parameters:
handler
- the content handler to be decoratedstream
- the input stream to be parsed
-
-
Method Details
-
getOutputThreshold
public long getOutputThreshold()Returns the configured output threshold.- Returns:
- output threshold
-
setOutputThreshold
public void setOutputThreshold(long threshold) Sets the threshold for output characters before the zip bomb prevention is activated. This avoids false positives in cases where an otherwise normal document for some reason starts with a highly compressible sequence of bytes.- Parameters:
threshold
- new output threshold
-
getMaximumCompressionRatio
public long getMaximumCompressionRatio()Returns the maximum compression ratio.- Returns:
- maximum compression ratio
-
setMaximumCompressionRatio
public void setMaximumCompressionRatio(long ratio) Sets the ratio between output characters and input bytes. If this ratio is exceeded (after the output threshold has been reached) then an exception gets thrown.- Parameters:
ratio
- new maximum compression ratio
-
getMaximumDepth
public int getMaximumDepth()Returns the maximum XML element nesting level.- Returns:
- maximum XML element nesting level
-
setMaximumDepth
public void setMaximumDepth(int depth) Sets the maximum XML element nesting level. If this depth level is exceeded then an exception gets thrown.- Parameters:
depth
- maximum XML element nesting level
-
getMaximumPackageEntryDepth
public int getMaximumPackageEntryDepth()Returns the maximum package entry nesting level.- Returns:
- maximum package entry nesting level
-
setMaximumPackageEntryDepth
public void setMaximumPackageEntryDepth(int depth) Sets the maximum package entry nesting level. If this depth level is exceeded then an exception gets thrown.- Parameters:
depth
- maximum package entry nesting level
-
throwIfCauseOf
Converts the givenSAXException
to a correspondingTikaException
if it's caused by this instance detecting a zip bomb.- Parameters:
e
- SAX exception- Throws:
TikaException
- zip bomb exception
-
advance
Records the given number of output characters (or more accurately UTF-16 code units). Throws an exception if the recorded number of characters highly exceeds the number of input bytes read.- Parameters:
length
- number of new output characters produced- Throws:
SAXException
- if a zip bomb is detected
-
startElement
public void startElement(String uri, String localName, String name, Attributes atts) throws SAXException - Specified by:
startElement
in interfaceContentHandler
- Overrides:
startElement
in classContentHandlerDecorator
- Throws:
SAXException
-
endElement
- Specified by:
endElement
in interfaceContentHandler
- Overrides:
endElement
in classContentHandlerDecorator
- Throws:
SAXException
-
characters
- Specified by:
characters
in interfaceContentHandler
- Overrides:
characters
in classContentHandlerDecorator
- Throws:
SAXException
-
ignorableWhitespace
- Specified by:
ignorableWhitespace
in interfaceContentHandler
- Overrides:
ignorableWhitespace
in classContentHandlerDecorator
- Throws:
SAXException
-