Security

The following is an incomplete list of known and fixed Critical Vulnerabilities and Exposures (CVEs) and other vulnerabilities in Apache Tika or its dependencies. Please help us fill this in with more details.

CVE or Vulnerability Description Reporter Affected Versions
CVE-2018-11796 XML Entity Expansion in Tika's SAXParsers after reset() Slava Gorelik ?-1.19
CVE-2018-11797 Very long loop parsing page tree in PDFBox Shawn Rasheed and Jens Dietrich ?-1.19
CVE-2018-11771 Infinite Loop in Commons-Compress ZipArchiveInputStream Tobias Ospelt ?-1.18
CVE-2018-8017 Infinite Loop in IptcAnpaParser Rohan Padhye and Tobias Ospelt 1.2-1.18
CVE-2018-8036 Infinite Loop leading to OOM in PDFBox's AFMParser Tobias Ospelt ?-1.18
CVE-2018-12418 Infinite Loop in junrar Tobias Ospelt ?-1.18
CVE-2018-11761 XML Entity Expansion Vulnerability Renfei (Brian) Wang 0.1-1.18
CVE-2018-11762 Rare Zip Slip Vulnerability in tika-app Tim Allison 0.9-1.18
RIFFReader Infinite Loop in AudioParser in Java 8 and 9 Sergey Bylokhov and Tobias Ospelt ?-1.18
TIKA-2446 OOM detecting OPCPackage files with corrupt ZIP Thorsten Schäfer ?-1.18
PDFBOX-4014 Infinite loop in JBig2 (versions less than 3.0.0) Hanno Böck (if user supplied) ?-1.17
CVE-2018-1339 Infinite loop in ChmParser Tobias Ospelt ?-1.17
CVE-2018-1338 Infinite loop in BPGParser Tobias Ospelt ?-1.17
CVE-2018-1335 Command Execution in tika-server Tim Allison ?-1.17
CVE-2017-12626 Apache POI - Infinite loops in WMF, EMF, MSG and macros; OOMs in DOC, PPT and XLS Tim Allison, Luís Filipe Nassif and Jerome Lacoste ?-1.17
CVE-2018-1324 and COMPRESS-432 Commons Compress - Infinite loop in ZipFile Luís Filipe Nassif and Anton Abashkin ?-1.17
CVE-2018-7489 and TIKA-2634 Jackson - Deserialization vulnerability Richard Cyganiak (notified Tika team) ?-1.17
PDFBOX-3919 Apache PDFBox - Infinite loop Hanno Böck and Andreas Bogk ?-1.16
TIKA-2115 Apache POI - OOM parsing OLE object Thomas Galla ?-1.15
COMPRESS-382 Commons Compress - OOM detecting corrupt LZMA Luís Filipe Nassif ?-1.15
COMPRESS-386 and TIKA-1631 Commons Compress - OOM detecting corrupt x-compress Pavel Micka ?-1.15
TIKA-2045 and TIKA-3442 Apache PDFBox - OOM in font caching Egbert ?-1.13
TIKA-1866 and TIKA-954 Apache POI - OOM in DOCX and PPTX because of bug in Piccolo parser Rob Tulloh and Shawn Johnson ?-1.13
TIKA-2040 GC-Overload and OOM in CHMParser Luís Filipe Nassif ?-1.13
CVE-2016-6809 jmatio - Deserialization Vulnerability in MATLAB parser Pierre Ernst 1.6-1.13
CVE-2016-4434 XXE Vulnerability in several parsers Arthur Khashaev, Seulgi Kim, Mesut Timur (and Tim Allison while remediating initial issue reported by Arthur et al.) 0.10-1.12
CVE-2015-3271 Remote Access to host files via tika-server Tim Allison 1.9?-1.10
PDFBOX-2811 Apache PDFBox - Infinite Loop Andreas Lehmkühler ?-1.10
PDFBOX-2200 Apache PDFBox - Slowly building memory leak because of static caching of fonts Matthew Buckett ?-1.6
TIKA-1471 Apache PDFBox - OOM with corrupt PDF Alan Burlison ?-1.6
TIKA-788 Infinite Loop in DWG Stas Shaposhnikov ?-1.4?
TIKA-1132 Apache POI - Nearly Infinite Loop in XLS Ryan Krueger ?-1.4
TIKA-1179 Infinite Loop in corrupt MP3 Marius Dumitru Florea ?-1.4
TIKA-866 OOM reading Tika config file Stephan Mühlstrasser ?-1.1

Acronyms and Terms

  • Command Execution -- A malicious client could execute anything on tika-server's commandline
  • Deserialization Vulnerability- OWASP's Cheat Sheet. A malicious actor could run arbitrary code on your computer.
  • OOM - Out of Memory Error - Parsers may allocate more memory than is available. This can sometimes be caused by parsers not performing sanity checks before allocation. See, for example: TIKA-1631
  • XXE - XML External Entity Processing A malicious client could access data on your system.