Apache Tika 3.3.0

The most notable changes in Tika 3.3.0 over the previous release are:

  • Switch to poi-ooxml-full (TIKA-4563).
  • Users need to add "allowAbsolutePaths=true" for the FileSystemFetcher to fetch an absolute path (TIKA-4649).
  • Add a markdown option for content handlers (TIKA-4563).
  • Improve zip parsing (TIKA-4650).
  • Add detection of compressed bmp (TIKA-4511).
  • Allow per file timeouts in tika-pipes (TIKA-4497).
  • Add matroska detector (TIKA-1180).
  • Allow multiple values for many Dublin Core keys (TIKA-4466).
  • Extract macros by default in tika-app's commandline and gui (TIKA-4472).

The following people have contributed to Tika 3.3.0 by submitting or commenting on the issues resolved in this release:

  • Chengxin Xu
  • Claude Warren
  • Diego Rivera
  • Eric Schoen
  • Grigorii Ioffe
  • Gus Heck
  • Hervé Boutemy
  • Iachimoe
  • Klara Mazurak
  • Laura Delmaestro
  • Lewis John McGibbney
  • Manish S N
  • Matt Dutton
  • Steven Huypens
  • Tiancheng Dai
  • Tilman Hausherr
  • Tim Allison
  • Tom Brisland
  • V. S.

See https://s.apache.org/li3jn for more details on these contributions.