Apache Tika 0.9

The most notable changes in Tika 0.9 over the previous release are:

  • A critical bugfix preventing metadata from printing to the command line when the underlying Parser didn't generate XHTML output was fixed. (TIKA-596)
  • The 0.8 version of Tika included a NetCDF jar file that pulled in tremendous amounts of redundant dependencies. This has been addressed in Tika 0.9 by republishing a minimal NetCDF jar and changing Tika to depend on that. (TIKA-556)
  • MIME detection for iWork, and OpenXML documents has been improved. (TIKA-533, TIKA-562, TIKA-588)
  • A critical backwards incompatible bug in PDF parsing that was introduced in Tika 0.8 has been fixed. (TIKA-548)
  • Support for forked parsing in separate processes was added. (TIKA-416)
  • Tika's language identifier now supports the Lithuanian language. (TIKA-582)

The following people have contributed to Tika 0.9 by submitting or commenting on the issues resolved in this release:

  • Alex Skochin
  • Alexander Chow
  • Antoine L.
  • Antoni Mylka
  • Benjamin Douglas
  • Benson Margulies
  • Chris A. Mattmann
  • Cristian Vat
  • Cyriel Vringer
  • David Benson
  • Erik Hetzner
  • Gabriel Miklos
  • Geoff Jarrad
  • Jukka Zitting
  • Ken Krugler
  • Kostya Gribov
  • Leszek Piotrowicz
  • Martijn van Groningen
  • Maxim Valyanskiy
  • Michel Tremblay
  • Nick Burch
  • paul
  • Paul Pearcy
  • Peter van Raamsdonk
  • Piotr Bartosiewicz
  • Reinhard Schwab
  • Scott Severtson
  • Shinsuke Sugaya
  • Staffan Olsson
  • Steve Kearns
  • Tom Klonikowski
  • Žygimantas Medelis

See http://s.apache.org/qi for more details on these contributions.