Apache Tika 1.27

The most notable changes in Tika 1.27 over the previous release are:

  • Migrate MP4 parsing to Drew Noakes' metadata-extractor (TIKA-3459). Note: to revert to legacy parser turn off NoakesMP4Parser and turn on MP4Parser via tika-config.xml.
  • Prevent rare infinite loop in tika-server's -spawnChild mode when restart fails because of failure to bind to the port (TIKA-3441).
  • Improve likelihood that tesseract will not be orphaned on jvm restart in tika-server (TIKA-3441).
  • Deprecate experimental PDFPreflightParser (TIKA-3437).
  • Apply encoding detection to zip entry names via Ryan421 (TIKA-3374).
  • Add json output for /tika endpoint in tika-server (TIKA-3352).

The following people have contributed to Tika 1.27 by submitting or commenting on the issues resolved in this release:

  • Andrei Dobrescu
  • Carey Halton
  • Cristian Zamfir
  • Furkan Kamaci
  • Jukka Zitting
  • Konstantin Gribov
  • Lewis John McGibbney
  • Peter Kronenberg
  • Philip Southam
  • Shubhangi Raut
  • Subhajit Das
  • Tim Allison
  • Trevor Bentley

See https://s.apache.org/vodtk for more details on these contributions.