Apache Tika 2.4.1

The most notable changes in Tika 2.4.1 over the previous release are:

  • Implement bulk upload in the OpenSearch emitter (TIKA-3791).
  • Implement tika-server client via pipes mode (TIKA-3790).
  • Custom embedded parsers and EmbeddedDocumentHandlers can now add metadata to the container file's metadata (TIKA-3789).
  • Record embedded file exceptions in the container file's metadata (TIKA-3788).
  • Allow continuation of parsing after write limit has been reached (TIKA-3787).
  • Allow pass-through of 'Content-Length' header to metadata in TikaResource (TIKA-3786).
  • Add embedded depth to profiles tables in tika-eval (TIKA-3775).
  • Add stop() method to TikaServerCli so that it can be run with Apache Commons Daemon (TIKA-1570).
  • Fixed bug in ordering of Parsers during service loading (TIKA-3750).
  • Users can expand system properties from the forking process into forked tika-server processes (TIKA-3748).
  • Fix a few files being wrongly detected as EML (TIKA-3771).

The following people have contributed to Tika 2.4.1 by submitting or commenting on the issues resolved in this release:

  • Alexander
  • Aurélien Marocco
  • Dmitrii Kriukov
  • Jason Borg
  • Luís Filipe Nassif
  • Sam Stephens
  • Tilman Hausherr
  • Tim Allison
  • Tom Brisland

See https://s.apache.org/ts016 for more details on these contributions.