Apache Tika 4.0.0-alpha-1
The most notable changes in Tika 4.0.0-alpha-1 over the previous release are:
Breaking Changes
- Move from xml to json based configuration (TIKA-4544 and many others).
- tika-pipes implementation modules have been reorganized by resource (tika-pipes-solr) vs task (tika-pipes-fetcher-solr) (TIKA-4543). Note that the file-system pipes components have been taken out of tika-pipes-core and placed in their own pf4j module: tika-pipes-file-system.
- tika-pipes implementation modules are now pf4j plugins (TIKA-4519).
- tika-pipes core classes have been moved to a new module: tika-pipes-core, and the FileSystem pipes components have moved (TIKA-4334).
- MetadataListFilter has been renamed MetadataFilter, and MetadataFilter has been removed (TIKA-4546).
- Removed several modules, including: tika-batch (TIKA-4333), snaps deployment (TIKA-4502), dotnet (TIKA-4332), advanced media module (TIKA-4500), tika-dl module (TIKA-4499), tika-fuzzing module (TIKA-4506).
- Headers are no longer injected into the body/content of MSG files (TIKA-4345). Please open a ticket if you need this behavior across email formats.
- API changes in the EmbeddedStreamTranslator (TIKA-4518).
- Removed DigestingParser (TIKA-4607).
- tika-parsers-standard-package is now a pom, not a jar. Users must add <type>pom</type> in Maven or @pom in Gradle (TIKA-4712).
- Removed legacy ExternalParser; external parsers now require explicit JSON configuration (TIKA-4707).
Other Changes
- Fix concurrency bug in TikaToXMP (TIKA-4393).
The following people have contributed to Tika 4.0.0-alpha-1 by submitting or commenting on the issues resolved in this release:
- Aashish Tudu
- Alexander Veit
- Chengxin Xu
- Claude Warren
- David Frizelle
- Eric Schoen
- Francesco
- Ghiles OUAREZKI
- Grigorii Ioffe
- Iachimoe
- james
- Justin Deoliveira
- Klara Mazurak
- Laura Delmaestro
- Leszek Sliwko
- Lewis John McGibbney
- Manish S N
- Matt Dutton
- Nino Skopac
- Olivier Ceulemans
- Peter Hoogendijk
- Pleeplop
- Ruairidh Williamson
- Sandeep Kulkarni
- Sebastian Nagel
- Stephen H
- Steven Huypens
- Subbu
- Tiancheng Dai
- Tilman Hausherr
- Tim Allison
- Tim Barrett
- Tom Brisland
- Valery Yatsynovich
- V. S.
See https://s.apache.org/6ctu5 for more details on these contributions.


