Apache Tika 1.24
The most notable changes in Tika 1.24 over the previous release are:
- Upgrade Drew Noakes' metadata-extractor (TIKA-2952).
- Enable optional extraction of structural tags in PDFs (alpha-grade) (TIKA-3026).
- Tika app's --extract mode now outputs to STDOUT (TIKA-3035).
- Add an optional Preflight parser for PDFs (TIKA-3055).
- Improve detection of some zip-based formats (TIKA-3057).
- Upgrade metadata-extractor to 2.13.0 (TIKA-2952).
- Upgrade POI to 4.1.2 (TIKA-3047).
- Extract XMP from PSD files (TIKA-3050).
- Added XMLProfiler as an optional parser to profile XFA and XMPin PDFs (TIKA-3045).
- Extract inline images that rely on the DCT filter from PDFs (TIKA-3041).
- Upgrade PDFBox to 2.0.19 (TIKA-3033).
- Fix bug in ASM parser configuration (TIKA-2992).
- Upgrade java-libpst to 0.9.3 (TIKA-2546).
The following people have contributed to Tika 1.24 by submitting or commenting on the issues resolved in this release:
- Aman Mishra * Arvind Jain * Carina Antunes * Clark Perkins * David Eric Pugh * David Pilato * Don * Jan Vlug * Jorge Spinsanti * Luís Filipe Nassif * Markus Mandalka * Michael Moritz * MRIT64 * Nick Burch * Richard Jones * Soren Daugaard * Steve * Syed Osama Anwer * Tilman Hausherr * Tim Allison * Zoltan Farago
See https://s.apache.org/xa01p for more details on these contributions.