Apache Tika 1.20
The most notable changes in Tika 1.20 over the previous release are:
- Upgrade to Apache POI 4.0.1 (TIKA-2751).
- Integrate/parameterize new angles handling in PDFBox (TIKA-2779).
- Upgrade to PDFBox 2.0.13 (TIKA-2788).
- Prevent content within style/ and script/ elements to be written in the ToTextContentHandler (TIKA-2550).
- Switch child to parent communication to a shared memory-mappedfile in tika-server's -spawnChild mode.
- Fix bug in tika-server when run in legacy mode (not -spawnChild) that caused it to return 503 on documents submitted after it hit an OutOfMemoryError (TIKA-2776).
- Upgrade jaxb-runtime and javax.activation (TIKA-2778).
- tika-app in batch mode now requires an interrupt or kill signal to the parent process to stop the parent and the child processes (TIKA-2780).
- Bulk upgrade of dependencies (TIKA-2775).
- Improve language id efficiency in tika-eval (TIKA-2777).
- 25.2 (TIKA-2773).
- Remove duplication of notes in PPT slides (TIKA-2735)
- Use -javaHome or $JAVA_HOME (if they exist) when spawning child in tika-server's -spawnChild mode.
The following people have contributed to Tika 1.20 by submitting or commenting on the issues resolved in this release:
- Boris Petrov
- Dave Meikle
- feng ye
- Hans Brende
- Jeroen
- Julien Massiera
- Kristen Cheung
- Lewis John McGibbney
- Mario Bisonti
- Markus Jelsma
- Nick Sincaglia
- Ronan O'Sullivan
- Tim Allison
See https://s.apache.org/fScy for more details on these contributions.