Migrating to Tika 4.x
This section provides guides and background documentation for migrating to Apache Tika 4.x.
See the Roadmap for version timelines and support schedules.
Migration Guides
-
Migration Guide - Step-by-step guide for upgrading from Tika 3.x to 4.x
-
Tika Server Migration - Breaking changes and new endpoints in tika-server 4.x
-
Metadata Changes - Detailed metadata key changes and migration examples
Background Documentation
-
Design Notes - Architectural decisions and design rationale
-
Serialization - JSON serialization design and implementation details
TODOs / Missing Features in 4.x
The following features from 3.x are not yet implemented in 4.x:
Config Serialization
The following tika-app options for dumping configuration are not yet available:
-
--dump-minimal-config- Print minimal TikaConfig -
--dump-current-config- Print current TikaConfig -
--dump-static-config- Print static config -
--dump-static-full-config- Print static explicit config
These require completing the JSON serialization support for TikaConfig objects. The underlying serialization infrastructure exists (see Serialization) but the CLI integration is pending.
Workaround: Manually create JSON config files using the templates in tika-pipes/tika-async-cli/src/main/resources/config-template.json as a starting point.