Migrating to Tika 4.x

This section provides guides and background documentation for migrating to Apache Tika 4.x.

See the Roadmap for version timelines and support schedules.

Migration Guides

Background Documentation

TODOs / Missing Features in 4.x

The following features from 3.x are not yet implemented in 4.x:

Config Serialization

The following tika-app options for dumping configuration are not yet available:

  • --dump-minimal-config - Print minimal TikaConfig

  • --dump-current-config - Print current TikaConfig

  • --dump-static-config - Print static config

  • --dump-static-full-config - Print static explicit config

These require completing the JSON serialization support for TikaConfig objects. The underlying serialization infrastructure exists (see Serialization) but the CLI integration is pending.

Workaround: Manually create JSON config files using the Tika Pipes config template as a starting point.