Publishing the Documentation Site
This guide covers how to build and publish the Apache Tika documentation site.
Overview
The documentation is built using Antora, a static site generator for AsciiDoc. The site supports multiple versions through Git branches and includes client-side search powered by Lunr.
Prerequisites
-
Maven 3.9+
-
Git
-
Internet access on first build — the Antora plugin downloads Node.js into
~/.cache/tika-antora/(~100 MB, one-time per machine; reused across clean builds and across worktrees)
Building the Site Locally
The docs module is only included in the reactor under the apache-release profile. Build the site from the repo root:
./mvnw package -Papache-release -pl :tika-docs -DskipTests
The generated site will be at docs/target/site/. The current git commit and date are stamped automatically onto the home page (a generated copy of the playbook lives at docs/antora-playbook-stamped.yml — gitignored).
To skip the stamping or override the playbook:
# build directly with the unstamped playbook
cd docs && mvn antora:antora -Dplaybook=antora-playbook.yml
Previewing the Site
Option 1: Python HTTP server (recommended)
cd docs/target/site
python3 -m http.server 8000
Then open http://localhost:8000 in your browser.
Option 2: Node.js HTTP server
npx http-server docs/target/site -p 8000
Then open http://localhost:8000 in your browser.
Option 3: Open static HTML directly
# Linux
xdg-open docs/target/site/index.html
# macOS
open docs/target/site/index.html
# Windows
start docs/target/site/index.html
| Opening static HTML directly may not fully test search and relative links. |
Living Documentation
The documentation includes examples that are symlinked to actual test configuration files in the codebase.
This ensures examples are always valid and tested. The symlinks are in docs/modules/ROOT/examples/ and
point to files in tika-parsers/…/config-examples/.
When you modify a config example in the codebase, the documentation automatically reflects the change on the next build.
Version Management
Documentation versions are managed through Git branches with the docs/ prefix.
Branch Structure
-
HEAD(main branch) - Current development version (SNAPSHOT) -
docs/4.0.0- Released 4.0.0 documentation -
docs/4.1.0- Released 4.1.0 documentation
The playbook (antora-playbook.yml) is configured to build all docs/* branches automatically.
Publishing to the Site
Build the docs with Maven, then run publish-docs.sh to copy the output to a tika-site SVN checkout (with URL flattening so /docs/tika/X.Y.Z/… becomes /docs/X.Y.Z/…):
./mvnw package -Papache-release -pl :tika-docs -DskipTests
cd docs
./publish-docs.sh /path/to/tika-site/publish
# Then in the SVN checkout:
cd /path/to/tika-site
svn add publish/docs publish/_ --force
svn commit -m "Publish 4.0.0-SNAPSHOT docs"
The Maven package step builds the Antora site (stamping the current git
commit and date on the home page); publish-docs.sh copies the output to
the site checkout with the correct directory layout:
-
publish/docs/4.0.0-SNAPSHOT/— the documentation pages -
publish/_/— CSS, JS, fonts (shared across versions) -
publish/docs/index.html— redirect to latest version
Publishing a Release
When releasing a new version (e.g., 4.0.0):
# 1. Tag the release as usual
git tag v4.0.0
# 2. Create docs branch from tag
git checkout -b docs/4.0.0 v4.0.0
# 3. Update version in antora.yml
sed -i "s/4.0.0-SNAPSHOT/4.0.0/" docs/antora.yml
git commit -am "Set docs version to 4.0.0"
git push origin docs/4.0.0
# 4. Build and publish
./mvnw package -Papache-release -pl :tika-docs -DskipTests
cd docs
./publish-docs.sh /path/to/tika-site/publish
# 5. Commit to SVN
cd /path/to/tika-site
svn add publish/docs publish/_ --force
svn commit -m "Publish 4.0.0 docs"
Updating Released Documentation
To fix or update documentation for a released version:
# 1. Checkout the docs branch
git checkout docs/4.0.0
# 2. Make changes (docs or config examples)
# Edit files as needed...
# 3. Commit and push
git commit -am "Fix PDF parser example"
git push origin docs/4.0.0
# 4. Rebuild and republish
./mvnw package -Papache-release -pl :tika-docs -DskipTests
cd docs
./publish-docs.sh /path/to/tika-site/publish
cd /path/to/tika-site
svn commit -m "Update 4.0.0 docs"
Site Structure
The Antora configuration files:
-
docs/antora.yml- Component descriptor (name, version, navigation) -
docs/antora-playbook.yml- Site-wide configuration (sources, UI, extensions) -
docs/modules/ROOT/nav.adoc- Navigation sidebar structure -
docs/modules/ROOT/pages/- Documentation pages -
docs/modules/ROOT/examples/- Symlinks to config examples -
docs/supplemental-ui/- Custom UI components (header, footer, search)