Pipes Reporters
Reporters track the processing status of each document in the pipeline. They record whether a parse succeeded, failed, or timed out, along with timing information.
File System Reporter (file-system-reporter)
Writes a JSON status file that is updated periodically.
Module: tika-pipes-file-system
| Field | Default | Description |
|---|---|---|
|
required |
Path to the JSON status file. |
|
|
How often to update the status file (milliseconds). |
JDBC Reporter (jdbc-reporter)
Writes per-document status to a SQL database table.
Module: tika-pipes-jdbc
| Field | Default | Description |
|---|---|---|
|
required |
JDBC connection string. |
|
required |
Table name for status records. |
|
|
Auto-create the table if it does not exist. |
Elasticsearch Reporter (es-pipes-reporter)
Writes per-document parse status back into the Elasticsearch index via upsert.
Module: tika-pipes-es
| Field | Default | Description |
|---|---|---|
|
required |
Elasticsearch endpoint (including index). |
|
|
Prefix for status fields (e.g., |
|
|
Include routing in upsert requests. |