Class JDBCPipesReporter
java.lang.Object
org.apache.tika.pipes.PipesReporter
org.apache.tika.pipes.PipesReporterBase
org.apache.tika.pipes.reporters.jdbc.JDBCPipesReporter
- All Implemented Interfaces:
Closeable
,AutoCloseable
,Initializable
This is an initial draft of a JDBCPipesReporter. This will drop
the tika_status table with each run. If you'd like different behavior,
please open a ticket on our JIRA!
-
Field Summary
Fields inherited from class org.apache.tika.pipes.PipesReporter
NO_OP_REPORTER
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionvoid
checkInitialization
(InitializableProblemHandler problemHandler) void
close()
No-op implementation.void
This is called if the process has crashed.void
This is called if the process has crashed.void
initialize
(Map<String, Param> params) boolean
void
report
(FetchEmitTuple t, PipesResult result, long elapsed) void
setCacheSize
(int cacheSize) Commit the reports if the cache is greater than or equal to this size.void
setConnection
(String connection) void
setCreateTable
(boolean createTable) The default is true.void
setPostConnection
(String postConnection) This sql will be called immediately after the connection is made.void
setReportSql
(String reportSql) This is the sql for the prepared statement to execute to store the report record. the default is:insert into tika_status (id, status, timestamp) values (?
void
setReportVariables
(List<String> variables) ADVANCED: This is used to set the variables in the prepared statement for the report.void
setReportWithinMs
(long reportWithinMs) Commit the reports if the amount of time elapsed since the last report commit exceeds this value.void
setTableName
(String tableName) The default isTABLE_NAME
Methods inherited from class org.apache.tika.pipes.PipesReporterBase
accept, setExcludes, setIncludes
Methods inherited from class org.apache.tika.pipes.PipesReporter
report, supportsTotalCount
-
Field Details
-
TABLE_NAME
- See Also:
-
-
Constructor Details
-
JDBCPipesReporter
public JDBCPipesReporter()
-
-
Method Details
-
initialize
- Specified by:
initialize
in interfaceInitializable
- Overrides:
initialize
in classPipesReporterBase
- Parameters:
params
- params to use for initialization- Throws:
TikaConfigException
-
checkInitialization
public void checkInitialization(InitializableProblemHandler problemHandler) throws TikaConfigException - Specified by:
checkInitialization
in interfaceInitializable
- Overrides:
checkInitialization
in classPipesReporterBase
- Parameters:
problemHandler
- if there is a problem and no custom initializableProblemHandler has been configured via Initializable parameters, this is called to respond.- Throws:
TikaConfigException
-
setConnection
-
setCacheSize
Commit the reports if the cache is greater than or equal to this size. Default isDEFAULT_CACHE_SIZE
. The reports will be committed if the cache size triggers reporting or if the amount of time since last reported (reportWithinMs
) triggers reporting.- Parameters:
cacheSize
-
-
setCreateTable
The default is true. In a distributed setting with multiple servers, this should be set to false, and you'll need to set up the table on your own. NOTE The default behavior is to drop the table if it exists and then create it. Make sure to set this to false if you do not want to drop the table.- Parameters:
createTable
-
-
setTableName
The default isTABLE_NAME
- Parameters:
tableName
-
-
setReportSql
This is the sql for the prepared statement to execute to store the report record. the default is:insert into tika_status (id, status, timestamp) values (?,?,?)
This can be modified for specific dialects of SQL or to run an upsert, merge or update instead of the default insert. Users need to coordinate this withsetReportVariables(List)
- Parameters:
reportSql
-
-
getTableName
-
getReportVariables
-
getReportSql
-
isCreateTable
public boolean isCreateTable() -
setReportVariables
ADVANCED: This is used to set the variables in the prepared statement for the report. This needs to be coordinated withsetReportSql(String)
. The available variables are "id, status, timestamp". If you're modifying to an update statement like "update table tika_status set status=?, timestamp=? where id = ?" then the values for this would be ["status", "timestamp", "id"]. The default for the insert is ["id", "status", "timestamp"]- Parameters:
variables
-
-
setReportWithinMs
Commit the reports if the amount of time elapsed since the last report commit exceeds this value. Default isDEFAULT_REPORT_WITHIN_MS
. The reports will be committed if the cache size triggers reporting or if the amount of time since last reported triggers reporting.- Parameters:
reportWithinMs
-
-
setPostConnection
This sql will be called immediately after the connection is made. This was initially added for setting pragmas on sqlite3, but may be used for other connection configuration in other dbs. Note: This is called before the table is created if it needs to be created.- Parameters:
postConnection
-
-
report
- Specified by:
report
in classPipesReporter
-
error
Description copied from class:PipesReporter
This is called if the process has crashed. Implementers should not rely on close() to be called after this.- Specified by:
error
in classPipesReporter
-
error
Description copied from class:PipesReporter
This is called if the process has crashed. Implementers should not rely on close() to be called after this.- Specified by:
error
in classPipesReporter
-
close
Description copied from class:PipesReporter
No-op implementation. Override for custom behavior- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
- Overrides:
close
in classPipesReporter
- Throws:
IOException
-