public abstract class FileResourceConsumer extends Object implements Callable<IFileProcessorFutureResult>
Modifier and Type | Field and Description |
---|---|
static String |
ELAPSED_MILLIS |
static String |
IO_IS |
static String |
IO_OS |
protected static org.slf4j.Logger |
LOG |
static String |
OOM |
static String |
PARSE_ERR |
static String |
PARSE_EX |
static String |
TIMED_OUT |
Constructor and Description |
---|
FileResourceConsumer(ArrayBlockingQueue<FileResource> fileQueue) |
Modifier and Type | Method and Description |
---|---|
IFileProcessorFutureResult |
call() |
org.apache.tika.batch.FileStarted |
checkForTimedOutMillis(long staleThresholdMillis)
Checks to see if the currentFile being processed (if there is one)
should be timed out (still being worked on after staleThresholdMillis).
|
protected void |
close(Closeable closeable) |
protected void |
flushAndClose(Closeable closeable) |
org.apache.tika.batch.FileStarted |
getCurrentFile()
Returns the name and start time of a file that is currently being processed.
|
int |
getNumHandledExceptions() |
int |
getNumResourcesConsumed() |
protected String |
getXMLifiedLogMsg(String type,
String resourceId,
String... attrs) |
protected String |
getXMLifiedLogMsg(String type,
String resourceId,
Throwable t,
String... attrs)
Use this for structured output that captures resourceId and other attributes.
|
protected void |
incrementHandledExceptions()
Make sure to call this appropriately!
|
boolean |
isStillActive()
Returns whether or not the consumer is still could process
a file or is still processing a file (ACTIVELY_CONSUMING or ASKED_TO_SHUTDOWN)
|
protected void |
parse(String resourceId,
Parser parser,
InputStream is,
ContentHandler handler,
Metadata m,
ParseContext parseContext)
Utility method to handle logging equivalently among all
implementing classes.
|
void |
pleaseShutdown()
This politely asks the consumer to shutdown.
|
abstract boolean |
processFileResource(FileResource fileResource)
Main piece of code that needs to be implemented.
|
protected static final org.slf4j.Logger LOG
public static String TIMED_OUT
public static String OOM
public static String IO_IS
public static String IO_OS
public static String PARSE_ERR
public static String PARSE_EX
public static String ELAPSED_MILLIS
public FileResourceConsumer(ArrayBlockingQueue<FileResource> fileQueue)
public IFileProcessorFutureResult call()
call
in interface Callable<IFileProcessorFutureResult>
public abstract boolean processFileResource(FileResource fileResource)
incrementHandledExceptions()
appropriately in
your implementation of this method.
fileResource
- resource to processprotected void incrementHandledExceptions()
public boolean isStillActive()
public void pleaseShutdown()
This offers another method for politely requesting
that a FileResourceConsumer stop processing
besides passing it PoisonFileResource
.
public org.apache.tika.batch.FileStarted getCurrentFile()
public int getNumResourcesConsumed()
public int getNumHandledExceptions()
public org.apache.tika.batch.FileStarted checkForTimedOutMillis(long staleThresholdMillis)
If the consumer should be timed out, this will return the currentFile and set the state to TIMED_OUT.
If the consumer was already timed out earlier or is not processing a file or has been working on a file for less than #staleThresholdMillis, then this will return null.
staleThresholdMillis
- threshold to determine whether the consumer has gone stale.protected String getXMLifiedLogMsg(String type, String resourceId, String... attrs)
protected String getXMLifiedLogMsg(String type, String resourceId, Throwable t, String... attrs)
type
- entity name for exceptionresourceId
- resourceId stringt
- throwable can be nullattrs
- (array of key0, value0, key1, value1, etc.)protected void close(Closeable closeable)
protected void flushAndClose(Closeable closeable)
protected void parse(String resourceId, Parser parser, InputStream is, ContentHandler handler, Metadata m, ParseContext parseContext) throws Throwable
resourceId
- resourceIdparser
- parser to useis
- inputStream (will be closed by this method!)handler
- handler for the contentm
- metadataparseContext
- parse contextThrowable
- (logs and then throws whatever was thrown (if anything)Copyright © 2007–2019 The Apache Software Foundation. All rights reserved.