Package org.apache.tika.eval.app
Class ExtractProfiler
- java.lang.Object
- 
- org.apache.tika.batch.FileResourceConsumer
- 
- org.apache.tika.eval.app.AbstractProfiler
- 
- org.apache.tika.eval.app.ExtractProfiler
 
 
 
- 
- All Implemented Interfaces:
- Callable<IFileProcessorFutureResult>
 
 public class ExtractProfiler extends AbstractProfiler 
- 
- 
Nested Class Summary- 
Nested classes/interfaces inherited from class org.apache.tika.eval.app.AbstractProfilerAbstractProfiler.EXCEPTION_TYPE, AbstractProfiler.PARSE_ERROR_TYPE
 
- 
 - 
Field SummaryFields Modifier and Type Field Description static TableInfoCONTAINER_TABLEstatic TableInfoCONTENTS_TABLEstatic TableInfoEMBEDDED_FILE_PATH_TABLEstatic TableInfoEXCEPTION_TABLEstatic TableInfoEXTRACT_EXCEPTION_TABLEstatic TableInfoPROFILE_TABLEstatic TableInfoTAGS_TABLE- 
Fields inherited from class org.apache.tika.eval.app.AbstractProfilerFALSE, ID, MIME_TABLE, REF_EXTRACT_EXCEPTION_TYPES, REF_PARSE_ERROR_TYPES, REF_PARSE_EXCEPTION_TYPES, TRUE, writer
 - 
Fields inherited from class org.apache.tika.batch.FileResourceConsumerELAPSED_MILLIS, IO_IS, IO_OS, OOM, PARSE_ERR, PARSE_EX, TIMED_OUT
 
- 
 - 
Constructor SummaryConstructors Constructor Description ExtractProfiler(ArrayBlockingQueue<FileResource> queue, Path inputDir, Path extracts, ExtractReader extractReader, IDBWriter dbWriter)
 - 
Method SummaryAll Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description booleanprocessFileResource(FileResource fileResource)Main piece of code that needs to be implemented.static voidUSAGE()- 
Methods inherited from class org.apache.tika.eval.app.AbstractProfilercalcTextStats, closeWriter, getContent, getFileLength, getPathsFromExtractCrawl, getPathsFromSrcCrawl, getSourceFileLength, loadCommonTokens, setMaxContentLength, setMaxContentLengthForLangId, setMaxTokens, truncateContent, writeContentData, writeExceptionData, writeExtractException, writeProfileData
 - 
Methods inherited from class org.apache.tika.batch.FileResourceConsumercall, checkForTimedOutMillis, close, flushAndClose, getCurrentFile, getNumHandledExceptions, getNumResourcesConsumed, getXMLifiedLogMsg, getXMLifiedLogMsg, incrementHandledExceptions, isStillActive, parse, pleaseShutdown
 
- 
 
- 
- 
- 
Field Detail- 
EXTRACT_EXCEPTION_TABLEpublic static TableInfo EXTRACT_EXCEPTION_TABLE 
 - 
EXCEPTION_TABLEpublic static TableInfo EXCEPTION_TABLE 
 - 
CONTAINER_TABLEpublic static TableInfo CONTAINER_TABLE 
 - 
PROFILE_TABLEpublic static TableInfo PROFILE_TABLE 
 - 
EMBEDDED_FILE_PATH_TABLEpublic static TableInfo EMBEDDED_FILE_PATH_TABLE 
 - 
CONTENTS_TABLEpublic static TableInfo CONTENTS_TABLE 
 - 
TAGS_TABLEpublic static TableInfo TAGS_TABLE 
 
- 
 - 
Constructor Detail- 
ExtractProfilerpublic ExtractProfiler(ArrayBlockingQueue<FileResource> queue, Path inputDir, Path extracts, ExtractReader extractReader, IDBWriter dbWriter) 
 
- 
 - 
Method Detail- 
USAGEpublic static void USAGE() 
 - 
processFileResourcepublic boolean processFileResource(FileResource fileResource) Description copied from class:FileResourceConsumerMain piece of code that needs to be implemented. Clients are responsible for closing streams and handling the exceptions that they'd like to handle. Unchecked throwables can be thrown past this, of course. When an unchecked throwable is thrown, this logs the error, and then rethrows the exception. Clients/subclasses should make sure to catch and handle everything they can. The design goal is that the whole process should close up and shutdown soon after an unchecked exception or error is thrown. Make sure to callFileResourceConsumer.incrementHandledExceptions()appropriately in your implementation of this method.- Specified by:
- processFileResourcein class- FileResourceConsumer
- Parameters:
- fileResource- resource to process
- Returns:
- whether or not a file was successfully processed
 
 
- 
 
-