public class ExtractProfiler extends AbstractProfiler
AbstractProfiler.EXCEPTION_TYPE, AbstractProfiler.PARSE_ERROR_TYPE
Modifier and Type | Field and Description |
---|---|
static TableInfo |
CONTAINER_TABLE |
static TableInfo |
CONTENTS_TABLE |
static TableInfo |
EMBEDDED_FILE_PATH_TABLE |
static TableInfo |
EXCEPTION_TABLE |
static TableInfo |
EXTRACT_EXCEPTION_TABLE |
static TableInfo |
PROFILE_TABLE |
static TableInfo |
TAGS_TABLE |
FALSE, ID, MIME_TABLE, REF_EXTRACT_EXCEPTION_TYPES, REF_PARSE_ERROR_TYPES, REF_PARSE_EXCEPTION_TYPES, TRUE, writer
ELAPSED_MILLIS, IO_IS, IO_OS, OOM, PARSE_ERR, PARSE_EX, TIMED_OUT
Constructor and Description |
---|
ExtractProfiler(ArrayBlockingQueue<FileResource> queue,
Path inputDir,
Path extracts,
ExtractReader extractReader,
IDBWriter dbWriter) |
Modifier and Type | Method and Description |
---|---|
boolean |
processFileResource(FileResource fileResource)
Main piece of code that needs to be implemented.
|
static void |
USAGE() |
calcTextStats, closeWriter, getContent, getFileLength, getPathsFromExtractCrawl, getPathsFromSrcCrawl, getSourceFileLength, loadCommonTokens, setMaxContentLength, setMaxContentLengthForLangId, setMaxTokens, truncateContent, writeContentData, writeExceptionData, writeExtractException, writeProfileData
call, checkForTimedOutMillis, close, flushAndClose, getCurrentFile, getNumHandledExceptions, getNumResourcesConsumed, getXMLifiedLogMsg, getXMLifiedLogMsg, incrementHandledExceptions, isStillActive, parse, pleaseShutdown
public static TableInfo EXTRACT_EXCEPTION_TABLE
public static TableInfo EXCEPTION_TABLE
public static TableInfo CONTAINER_TABLE
public static TableInfo PROFILE_TABLE
public static TableInfo EMBEDDED_FILE_PATH_TABLE
public static TableInfo CONTENTS_TABLE
public static TableInfo TAGS_TABLE
public ExtractProfiler(ArrayBlockingQueue<FileResource> queue, Path inputDir, Path extracts, ExtractReader extractReader, IDBWriter dbWriter)
public static void USAGE()
public boolean processFileResource(FileResource fileResource)
FileResourceConsumer
FileResourceConsumer.incrementHandledExceptions()
appropriately in
your implementation of this method.
processFileResource
in class FileResourceConsumer
fileResource
- resource to processCopyright © 2007–2022 The Apache Software Foundation. All rights reserved.