public class ExtractComparer extends AbstractProfiler
AbstractProfiler.EXCEPTION_TYPE, AbstractProfiler.PARSE_ERROR_TYPE
Modifier and Type | Field and Description |
---|---|
static TableInfo |
COMPARISON_CONTAINERS |
static TableInfo |
CONTENT_COMPARISONS |
static TableInfo |
CONTENTS_TABLE_A |
static TableInfo |
CONTENTS_TABLE_B |
static TableInfo |
EMBEDDED_FILE_PATH_TABLE_A |
static TableInfo |
EMBEDDED_FILE_PATH_TABLE_B |
static TableInfo |
EXCEPTION_TABLE_A |
static TableInfo |
EXCEPTION_TABLE_B |
static TableInfo |
EXTRACT_EXCEPTION_TABLE_A |
static TableInfo |
EXTRACT_EXCEPTION_TABLE_B |
static TableInfo |
PROFILES_A |
static TableInfo |
PROFILES_B |
static TableInfo |
REF_PAIR_NAMES |
static TableInfo |
TAGS_TABLE_A |
static TableInfo |
TAGS_TABLE_B |
FALSE, ID, MIME_TABLE, REF_EXTRACT_EXCEPTION_TYPES, REF_PARSE_ERROR_TYPES, REF_PARSE_EXCEPTION_TYPES, TRUE, writer
ELAPSED_MILLIS, IO_IS, IO_OS, OOM, PARSE_ERR, PARSE_EX, TIMED_OUT
Constructor and Description |
---|
ExtractComparer(ArrayBlockingQueue<FileResource> queue,
Path inputDir,
Path extractsA,
Path extractsB,
ExtractReader extractReader,
IDBWriter writer) |
Modifier and Type | Method and Description |
---|---|
protected void |
compareFiles(org.apache.tika.eval.EvalFilePaths fpsA,
org.apache.tika.eval.EvalFilePaths fpsB) |
boolean |
processFileResource(FileResource fileResource)
Main piece of code that needs to be implemented.
|
static void |
USAGE() |
calcTextStats, closeWriter, getContent, getFileLength, getPathsFromExtractCrawl, getPathsFromSrcCrawl, getSourceFileLength, loadCommonTokens, setMaxContentLength, setMaxContentLengthForLangId, setMaxTokens, truncateContent, writeContentData, writeExceptionData, writeExtractException, writeProfileData
call, checkForTimedOutMillis, close, flushAndClose, getCurrentFile, getNumHandledExceptions, getNumResourcesConsumed, getXMLifiedLogMsg, getXMLifiedLogMsg, incrementHandledExceptions, isStillActive, parse, pleaseShutdown
public static TableInfo REF_PAIR_NAMES
public static TableInfo COMPARISON_CONTAINERS
public static TableInfo CONTENT_COMPARISONS
public static TableInfo PROFILES_A
public static TableInfo PROFILES_B
public static TableInfo EMBEDDED_FILE_PATH_TABLE_A
public static TableInfo EMBEDDED_FILE_PATH_TABLE_B
public static TableInfo CONTENTS_TABLE_A
public static TableInfo CONTENTS_TABLE_B
public static TableInfo TAGS_TABLE_A
public static TableInfo TAGS_TABLE_B
public static TableInfo EXCEPTION_TABLE_A
public static TableInfo EXCEPTION_TABLE_B
public static TableInfo EXTRACT_EXCEPTION_TABLE_A
public static TableInfo EXTRACT_EXCEPTION_TABLE_B
public ExtractComparer(ArrayBlockingQueue<FileResource> queue, Path inputDir, Path extractsA, Path extractsB, ExtractReader extractReader, IDBWriter writer)
public static void USAGE()
public boolean processFileResource(FileResource fileResource)
FileResourceConsumer
Unchecked throwables can be thrown past this, of course. When an unchecked throwable is thrown, this logs the error, and then rethrows the exception. Clients/subclasses should make sure to catch and handle everything they can.
The design goal is that the whole process should close up and shutdown soon after an unchecked exception or error is thrown.
Make sure to call FileResourceConsumer.incrementHandledExceptions()
appropriately in
your implementation of this method.
processFileResource
in class FileResourceConsumer
fileResource
- resource to processprotected void compareFiles(org.apache.tika.eval.EvalFilePaths fpsA, org.apache.tika.eval.EvalFilePaths fpsB) throws IOException
IOException
Copyright © 2007–2020 The Apache Software Foundation. All rights reserved.