public class FSDirectoryCrawler extends FileResourceCrawler
Modifier and Type | Class and Description |
---|---|
static class |
FSDirectoryCrawler.CRAWL_ORDER |
ADDED, LOG, SKIPPED, STOP_NOW
Constructor and Description |
---|
FSDirectoryCrawler(ArrayBlockingQueue<FileResource> fileQueue,
int numConsumers,
Path root,
FSDirectoryCrawler.CRAWL_ORDER crawlOrder) |
FSDirectoryCrawler(ArrayBlockingQueue<FileResource> fileQueue,
int numConsumers,
Path root,
Path startDirectory,
FSDirectoryCrawler.CRAWL_ORDER crawlOrder) |
Modifier and Type | Method and Description |
---|---|
void |
handleFirstFileInDirectory(Path f)
Override this if you have any special handling
for the first actual file that the crawler comes across
in a directory.
|
void |
start()
Implement this to control the addition of FileResources.
|
call, getAdded, getConsidered, isActive, isQueueEmpty, select, setDocumentSelector, setMaxConsecWaitInMillis, setMaxFilesToAdd, setMaxFilesToConsider, shutDownNoPoison, tryToAdd, wasTimedOut
public FSDirectoryCrawler(ArrayBlockingQueue<FileResource> fileQueue, int numConsumers, Path root, FSDirectoryCrawler.CRAWL_ORDER crawlOrder)
public FSDirectoryCrawler(ArrayBlockingQueue<FileResource> fileQueue, int numConsumers, Path root, Path startDirectory, FSDirectoryCrawler.CRAWL_ORDER crawlOrder)
public void start() throws InterruptedException
FileResourceCrawler
FileResourceCrawler.tryToAdd(org.apache.tika.batch.FileResource)
to add FileResources to the queue.start
in class FileResourceCrawler
InterruptedException
public void handleFirstFileInDirectory(Path f)
f
- file to handleCopyright © 2007–2022 The Apache Software Foundation. All rights reserved.