public class FSListCrawler extends FileResourceCrawler
ADDED, LOG, SKIPPED, STOP_NOW
Constructor and Description |
---|
FSListCrawler(ArrayBlockingQueue<FileResource> fileQueue,
int numConsumers,
File root,
File list,
String encoding)
Deprecated.
|
FSListCrawler(ArrayBlockingQueue<FileResource> fileQueue,
int numConsumers,
Path root,
Path list,
Charset charset)
Constructor for a crawler that reads a list of files to process.
|
Modifier and Type | Method and Description |
---|---|
void |
start()
Implement this to control the addition of FileResources.
|
call, getAdded, getConsidered, isActive, isQueueEmpty, select, setDocumentSelector, setMaxConsecWaitInMillis, setMaxFilesToAdd, setMaxFilesToConsider, shutDownNoPoison, tryToAdd, wasTimedOut
@Deprecated public FSListCrawler(ArrayBlockingQueue<FileResource> fileQueue, int numConsumers, File root, File list, String encoding) throws FileNotFoundException, UnsupportedEncodingException
fileQueue
- numConsumers
- root
- list
- encoding
- FileNotFoundException
UnsupportedEncodingException
FSListCrawler(ArrayBlockingQueue, int, Path, Path, Charset)
public FSListCrawler(ArrayBlockingQueue<FileResource> fileQueue, int numConsumers, Path root, Path list, Charset charset) throws IOException
The list should be paths relative to the root.
fileQueue
- queue for batchnumConsumers
- number of consumersroot
- root input directorlist
- text file list (one file per line) of paths relative to
the root for processingcharset
- charset of the fileIOException
public void start() throws InterruptedException
FileResourceCrawler
FileResourceCrawler.tryToAdd(org.apache.tika.batch.FileResource)
to add FileResources to the queue.start
in class FileResourceCrawler
InterruptedException
Copyright © 2007–2023 The Apache Software Foundation. All rights reserved.