Package org.apache.tika.batch.fs
Class FSListCrawler
java.lang.Object
org.apache.tika.batch.FileResourceCrawler
org.apache.tika.batch.fs.FSListCrawler
- All Implemented Interfaces:
Callable<IFileProcessorFutureResult>
Class that "crawls" a list of files.
-
Field Summary
Fields inherited from class org.apache.tika.batch.FileResourceCrawler
ADDED, LOG, SKIPPED, STOP_NOW
-
Constructor Summary
ConstructorDescriptionFSListCrawler
(ArrayBlockingQueue<FileResource> fileQueue, int numConsumers, File root, File list, String encoding) Deprecated.FSListCrawler
(ArrayBlockingQueue<FileResource> fileQueue, int numConsumers, Path root, Path list, Charset charset) Constructor for a crawler that reads a list of files to process. -
Method Summary
Modifier and TypeMethodDescriptionvoid
start()
Implement this to control the addition of FileResources.Methods inherited from class org.apache.tika.batch.FileResourceCrawler
call, getAdded, getConsidered, isActive, isQueueEmpty, select, setDocumentSelector, setMaxConsecWaitInMillis, setMaxFilesToAdd, setMaxFilesToConsider, shutDownNoPoison, tryToAdd, wasTimedOut
-
Constructor Details
-
FSListCrawler
@Deprecated public FSListCrawler(ArrayBlockingQueue<FileResource> fileQueue, int numConsumers, File root, File list, String encoding) throws FileNotFoundException, UnsupportedEncodingException Deprecated.- Parameters:
fileQueue
-numConsumers
-root
-list
-encoding
-- Throws:
FileNotFoundException
UnsupportedEncodingException
- See Also:
-
FSListCrawler
public FSListCrawler(ArrayBlockingQueue<FileResource> fileQueue, int numConsumers, Path root, Path list, Charset charset) throws IOException Constructor for a crawler that reads a list of files to process.The list should be paths relative to the root.
- Parameters:
fileQueue
- queue for batchnumConsumers
- number of consumersroot
- root input directorlist
- text file list (one file per line) of paths relative to the root for processingcharset
- charset of the file- Throws:
IOException
-
-
Method Details
-
start
Description copied from class:FileResourceCrawler
Implement this to control the addition of FileResources. CallFileResourceCrawler.tryToAdd(org.apache.tika.batch.FileResource)
to add FileResources to the queue.- Specified by:
start
in classFileResourceCrawler
- Throws:
InterruptedException
-