Class CSVPipesIterator
java.lang.Object
org.apache.tika.config.ConfigBase
org.apache.tika.pipes.pipesiterator.PipesIterator
org.apache.tika.pipes.pipesiterator.csv.CSVPipesIterator
- All Implemented Interfaces:
Iterable<FetchEmitTuple>,Callable<Integer>,Initializable
Iterates through a UTF-8 CSV file. This adds all columns
(except for the 'fetchKeyColumn' and 'emitKeyColumn', if specified)
to the metadata object.
- If an 'idColumn' is specified, this will use that column's value as the id.
- If no 'idColumn' is specified, but a 'fetchKeyColumn' is specified, the string in the 'fetchKeyColumn' will be used as the 'id'.
- The 'idColumn' value is not added to the metadata.
- If a 'fetchKeyColumn' is specified, this will use that column's value as the fetchKey.
- If no 'fetchKeyColumn' is specified, this will send the metadata from the other columns.
- The 'fetchKeyColumn' value is not added to the metadata.
- If an 'emitKeyColumn' is specified, this will use that column's value as the emit key.
- If an 'emitKeyColumn' is not specified, this will use the value from the 'fetchKeyColumn'.
- The 'emitKeyColumn' value is not added to the metadata.
-
Field Summary
Fields inherited from class org.apache.tika.pipes.pipesiterator.PipesIterator
COMPLETED_SEMAPHORE, DEFAULT_MAX_WAIT_MS, DEFAULT_QUEUE_SIZE -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidcheckInitialization(InitializableProblemHandler problemHandler) protected voidenqueue()voidsetCsvPath(String csvPath) voidsetCsvPath(Path csvPath) voidsetEmitKeyColumn(String emitKeyColumn) voidsetFetchKeyColumn(String fetchKeyColumn) voidsetIdColumn(String idColumn) Methods inherited from class org.apache.tika.pipes.pipesiterator.PipesIterator
build, call, getEmitterName, getFetcherName, getHandlerConfig, getOnParseException, initialize, iterator, setEmitterName, setFetcherName, setHandlerType, setMaxEmbeddedResources, setMaxWaitMs, setOnParseException, setOnParseException, setParseMode, setParseMode, setQueueSize, setThrowOnWriteLimitReached, setWriteLimit, tryToAddMethods inherited from class org.apache.tika.config.ConfigBase
buildComposite, buildComposite, buildSingle, buildSingle, configure, handleSettingsMethods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, waitMethods inherited from interface org.apache.tika.config.Initializable
initializeMethods inherited from interface java.lang.Iterable
forEach, spliterator
-
Constructor Details
-
CSVPipesIterator
public CSVPipesIterator()
-
-
Method Details
-
setCsvPath
-
setFetchKeyColumn
-
setEmitKeyColumn
-
setIdColumn
-
setCsvPath
-
enqueue
- Specified by:
enqueuein classPipesIterator- Throws:
InterruptedExceptionIOExceptionTimeoutException
-
checkInitialization
public void checkInitialization(InitializableProblemHandler problemHandler) throws TikaConfigException - Specified by:
checkInitializationin interfaceInitializable- Overrides:
checkInitializationin classPipesIterator- Parameters:
problemHandler- if there is a problem and no custom initializableProblemHandler has been configured via Initializable parameters, this is called to respond.- Throws:
TikaConfigException
-