Uses of Annotation Type
org.apache.tika.config.Field
Package
Description
Media type detection.
Extraction of component documents.
Tika parsers.
-
Uses of Field in org.apache.tika.detect
Modifier and TypeMethodDescriptionvoid
FileCommandDetector.setFilePath
(String fileCommandPath) void
FileCommandDetector.setMaxBytes
(int maxBytes) If this is not called on a TikaInputStream, this detector will spool up to this many bytes to a file to be detected by the 'file' command.void
FileCommandDetector.setTimeoutMs
(long timeoutMs) void
FileCommandDetector.setUseMime
(boolean useMime) -
Uses of Field in org.apache.tika.detect.siegfried
Modifier and TypeMethodDescriptionvoid
SiegfriedDetector.setMaxBytes
(int maxBytes) If this is not called on a TikaInputStream, this detector will spool up to this many bytes to a file to be detected by the 'file' command.void
SiegfriedDetector.setSiegfriedPath
(String fileCommandPath) void
SiegfriedDetector.setTimeoutMs
(long timeoutMs) void
SiegfriedDetector.setUseMime
(boolean useMime) As default behavior, Tika runs Siegfried to add its detection to the metadata, but NOT to use detection in determining parsers etc. -
Uses of Field in org.apache.tika.detect.zip
Modifier and TypeMethodDescriptionvoid
DefaultZipContainerDetector.setMarkLimit
(int markLimit) If this is less than 0, the file will be spooled to disk, and detection will run on the full file. -
Uses of Field in org.apache.tika.extractor
Modifier and TypeMethodDescriptionvoid
RUnpackExtractorFactory.setEmbeddedBytesExcludeEmbeddedResourceTypes
(List<String> excludeAttachmentTypes) void
RUnpackExtractorFactory.setEmbeddedBytesExcludeMimeTypes
(List<String> excludeMimeTypes) void
RUnpackExtractorFactory.setEmbeddedBytesIncludeEmbeddedResourceTypes
(List<String> includeAttachmentTypes) void
RUnpackExtractorFactory.setEmbeddedBytesIncludeMimeTypes
(List<String> includeMimeTypes) void
RUnpackExtractorFactory.setMaxEmbeddedBytesForExtraction
(long maxEmbeddedBytesForExtraction) Total number of bytes to write out.void
ParsingEmbeddedDocumentExtractorFactory.setWriteFileNameToContent
(boolean writeFileNameToContent) void
RUnpackExtractorFactory.setWriteFileNameToContent
(boolean writeFileNameToContent) -
Uses of Field in org.apache.tika.langdetect.opennlp.metadatafilter
Modifier and TypeMethodDescriptionvoid
OpenNLPMetadataFilter.setMaxCharsForDetection
(int maxCharsForDetection) -
Uses of Field in org.apache.tika.langdetect.optimaize.metadatafilter
Modifier and TypeMethodDescriptionvoid
OptimaizeMetadataFilter.setMaxCharsForDetection
(int maxCharsForDetection) -
Uses of Field in org.apache.tika.metadata.filter
Modifier and TypeMethodDescriptionvoid
DateNormalizingMetadataFilter.setDefaultTimeZone
(String timeZoneId) void
ExcludeFieldMetadataFilter.setExclude
(List<String> exclude) void
FieldNameMappingFilter.setExcludeUnmapped
(boolean excludeUnmapped) If this istrue
(default), this means that only the fields that have a "from" value in the mapper will be passed through.void
GeoPointMetadataFilter.setGeoPointFieldName
(String geoPointFieldName) Set the field for the concatenated LATITUDE,LONGITUDE string.void
IncludeFieldMetadataFilter.setInclude
(List<String> include) void
FieldNameMappingFilter.setMappings
(Map<String, String> mappings) void
void
void
CaptureGroupMetadataFilter.setSourceField
(String sourceField) void
CaptureGroupMetadataFilter.setTargetField
(String targetField) void
For types seeTikaCoreProperties.EmbeddedResourceType
-
Uses of Field in org.apache.tika.parser
Modifier and TypeMethodDescriptionvoid
RegexCaptureParser.setCaptureMap
(Map<String, String> map) void
RegexCaptureParser.setMatchMap
(Map<String, String> map) void
RegexCaptureParser.setWriteContent
(boolean writeContent) -
Uses of Field in org.apache.tika.parser.csv
Modifier and TypeMethodDescriptionvoid
TextAndCSVParser.setNameToDelimiterMap
(Map<String, String> map) -
Uses of Field in org.apache.tika.parser.digestutils
Modifier and TypeMethodDescriptionvoid
CommonsDigesterFactory.setAlgorithmString
(String algorithmString) void
CommonsDigesterFactory.setMarkLimit
(int markLimit) void
CommonsDigesterFactory.setSkipContainerDocument
(boolean skipContainerDocument) -
Uses of Field in org.apache.tika.parser.dwg
Modifier and TypeMethodDescriptionvoid
AbstractDWGParser.setCleanDwgReadOutput
(boolean cleanDwgReadOutput) void
AbstractDWGParser.setCleanDwgReadOutputBatchSize
(int cleanDwgReadOutputBatchSize) void
AbstractDWGParser.setCleanDwgReadRegexToReplace
(String cleanDwgReadRegexToReplace) void
AbstractDWGParser.setCleanDwgReadReplaceWith
(String cleanDwgReadReplaceWith) void
AbstractDWGParser.setDwgReadExecutable
(String dwgReadExecutable) void
AbstractDWGParser.setDwgReadTimeout
(long dwgReadTimeout) -
Uses of Field in org.apache.tika.parser.epub
-
Uses of Field in org.apache.tika.parser.external2
Modifier and TypeMethodDescriptionvoid
ExternalParser.setCommandLine
(List<String> commandLine) Use this to specify the full commandLine.void
ExternalParser.setMaxStdErr
(int maxStdErr) void
ExternalParser.setMaxStdOut
(int maxStdOut) void
ExternalParser.setOutputParser
(Parser parser) This parser is called on the output of the process.void
ExternalParser.setReturnStderr
(boolean returnStderr) If set to true, this will return the stderr in the metadata viaExternalProcess.STD_ERR
.void
ExternalParser.setReturnStdout
(boolean returnStdout) If set to true, this will return the stdout in the metadata viaExternalProcess.STD_OUT
.void
ExternalParser.setSupportedTypes
(List<String> supportedTypes) This is set during initialization from a tika-config.void
ExternalParser.setTimeoutMs
(long timeoutMs) -
Uses of Field in org.apache.tika.parser.geo.topic
Modifier and TypeMethodDescriptionvoid
GeoParser.setGazetteerRestEndpoint
(String gazetteerRestEndpoint) void
GeoParser.setNerModelUrl
(String nerModelUrl) -
Uses of Field in org.apache.tika.parser.geopkg
Modifier and TypeMethodDescriptionvoid
GeoPkgParser.setIgnoreBlobColumns
(List<String> ignoreBlobColumns) -
Uses of Field in org.apache.tika.parser.html
Modifier and TypeMethodDescriptionvoid
JSoupParser.setExtractScripts
(boolean extractScripts) Whether or not to extract contents in script entities.void
HtmlEncodingDetector.setMarkLimit
(int markLimit) How far into the stream to read for charset detection. -
Uses of Field in org.apache.tika.parser.html.charsetdetector
Modifier and TypeMethodDescriptionvoid
StandardHtmlEncodingDetector.setMarkLimit
(int markLimit) How far into the stream to read for charset detection. -
Uses of Field in org.apache.tika.parser.image
Modifier and TypeMethodDescriptionvoid
PSDParser.setMaxDataLengthBytes
(int maxDataLengthBytes) void
BPGParser.setMaxRecordLength
(int maxRecordLength) -
Uses of Field in org.apache.tika.parser.microsoft
Modifier and TypeMethodDescriptionvoid
AbstractOfficeParser.setByteArrayMaxOverride
(int maxOverride) WARNING: this sets a static variable in POI.void
AbstractOfficeParser.setConcatenatePhoneticRuns
(boolean concatenatePhoneticRuns) void
AbstractOfficeParser.setDateFormatOverride
(String format) void
AbstractOfficeParser.setExtractAllAlternativesFromMSG
(boolean extractAllAlternativesFromMSG) Some .msg files can contain body content in html, rtf and/or text.void
AbstractOfficeParser.setExtractMacros
(boolean extractMacros) void
AbstractOfficeParser.setIncludeDeletedContent
(boolean includeDeletedConent) void
AbstractOfficeParser.setIncludeHeadersAndFooters
(boolean includeHeadersAndFooters) void
AbstractOfficeParser.setIncludeMoveFromContent
(boolean includeMoveFromContent) void
AbstractOfficeParser.setIncludeShapeBasedContent
(boolean includeShapeBasedContent) void
AbstractOfficeParser.setUseSAXDocxExtractor
(boolean useSAXDocxExtractor) void
AbstractOfficeParser.setUseSAXPptxExtractor
(boolean useSAXPptxExtractor) -
Uses of Field in org.apache.tika.parser.microsoft.libpst
Modifier and TypeMethodDescriptionvoid
LibPstParser.setIncludeDeleted
(boolean includeDeleted) void
LibPstParser.setMaxEmails
(int maxEmails) void
LibPstParser.setProcessEmailAsMsg
(boolean processEmailAsMsg) void
LibPstParser.setTimeoutSeconds
(long timeoutSeconds) -
Uses of Field in org.apache.tika.parser.microsoft.rtf
-
Uses of Field in org.apache.tika.parser.mp3
Modifier and TypeMethodDescriptionvoid
Mp3Parser.setMaxRecordSize
(int maxRecordSize) This statically sets the max record size inID3v2Frame
-
Uses of Field in org.apache.tika.parser.ocr
Modifier and TypeMethodDescriptionvoid
TesseractOCRParser.setApplyRotation
(boolean applyRotation) void
TesseractOCRParser.setColorspace
(String colorspace) void
TesseractOCRParser.setDensity
(int density) void
TesseractOCRParser.setDepth
(int depth) void
TesseractOCRParser.setEnableImagePreprocessing
(boolean enableImagePreprocessing) void
void
TesseractOCRParser.setImageMagickPath
(String imageMagickPath) Set the path to the ImageMagick executable directory, needed if it is not on system path.void
TesseractOCRParser.setInlineContent
(boolean inlineContent) void
TesseractOCRParser.setLanguage
(String language) void
TesseractOCRParser.setMaxFileSizeToOcr
(long maxFileSizeToOcr) void
TesseractOCRParser.setMinFileSizeToOcr
(long minFileSizeToOcr) void
TesseractOCRParser.setOtherTesseractSettings
(List<String> settings) void
TesseractOCRParser.setOutputType
(String outputType) void
TesseractOCRParser.setPageSegMode
(String pageSegMode) void
TesseractOCRParser.setPreloadLangs
(boolean preloadLangs) If set totrue
and if tesseract is found, this will load the langs that result from --list-langs.void
TesseractOCRParser.setPreserveInterwordSpacing
(boolean preserveInterwordSpacing) void
TesseractOCRParser.setResize
(int resize) void
TesseractOCRParser.setSkipOCR
(boolean skipOCR) void
TesseractOCRParser.setTessdataPath
(String tessdataPath) Set the path to the 'tessdata' folder, which contains language files and config files.void
TesseractOCRParser.setTesseractPath
(String tesseractPath) Set the path to the Tesseract executable's directory, needed if it is not on system path.void
TesseractOCRParser.setTimeout
(int timeout) Set default timeout in seconds. -
Uses of Field in org.apache.tika.parser.odf
Modifier and TypeMethodDescriptionvoid
FlatOpenDocumentParser.setExtractMacros
(boolean extractMacros) void
OpenDocumentParser.setExtractMacros
(boolean extractMacros) -
Uses of Field in org.apache.tika.parser.pdf
Modifier and TypeMethodDescriptionvoid
PDFParser.setAllowExtractionForAccessibility
(boolean allowExtractionForAccessibility) void
PDFParser.setAverageCharTolerance
(float averageCharTolerance) void
PDFParser.setCatchIntermediateExceptions
(boolean catchIntermediateExceptions) void
PDFParser.setDetectAngles
(boolean detectAngles) void
PDFParser.setDropThreshold
(float dropThreshold) void
PDFParser.setEnableAutoSpace
(boolean v) If true (the default), the parser should estimate where spaces should be inserted between words.void
PDFParser.setExtractAcroFormContent
(boolean extractAcroFormContent) void
PDFParser.setExtractActions
(boolean extractActions) void
PDFParser.setExtractAnnotationText
(boolean v) If true (the default), text in annotations will be extracted.void
PDFParser.setExtractBookmarksText
(boolean extractBookmarksText) void
PDFParser.setExtractFontNames
(boolean extractFontNames) void
PDFParser.setExtractIncrementalUpdateInfo
(boolean setExtractIncrementalUpdateInfo) Whether or not to scan a PDF for incremental updates.void
PDFParser.setExtractInlineImageMetadataOnly
(boolean extractInlineImageMetadataOnly) void
PDFParser.setExtractInlineImages
(boolean extractInlineImages) void
PDFParser.setExtractMarkedContent
(boolean extractMarkedContent) void
PDFParser.setExtractUniqueInlineImagesOnly
(boolean extractUniqueInlineImagesOnly) void
PDFParser.setIfXFAExtractOnlyXFA
(boolean ifXFAExtractOnlyXFA) void
PDFParser.setImageGraphicsEngineFactory
(ImageGraphicsEngineFactory imageGraphicsEngineFactory) void
PDFParser.setImageStrategy
(String imageStrategy) void
PDFParser.setMaxIncrementalUpdates
(int maxIncrementalUpdates) Set the maximum number of incremental updates to parsevoid
PDFParser.setMaxMainMemoryBytes
(long maxMainMemoryBytes) void
PDFParser.setOcrDPI
(int dpi) void
PDFParser.setOcrImageFormatName
(String formatName) void
PDFParser.setOcrImageQuality
(float imageQuality) void
PDFParser.setOcrImageType
(String imageType) void
PDFParser.setOcrRenderingStrategy
(String ocrRenderingStrategy) void
PDFParser.setOcrStrategy
(String ocrStrategyString) void
PDFParser.setOcrStrategyAuto
(String ocrStrategyAuto) void
PDFParser.setParseIncrementalUpdates
(boolean parseIncrementalUpdates) If set to true, this will parse incremental updates if they exist within a PDF.void
PDFParser.setSetKCMS
(boolean setKCMS) void
PDFParser.setSortByPosition
(boolean v) If true, sort text tokens by their x/y position before extracting text.void
PDFParser.setSpacingTolerance
(float spacingTolerance) void
PDFParser.setSuppressDuplicateOverlappingText
(boolean v) If true, the parser should try to remove duplicated text over the same region.void
PDFParser.setThrowOnEncryptedPayload
(boolean throwOnEncryptedPayload) If the file is a 'Collection' and contains an embedded file with a defined 'AssociatedFile' value of 'EncryptedPayload', then throw anEncryptedDocumentException
. -
Uses of Field in org.apache.tika.parser.pkg
Modifier and TypeMethodDescriptionvoid
CompressorParser.setDecompressConcatenated
(boolean decompressConcatenated) void
PackageParser.setDetectCharsetsInEntryNames
(boolean detectCharsetsInEntryNames) Whether or not to run the default charset detector against entry names in ZipFiles.void
CompressorParser.setMemoryLimitInKb
(int memoryLimitInKb) -
Uses of Field in org.apache.tika.parser.recognition
Modifier and TypeMethodDescriptionvoid
ObjectRecognitionParser.setRecogniser
(String recogniserClass) -
Uses of Field in org.apache.tika.parser.recognition.tf
Modifier and TypeFieldDescriptionprotected URI
TensorflowRESTRecogniser.apiBaseUri
protected double
TensorflowRESTRecogniser.minConfidence
protected int
TensorflowRESTRecogniser.topN
-
Uses of Field in org.apache.tika.parser.strings
Modifier and TypeMethodDescriptionvoid
StringsParser.setEncoding
(String encoding) void
StringsParser.setMinLength
(int minLength) void
StringsParser.setStringsPath
(String path) Sets the "strings" installation folder.void
StringsParser.setTimeoutSeconds
(int timeoutSeconds) -
Uses of Field in org.apache.tika.parser.transcribe.aws
Modifier and TypeMethodDescriptionvoid
Sets the client secret for the transcriber API.void
AmazonTranscribe.setClientId
(String id) Sets the client Id for the transcriber API.void
AmazonTranscribe.setClientSecret
(String secret) Sets the client secret for the transcriber API.void
-
Uses of Field in org.apache.tika.parser.txt
Modifier and TypeMethodDescriptionvoid
Icu4jEncodingDetector.setIgnoreCharsets
(List<String> charsetsToIgnore) void
Icu4jEncodingDetector.setMarkLimit
(int markLimit) How far into the stream to read for charset detection.void
UniversalEncodingDetector.setMarkLimit
(int markLimit) How far into the stream to read for charset detection.void
Icu4jEncodingDetector.setStripMarkup
(boolean stripMarkup) Whether or not to attempt to strip html-ish markup from the stream before sending it to the underlying detector. -
Uses of Field in org.apache.tika.parser.wordperfect
Modifier and TypeMethodDescriptionvoid
WordPerfectParser.setIncludeDeletedContent
(boolean includeDeletedContent) Whether or not to include deleted content. -
Uses of Field in org.apache.tika.pipes
Modifier and TypeMethodDescriptionvoid
CompositePipesReporter.addPipesReporter
(PipesReporter pipesReporter) void
PipesReporterBase.setExcludes
(List<String> excludes) void
PipesReporterBase.setIncludes
(List<String> includes) -
Uses of Field in org.apache.tika.pipes.emitter.azblob
Modifier and TypeMethodDescriptionvoid
AZBlobEmitter.setContainer
(String container) void
AZBlobEmitter.setEndpoint
(String endpoint) void
AZBlobEmitter.setFileExtension
(String fileExtension) If you want to customize the output file's file extension.void
AZBlobEmitter.setOverwriteExisting
(boolean overwriteExisting) void
void
AZBlobEmitter.setSasToken
(String sasToken) -
Uses of Field in org.apache.tika.pipes.emitter.fs
Modifier and TypeMethodDescriptionvoid
FileSystemEmitter.setBasePath
(String basePath) void
FileSystemEmitter.setFileExtension
(String fileExtension) If you want to customize the output file's file extension.void
FileSystemEmitter.setOnExists
(String onExists) What to do if the target file already exists.void
FileSystemEmitter.setPrettyPrint
(boolean prettyPrint) -
Uses of Field in org.apache.tika.pipes.emitter.gcs
Modifier and TypeMethodDescriptionvoid
void
GCSEmitter.setFileExtension
(String fileExtension) If you want to customize the output file's file extension.void
void
GCSEmitter.setProjectId
(String projectId) -
Uses of Field in org.apache.tika.pipes.emitter.jdbc
Modifier and TypeMethodDescriptionvoid
JDBCEmitter.setAttachmentStrategy
(String attachmentStrategy) void
JDBCEmitter.setConnection
(String connection) void
JDBCEmitter.setCreateTable
(String createTable) void
void
The implementation of keys should be a LinkedHashMap because order matters!void
JDBCEmitter.setMaxStringLength
(int maxStringLength) Set the maximum string length in characters (not bytes).void
JDBCEmitter.setMultivaluedFieldDelimiter
(String delimiter) void
JDBCEmitter.setMultivaluedFieldStrategy
(String strategy) This applies to fields of type 'string' or 'varchar'.void
JDBCEmitter.setPostConnection
(String postConnection) This sql will be called immediately after the connection is made. -
Uses of Field in org.apache.tika.pipes.emitter.kafka
Modifier and TypeMethodDescriptionvoid
void
KafkaEmitter.setBootstrapServers
(String bootstrapServers) void
KafkaEmitter.setBufferMemory
(int bufferMemory) void
KafkaEmitter.setClientId
(String clientId) void
KafkaEmitter.setCompressionType
(String compressionType) void
KafkaEmitter.setConnectionsMaxIdleMs
(int connectionsMaxIdleMs) void
KafkaEmitter.setDeliveryTimeoutMs
(int deliveryTimeoutMs) void
KafkaEmitter.setEnableIdempotence
(boolean enableIdempotence) void
KafkaEmitter.setInterceptorClasses
(String interceptorClasses) void
KafkaEmitter.setKeySerializer
(String keySerializer) void
KafkaEmitter.setLingerMs
(int lingerMs) void
KafkaEmitter.setMaxBlockMs
(int maxBlockMs) void
KafkaEmitter.setMaxInFlightRequestsPerConnection
(int maxInFlightRequestsPerConnection) void
KafkaEmitter.setMaxRequestSize
(int maxRequestSize) void
KafkaEmitter.setMetadataMaxAgeMs
(int metadataMaxAgeMs) void
KafkaEmitter.setRequestTimeoutMs
(int requestTimeoutMs) void
KafkaEmitter.setRetries
(int retries) void
KafkaEmitter.setRetryBackoffMs
(int retryBackoffMs) void
void
KafkaEmitter.setTransactionalId
(String transactionalId) void
KafkaEmitter.setTransactionTimeoutMs
(int transactionTimeoutMs) void
KafkaEmitter.setValueSerializer
(String valueSerializer) -
Uses of Field in org.apache.tika.pipes.emitter.opensearch
Modifier and TypeMethodDescriptionvoid
OpenSearchEmitter.setAttachmentStrategy
(String attachmentStrategy) Options: SEPARATE_DOCUMENTS, PARENT_CHILD.void
OpenSearchEmitter.setAuthScheme
(String authScheme) void
OpenSearchEmitter.setCommitWithin
(int commitWithin) void
OpenSearchEmitter.setConnectionTimeout
(int connectionTimeout) void
OpenSearchEmitter.setEmbeddedFileFieldName
(String embeddedFileFieldName) If using theOpenSearchEmitter.AttachmentStrategy.PARENT_CHILD
, this is the field name used to store the child documents.void
OpenSearchEmitter.setIdField
(String idField) Specify the field in the first Metadata that should be used as the id field for the document.void
OpenSearchEmitter.setOpenSearchUrl
(String openSearchUrl) void
OpenSearchEmitter.setPassword
(String password) void
OpenSearchEmitter.setProxyHost
(String proxyHost) void
OpenSearchEmitter.setProxyPort
(int proxyPort) void
OpenSearchEmitter.setSocketTimeout
(int socketTimeout) void
OpenSearchEmitter.setUserName
(String userName) -
Uses of Field in org.apache.tika.pipes.emitter.s3
Modifier and TypeMethodDescriptionvoid
S3Emitter.setAccessKey
(String accessKey) void
void
S3Emitter.setCredentialsProvider
(String credentialsProvider) void
S3Emitter.setEndpointConfigurationService
(String endpointConfigurationService) void
S3Emitter.setFileExtension
(String fileExtension) If you want to customize the output file's file extension.void
S3Emitter.setMaxConnections
(int maxConnections) maximum number of http connections allowed.void
S3Emitter.setPathStyleAccessEnabled
(boolean pathStyleAccessEnabled) void
void
S3Emitter.setProfile
(String profile) void
void
S3Emitter.setSecretKey
(String secretKey) void
S3Emitter.setSpoolToTemp
(boolean spoolToTemp) Whether or not to spool the metadatalist to a tmp file before putting object. -
Uses of Field in org.apache.tika.pipes.emitter.solr
Modifier and TypeMethodDescriptionvoid
SolrEmitter.setAttachmentStrategy
(String attachmentStrategy) Options: SKIP, CONCATENATE_CONTENT, PARENT_CHILD.void
SolrEmitter.setAuthScheme
(String authScheme) void
SolrEmitter.setCommitWithin
(int commitWithin) void
SolrEmitter.setConnectionTimeout
(int connectionTimeout) void
SolrEmitter.setEmbeddedFileFieldName
(String embeddedFileFieldName) If using theSolrEmitter.AttachmentStrategy.PARENT_CHILD
, this is the field name used to store the child documents.void
SolrEmitter.setIdField
(String idField) Specify the field in the first Metadata that should be used as the id field for the document.void
SolrEmitter.setPassword
(String password) void
SolrEmitter.setProxyHost
(String proxyHost) void
SolrEmitter.setProxyPort
(int proxyPort) void
SolrEmitter.setSocketTimeout
(int socketTimeout) void
SolrEmitter.setSolrCollection
(String solrCollection) void
SolrEmitter.setSolrUrls
(List<String> solrUrls) void
SolrEmitter.setSolrZkChroot
(String solrZkChroot) void
SolrEmitter.setSolrZkHosts
(List<String> solrZkHosts) void
SolrEmitter.setUpdateStrategy
(String updateStrategy) void
SolrEmitter.setUserName
(String userName) -
Uses of Field in org.apache.tika.pipes.fetcher
-
Uses of Field in org.apache.tika.pipes.fetcher.azblob
Modifier and TypeMethodDescriptionvoid
AZBlobFetcher.setContainer
(String container) void
AZBlobFetcher.setEndpoint
(String endpoint) void
AZBlobFetcher.setExtractUserMetadata
(boolean extractUserMetadata) Whether or not to extract user metadata from the blob objectvoid
AZBlobFetcher.setSasToken
(String sasToken) void
AZBlobFetcher.setSpoolToTemp
(boolean spoolToTemp) -
Uses of Field in org.apache.tika.pipes.fetcher.fs
Modifier and TypeMethodDescriptionvoid
FileSystemFetcher.setBasePath
(String basePath) Default behavior si that clients will send in relative paths, this must be set to allow this fetcher to fetch the full path.void
FileSystemFetcher.setExtractFileSystemMetadata
(boolean extractFileSystemMetadata) Extract file system metadata (created, modified, accessed) when fetching file. -
Uses of Field in org.apache.tika.pipes.fetcher.gcs
Modifier and TypeMethodDescriptionvoid
void
GCSFetcher.setExtractUserMetadata
(boolean extractUserMetadata) Whether or not to extract user metadata from the S3Objectvoid
GCSFetcher.setProjectId
(String projectId) void
GCSFetcher.setSpoolToTemp
(boolean spoolToTemp) -
Uses of Field in org.apache.tika.pipes.fetcher.http
Modifier and TypeMethodDescriptionvoid
HttpFetcher.setAuthScheme
(String authScheme) void
HttpFetcher.setConnectTimeout
(int connectTimeout) void
HttpFetcher.setHttpHeaders
(List<String> headers) Which http headers should we capture in the metadata.void
HttpFetcher.setHttpRequestHeaders
(List<String> headers) Which http request headers should we send in the http fetch requests.void
HttpFetcher.setJwtExpiresInSeconds
(int jwtExpiresInSeconds) void
HttpFetcher.setJwtIssuer
(String jwtIssuer) void
HttpFetcher.setJwtPrivateKeyBase64
(String jwtPrivateKeyBase64) void
HttpFetcher.setJwtSecret
(String jwtSecret) void
HttpFetcher.setJwtSubject
(String jwtSubject) void
HttpFetcher.setMaxConnections
(int maxConnections) void
HttpFetcher.setMaxConnectionsPerRoute
(int maxConnectionsPerRoute) void
HttpFetcher.setMaxErrMsgSize
(int maxErrMsgSize) void
HttpFetcher.setMaxRedirects
(int maxRedirects) void
HttpFetcher.setMaxSpoolSize
(long maxSpoolSize) Set the maximum number of bytes to spool to a temp file.void
HttpFetcher.setNtDomain
(String domain) void
HttpFetcher.setOverallTimeout
(long overallTimeout) This sets an overall timeout on the request.void
HttpFetcher.setPassword
(String password) void
HttpFetcher.setProxyHost
(String proxyHost) void
HttpFetcher.setProxyPort
(int proxyPort) void
HttpFetcher.setRequestTimeout
(int requestTimeout) void
HttpFetcher.setSocketTimeout
(int socketTimeout) void
HttpFetcher.setUserAgent
(String userAgent) When making the request, what User-Agent is sent in the request.void
HttpFetcher.setUserName
(String userName) -
Uses of Field in org.apache.tika.pipes.fetcher.s3
Modifier and TypeMethodDescriptionvoid
S3Fetcher.setAccessKey
(String accessKey) void
void
S3Fetcher.setCredentialsProvider
(String credentialsProvider) void
S3Fetcher.setEndpointConfigurationService
(String endpointConfigurationService) void
S3Fetcher.setExtractUserMetadata
(boolean extractUserMetadata) Whether or not to extract user metadata from the S3Objectvoid
S3Fetcher.setMaxConnections
(int maxConnections) void
S3Fetcher.setMaxLength
(long maxLength) void
S3Fetcher.setPathStyleAccessEnabled
(boolean pathStyleAccessEnabled) void
prefix to prepend to the fetch key before fetching.void
S3Fetcher.setProfile
(String profile) void
void
S3Fetcher.setSecretKey
(String secretKey) void
S3Fetcher.setSleepBeforeRetryMillis
(long sleepBeforeRetryMillis) Deprecated.void
S3Fetcher.setSpoolToTemp
(boolean spoolToTemp) void
S3Fetcher.setThrottleSeconds
(String commaDelimitedLongs) Set seconds to throttle retries as a comma-delimited list, e.g.: 30,60,120,600 -
Uses of Field in org.apache.tika.pipes.fetchers.microsoftgraph
Modifier and TypeMethodDescriptionvoid
MicrosoftGraphFetcher.setThrottleSeconds
(String commaDelimitedLongs) Set seconds to throttle retries as a comma-delimited list, e.g.: 30,60,120,600 -
Uses of Field in org.apache.tika.pipes.pipesiterator
Modifier and TypeMethodDescriptionvoid
PipesIterator.setEmitterName
(String emitterName) void
PipesIterator.setFetcherName
(String fetcherName) void
PipesIterator.setHandlerType
(String handlerType) void
PipesIterator.setMaxEmbeddedResources
(int maxEmbeddedResources) void
PipesIterator.setMaxWaitMs
(long maxWaitMs) void
PipesIterator.setOnParseException
(String onParseException) void
PipesIterator.setParseMode
(String parseModeString) void
PipesIterator.setQueueSize
(int queueSize) void
PipesIterator.setThrowOnWriteLimitReached
(boolean throwOnWriteLimitReached) void
PipesIterator.setWriteLimit
(int writeLimit) -
Uses of Field in org.apache.tika.pipes.pipesiterator.azblob
Modifier and TypeMethodDescriptionvoid
AZBlobPipesIterator.setContainer
(String container) void
AZBlobPipesIterator.setEndpoint
(String endpoint) void
void
AZBlobPipesIterator.setSasToken
(String sasToken) -
Uses of Field in org.apache.tika.pipes.pipesiterator.csv
Modifier and TypeMethodDescriptionvoid
CSVPipesIterator.setCsvPath
(String csvPath) void
CSVPipesIterator.setCsvPath
(Path csvPath) void
CSVPipesIterator.setEmitKeyColumn
(String emitKeyColumn) void
CSVPipesIterator.setFetchKeyColumn
(String fetchKeyColumn) void
CSVPipesIterator.setIdColumn
(String idColumn) -
Uses of Field in org.apache.tika.pipes.pipesiterator.filelist
Modifier and TypeMethodDescriptionvoid
FileListPipesIterator.setFileList
(String path) void
FileListPipesIterator.setHasHeader
(boolean hasHeader) -
Uses of Field in org.apache.tika.pipes.pipesiterator.fs
Modifier and TypeMethodDescriptionvoid
FileSystemPipesIterator.setBasePath
(String basePath) void
FileSystemPipesIterator.setCountTotal
(boolean countTotal) -
Uses of Field in org.apache.tika.pipes.pipesiterator.gcs
-
Uses of Field in org.apache.tika.pipes.pipesiterator.jdbc
Modifier and TypeMethodDescriptionvoid
JDBCPipesIterator.setConnection
(String connection) void
JDBCPipesIterator.setEmitKeyColumn
(String fetchKeyColumn) void
JDBCPipesIterator.setFetchKeyColumn
(String fetchKeyColumn) void
JDBCPipesIterator.setFetchKeyRangeEndColumn
(String fetchKeyRangeEndColumn) void
JDBCPipesIterator.setFetchKeyRangeStartColumn
(String fetchKeyRangeStartColumn) void
JDBCPipesIterator.setFetchSize
(int fetchSize) void
JDBCPipesIterator.setIdColumn
(String idColumn) void
-
Uses of Field in org.apache.tika.pipes.pipesiterator.kafka
Modifier and TypeMethodDescriptionvoid
KafkaPipesIterator.setAutoOffsetReset
(String autoOffsetReset) void
KafkaPipesIterator.setBootstrapServers
(String bootstrapServers) void
KafkaPipesIterator.setEmitMax
(int emitMax) If the kafka pipe iterator will keep polling for more documents until it returns an empty result.void
KafkaPipesIterator.setGroupId
(String groupId) void
KafkaPipesIterator.setGroupInitialRebalanceDelayMs
(int groupInitialRebalanceDelayMs) void
KafkaPipesIterator.setKeySerializer
(String keySerializer) void
KafkaPipesIterator.setPollDelayMs
(int pollDelayMs) void
void
KafkaPipesIterator.setValueSerializer
(String valueSerializer) -
Uses of Field in org.apache.tika.pipes.pipesiterator.s3
Modifier and TypeMethodDescriptionvoid
S3PipesIterator.setAccessKey
(String accessKey) void
void
S3PipesIterator.setCredentialsProvider
(String credentialsProvider) void
S3PipesIterator.setEndpointConfigurationService
(String endpointConfigurationService) void
S3PipesIterator.setFileNamePattern
(String fileNamePattern) void
S3PipesIterator.setFileNamePattern
(Pattern fileNamePattern) void
S3PipesIterator.setMaxConnections
(int maxConnections) void
S3PipesIterator.setPathStyleAccessEnabled
(boolean pathStyleAccessEnabled) void
void
S3PipesIterator.setProfile
(String profile) void
void
S3PipesIterator.setSecretKey
(String secretKey) -
Uses of Field in org.apache.tika.pipes.pipesiterator.solr
Modifier and TypeMethodDescriptionvoid
SolrPipesIterator.setAdditionalFields
(List<String> additionalFields) void
SolrPipesIterator.setAuthScheme
(String authScheme) void
SolrPipesIterator.setConnectionTimeout
(int connectionTimeout) void
SolrPipesIterator.setFailCountField
(String failCountField) void
SolrPipesIterator.setFilters
(List<String> filters) void
SolrPipesIterator.setIdField
(String idField) void
SolrPipesIterator.setParsingIdField
(String parsingIdField) void
SolrPipesIterator.setPassword
(String password) void
SolrPipesIterator.setProxyHost
(String proxyHost) void
SolrPipesIterator.setProxyPort
(int proxyPort) void
SolrPipesIterator.setRows
(int rows) void
SolrPipesIterator.setSizeFieldName
(String sizeFieldName) void
SolrPipesIterator.setSocketTimeout
(int socketTimeout) void
SolrPipesIterator.setSolrCollection
(String solrCollection) void
SolrPipesIterator.setSolrUrls
(List<String> solrUrls) void
SolrPipesIterator.setSolrZkChroot
(String solrZkChroot) void
SolrPipesIterator.setSolrZkHosts
(List<String> solrZkHosts) void
SolrPipesIterator.setUserName
(String userName) -
Uses of Field in org.apache.tika.pipes.reporters.fs
Modifier and TypeMethodDescriptionvoid
FileSystemStatusReporter.setReportUpdateMillis
(long millis) void
FileSystemStatusReporter.setStatusFile
(String path) -
Uses of Field in org.apache.tika.pipes.reporters.jdbc
Modifier and TypeMethodDescriptionvoid
JDBCPipesReporter.setCacheSize
(int cacheSize) Commit the reports if the cache is greater than or equal to this size.void
JDBCPipesReporter.setConnection
(String connection) void
JDBCPipesReporter.setCreateTable
(boolean createTable) The default is true.void
JDBCPipesReporter.setPostConnection
(String postConnection) This sql will be called immediately after the connection is made.void
JDBCPipesReporter.setReportSql
(String reportSql) This is the sql for the prepared statement to execute to store the report record. the default is:insert into tika_status (id, status, timestamp) values (?
void
JDBCPipesReporter.setReportVariables
(List<String> variables) ADVANCED: This is used to set the variables in the prepared statement for the report.void
JDBCPipesReporter.setReportWithinMs
(long reportWithinMs) Commit the reports if the amount of time elapsed since the last report commit exceeds this value.void
JDBCPipesReporter.setTableName
(String tableName) The default isJDBCPipesReporter.TABLE_NAME
-
Uses of Field in org.apache.tika.pipes.reporters.opensearch
Modifier and TypeMethodDescriptionvoid
OpenSearchPipesReporter.setAuthScheme
(String authScheme) void
OpenSearchPipesReporter.setConnectionTimeout
(int connectionTimeout) void
OpenSearchPipesReporter.setExcludeStatuses
(List<String> statusList) void
OpenSearchPipesReporter.setIncludeRouting
(boolean includeRouting) void
OpenSearchPipesReporter.setIncludeStatuses
(List<String> statusList) void
OpenSearchPipesReporter.setKeyPrefix
(String keyPrefix) This prefixes the keys before sending them to OpenSearch.void
OpenSearchPipesReporter.setOpenSearchUrl
(String openSearchUrl) void
OpenSearchPipesReporter.setPassword
(String password) void
OpenSearchPipesReporter.setProxyHost
(String proxyHost) void
OpenSearchPipesReporter.setProxyPort
(int proxyPort) void
OpenSearchPipesReporter.setSocketTimeout
(int socketTimeout) void
OpenSearchPipesReporter.setUserName
(String userName)
S3Fetcher.setThrottleSeconds(String)