Index
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form
A
- AadCredentialConfigBase<T> - Interface in org.apache.tika.pipes.fetchers.microsoftgraph.config
- ABOUT - Static variable in interface org.apache.tika.metadata.XMP
-
Unordered text strings of advisories.
- ABOUT - Static variable in interface org.apache.tika.metadata.XMPPDF
-
Unordered text strings of about.
- ABS_PEAK_AUDIO_FILE_PATH - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The absolute path to the file's peak audio file.
- AbstractArchiveParser - Class in org.apache.tika.parser.pkg
-
Abstract base class for archive parsers that provides common functionality for handling embedded documents within archives.
- AbstractArchiveParser() - Constructor for class org.apache.tika.parser.pkg.AbstractArchiveParser
- AbstractArchiveParser(EncodingDetector) - Constructor for class org.apache.tika.parser.pkg.AbstractArchiveParser
- AbstractChunking - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
-
This class specifies the base class for file chunking
- AbstractChunking(byte[]) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.AbstractChunking
-
Initializes a new instance of the AbstractChunking class.
- AbstractComponentManager<T extends TikaExtension,
F extends TikaExtensionFactory<T>> - Class in org.apache.tika.pipes.core -
Abstract base class for managing Tika components (Fetchers, Emitters, etc.).
- AbstractComponentManager(PluginManager, Map<String, ExtensionConfig>, boolean) - Constructor for class org.apache.tika.pipes.core.AbstractComponentManager
- AbstractComponentManager(PluginManager, Map<String, ExtensionConfig>, boolean, ConfigStore) - Constructor for class org.apache.tika.pipes.core.AbstractComponentManager
- AbstractConverter - Class in org.apache.tika.xmp.convert
-
Base class for Tika Metadata to XMP converter which provides some needed common functionality.
- AbstractConverter() - Constructor for class org.apache.tika.xmp.convert.AbstractConverter
- AbstractDBParser - Class in org.apache.tika.parser.jdbc
-
Abstract class that handles iterating through tables within a database.
- AbstractDBParser() - Constructor for class org.apache.tika.parser.jdbc.AbstractDBParser
- AbstractDWGParser - Class in org.apache.tika.parser.dwg
- AbstractDWGParser() - Constructor for class org.apache.tika.parser.dwg.AbstractDWGParser
- AbstractDWGParser(JsonConfig) - Constructor for class org.apache.tika.parser.dwg.AbstractDWGParser
- AbstractDWGParser(DWGParserConfig) - Constructor for class org.apache.tika.parser.dwg.AbstractDWGParser
- AbstractEmbeddingFilter - Class in org.apache.tika.inference
-
Base class for metadata filters that chunk text content and call a remote embeddings endpoint to produce vectors for each chunk.
- AbstractEmbeddingFilter() - Constructor for class org.apache.tika.inference.AbstractEmbeddingFilter
- AbstractEmbeddingFilter(InferenceConfig) - Constructor for class org.apache.tika.inference.AbstractEmbeddingFilter
- AbstractEmitter - Class in org.apache.tika.pipes.api.emitter
- AbstractEmitter(ExtensionConfig) - Constructor for class org.apache.tika.pipes.api.emitter.AbstractEmitter
- AbstractEncodingDetectorParser - Class in org.apache.tika.parser
-
Abstract base class for parsers that use the AutoDetectReader and need to use an
EncodingDetector. - AbstractEncodingDetectorParser() - Constructor for class org.apache.tika.parser.AbstractEncodingDetectorParser
- AbstractEncodingDetectorParser(EncodingDetector) - Constructor for class org.apache.tika.parser.AbstractEncodingDetectorParser
- AbstractExternalProcessParser - Class in org.apache.tika.parser
-
Abstract base class for parsers that call external processes.
- AbstractExternalProcessParser() - Constructor for class org.apache.tika.parser.AbstractExternalProcessParser
- AbstractImageParser - Class in org.apache.tika.parser.image
- AbstractImageParser() - Constructor for class org.apache.tika.parser.image.AbstractImageParser
- AbstractListManager - Class in org.apache.tika.parser.microsoft
- AbstractListManager() - Constructor for class org.apache.tika.parser.microsoft.AbstractListManager
- AbstractListManager.LevelTuple - Class in org.apache.tika.parser.microsoft
- AbstractListManager.ParagraphLevelCounter - Class in org.apache.tika.parser.microsoft
- AbstractMultipleParser - Class in org.apache.tika.parser.multiple
-
Abstract base class for parser wrappers which may / will process a given stream multiple times, merging the results of the various parsers used.
- AbstractMultipleParser(MediaTypeRegistry, AbstractMultipleParser.MetadataPolicy, Collection<? extends Parser>) - Constructor for class org.apache.tika.parser.multiple.AbstractMultipleParser
- AbstractMultipleParser(MediaTypeRegistry, AbstractMultipleParser.MetadataPolicy, Parser...) - Constructor for class org.apache.tika.parser.multiple.AbstractMultipleParser
- AbstractMultipleParser.MetadataPolicy - Enum Class in org.apache.tika.parser.multiple
-
The various strategies for handling metadata emitted by multiple parsers.
- AbstractOfficeParser - Class in org.apache.tika.parser.microsoft
-
Intermediate layer to set
OfficeParserConfiguniformly. - AbstractOfficeParser() - Constructor for class org.apache.tika.parser.microsoft.AbstractOfficeParser
- AbstractOOXMLExtractor - Class in org.apache.tika.parser.microsoft.ooxml
-
Base class for all Tika OOXML extractors.
- AbstractOOXMLExtractor(ParseContext, OPCPackage) - Constructor for class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
- AbstractParser - Class in org.apache.tika.parser
-
Deprecated.for removal in 4.x
- AbstractParser() - Constructor for class org.apache.tika.parser.AbstractParser
-
Deprecated.
- AbstractRecursiveParserWrapperHandler - Class in org.apache.tika.sax
-
This is a special handler to be used only with the
RecursiveParserWrapper. - AbstractRecursiveParserWrapperHandler(ContentHandlerFactory) - Constructor for class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- AbstractSpiComponentLoader<T> - Class in org.apache.tika.config.loader
-
Base loader for components that support SPI fallback with exclusions.
- AbstractSpiComponentLoader(String, String, Class<T>) - Constructor for class org.apache.tika.config.loader.AbstractSpiComponentLoader
-
Creates a new SPI component loader.
- AbstractStreamEmitter - Class in org.apache.tika.pipes.api.emitter
- AbstractStreamEmitter(ExtensionConfig) - Constructor for class org.apache.tika.pipes.api.emitter.AbstractStreamEmitter
- AbstractTikaExtension - Class in org.apache.tika.plugins
- AbstractTikaExtension(ExtensionConfig) - Constructor for class org.apache.tika.plugins.AbstractTikaExtension
- AbstractTranslator - Class in org.apache.tika.language.translate.impl
- AbstractTranslator() - Constructor for class org.apache.tika.language.translate.impl.AbstractTranslator
- AbstractUnpackHandler - Class in org.apache.tika.pipes.core.extractor
- AbstractUnpackHandler() - Constructor for class org.apache.tika.pipes.core.extractor.AbstractUnpackHandler
- AbstractVLMParser - Class in org.apache.tika.parser.vlm
-
Abstract base class for parsers that delegate to a remote Vision-Language Model (VLM) endpoint for OCR and document understanding.
- AbstractVLMParser(VLMOCRConfig) - Constructor for class org.apache.tika.parser.vlm.AbstractVLMParser
- AbstractVLMParser.HttpCall - Record Class in org.apache.tika.parser.vlm
-
Encapsulates a fully built HTTP request for a VLM API call.
- AbstractXML2003Parser - Class in org.apache.tika.parser.microsoft.xml
- AbstractXML2003Parser() - Constructor for class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
- accept(PipesResult.RESULT_STATUS) - Method in class org.apache.tika.pipes.reporters.PipesReporterBase
-
Implementations must call this for the includes/excludes filters to work!
- ACCEPT_ALL - Static variable in interface org.apache.tika.extractor.UnpackSelector
- AcceptAll() - Constructor for class org.apache.tika.extractor.UnpackSelector.AcceptAll
- ACCESS_PERMISSION - Enum constant in enum class org.apache.tika.eval.app.ProfilerBase.EXCEPTION_TYPE
- ACCESSED - Static variable in interface org.apache.tika.metadata.FileSystem
- accessKey() - Method in record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
-
Returns the value of the
accessKeyrecord component. - AccessPermissionException - Exception in org.apache.tika.exception
-
Exception to be thrown when a document does not allow content extraction.
- AccessPermissionException() - Constructor for exception org.apache.tika.exception.AccessPermissionException
- AccessPermissionException(String) - Constructor for exception org.apache.tika.exception.AccessPermissionException
- AccessPermissionException(String, Throwable) - Constructor for exception org.apache.tika.exception.AccessPermissionException
- AccessPermissionException(Throwable) - Constructor for exception org.apache.tika.exception.AccessPermissionException
- AccessPermissions - Interface in org.apache.tika.metadata
-
Until we can find a common standard, we'll use these options.
- ack() - Static method in record class org.apache.tika.pipes.core.protocol.PipesMessage
- ACK - Enum constant in enum class org.apache.tika.pipes.core.protocol.PipesMessageType
- ACKNOWLEDGEMENT - Static variable in interface org.apache.tika.metadata.ClimateForcast
- acks() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
acksrecord component. - ACRONYM_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- ACTION_TRIGGER - Static variable in interface org.apache.tika.metadata.PDF
-
This specifies where an action or destination would be found/triggered in the document: on document open, before close, etc.
- ACTION_TRIGGERS - Static variable in interface org.apache.tika.metadata.PDF
-
This is a list of all action or destination triggers contained within a given PDF.
- ACTION_TYPES - Static variable in interface org.apache.tika.metadata.PDF
- ActionItemSchemaVersion - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ActionItemStatus - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ActionItemType - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- actionPerformed(ActionEvent) - Method in class org.apache.tika.gui.TikaGUI
- ActiveMimeParser - Class in org.apache.tika.parser.microsoft.activemime
-
ActiveMime is a macro container format used in some mso files.
- ActiveMimeParser() - Constructor for class org.apache.tika.parser.microsoft.activemime.ActiveMimeParser
- AdapterHelper - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
- AdapterHelper() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AdapterHelper
- add(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
- add(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- add(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
- add(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
- add(int, Metadata, InputStream) - Method in interface org.apache.tika.extractor.UnpackHandler
- add(int, Metadata, InputStream) - Method in class org.apache.tika.pipes.core.extractor.AbstractUnpackHandler
- add(int, Metadata, InputStream) - Method in class org.apache.tika.pipes.core.extractor.EmittingUnpackHandler
- add(int, Metadata, InputStream) - Method in class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler
- add(int, Metadata, InputStream) - Method in class org.apache.tika.pipes.core.extractor.TempFileUnpackHandler
- add(long) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
- add(String, long) - Method in class org.apache.tika.eval.core.tokens.LangModel
- add(String, String) - Method in class org.apache.tika.metadata.Metadata
-
Add a metadata name/value mapping.
- add(String, String) - Method in class org.apache.tika.xmp.XMPMetadata
-
As this API could only possibly work for simple properties in XMP, it just calls the set method, which replaces any existing value
- add(String, String[]) - Method in class org.apache.tika.metadata.Metadata
-
Add a metadata name/value mapping.
- add(String, String, Map<String, String[]>) - Method in interface org.apache.tika.metadata.writefilter.MetadataWriteLimiter
-
Based on the field and value, this limiter modifies the field and/or the value to something that should be added to the Metadata object.
- add(String, String, Map<String, String[]>) - Method in class org.apache.tika.metadata.writefilter.StandardMetadataLimiter
- add(Property, int) - Method in class org.apache.tika.metadata.Metadata
-
Adds the integer value of the identified metadata property.
- add(Property, String) - Method in class org.apache.tika.metadata.Metadata
-
Add a metadata property/value mapping.
- add(Property, Calendar) - Method in class org.apache.tika.metadata.Metadata
-
Adds the date value of the identified metadata property.
- add(UByte) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
- add(UInteger) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- add(ULong) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
- add(UShort) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
- add(RenderResult) - Method in class org.apache.tika.renderer.PageBasedRenderResults
- add(RenderResult) - Method in class org.apache.tika.renderer.RenderResults
- ADD - Enum constant in enum class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig.UpdateStrategy
- addAlias(MediaType, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
- addAllCharacters(String, ContentHandler) - Method in class org.apache.tika.parser.jdbc.JDBCTableReader
- addAllGetFetcherReplies(Iterable<? extends GetFetcherReply>) - Method in class org.apache.tika.ListFetchersReply.Builder
-
List of fetcher configs returned by the Lists Fetchers service.
- addAlternative(GeoTag) - Method in class org.apache.tika.parser.geo.topic.GeoTag
- addCloseableResource(Closeable) - Method in class org.apache.tika.io.TikaInputStream
- addData(byte[], int, int) - Method in class org.apache.tika.detect.TextStatistics
- addDrawingHyperLinks(PackagePart) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
- addEmitter(String, String, Map<String, Object>) - Method in class org.apache.tika.pipes.core.config.ConfigOverrides.Builder
-
Add an emitter configuration.
- addEvenIfNull(Property, String, Metadata) - Static method in class org.apache.tika.parser.microsoft.OutlookExtractor
- addException(Exception) - Method in class org.apache.tika.parser.ParseRecord
- addFetcher(String, String, Map<String, Object>) - Method in class org.apache.tika.pipes.core.config.ConfigOverrides.Builder
-
Add a fetcher configuration.
- addGetFetcherReplies(int, GetFetcherReply) - Method in class org.apache.tika.ListFetchersReply.Builder
-
List of fetcher configs returned by the Lists Fetchers service.
- addGetFetcherReplies(int, GetFetcherReply.Builder) - Method in class org.apache.tika.ListFetchersReply.Builder
-
List of fetcher configs returned by the Lists Fetchers service.
- addGetFetcherReplies(GetFetcherReply) - Method in class org.apache.tika.ListFetchersReply.Builder
-
List of fetcher configs returned by the Lists Fetchers service.
- addGetFetcherReplies(GetFetcherReply.Builder) - Method in class org.apache.tika.ListFetchersReply.Builder
-
List of fetcher configs returned by the Lists Fetchers service.
- addGetFetcherRepliesBuilder() - Method in class org.apache.tika.ListFetchersReply.Builder
-
List of fetcher configs returned by the Lists Fetchers service.
- addGetFetcherRepliesBuilder(int) - Method in class org.apache.tika.ListFetchersReply.Builder
-
List of fetcher configs returned by the Lists Fetchers service.
- addingService(ServiceReference) - Method in class org.apache.tika.config.TikaActivator
- ADDITIONAL_FETCH_CONFIG_JSON_FIELD_NUMBER - Static variable in class org.apache.tika.FetchAndParseRequest
- ADDITIONAL_MODEL_INFO - Static variable in interface org.apache.tika.metadata.IPTC
-
Information about the ethnicity and other facets of the model(s) in a model-released image.
- ADDITIONAL_NAMESPACES - Static variable in class org.apache.tika.xmp.convert.MSOfficeBinaryConverter
- ADDITIONAL_NAMESPACES - Static variable in class org.apache.tika.xmp.convert.MSOfficeXMLConverter
- ADDITIONAL_NAMESPACES - Static variable in class org.apache.tika.xmp.convert.OpenDocumentConverter
- ADDITIONAL_NAMESPACES - Static variable in class org.apache.tika.xmp.convert.RTFConverter
- AdditionalFlags - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Additional Flags
- addJvmArg(String) - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Add a JVM argument for the forked process.
- addMetadata(Mp4Directory) - Method in class org.apache.tika.parser.mp4.boxes.TikaUserDataBox
- addMetadata(String) - Method in class org.apache.tika.parser.xml.AttributeMetadataHandler
-
Adds the given metadata value.
- addMetadata(String) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
- addMetadata(String) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
- addMetadata(String) - Method in class org.apache.tika.parser.xml.MetadataHandler
-
Deprecated.
- addMetadata(Metadata) - Method in class org.apache.tika.parser.ParseRecord
- addMulti(Metadata, Property, String) - Static method in class org.apache.tika.parser.microsoft.SummaryExtractor
- addOtherTesseractConfig(String, String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Add a key-value pair to pass to Tesseract using its -c command line option.
- addPaginated(PaginatedLocator) - Method in class org.apache.tika.inference.locator.Locators
- addPattern(MimeType, String) - Method in class org.apache.tika.mime.MimeTypes
-
Adds a file name pattern for the given media type.
- addPattern(MimeType, String, boolean) - Method in class org.apache.tika.mime.MimeTypes
-
Adds a file name pattern for the given media type.
- addPersonAndEmail(String, Property, Property, Metadata) - Static method in class org.apache.tika.parser.mailcommons.MailUtil
-
This tries to split a "from" or "to" value into a person field and an email field.
- addPipesReporter(PipesReporter) - Method in class org.apache.tika.pipes.core.reporter.CompositePipesReporter
- addPrefix(String, String) - Method in class org.apache.tika.sax.xpath.XPathParser
- addRepeatedField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.DeleteFetcherReply.Builder
- addRepeatedField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.DeleteFetcherRequest.Builder
- addRepeatedField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
- addRepeatedField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- addRepeatedField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.FetchAndParseReply.Builder
- addRepeatedField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.FetchAndParseRequest.Builder
- addRepeatedField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- addRepeatedField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- addRepeatedField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.GetFetcherReply.Builder
- addRepeatedField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.GetFetcherRequest.Builder
- addRepeatedField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.GetPipesIteratorReply.Builder
- addRepeatedField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
- addRepeatedField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.ListFetchersReply.Builder
- addRepeatedField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.ListFetchersRequest.Builder
- addRepeatedField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.SaveFetcherReply.Builder
- addRepeatedField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.SaveFetcherRequest.Builder
- addRepeatedField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.SavePipesIteratorReply.Builder
- addRepeatedField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
- addResource(Closeable) - Method in class org.apache.tika.io.TemporaryResources
-
Adds a new resource to the set of tracked resources that will all be closed when the
TemporaryResources.close()method is called. - addResource(String, String, long, String, String) - Method in class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
-
Adds a resource to this data package with all parameters.
- addResource(FrictionlessResource) - Method in class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
-
Adds a resource to this data package.
- addResult(List<EncodingResult>, String) - Method in class org.apache.tika.detect.EncodingDetectorContext
-
Record the ranked results from a child detector.
- addSpatial(SpatialLocator) - Method in class org.apache.tika.inference.locator.Locators
- addSuperType(MediaType, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
- addTemporal(TemporalLocator) - Method in class org.apache.tika.inference.locator.Locators
- addText(char[], int, int) - Method in class org.apache.tika.langdetect.charsoup.CharSoupLanguageDetector
- addText(char[], int, int) - Method in class org.apache.tika.langdetect.lingo24.Lingo24LangDetector
- addText(char[], int, int) - Method in class org.apache.tika.langdetect.mitll.TextLangDetector
- addText(char[], int, int) - Method in class org.apache.tika.langdetect.opennlp.OpenNLPDetector
-
This will buffer up to
OpenNLPDetector.setMaxLength(int)and then ignore the rest of the text. - addText(char[], int, int) - Method in class org.apache.tika.langdetect.optimaize.OptimaizeLangDetector
- addText(char[], int, int) - Method in class org.apache.tika.language.detect.LanguageDetector
-
Add statistics about this text for the current document.
- addText(CharSequence) - Method in class org.apache.tika.language.detect.LanguageDetector
-
Add
to the statistics being accumulated for the current document. - addText(TextLocator) - Method in class org.apache.tika.inference.locator.Locators
- addType(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
- addWarning(String) - Method in class org.apache.tika.parser.ParseRecord
- AdobeFontMetricParser - Class in org.apache.tika.parser.font
-
Parser for AFM Font Files
- AdobeFontMetricParser() - Constructor for class org.apache.tika.parser.font.AdobeFontMetricParser
- advance(int) - Method in class org.apache.tika.sax.SecureContentHandler
-
Records the given number of output characters (or more accurately UTF-16 code units).
- AdvancedTypeDetector - Class in org.apache.tika.example
- AdvancedTypeDetector() - Constructor for class org.apache.tika.example.AdvancedTypeDetector
- ADVISORY - Static variable in interface org.apache.tika.metadata.XMP
-
Unordered text strings of advisories.
- AES_ENV_VAR - Static variable in class org.apache.tika.client.HttpClientFactory
- afterRead(int) - Method in class org.apache.tika.io.TikaInputStream
- ALBUM - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the album."
- ALBUM_ARTIST - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the album artist or group for compilation albums."
- ALIAS_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- ALIAS_TYPE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- ALIGNED_OFFSET - Static variable in class org.apache.tika.parser.microsoft.chm.ChmCommons
- alignedLenTable - Variable in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- alignedTreeTable - Variable in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- ALL - Enum constant in enum class org.apache.tika.parser.pdf.OcrConfig.RenderingStrategy
- ALL - Enum constant in enum class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig.AttachmentStrategy
- AllocateExtendedGuidRange - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.RequestTypes
-
Allocate extended Guid range .
- AllocateExtendedGUIDRangeRequest - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Allocate Extended GUID Range Request
- AllocateExtendedGUIDRangeResponse - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Allocate Extended GUID Range Response
- ALLOW_EXTRACTION_FOR_ACCESSIBILITY - Enum constant in enum class org.apache.tika.parser.pdf.PDFParserConfig.AccessCheckMode
-
Check permissions, but allow extraction for accessibility purposes if extraction for accessibility is allowed.
- allowedPolicies - Static variable in class org.apache.tika.parser.multiple.FallbackParser
-
The different Metadata Policies we support (all)
- allowedPolicies - Static variable in class org.apache.tika.parser.multiple.SupplementingParser
-
The different Metadata Policies we support (not discard)
- alpha - Variable in class org.apache.tika.parser.ocr.tess4j.ImageDeskew.HoughLine
- ALT - Enum constant in enum class org.apache.tika.metadata.Property.PropertyType
-
An ordered array with some sort of criteria
- ALT_TAPE_NAME - Static variable in interface org.apache.tika.metadata.XMPDM
-
"An alternative tape name, set via the project window or timecode dialog in Premiere.
- ALTERNATE_FORMAT_CHUNK - Enum constant in enum class org.apache.tika.metadata.TikaCoreProperties.EmbeddedResourceType
- AlternativePackaging - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
- AlternativePackaging - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Alternative Packaging
- AlternativePackaging - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Alternative Packaging
- AlternativePackaging() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
- alterTable() - Method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
-
Returns the value of the
alterTablerecord component. - ALTITUDE - Static variable in interface org.apache.tika.metadata.Geographic
-
The WGS84 Altitude of the Point
- ALTITUDE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- ALWAYS_ADD_FIELDS - Static variable in class org.apache.tika.metadata.writefilter.StandardMetadataLimiter
- ALWAYS_SET_FIELDS - Static variable in class org.apache.tika.metadata.writefilter.StandardMetadataLimiter
- amazonTranscribe(Path, Path) - Static method in class org.apache.tika.example.TranscribeTranslateExample
-
Use
AmazonTranscribeto execute transcription on input data. - AmazonTranscribe - Class in org.apache.tika.parser.transcribe.aws
-
Amazon Transcribe implementation.
- AmazonTranscribe() - Constructor for class org.apache.tika.parser.transcribe.aws.AmazonTranscribe
- AmazonTranscribe(JsonConfig) - Constructor for class org.apache.tika.parser.transcribe.aws.AmazonTranscribe
- AmazonTranscribe(AmazonTranscribeConfig) - Constructor for class org.apache.tika.parser.transcribe.aws.AmazonTranscribe
- AmazonTranscribeConfig - Class in org.apache.tika.parser.transcribe.aws
- AmazonTranscribeConfig() - Constructor for class org.apache.tika.parser.transcribe.aws.AmazonTranscribeConfig
- AmazonTranscribeConfig.RuntimeConfig - Class in org.apache.tika.parser.transcribe.aws
-
RuntimeConfig blocks modification of security-sensitive credential and infrastructure fields at runtime.
- AMBIGUOUS - Enum constant in enum class org.apache.tika.ml.chardetect.StructuralEncodingRules.Utf8Result
-
Sample is structurally valid but contains no complete multi-byte sequence (pure ASCII, or only a truncated lead at probe-end).
- AnalyzerManager - Class in org.apache.tika.eval.core.tokens
-
Manages tokenization for tika-eval.
- analyzeStorageIndexDataElement(List<DataElement>, ExGuid, AtomicReference<ExGuid>, AtomicReference<HashMap<CellID, ExGuid>>, AtomicReference<HashMap<ExGuid, ExGuid>>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to analyze the storage index data element to get all the mappings.
- ANNOTATION_SUBTYPES - Static variable in interface org.apache.tika.metadata.PDF
- ANNOTATION_TYPES - Static variable in interface org.apache.tika.metadata.PDF
- ANSICPG_MAP - Static variable in class org.apache.tika.parser.microsoft.rtf.jflex.RTFCharsetMaps
-
Maps
\ansicpgNvalues to Java charsets. - apiKey() - Method in record class org.apache.tika.pipes.emitter.es.ESEmitterConfig
-
Returns the value of the
apiKeyrecord component. - apiKey() - Method in record class org.apache.tika.pipes.reporter.es.ESReporterConfig
-
Returns the value of the
apiKeyrecord component. - APP_VERSION - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- appendByteArrayToListOfByte(List<Byte>, byte[]) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.ByteUtil
- appendGUID(UUID) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
-
Append a specified GUID value into the buffer.
- appendInit32(int, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
-
Append a specified Init32 type value into the buffer with the specified bit length.
- appendRectangle(Point2D, Point2D, Point2D, Point2D) - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- appendUInit32(int, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
-
Append a specified Unit32 type value into the buffer with the specified bit length.
- appendUInt64(long, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
-
Append a specified Unit64 type value into the buffer with the specified bit length.
- AppleSingleFileParser - Class in org.apache.tika.parser.apple
-
Parser that strips the header off of AppleSingle and AppleDouble files.
- AppleSingleFileParser() - Constructor for class org.apache.tika.parser.apple.AppleSingleFileParser
- application(String) - Static method in class org.apache.tika.mime.MediaType
- APPLICATION - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- APPLICATION_COMMENT - Static variable in interface org.apache.tika.metadata.DWG
- APPLICATION_NAME - Static variable in interface org.apache.tika.metadata.DWG
- APPLICATION_VERSION - Static variable in interface org.apache.tika.metadata.DWG
- APPLICATION_XML - Static variable in class org.apache.tika.mime.MediaType
- APPLICATION_ZIP - Static variable in class org.apache.tika.mime.MediaType
- applyReplacements(JsonNode, Map<String, Object>) - Static method in class org.apache.tika.config.JsonConfigHelper
-
Applies replacements to a JsonNode tree, modifying it in place.
- applyStyleAndValue(int, ResultSet, Cell) - Method in class org.apache.tika.eval.app.reports.XLSXHREFFormatter
- AR - Static variable in class org.apache.tika.detect.zip.PackageConstants
- ARABIC - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- ARC_GZ - Static variable in class org.apache.tika.detect.gzip.GZipSpecializationDetector
- ARCHITECTURE_BITS - Static variable in interface org.apache.tika.metadata.MachineMetadata
- ARJ - Static variable in class org.apache.tika.detect.zip.PackageConstants
- ARMENIAN - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- ArrayNumber - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
-
The class is used to represent the number of the array.
- ArrayNumber() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.ArrayNumber
- ArrayOfContextIDs - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
The property contains an array of CompactID structures in the ObjectSpaceObjectPropSet.ContextIDs.body stream field.
- ArrayOfObjectIDs - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
The property contains an array of CompactID structures in the ObjectSpaceObjectPropSet.OSIDs.body stream field.
- ArrayOfObjectSpaceIDs - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
The property contains an array of CompactID structures in the ObjectSpaceObjectPropSet.OSIDs.body stream field.
- ArrayOfPropertyValues - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
The property contains a prtArrayOfPropertyValues structure in the PropertySet.rgData stream field.
- ARTIST - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the artist or artists."
- ARTWORK_OR_OBJECT - Static variable in interface org.apache.tika.metadata.IPTC
-
A set of metadata about artwork or an object in the item
- ARTWORK_OR_OBJECT_DETAIL_COPYRIGHT_NOTICE - Static variable in interface org.apache.tika.metadata.IPTC
-
Contains any necessary copyright notice for claiming the intellectual property for artwork or an object in the image and should identify the current owner of the copyright of this work with associated intellectual property rights.
- ARTWORK_OR_OBJECT_DETAIL_CREATOR - Static variable in interface org.apache.tika.metadata.IPTC
-
Contains the name of the artist who has created artwork or an object in the image.
- ARTWORK_OR_OBJECT_DETAIL_DATE_CREATED - Static variable in interface org.apache.tika.metadata.IPTC
-
Designates the date and optionally the time the artwork or object in the image was created.
- ARTWORK_OR_OBJECT_DETAIL_SOURCE - Static variable in interface org.apache.tika.metadata.IPTC
-
The organisation or body holding and registering the artwork or object in the image for inventory purposes.
- ARTWORK_OR_OBJECT_DETAIL_SOURCE_INVENTORY_NUMBER - Static variable in interface org.apache.tika.metadata.IPTC
-
The inventory number issued by the organisation or body holding and registering the artwork or object in the image.
- ARTWORK_OR_OBJECT_DETAIL_TITLE - Static variable in interface org.apache.tika.metadata.IPTC
-
A reference for the artwork or object in the image.
- AS_IS - Enum constant in enum class org.apache.tika.eval.app.io.ExtractReader.ALTER_METADATA_LIST
- asBytes(UUID) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.UuidUtils
- asInputSource() - Method in class org.apache.tika.detect.AutoDetectReader
- ASSEMBLE_DOCUMENT - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Can the user insert/rotate/delete pages.
- assertByteArrayNotNull(byte[]) - Static method in class org.apache.tika.parser.microsoft.chm.ChmAssert
-
Checks if byte[] is not null
- assertByteArrayNotNull(byte[]) - Static method in class org.apache.tika.parser.microsoft.chm.ChmCommons
- assertChmAccessorNotNull(ChmAccessor<?>) - Static method in class org.apache.tika.parser.microsoft.chm.ChmAssert
-
Checks if ChmAccessor is not null In case of null throws exception
- assertChmAccessorParameters(byte[], ChmAccessor<?>, int) - Static method in class org.apache.tika.parser.microsoft.chm.ChmAssert
-
Checks validity of ChmAccessor parameters
- assertChmBlockSegment(byte[], ChmLzxcResetTable, int, int, int) - Static method in class org.apache.tika.parser.microsoft.chm.ChmAssert
-
Checks a validity of the chmBlockSegment parameters
- assertCopyingDataIndex(int, int) - Static method in class org.apache.tika.parser.microsoft.chm.ChmAssert
- assertDirectoryListingEntry(int, String, ChmCommons.EntryType, int, int) - Static method in class org.apache.tika.parser.microsoft.chm.ChmAssert
-
Checks validity of the DirectoryListingEntry's parameters In case of invalid parameter(s) throws an exception
- assertInputStreamNotNull(InputStream) - Static method in class org.apache.tika.parser.microsoft.chm.ChmAssert
-
Checks if InputStream is not null
- assertPositiveInt(int) - Static method in class org.apache.tika.parser.microsoft.chm.ChmAssert
-
Checks if int param is greater than zero In case param <= 0 throws an exception
- ASSOCIATED_FILE_RELATIONSHIP - Static variable in interface org.apache.tika.metadata.PDF
- asUuid(byte[]) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.UuidUtils
- AsyncEmitter - Class in org.apache.tika.pipes.core.async
-
Worker thread that takes EmitData off the queue, batches it and tries to emit it as a batch
- AsyncEmitter(PipesConfig, ArrayBlockingQueue<EmitDataPair>, EmitterManager) - Constructor for class org.apache.tika.pipes.core.async.AsyncEmitter
- AsyncHelper - Class in org.apache.tika.cli
- AsyncHelper() - Constructor for class org.apache.tika.cli.AsyncHelper
- AsyncProcessor - Class in org.apache.tika.pipes.core.async
-
This is the main class for handling async requests.
- AsyncRequest - Class in org.apache.tika.server.core.resource
- AsyncRequest(List<FetchEmitTuple>) - Constructor for class org.apache.tika.server.core.resource.AsyncRequest
- AsyncResource - Class in org.apache.tika.server.core.resource
- AsyncResource(Path) - Constructor for class org.apache.tika.server.core.resource.AsyncResource
- AtlassianJwtFetcher - Class in org.apache.tika.pipes.fetcher.atlassianjwt
- AtlassianJwtFetcher(ExtensionConfig, AtlassianJwtFetcherConfig) - Constructor for class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcher
- AtlassianJwtFetcherConfig - Class in org.apache.tika.pipes.fetcher.atlassianjwt.config
- AtlassianJwtFetcherConfig() - Constructor for class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- AtlassianJwtFetcherFactory - Class in org.apache.tika.pipes.fetcher.atlassianjwt
-
Factory for creating Atlassian JWT fetchers.
- AtlassianJwtFetcherFactory() - Constructor for class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcherFactory
- AtlassianJwtFetcherPlugin - Class in org.apache.tika.pipes.fetcher.atlassianjwt
- AtlassianJwtFetcherPlugin() - Constructor for class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcherPlugin
- AtlassianJwtGenerator - Class in org.apache.tika.pipes.fetcher.atlassianjwt
- AtlassianJwtGenerator(String, String, String, int) - Constructor for class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtGenerator
- AtlassianJwtPipesPlugin - Class in org.apache.tika.pipes.plugin.atlassianjwt
- AtlassianJwtPipesPlugin(PluginWrapper) - Constructor for class org.apache.tika.pipes.plugin.atlassianjwt.AtlassianJwtPipesPlugin
- ATTACH_CONTENT_ID - Static variable in interface org.apache.tika.metadata.MAPI
- ATTACH_CONTENT_LOCATION - Static variable in interface org.apache.tika.metadata.MAPI
- ATTACH_DISPLAY_NAME - Static variable in interface org.apache.tika.metadata.MAPI
- ATTACH_EXTENSION - Static variable in interface org.apache.tika.metadata.MAPI
- ATTACH_FILE_NAME - Static variable in interface org.apache.tika.metadata.MAPI
- ATTACH_FLAGS - Static variable in interface org.apache.tika.metadata.MAPI
-
PidTagAttachFlags (0x3714) — indicates which body formats might reference this attachment.
- ATTACH_HIDDEN - Static variable in interface org.apache.tika.metadata.MAPI
-
PidTagAttachmentHidden (0x7FFE) — indicates whether this attachment is hidden from the end user.
- ATTACH_LANGUAGE - Static variable in interface org.apache.tika.metadata.MAPI
- ATTACH_LONG_FILE_NAME - Static variable in interface org.apache.tika.metadata.MAPI
- ATTACH_LONG_PATH_NAME - Static variable in interface org.apache.tika.metadata.MAPI
- ATTACH_MIME - Static variable in interface org.apache.tika.metadata.MAPI
- ATTACHMENT - Enum constant in enum class org.apache.tika.metadata.TikaCoreProperties.EmbeddedResourceType
- ATTACHMENT_TYPE - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- attachmentStrategy() - Method in record class org.apache.tika.pipes.emitter.es.ESEmitterConfig
-
Returns the value of the
attachmentStrategyrecord component. - attachmentStrategy() - Method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
-
Returns the value of the
attachmentStrategyrecord component. - attachmentStrategy() - Method in record class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig
-
Returns the value of the
attachmentStrategyrecord component. - attachmentStrategy() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Returns the value of the
attachmentStrategyrecord component. - AttributeDependantMetadataHandler - Class in org.apache.tika.parser.xml
-
This adds a Metadata entry for a given node.
- AttributeDependantMetadataHandler(Metadata, String, String) - Constructor for class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
- AttributeMatcher - Class in org.apache.tika.sax.xpath
-
Final evaluation state of a
. - AttributeMatcher() - Constructor for class org.apache.tika.sax.xpath.AttributeMatcher
- AttributeMetadataHandler - Class in org.apache.tika.parser.xml
-
SAX event handler that maps the contents of an XML attribute into a metadata field.
- AttributeMetadataHandler(String, String, Metadata, String) - Constructor for class org.apache.tika.parser.xml.AttributeMetadataHandler
- AttributeMetadataHandler(String, String, Metadata, Property) - Constructor for class org.apache.tika.parser.xml.AttributeMetadataHandler
- audio(String) - Static method in class org.apache.tika.mime.MediaType
- AUDIO_CHANNEL_TYPE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio channel type."
- AUDIO_COMPRESSOR - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio compression used.
- AUDIO_MOD_DATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The date and time when the audio was last modified."
- AUDIO_SAMPLE_RATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio sample rate.
- AUDIO_SAMPLE_TYPE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio sample type."
- AudioFrame - Class in org.apache.tika.parser.mp3
-
An Audio Frame in an MP3 file.
- AudioFrame(int, int, int, int, int, int, float) - Constructor for class org.apache.tika.parser.mp3.AudioFrame
-
Creates a new instance of
AudioFrameand initializes all properties. - AudioFrame(int, int, int, int, InputStream) - Constructor for class org.apache.tika.parser.mp3.AudioFrame
-
Deprecated.Use the constructor which is passed all values directly.
- AudioFrame(InputStream, ContentHandler) - Constructor for class org.apache.tika.parser.mp3.AudioFrame
-
Deprecated.Use the constructor which is passed all values directly.
- AudioParser - Class in org.apache.tika.parser.audio
- AudioParser() - Constructor for class org.apache.tika.parser.audio.AudioParser
- AUTH_TOKEN_LENGTH_BYTES - Static variable in class org.apache.tika.pipes.core.server.PipesServer
- Author - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- AUTHOR - Static variable in interface org.apache.tika.metadata.Office
-
Name of the principal author(s) of a document
- AuthorMostRecent - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- AuthorOriginal - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- AUTHORS_POSITION - Static variable in interface org.apache.tika.metadata.Photoshop
- authScheme() - Method in record class org.apache.tika.pipes.emitter.es.HttpClientConfig
-
Returns the value of the
authSchemerecord component. - authScheme() - Method in record class org.apache.tika.pipes.emitter.opensearch.HttpClientConfig
-
Returns the value of the
authSchemerecord component. - authScheme() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Returns the value of the
authSchemerecord component. - authScheme() - Method in record class org.apache.tika.pipes.reporter.opensearch.HttpClientConfig
-
Returns the value of the
authSchemerecord component. - AUTO - Enum constant in enum class org.apache.tika.parser.pdf.OcrConfig.Strategy
- AutoDetectParser - Class in org.apache.tika.parser
- AutoDetectParser() - Constructor for class org.apache.tika.parser.AutoDetectParser
-
Creates an auto-detecting parser instance using the default Tika configuration.
- AutoDetectParser(Detector) - Constructor for class org.apache.tika.parser.AutoDetectParser
- AutoDetectParser(Detector, Parser...) - Constructor for class org.apache.tika.parser.AutoDetectParser
- AutoDetectParser(MediaTypeRegistry, Parser, Detector, AutoDetectParserConfig) - Constructor for class org.apache.tika.parser.AutoDetectParser
- AutoDetectParser(Parser...) - Constructor for class org.apache.tika.parser.AutoDetectParser
-
Creates an auto-detecting parser instance using the specified set of parser.
- AutoDetectParserConfig - Class in org.apache.tika.parser
-
Configuration for AutoDetectParser behavior.
- AutoDetectParserConfig() - Constructor for class org.apache.tika.parser.AutoDetectParserConfig
- AutoDetectReader - Class in org.apache.tika.detect
-
An input stream reader that automatically detects the character encoding to be used for converting bytes to characters.
- AutoDetectReader(InputStream) - Constructor for class org.apache.tika.detect.AutoDetectReader
- AutoDetectReader(InputStream, Metadata) - Constructor for class org.apache.tika.detect.AutoDetectReader
- AutoDetectReader(InputStream, Metadata, ServiceLoader) - Constructor for class org.apache.tika.detect.AutoDetectReader
- AutoDetectReader(InputStream, Metadata, EncodingDetector) - Constructor for class org.apache.tika.detect.AutoDetectReader
- autoTranslatePost(InputStream, String, String) - Method in class org.apache.tika.server.core.resource.TranslateResource
- autoTranslatePut(InputStream, String, String) - Method in class org.apache.tika.server.core.resource.TranslateResource
- available() - Method in class org.apache.tika.io.BoundedInputStream
- available() - Method in class org.apache.tika.io.LookaheadInputStream
- awaitAck() - Method in class org.apache.tika.pipes.core.server.ServerProtocolIO
-
Reads a framed message and verifies it is an ACK.
- AZBlobEmitter - Class in org.apache.tika.pipes.emitter.azblob
-
Emitter to write files to Azure Blob Storage.
- AZBlobEmitterConfig - Record Class in org.apache.tika.pipes.emitter.azblob
- AZBlobEmitterConfig(String, String, String, String, String, boolean) - Constructor for record class org.apache.tika.pipes.emitter.azblob.AZBlobEmitterConfig
-
Creates an instance of a
AZBlobEmitterConfigrecord class. - AZBlobEmitterFactory - Class in org.apache.tika.pipes.emitter.azblob
-
Factory for creating Azure Blob Storage emitters.
- AZBlobEmitterFactory() - Constructor for class org.apache.tika.pipes.emitter.azblob.AZBlobEmitterFactory
- AZBlobFetcher - Class in org.apache.tika.pipes.fetcher.azblob
-
Fetches files from Azure blob storage.
- AZBlobFetcherConfig - Class in org.apache.tika.pipes.fetcher.azblob.config
- AZBlobFetcherConfig() - Constructor for class org.apache.tika.pipes.fetcher.azblob.config.AZBlobFetcherConfig
- AZBlobFetcherFactory - Class in org.apache.tika.pipes.fetcher.azblob
-
Factory for creating Azure Blob Storage fetchers.
- AZBlobFetcherFactory() - Constructor for class org.apache.tika.pipes.fetcher.azblob.AZBlobFetcherFactory
- AZBlobPipesIterator - Class in org.apache.tika.pipes.iterator.azblob
- AZBlobPipesIteratorConfig - Class in org.apache.tika.pipes.iterator.azblob
- AZBlobPipesIteratorConfig() - Constructor for class org.apache.tika.pipes.iterator.azblob.AZBlobPipesIteratorConfig
- AZBlobPipesIteratorFactory - Class in org.apache.tika.pipes.iterator.azblob
-
Factory for creating Azure Blob Storage pipes iterators.
- AZBlobPipesIteratorFactory() - Constructor for class org.apache.tika.pipes.iterator.azblob.AZBlobPipesIteratorFactory
- AZBlobPipesPlugin - Class in org.apache.tika.pipes.plugin.azblob
- AZBlobPipesPlugin(PluginWrapper) - Constructor for class org.apache.tika.pipes.plugin.azblob.AZBlobPipesPlugin
B
- B - Enum constant in enum class org.apache.tika.parser.microsoft.FormattingUtils.Tag
- BAG - Enum constant in enum class org.apache.tika.metadata.Property.PropertyType
-
An un-ordered array
- BASE_MIME - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- BASE32 - Enum constant in enum class org.apache.tika.digest.DigestDef.Encoding
- BASE64 - Enum constant in enum class org.apache.tika.digest.DigestDef.Encoding
- basePath() - Method in record class org.apache.tika.pipes.emitter.fs.FileSystemEmitterConfig
-
Returns the value of the
basePathrecord component. - baseRevisionID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifest
- BasicContentHandlerFactory - Class in org.apache.tika.sax
-
Basic factory for creating common types of ContentHandlers.
- BasicContentHandlerFactory() - Constructor for class org.apache.tika.sax.BasicContentHandlerFactory
-
No-arg constructor for bean-style configuration (e.g., Jackson deserialization).
- BasicContentHandlerFactory(BasicContentHandlerFactory.HANDLER_TYPE, int) - Constructor for class org.apache.tika.sax.BasicContentHandlerFactory
-
Create a BasicContentHandlerFactory with
BasicContentHandlerFactory.throwOnWriteLimitReachedis true - BasicContentHandlerFactory(BasicContentHandlerFactory.HANDLER_TYPE, int, boolean, ParseContext) - Constructor for class org.apache.tika.sax.BasicContentHandlerFactory
- BasicContentHandlerFactory.HANDLER_TYPE - Enum Class in org.apache.tika.sax
-
Common handler types for content.
- BasicObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
-
Base object for FSSHTTPB.
- BasicObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BasicObject
- BasicTokenCountStatsCalculator - Class in org.apache.tika.eval.core.textstats
- BasicTokenCountStatsCalculator() - Constructor for class org.apache.tika.eval.core.textstats.BasicTokenCountStatsCalculator
- BASIS - Static variable in class org.apache.tika.detect.siegfried.SiegfriedDetector
- batchInsert(PreparedStatement, TableInfo, Map<Cols, String>) - Static method in class org.apache.tika.eval.app.db.JDBCUtil
- batchSize() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
batchSizerecord component. - BCC - Enum constant in enum class org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
- BEGIN - Enum constant in enum class org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
- BenchmarkCharsetDetectors - Class in org.apache.tika.ml.chardetect.tools
-
Micro-benchmark comparing charset detector throughput.
- BenchmarkCharsetDetectors() - Constructor for class org.apache.tika.ml.chardetect.tools.BenchmarkCharsetDetectors
- BENGALI - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- BETTER - Static variable in class org.apache.tika.parser.pdf.OcrConfig.StrategyAuto
- BIG - Static variable in class org.apache.tika.metadata.MachineMetadata.Endian
- BIGENDIAN_16_BIT - Enum constant in enum class org.apache.tika.parser.strings.StringsEncoding
- BIGENDIAN_32_BIT - Enum constant in enum class org.apache.tika.parser.strings.StringsEncoding
- BIN - Enum constant in enum class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenType
- BinaryItem - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
- BinaryItem() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
-
Initializes a new instance of the BinaryItem class.
- BinaryItem(Collection<Byte>) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
-
Initializes a new instance of the BinaryItem class with the specified content.
- BIND_EXCEPTION - Static variable in class org.apache.tika.server.core.TikaServerProcess
- bindService() - Method in class org.apache.tika.TikaGrpc.TikaImplBase
- bindService(TikaGrpc.AsyncService) - Static method in class org.apache.tika.TikaGrpc
- Bit - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
-
The class is used to read/set bit value for a byte array
- Bit() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.Bit
- BitConverter - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
- BitConverter() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- BitReader - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
-
A class is used to extract values across byte boundaries with arbitrary bit positions.
- BitReader(byte[], int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
-
Initializes a new instance of the BitReader class with specified bytes buffer and start position in byte.
- BITS_PER_SAMPLE - Static variable in interface org.apache.tika.metadata.TIFF
-
"Number of bits per component in each channel."
- BITUNES - Static variable in class org.apache.tika.detect.apple.BPListDetector
- BitWriter - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
- BitWriter(int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
-
Initializes a new instance of the BitWriter class with specified buffer size in byte.
- blobExtendedGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDataBLOBReference
- blockUntilShutdown() - Method in class org.apache.tika.pipes.grpc.TikaGrpcServer
-
Await termination on the main thread since the grpc library uses daemon threads.
- BMEMGRAPH - Static variable in class org.apache.tika.detect.apple.BPListDetector
- body - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
- body - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfContextIDs
- body - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOIDs
- body - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOSIDs
- BODY - Enum constant in enum class org.apache.tika.sax.BasicContentHandlerFactory.HANDLER_TYPE
- BODY_TYPES_PROCESSED - Static variable in interface org.apache.tika.metadata.MAPI
- BodyContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that only passes everything inside the XHTML <body/> tag to the underlying handler.
- BodyContentHandler() - Constructor for class org.apache.tika.sax.BodyContentHandler
-
Creates a content handler that writes XHTML body character events to an internal string buffer.
- BodyContentHandler(int) - Constructor for class org.apache.tika.sax.BodyContentHandler
-
Creates a content handler that writes XHTML body character events to an internal string buffer.
- BodyContentHandler(Writer) - Constructor for class org.apache.tika.sax.BodyContentHandler
-
Creates a content handler that writes XHTML body character events to the given writer.
- BodyContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.BodyContentHandler
-
Creates a content handler that passes all XHTML body events to the given underlying content handler.
- BodyTextAlignment - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- BoilerpipeContentHandler - Class in org.apache.tika.sax.boilerpipe
-
Uses the boilerpipe library to automatically extract the main content from a web page.
- BoilerpipeContentHandler(Writer) - Constructor for class org.apache.tika.sax.boilerpipe.BoilerpipeContentHandler
-
Creates a content handler that writes XHTML body character events to the given writer.
- BoilerpipeContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.boilerpipe.BoilerpipeContentHandler
-
Creates a new boilerpipe-based content extractor, using the
DefaultExtractorextraction rules and "delegate" as the content handler. - BoilerpipeContentHandler(ContentHandler, BoilerpipeExtractor) - Constructor for class org.apache.tika.sax.boilerpipe.BoilerpipeContentHandler
-
Creates a new boilerpipe-based content extractor, using the given extraction rules.
- Bold - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- BOMDetector - Class in org.apache.tika.detect
-
Encoding detector that identifies the character set from a byte-order mark (BOM) at the start of the stream.
- BOMDetector() - Constructor for class org.apache.tika.detect.BOMDetector
- Bool - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
The property is a Boolean value specified by boolValue.
- BOOLEAN - Enum constant in enum class org.apache.tika.metadata.Property.ValueType
- boolValue - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
- bootstrapServers() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
bootstrapServersrecord component. - BOPOMOFO - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- BouncyCastleDigester - Class in org.apache.tika.parser.digestutils
-
Digester that relies on BouncyCastle for MessageDigest implementations.
- BouncyCastleDigester(List<DigestDef>) - Constructor for class org.apache.tika.parser.digestutils.BouncyCastleDigester
- BouncyCastleDigester(DigestDef.Algorithm...) - Constructor for class org.apache.tika.parser.digestutils.BouncyCastleDigester
-
Convenience constructor using Algorithm enum with HEX encoding.
- BouncyCastleDigesterFactory - Class in org.apache.tika.parser.digestutils
-
Factory for
BouncyCastleDigesterwith configurable algorithms and encodings. - BouncyCastleDigesterFactory() - Constructor for class org.apache.tika.parser.digestutils.BouncyCastleDigesterFactory
- BoundedInputStream - Class in org.apache.tika.io
-
Very slight modification of Commons' BoundedInputStream so that we can figure out if this hit the bound or not.
- BoundedInputStream(long, InputStream) - Constructor for class org.apache.tika.io.BoundedInputStream
- BPGParser - Class in org.apache.tika.parser.image
-
Parser for the Better Portable Graphics (BPG) File Format.
- BPGParser() - Constructor for class org.apache.tika.parser.image.BPGParser
- BPLIST - Static variable in class org.apache.tika.detect.apple.BPListDetector
- BPListDetector - Class in org.apache.tika.detect.apple
-
Detector for BPList with utility functions for PList.
- BPListDetector() - Constructor for class org.apache.tika.detect.apple.BPListDetector
- BROTLI - Static variable in class org.apache.tika.detect.zip.CompressorConstants
- bucket() - Method in record class org.apache.tika.pipes.emitter.gcs.GCSEmitterConfig
-
Returns the value of the
bucketrecord component. - bucket() - Method in record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
-
Returns the value of the
bucketrecord component. - bufferMemory() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
bufferMemoryrecord component. - BufferUnderrunException() - Constructor for exception org.apache.tika.io.EndianUtils.BufferUnderrunException
- build() - Method in class org.apache.tika.client.HttpClientFactory
- build() - Method in class org.apache.tika.DeleteFetcherReply.Builder
- build() - Method in class org.apache.tika.DeleteFetcherRequest.Builder
- build() - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
- build() - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- build() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- build() - Method in interface org.apache.tika.digest.DigesterFactory
-
Build a new Digester instance using the factory's configured properties.
- build() - Method in class org.apache.tika.FetchAndParseReply.Builder
- build() - Method in class org.apache.tika.FetchAndParseRequest.Builder
- build() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- build() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- build() - Method in class org.apache.tika.GetFetcherReply.Builder
- build() - Method in class org.apache.tika.GetFetcherRequest.Builder
- build() - Method in class org.apache.tika.GetPipesIteratorReply.Builder
- build() - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
- build() - Method in class org.apache.tika.ListFetchersReply.Builder
- build() - Method in class org.apache.tika.ListFetchersRequest.Builder
- build() - Method in class org.apache.tika.parser.digestutils.BouncyCastleDigesterFactory
- build() - Method in class org.apache.tika.parser.digestutils.CommonsDigesterFactory
- build() - Method in class org.apache.tika.pipes.core.config.ConfigOverrides.Builder
-
Build the ConfigOverrides instance.
- build() - Method in class org.apache.tika.SaveFetcherReply.Builder
- build() - Method in class org.apache.tika.SaveFetcherRequest.Builder
- build() - Method in class org.apache.tika.SavePipesIteratorReply.Builder
- build() - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
- build() - Method in class org.apache.tika.sax.StandardReference.StandardReferenceBuilder
- build() - Method in class org.apache.tika.serialization.ComponentConfig.Builder
-
Build the ComponentConfig.
- build(int) - Static method in class org.apache.tika.http.TikaHttpClient
-
Create a new
TikaHttpClientwith a daemon-thread executor. - build(Channel, CallOptions) - Method in class org.apache.tika.TikaGrpc.TikaBlockingStub
- build(Channel, CallOptions) - Method in class org.apache.tika.TikaGrpc.TikaBlockingV2Stub
- build(Channel, CallOptions) - Method in class org.apache.tika.TikaGrpc.TikaFutureStub
- build(Channel, CallOptions) - Method in class org.apache.tika.TikaGrpc.TikaStub
- build(Path) - Static method in class org.apache.tika.eval.app.reports.ResultsReporter
- build(Path) - Static method in class org.apache.tika.server.client.TikaServerClientConfig
- build(CompositeParser, Detector, AutoDetectParserConfig) - Static method in class org.apache.tika.parser.AutoDetectParser
- build(NodeObject) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData.Builder
-
This method is used to build a list of DataElement from a node object
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.emitter.azblob.AZBlobEmitter
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.emitter.es.ESEmitter
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.emitter.fs.FileSystemEmitter
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.emitter.gcs.GCSEmitter
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.emitter.jdbc.JDBCEmitter
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.emitter.kafka.KafkaEmitter
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitter
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.emitter.s3.S3Emitter
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.emitter.solr.SolrEmitter
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcher
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.fetcher.azblob.AZBlobFetcher
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.fetcher.gcs.GCSFetcher
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.fetcher.googledrive.GoogleDriveFetcher
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.fetcher.http.HttpFetcher
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.fetcher.s3.S3Fetcher
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.fetchers.microsoftgraph.MicrosoftGraphFetcher
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.iterator.azblob.AZBlobPipesIterator
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.iterator.csv.CSVPipesIterator
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.iterator.fs.FileSystemPipesIterator
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.iterator.gcs.GCSPipesIterator
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIterator
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.iterator.kafka.KafkaPipesIterator
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.iterator.s3.S3PipesIterator
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.iterator.solr.SolrPipesIterator
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.pipesiterator.json.JsonPipesIterator
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.reporter.es.ESPipesReporter
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.reporter.fs.FileSystemStatusReporter
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporter
- build(ExtensionConfig) - Static method in class org.apache.tika.pipes.reporter.opensearch.OpenSearchPipesReporter
- Build(byte[]) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.IntermediateNodeObject.RootNodeObjectBuilder
-
This method is used to build a root node object from a byte array
- Build(byte[], SignatureObject) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject.IntermediateNodeObjectBuilder
-
This method is used to build intermediate node object from a byte array with a signature
- Build(List<ObjectGroupDataElementData>, ObjectGroupObjectData, ExGuid) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject.IntermediateNodeObjectBuilder
-
This method is used to build intermediate node object from an list of object group data element
- BUILD - Static variable in interface org.apache.tika.metadata.QuattroPro
-
Build.
- build2() - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
-
Initialize the MimeTypes with this builder instance
- BuildCharsetTrainingData - Class in org.apache.tika.ml.chardetect.tools
-
Generates charset-detection training, devtest, and test data from MADLAD-400 and Cantonese Wikipedia sentence files.
- BuildCharsetTrainingData() - Constructor for class org.apache.tika.ml.chardetect.tools.BuildCharsetTrainingData
- buildConfig(JsonConfig, Class<T>) - Static method in class org.apache.tika.config.ConfigDeserializer
-
Deserializes a JSON configuration to a configuration object.
- buildDataElements(byte[], AtomicReference<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to build a list of data elements to represent a file.
- buildDataPackage(String) - Method in class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler
-
Builds the DataPackage manifest from collected files.
- buildDOM(InputStream) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Builds a Document with a DocumentBuilder from the pool
- buildDOM(InputStream, ParseContext) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
This checks context for a user specified
DocumentBuilder. - buildDOM(Reader, ParseContext) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
This checks context for a user specified
DocumentBuilder. - buildDOM(String) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Builds a Document with a DocumentBuilder from the pool
- buildDOM(Path) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Builds a Document with a DocumentBuilder from the pool
- builder() - Static method in class org.apache.tika.pipes.core.config.ConfigOverrides
- builder(String, Class<T>) - Static method in class org.apache.tika.serialization.ComponentConfig
-
Creates a new builder for ComponentConfig.
- Builder() - Constructor for class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- Builder() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData.Builder
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.core.config.FileBasedConfigStoreFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.emitter.azblob.AZBlobEmitterFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.emitter.es.ESEmitterFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.emitter.fs.FileSystemEmitterFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.emitter.gcs.GCSEmitterFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.emitter.kafka.KafkaEmitterFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.emitter.s3.S3EmitterFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.emitter.solr.SolrEmitterFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcherFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.fetcher.azblob.AZBlobFetcherFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.fetcher.fs.FileSystemFetcherFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.fetcher.gcs.GCSFetcherFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.fetcher.googledrive.GoogleDriveFetcherFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.fetcher.http.HttpFetcherFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.fetcher.s3.S3FetcherFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.MicrosoftGraphFetcherFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.ignite.IgniteConfigStoreFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.iterator.azblob.AZBlobPipesIteratorFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.iterator.csv.CSVPipesIteratorFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.iterator.fs.FileSystemPipesIteratorFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.iterator.gcs.GCSPipesIteratorFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIteratorFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.iterator.kafka.KafkaPipesIteratorFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.iterator.s3.S3PipesIteratorFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.pipesiterator.json.JsonPipesIteratorFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.reporter.es.ESReporterFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.reporter.fs.FileSystemReporterFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterFactory
- buildExtension(ExtensionConfig) - Method in class org.apache.tika.pipes.reporter.opensearch.OpenSearchReporterFactory
- buildExtension(ExtensionConfig) - Method in interface org.apache.tika.plugins.TikaExtensionFactory
- buildGroupIndices(String[]) - Static method in class org.apache.tika.ml.chardetect.CharsetConfusables
-
Build a per-class group-index array from a label array (e.g. from a
LinearModel), usingCharsetConfusables.GROUPS(both symmetric and superset chains) for probability collapsing in inference. - buildHttpCall(VLMOCRConfig, String, String) - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
-
Build a fully formed
AbstractVLMParser.HttpCallfor the target API. - buildHttpCall(VLMOCRConfig, String, String) - Method in class org.apache.tika.parser.vlm.ClaudeVLMParser
- buildHttpCall(VLMOCRConfig, String, String) - Method in class org.apache.tika.parser.vlm.GeminiVLMParser
- buildHttpCall(VLMOCRConfig, String, String) - Method in class org.apache.tika.parser.vlm.OpenAIVLMParser
- BuildJunkTrainingData - Class in org.apache.tika.ml.junkdetect.tools
-
Builds per-script positive training data for the junk detector from MADLAD-400 and Wikipedia sentence files.
- BuildJunkTrainingData() - Constructor for class org.apache.tika.ml.junkdetect.tools.BuildJunkTrainingData
- buildParagraphTagAndStyle(String, boolean) - Static method in class org.apache.tika.parser.microsoft.WordExtractor
-
Given a style name, return what tag should be used, and what style should be applied to it.
- buildPartial() - Method in class org.apache.tika.DeleteFetcherReply.Builder
- buildPartial() - Method in class org.apache.tika.DeleteFetcherRequest.Builder
- buildPartial() - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
- buildPartial() - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- buildPartial() - Method in class org.apache.tika.FetchAndParseReply.Builder
- buildPartial() - Method in class org.apache.tika.FetchAndParseRequest.Builder
- buildPartial() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- buildPartial() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- buildPartial() - Method in class org.apache.tika.GetFetcherReply.Builder
- buildPartial() - Method in class org.apache.tika.GetFetcherRequest.Builder
- buildPartial() - Method in class org.apache.tika.GetPipesIteratorReply.Builder
- buildPartial() - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
- buildPartial() - Method in class org.apache.tika.ListFetchersReply.Builder
- buildPartial() - Method in class org.apache.tika.ListFetchersRequest.Builder
- buildPartial() - Method in class org.apache.tika.SaveFetcherReply.Builder
- buildPartial() - Method in class org.apache.tika.SaveFetcherRequest.Builder
- buildPartial() - Method in class org.apache.tika.SavePipesIteratorReply.Builder
- buildPartial() - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
- buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
Populates the
XHTMLContentHandlerobject received as parameter. - buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.SXSLFPowerPointExtractorDecorator
- buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.SXWPFWordExtractorDecorator
- buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.VSDXExtractorDecorator
- buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSExtractorDecorator
- buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFBExcelExtractorDecorator
- buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
- BundleActivator - Class in org.apache.tika.bundle.internal
-
Registers Tika Parser and Detector services when the bundle starts in an OSGi container.
- BundleActivator() - Constructor for class org.apache.tika.bundle.internal.BundleActivator
- BWEBARCHIVE - Static variable in class org.apache.tika.detect.apple.BPListDetector
- BYTE_ARRAY_LENGHT - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- byteIdenticalOnProbe(byte[], Charset, Charset) - Static method in class org.apache.tika.ml.chardetect.DecodeEquivalence
-
Returns
trueif decodingprobeunder charsetsaandbproduces bit-identical character sequences. - bytes() - Method in record class org.apache.tika.pipes.core.extractor.frictionless.FrictionlessResource
-
Returns the value of the
bytesrecord component. - bytes() - Method in record class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler.FrictionlessFileInfo
-
Returns the value of the
bytesrecord component. - BytesRefCalculator<T> - Interface in org.apache.tika.eval.core.textstats
-
Interface for calculators that require a string
- BytesRefCalculator.BytesRefCalcInstance<T> - Interface in org.apache.tika.eval.core.textstats
- ByteUtil - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
- ByteUtil() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.ByteUtil
- BZIP - Static variable in class org.apache.tika.detect.zip.CompressorConstants
- BZIP2 - Static variable in class org.apache.tika.detect.zip.CompressorConstants
C
- CachedTitleString - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- CachedTitleStringFromPage - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- CachedTranslator - Class in org.apache.tika.language.translate.impl
-
CachedTranslator.
- CachedTranslator() - Constructor for class org.apache.tika.language.translate.impl.CachedTranslator
-
Create a new CachedTranslator (must set the
TranslatorwithCachedTranslator.setTranslator(Translator)before use!) - CachedTranslator(Translator) - Constructor for class org.apache.tika.language.translate.impl.CachedTranslator
-
Create a new CachedTranslator.
- cacheSize() - Method in record class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterConfig
-
Returns the value of the
cacheSizerecord component. - calcTextStats(ContentTags) - Method in class org.apache.tika.eval.app.ProfilerBase
- calculate(String) - Method in class org.apache.tika.eval.core.langid.LanguageIDWrapper
- calculate(String) - Method in class org.apache.tika.eval.core.textstats.CompositeTextStatsCalculator
- calculate(String) - Method in class org.apache.tika.eval.core.textstats.ContentLengthCalculator
- calculate(String) - Method in interface org.apache.tika.eval.core.textstats.StringStatsCalculator
- calculate(String) - Method in class org.apache.tika.eval.core.textstats.UnicodeBlockCounter
- calculate(List<LanguageResult>, TokenCounts) - Method in class org.apache.tika.eval.core.textstats.CommonTokens
- calculate(List<LanguageResult>, TokenCounts) - Method in class org.apache.tika.eval.core.textstats.CommonTokensBhattacharyya
- calculate(List<LanguageResult>, TokenCounts) - Method in class org.apache.tika.eval.core.textstats.CommonTokensCosine
- calculate(List<LanguageResult>, TokenCounts) - Method in class org.apache.tika.eval.core.textstats.CommonTokensHellinger
- calculate(List<LanguageResult>, TokenCounts) - Method in class org.apache.tika.eval.core.textstats.CommonTokensKLDivergence
- calculate(List<LanguageResult>, TokenCounts) - Method in class org.apache.tika.eval.core.textstats.CommonTokensKLDNormed
- calculate(List<LanguageResult>, TokenCounts) - Method in interface org.apache.tika.eval.core.textstats.LanguageAwareTokenCountStats
- calculate(TokenCounts) - Method in class org.apache.tika.eval.core.textstats.BasicTokenCountStatsCalculator
- calculate(TokenCounts) - Method in class org.apache.tika.eval.core.textstats.TextProfileSignature
- calculate(TokenCounts) - Method in interface org.apache.tika.eval.core.textstats.TokenCountStatsCalculator
- calculate(TokenCounts) - Method in class org.apache.tika.eval.core.textstats.TokenEntropy
- calculate(TokenCounts) - Method in class org.apache.tika.eval.core.textstats.TokenLengths
- calculate(TokenCounts) - Method in class org.apache.tika.eval.core.textstats.TopNTokens
- calculateContrastStatistics(TokenCounts, TokenCounts) - Method in class org.apache.tika.eval.core.tokens.TokenContraster
- calculateExtension(Metadata, String) - Static method in class org.apache.tika.io.FilenameUtils
-
Calculate the extension based on the
HttpHeaders.CONTENT_TYPEvalue. - call() - Method in class org.apache.tika.eval.app.StatusReporter
- call() - Method in class org.apache.tika.pipes.core.async.AsyncEmitter
- call() - Method in class org.apache.tika.pipes.core.pipesiterator.CallablePipesIterator
- call() - Method in class org.apache.tika.pipes.pipesiterator.PipesIteratorBase
- CallablePipesIterator - Class in org.apache.tika.pipes.core.pipesiterator
-
This is a simple wrapper around
PipesIteratorthat allows it to be called in its own thread. - CallablePipesIterator(PipesIterator, ArrayBlockingQueue<FetchEmitTuple>) - Constructor for class org.apache.tika.pipes.core.pipesiterator.CallablePipesIterator
-
This sets timeoutMillis to -1, meaning that this will block forever trying to add fetchemittuples to the queue.
- CallablePipesIterator(PipesIterator, ArrayBlockingQueue<FetchEmitTuple>, long) - Constructor for class org.apache.tika.pipes.core.pipesiterator.CallablePipesIterator
-
This sets the number of
PipesIterator.COMPLETED_SEMAPHOREto 1. - CallablePipesIterator(PipesIterator, ArrayBlockingQueue<FetchEmitTuple>, long, int) - Constructor for class org.apache.tika.pipes.core.pipesiterator.CallablePipesIterator
- CAN_MODIFY - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Can any modifications be made to the document
- CAN_MODIFY_ANNOTATIONS - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Can the user modify annotations
- CAN_PRINT - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Can the user print the document
- CAN_PRINT_FAITHFUL - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Can the user print an image-degraded version of the document.
- CANADIAN_ABORIGINAL - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- CannotBeSelected - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- canRun() - Static method in class org.apache.tika.langdetect.mitll.TextLangDetector
- canRun() - Static method in class org.apache.tika.parser.journal.GrobidRESTParser
- CAPTION_WRITER - Static variable in interface org.apache.tika.metadata.Photoshop
- CaptureGroupMetadataFilter - Class in org.apache.tika.metadata.filter
-
This filter runs a regex against the first value in the "sourceField".
- CaptureGroupMetadataFilter() - Constructor for class org.apache.tika.metadata.filter.CaptureGroupMetadataFilter
- CaptureGroupMetadataFilter(JsonConfig) - Constructor for class org.apache.tika.metadata.filter.CaptureGroupMetadataFilter
-
Constructor for JSON configuration.
- CaptureGroupMetadataFilter(CaptureGroupMetadataFilter.Config) - Constructor for class org.apache.tika.metadata.filter.CaptureGroupMetadataFilter
-
Constructor with explicit Config object.
- CaptureGroupMetadataFilter.Config - Class in org.apache.tika.metadata.filter
-
Configuration class for JSON deserialization.
- CATEGORY - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.
- CATEGORY - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
A categorization of the content of this package.
- CATEGORY - Static variable in interface org.apache.tika.metadata.Photoshop
- cb - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtFourBytesOfLengthFollowedByData
- CC - Enum constant in enum class org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
- cell(String, String, XSSFComment) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
-
Bridge for POI's
XSSFSheetXMLHandler.SheetContentsHandlerinterface, used by the XLSB (binary) path viaXSSFBSheetHandler. - cell(String, String, XSSFCommentsShim.CommentData) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
- Cell - Interface in org.apache.tika.parser.microsoft
-
Cell of content.
- CellDecorator - Class in org.apache.tika.parser.microsoft
-
Cell decorator.
- CellDecorator(Cell) - Constructor for class org.apache.tika.parser.microsoft.CellDecorator
- CellError - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Cell Error
- cellID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
- cellID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestRootDeclare
- CellID - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
- CellID() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
-
Initializes a new instance of the CellID class, this is default constructor.
- CellID(CellID) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
-
Initializes a new instance of the CellID class, this is the copy constructor.
- CellID(ExGuid, ExGuid) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
-
Initializes a new instance of the CellID class with specified ExGuids.
- cellIDArray - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
- cellIDArray - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDataBLOBReference
- CellIDArray - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
- CellIDArray() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
-
Initializes a new instance of the CellIDArray class, this is default constructor.
- CellIDArray(long, List<CellID>) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
-
Initializes a new instance of the CellIDArray class.
- CellIDArray(CellIDArray) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
-
Initializes a new instance of the CellIDArray class, this is copy constructor.
- CellKnowledge - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Cell Knowledge
- CellKnowledge - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Cell Knowledge
- CellKnowledgeEntry - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Cell Knowledge Entry
- CellKnowledgeRange - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Cell Knowledge Range
- cellManifestCurrentRevision - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestDataElementData
- CellManifestCurrentRevision - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- CellManifestCurrentRevision - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Cell Manifest Current Revision
- CellManifestCurrentRevision() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestCurrentRevision
-
Initializes a new instance of the CellManifestCurrentRevision class.
- cellManifestCurrentRevisionExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestCurrentRevision
- CellManifestDataElementData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Cell manifest data element
- CellManifestDataElementData - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
-
Cell Manifest Data Element
- CellManifestDataElementData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestDataElementData
-
Initializes a new instance of the CellManifestDataElementData class.
- cellManifests - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
- cellMappingExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
- cellMappingSerialNumber - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
- cellReferencesCount - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
- cellReferencesCount - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
- CellRoundtripOptions - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Cell Roundtrip Options
- CellSecondExGuid - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
- CENTRAL_DIRECTORY_ONLY_ENTRIES - Static variable in interface org.apache.tika.metadata.Zip
-
Entry names that exist in central directory but not in local headers.
- CERTIFICATE - Static variable in interface org.apache.tika.metadata.XMPRights
-
A Web URL for a rights management certificate.
- ChannelTypePropertyConverter() - Constructor for class org.apache.tika.metadata.XMPDM.ChannelTypePropertyConverter
-
Deprecated.
- CHARACTER_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Characters in the document
- CHARACTER_COUNT_WITH_SPACES - Static variable in interface org.apache.tika.metadata.Office
-
The number of Characters in the document, including spaces
- characters - Variable in class org.apache.tika.mime.MimeTypesReader
- characters(char[], int, int) - Method in class org.apache.tika.mime.MimeTypesReader
- characters(char[], int, int) - Method in class org.apache.tika.parser.ctakes.CTAKESContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.parser.dif.DIFContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
- characters(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
- characters(char[], int, int) - Method in class org.apache.tika.parser.mif.MIFContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.parser.tmx.TMXContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
- characters(char[], int, int) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
- characters(char[], int, int) - Method in class org.apache.tika.parser.xml.MetadataHandler
-
Deprecated.
- characters(char[], int, int) - Method in class org.apache.tika.sax.boilerpipe.BoilerpipeContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- characters(char[], int, int) - Method in class org.apache.tika.sax.DIFContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.ExpandedTitleContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.LinkContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.PhoneExtractingContentHandler
-
The characters method is called whenever a Parser wants to pass raw...
- characters(char[], int, int) - Method in class org.apache.tika.sax.SafeContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.SecureContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.StandardsExtractingContentHandler
-
The characters method is called whenever a Parser wants to pass raw characters to the ContentHandler.
- characters(char[], int, int) - Method in class org.apache.tika.sax.TeeContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.TextContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.ToMarkdownContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.ToTextContentHandler
-
Writes the given characters to the given character stream.
- characters(char[], int, int) - Method in class org.apache.tika.sax.ToXMLContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.WriteOutContentHandler
-
Writes the given characters to the given character stream.
- characters(char[], int, int) - Method in class org.apache.tika.sax.XHTMLContentHandler
- characters(char[], int, int) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
- characters(String) - Method in class org.apache.tika.sax.XHTMLContentHandler
- CHARACTERS_PER_PAGE - Static variable in interface org.apache.tika.metadata.PDF
- charset - Variable in class org.apache.tika.detect.OverrideEncodingDetector.Config
- Charset - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- CharsetConfusables - Class in org.apache.tika.ml.chardetect
-
Charset relationships used for lenient (lenient) evaluation of charset detectors.
- CharsetContentHandlerFactory() - Constructor for class org.apache.tika.example.PickBestTextEncodingParser.CharsetContentHandlerFactory
-
Deprecated.
- CharsetDetector - Class in org.apache.tika.parser.txt
-
CharsetDetectorprovides a facility for detecting the charset or encoding of character data in an unknown format. - CharsetDetector() - Constructor for class org.apache.tika.parser.txt.CharsetDetector
-
Constructor
- CharsetDetector(int) - Constructor for class org.apache.tika.parser.txt.CharsetDetector
- charsetFromContentEncoding(Metadata) - Static method in class org.apache.tika.detect.MetadataCharsetDetector
-
Returns the charset named in
HttpHeaders.CONTENT_ENCODING, ornullif absent or unparseable. - charsetFromContentType(Metadata) - Static method in class org.apache.tika.detect.MetadataCharsetDetector
-
Returns the charset named in the
charsetparameter of theHttpHeaders.CONTENT_TYPEvalue, ornullif absent or unparseable. - CharsetMatch - Class in org.apache.tika.parser.txt
-
This class represents a charset that has been identified by a CharsetDetector as a possible encoding for a set of input data.
- CharsetSupersets - Class in org.apache.tika.detect
-
Maps detected charsets to safer superset charsets for decoding.
- CharsetTester() - Constructor for class org.apache.tika.example.PickBestTextEncodingParser.CharsetTester
-
Deprecated.
- CharsetUtils - Class in org.apache.tika.utils
- CharsetUtils() - Constructor for class org.apache.tika.utils.CharsetUtils
- CharSoupFeatureExtractor - Class in org.apache.tika.langdetect.charsoup
-
Extracts character n-gram features from text using the hashing trick (FNV-1a).
- CharSoupFeatureExtractor(int) - Constructor for class org.apache.tika.langdetect.charsoup.CharSoupFeatureExtractor
-
Create an extractor with bigrams only.
- CharSoupFeatureExtractor(int, boolean) - Constructor for class org.apache.tika.langdetect.charsoup.CharSoupFeatureExtractor
-
Create an extractor with configurable n-gram mode.
- CharSoupLanguageDetector - Class in org.apache.tika.langdetect.charsoup
-
CharSoup language detector using INT8-quantized multinomial logistic regression trained on Wikipedia (primary corpus) with MADLAD supplements for thin languages.
- CharSoupLanguageDetector() - Constructor for class org.apache.tika.langdetect.charsoup.CharSoupLanguageDetector
-
Constructs a detector using the default classpath-loaded model.
- CharSoupLanguageDetector(CharSoupModel) - Constructor for class org.apache.tika.langdetect.charsoup.CharSoupLanguageDetector
-
Constructs a detector that uses a caller-supplied model instead of the classpath default.
- CharSoupMetadataFilter - Class in org.apache.tika.langdetect.charsoup
-
A
MetadataFilterthat runs CharSoup language detection on the extracted text content and writes the detected language and confidence into the metadata. - CharSoupMetadataFilter() - Constructor for class org.apache.tika.langdetect.charsoup.CharSoupMetadataFilter
- CharSoupModel - Class in org.apache.tika.langdetect.charsoup
-
INT8-quantized multinomial logistic regression model for language detection.
- CharSoupModel(int, int, String[], float[], float[], byte[][]) - Constructor for class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Construct from class-major
byte[][]weights with default feature configuration (word unigrams only — backward compatible with v1). - CharSoupModel(int, int, String[], float[], float[], byte[][], int) - Constructor for class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Construct from class-major
byte[][]weights with explicit feature flags. - check(String[], int...) - Static method in class org.apache.tika.embedder.ExternalEmbedder
-
Checks to see if the command can be run.
- check(String, int...) - Static method in class org.apache.tika.embedder.ExternalEmbedder
-
Checks to see if the command can be run.
- checkActive() - Method in class org.apache.tika.pipes.core.async.AsyncProcessor
- checkAscii(byte[]) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
-
Returns
trueifbytescontains no bytes with value >= 0x80 (i.e. pure 7-bit ASCII, which is a strict subset of UTF-8). - checkAscii(byte[], int, int) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
- checkAvail() - Method in class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
-
Ping lucene-geo-gazetteer API
- checkBit(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
- checkCertificateExpiration() - Method in class org.apache.tika.server.core.TlsConfig
-
Check certificate expiration dates and log warnings for certificates expiring within the configured threshold.
- checkCommand(String[], int...) - Static method in class org.apache.tika.utils.ProcessUtils
-
Checks to see if the command can be run.
- checkCommand(String, int...) - Method in class org.apache.tika.language.translate.impl.ExternalTranslator
-
Checks to see if the command can be run.
- checkCommand(String, int...) - Static method in class org.apache.tika.utils.ProcessUtils
-
Checks to see if the command can be run.
- checkConfig(FileSystemPipesIteratorConfig) - Method in class org.apache.tika.pipes.iterator.fs.FileSystemPipesIterator
- checkEmbeddedLimits(ParseRecord) - Method in class org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
-
Checks embedded document limits from ParseRecord.
- checkHasFile() - Static method in class org.apache.tika.detect.FileCommandDetector
- checkHasFile(String) - Static method in class org.apache.tika.detect.FileCommandDetector
- checkHasMagika(String) - Static method in class org.apache.tika.detect.magika.MagikaDetector
- checkHasSiegfried(String) - Static method in class org.apache.tika.detect.siegfried.SiegfriedDetector
- checkHz(byte[]) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
-
Returns
trueif HZ-GB-2312 switching sequences are present. - checkHz(byte[], int, int) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
- checkIbm424(byte[]) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
-
Detects IBM424 (EBCDIC Hebrew) by examining the sub-0x80 byte landscape.
- checkIbm424(byte[], int, int) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
- checkIbm500(byte[]) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
-
Detects IBM500 (International EBCDIC / EBCDIC-500) by looking for the combination of the EBCDIC space byte and high-byte Latin letter density.
- checkIbm500(byte[], int, int) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
- checkInitialization() - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
- checkInitialization() - Method in class org.apache.tika.parser.ocrencode.EncodeOCRParser
- checkInitialization() - Method in class org.apache.tika.server.client.TikaServerClientConfig
- checkInitialization() - Method in class org.apache.tika.server.core.TlsConfig
- checkIso2022Jp(byte[]) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
-
Deprecated.Use
StructuralEncodingRules.detectIso2022(byte[])which distinguishes JP/KR/CN. - checkQuietly() - Method in class org.apache.tika.parser.microsoft.libpst.LibPstParser
- checkUtf8(byte[]) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
-
Validates the UTF-8 byte grammar of the sample and returns one of three outcomes:
StructuralEncodingRules.Utf8Result.LIKELY_UTF8: all multi-byte sequences are valid and the sample contains enough high bytes to be informative. - checkUtf8(byte[], int, int) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
- ChildGraphSpaceElementNodes - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ChildMatcher - Class in org.apache.tika.sax.xpath
-
Intermediate evaluation state of a
.../*...XPath expression. - ChildMatcher(Matcher) - Constructor for class org.apache.tika.sax.xpath.ChildMatcher
- CHM_ITSF_V2_LEN - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- CHM_ITSF_V3_LEN - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- CHM_ITSP_V1_LEN - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- CHM_LZXC_MIN_LEN - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- CHM_LZXC_RESETTABLE_V1_LEN - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- CHM_LZXC_V2_LEN - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- CHM_PMGI_LEN - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- CHM_PMGI_MARKER - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- CHM_PMGL_LEN - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- CHM_SIGNATURE_LEN - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- CHM_VER_1 - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- CHM_VER_2 - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- CHM_VER_3 - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- CHM_WINDOW_SIZE_BLOCK - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- ChmAccessor<T> - Interface in org.apache.tika.parser.microsoft.chm
-
Defines an accessor interface
- ChmAssert - Class in org.apache.tika.parser.microsoft.chm
-
Contains chm extractor assertions
- ChmAssert() - Constructor for class org.apache.tika.parser.microsoft.chm.ChmAssert
- ChmBlockInfo - Class in org.apache.tika.parser.microsoft.chm
-
A container that contains chm block information such as: i. initial block is using to reset main tree ii. start block is using for knowing where to start iii. end block is using for knowing where to stop iv. start offset is using for knowing where to start reading v. end offset is using for knowing where to stop reading
- ChmCommons - Class in org.apache.tika.parser.microsoft.chm
- ChmCommons.EntryType - Enum Class in org.apache.tika.parser.microsoft.chm
-
Represents entry types: uncompressed, compressed
- ChmCommons.IntelState - Enum Class in org.apache.tika.parser.microsoft.chm
-
Represents intel file states during decompression
- ChmCommons.LzxState - Enum Class in org.apache.tika.parser.microsoft.chm
-
Represents lzx states: started decoding, not started decoding
- ChmConstants - Class in org.apache.tika.parser.microsoft.chm
- ChmDirectoryListingSet - Class in org.apache.tika.parser.microsoft.chm
-
Holds chm listing entries
- ChmDirectoryListingSet(byte[], ChmItsfHeader, ChmItspHeader) - Constructor for class org.apache.tika.parser.microsoft.chm.ChmDirectoryListingSet
-
Constructs chm directory listing set
- ChmExtractor - Class in org.apache.tika.parser.microsoft.chm
-
Extracts text from chm file.
- ChmExtractor(InputStream) - Constructor for class org.apache.tika.parser.microsoft.chm.ChmExtractor
- ChmItsfHeader - Class in org.apache.tika.parser.microsoft.chm
-
The Header 0000: char[4] 'ITSF' 0004: DWORD 3 (Version number) 0008: DWORD Total header length, including header section table and following data. 000C: DWORD 1 (unknown) 0010: DWORD a timestamp 0014: DWORD Windows Language ID 0018: GUID {7C01FD10-7BAA-11D0-9E0C-00A0-C922-E6EC} 0028: GUID {7C01FD11-7BAA-11D0-9E0C-00A0-C922-E6EC} Note: a GUID is $10 bytes, arranged as 1 DWORD, 2 WORDs, and 8 BYTEs. 0000: QWORD Offset of section from beginning of file 0008: QWORD Length of section Following the header section table is 8 bytes of additional header data.
- ChmItsfHeader() - Constructor for class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
- ChmItspHeader - Class in org.apache.tika.parser.microsoft.chm
-
Directory header The directory starts with a header; its format is as follows: 0000: char[4] 'ITSP' 0004: DWORD Version number 1 0008: DWORD Length of the directory header 000C: DWORD $0a (unknown) 0010: DWORD $1000 Directory chunk size 0014: DWORD "Density" of quickref section, usually 2 0018: DWORD Depth of the index tree - 1 there is no index, 2 if there is one level of PMGI chunks 001C: DWORD Chunk number of root index chunk, -1 if there is none (though at least one file has 0 despite there being no index chunk, probably a bug) 0020: DWORD Chunk number of first PMGL (listing) chunk 0024: DWORD Chunk number of last PMGL (listing) chunk 0028: DWORD -1 (unknown) 002C: DWORD Number of directory chunks (total) 0030: DWORD Windows language ID 0034: GUID {5D02926A-212E-11D0-9DF9-00A0C922E6EC} 0044: DWORD $54 (This is the length again) 0048: DWORD -1 (unknown) 004C: DWORD -1 (unknown) 0050: DWORD -1 (unknown)
- ChmItspHeader() - Constructor for class org.apache.tika.parser.microsoft.chm.ChmItspHeader
- ChmLzxBlock - Class in org.apache.tika.parser.microsoft.chm
-
Decompresses a chm block.
- ChmLzxBlock(int, byte[], long, ChmLzxBlock) - Constructor for class org.apache.tika.parser.microsoft.chm.ChmLzxBlock
- ChmLzxcControlData - Class in org.apache.tika.parser.microsoft.chm
-
::DataSpace/Storage/
/ControlData This file contains $20 bytes of information on the compression. - ChmLzxcControlData() - Constructor for class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
- ChmLzxcResetTable - Class in org.apache.tika.parser.microsoft.chm
-
LZXC reset table For ensuring a decompression.
- ChmLzxcResetTable() - Constructor for class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
- ChmLzxState - Class in org.apache.tika.parser.microsoft.chm
- ChmLzxState(int) - Constructor for class org.apache.tika.parser.microsoft.chm.ChmLzxState
- ChmParser - Class in org.apache.tika.parser.microsoft.chm
- ChmParser() - Constructor for class org.apache.tika.parser.microsoft.chm.ChmParser
- ChmParsingException - Exception in org.apache.tika.parser.microsoft.chm
- ChmParsingException(String) - Constructor for exception org.apache.tika.parser.microsoft.chm.ChmParsingException
- ChmPmgiHeader - Class in org.apache.tika.parser.microsoft.chm
-
Description Note: not always exists An index chunk has the following format: 0000: char[4] 'PMGI' 0004: DWORD Length of quickref/free area at end of directory chunk 0008: Directory index entries (to quickref/free area) The quickref area in an PMGI is the same as in an PMGL The format of a directory index entry is as follows: BYTE: length of name BYTEs: name (UTF-8 encoded) ENCINT: directory listing chunk which starts with name Encoded Integers aka ENCINT An ENCINT is a variable-length integer.
- ChmPmgiHeader() - Constructor for class org.apache.tika.parser.microsoft.chm.ChmPmgiHeader
- ChmPmglHeader - Class in org.apache.tika.parser.microsoft.chm
-
Description There are two types of directory chunks -- index chunks, and listing chunks.
- ChmPmglHeader() - Constructor for class org.apache.tika.parser.microsoft.chm.ChmPmglHeader
- ChmSection - Class in org.apache.tika.parser.microsoft.chm
- ChmSection(byte[]) - Constructor for class org.apache.tika.parser.microsoft.chm.ChmSection
- ChmSection(byte[], byte[]) - Constructor for class org.apache.tika.parser.microsoft.chm.ChmSection
- ChmWrapper - Class in org.apache.tika.parser.microsoft.chm
- ChmWrapper() - Constructor for class org.apache.tika.parser.microsoft.chm.ChmWrapper
- chunk(String) - Method in class org.apache.tika.inference.MarkdownChunker
-
Chunk the given markdown text.
- Chunk - Class in org.apache.tika.inference
-
A content chunk with multimodal locators and an optional embedding vector.
- Chunk(String, int, int) - Constructor for class org.apache.tika.inference.Chunk
-
Convenience constructor for text-only chunks with character offsets.
- Chunk(String, Locators) - Constructor for class org.apache.tika.inference.Chunk
- chunking() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.AbstractChunking
-
This method is used to chunk the file data.
- chunking() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.RDCAnalysisChunking
-
This method is used to chunk the file data.
- chunking() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.SimpleChunking
-
This method is used to chunk the file data.
- chunking() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ZipFilesChunking
-
This method is used to chunk the file data.
- ChunkingFactory - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
-
This class is used to create instance of AbstractChunking.
- ChunkingMethod - Enum Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
- chunksComplete() - Method in class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks
-
Used to flag that all the chunks of the NameID have now been located.
- ChunkSerializer - Class in org.apache.tika.inference
-
Serializes and deserializes a list of
Chunkobjects to/from JSON. - CITY - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of the city the content is focussing on -- either the place shown in visual media or referenced by text or audio media.
- CITY - Static variable in interface org.apache.tika.metadata.Photoshop
- className - Variable in class org.apache.tika.server.core.resource.TikaWelcome.Endpoint
- ClassParser - Class in org.apache.tika.parser.asm
-
Parser for Java .class files.
- ClassParser() - Constructor for class org.apache.tika.parser.asm.ClassParser
- ClaudeVLMParser - Class in org.apache.tika.parser.vlm
-
VLM parser for the Anthropic Claude Messages API.
- ClaudeVLMParser() - Constructor for class org.apache.tika.parser.vlm.ClaudeVLMParser
- ClaudeVLMParser(JsonConfig) - Constructor for class org.apache.tika.parser.vlm.ClaudeVLMParser
- ClaudeVLMParser(VLMOCRConfig) - Constructor for class org.apache.tika.parser.vlm.ClaudeVLMParser
- clean(String) - Static method in class org.apache.tika.sax.CleanPhoneText
- clean(String) - Static method in class org.apache.tika.utils.CharsetUtils
-
Handle various common charset name errors, and return something that will be considered valid (and is normalized)
- CleanPhoneText - Class in org.apache.tika.sax
-
Class to help de-obfuscate phone numbers in text.
- CleanPhoneText() - Constructor for class org.apache.tika.sax.CleanPhoneText
- cleanSubstitutions - Static variable in class org.apache.tika.sax.CleanPhoneText
- cleanup() - Method in record class org.apache.tika.server.core.resource.PipesParsingHelper.UnpackResult
-
Deletes the zip file.
- cleanupDwgString(String) - Method in class org.apache.tika.parser.dwg.DWGReadFormatRemover
- clear() - Method in class org.apache.tika.DeleteFetcherReply.Builder
- clear() - Method in class org.apache.tika.DeleteFetcherRequest.Builder
- clear() - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
- clear() - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- clear() - Method in class org.apache.tika.FetchAndParseReply.Builder
- clear() - Method in class org.apache.tika.FetchAndParseRequest.Builder
- clear() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- clear() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- clear() - Method in class org.apache.tika.GetFetcherReply.Builder
- clear() - Method in class org.apache.tika.GetFetcherRequest.Builder
- clear() - Method in class org.apache.tika.GetPipesIteratorReply.Builder
- clear() - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
- clear() - Method in class org.apache.tika.ListFetchersReply.Builder
- clear() - Method in class org.apache.tika.ListFetchersRequest.Builder
- clear() - Method in class org.apache.tika.SaveFetcherReply.Builder
- clear() - Method in class org.apache.tika.SaveFetcherRequest.Builder
- clear() - Method in class org.apache.tika.SavePipesIteratorReply.Builder
- clear() - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
- clearAdditionalFetchConfigJson() - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
You can supply additional fetch configuration using this.
- clearBit(byte[], long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.Bit
-
Set a bit value to "Off" in the specified byte array with the specified bit position.
- ClearByAttachmentTypeMetadataFilter - Class in org.apache.tika.metadata.filter
-
This class clears the entire metadata object if the attachment type matches one of the types.
- ClearByAttachmentTypeMetadataFilter() - Constructor for class org.apache.tika.metadata.filter.ClearByAttachmentTypeMetadataFilter
- ClearByAttachmentTypeMetadataFilter(Set<String>) - Constructor for class org.apache.tika.metadata.filter.ClearByAttachmentTypeMetadataFilter
- ClearByAttachmentTypeMetadataFilter(JsonConfig) - Constructor for class org.apache.tika.metadata.filter.ClearByAttachmentTypeMetadataFilter
-
Constructor for JSON configuration.
- ClearByAttachmentTypeMetadataFilter(ClearByAttachmentTypeMetadataFilter.Config) - Constructor for class org.apache.tika.metadata.filter.ClearByAttachmentTypeMetadataFilter
-
Constructor with explicit Config object.
- ClearByAttachmentTypeMetadataFilter.Config - Class in org.apache.tika.metadata.filter
-
Configuration class for JSON deserialization.
- clearEmitterId() - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
The ID of the emitter to use (optional).
- clearErrorMessage() - Method in class org.apache.tika.FetchAndParseReply.Builder
-
If there was an error, this will contain the error message.
- clearFetcherClass() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
-
The full java class name of the fetcher config for which to fetch json schema.
- clearFetcherClass() - Method in class org.apache.tika.GetFetcherReply.Builder
-
The full Java class name of the Fetcher.
- clearFetcherClass() - Method in class org.apache.tika.SaveFetcherRequest.Builder
-
The full java class name of the fetcher class.
- clearFetcherConfigJson() - Method in class org.apache.tika.SaveFetcherRequest.Builder
-
JSON string of the fetcher config object.
- clearFetcherConfigJsonSchema() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
-
The json schema that describes the fetcher config in string format.
- clearFetcherId() - Method in class org.apache.tika.DeleteFetcherRequest.Builder
-
ID of the fetcher to delete.
- clearFetcherId() - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
The ID of the fetcher in the fetcher store (previously saved by SaveFetcher) to use for the fetch.
- clearFetcherId() - Method in class org.apache.tika.GetFetcherReply.Builder
-
Echoes the ID of the fetcher being returned.
- clearFetcherId() - Method in class org.apache.tika.GetFetcherRequest.Builder
-
ID of the fetcher for which to return config.
- clearFetcherId() - Method in class org.apache.tika.SaveFetcherReply.Builder
-
The fetcher_id that was saved.
- clearFetcherId() - Method in class org.apache.tika.SaveFetcherRequest.Builder
-
A unique identifier for each fetcher.
- clearFetchKey() - Method in class org.apache.tika.FetchAndParseReply.Builder
-
Echoes the fetch_key that was sent in the request.
- clearFetchKey() - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
The "Fetch Key" of the item that will be fetched.
- clearField(Descriptors.FieldDescriptor) - Method in class org.apache.tika.DeleteFetcherReply.Builder
- clearField(Descriptors.FieldDescriptor) - Method in class org.apache.tika.DeleteFetcherRequest.Builder
- clearField(Descriptors.FieldDescriptor) - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
- clearField(Descriptors.FieldDescriptor) - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- clearField(Descriptors.FieldDescriptor) - Method in class org.apache.tika.FetchAndParseReply.Builder
- clearField(Descriptors.FieldDescriptor) - Method in class org.apache.tika.FetchAndParseRequest.Builder
- clearField(Descriptors.FieldDescriptor) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- clearField(Descriptors.FieldDescriptor) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- clearField(Descriptors.FieldDescriptor) - Method in class org.apache.tika.GetFetcherReply.Builder
- clearField(Descriptors.FieldDescriptor) - Method in class org.apache.tika.GetFetcherRequest.Builder
- clearField(Descriptors.FieldDescriptor) - Method in class org.apache.tika.GetPipesIteratorReply.Builder
- clearField(Descriptors.FieldDescriptor) - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
- clearField(Descriptors.FieldDescriptor) - Method in class org.apache.tika.ListFetchersReply.Builder
- clearField(Descriptors.FieldDescriptor) - Method in class org.apache.tika.ListFetchersRequest.Builder
- clearField(Descriptors.FieldDescriptor) - Method in class org.apache.tika.SaveFetcherReply.Builder
- clearField(Descriptors.FieldDescriptor) - Method in class org.apache.tika.SaveFetcherRequest.Builder
- clearField(Descriptors.FieldDescriptor) - Method in class org.apache.tika.SavePipesIteratorReply.Builder
- clearField(Descriptors.FieldDescriptor) - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
- clearFields() - Method in class org.apache.tika.FetchAndParseReply.Builder
- clearGetFetcherReplies() - Method in class org.apache.tika.ListFetchersReply.Builder
-
List of fetcher configs returned by the Lists Fetchers service.
- clearIteratorClass() - Method in class org.apache.tika.GetPipesIteratorReply.Builder
-
The full java class name of the pipes iterator
- clearIteratorClass() - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
-
The full java class name of the pipes iterator class.
- clearIteratorConfigJson() - Method in class org.apache.tika.GetPipesIteratorReply.Builder
-
JSON string of the pipes iterator config object
- clearIteratorConfigJson() - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
-
JSON string of the pipes iterator config object.
- clearIteratorId() - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
-
The pipes iterator ID to delete
- clearIteratorId() - Method in class org.apache.tika.GetPipesIteratorReply.Builder
-
The pipes iterator ID
- clearIteratorId() - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
-
The pipes iterator ID to retrieve
- clearIteratorId() - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
-
A unique identifier for each pipes iterator.
- clearMessage() - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
-
Status message
- clearMessage() - Method in class org.apache.tika.SavePipesIteratorReply.Builder
-
Status message
- clearNumFetchersPerPage() - Method in class org.apache.tika.ListFetchersRequest.Builder
-
List this many fetchers per page.
- clearOneof(Descriptors.OneofDescriptor) - Method in class org.apache.tika.DeleteFetcherReply.Builder
- clearOneof(Descriptors.OneofDescriptor) - Method in class org.apache.tika.DeleteFetcherRequest.Builder
- clearOneof(Descriptors.OneofDescriptor) - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
- clearOneof(Descriptors.OneofDescriptor) - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- clearOneof(Descriptors.OneofDescriptor) - Method in class org.apache.tika.FetchAndParseReply.Builder
- clearOneof(Descriptors.OneofDescriptor) - Method in class org.apache.tika.FetchAndParseRequest.Builder
- clearOneof(Descriptors.OneofDescriptor) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- clearOneof(Descriptors.OneofDescriptor) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- clearOneof(Descriptors.OneofDescriptor) - Method in class org.apache.tika.GetFetcherReply.Builder
- clearOneof(Descriptors.OneofDescriptor) - Method in class org.apache.tika.GetFetcherRequest.Builder
- clearOneof(Descriptors.OneofDescriptor) - Method in class org.apache.tika.GetPipesIteratorReply.Builder
- clearOneof(Descriptors.OneofDescriptor) - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
- clearOneof(Descriptors.OneofDescriptor) - Method in class org.apache.tika.ListFetchersReply.Builder
- clearOneof(Descriptors.OneofDescriptor) - Method in class org.apache.tika.ListFetchersRequest.Builder
- clearOneof(Descriptors.OneofDescriptor) - Method in class org.apache.tika.SaveFetcherReply.Builder
- clearOneof(Descriptors.OneofDescriptor) - Method in class org.apache.tika.SaveFetcherRequest.Builder
- clearOneof(Descriptors.OneofDescriptor) - Method in class org.apache.tika.SavePipesIteratorReply.Builder
- clearOneof(Descriptors.OneofDescriptor) - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
- clearPageNumber() - Method in class org.apache.tika.ListFetchersRequest.Builder
-
List the fetchers starting at this page number
- clearParams() - Method in class org.apache.tika.GetFetcherReply.Builder
- clearParseContextJson() - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
Optional JSON object to configure the ParseContext for this request, overriding server defaults.
- clearStatus() - Method in class org.apache.tika.FetchAndParseReply.Builder
-
The status from the message.
- clearSuccess() - Method in class org.apache.tika.DeleteFetcherReply.Builder
-
Success if the fetcher was successfully removed from the fetch store.
- CLIENT_UNAVAILABLE_WITHIN_MS - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- CLIENT_UNAVAILABLE_WITHIN_MS - Static variable in class org.apache.tika.pipes.core.PipesResults
- Client2CertificateCredentialsConfig - Class in org.apache.tika.pipes.fetchers.microsoftgraph.config
- Client2CertificateCredentialsConfig() - Constructor for class org.apache.tika.pipes.fetchers.microsoftgraph.config.Client2CertificateCredentialsConfig
- ClientCertificateCredentialsConfig - Class in org.apache.tika.pipes.fetchers.microsoftgraph.config
- ClientCertificateCredentialsConfig() - Constructor for class org.apache.tika.pipes.fetchers.microsoftgraph.config.ClientCertificateCredentialsConfig
- clientId() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
clientIdrecord component. - ClientSecretCredentialsConfig - Class in org.apache.tika.pipes.fetchers.microsoftgraph.config
- ClientSecretCredentialsConfig() - Constructor for class org.apache.tika.pipes.fetchers.microsoftgraph.config.ClientSecretCredentialsConfig
- ClimateForcast - Interface in org.apache.tika.metadata
-
Met keys from NCAR CCSM files in the Climate Forecast Convention.
- clip(int) - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- clone() - Method in class org.apache.tika.DeleteFetcherReply.Builder
- clone() - Method in class org.apache.tika.DeleteFetcherRequest.Builder
- clone() - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
- clone() - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- clone() - Method in class org.apache.tika.FetchAndParseReply.Builder
- clone() - Method in class org.apache.tika.FetchAndParseRequest.Builder
- clone() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- clone() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- clone() - Method in class org.apache.tika.GetFetcherReply.Builder
- clone() - Method in class org.apache.tika.GetFetcherRequest.Builder
- clone() - Method in class org.apache.tika.GetPipesIteratorReply.Builder
- clone() - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
- clone() - Method in class org.apache.tika.ListFetchersReply.Builder
- clone() - Method in class org.apache.tika.ListFetchersRequest.Builder
- clone() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- clone() - Method in class org.apache.tika.SaveFetcherReply.Builder
- clone() - Method in class org.apache.tika.SaveFetcherRequest.Builder
- clone() - Method in class org.apache.tika.SavePipesIteratorReply.Builder
- clone() - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
- cloneMetadata(Metadata) - Static method in class org.apache.tika.utils.ParserUtils
-
Does a deep clone of a Metadata object.
- close() - Method in class org.apache.tika.eval.app.db.DBBuffer
- close() - Method in class org.apache.tika.eval.app.db.MimeBuffer
- close() - Method in class org.apache.tika.eval.app.io.DBWriter
-
This closes the writer by executing batch and committing changes.
- close() - Method in interface org.apache.tika.eval.app.io.IDBWriter
- close() - Method in class org.apache.tika.eval.core.tokens.CommonTokenCountManager
- close() - Method in class org.apache.tika.http.TikaHttpClient
- close() - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- close() - Method in class org.apache.tika.io.LookaheadInputStream
- close() - Method in class org.apache.tika.io.TemporaryResources
-
Closes all tracked resources.
- close() - Method in class org.apache.tika.io.TikaInputStream
- close() - Method in class org.apache.tika.language.detect.LanguageWriter
-
Ignored.
- close() - Method in class org.apache.tika.language.translate.impl.MarianTranslator.MarianServerClient
-
Close the connection to the Marian Server.
- close() - Method in class org.apache.tika.metadata.filter.CompositeMetadataFilter
- close() - Method in class org.apache.tika.metadata.filter.MetadataFilter
-
Releases any resources held by this filter (e.g.
- close() - Method in class org.apache.tika.parser.jdbc.AbstractDBParser
-
Override this for any special handling of closing the connection.
- close() - Method in class org.apache.tika.parser.microsoft.ooxml.OPCPackageWrapper
- close() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFObjDataStreamParser
- close() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFPictStreamParser
- close() - Method in class org.apache.tika.parser.ParsingReader
-
Closes the read end of the pipe.
- close() - Method in class org.apache.tika.parser.sqlite3.SQLite3DBParser
- close() - Method in class org.apache.tika.pipes.core.async.AsyncProcessor
- close() - Method in class org.apache.tika.pipes.core.extractor.EmittingUnpackHandler
- close() - Method in class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler
- close() - Method in class org.apache.tika.pipes.core.extractor.TempFileUnpackHandler
- close() - Method in class org.apache.tika.pipes.core.PerClientServerManager
- close() - Method in class org.apache.tika.pipes.core.PipesClient
- close() - Method in class org.apache.tika.pipes.core.PipesParser
- close() - Method in class org.apache.tika.pipes.core.reporter.CompositePipesReporter
-
Tries to close all resources.
- close() - Method in class org.apache.tika.pipes.core.reporter.NoOpReporter
- close() - Method in class org.apache.tika.pipes.core.server.ConnectionHandler
- close() - Method in class org.apache.tika.pipes.core.server.PipesServer
- close() - Method in class org.apache.tika.pipes.core.SharedServerManager
- close() - Method in class org.apache.tika.pipes.emitter.jdbc.JDBCEmitter
- close() - Method in class org.apache.tika.pipes.fork.PipesForkParser
- close() - Method in class org.apache.tika.pipes.ignite.IgniteConfigStore
- close() - Method in class org.apache.tika.pipes.ignite.server.IgniteStoreServer
- close() - Method in class org.apache.tika.pipes.iterator.fs.FileSystemPipesIterator
- close() - Method in class org.apache.tika.pipes.reporter.es.ESPipesReporter
- close() - Method in class org.apache.tika.pipes.reporter.fs.FileSystemStatusReporter
- close() - Method in class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporter
- close() - Method in class org.apache.tika.pipes.reporter.opensearch.OpenSearchPipesReporter
- close() - Method in class org.apache.tika.renderer.RenderResult
- close() - Method in class org.apache.tika.renderer.RenderResults
- close() - Method in class org.apache.tika.server.core.resource.PipesResource
- CLOSED_CHOICE - Enum constant in enum class org.apache.tika.metadata.Property.ValueType
- closePath() - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- closeStyleTags(XHTMLContentHandler, Deque<FormattingUtils.Tag>) - Static method in class org.apache.tika.parser.microsoft.FormattingUtils
-
Closes all formatting tags.
- closeWriter() - Method in class org.apache.tika.eval.app.ProfilerBase
- ColInfo - Class in org.apache.tika.eval.app.db
- ColInfo(Cols, int) - Constructor for class org.apache.tika.eval.app.db.ColInfo
- ColInfo(Cols, int, Integer) - Constructor for class org.apache.tika.eval.app.db.ColInfo
- ColInfo(Cols, int, Integer, String) - Constructor for class org.apache.tika.eval.app.db.ColInfo
- ColInfo(Cols, int, String) - Constructor for class org.apache.tika.eval.app.db.ColInfo
- collapseGroups(float[], int[][]) - Static method in class org.apache.tika.ml.chardetect.CharsetConfusables
-
Collapse confusable group probabilities: within each group, sum all members' probabilities and assign the total to the highest-scoring member; the other members get 0.
- COLOR_MODE - Static variable in interface org.apache.tika.metadata.Photoshop
- Cols - Enum Class in org.apache.tika.eval.app.db
- COLUMN_COUNT - Static variable in interface org.apache.tika.metadata.Database
- COLUMN_NAME - Static variable in interface org.apache.tika.metadata.Database
- ColumnCount - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- COMMAND_LINE - Static variable in interface org.apache.tika.metadata.ClimateForcast
- COMMENT - Static variable in interface org.apache.tika.metadata.ClimateForcast
- COMMENT - Static variable in interface org.apache.tika.metadata.Zip
-
Comment associated with a ZIP entry.
- COMMENT_PERSONS - Static variable in interface org.apache.tika.metadata.Office
- COMMENT_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- CommentPersonHandler - Class in org.apache.tika.parser.microsoft.ooxml
- commentReference(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- commentReference(String) - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
-
Called when a comment reference is encountered in the document body.
- COMMENTS - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- COMMENTS - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- commitWithin() - Method in record class org.apache.tika.pipes.emitter.es.ESEmitterConfig
-
Returns the value of the
commitWithinrecord component. - commitWithin() - Method in record class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig
-
Returns the value of the
commitWithinrecord component. - commitWithin() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Returns the value of the
commitWithinrecord component. - COMMON_TOKENS - Enum constant in enum class org.apache.tika.eval.core.tokens.TikaEvalTokenizer.Mode
-
Common-token analysis — letters and ideographs only.
- COMMON_TOKENS_LANG - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- CommonsDigester - Class in org.apache.tika.parser.digestutils
-
Implementation of
Digesterthat relies on commons.codec.digest.DigestUtils to calculate digest hashes. - CommonsDigester(List<DigestDef>) - Constructor for class org.apache.tika.parser.digestutils.CommonsDigester
- CommonsDigester(DigestDef.Algorithm...) - Constructor for class org.apache.tika.parser.digestutils.CommonsDigester
- CommonsDigesterFactory - Class in org.apache.tika.parser.digestutils
-
Factory for
CommonsDigesterwith configurable algorithms and encodings. - CommonsDigesterFactory() - Constructor for class org.apache.tika.parser.digestutils.CommonsDigesterFactory
- CommonTokenCountManager - Class in org.apache.tika.eval.core.tokens
- CommonTokenCountManager() - Constructor for class org.apache.tika.eval.core.tokens.CommonTokenCountManager
- CommonTokenCountManager(Path, String) - Constructor for class org.apache.tika.eval.core.tokens.CommonTokenCountManager
- CommonTokenOverlapCounter - Class in org.apache.tika.eval.app.tools
- CommonTokenOverlapCounter() - Constructor for class org.apache.tika.eval.app.tools.CommonTokenOverlapCounter
- CommonTokenResult - Class in org.apache.tika.eval.core.tokens
- CommonTokenResult(String, int, int, int, int) - Constructor for class org.apache.tika.eval.core.tokens.CommonTokenResult
- CommonTokens - Class in org.apache.tika.eval.core.textstats
- CommonTokens() - Constructor for class org.apache.tika.eval.core.textstats.CommonTokens
- CommonTokens(CommonTokenCountManager) - Constructor for class org.apache.tika.eval.core.textstats.CommonTokens
- CommonTokensBhattacharyya - Class in org.apache.tika.eval.core.textstats
- CommonTokensBhattacharyya(CommonTokenCountManager) - Constructor for class org.apache.tika.eval.core.textstats.CommonTokensBhattacharyya
- CommonTokensCosine - Class in org.apache.tika.eval.core.textstats
- CommonTokensCosine(CommonTokenCountManager) - Constructor for class org.apache.tika.eval.core.textstats.CommonTokensCosine
- CommonTokensHellinger - Class in org.apache.tika.eval.core.textstats
- CommonTokensHellinger(CommonTokenCountManager) - Constructor for class org.apache.tika.eval.core.textstats.CommonTokensHellinger
- CommonTokensKLDivergence - Class in org.apache.tika.eval.core.textstats
- CommonTokensKLDivergence(CommonTokenCountManager) - Constructor for class org.apache.tika.eval.core.textstats.CommonTokensKLDivergence
- CommonTokensKLDNormed - Class in org.apache.tika.eval.core.textstats
- CommonTokensKLDNormed(CommonTokenCountManager) - Constructor for class org.apache.tika.eval.core.textstats.CommonTokensKLDNormed
- COMP_OBJ - Enum constant in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- COMP_OBJ - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
Some other kind of embedded document, in a CompObj container within another OLE2 document
- COMPACT_ID_MISSING - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.Error
- Compact64bitInt - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
-
A 9-byte encoding of values in the range 0x0002000000000000 through 0xFFFFFFFFFFFFFFFF
- Compact64bitInt() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Initializes a new instance of the Compact64bitInt class, this is the default constructor.
- Compact64bitInt(long) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Initializes a new instance of the Compact64bitInt class with specified value.
- CompactID - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
-
This class is used to represent the CompactID structrue.
- CompactID() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CompactID
- CompactUint14bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Specify the type value for compact uint 14 bits type value.
- CompactUint21bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Specify the type value for compact uint 21 bits type value.
- CompactUint28bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Specify the type value for compact uint 28 bits type value.
- CompactUint35bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Specify the type value for compact uint 35 bits type value.
- CompactUint42bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Specify the type value for compact uint 42 bits type value.
- CompactUint49bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Specify the type value for compact uint 49 bits type value.
- CompactUint64bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Specify the type value for compact uint 64 bits type value.
- CompactUint7bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Specify the type value for compact uint 7 bits type value.
- CompactUintNullType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
Specify the type value for compact uint zero type value.
- COMPANY - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- compare(long, long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
- compare(String, String) - Method in class org.apache.tika.serialization.PrettyMetadataKeyComparator
- compare(String, String, String, String) - Method in class org.apache.tika.ml.junkdetect.JunkDetector
-
Compares two candidate strings and returns which is higher-quality (cleaner text).
- compare(String, String, String, String) - Method in interface org.apache.tika.quality.TextQualityDetector
-
Compares two candidate strings and returns which is higher-quality (cleaner text).
- compare(ClassResourceInfo, ClassResourceInfo, Message) - Method in class org.apache.tika.server.core.ProduceTypeResourceComparator
-
Compares the class to handle.
- compare(OperationResourceInfo, OperationResourceInfo, Message) - Method in class org.apache.tika.server.core.ProduceTypeResourceComparator
-
Compares the method to handle.
- compareClassName(Object, Object) - Static method in class org.apache.tika.utils.CompareUtils
-
Compare two classes by class names.
- compareFiles(EvalFilePaths, EvalFilePaths) - Method in class org.apache.tika.eval.app.ExtractComparer
- compareLanguageSignal(Map<K, String>) - Method in class org.apache.tika.langdetect.charsoup.CharSoupLanguageDetector
-
Compare multiple candidate texts and return the key of the one with the strongest language signal.
- compareTo(TokenIntPair) - Method in class org.apache.tika.eval.core.tokens.TokenIntPair
-
Descending by value, ascending by token
- compareTo(Property) - Method in class org.apache.tika.metadata.Property
- compareTo(MediaType) - Method in class org.apache.tika.mime.MediaType
- compareTo(MimeType) - Method in class org.apache.tika.mime.MimeType
- compareTo(CSVResult) - Method in class org.apache.tika.parser.csv.CSVResult
-
Sorts in descending order of confidence
- compareTo(ExtendedGUID) - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
- compareTo(UByte) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
- compareTo(UInteger) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- compareTo(ULong) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
- compareTo(UShort) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
- compareTo(GUID) - Method in class org.apache.tika.parser.microsoft.onenote.GUID
- compareTo(CharsetMatch) - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Compare to other CharsetMatch objects.
- CompareUtils - Class in org.apache.tika.utils
- CompareUtils() - Constructor for class org.apache.tika.utils.CompareUtils
- COMPARISON_CONTAINERS - Static variable in class org.apache.tika.eval.app.ExtractComparer
- COMPILATION - Static variable in interface org.apache.tika.metadata.XMPDM
-
"An album created by various artists."
- complete(long) - Method in class org.apache.tika.server.core.ServerStatus
-
Removes the task from the collection of currently running tasks.
- COMPLETED - Enum constant in enum class org.apache.tika.pipes.api.pipesiterator.TotalCountResult.STATUS
- COMPLETED_SEMAPHORE - Static variable in interface org.apache.tika.pipes.api.pipesiterator.PipesIterator
- COMPLETED_VAL - Static variable in class org.apache.tika.eval.app.StatusReporter
- componentClass() - Method in record class org.apache.tika.config.loader.ComponentInfo
-
Returns the value of the
componentClassrecord component. - ComponentConfig<T> - Class in org.apache.tika.serialization
-
Configuration for how to load a top-level component from JSON.
- ComponentConfig.Builder<T> - Class in org.apache.tika.serialization
-
Builder for ComponentConfig.
- ComponentInfo - Record Class in org.apache.tika.config.loader
-
Information about a registered Tika component.
- ComponentInfo(Class<?>, boolean) - Constructor for record class org.apache.tika.config.loader.ComponentInfo
-
Creates a ComponentInfo with no explicit context key (auto-detect) and not default.
- ComponentInfo(Class<?>, boolean, Class<?>) - Constructor for record class org.apache.tika.config.loader.ComponentInfo
-
Creates a ComponentInfo with explicit context key but not default.
- ComponentInfo(Class<?>, boolean, Class<?>, boolean) - Constructor for record class org.apache.tika.config.loader.ComponentInfo
-
Creates an instance of a
ComponentInforecord class. - ComponentInstantiator - Class in org.apache.tika.config.loader
-
Utility class for instantiating Tika components from JSON configuration.
- ComponentInstantiator() - Constructor for class org.apache.tika.config.loader.ComponentInstantiator
- ComponentLoader<T> - Interface in org.apache.tika.config.loader
-
Strategy interface for loading components from JSON config.
- ComponentNameResolver - Class in org.apache.tika.serialization
-
Utility class that resolves friendly component names to classes using ComponentRegistry.
- ComponentRegistry - Class in org.apache.tika.config.loader
-
Registry for looking up Tika component classes by name.
- ComponentRegistry(String, ClassLoader) - Constructor for class org.apache.tika.config.loader.ComponentRegistry
-
Creates a component registry by loading the specified index file.
- COMPOSER - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The composer's name."
- composite(Property, Property[]) - Static method in class org.apache.tika.metadata.Property
-
Constructs a new composite property from the given primary and array of secondary properties.
- COMPOSITE - Enum constant in enum class org.apache.tika.metadata.Property.PropertyType
-
Multiple child properties
- CompositeDetector - Class in org.apache.tika.detect
-
Content type detector that combines multiple different detection mechanisms.
- CompositeDetector(List<Detector>) - Constructor for class org.apache.tika.detect.CompositeDetector
- CompositeDetector(Detector...) - Constructor for class org.apache.tika.detect.CompositeDetector
- CompositeDetector(MediaTypeRegistry, List<Detector>) - Constructor for class org.apache.tika.detect.CompositeDetector
- CompositeDetector(MediaTypeRegistry, List<Detector>, Collection<Class<? extends Detector>>) - Constructor for class org.apache.tika.detect.CompositeDetector
- CompositeDigester - Class in org.apache.tika.digest
- CompositeDigester(Digester...) - Constructor for class org.apache.tika.digest.CompositeDigester
- CompositeEncodingDetector - Class in org.apache.tika.detect
-
A composite encoding detector that runs child detectors.
- CompositeEncodingDetector(List<EncodingDetector>) - Constructor for class org.apache.tika.detect.CompositeEncodingDetector
- CompositeEncodingDetector(List<EncodingDetector>, Collection<Class<? extends EncodingDetector>>) - Constructor for class org.apache.tika.detect.CompositeEncodingDetector
- CompositeMatcher - Class in org.apache.tika.sax.xpath
-
Composite XPath evaluation state.
- CompositeMatcher(Matcher, Matcher) - Constructor for class org.apache.tika.sax.xpath.CompositeMatcher
- CompositeMetadataFilter - Class in org.apache.tika.metadata.filter
- CompositeMetadataFilter() - Constructor for class org.apache.tika.metadata.filter.CompositeMetadataFilter
- CompositeMetadataFilter(List<MetadataFilter>) - Constructor for class org.apache.tika.metadata.filter.CompositeMetadataFilter
- CompositeParser - Class in org.apache.tika.parser
-
Composite parser that delegates parsing tasks to a component parser based on the declared content type of the incoming document.
- CompositeParser() - Constructor for class org.apache.tika.parser.CompositeParser
- CompositeParser(MediaTypeRegistry, List<Parser>) - Constructor for class org.apache.tika.parser.CompositeParser
- CompositeParser(MediaTypeRegistry, List<Parser>, Collection<Class<? extends Parser>>) - Constructor for class org.apache.tika.parser.CompositeParser
- CompositeParser(MediaTypeRegistry, Parser...) - Constructor for class org.apache.tika.parser.CompositeParser
- CompositeParser(CompositeParser) - Constructor for class org.apache.tika.parser.CompositeParser
- CompositePipesReporter - Class in org.apache.tika.pipes.core.reporter
- CompositePipesReporter(List<PipesReporter>) - Constructor for class org.apache.tika.pipes.core.reporter.CompositePipesReporter
- CompositeRenderer - Class in org.apache.tika.renderer
- CompositeRenderer(List<Renderer>) - Constructor for class org.apache.tika.renderer.CompositeRenderer
- CompositeRenderer(ServiceLoader) - Constructor for class org.apache.tika.renderer.CompositeRenderer
- CompositeTagHandler - Class in org.apache.tika.parser.mp3
- CompositeTagHandler(ID3Tags[]) - Constructor for class org.apache.tika.parser.mp3.CompositeTagHandler
- CompositeTextStatsCalculator - Class in org.apache.tika.eval.core.textstats
- CompositeTextStatsCalculator(List<TextStatsCalculator>) - Constructor for class org.apache.tika.eval.core.textstats.CompositeTextStatsCalculator
- CompositeTextStatsCalculator(List<TextStatsCalculator>, AnalyzerManager, LanguageIDWrapper) - Constructor for class org.apache.tika.eval.core.textstats.CompositeTextStatsCalculator
- compound - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
-
Gets or sets a value that specifies if set a compound parse type is needed and MUST be ended with either an 8-bit stream object header end or a 16-bit stream object header end.
- COMPRESS - Static variable in class org.apache.tika.detect.zip.CompressorConstants
- COMPRESSED - Enum constant in enum class org.apache.tika.parser.microsoft.chm.ChmCommons.EntryType
- COMPRESSED_SIZE - Static variable in interface org.apache.tika.metadata.Zip
-
Compressed size of the entry in bytes.
- COMPRESSION_METHOD - Static variable in interface org.apache.tika.metadata.Zip
-
Compression method used for the entry (0=stored, 8=deflated, etc.).
- compressionType() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
compressionTyperecord component. - CompressorConstants - Class in org.apache.tika.detect.zip
- CompressorConstants() - Constructor for class org.apache.tika.detect.zip.CompressorConstants
- CompressorParser - Class in org.apache.tika.parser.pkg
-
Parser for various compression formats.
- CompressorParser() - Constructor for class org.apache.tika.parser.pkg.CompressorParser
- CompressorParser(JsonConfig) - Constructor for class org.apache.tika.parser.pkg.CompressorParser
-
Constructor for JSON configuration.
- CompressorParser(CompressorParser.Config) - Constructor for class org.apache.tika.parser.pkg.CompressorParser
-
Constructor with explicit Config object.
- CompressorParser.Config - Class in org.apache.tika.parser.pkg
-
Configuration class for JSON deserialization.
- CompressorParserOptions - Interface in org.apache.tika.parser.pkg
-
Interface for setting options for the
CompressorParserby passing via theParseContext. - computeFontHeight(PDFont) - Method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
- CONCATENATE - Enum constant in enum class org.apache.tika.pipes.api.ParseMode
-
Concatenates content from all embedded files into a single document.
- CONCATENATE - Enum constant in enum class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig.MultivaluedFieldStrategy
- CONCATENATE_CONTENT_INTO_FIRST - Enum constant in enum class org.apache.tika.eval.app.io.ExtractReader.ALTER_METADATA_LIST
- ConcurrentUtils - Class in org.apache.tika.utils
-
Utility Class for Concurrency in Tika
- ConcurrentUtils() - Constructor for class org.apache.tika.utils.ConcurrentUtils
- CONDITIONAL - Enum constant in enum class org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
- CONFIDENCE - Enum constant in enum class org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
- config - Variable in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
- Config() - Constructor for class org.apache.tika.detect.magika.MagikaDetector.Config
- Config() - Constructor for class org.apache.tika.detect.OverrideEncodingDetector.Config
- Config() - Constructor for class org.apache.tika.detect.siegfried.SiegfriedDetector.Config
- Config() - Constructor for class org.apache.tika.metadata.filter.CaptureGroupMetadataFilter.Config
- Config() - Constructor for class org.apache.tika.metadata.filter.ClearByAttachmentTypeMetadataFilter.Config
- Config() - Constructor for class org.apache.tika.metadata.filter.DateNormalizingMetadataFilter.Config
- Config() - Constructor for class org.apache.tika.metadata.filter.ExcludeFieldMetadataFilter.Config
- Config() - Constructor for class org.apache.tika.metadata.filter.FieldNameMappingFilter.Config
- Config() - Constructor for class org.apache.tika.metadata.filter.GeoPointMetadataFilter.Config
- Config() - Constructor for class org.apache.tika.metadata.filter.IncludeFieldMetadataFilter.Config
- Config() - Constructor for class org.apache.tika.metadata.filter.RemoveByMimeMetadataFilter.Config
- Config() - Constructor for class org.apache.tika.parser.html.HtmlEncodingDetector.Config
- Config() - Constructor for class org.apache.tika.parser.html.JSoupParser.Config
- Config() - Constructor for class org.apache.tika.parser.mail.RFC822Parser.Config
- Config() - Constructor for class org.apache.tika.parser.microsoft.rtf.RTFParser.Config
- Config() - Constructor for class org.apache.tika.parser.odf.FlatOpenDocumentParser.Config
- Config() - Constructor for class org.apache.tika.parser.odf.OpenDocumentParser.Config
- Config() - Constructor for class org.apache.tika.parser.pkg.CompressorParser.Config
- Config() - Constructor for class org.apache.tika.parser.txt.Icu4jEncodingDetector.Config
- Config() - Constructor for class org.apache.tika.parser.txt.UniversalEncodingDetector.Config
- CONFIG_KEY - Static variable in class org.apache.tika.pipes.core.pipesiterator.PipesIteratorManager
- CONFIG_KEY - Static variable in class org.apache.tika.pipes.core.reporter.ReporterManager
- ConfigDeserializer - Class in org.apache.tika.config
-
Utility for deserializing JSON configuration without compile-time dependency on Jackson.
- ConfigDeserializer - Class in org.apache.tika.serialization
-
Helper utility for
SelfConfiguringcomponents to deserialize their configuration from ParseContext at run time. - ConfigDeserializer() - Constructor for class org.apache.tika.config.ConfigDeserializer
- ConfigDeserializer() - Constructor for class org.apache.tika.serialization.ConfigDeserializer
- ConfigEndpointSecurityFilter - Class in org.apache.tika.server.core
-
JAX-RS filter that gates /config endpoints behind the enableUnsecureFeatures flag.
- ConfigEndpointSecurityFilter(boolean) - Constructor for class org.apache.tika.server.core.ConfigEndpointSecurityFilter
- configKey() - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- configKey() - Method in class org.apache.tika.parser.vlm.ClaudeVLMParser
- configKey() - Method in class org.apache.tika.parser.vlm.GeminiVLMParser
- configKey() - Method in class org.apache.tika.parser.vlm.OpenAIVLMParser
- ConfigLoader - Class in org.apache.tika.config.loader
-
Loader for configuration objects from the "parse-context" section.
- ConfigMerger - Class in org.apache.tika.pipes.core.config
-
Utility for merging configuration overrides with existing Tika JSON configuration.
- ConfigMerger.MergeResult - Record Class in org.apache.tika.pipes.core.config
-
Result of a config merge operation.
- ConfigOverrides - Class in org.apache.tika.pipes.core.config
-
Configuration overrides for merging with or creating Tika JSON configuration.
- ConfigOverrides.Builder - Class in org.apache.tika.pipes.core.config
-
Builder for ConfigOverrides.
- ConfigOverrides.EmitterOverride - Class in org.apache.tika.pipes.core.config
-
Represents an emitter configuration override.
- ConfigOverrides.FetcherOverride - Class in org.apache.tika.pipes.core.config
-
Represents a fetcher configuration override.
- ConfigOverrides.PipesConfigOverride - Class in org.apache.tika.pipes.core.config
-
Represents pipes configuration overrides.
- configPath() - Method in record class org.apache.tika.pipes.core.config.ConfigMerger.MergeResult
-
Returns the value of the
configPathrecord component. - ConfigStore - Interface in org.apache.tika.pipes.core.config
-
Interface for storing and retrieving component configurations.
- ConfigStoreFactory - Interface in org.apache.tika.pipes.core.config
-
Factory interface for creating ConfigStore instances.
- ConfigurableThreadPoolExecutor - Interface in org.apache.tika.concurrent
-
Allows Thread Pool to be Configurable.
- configure(ParseContext) - Method in class org.apache.tika.parser.dwg.AbstractDWGParser
- configure(ParseContext) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
Checks to see if the user has specified an
OfficeParserConfig. - configure(PDF2XHTML) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Configures the given pdf2XHTML.
- ConfigValidator - Class in org.apache.tika.config
-
Utility class for validating configuration parameters.
- ConflictingUserName - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ConfusableGroups - Class in org.apache.tika.langdetect.charsoup
-
Loads the shared confusable language groups from
confusables.txton the classpath. - connect(int) - Method in class org.apache.tika.pipes.core.PerClientServerManager
- connect(int) - Method in interface org.apache.tika.pipes.core.ServerManager
-
Establishes a connection to the server and returns a connected Socket.
- connect(int) - Method in class org.apache.tika.pipes.core.SharedServerManager
- connection() - Method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
-
Returns the value of the
connectionrecord component. - ConnectionHandler - Class in org.apache.tika.pipes.core.server
-
Handles a single client connection in shared server mode.
- ConnectionHandler(Socket, SharedServerResources, PipesConfig) - Constructor for class org.apache.tika.pipes.core.server.ConnectionHandler
-
Creates a new ConnectionHandler.
- connectionsMaxIdleMs() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
connectionsMaxIdleMsrecord component. - connectionString() - Method in record class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterConfig
-
Returns the value of the
connectionStringrecord component. - connectionTimeoutMillis() - Method in record class org.apache.tika.pipes.emitter.es.HttpClientConfig
-
Returns the value of the
connectionTimeoutMillisrecord component. - connectionTimeoutMillis() - Method in record class org.apache.tika.pipes.emitter.opensearch.HttpClientConfig
-
Returns the value of the
connectionTimeoutMillisrecord component. - connectionTimeoutMillis() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Returns the value of the
connectionTimeoutMillisrecord component. - connectionTimeoutMillis() - Method in record class org.apache.tika.pipes.reporter.opensearch.HttpClientConfig
-
Returns the value of the
connectionTimeoutMillisrecord component. - CONTACT - Static variable in interface org.apache.tika.metadata.ClimateForcast
- CONTACT_INFO_ADDRESS - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information address part.
- CONTACT_INFO_CITY - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information city part.
- CONTACT_INFO_COUNTRY - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information country part.
- CONTACT_INFO_EMAIL - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information email address part.
- CONTACT_INFO_PHONE - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information phone number part.
- CONTACT_INFO_POSTAL_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information part denoting the local postal code.
- CONTACT_INFO_STATE_PROVINCE - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information part denoting regional information such as state or province.
- CONTACT_INFO_WEB_URL - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information web address part.
- container() - Method in record class org.apache.tika.pipes.emitter.azblob.AZBlobEmitterConfig
-
Returns the value of the
containerrecord component. - CONTAINER_EXCEPTION - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- CONTAINER_ID - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- CONTAINER_STACK_TRACE - Static variable in class org.apache.tika.pipes.core.serialization.EmitDataSerializer
- CONTAINER_TABLE - Static variable in class org.apache.tika.eval.app.ExtractProfiler
- ContainerExtractor - Interface in org.apache.tika.extractor
-
Tika container extractor interface.
- contains(String) - Method in class org.apache.tika.eval.core.tokens.LangModel
- contains(String, String) - Method in class org.apache.tika.language.translate.impl.CachedTranslator
-
Check whether this CachedTranslator's cache contains a translation of the text to the target language, attempting to auto-detect the source language.
- contains(String, String, String) - Method in class org.apache.tika.language.translate.impl.CachedTranslator
-
Check whether this CachedTranslator's cache contains a translation of the text from the source language to the target language.
- contains(Charset) - Method in class org.apache.tika.parser.html.charsetdetector.charsets.ReplacementCharset
- contains(Charset) - Method in class org.apache.tika.parser.html.charsetdetector.charsets.XUserDefinedCharset
- CONTAINS_DAMAGED_FONT - Static variable in interface org.apache.tika.metadata.PDF
-
Contains at least one damaged font for at least one character
- CONTAINS_ENCAPSULATED_HTML - Static variable in interface org.apache.tika.metadata.RTFMetadata
- CONTAINS_NON_EMBEDDED_FONT - Static variable in interface org.apache.tika.metadata.PDF
-
Contains at least one font that is not embedded
- containsColumn(Cols) - Method in class org.apache.tika.eval.app.db.TableInfo
- containsEmail(String) - Static method in class org.apache.tika.parser.mailcommons.MailUtil
-
If the chunk looks like it contains an email
- containsFields(String) - Method in class org.apache.tika.FetchAndParseReply.Builder
-
Metadata fields from the parse output.
- containsFields(String) - Method in class org.apache.tika.FetchAndParseReply
-
Metadata fields from the parse output.
- containsFields(String) - Method in interface org.apache.tika.FetchAndParseReplyOrBuilder
-
Metadata fields from the parse output.
- containsKey(String) - Method in interface org.apache.tika.pipes.core.config.ConfigStore
-
Checks if a configuration exists.
- containsKey(String) - Method in class org.apache.tika.pipes.core.config.FileBasedConfigStore
- containsKey(String) - Method in class org.apache.tika.pipes.core.config.InMemoryConfigStore
- containsKey(String) - Method in class org.apache.tika.pipes.ignite.IgniteConfigStore
- containsParams(String) - Method in class org.apache.tika.GetFetcherReply.Builder
-
The configuration parameters.
- containsParams(String) - Method in class org.apache.tika.GetFetcherReply
-
The configuration parameters.
- containsParams(String) - Method in interface org.apache.tika.GetFetcherReplyOrBuilder
-
The configuration parameters.
- containsTable(String) - Method in class org.apache.tika.eval.app.db.JDBCUtil
- content - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
- content - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
- content - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
-
Gets or sets an extended GUID array
- CONTENT - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- CONTENT_COMPARISONS - Static variable in class org.apache.tika.eval.app.ExtractComparer
- CONTENT_DISPOSITION - Static variable in interface org.apache.tika.metadata.HttpHeaders
- CONTENT_ENCODING - Static variable in interface org.apache.tika.metadata.HttpHeaders
- CONTENT_LANGUAGE - Static variable in interface org.apache.tika.metadata.HttpHeaders
- CONTENT_LENGTH - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- CONTENT_LENGTH - Static variable in interface org.apache.tika.metadata.HttpHeaders
- CONTENT_LOCATION - Static variable in interface org.apache.tika.metadata.HttpHeaders
- CONTENT_MD5 - Static variable in interface org.apache.tika.metadata.HttpHeaders
- CONTENT_ONLY - Enum constant in enum class org.apache.tika.pipes.api.ParseMode
-
Concatenates content and emits only the raw content string, with no metadata and no JSON wrapper.
- CONTENT_STATUS - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
The status of the content.
- CONTENT_TRUNCATED_AT_MAX_LEN - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- CONTENT_TYPE - Static variable in interface org.apache.tika.metadata.HttpHeaders
- CONTENT_TYPE_HINT - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
This is currently used to identify Content-Type that may be included within a document, such as in html documents (e.g.
- CONTENT_TYPE_MAGIC_DETECTED - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
This is set by DefaultDetector to store the result of MimeTypes (magic byte) detection.
- CONTENT_TYPE_PARSER_OVERRIDE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
This is used by parsers to override detection of embedded resources with the override detector.
- CONTENT_TYPE_USER_OVERRIDE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
This is used by users to override detection with the override detector.
- ContentChildNodesOfOutlineElement - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ContentChildNodesOfPageManifest - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ContentHandlerDecorator - Class in org.apache.tika.sax
-
Decorator base class for the
ContentHandlerinterface. - ContentHandlerDecorator() - Constructor for class org.apache.tika.sax.ContentHandlerDecorator
-
Creates a decorator that by default forwards incoming SAX events to a dummy content handler that simply ignores all the events.
- ContentHandlerDecorator(ContentHandler) - Constructor for class org.apache.tika.sax.ContentHandlerDecorator
-
Creates a decorator for the given SAX event handler.
- ContentHandlerDecoratorFactory - Interface in org.apache.tika.sax
- ContentHandlerExample - Class in org.apache.tika.example
-
Examples of using different Content Handlers to get different parts of the file's contents
- ContentHandlerExample() - Constructor for class org.apache.tika.example.ContentHandlerExample
- ContentHandlerFactory - Interface in org.apache.tika.sax
-
Factory interface for creating ContentHandler instances.
- ContentLengthCalculator - Class in org.apache.tika.eval.core.textstats
- ContentLengthCalculator() - Constructor for class org.apache.tika.eval.core.textstats.ContentLengthCalculator
- CONTENTS_TABLE - Static variable in class org.apache.tika.eval.app.ExtractProfiler
- CONTENTS_TABLE_A - Static variable in class org.apache.tika.eval.app.ExtractComparer
- CONTENTS_TABLE_B - Static variable in class org.apache.tika.eval.app.ExtractComparer
- ContentTagKnowledge - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Content Tag Knowledge
- ContentTagKnowledge - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Content Tag Knowledge
- ContentTagKnowledgeEntry - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Content Tag Knowledge Entry
- ContentTagParser - Class in org.apache.tika.eval.core.util
- ContentTagParser() - Constructor for class org.apache.tika.eval.core.util.ContentTagParser
- ContentTags - Class in org.apache.tika.eval.core.util
- ContentTags(String) - Constructor for class org.apache.tika.eval.core.util.ContentTags
- ContentTags(String, boolean) - Constructor for class org.apache.tika.eval.core.util.ContentTags
- ContentTags(String, Map<String, Integer>) - Constructor for class org.apache.tika.eval.core.util.ContentTags
- context - Variable in class org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
- context - Variable in class org.apache.tika.parser.microsoft.OutlookExtractor
- ContextID - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
The property contains one CompactID in the ObjectSpaceObjectPropSet.ContextIDs.body stream field.
- contextIDs - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
- contextKey() - Method in record class org.apache.tika.config.loader.ComponentInfo
-
Returns the value of the
contextKeyrecord component. - contextKey() - Element in annotation interface org.apache.tika.config.TikaComponent
-
The class to use as the key when adding this component to ParseContext.
- ContrastStatistics - Class in org.apache.tika.eval.core.tokens
- ContrastStatistics() - Constructor for class org.apache.tika.eval.core.tokens.ContrastStatistics
- CONTRIBUTOR - Static variable in interface org.apache.tika.metadata.DublinCore
-
An entity responsible for making contributions to the content of the resource.
- CONTRIBUTOR - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- CONTRIBUTOR - Static variable in interface org.apache.tika.metadata.XMPDC
-
An entity responsible for making contributions to the content of the resource.
- CONTROL_DATA - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- CONTROL_SYMBOL - Enum constant in enum class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenType
- CONTROL_WORD - Enum constant in enum class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenType
- CONTROLLED_VOCABULARY_TERM - Static variable in interface org.apache.tika.metadata.IPTC
-
A term to describe the content of the image by a value from a Controlled Vocabulary.
- CONVENTIONS - Static variable in interface org.apache.tika.metadata.ClimateForcast
- CONVERSATION_INDEX - Static variable in interface org.apache.tika.metadata.MAPI
- CONVERSATION_TOPIC - Static variable in interface org.apache.tika.metadata.MAPI
- convert(InputStream, OutputStream) - Static method in class org.apache.tika.cli.XmlToJsonConfigConverter
-
Converts an XML Tika configuration stream to JSON format.
- convert(InputStream, OutputStream, ClassLoader) - Static method in class org.apache.tika.cli.XmlToJsonConfigConverter
-
Converts an XML Tika configuration stream to JSON format.
- convert(Object) - Static method in class org.apache.tika.metadata.XMPDM.ChannelTypePropertyConverter
-
Deprecated.How a standalone converter might work
- convert(Path, Path) - Static method in class org.apache.tika.cli.XmlToJsonConfigConverter
-
Converts an XML Tika configuration file to JSON format.
- convert(Metadata) - Static method in class org.apache.tika.xmp.convert.TikaToXMP
- convert(Metadata, String) - Static method in class org.apache.tika.xmp.convert.TikaToXMP
-
Convert the given Tika metadata map to XMP object.
- convertAndSet(Metadata, Object) - Static method in class org.apache.tika.metadata.XMPDM.ChannelTypePropertyConverter
-
Deprecated.How convert+set might work
- convertBase64ToPrivateKey(String) - Static method in class org.apache.tika.pipes.fetcher.http.jwt.JwtPrivateKeyCreds
- convertPrivateKeyToBase64(PrivateKey) - Static method in class org.apache.tika.pipes.fetcher.http.jwt.JwtPrivateKeyCreds
- convertToJSONArray(JSONObject, String) - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
Converts JSON Object to JSON Array
- convertToJSONObject(String) - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
Parses a JSON String and converts it to a JSON Object
- copy() - Method in class org.apache.tika.client.HttpClientFactory
- copy(DirectoryEntry, DirectoryEntry) - Method in class org.apache.tika.extractor.microsoft.MSEmbeddedStreamTranslator
- copyAtMost(Reader, Writer, int) - Method in class org.apache.tika.langdetect.LanguageDetectorTest
- copyFrom(ParseContext) - Method in class org.apache.tika.parser.ParseContext
-
Copies all entries from the source ParseContext into this one.
- copyOfRange(byte[], int, int) - Static method in class org.apache.tika.parser.microsoft.chm.ChmCommons
- COPYRIGHT - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The copyright information."
- COPYRIGHT_NOTICE - Static variable in interface org.apache.tika.metadata.IPTC
-
Contains any necessary copyright notice for claiming the intellectual property for this item and should identify the current owner of the copyright for the item.
- COPYRIGHT_OWNER - Static variable in interface org.apache.tika.metadata.IPTC
-
Owner or owners of the copyright in the licensed image.
- COPYRIGHT_OWNER_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
The ID of the owner or owners of the copyright in the licensed image.
- COPYRIGHT_OWNER_ID_WRONG_CASE - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.
- COPYRIGHT_OWNER_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of the owner or owners of the copyright in the licensed image.
- copyUpToMaxLength(InputStream, OutputStream) - Static method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- CoreNLPNERecogniser - Class in org.apache.tika.parser.ner.corenlp
-
This class offers an implementation of
NERecogniserbased on CRF classifiers from Stanford CoreNLP. - CoreNLPNERecogniser() - Constructor for class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
- CoreNLPNERecogniser(String) - Constructor for class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
Creates a NERecogniser by loading model from given path
- CorruptedFileException - Exception in org.apache.tika.exception
-
This exception should be thrown when the parse absolutely, positively has to stop.
- CorruptedFileException(String) - Constructor for exception org.apache.tika.exception.CorruptedFileException
- CorruptedFileException(String, Throwable) - Constructor for exception org.apache.tika.exception.CorruptedFileException
- count - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
- count - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
- count - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
- count - Variable in class org.apache.tika.parser.ocr.tess4j.ImageDeskew.HoughLine
- count() - Method in class org.apache.tika.detect.TextStatistics
-
Returns the total number of bytes seen so far.
- count(int) - Method in class org.apache.tika.detect.TextStatistics
-
Returns the number of occurrences of the given byte.
- COUNT - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
-
Number of distinct categories.
- countControl() - Method in class org.apache.tika.detect.TextStatistics
-
Counts control characters (i.e.
- countEightBit() - Method in class org.apache.tika.detect.TextStatistics
-
Counts eight bit characters, i.e. bytes with their highest bit set.
- COUNTRY - Static variable in interface org.apache.tika.metadata.IPTC
-
Full name of the country the content is focussing on -- either the country shown in visual media or referenced in text or audio media.
- COUNTRY - Static variable in interface org.apache.tika.metadata.Photoshop
- COUNTRY_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
Code of the country the content is focussing on -- either the country shown in visual media or referenced in text or audio media.
- countSafeAscii() - Method in class org.apache.tika.detect.TextStatistics
-
Counts "safe" (i.e. seven-bit non-control) ASCII characters.
- countUtf8Errors(byte[]) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
-
Counts the number of malformed UTF-8 sequences in the sample — one event per bad lead, orphaned continuation, overlong, surrogate, or out-of-range codepoint, regardless of how many bytes the bad sequence spans.
- countUtf8Errors(byte[], int, int) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
- COVERAGE - Static variable in interface org.apache.tika.metadata.DublinCore
-
The extent or scope of the content of the resource.
- COVERAGE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- COVERAGE - Static variable in interface org.apache.tika.metadata.XMPDC
-
The extent or scope of the content of the resource.
- CPIO - Static variable in class org.apache.tika.detect.zip.PackageConstants
- cProperties - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
- cProperties - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySet
- crash(PipesMessageType, byte[]) - Static method in record class org.apache.tika.pipes.core.protocol.PipesMessage
- CRC32 - Static variable in interface org.apache.tika.metadata.Zip
-
CRC-32 checksum of the uncompressed entry data.
- create() - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates an empty instance; same as calling new MimeTypes().
- create(InputStream) - Static method in class org.apache.tika.mime.MimeTypesFactory
- create(InputStream...) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates and returns a MimeTypes instance from the specified input stream.
- create(String) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates and returns a MimeTypes instance from the specified file path, as interpreted by the class loader in getResource().
- create(String, String) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates and returns a MimeTypes instance.
- create(String, String, long, String) - Static method in record class org.apache.tika.pipes.core.extractor.frictionless.FrictionlessResource
-
Creates a FrictionlessResource without the optional name field.
- create(String, String, long, String, String) - Static method in record class org.apache.tika.pipes.core.extractor.frictionless.FrictionlessResource
-
Creates a FrictionlessResource with all fields.
- create(String, String, ClassLoader) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates and returns a MimeTypes instance.
- create(URL) - Static method in class org.apache.tika.mime.MimeTypesFactory
- create(URL...) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates and returns a MimeTypes instance from the resource at the location specified by the URL.
- create(Document) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates and returns a MimeTypes instance from the specified document.
- CREATE_DATE - Static variable in interface org.apache.tika.metadata.XMP
-
The date and time the resource was created.
- createArrayProperty(String, String, String, int) - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
Creates an array property from a list of values.
- createArrayProperty(Property, String, String, int) - Method in class org.apache.tika.xmp.convert.AbstractConverter
- createCellMainifestDataElement(ExGuid, Map<CellID, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to create the cell manifest data element.
- createChunkingInstance(byte[]) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingFactory
-
This method is used to create the instance of AbstractChunking.
- createChunkingInstance(byte[], ChunkingMethod) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingFactory
-
This method is used to create the instance of AbstractChunking.
- createChunkingInstance(IntermediateNodeObject) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingFactory
-
This method is used to create the instance of AbstractChunking.
- createCommaSeparatedArray(String, String, String, int) - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
Creates an array property from a comma separated list.
- createCommaSeparatedArray(Property, String, String, int) - Method in class org.apache.tika.xmp.convert.AbstractConverter
- createConfigStore(PluginManager, String, ExtensionConfig) - Static method in interface org.apache.tika.pipes.core.config.ConfigStoreFactory
-
Creates a ConfigStore instance based on configuration.
- CREATED - Static variable in interface org.apache.tika.metadata.DublinCore
-
Date of creation of the resource.
- CREATED - Static variable in interface org.apache.tika.metadata.FileSystem
- CREATED - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- CREATED - Static variable in interface org.apache.tika.metadata.XMPDC
-
Date of creation of the resource.
- createDecryptStream(InputStream, Key) - Method in class org.apache.tika.parser.hwp.HwpTextExtractorV5
- createDefaultComposite(Set<Class<? extends Detector>>, LoaderContext) - Method in class org.apache.tika.config.loader.DetectorLoader
- createDefaultComposite(Set<Class<? extends EncodingDetector>>, LoaderContext) - Method in class org.apache.tika.config.loader.EncodingDetectorLoader
- createDefaultComposite(Set<Class<? extends Parser>>, LoaderContext) - Method in class org.apache.tika.config.loader.ParserLoader
- createDefaultComposite(Set<Class<? extends T>>, LoaderContext) - Method in class org.apache.tika.config.loader.AbstractSpiComponentLoader
-
Create the SPI-backed default composite with exclusions.
- createExtensionFinder() - Method in class org.apache.tika.plugins.TikaPluginManager
-
Override to disable classpath scanning for extensions.
- createExtractor() - Method in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Create the production
FeatureExtractorfor this model by dispatching on theCharSoupModel.featureFlagsembedded in the binary. - createFrameIfPresent(InputStream) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
Returns the next ID3v2 Frame in the file, or null if the next batch of data doesn't correspond to either an ID3v2 header.
- createHandler() - Method in class org.apache.tika.example.PickBestTextEncodingParser.CharsetContentHandlerFactory
-
Deprecated.
- createHandler() - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- createHandler() - Method in class org.apache.tika.sax.BasicContentHandlerFactory
- createHandler() - Method in interface org.apache.tika.sax.ContentHandlerFactory
-
Creates a new ContentHandler for extracting content.
- createHandler(OutputStream, Charset) - Method in class org.apache.tika.sax.BasicContentHandlerFactory
- createHandler(OutputStream, Charset) - Method in interface org.apache.tika.sax.StreamingContentHandlerFactory
-
Creates a new ContentHandler that writes output directly to the given OutputStream.
- createInstance(ExGuid, ObjectGroupDataElementData, boolean) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObjectGroup
- createInstance(ObjectGroupDataElementData) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.HeaderCell
-
Create the instance of Header Cell.
- createLangAltProperty(String, String, String) - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
Creates a language alternative property in the x-default language
- createLangAltProperty(Property, String, String) - Method in class org.apache.tika.xmp.convert.AbstractConverter
- createMapper() - Static method in class org.apache.tika.config.loader.TikaObjectMapperFactory
-
Creates an ObjectMapper configured for Tika serialization.
- createMapper(JsonFactory) - Static method in class org.apache.tika.config.loader.TikaObjectMapperFactory
-
Creates an ObjectMapper configured for Tika serialization with a custom JsonFactory.
- createMergedParseContext(ParseContext) - Method in class org.apache.tika.pipes.core.server.SharedServerResources
-
Creates a merged ParseContext with defaults from tika-config overlaid with request values.
- createNotFoundException(String) - Method in class org.apache.tika.pipes.core.AbstractComponentManager
-
Creates a not-found exception for this component type.
- createNotFoundException(String) - Method in class org.apache.tika.pipes.core.emitter.EmitterManager
- createNotFoundException(String) - Method in class org.apache.tika.pipes.core.fetcher.FetcherManager
- createObjectGroupDataElement(byte[], AtomicReference<ExGuid>, List<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to create object group data/blob element list.
- createOneNoteDocumentFromDirectFileResource(OneNoteDirectFileResource) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteParser
-
Create a OneNoteDocument object.
- createPageDrawer(PageDrawerParameters) - Method in class org.apache.tika.renderer.pdf.pdfbox.NoTextPDFRenderer
-
Returns a new PageDrawer instance, using the given parameters.
- createPageDrawer(PageDrawerParameters) - Method in class org.apache.tika.renderer.pdf.pdfbox.TextOnlyPDFRenderer
-
Returns a new PageDrawer instance, using the given parameters.
- createPageDrawer(PageDrawerParameters) - Method in class org.apache.tika.renderer.pdf.pdfbox.VectorGraphicsOnlyPDFRenderer
-
Returns a new PageDrawer instance, using the given parameters.
- createParseContext() - Static method in class org.apache.tika.server.core.resource.TikaResource
-
Creates a new ParseContext with defaults loaded from tika-config.
- createParser() - Static method in class org.apache.tika.server.core.resource.TikaResource
- createPluginDescriptorFinder() - Method in class org.apache.tika.plugins.TikaPluginManager
-
Override to use PropertiesPluginDescriptorFinder in development mode.
- createPluginRepository() - Method in class org.apache.tika.plugins.TikaPluginManager
-
Override to prevent scanning subdirectories in development mode.
- createProperty(String, String, String) - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
Creates a simple property.
- createProperty(Property, String, String) - Method in class org.apache.tika.xmp.convert.AbstractConverter
- createRevisionManifestDataElement(ExGuid, ExGuid, List<ExGuid>, Map<ExGuid, ExGuid>, AtomicReference<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to create the revision manifest data element.
- createStorageIndexDataElement(ExGuid, Map<CellID, ExGuid>, Map<ExGuid, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to create the storage index data element.
- createStorageManifestDataElement(Map<CellID, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to create the storage manifest data element.
- createTable() - Method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
-
Returns the value of the
createTablerecord component. - createTable() - Method in record class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterConfig
-
Returns the value of the
createTablerecord component. - createTables(List<TableInfo>, JDBCUtil.CREATE_TABLE) - Method in class org.apache.tika.eval.app.db.JDBCUtil
- createTempFile() - Method in class org.apache.tika.io.TemporaryResources
- createTempFile(String) - Method in class org.apache.tika.io.TemporaryResources
-
Creates a temporary file that will automatically be deleted when the
TemporaryResources.close()method is called, returning its path. - createTempFile(Metadata) - Method in class org.apache.tika.io.TemporaryResources
-
Creates a temporary file that will automatically be deleted when the
TemporaryResources.close()method is called, returning its path. - createTemporaryFile() - Method in class org.apache.tika.io.TemporaryResources
-
Creates and returns a temporary file that will automatically be deleted when the
TemporaryResources.close()method is called. - CREATION_DATE - Static variable in interface org.apache.tika.metadata.Office
-
When was the document created?
- CreationTimeStamp - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- CreativeCommons - Interface in org.apache.tika.metadata
-
A collection of Creative Commons properties names.
- CREATOR - Static variable in interface org.apache.tika.metadata.DublinCore
-
An entity primarily responsible for making the content of the resource.
- CREATOR - Static variable in interface org.apache.tika.metadata.IPTC
-
Contains the name of the person who created the content of this item, a photographer for photos, a graphic artist for graphics, or a writer for textual news, but in cases where the photographer should not be identified the name of a company or organisation may be appropriate.
- CREATOR - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- CREATOR - Static variable in interface org.apache.tika.metadata.XMPDC
-
An entity primarily responsible for making the content of the resource.
- CREATOR_TOOL - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- CREATOR_TOOL - Static variable in interface org.apache.tika.metadata.XMP
-
The name of the first known tool used to create the resource.
- CREATORS_CONTACT_INFO - Static variable in interface org.apache.tika.metadata.IPTC
-
The creator's contact information provides all necessary information to get in contact with the creator of this item and comprises a set of sub-properties for proper addressing.
- CREATORS_JOB_TITLE - Static variable in interface org.apache.tika.metadata.IPTC
-
Contains the job title of the person who created the content of this item.
- credentialsProvider() - Method in record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
-
Returns the value of the
credentialsProviderrecord component. - CREDIT - Static variable in interface org.apache.tika.metadata.Photoshop
- CREDIT_LINE - Static variable in interface org.apache.tika.metadata.IPTC
-
The credit to person(s) and/or organisation(s) required by the supplier of the item to be used when published.
- CRLF - Enum constant in enum class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenType
- CryptoParser - Class in org.apache.tika.parser
-
Decrypts the incoming document stream and delegates further parsing to another parser instance.
- CryptoParser(String, Provider, Set<MediaType>) - Constructor for class org.apache.tika.parser.CryptoParser
- CryptoParser(String, Set<MediaType>) - Constructor for class org.apache.tika.parser.CryptoParser
- CSVMessageBodyWriter - Class in org.apache.tika.server.core.writer
- CSVMessageBodyWriter() - Constructor for class org.apache.tika.server.core.writer.CSVMessageBodyWriter
- CSVParams - Class in org.apache.tika.parser.csv
- CSVPipesIterator - Class in org.apache.tika.pipes.iterator.csv
-
Iterates through a UTF-8 CSV file.
- CSVPipesIteratorConfig - Class in org.apache.tika.pipes.iterator.csv
- CSVPipesIteratorConfig() - Constructor for class org.apache.tika.pipes.iterator.csv.CSVPipesIteratorConfig
- CSVPipesIteratorFactory - Class in org.apache.tika.pipes.iterator.csv
-
Factory for creating CSV pipes iterators.
- CSVPipesIteratorFactory() - Constructor for class org.apache.tika.pipes.iterator.csv.CSVPipesIteratorFactory
- CSVPipesPlugin - Class in org.apache.tika.pipes.plugin.csv
- CSVPipesPlugin(PluginWrapper) - Constructor for class org.apache.tika.pipes.plugin.csv.CSVPipesPlugin
- CSVResult - Class in org.apache.tika.parser.csv
- CSVResult(double, MediaType, Character) - Constructor for class org.apache.tika.parser.csv.CSVResult
- CTAKES_META_PREFIX - Static variable in class org.apache.tika.parser.ctakes.CTAKESContentHandler
- CTAKESAnnotationProperty - Enum Class in org.apache.tika.parser.ctakes
-
This enumeration includes the properties that an
IdentifiedAnnotationobject can provide. - CTAKESConfig - Class in org.apache.tika.parser.ctakes
-
Configuration for
CTAKESContentHandler. - CTAKESConfig() - Constructor for class org.apache.tika.parser.ctakes.CTAKESConfig
-
Default constructor.
- CTAKESConfig(InputStream) - Constructor for class org.apache.tika.parser.ctakes.CTAKESConfig
-
Loads properties from InputStream and then tries to close InputStream.
- CTAKESContentHandler - Class in org.apache.tika.parser.ctakes
-
Class used to extract biomedical information while parsing.
- CTAKESContentHandler() - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
Default constructor.
- CTAKESContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
Creates a new
CTAKESContentHandlerfor the givenContentHandlerand Metadata objects. - CTAKESContentHandler(ContentHandler, Metadata, CTAKESConfig) - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
Creates a new
CTAKESContentHandlerfor the givenContentHandlerand Metadata objects. - CTAKESParser - Class in org.apache.tika.parser.ctakes
-
CTAKESParser decorates a
Parserand leverages onCTAKESContentHandlerto extract biomedical information from clinical text using Apache cTAKES. - CTAKESParser() - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
-
Wraps the default Parser
- CTAKESParser(Parser) - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
-
Wraps the specified Parser
- CTAKESSerializer - Enum Class in org.apache.tika.parser.ctakes
-
Enumeration for types of cTAKES (UIMA) CAS serializer supported by cTAKES.
- CTAKESUtils - Class in org.apache.tika.parser.ctakes
-
This class provides methods to extract biomedical information from plain text using
CTAKESContentHandlerthat relies on Apache cTAKES. - CTAKESUtils() - Constructor for class org.apache.tika.parser.ctakes.CTAKESUtils
- curveTo(float, float, float, float, float, float) - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- CUSTOM - Enum constant in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.KEY_BASE_STRATEGY
-
Custom pattern using emitKeyBase
- CUSTOM_MIMES_SYS_PROP - Static variable in class org.apache.tika.mime.MimeTypesFactory
-
System property to set a path to an additional external custom mimetypes XML file to be loaded.
- customCompositeDetector() - Static method in class org.apache.tika.example.CustomMimeInfo
- customLoader(ComponentLoader<T>) - Method in class org.apache.tika.serialization.ComponentConfig.Builder
-
Configure a custom loader for complex components that need special handling (SPI fallback, dependency injection, etc.).
- customMimeInfo() - Static method in class org.apache.tika.example.CustomMimeInfo
- CustomMimeInfo - Class in org.apache.tika.example
- CustomMimeInfo() - Constructor for class org.apache.tika.example.CustomMimeInfo
- CYRILLIC - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
D
- d - Variable in class org.apache.tika.parser.ocr.tess4j.ImageDeskew.HoughLine
- DAALA_VIDEO - Static variable in class org.apache.tika.parser.ogg.OggParser
- data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.EightBytesOfData
- data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.FourBytesOfData
- data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.OneByteOfData
- data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
- data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtFourBytesOfLengthFollowedByData
- data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.TwoBytesOfData
- data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
- data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
-
Gets or sets a binary item as specified in [MS-FSSHTTPB] section 2.2.1.3 that specifies a value that is unique to the file data represented by this root node object.
- data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
- data - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.RawTag
- Database - Interface in org.apache.tika.metadata
- databaseExists(Path) - Static method in class org.apache.tika.eval.app.db.H2Util
- DataElement - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- DataElement - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Data Element
- DataElement - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Data Element
- DataElement() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
-
Initializes a new instance of the DataElement class.
- DataElement(DataElementType, DataElementData) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
-
Initializes a new instance of the DataElement class.
- DataElementData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Base class of data element
- DataElementData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementData
- dataElementExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
- DataElementFragment - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Data Element Fragment
- dataElementHash - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
- DataElementHash - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Specifies an data element hash stream object
- DataElementHash - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Data Element Hash
- DataElementHash() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
-
Initializes a new instance of the DataElementHash class.
- dataElementHashData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
- dataElementHashScheme - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
- dataElementPackage - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
- DataElementPackage - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- DataElementPackage - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Data Element Package
- DataElementPackage - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Data Element Package
- DataElementPackage() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementPackage
-
Initializes a new instance of the DataElementHash class.
- DataElementParseErrorException - Exception in org.apache.tika.parser.microsoft.onenote.fsshttpb.exception
- DataElementParseErrorException(int, Exception) - Constructor for exception org.apache.tika.parser.microsoft.onenote.fsshttpb.exception.DataElementParseErrorException
- DataElementParseErrorException(int, String, Exception) - Constructor for exception org.apache.tika.parser.microsoft.onenote.fsshttpb.exception.DataElementParseErrorException
- dataElements - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementPackage
- dataElementType - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
- DataElementType - Enum Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
-
The enumeration of the data element type
- DataElementUtils - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
- DataElementUtils() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
- dataHash - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
- DataHashObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- DataHashObject - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Data Hash Object
- DataHashObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
-
Initializes a new instance of the DataHashObject class.
- dataNodeObjectData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
- DataNodeObjectData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
-
Data Node Object data
- DataNodeObjectData(byte[], int, int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataNodeObjectData
-
Initializes a new instance of the DataNodeObjectData class.
- DataPackage - Class in org.apache.tika.pipes.core.extractor.frictionless
-
Represents a Frictionless Data Package manifest (datapackage.json).
- DataPackage() - Constructor for class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
-
Creates an empty DataPackage for deserialization.
- DataPackage(String) - Constructor for class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
-
Creates a new DataPackage with the given name and current timestamp.
- dataRoot - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
- dataSize - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataSizeObject
- dataSize - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.NodeObject
- DataSizeObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Data Size Object
- DataSizeObject - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Data Size Object
- DataSizeObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataSizeObject
-
Initializes a new instance of the DataSizeObject class.
- DataURIScheme - Class in org.apache.tika.parser.html
- DataURISchemeParseException - Exception in org.apache.tika.parser.html
- DataURISchemeParseException(String) - Constructor for exception org.apache.tika.parser.html.DataURISchemeParseException
- DataURISchemeUtil - Class in org.apache.tika.parser.html
-
Not thread safe.
- DataURISchemeUtil() - Constructor for class org.apache.tika.parser.html.DataURISchemeUtil
- DATE - Enum constant in enum class org.apache.tika.metadata.Property.ValueType
- DATE - Static variable in interface org.apache.tika.metadata.DublinCore
-
A date associated with an event in the life cycle of the resource.
- DATE - Static variable in interface org.apache.tika.metadata.XMPDC
-
A date associated with an event in the life cycle of the resource.
- DATE - Static variable in interface org.apache.tika.parser.ner.NERecogniser
- DATE_CREATED - Static variable in interface org.apache.tika.metadata.IPTC
-
Designates the date and optionally the time the intellectual content was created rather than the date of the creation of the physical representation.
- DATE_CREATED - Static variable in interface org.apache.tika.metadata.Photoshop
- DATE_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- DateNormalizingMetadataFilter - Class in org.apache.tika.metadata.filter
-
Some dates in some file formats do not have a timezone.
- DateNormalizingMetadataFilter() - Constructor for class org.apache.tika.metadata.filter.DateNormalizingMetadataFilter
- DateNormalizingMetadataFilter(JsonConfig) - Constructor for class org.apache.tika.metadata.filter.DateNormalizingMetadataFilter
-
Constructor for JSON configuration.
- DateNormalizingMetadataFilter(DateNormalizingMetadataFilter.Config) - Constructor for class org.apache.tika.metadata.filter.DateNormalizingMetadataFilter
-
Constructor with explicit Config object.
- DateNormalizingMetadataFilter.Config - Class in org.apache.tika.metadata.filter
-
Configuration class for JSON deserialization.
- DateUtils - Class in org.apache.tika.utils
-
Date related utility methods and constants
- DateUtils() - Constructor for class org.apache.tika.utils.DateUtils
- DBBuffer - Class in org.apache.tika.eval.app.db
- DBBuffer(Connection, String, String, String) - Constructor for class org.apache.tika.eval.app.db.DBBuffer
- DBFParser - Class in org.apache.tika.parser.dbf
-
This is a Tika wrapper around the DBFReader.
- DBFParser() - Constructor for class org.apache.tika.parser.dbf.DBFParser
- DBWriter - Class in org.apache.tika.eval.app.io
-
This is still in its early stages.
- DBWriter(Connection, List<TableInfo>, JDBCUtil, MimeBuffer) - Constructor for class org.apache.tika.eval.app.io.DBWriter
- DcXMLParser - Class in org.apache.tika.parser.xml
-
Dublin Core metadata parser
- DcXMLParser() - Constructor for class org.apache.tika.parser.xml.DcXMLParser
- DD_MMM_YY - Static variable in class org.apache.tika.parser.mailcommons.MailDateParser
- DD_SLASH_MM_SLASH_YYYY - Static variable in class org.apache.tika.parser.mailcommons.MailDateParser
- DECLARATIVE - Enum constant in enum class org.apache.tika.detect.EncodingResult.ResultType
-
The document explicitly declared its encoding (BOM, HTML meta charset).
- decode(char[]) - Static method in class org.apache.tika.mime.HexCoDec
-
Decode an array of hex chars
- decode(char[], int, int) - Static method in class org.apache.tika.mime.HexCoDec
-
Decode an array of hex chars.
- decode(String) - Static method in class org.apache.tika.inference.VectorSerializer
-
Decode a base64 string back to a float array (big-endian float32).
- decode(String) - Static method in class org.apache.tika.mime.HexCoDec
-
Decode a hex string
- DECODED_CHARSET - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
The charset actually used to decode the stream when a superset override was applied.
- DecodeEquivalence - Class in org.apache.tika.ml.chardetect
-
Cheap byte-wise decode-equivalence check for single-byte charsets.
- decompressConcatenated(Metadata) - Method in interface org.apache.tika.parser.pkg.CompressorParserOptions
- decorate(ContentHandler, Metadata, ParseContext) - Method in interface org.apache.tika.sax.ContentHandlerDecoratorFactory
- decorateDefaultComposite(Parser, JsonNode, LoaderContext) - Method in class org.apache.tika.config.loader.ParserLoader
- decorateDefaultComposite(T, JsonNode, LoaderContext) - Method in class org.apache.tika.config.loader.AbstractSpiComponentLoader
-
Decorate the default composite with additional behavior.
- decrementEmbeddedDepth() - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- DEFAULT - Enum constant in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.KEY_BASE_STRATEGY
-
Default pattern: {containerKey}-embed/{id}{suffix}
- DEFAULT - Static variable in class org.apache.tika.parser.AutoDetectParserConfig
- DEFAULT_CERT_EXPIRATION_WARNING_DAYS - Static variable in class org.apache.tika.server.core.TlsConfig
-
Default warning threshold for certificate expiration (30 days).
- DEFAULT_CHARSET - Static variable in class org.apache.tika.detect.OverrideEncodingDetector
- DEFAULT_CHARSET - Static variable in class org.apache.tika.parser.html.JSoupParser
- DEFAULT_CHARSET - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- DEFAULT_DIRECT_EMIT_THRESHOLD_BYTES - Static variable in class org.apache.tika.pipes.core.EmitStrategyConfig
-
Default threshold in bytes for direct emission from PipesServer.
- DEFAULT_EMBEDDED_FILE_FIELD_NAME - Static variable in class org.apache.tika.pipes.emitter.es.ESEmitter
- DEFAULT_EMBEDDED_FILE_FIELD_NAME - Static variable in class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitter
- DEFAULT_EMBEDDED_FILE_FIELD_NAME - Static variable in class org.apache.tika.pipes.emitter.solr.SolrEmitter
- DEFAULT_EMIT_MAX_ESTIMATED_BYTES - Static variable in class org.apache.tika.pipes.core.PipesConfig
- DEFAULT_EMIT_STRATEGY - Static variable in class org.apache.tika.pipes.core.EmitStrategyConfig
-
Default emit strategy for PipesServer.
- DEFAULT_EMIT_WITHIN_MILLIS - Static variable in class org.apache.tika.pipes.core.PipesConfig
- DEFAULT_EXIT_VALUE_KEY - Static variable in class org.apache.tika.pipes.reporter.es.ESPipesReporter
- DEFAULT_EXIT_VALUE_KEY - Static variable in class org.apache.tika.pipes.reporter.opensearch.OpenSearchPipesReporter
- DEFAULT_FETCHER_ID - Static variable in class org.apache.tika.server.core.resource.PipesParsingHelper
-
The fetcher ID used for reading temp files.
- DEFAULT_FETCHER_NAME - Static variable in class org.apache.tika.pipes.fork.PipesForkParser
- DEFAULT_FORKED_STARTUP_MILLIS - Static variable in class org.apache.tika.server.core.TikaServerConfig
-
Number of milliseconds to wait for forked process to startup
- DEFAULT_HANDLER_TYPE - Static variable in class org.apache.tika.server.core.resource.RecursiveMetadataResource
- DEFAULT_HEARTBEAT_INTERVAL_MS - Static variable in class org.apache.tika.pipes.core.PipesConfig
- DEFAULT_HOST - Static variable in class org.apache.tika.server.core.TikaServerConfig
- DEFAULT_ID - Static variable in class org.apache.tika.language.translate.impl.MicrosoftTranslator
- DEFAULT_MAX_CHARS_FOR_DETECTION - Static variable in class org.apache.tika.langdetect.optimaize.OptimaizeLangDetector
- DEFAULT_MAX_CHARS_FOR_SHORT_DETECTION - Static variable in class org.apache.tika.langdetect.optimaize.OptimaizeLangDetector
- DEFAULT_MAX_ENTITY_EXPANSIONS - Static variable in class org.apache.tika.utils.XMLReaderUtils
- DEFAULT_MAX_FIELD_SIZE - Static variable in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- DEFAULT_MAX_FILE_SIZE_TO_OCR - Static variable in class org.apache.tika.parser.ocrencode.EncodeOCRConfig
- DEFAULT_MAX_FILES_PROCESSED_PER_PROCESS - Static variable in class org.apache.tika.pipes.core.PipesConfig
- DEFAULT_MAX_KEY_SIZE - Static variable in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- DEFAULT_MAX_TOTAL_BYTES - Static variable in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- DEFAULT_MAX_UNPACK_BYTES - Static variable in class org.apache.tika.pipes.core.extractor.UnpackConfig
-
Default maximum bytes to unpack per file: 10 GB.
- DEFAULT_MAX_VALUES_PER_FIELD - Static variable in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- DEFAULT_MAX_WAIT_FOR_CLIENT_MS - Static variable in class org.apache.tika.pipes.core.PipesConfig
- DEFAULT_MAX_WAIT_MS - Static variable in class org.apache.tika.pipes.pipesiterator.PipesIteratorBase
- DEFAULT_MODEL_PATH - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
default Model path
- DEFAULT_MODEL_RESOURCE - Static variable in class org.apache.tika.ml.chardetect.MojibusterEncodingDetector
-
Default NB bigram model on the classpath.
- DEFAULT_MODEL_RESOURCE - Static variable in class org.apache.tika.ml.chardetect.Utf16SpecialistEncodingDetector
-
Default classpath resource for the trained UTF-16 specialist model.
- DEFAULT_MODEL_RESOURCE - Static variable in class org.apache.tika.ml.junkdetect.JunkDetector
-
Classpath resource path for the bundled production model.
- DEFAULT_MODELS - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- DEFAULT_NER_IMPL - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
- DEFAULT_NUM_CLIENTS - Static variable in class org.apache.tika.pipes.core.PipesConfig
- DEFAULT_NUM_EMITTERS - Static variable in class org.apache.tika.pipes.core.PipesConfig
- DEFAULT_NUM_REUSES - Static variable in class org.apache.tika.utils.XMLReaderUtils
- DEFAULT_ON_PARSE_EXCEPTION - Static variable in class org.apache.tika.pipes.api.FetchEmitTuple
- DEFAULT_PARSE_STATUS_KEY - Static variable in class org.apache.tika.pipes.reporter.es.ESPipesReporter
- DEFAULT_PARSE_STATUS_KEY - Static variable in class org.apache.tika.pipes.reporter.opensearch.OpenSearchPipesReporter
- DEFAULT_PARSE_TIME_KEY - Static variable in class org.apache.tika.pipes.reporter.es.ESPipesReporter
- DEFAULT_PARSE_TIME_KEY - Static variable in class org.apache.tika.pipes.reporter.opensearch.OpenSearchPipesReporter
- DEFAULT_POOL_SIZE - Static variable in class org.apache.tika.utils.XMLReaderUtils
-
Default size for the pool of SAX Parsers and the pool of DOM builders
- DEFAULT_PORT - Static variable in class org.apache.tika.server.core.TikaServerConfig
- DEFAULT_PROGRESS_TIMEOUT_MILLIS - Static variable in class org.apache.tika.config.TimeoutLimits
- DEFAULT_PROTOCOLS - Static variable in class org.apache.tika.server.core.TlsConfig
-
Default TLS protocols - only TLS 1.2 and 1.3 are enabled by default.
- DEFAULT_QUEUE_SIZE - Static variable in class org.apache.tika.pipes.core.PipesConfig
- DEFAULT_QUEUE_SIZE - Static variable in class org.apache.tika.pipes.pipesiterator.PipesIteratorBase
- DEFAULT_SECRET - Static variable in class org.apache.tika.language.translate.impl.MicrosoftTranslator
- DEFAULT_SHUTDOWN_CLIENT_AFTER_MILLS - Static variable in class org.apache.tika.pipes.core.PipesConfig
- DEFAULT_SOCKET_TIMEOUT_MS - Static variable in class org.apache.tika.pipes.core.PipesConfig
- DEFAULT_STALE_FETCHER_DELAY_SECONDS - Static variable in class org.apache.tika.pipes.core.PipesConfig
- DEFAULT_STALE_FETCHER_TIMEOUT_SECONDS - Static variable in class org.apache.tika.pipes.core.PipesConfig
- DEFAULT_STARTUP_TIMEOUT_MILLIS - Static variable in class org.apache.tika.pipes.core.PipesConfig
- DEFAULT_TIMEOUT_MS - Static variable in class org.apache.tika.parser.external.ExternalParser
- DEFAULT_TIMEOUT_MS - Static variable in class org.apache.tika.parser.gdal.GDALParser
- DEFAULT_TOTAL_TASK_TIMEOUT_MILLIS - Static variable in class org.apache.tika.config.TimeoutLimits
- DEFAULT_USE_SHARED_SERVER - Static variable in class org.apache.tika.pipes.core.PipesConfig
- DefaultDetector - Class in org.apache.tika.detect
-
A composite detector that orchestrates the detection pipeline: MimeTypes (magic byte) detection Container and other detectors loaded via SPI TextDetector as fallback for unknown types Returns the most specific type detected
- DefaultDetector() - Constructor for class org.apache.tika.detect.DefaultDetector
- DefaultDetector(ClassLoader) - Constructor for class org.apache.tika.detect.DefaultDetector
- DefaultDetector(MimeTypes) - Constructor for class org.apache.tika.detect.DefaultDetector
- DefaultDetector(MimeTypes, ClassLoader) - Constructor for class org.apache.tika.detect.DefaultDetector
- DefaultDetector(MimeTypes, ServiceLoader) - Constructor for class org.apache.tika.detect.DefaultDetector
- DefaultDetector(MimeTypes, ServiceLoader, Collection<Class<? extends Detector>>) - Constructor for class org.apache.tika.detect.DefaultDetector
- DefaultDetectorSerializer - Class in org.apache.tika.serialization.serdes
-
Serializer for DefaultDetector that outputs exclusions.
- DefaultDetectorSerializer() - Constructor for class org.apache.tika.serialization.serdes.DefaultDetectorSerializer
- DefaultEmbeddedStreamTranslator - Class in org.apache.tika.extractor
-
Loads EmbeddedStreamTranslators via service loading.
- DefaultEmbeddedStreamTranslator() - Constructor for class org.apache.tika.extractor.DefaultEmbeddedStreamTranslator
- DefaultEncodingDetector - Class in org.apache.tika.detect
-
A composite encoding detector based on all the
EncodingDetectorimplementations available through theservice provider mechanism. - DefaultEncodingDetector() - Constructor for class org.apache.tika.detect.DefaultEncodingDetector
- DefaultEncodingDetector(ServiceLoader) - Constructor for class org.apache.tika.detect.DefaultEncodingDetector
- DefaultEncodingDetector(ServiceLoader, Collection<Class<? extends EncodingDetector>>) - Constructor for class org.apache.tika.detect.DefaultEncodingDetector
- defaultFor() - Element in annotation interface org.apache.tika.config.TikaComponent
-
Marks this component as the default implementation for the specified interface.
- DefaultHtmlMapper - Class in org.apache.tika.parser.html
-
The default HTML mapping rules in Tika.
- DefaultHtmlMapper() - Constructor for class org.apache.tika.parser.html.DefaultHtmlMapper
- DefaultMetadataFilter - Class in org.apache.tika.metadata.filter
- DefaultMetadataFilter() - Constructor for class org.apache.tika.metadata.filter.DefaultMetadataFilter
- DefaultMetadataFilter(List<MetadataFilter>) - Constructor for class org.apache.tika.metadata.filter.DefaultMetadataFilter
- DefaultMetadataFilter(ServiceLoader) - Constructor for class org.apache.tika.metadata.filter.DefaultMetadataFilter
- DefaultParser - Class in org.apache.tika.parser
-
A composite parser based on all the
Parserimplementations available through theservice provider mechanism. - DefaultParser() - Constructor for class org.apache.tika.parser.DefaultParser
- DefaultParser(ClassLoader) - Constructor for class org.apache.tika.parser.DefaultParser
- DefaultParser(MediaTypeRegistry) - Constructor for class org.apache.tika.parser.DefaultParser
- DefaultParser(MediaTypeRegistry, ClassLoader) - Constructor for class org.apache.tika.parser.DefaultParser
- DefaultParser(MediaTypeRegistry, ServiceLoader) - Constructor for class org.apache.tika.parser.DefaultParser
- DefaultParser(MediaTypeRegistry, ServiceLoader, Collection<Class<? extends Parser>>) - Constructor for class org.apache.tika.parser.DefaultParser
- DefaultParser(MediaTypeRegistry, ServiceLoader, Collection<Class<? extends Parser>>, EncodingDetector, Renderer) - Constructor for class org.apache.tika.parser.DefaultParser
- DefaultParser(MediaTypeRegistry, ServiceLoader, EncodingDetector, Renderer) - Constructor for class org.apache.tika.parser.DefaultParser
- DefaultParserSerializer - Class in org.apache.tika.serialization.serdes
-
Serializer for DefaultParser that outputs exclusions.
- DefaultParserSerializer() - Constructor for class org.apache.tika.serialization.serdes.DefaultParserSerializer
- DefaultProbDetector - Class in org.apache.tika.detect
-
A version of
DefaultDetectorfor probabilistic mime detectors, which use statistical techniques to blend the results of differing underlying detectors when attempting to detect the type of a given file. - DefaultProbDetector() - Constructor for class org.apache.tika.detect.DefaultProbDetector
- DefaultProbDetector(ClassLoader) - Constructor for class org.apache.tika.detect.DefaultProbDetector
- DefaultProbDetector(MimeTypes) - Constructor for class org.apache.tika.detect.DefaultProbDetector
- DefaultProbDetector(ProbabilisticMimeDetectionSelector, ClassLoader) - Constructor for class org.apache.tika.detect.DefaultProbDetector
- DefaultProbDetector(ProbabilisticMimeDetectionSelector, ServiceLoader) - Constructor for class org.apache.tika.detect.DefaultProbDetector
- defaultProvider(Supplier<T>) - Method in class org.apache.tika.serialization.ComponentConfig.Builder
-
Configure a default value to return when the JSON field is absent.
- defaultTimeZone - Variable in class org.apache.tika.metadata.filter.DateNormalizingMetadataFilter.Config
- DefaultTranslator - Class in org.apache.tika.language.translate
-
A translator which picks the first available
Translatorimplementations available through theservice provider mechanism. - DefaultTranslator() - Constructor for class org.apache.tika.language.translate.DefaultTranslator
- DefaultTranslator(ServiceLoader) - Constructor for class org.apache.tika.language.translate.DefaultTranslator
- DefaultZipContainerDetector - Class in org.apache.tika.detect.zip
-
This class is designed to detect subtypes of zip-based file formats.
- DefaultZipContainerDetector() - Constructor for class org.apache.tika.detect.zip.DefaultZipContainerDetector
- DefaultZipContainerDetector(List<ZipContainerDetector>) - Constructor for class org.apache.tika.detect.zip.DefaultZipContainerDetector
- DefaultZipContainerDetector(ServiceLoader) - Constructor for class org.apache.tika.detect.zip.DefaultZipContainerDetector
- DEFLATE64 - Static variable in class org.apache.tika.detect.zip.CompressorConstants
- DelegatingParser - Class in org.apache.tika.parser
-
Base class for parser implementations that want to delegate parts of the task of parsing an input document to another parser.
- DelegatingParser() - Constructor for class org.apache.tika.parser.DelegatingParser
- Deletable - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- delete() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcherPlugin
- delete() - Method in class org.apache.tika.pipes.plugin.atlassianjwt.AtlassianJwtPipesPlugin
- delete() - Method in class org.apache.tika.pipes.plugin.azblob.AZBlobPipesPlugin
- delete() - Method in class org.apache.tika.pipes.plugin.csv.CSVPipesPlugin
- delete() - Method in class org.apache.tika.pipes.plugin.es.ESPipesPlugin
- delete() - Method in class org.apache.tika.pipes.plugin.fs.FileSystemPipesPlugin
- delete() - Method in class org.apache.tika.pipes.plugin.gcs.GCSPipesPlugin
- delete() - Method in class org.apache.tika.pipes.plugin.googledrive.GoogleDrivePipesPlugin
- delete() - Method in class org.apache.tika.pipes.plugin.http.HttpPipesPlugin
- delete() - Method in class org.apache.tika.pipes.plugin.jdbc.JDBCPipesPlugin
- delete() - Method in class org.apache.tika.pipes.plugin.JsonPipesPlugin
- delete() - Method in class org.apache.tika.pipes.plugin.kafka.KafkaPipesPlugin
- delete() - Method in class org.apache.tika.pipes.plugin.microsoftgraph.MicrosoftGraphPipesPlugin
- delete() - Method in class org.apache.tika.pipes.plugin.opensearch.OpenSearchPipesPlugin
- delete() - Method in class org.apache.tika.pipes.plugin.s3.S3PipesPlugin
- delete() - Method in class org.apache.tika.pipes.plugin.solr.SolrPipesPlugin
- DELETE - Enum constant in enum class org.apache.tika.parser.microsoft.ooxml.EditType
- deleteComponent(String) - Method in class org.apache.tika.pipes.core.AbstractComponentManager
-
Deletes a component configuration by ID.
- deleteEmitter(String) - Method in class org.apache.tika.pipes.core.emitter.EmitterManager
-
Deletes an emitter configuration by ID.
- deleteFetcher(String) - Method in class org.apache.tika.pipes.core.fetcher.FetcherManager
-
Deletes a fetcher configuration by ID.
- deleteFetcher(DeleteFetcherRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingStub
-
Delete a fetcher from the fetcher store.
- deleteFetcher(DeleteFetcherRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingV2Stub
-
Delete a fetcher from the fetcher store.
- deleteFetcher(DeleteFetcherRequest) - Method in class org.apache.tika.TikaGrpc.TikaFutureStub
-
Delete a fetcher from the fetcher store.
- deleteFetcher(DeleteFetcherRequest, StreamObserver<DeleteFetcherReply>) - Method in interface org.apache.tika.TikaGrpc.AsyncService
-
Delete a fetcher from the fetcher store.
- deleteFetcher(DeleteFetcherRequest, StreamObserver<DeleteFetcherReply>) - Method in class org.apache.tika.TikaGrpc.TikaStub
-
Delete a fetcher from the fetcher store.
- DeleteFetcherReply - Class in org.apache.tika
-
Protobuf type
tika.DeleteFetcherReply - DeleteFetcherReply.Builder - Class in org.apache.tika
-
Protobuf type
tika.DeleteFetcherReply - DeleteFetcherReplyOrBuilder - Interface in org.apache.tika
- DeleteFetcherRequest - Class in org.apache.tika
-
Protobuf type
tika.DeleteFetcherRequest - DeleteFetcherRequest.Builder - Class in org.apache.tika
-
Protobuf type
tika.DeleteFetcherRequest - DeleteFetcherRequestOrBuilder - Interface in org.apache.tika
- deleteNamespace(String) - Static method in class org.apache.tika.xmp.XMPMetadata
-
Deletes a namespace from the registry.
- deletePipesIterator(DeletePipesIteratorRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingStub
-
Delete a pipes iterator from the iterator store.
- deletePipesIterator(DeletePipesIteratorRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingV2Stub
-
Delete a pipes iterator from the iterator store.
- deletePipesIterator(DeletePipesIteratorRequest) - Method in class org.apache.tika.TikaGrpc.TikaFutureStub
-
Delete a pipes iterator from the iterator store.
- deletePipesIterator(DeletePipesIteratorRequest, StreamObserver<DeletePipesIteratorReply>) - Method in interface org.apache.tika.TikaGrpc.AsyncService
-
Delete a pipes iterator from the iterator store.
- deletePipesIterator(DeletePipesIteratorRequest, StreamObserver<DeletePipesIteratorReply>) - Method in class org.apache.tika.TikaGrpc.TikaStub
-
Delete a pipes iterator from the iterator store.
- DeletePipesIteratorReply - Class in org.apache.tika
-
Protobuf type
tika.DeletePipesIteratorReply - DeletePipesIteratorReply.Builder - Class in org.apache.tika
-
Protobuf type
tika.DeletePipesIteratorReply - DeletePipesIteratorReplyOrBuilder - Interface in org.apache.tika
- DeletePipesIteratorRequest - Class in org.apache.tika
-
Protobuf type
tika.DeletePipesIteratorRequest - DeletePipesIteratorRequest.Builder - Class in org.apache.tika
-
Protobuf type
tika.DeletePipesIteratorRequest - DeletePipesIteratorRequestOrBuilder - Interface in org.apache.tika
- DELIMITER_PROPERTY - Static variable in class org.apache.tika.parser.csv.TextAndCSVParser
- deliveryTimeoutMs() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
deliveryTimeoutMsrecord component. - delta() - Method in class org.apache.tika.quality.TextQualityComparison
-
Absolute difference in z-scores between the two candidates.
- DeprecatedStreamingZipContainerDetector - Class in org.apache.tika.detect.zip
- DeprecatedStreamingZipContainerDetector() - Constructor for class org.apache.tika.detect.zip.DeprecatedStreamingZipContainerDetector
- DeprecatedZipContainerDetector - Class in org.apache.tika.detect.zip
-
A detector that works on Zip documents and tries to figure out basic types -- epub, jar, ear, war, kmz and StarOffice
- DeprecatedZipContainerDetector() - Constructor for class org.apache.tika.detect.zip.DeprecatedZipContainerDetector
- DERIVED_FROM_DOCUMENTID - Static variable in interface org.apache.tika.metadata.XMPMM
-
Document id for the document that this document was derived from
- DERIVED_FROM_INSTANCEID - Static variable in interface org.apache.tika.metadata.XMPMM
-
Instance id for the document instance that this document was derived from
- descend(String, String) - Method in class org.apache.tika.sax.xpath.ChildMatcher
- descend(String, String) - Method in class org.apache.tika.sax.xpath.CompositeMatcher
- descend(String, String) - Method in class org.apache.tika.sax.xpath.Matcher
-
Returns the XPath evaluation state that results from descending to a child element with the given name.
- descend(String, String) - Method in class org.apache.tika.sax.xpath.NamedElementMatcher
- descend(String, String) - Method in class org.apache.tika.sax.xpath.SubtreeMatcher
- DescendantsCannotBeMoved - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- describeMediaType() - Static method in class org.apache.tika.example.MediaTypeExample
- DescribeMetadata - Class in org.apache.tika.example
-
Print the supported Tika Metadata models and their fields.
- DescribeMetadata() - Constructor for class org.apache.tika.example.DescribeMetadata
- DESCRIPTION - Static variable in interface org.apache.tika.metadata.DublinCore
-
An account of the content of the resource.
- DESCRIPTION - Static variable in interface org.apache.tika.metadata.IPTC
-
A textual description, including captions, of the item's content, particularly used where the object is not text.
- DESCRIPTION - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- DESCRIPTION - Static variable in interface org.apache.tika.metadata.XMPDC
-
An account of the content of the resource.
- DESCRIPTION_WRITER - Static variable in interface org.apache.tika.metadata.IPTC
-
Identifier or the name of the person involved in writing, editing or correcting the description of the content.
- DESCRIPTOR_NODE_ID - Static variable in interface org.apache.tika.metadata.PST
- deserialize(JsonParser, DeserializationContext) - Method in class org.apache.tika.pipes.core.serialization.EmitDataDeserializer
- deserialize(JsonParser, DeserializationContext) - Method in class org.apache.tika.pipes.core.serialization.FetchEmitTupleDeserializer
- deserialize(JsonParser, DeserializationContext) - Method in class org.apache.tika.pipes.core.serialization.PipesResultDeserializer
- deserialize(JsonParser, DeserializationContext) - Method in class org.apache.tika.serialization.serdes.MetadataDeserializer
- deserialize(JsonParser, DeserializationContext) - Method in class org.apache.tika.serialization.serdes.ParseContextDeserializer
- deserialize(ObjectMapper, String, Class<T>) - Static method in class org.apache.tika.config.loader.JsonMergeUtils
-
Deserializes JSON to a configuration object without merging.
- deserialize(String, Class<T>) - Method in class org.apache.tika.config.loader.TikaJsonConfig
-
Deserializes a configuration value for the given key.
- deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestDataElementData
-
Used to return the length of this element.
- deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementData
-
De-serialize data element data from byte array.
- deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
-
Used to return the length of this element.
- deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestDataElementData
-
Used to return the length of this element.
- deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
-
Used to de-serialize the data element.
- deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestDataElementData
-
Used to de-serialize data element.
- deserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BasicObject
-
Used to return the length of this element.
- deserializeFromByteArray(StreamObjectHeaderStart, byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
-
Used to return the length of this element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestCurrentRevision
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementPackage
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataSizeObject
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.IntermediateNodeObject
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupData
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDeclarations
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadata
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadataDeclarations
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDataBLOBReference
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifest
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestObjectGroupReferences
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestRootDeclare
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.SignatureObject
-
Used to de-serialize the element.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
-
Used to de-serialize the items.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexManifestMapping
-
Used to Deserialize the items.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexRevisionMapping
-
Used to de-serialize the items
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestRootDeclare
-
Used to de-serialize the items.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestSchemaGUID
-
Used to de-serialize the items.
- deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
-
De-serialize items from byte array.
- detect() - Method in class org.apache.tika.language.detect.LanguageDetector
- detect() - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Return the charset that best matches the supplied input data.
- detect(byte[]) - Method in class org.apache.tika.ml.chardetect.MojibusterEncodingDetector
-
Byte-array entry point without metadata — same as passing
null. - detect(byte[]) - Method in class org.apache.tika.ml.chardetect.NaiveBayesBigramEncodingDetector
- detect(byte[]) - Method in class org.apache.tika.ml.chardetect.Utf16SpecialistEncodingDetector
-
Byte-array entry point for callers that already hold a probe (e.g.
- detect(byte[]) - Method in class org.apache.tika.Tika
-
Detects the media type of the given document.
- detect(byte[], String) - Method in class org.apache.tika.Tika
-
Detects the media type of the given document.
- detect(byte[], Metadata) - Method in class org.apache.tika.ml.chardetect.MojibusterEncodingDetector
-
Byte-array entry point with optional metadata.
- detect(File) - Method in class org.apache.tika.Tika
-
Detects the media type of the given file.
- detect(InputStream) - Method in class org.apache.tika.Tika
-
Detects the media type of the given document.
- detect(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.core.resource.DetectorResource
- detect(InputStream, String) - Method in class org.apache.tika.Tika
-
Detects the media type of the given document.
- detect(InputStream, Metadata) - Method in class org.apache.tika.Tika
-
Detects the media type of the given document.
- detect(CharSequence) - Method in class org.apache.tika.language.detect.LanguageDetector
- detect(String) - Method in class org.apache.tika.Tika
-
Detects the media type of a document with the given file name.
- detect(URL) - Method in class org.apache.tika.Tika
-
Detects the media type of the resource at the given URL.
- detect(Path) - Method in class org.apache.tika.Tika
-
Detects the media type of the file at the given path.
- detect(Set<String>) - Static method in class org.apache.tika.detect.ole.MiscOLEDetector
-
Deprecated.Use
MiscOLEDetector.detect(Set, DirectoryEntry)and pass the root entry of the filesystem whose type is to be detected, as a second argument. - detect(Set<String>, DirectoryEntry) - Static method in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
Internal detection of the specific kind of OLE2 document, based on the names of the top-level streams within the file.
- detect(Set<String>, DirectoryEntry) - Static method in class org.apache.tika.detect.ole.MiscOLEDetector
-
Internal detection of the specific kind of OLE2 document, based on the names of the top-level streams within the file.
- detect(ZipFile) - Static method in enum class org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
- detect(ZipFile) - Static method in enum class org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
- detect(ZipFile, TikaInputStream) - Method in class org.apache.tika.detect.apple.IWorkDetector
- detect(ZipFile, TikaInputStream) - Method in class org.apache.tika.detect.microsoft.ooxml.OPCPackageDetector
- detect(ZipFile, TikaInputStream) - Method in class org.apache.tika.detect.zip.FrictionlessPackageDetector
- detect(ZipFile, TikaInputStream) - Method in class org.apache.tika.detect.zip.IPADetector
- detect(ZipFile, TikaInputStream) - Method in class org.apache.tika.detect.zip.JarDetector
- detect(ZipFile, TikaInputStream) - Method in class org.apache.tika.detect.zip.KMZDetector
- detect(ZipFile, TikaInputStream) - Method in class org.apache.tika.detect.zip.OpenDocumentDetector
- detect(ZipFile, TikaInputStream) - Method in class org.apache.tika.detect.zip.StarOfficeDetector
- detect(ZipFile, TikaInputStream) - Method in interface org.apache.tika.detect.zip.ZipContainerDetector
-
If detection is successful, the ZipDetector should set the zip file or OPCPackage in TikaInputStream.setOpenContainer() Implementations should _not_ close the ZipFile
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.apple.BPListDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.BOMDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.CompositeDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.CompositeEncodingDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.DefaultDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in interface org.apache.tika.detect.Detector
-
Detects the content type of the given input document.
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.EmptyDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in interface org.apache.tika.detect.EncodingDetector
-
Detects the character encoding of the given text document.
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.FileCommandDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.gzip.GZipSpecializationDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.MagicDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.magika.MagikaDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.MatroskaDetector
-
Detects the media type of the input stream by inspecting EBML headers.
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.MetadataCharsetDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.microsoft.POIFSContainerDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.NameDetector
-
Detects the content type of an input document based on the document name given in the input metadata.
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.ogg.OggDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.ole.MiscOLEDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.OverrideDetector
-
Deprecated.
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.OverrideEncodingDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.siegfried.SiegfriedDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.TextDetector
-
Looks at the beginning of the document input stream to determine whether the document is text or not.
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.TrainedModelDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.TypeDetector
-
Detects the content type of an input document based on a type hint given in the input metadata.
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.ZeroSizeFileDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.zip.DefaultZipContainerDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.zip.DeprecatedStreamingZipContainerDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.detect.zip.StreamingZipContainerDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.example.EncryptedPrescriptionDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.mime.MimeTypes
-
Automatically detects the MIME type of a document based on magic markers in the stream prefix and any given metadata hints.
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.ml.chardetect.MojibusterEncodingDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.ml.chardetect.NaiveBayesBigramEncodingDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.ml.chardetect.Utf16SpecialistEncodingDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.ml.junkdetect.JunkFilterEncodingDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.parser.html.HtmlEncodingDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
- detect(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.parser.txt.UniversalEncodingDetector
- DETECT - Enum constant in enum class org.apache.tika.server.core.ServerStatus.TASK
- detectAll() - Method in class org.apache.tika.langdetect.charsoup.CharSoupLanguageDetector
- detectAll() - Method in class org.apache.tika.langdetect.lingo24.Lingo24LangDetector
- detectAll() - Method in class org.apache.tika.langdetect.mitll.TextLangDetector
- detectAll() - Method in class org.apache.tika.langdetect.opennlp.OpenNLPDetector
- detectAll() - Method in class org.apache.tika.langdetect.optimaize.OptimaizeLangDetector
-
Detect languages based on previously submitted text (via addText calls).
- detectAll() - Method in class org.apache.tika.language.detect.LanguageDetector
-
Detect languages based on previously submitted text (via addText calls).
- detectAll() - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Return an array of all charsets that appear to be plausible matches with the input data.
- detectAll(String) - Method in class org.apache.tika.language.detect.LanguageDetector
-
Utility wrapper that detects the language of a given chunk of text.
- DETECTED - Enum constant in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.SUFFIX_STRATEGY
- DETECTED_ENCODING - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
When an EncodingDetector detects an encoding, the encoding should be stored in this field.
- detectFilename(MultivaluedMap<String, String>) - Static method in class org.apache.tika.server.core.resource.TikaResource
- DetectHelper - Class in org.apache.tika.detect
-
Utility methods for content detection.
- DetectHelper() - Constructor for class org.apache.tika.detect.DetectHelper
- detectIfPossible(ZipEntry) - Static method in enum class org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
- detectIfPossible(ZipEntry) - Static method in enum class org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
- DETECTION_CONTENT_LENGTH - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
When content is truncated for detection, this stores the number of bytes that were actually buffered for detection.
- detectIso2022(byte[]) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
-
Detects ISO-2022-JP, ISO-2022-KR, and ISO-2022-CN by scanning for their characteristic ESC designation sequences.
- detectIso2022(byte[], int, int) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
- detectLanguage(String) - Method in class org.apache.tika.example.LanguageDetectorExample
- detectLanguage(String) - Method in class org.apache.tika.language.translate.impl.AbstractTranslator
- detectOfficeOpenXML(OPCPackage) - Static method in class org.apache.tika.detect.microsoft.ooxml.OPCPackageDetector
-
Detects the type of an OfficeOpenXML (OOXML) file from opened Package
- detectOnKeys(Set<String>) - Static method in class org.apache.tika.detect.apple.BPListDetector
- Detector - Interface in org.apache.tika.detect
-
Content type detector.
- DETECTOR_DATA_DESCRIPTOR_REQUIRED - Static variable in interface org.apache.tika.metadata.Zip
-
Set by the detector to indicate whether streaming required DATA_DESCRIPTOR support.
- DETECTOR_ZIPFILE_OPENED - Static variable in interface org.apache.tika.metadata.Zip
-
Set by the detector to indicate whether it successfully opened the ZIP as a ZipFile.
- DetectorLoader - Class in org.apache.tika.config.loader
-
Loader for detectors with support for SPI fallback via "default-detector" marker.
- DetectorLoader() - Constructor for class org.apache.tika.config.loader.DetectorLoader
- DetectorResource - Class in org.apache.tika.server.core.resource
- DetectorResource(ServerStatus) - Constructor for class org.apache.tika.server.core.resource.DetectorResource
- detectPostStream(InputStream) - Method in class org.apache.tika.server.core.resource.LanguageResource
- detectPostString(String) - Method in class org.apache.tika.server.core.resource.LanguageResource
- detectPutStream(InputStream) - Method in class org.apache.tika.server.core.resource.LanguageResource
- detectPutString(String) - Method in class org.apache.tika.server.core.resource.LanguageResource
- detectType(InputStream) - Static method in enum class org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
- detectType(ZipArchiveEntry, ZipArchiveInputStream) - Static method in enum class org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
- detectType(ZipArchiveEntry, ZipFile) - Static method in enum class org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
- detectType(DirectoryEntry) - Static method in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- detectType(POIFSFileSystem) - Static method in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- detectWithCustomConfig(String) - Static method in class org.apache.tika.example.AdvancedTypeDetector
- detectWithCustomDetector(String) - Static method in class org.apache.tika.example.AdvancedTypeDetector
- detectXMLOnKeys(Set<String>) - Static method in class org.apache.tika.detect.apple.BPListDetector
- determineContextKey(ComponentInfo) - Static method in class org.apache.tika.serialization.ComponentNameResolver
-
Determines the ParseContext key for a component.
- DEVANAGARI - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- DGN_8 - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
- DGN8Parser - Class in org.apache.tika.parser.dgn
-
This is a VERY LIMITED parser.
- DGN8Parser() - Constructor for class org.apache.tika.parser.dgn.DGN8Parser
- DiagnosticRequestOptionInput - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Diagnostic Request Option Input
- DiagnosticRequestOptionOutput - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Diagnostic Request Option Output
- DICE_COEFFICIENT - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- DIFContentHandler - Class in org.apache.tika.parser.dif
- DIFContentHandler - Class in org.apache.tika.sax
- DIFContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.parser.dif.DIFContentHandler
- DIFContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.sax.DIFContentHandler
- DIFParser - Class in org.apache.tika.parser.dif
- DIFParser() - Constructor for class org.apache.tika.parser.dif.DIFParser
- digest(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.digest.CompositeDigester
- digest(TikaInputStream, Metadata, ParseContext) - Method in interface org.apache.tika.digest.Digester
-
Digests a TikaInputStream and sets the appropriate value(s) in the metadata.
- digest(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.digest.InputStreamDigester
-
Digests the TikaInputStream and stores the result in metadata.
- DigestDef - Class in org.apache.tika.digest
-
Defines a digest algorithm with its output encoding.
- DigestDef() - Constructor for class org.apache.tika.digest.DigestDef
- DigestDef(DigestDef.Algorithm) - Constructor for class org.apache.tika.digest.DigestDef
- DigestDef(DigestDef.Algorithm, DigestDef.Encoding) - Constructor for class org.apache.tika.digest.DigestDef
- DigestDef.Algorithm - Enum Class in org.apache.tika.digest
-
Supported digest algorithms.
- DigestDef.Encoding - Enum Class in org.apache.tika.digest
-
Supported digest output encodings.
- Digester - Interface in org.apache.tika.digest
-
Interface for digester implementations.
- DigesterFactory - Interface in org.apache.tika.digest
-
Factory interface for creating Digester instances.
- DigestHelper - Class in org.apache.tika.digest
-
Utility class for computing digests on streams.
- DigestHelper() - Constructor for class org.apache.tika.digest.DigestHelper
- DIGITAL_IMAGE_GUID - Static variable in interface org.apache.tika.metadata.IPTC
-
Globally unique identifier for the item.
- DIGITAL_SOURCE_FILE_TYPE - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.
- DIGITAL_SOURCE_TYPE - Static variable in interface org.apache.tika.metadata.IPTC
-
The type of the source of this digital image
- DIR_NAME_A - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- DIR_NAME_B - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- DIRAC_VIDEO - Static variable in class org.apache.tika.parser.ogg.OggParser
- DIRECTORY - Enum constant in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.OUTPUT_MODE
-
Emit files directly to the configured emitter as separate items
- DirectoryListingEntry - Class in org.apache.tika.parser.microsoft.chm
-
The format of a directory listing entry is as follows: BYTE: length of name BYTEs: name (UTF-8 encoded) ENCINT: content section ENCINT: offset ENCINT: length The offset is from the beginning of the content section the file is in, after the section has been decompressed (if appropriate).
- DirectoryListingEntry() - Constructor for class org.apache.tika.parser.microsoft.chm.DirectoryListingEntry
- DirectoryListingEntry(int, String, ChmCommons.EntryType, int, int) - Constructor for class org.apache.tika.parser.microsoft.chm.DirectoryListingEntry
-
Constructs directoryListingEntry
- DirListParser - Class in org.apache.tika.example
-
Parses the output of /bin/ls and counts the number of files and the number of executables using Tika.
- DirListParser() - Constructor for class org.apache.tika.example.DirListParser
- DISC_NUMBER - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The disc number for part of an album set."
- DISCARD_ALL - Enum constant in enum class org.apache.tika.parser.multiple.AbstractMultipleParser.MetadataPolicy
-
Before moving onto another parser, throw away all previously seen metadata
- DISCOVERY_TECNIQUE - Enum constant in enum class org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
- DisplayedPageNumber - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- DisplayMetInstance - Class in org.apache.tika.example
-
Grabs a PDF file from a URL and prints its
Metadata - DisplayMetInstance() - Constructor for class org.apache.tika.example.DisplayMetInstance
- dispose() - Method in class org.apache.tika.io.TemporaryResources
-
Calls the
TemporaryResources.close()method and wraps the potentialIOExceptioninto aTikaExceptionfor convenience when used within Tika. - dispose() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
-
Assign the internal read buffer to null.
- DO_NOT_RESTART_EXIT_VALUE - Static variable in class org.apache.tika.server.core.TikaServerProcess
- DOC - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
Microsoft Word
- DOC_INFO_CREATED - Static variable in interface org.apache.tika.metadata.PDF
- DOC_INFO_CREATOR - Static variable in interface org.apache.tika.metadata.PDF
- DOC_INFO_CREATOR_TOOL - Static variable in interface org.apache.tika.metadata.PDF
- DOC_INFO_KEY_WORDS - Static variable in interface org.apache.tika.metadata.PDF
- DOC_INFO_MODIFICATION_DATE - Static variable in interface org.apache.tika.metadata.PDF
- DOC_INFO_PRODUCER - Static variable in interface org.apache.tika.metadata.PDF
- DOC_INFO_SUBJECT - Static variable in interface org.apache.tika.metadata.PDF
- DOC_INFO_TITLE - Static variable in interface org.apache.tika.metadata.PDF
- DOC_INFO_TRAPPED - Static variable in interface org.apache.tika.metadata.PDF
- DOC_SECURITY - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- DOC_SECURITY_STRING - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- DOCUMENTID - Static variable in interface org.apache.tika.metadata.XMPMM
-
The common identifier for all versions and renditions of a resource.
- DocumentSelector - Interface in org.apache.tika.extractor
-
Interface for different document selection strategies for purposes like embedded document extraction by a
ContainerExtractorinstance. - doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.ArrayNumber
-
This method is used to deserialize the number of array from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.EightBytesOfData
-
This method is used to deserialize the EightBytesOfData from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.FourBytesOfData
-
This method is used to deserialize the FourBytesOfData from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in interface org.apache.tika.parser.microsoft.onenote.fsshttpb.property.IProperty
-
This method is used to deserialize the property from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.NoData
-
This method is used to deserialize the NoData from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.OneByteOfData
-
This method is used to deserialize the OneByteOfData from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
-
This method is used to deserialize the prtArrayOfPropertyValues from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtFourBytesOfLengthFollowedByData
-
This method is used to deserialize the prtFourBytesOfLengthFollowedByData from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.TwoBytesOfData
-
This method is used to deserialize the TwoBytesOfData from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
-
This method is used to deserialize the Alternative Packaging object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BasicObject
-
Used to return the length of this element.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
-
This method is used to de-serialize the BinaryItem basic object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
-
This method is used to deserialize the CellID basic object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
-
This method is used to deserialize the CellIDArray basic object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
This method is used to deserialize the Compact64bitInt basic object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CompactID
-
This method is used to deserialize the CompactID object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
-
This method is used to deserialize the ExGuid basic object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
-
This method is used to deserialize the ExGUIDArray basic object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
-
This method is used to deserialize the JCID object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
-
This method is used to deserialize the PropertyID object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
-
This method is used to deserialize the SerialNumber basic object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySet
-
This method is used to deserialize the PropertySet from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
-
This method is used to deserialize the ObjectSpaceObjectPropSet from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
-
This method is used to deserialize the ObjectSpaceObjectStreamHeader object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfContextIDs
-
This method is used to deserialize the ObjectSpaceObjectStreamOfContextIDs object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOIDs
-
This method is used to deserialize the ObjectSpaceObjectStreamOfOIDs object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOSIDs
-
This method is used to deserialize the ObjectSpaceObjectStreamOfOSIDs object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd16bit
-
This method is used to deserialize the StreamObjectHeaderEnd16bit basic object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd8bit
-
This method is used to deserialize the StreamObjectHeaderEnd8bit basic object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart16bit
-
This method is used to deserialize the StreamObjectHeaderStart16bit basic object from the specified byte array and start index.
- doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart32bit
-
This method is used to deserialize the StreamObjectHeaderStart32bit basic object from the specified byte array and start index.
- DONT_CHECK - Enum constant in enum class org.apache.tika.parser.pdf.PDFParserConfig.AccessCheckMode
-
Don't check extraction permissions.
- doubleByte - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.TextEncoding
- doubleToInt64Bits(double) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- doubleValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
- doubleValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- doubleValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
- doubleValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
- drawImage(PDImage) - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- drawingHyperlinks - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
- DRM_ENCRYPTED - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
TIKA-3666 MSOffice or other file encrypted with DRM in an OLE container
- DRMENCRYPTED - Enum constant in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- DROP_IF_EXISTS - Enum constant in enum class org.apache.tika.eval.app.db.JDBCUtil.CREATE_TABLE
- dropTableIfExists(Connection, String) - Method in class org.apache.tika.eval.app.db.H2Util
- dropTableIfExists(Connection, String) - Method in class org.apache.tika.eval.app.db.JDBCUtil
- DublinCore - Interface in org.apache.tika.metadata
-
A collection of Dublin Core metadata names.
- DUMP - Static variable in class org.apache.tika.detect.zip.PackageConstants
- DUPLICATE_ENTRY_NAMES - Static variable in interface org.apache.tika.metadata.Zip
-
Entry names that appear multiple times in the local headers (streaming).
- DURATION - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The duration of the media file."
- DurationFormatUtils - Class in org.apache.tika.utils
-
Functionality and naming conventions (roughly) copied from org.apache.commons.lang3 so that we didn't have to add another dependency.
- DurationFormatUtils() - Constructor for class org.apache.tika.utils.DurationFormatUtils
- DWG - Interface in org.apache.tika.metadata
-
DWG-specific properties surfaced by LibreDWG's dwgread JSON output.
- DWG_CUSTOM_META_PREFIX - Static variable in class org.apache.tika.parser.dwg.DWGParser
- DWG_PREFIX - Static variable in interface org.apache.tika.metadata.DWG
- DWGParser - Class in org.apache.tika.parser.dwg
-
DWG (CAD Drawing) parser.
- DWGParser() - Constructor for class org.apache.tika.parser.dwg.DWGParser
- DWGParser(JsonConfig) - Constructor for class org.apache.tika.parser.dwg.DWGParser
- DWGParser(DWGParserConfig) - Constructor for class org.apache.tika.parser.dwg.DWGParser
- DWGParserConfig - Class in org.apache.tika.parser.dwg
- DWGParserConfig() - Constructor for class org.apache.tika.parser.dwg.DWGParserConfig
- DWGParserConfig.RuntimeConfig - Class in org.apache.tika.parser.dwg
-
RuntimeConfig blocks modification of security-sensitive path fields at runtime.
- DWGReadFormatRemover - Class in org.apache.tika.parser.dwg
-
DWGReadFormatRemover removes the formatting from the text from libredwg files so only the raw text remains.
- DWGReadFormatRemover() - Constructor for class org.apache.tika.parser.dwg.DWGReadFormatRemover
- DWGReadParser - Class in org.apache.tika.parser.dwg
-
DWGReadParser (CAD Drawing) parser.
- DWGReadParser() - Constructor for class org.apache.tika.parser.dwg.DWGReadParser
- DYNAMIC - Enum constant in enum class org.apache.tika.pipes.core.EmitStrategy
E
- EditRootRTL - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- EditType - Enum Class in org.apache.tika.parser.microsoft.ooxml
- EightBytesOfData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
-
This class is used to represent the property contains 8 bytes of data in the PropertySet.rgData stream field.
- EightBytesOfData - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
The property contains 8 bytes of data in the PropertySet.rgData stream field.
- EightBytesOfData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.EightBytesOfData
- ELAPSED_TIME_MILLIS - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- element(String, String) - Method in class org.apache.tika.sax.XHTMLContentHandler
-
Emits an XHTML element with the given text content.
- ElementChildNodesOfOutline - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ElementChildNodesOfOutlineElement - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ElementChildNodesOfPage - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ElementChildNodesOfSection - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ElementChildNodesOfTable - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ElementChildNodesOfTableCell - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ElementChildNodesOfTableRow - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ElementChildNodesOfTitle - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ElementChildNodesOfVersionHistory - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ElementMappingContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that maps element
QNames using aMap. - ElementMappingContentHandler(ContentHandler, Map<QName, ElementMappingContentHandler.TargetElement>) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler
- ElementMappingContentHandler.TargetElement - Class in org.apache.tika.sax
- ElementMatcher - Class in org.apache.tika.sax.xpath
-
Final evaluation state of an XPath expression that targets an element.
- ElementMatcher() - Constructor for class org.apache.tika.sax.xpath.ElementMatcher
- ElementMetadataHandler - Class in org.apache.tika.parser.xml
-
SAX event handler that maps the contents of an XML element into a metadata field.
- ElementMetadataHandler(String, String, Metadata, String) - Constructor for class org.apache.tika.parser.xml.ElementMetadataHandler
-
Constructor for string metadata keys.
- ElementMetadataHandler(String, String, Metadata, String, boolean, boolean) - Constructor for class org.apache.tika.parser.xml.ElementMetadataHandler
-
Constructor for string metadata keys which allows change of behavior for duplicate and empty entry values.
- ElementMetadataHandler(String, String, Metadata, Property) - Constructor for class org.apache.tika.parser.xml.ElementMetadataHandler
-
Constructor for Property metadata keys.
- ElementMetadataHandler(String, String, Metadata, Property, boolean, boolean) - Constructor for class org.apache.tika.parser.xml.ElementMetadataHandler
-
Constructor for Property metadata keys which allows change of behavior for duplicate and empty entry values.
- EmailVisitor - Class in org.apache.tika.parser.microsoft.libpst
- EmailVisitor(Path, boolean, XHTMLContentHandler, Metadata, ParseContext) - Constructor for class org.apache.tika.parser.microsoft.libpst.EmailVisitor
- EMB_APP_VERSION - Static variable in interface org.apache.tika.metadata.RTFMetadata
-
if an application and version is given as part of the embedded object, this is the literal string
- EMB_CLASS - Static variable in interface org.apache.tika.metadata.RTFMetadata
- EMB_ITEM - Static variable in interface org.apache.tika.metadata.RTFMetadata
- EMB_TOPIC - Static variable in interface org.apache.tika.metadata.RTFMetadata
- embed(List<Chunk>, InferenceConfig) - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
-
Call the embeddings endpoint to fill in vectors on each chunk.
- embed(List<Chunk>, InferenceConfig) - Method in class org.apache.tika.inference.OpenAIEmbeddingFilter
- embed(Metadata, InputStream, OutputStream, ParseContext) - Method in interface org.apache.tika.embedder.Embedder
-
Embeds related document metadata from the given metadata object into the given output stream.
- embed(Metadata, InputStream, OutputStream, ParseContext) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Executes the configured external command and passes the given document stream as a simple XHTML document to the given SAX content handler.
- EMBEDDED - Enum constant in enum class org.apache.tika.extractor.EmbeddedDocumentUtil.EmbeddedResourcePrefix
- EMBEDDED_BYTES_EXCEPTION - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- EMBEDDED_DEPTH - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- EMBEDDED_DEPTH - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- EMBEDDED_DEPTH_LIMIT_REACHED - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- EMBEDDED_DEPTH_LIMIT_REACHED - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- EMBEDDED_EXCEPTION - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- EMBEDDED_FILE_ANNOTATION_TYPE - Static variable in interface org.apache.tika.metadata.PDF
-
If the file came from an annotation and there was a type
- EMBEDDED_FILE_DESCRIPTION - Static variable in interface org.apache.tika.metadata.PDF
- EMBEDDED_FILE_PATH - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- EMBEDDED_FILE_PATH_TABLE - Static variable in class org.apache.tika.eval.app.ExtractProfiler
- EMBEDDED_FILE_PATH_TABLE_A - Static variable in class org.apache.tika.eval.app.ExtractComparer
- EMBEDDED_FILE_PATH_TABLE_B - Static variable in class org.apache.tika.eval.app.ExtractComparer
- EMBEDDED_FILE_SUBTYPE - Static variable in interface org.apache.tika.metadata.PDF
-
literal string from the PDEmbeddedFile#getSubtype(), should be what the PDF alleges is the embedded file's mime type
- EMBEDDED_ID - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
This is a 1-index counter for embedded files, used by the RecursiveParserWrapper
- EMBEDDED_ID_PATH - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
This tracks the embedded file paths based on the embedded file's
TikaCoreProperties.EMBEDDED_ID. - EMBEDDED_PARSER - Static variable in class org.apache.tika.utils.ParserUtils
- EMBEDDED_RELATIONSHIP_ID - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- EMBEDDED_RELATIONSHIPS - Static variable in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
- EMBEDDED_RESOURCE_LIMIT_REACHED - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- EMBEDDED_RESOURCE_LIMIT_REACHED - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- EMBEDDED_RESOURCE_PATH - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
This tracks the embedded file paths based on the name of embedded files where available.
- EMBEDDED_RESOURCE_TYPE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Embedded resource type property
- EMBEDDED_RESOURCE_TYPE_KEY - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- EMBEDDED_STORAGE_CLASS_ID - Static variable in interface org.apache.tika.metadata.Office
- EMBEDDED_WARNING - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- EmbeddedContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that prevents the
EmbeddedContentHandler.startDocument()andEmbeddedContentHandler.endDocument()events from reaching the decorated handler. - EmbeddedContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.EmbeddedContentHandler
-
Created a decorator that prevents the given handler from receiving
EmbeddedContentHandler.startDocument()andEmbeddedContentHandler.endDocument()events. - EmbeddedDocumentByteStoreExtractorFactory - Interface in org.apache.tika.extractor
-
This factory creates EmbeddedDocumentExtractors that require an
UnpackHandlerin theParseContextshould extend this. - embeddedDocumentExtractor - Variable in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- EmbeddedDocumentExtractor - Interface in org.apache.tika.extractor
- EmbeddedDocumentExtractorFactory - Interface in org.apache.tika.extractor
- EmbeddedDocumentUtil - Class in org.apache.tika.extractor
-
Utility class to handle common issues with embedded documents.
- EmbeddedDocumentUtil(ParseContext) - Constructor for class org.apache.tika.extractor.EmbeddedDocumentUtil
- EmbeddedDocumentUtil.EmbeddedResourcePrefix - Enum Class in org.apache.tika.extractor
-
Type of embedded resource, used for generating canonical resource names.
- EmbeddedFileContainer - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- embeddedFileFieldName() - Method in record class org.apache.tika.pipes.emitter.es.ESEmitterConfig
-
Returns the value of the
embeddedFileFieldNamerecord component. - embeddedFileFieldName() - Method in record class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig
-
Returns the value of the
embeddedFileFieldNamerecord component. - embeddedFileFieldName() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Returns the value of the
embeddedFileFieldNamerecord component. - EmbeddedFileInfo(int, String, Path, Metadata) - Constructor for record class org.apache.tika.pipes.core.extractor.TempFileUnpackHandler.EmbeddedFileInfo
-
Creates an instance of a
EmbeddedFileInforecord class. - EmbeddedFileName - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- EmbeddedLimitReachedException - Exception in org.apache.tika.exception
-
Runtime exception thrown when an embedded document limit is reached and the configuration specifies that parsing should stop with an exception.
- EmbeddedLimitReachedException(EmbeddedLimitReachedException.LimitType, int) - Constructor for exception org.apache.tika.exception.EmbeddedLimitReachedException
- EmbeddedLimitReachedException.LimitType - Enum Class in org.apache.tika.exception
- EmbeddedLimits - Class in org.apache.tika.config
-
Configuration for limits on embedded document processing.
- EmbeddedLimits() - Constructor for class org.apache.tika.config.EmbeddedLimits
-
No-arg constructor for Jackson deserialization.
- EmbeddedLimits(int, boolean, int, boolean) - Constructor for class org.apache.tika.config.EmbeddedLimits
-
Constructor with all parameters.
- embeddedOLERef(String, String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- embeddedOLERef(String, String, String) - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- EmbeddedPartMetadata - Class in org.apache.tika.parser.microsoft.ooxml
-
This class records metadata about embedded parts that exists in the xml of the main document.
- EmbeddedPartMetadata(String) - Constructor for class org.apache.tika.parser.microsoft.ooxml.EmbeddedPartMetadata
- embeddedPicRef(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- embeddedPicRef(String, String) - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- EmbeddedResourceHandler - Interface in org.apache.tika.extractor
-
Tika container extractor callback interface.
- EmbeddedStreamTranslator - Interface in org.apache.tika.extractor
-
Interface for different filtering of embedded streams.
- Embedder - Interface in org.apache.tika.embedder
-
Tika embedder interface
- EMF_ICON_ONLY - Static variable in class org.apache.tika.parser.microsoft.EMFParser
- EMF_ICON_STRING - Static variable in class org.apache.tika.parser.microsoft.EMFParser
- EMFParser - Class in org.apache.tika.parser.microsoft
-
Extracts files embedded in EMF and offers a very rough capability to extract text if there is text stored in the EMF.
- EMFParser() - Constructor for class org.apache.tika.parser.microsoft.EMFParser
- emit(String, InputStream, Metadata, ParseContext) - Method in interface org.apache.tika.pipes.api.emitter.StreamEmitter
- emit(String, InputStream, Metadata, ParseContext) - Method in class org.apache.tika.pipes.emitter.azblob.AZBlobEmitter
- emit(String, InputStream, Metadata, ParseContext) - Method in class org.apache.tika.pipes.emitter.fs.FileSystemEmitter
- emit(String, InputStream, Metadata, ParseContext) - Method in class org.apache.tika.pipes.emitter.gcs.GCSEmitter
- emit(String, InputStream, Metadata, ParseContext) - Method in class org.apache.tika.pipes.emitter.s3.S3Emitter
- emit(String, List<Metadata>, ParseContext) - Method in interface org.apache.tika.pipes.api.emitter.Emitter
- emit(String, List<Metadata>, ParseContext) - Method in class org.apache.tika.pipes.core.emitter.EmptyEmitter
- emit(String, List<Metadata>, ParseContext) - Method in class org.apache.tika.pipes.emitter.azblob.AZBlobEmitter
- emit(String, List<Metadata>, ParseContext) - Method in class org.apache.tika.pipes.emitter.es.ESEmitter
- emit(String, List<Metadata>, ParseContext) - Method in class org.apache.tika.pipes.emitter.fs.FileSystemEmitter
- emit(String, List<Metadata>, ParseContext) - Method in class org.apache.tika.pipes.emitter.gcs.GCSEmitter
- emit(String, List<Metadata>, ParseContext) - Method in class org.apache.tika.pipes.emitter.jdbc.JDBCEmitter
- emit(String, List<Metadata>, ParseContext) - Method in class org.apache.tika.pipes.emitter.kafka.KafkaEmitter
- emit(String, List<Metadata>, ParseContext) - Method in class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitter
- emit(String, List<Metadata>, ParseContext) - Method in class org.apache.tika.pipes.emitter.s3.S3Emitter
- emit(String, List<Metadata>, ParseContext) - Method in class org.apache.tika.pipes.emitter.solr.SolrEmitter
- emit(List<? extends EmitData>) - Method in class org.apache.tika.pipes.api.emitter.AbstractEmitter
- emit(List<? extends EmitData>) - Method in class org.apache.tika.pipes.api.emitter.AbstractStreamEmitter
- emit(List<? extends EmitData>) - Method in interface org.apache.tika.pipes.api.emitter.Emitter
- emit(List<? extends EmitData>) - Method in class org.apache.tika.pipes.emitter.es.ESEmitter
- emit(List<? extends EmitData>) - Method in class org.apache.tika.pipes.emitter.jdbc.JDBCEmitter
- emit(List<? extends EmitData>) - Method in class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitter
- emit(List<? extends EmitData>) - Method in class org.apache.tika.pipes.emitter.solr.SolrEmitter
- EMIT - Enum constant in enum class org.apache.tika.pipes.api.FetchEmitTuple.ON_PARSE_EXCEPTION
- EMIT_ALL - Enum constant in enum class org.apache.tika.pipes.core.EmitStrategy
- EMIT_DATA - Static variable in class org.apache.tika.pipes.core.serialization.PipesResultSerializer
- EMIT_EXCEPTION - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- EMIT_KEY - Static variable in class org.apache.tika.pipes.core.serialization.EmitDataSerializer
- EMIT_KEY - Static variable in class org.apache.tika.pipes.core.serialization.FetchEmitTupleSerializer
- EMIT_SUCCESS - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- EMIT_SUCCESS - Static variable in class org.apache.tika.pipes.core.PipesResults
- EMIT_SUCCESS_PARSE_EXCEPTION - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- EMIT_SUCCESS_PASSBACK - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- emitData() - Method in record class org.apache.tika.pipes.api.PipesResult
-
Returns the value of the
emitDatarecord component. - emitData() - Method in record class org.apache.tika.pipes.core.async.EmitDataPair
-
Returns the value of the
emitDatarecord component. - EmitData - Interface in org.apache.tika.pipes.api.emitter
- EmitDataDeserializer - Class in org.apache.tika.pipes.core.serialization
- EmitDataDeserializer() - Constructor for class org.apache.tika.pipes.core.serialization.EmitDataDeserializer
- EmitDataImpl - Class in org.apache.tika.pipes.core.emitter
- EmitDataImpl(String, List<Metadata>) - Constructor for class org.apache.tika.pipes.core.emitter.EmitDataImpl
- EmitDataImpl(String, List<Metadata>, String) - Constructor for class org.apache.tika.pipes.core.emitter.EmitDataImpl
- EmitDataPair - Record Class in org.apache.tika.pipes.core.async
- EmitDataPair(String, EmitData) - Constructor for record class org.apache.tika.pipes.core.async.EmitDataPair
-
Creates an instance of a
EmitDataPairrecord class. - EmitDataSerializer - Class in org.apache.tika.pipes.core.serialization
- EmitDataSerializer() - Constructor for class org.apache.tika.pipes.core.serialization.EmitDataSerializer
- emitDocument(String, String, Metadata) - Method in class org.apache.tika.pipes.reporter.opensearch.OpenSearchClient
- emitDocument(String, List<Metadata>) - Method in class org.apache.tika.pipes.emitter.es.ESClient
- emitDocument(String, List<Metadata>) - Method in class org.apache.tika.pipes.emitter.opensearch.OpenSearchClient
- emitDocuments(List<? extends EmitData>) - Method in class org.apache.tika.pipes.emitter.es.ESClient
- emitDocuments(List<? extends EmitData>) - Method in class org.apache.tika.pipes.emitter.opensearch.OpenSearchClient
- EmitKey - Class in org.apache.tika.pipes.api.emitter
- EmitKey() - Constructor for class org.apache.tika.pipes.api.emitter.EmitKey
- EmitKey(String, String) - Constructor for class org.apache.tika.pipes.api.emitter.EmitKey
- EmitStrategy - Enum Class in org.apache.tika.pipes.core
-
Strategy for how the forked PipesServer handles emitting data.
- EmitStrategyConfig - Class in org.apache.tika.pipes.core
-
Configuration for emit strategy.
- EmitStrategyConfig() - Constructor for class org.apache.tika.pipes.core.EmitStrategyConfig
- EmitStrategyConfig(EmitStrategy) - Constructor for class org.apache.tika.pipes.core.EmitStrategyConfig
- EmitStrategyConfig(EmitStrategy, Long) - Constructor for class org.apache.tika.pipes.core.EmitStrategyConfig
- Emitter - Interface in org.apache.tika.pipes.api.emitter
- EMITTER - Static variable in class org.apache.tika.pipes.core.serialization.FetchEmitTupleSerializer
- EMITTER_ID_FIELD_NUMBER - Static variable in class org.apache.tika.FetchAndParseRequest
- EMITTER_INITIALIZATION_EXCEPTION - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- EMITTER_NOT_FOUND - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- EmitterFactory - Interface in org.apache.tika.pipes.api.emitter
- emitterId() - Method in record class org.apache.tika.pipes.core.async.EmitDataPair
-
Returns the value of the
emitterIdrecord component. - emitterId() - Method in record class org.apache.tika.pipes.core.config.ConfigMerger.MergeResult
-
Returns the value of the
emitterIdrecord component. - EmitterManager - Class in org.apache.tika.pipes.core.emitter
-
Utility class that will apply the appropriate emitter to the emitterString based on the prefix.
- EmitterNotFoundException - Exception in org.apache.tika.pipes.api.emitter
-
Exception thrown when a requested emitter configuration does not exist.
- EmitterNotFoundException(String) - Constructor for exception org.apache.tika.pipes.api.emitter.EmitterNotFoundException
- EmitterNotFoundException(String, Throwable) - Constructor for exception org.apache.tika.pipes.api.emitter.EmitterNotFoundException
- EmitterOverride(String, String, Map<String, Object>) - Constructor for class org.apache.tika.pipes.core.config.ConfigOverrides.EmitterOverride
- EmittingUnpackHandler - Class in org.apache.tika.pipes.core.extractor
- EmittingUnpackHandler(FetchEmitTuple, EmitterManager, ParseContext) - Constructor for class org.apache.tika.pipes.core.extractor.EmittingUnpackHandler
- EMPTY - Static variable in class org.apache.tika.mime.MediaType
- EMPTY - Static variable in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFNumberingShim
- EMPTY - Static variable in class org.apache.tika.utils.StringUtils
-
The empty String
"". - EMPTY_CONTENT_TAGS - Static variable in class org.apache.tika.eval.core.util.ContentTags
- EMPTY_LIST - Static variable in class org.apache.tika.parser.microsoft.ooxml.XWPFListManager
-
Empty singleton to be used when there is no list manager.
- EMPTY_MODEL - Static variable in class org.apache.tika.eval.core.tokens.LangModel
- EMPTY_OUTPUT - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- EMPTY_OUTPUT - Static variable in class org.apache.tika.pipes.core.PipesResults
- EMPTY_STYLES - Static variable in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFStylesShim
-
Empty singleton to be used when there is no style info
- EmptyDetector - Class in org.apache.tika.detect
-
Dummy detector that returns application/octet-stream for all documents.
- EmptyDetector() - Constructor for class org.apache.tika.detect.EmptyDetector
- EmptyEmitter - Class in org.apache.tika.pipes.core.emitter
- EmptyEmitter(ExtensionConfig) - Constructor for class org.apache.tika.pipes.core.emitter.EmptyEmitter
- EmptyFetcher - Class in org.apache.tika.pipes.core.fetcher
- EmptyFetcher() - Constructor for class org.apache.tika.pipes.core.fetcher.EmptyFetcher
- emptyGuid() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.GuidUtil
- EmptyParser - Class in org.apache.tika.parser
-
Dummy parser that always produces an empty XHTML document without even attempting to parse the given document stream.
- EmptyParser() - Constructor for class org.apache.tika.parser.EmptyParser
- EmptyTranslator - Class in org.apache.tika.language.translate
-
Dummy translator that always declines to give any text.
- EmptyTranslator() - Constructor for class org.apache.tika.language.translate.EmptyTranslator
- EnableHistory - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- enableIdempotence() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
enableIdempotencerecord component. - enableInputFilter(boolean) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Enable filtering of input text.
- enableRewind() - Method in class org.apache.tika.io.TikaInputStream
-
Enables full rewind capability for this stream.
- encode(byte[]) - Method in interface org.apache.tika.digest.Encoder
-
Encode a byte array to a string representation.
- encode(byte[]) - Static method in class org.apache.tika.mime.HexCoDec
-
Hex encode an array of bytes
- encode(byte[], int, int) - Static method in class org.apache.tika.mime.HexCoDec
-
Hex encode an array of bytes
- encode(float[]) - Static method in class org.apache.tika.inference.VectorSerializer
-
Encode a float array as a base64 string (big-endian float32).
- EncodeOCRConfig - Class in org.apache.tika.parser.ocrencode
-
Configuration for
EncodeOCRParser. - EncodeOCRConfig() - Constructor for class org.apache.tika.parser.ocrencode.EncodeOCRConfig
- EncodeOCRParser - Class in org.apache.tika.parser.ocrencode
-
Parser that base64-encodes image content instead of performing OCR text extraction.
- EncodeOCRParser() - Constructor for class org.apache.tika.parser.ocrencode.EncodeOCRParser
- EncodeOCRParser(JsonConfig) - Constructor for class org.apache.tika.parser.ocrencode.EncodeOCRParser
-
Constructor for JSON configuration.
- EncodeOCRParser(EncodeOCRConfig) - Constructor for class org.apache.tika.parser.ocrencode.EncodeOCRParser
- Encoder - Interface in org.apache.tika.digest
-
Encodes byte array from a MessageDigest to String.
- encoding - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.TextEncoding
- ENCODING_DETECTION_TRACE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Diagnostic trace showing which encoding detectors ran and what each returned, plus the arbitration method used when detectors disagreed.
- ENCODING_DETECTOR - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
This should be the simple class name for the EncodingDetectors whose detected encoding was used in the parse.
- EncodingDetector - Interface in org.apache.tika.detect
-
Character encoding detector.
- EncodingDetectorContext - Class in org.apache.tika.detect
-
Context object that collects encoding detection results from base detectors.
- EncodingDetectorContext() - Constructor for class org.apache.tika.detect.EncodingDetectorContext
- EncodingDetectorContext.Result - Class in org.apache.tika.detect
-
A single detector's contribution: its ranked list of candidates and its name.
- EncodingDetectorLoader - Class in org.apache.tika.config.loader
-
Loader for encoding detectors with support for SPI fallback via "default-encoding-detector" marker.
- EncodingDetectorLoader() - Constructor for class org.apache.tika.config.loader.EncodingDetectorLoader
- EncodingResult - Class in org.apache.tika.detect
-
A charset detection result pairing a
Charsetwith a confidence score and aEncodingResult.ResultTypeindicating the nature of the evidence. - EncodingResult(Charset, float) - Constructor for class org.apache.tika.detect.EncodingResult
-
Constructs a STATISTICAL result.
- EncodingResult(Charset, float, String) - Constructor for class org.apache.tika.detect.EncodingResult
-
Constructs a STATISTICAL result with a detector-specific label.
- EncodingResult(Charset, float, String, EncodingResult.ResultType) - Constructor for class org.apache.tika.detect.EncodingResult
-
Constructs a result with an explicit
EncodingResult.ResultType. - EncodingResult.ResultType - Enum Class in org.apache.tika.detect
-
The nature of the evidence that produced this result.
- encodings - Static variable in class org.apache.tika.parser.mp3.ID3v2Frame
- ENCRYPTED - Enum constant in enum class org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
- ENCRYPTED - Enum constant in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- ENCRYPTED - Static variable in interface org.apache.tika.metadata.WordPerfect
-
Is encrypted?.
- ENCRYPTED - Static variable in interface org.apache.tika.metadata.Zip
-
Whether the entry is encrypted.
- EncryptedDocumentException - Exception in org.apache.tika.exception
- EncryptedDocumentException() - Constructor for exception org.apache.tika.exception.EncryptedDocumentException
- EncryptedDocumentException(String) - Constructor for exception org.apache.tika.exception.EncryptedDocumentException
- EncryptedDocumentException(String, Throwable) - Constructor for exception org.apache.tika.exception.EncryptedDocumentException
- EncryptedDocumentException(Throwable) - Constructor for exception org.apache.tika.exception.EncryptedDocumentException
- EncryptedPrescriptionDetector - Class in org.apache.tika.example
- EncryptedPrescriptionDetector() - Constructor for class org.apache.tika.example.EncryptedPrescriptionDetector
- EncryptedPrescriptionParser - Class in org.apache.tika.example
- EncryptedPrescriptionParser() - Constructor for class org.apache.tika.example.EncryptedPrescriptionParser
- ENCRYPTION - Enum constant in enum class org.apache.tika.eval.app.ProfilerBase.EXCEPTION_TYPE
- encryptionObjects - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObjectGroup
- END - Enum constant in enum class org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
- endBookmark(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- endBookmark(String) - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- endDescription() - Method in class org.apache.tika.sax.XMPContentHandler
- endDocument() - Method in class org.apache.tika.parser.ctakes.CTAKESContentHandler
- endDocument() - Method in class org.apache.tika.parser.dif.DIFContentHandler
- endDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
- endDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
- endDocument() - Method in class org.apache.tika.parser.mif.MIFContentHandler
- endDocument() - Method in class org.apache.tika.parser.tmx.TMXContentHandler
- endDocument() - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
- endDocument() - Method in class org.apache.tika.sax.boilerpipe.BoilerpipeContentHandler
- endDocument() - Method in class org.apache.tika.sax.ContentHandlerDecorator
- endDocument() - Method in class org.apache.tika.sax.DIFContentHandler
- endDocument() - Method in class org.apache.tika.sax.EmbeddedContentHandler
-
Ignored.
- endDocument() - Method in class org.apache.tika.sax.EndDocumentShieldingContentHandler
- endDocument() - Method in class org.apache.tika.sax.PhoneExtractingContentHandler
-
This method is called whenever the Parser is done parsing the file.
- endDocument() - Method in class org.apache.tika.sax.SafeContentHandler
- endDocument() - Method in class org.apache.tika.sax.StandardsExtractingContentHandler
-
This method is called whenever the Parser is done parsing the file.
- endDocument() - Method in class org.apache.tika.sax.TeeContentHandler
- endDocument() - Method in class org.apache.tika.sax.TextContentHandler
- endDocument() - Method in class org.apache.tika.sax.ToMarkdownContentHandler
- endDocument() - Method in class org.apache.tika.sax.ToTextContentHandler
-
Flushes the character stream so that no characters are forgotten in internal buffers.
- endDocument() - Method in class org.apache.tika.sax.XHTMLContentHandler
-
Ends the XHTML document by writing the following footer and clearing the namespace mappings:
- endDocument() - Method in class org.apache.tika.sax.XMPContentHandler
-
Ends the XMP document by writing the following footer and clearing the namespace mappings:
- endDocument(PDDocument) - Method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
- endDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
-
This is called after the full parse has completed.
- endDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.RecursiveParserWrapperHandler
- EndDocumentShieldingContentHandler - Class in org.apache.tika.sax
-
A wrapper around a
ContentHandlerwhich will ignore normal SAX calls toEndDocumentShieldingContentHandler.endDocument(), and only fire them later. - EndDocumentShieldingContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.EndDocumentShieldingContentHandler
-
Creates a decorator for the given SAX event handler.
- endEditedSection() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- endEditedSection() - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- endElement(String) - Method in class org.apache.tika.sax.XHTMLContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.mime.MimeTypesReader
- endElement(String, String, String) - Method in class org.apache.tika.parser.dif.DIFContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
- endElement(String, String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
- endElement(String, String, String) - Method in class org.apache.tika.parser.mif.MIFContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.parser.tmx.TMXContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
- endElement(String, String, String) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
- endElement(String, String, String) - Method in class org.apache.tika.parser.xml.MetadataHandler
-
Deprecated.
- endElement(String, String, String) - Method in class org.apache.tika.sax.boilerpipe.BoilerpipeContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- endElement(String, String, String) - Method in class org.apache.tika.sax.DIFContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.ElementMappingContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.ExpandedTitleContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.LinkContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.SafeContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.SecureContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.TeeContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.ToHTMLContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.ToMarkdownContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.ToTextContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.ToXMLContentHandler
- endElement(String, String, String) - Method in class org.apache.tika.sax.XHTMLContentHandler
-
Ends the given element.
- endElement(String, String, String) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
- endEmbeddedDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
-
This is called after parsing each embedded document.
- endEmbeddedDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.RecursiveParserWrapperHandler
-
This is called after parsing an embedded document.
- ENDIAN - Static variable in interface org.apache.tika.metadata.MachineMetadata
- EndianUtils - Class in org.apache.tika.io
-
General Endian Related Utilties.
- EndianUtils() - Constructor for class org.apache.tika.io.EndianUtils
- EndianUtils.BufferUnderrunException - Exception in org.apache.tika.io
- ENDLINE - Static variable in class org.apache.tika.sax.XHTMLContentHandler
-
The elements that get appended with the
XHTMLContentHandler.NLcharacter. - endnoteReference(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- endnoteReference(String) - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- endPage(PDPage) - Method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
- endParagraph() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- endParagraph() - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- endPath() - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- endpoint() - Method in record class org.apache.tika.pipes.emitter.azblob.AZBlobEmitterConfig
-
Returns the value of the
endpointrecord component. - Endpoint(Class<?>, Method, String, String, String[]) - Constructor for class org.apache.tika.server.core.resource.TikaWelcome.Endpoint
- endpointConfigurationService() - Method in record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
-
Returns the value of the
endpointConfigurationServicerecord component. - endPrefixMapping(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
- endPrefixMapping(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
- endPrefixMapping(String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- endPrefixMapping(String) - Method in class org.apache.tika.sax.TeeContentHandler
- endRow(int) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
- endSDT() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- endSDT() - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- endSheet() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
- endTable() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- endTable() - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- endTableCell() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- endTableCell() - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- endTableRow() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- endTableRow() - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- EnforceOutlineStructure - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ENGINEER - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The engineer's name."
- enqueue() - Method in class org.apache.tika.pipes.iterator.azblob.AZBlobPipesIterator
- enqueue() - Method in class org.apache.tika.pipes.iterator.csv.CSVPipesIterator
- enqueue() - Method in class org.apache.tika.pipes.iterator.fs.FileSystemPipesIterator
- enqueue() - Method in class org.apache.tika.pipes.iterator.gcs.GCSPipesIterator
- enqueue() - Method in class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIterator
- enqueue() - Method in class org.apache.tika.pipes.iterator.kafka.KafkaPipesIterator
- enqueue() - Method in class org.apache.tika.pipes.iterator.s3.S3PipesIterator
- enqueue() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIterator
- enqueue() - Method in class org.apache.tika.pipes.pipesiterator.json.JsonPipesIterator
- enqueue() - Method in class org.apache.tika.pipes.pipesiterator.PipesIteratorBase
- ensureFormattingState(XHTMLContentHandler, EnumSet<FormattingUtils.Tag>, Deque<FormattingUtils.Tag>) - Static method in class org.apache.tika.parser.microsoft.FormattingUtils
-
Closes all tags until
currentStatecontains only tags fromdesiredset, then open all required tags to reach desired state. - ensureRunning() - Method in class org.apache.tika.pipes.core.PerClientServerManager
- ensureRunning() - Method in interface org.apache.tika.pipes.core.ServerManager
-
Ensures the server is running, starting or restarting it if necessary.
- ensureRunning() - Method in class org.apache.tika.pipes.core.SharedServerManager
-
Ensures the shared server is running, starting it if necessary.
- ENTITY_LOCAL_NAMES - Static variable in class org.apache.tika.parser.xml.XMLProfiler
- ENTITY_TYPES - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
- ENTITY_TYPES - Static variable in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
- ENTITY_TYPES - Static variable in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
- ENTITY_TYPES - Static variable in class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
-
some common entities identified by NLTK
- ENTITY_URIS - Static variable in class org.apache.tika.parser.xml.XMLProfiler
- entityTypes - Variable in class org.apache.tika.parser.ner.regex.RegexNERecogniser
- entropy(float[]) - Static method in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Shannon entropy (in bits) of a probability distribution.
- entropy(float[]) - Static method in class org.apache.tika.ml.LinearModel
-
Shannon entropy (in bits) of a probability distribution.
- enumerateChm() - Method in class org.apache.tika.parser.microsoft.chm.ChmExtractor
-
Enumerates chm entities
- ENVI_MIME_TYPE - Static variable in class org.apache.tika.parser.envi.EnviHeaderParser
- EnviHeaderParser - Class in org.apache.tika.parser.envi
- EnviHeaderParser() - Constructor for class org.apache.tika.parser.envi.EnviHeaderParser
- EnviHeaderParser(EncodingDetector) - Constructor for class org.apache.tika.parser.envi.EnviHeaderParser
- EOF - Enum constant in enum class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenType
- EOF_OFFSETS - Static variable in interface org.apache.tika.metadata.PDF
-
Number of %%EOF as extracted by the StartXRefScanner.
- Epub - Interface in org.apache.tika.metadata
-
EPub properties collection.
- EPUB_PREFIX - Static variable in interface org.apache.tika.metadata.Epub
- EpubContentParser - Class in org.apache.tika.parser.epub
-
Parser for EPUB OPS
*.htmlfiles. - EpubContentParser() - Constructor for class org.apache.tika.parser.epub.EpubContentParser
- EpubParser - Class in org.apache.tika.parser.epub
-
Epub parser
- EpubParser() - Constructor for class org.apache.tika.parser.epub.EpubParser
- equals(Object) - Method in class org.apache.tika.config.EmbeddedLimits
- equals(Object) - Method in record class org.apache.tika.config.loader.ComponentInfo
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in class org.apache.tika.config.OutputLimits
- equals(Object) - Method in class org.apache.tika.config.TimeoutLimits
- equals(Object) - Method in class org.apache.tika.DeleteFetcherReply
- equals(Object) - Method in class org.apache.tika.DeleteFetcherRequest
- equals(Object) - Method in class org.apache.tika.DeletePipesIteratorReply
- equals(Object) - Method in class org.apache.tika.DeletePipesIteratorRequest
- equals(Object) - Method in class org.apache.tika.eval.app.db.ColInfo
- equals(Object) - Method in class org.apache.tika.eval.core.tokens.TokenIntPair
- equals(Object) - Method in class org.apache.tika.eval.core.tokens.TokenStatistics
- equals(Object) - Method in class org.apache.tika.FetchAndParseReply
- equals(Object) - Method in class org.apache.tika.FetchAndParseRequest
- equals(Object) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- equals(Object) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- equals(Object) - Method in class org.apache.tika.GetFetcherReply
- equals(Object) - Method in class org.apache.tika.GetFetcherRequest
- equals(Object) - Method in class org.apache.tika.GetPipesIteratorReply
- equals(Object) - Method in class org.apache.tika.GetPipesIteratorRequest
- equals(Object) - Method in class org.apache.tika.ListFetchersReply
- equals(Object) - Method in class org.apache.tika.ListFetchersRequest
- equals(Object) - Method in class org.apache.tika.metadata.Metadata
- equals(Object) - Method in class org.apache.tika.metadata.Property
- equals(Object) - Method in class org.apache.tika.mime.MediaType
- equals(Object) - Method in class org.apache.tika.mime.MimeType
- equals(Object) - Method in class org.apache.tika.parser.csv.CSVResult
- equals(Object) - Method in class org.apache.tika.parser.html.DataURIScheme
- equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
- equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
-
Override the Equals method.
- equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
-
Override the Equals method.
- equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
- equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
- equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
- equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
- equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.GUID
- equals(Object) - Method in class org.apache.tika.parser.ParseContext
- equals(Object) - Method in class org.apache.tika.parser.txt.CharsetMatch
-
compare this CharsetMatch to another based on confidence value
- equals(Object) - Method in record class org.apache.tika.parser.vlm.AbstractVLMParser.HttpCall
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in class org.apache.tika.pipes.api.emitter.EmitKey
- equals(Object) - Method in class org.apache.tika.pipes.api.FetchEmitTuple
- equals(Object) - Method in class org.apache.tika.pipes.api.fetcher.FetchKey
- equals(Object) - Method in record class org.apache.tika.pipes.api.PipesResult
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.core.async.EmitDataPair
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.core.config.ConfigMerger.MergeResult
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.core.extractor.frictionless.FrictionlessResource
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler.FrictionlessFileInfo
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.core.extractor.TempFileUnpackHandler.EmbeddedFileInfo
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- equals(Object) - Method in record class org.apache.tika.pipes.core.protocol.PipesMessage
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.emitter.azblob.AZBlobEmitterConfig
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.emitter.es.ESEmitterConfig
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.emitter.es.HttpClientConfig
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.emitter.fs.FileSystemEmitterConfig
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.emitter.gcs.GCSEmitterConfig
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.emitter.opensearch.HttpClientConfig
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpHeaders
- equals(Object) - Method in class org.apache.tika.pipes.iterator.azblob.AZBlobPipesIteratorConfig
- equals(Object) - Method in class org.apache.tika.pipes.iterator.csv.CSVPipesIteratorConfig
- equals(Object) - Method in class org.apache.tika.pipes.iterator.fs.FileSystemPipesIteratorConfig
- equals(Object) - Method in class org.apache.tika.pipes.iterator.gcs.GCSPipesIteratorConfig
- equals(Object) - Method in class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIteratorConfig
- equals(Object) - Method in class org.apache.tika.pipes.iterator.kafka.KafkaPipesIteratorConfig
- equals(Object) - Method in class org.apache.tika.pipes.iterator.s3.S3PipesIteratorConfig
- equals(Object) - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- equals(Object) - Method in class org.apache.tika.pipes.pipesiterator.json.JsonPipesIteratorConfig
- equals(Object) - Method in class org.apache.tika.pipes.pipesiterator.PipesIteratorConfig
- equals(Object) - Method in record class org.apache.tika.pipes.reporter.es.ESReporterConfig
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.reporter.fs.FileSystemReporterConfig
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterConfig
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.reporter.opensearch.HttpClientConfig
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.pipes.reporter.opensearch.OpenSearchReporterConfig
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.plugins.ExtensionConfig
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in class org.apache.tika.renderer.PageRangeRequest
- equals(Object) - Method in class org.apache.tika.SaveFetcherReply
- equals(Object) - Method in class org.apache.tika.SaveFetcherRequest
- equals(Object) - Method in class org.apache.tika.SavePipesIteratorReply
- equals(Object) - Method in class org.apache.tika.SavePipesIteratorRequest
- equals(Object) - Method in class org.apache.tika.sax.BasicContentHandlerFactory
- equals(Object) - Method in record class org.apache.tika.server.core.resource.PipesParsingHelper.UnpackResult
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in record class org.apache.tika.server.core.resource.ServerHandlerConfig
-
Indicates whether some other object is "equal to" this one.
- equals(Object) - Method in class org.apache.tika.xmp.XMPMetadata
-
This method is not implemented, yet.
- equals(String, String) - Static method in class org.apache.tika.language.detect.LanguageNames
- EQUIPMENT_MAKE - Static variable in interface org.apache.tika.metadata.TIFF
-
"Manufacturer of the recording equipment."
- EQUIPMENT_MODEL - Static variable in interface org.apache.tika.metadata.TIFF
-
"Model name or number of the recording equipment."
- error(String) - Method in interface org.apache.tika.pipes.api.reporter.PipesReporter
-
This is called if the process has crashed.
- error(String) - Method in class org.apache.tika.pipes.core.reporter.CompositePipesReporter
- error(String) - Method in class org.apache.tika.pipes.core.reporter.NoOpReporter
- error(String) - Method in class org.apache.tika.pipes.reporter.es.ESPipesReporter
- error(String) - Method in class org.apache.tika.pipes.reporter.fs.FileSystemStatusReporter
- error(String) - Method in class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporter
- error(String) - Method in class org.apache.tika.pipes.reporter.opensearch.OpenSearchPipesReporter
- error(Throwable) - Method in interface org.apache.tika.pipes.api.reporter.PipesReporter
-
This is called if the process has crashed.
- error(Throwable) - Method in class org.apache.tika.pipes.core.reporter.CompositePipesReporter
- error(Throwable) - Method in class org.apache.tika.pipes.core.reporter.NoOpReporter
- error(Throwable) - Method in class org.apache.tika.pipes.reporter.es.ESPipesReporter
- error(Throwable) - Method in class org.apache.tika.pipes.reporter.fs.FileSystemStatusReporter
- error(Throwable) - Method in class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporter
- error(Throwable) - Method in class org.apache.tika.pipes.reporter.opensearch.OpenSearchPipesReporter
- error(SAXParseException) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- Error - Enum Class in org.apache.tika.parser.microsoft.onenote
- Error - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
The Error type
- ERROR_MESSAGE_FIELD_NUMBER - Static variable in class org.apache.tika.FetchAndParseReply
- ErrorParser - Class in org.apache.tika.parser
-
Dummy parser that always throws a
TikaExceptionwithout even attempting to parse the given document stream. - ErrorParser() - Constructor for class org.apache.tika.parser.ErrorParser
- ERRORS - Static variable in class org.apache.tika.detect.siegfried.SiegfriedDetector
- ErrorStringSupplementalInfo - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
ErrorStringSupplementalInfo type in the ResponseError
- escapeCommandLine(String) - Static method in class org.apache.tika.utils.ProcessUtils
-
This should correctly put double-quotes around an argument if ProcessBuilder doesn't seem to work (as it doesn't on paths with spaces on Windows)
- ESClient - Class in org.apache.tika.pipes.emitter.es
-
Plain HTTP client for the ES REST API.
- ESClient(ESEmitterConfig, HttpClient) - Constructor for class org.apache.tika.pipes.emitter.es.ESClient
- ESEmitter - Class in org.apache.tika.pipes.emitter.es
-
Emitter that sends documents to an ES-compatible REST API.
- ESEmitter(ExtensionConfig, ESEmitterConfig) - Constructor for class org.apache.tika.pipes.emitter.es.ESEmitter
- ESEmitterConfig - Record Class in org.apache.tika.pipes.emitter.es
-
Configuration for the ES emitter.
- ESEmitterConfig(String, String, ESEmitterConfig.AttachmentStrategy, ESEmitterConfig.UpdateStrategy, int, String, String, HttpClientConfig) - Constructor for record class org.apache.tika.pipes.emitter.es.ESEmitterConfig
-
Creates an instance of a
ESEmitterConfigrecord class. - ESEmitterConfig.AttachmentStrategy - Enum Class in org.apache.tika.pipes.emitter.es
- ESEmitterConfig.UpdateStrategy - Enum Class in org.apache.tika.pipes.emitter.es
- ESEmitterFactory - Class in org.apache.tika.pipes.emitter.es
-
Factory for creating ES emitters.
- ESEmitterFactory() - Constructor for class org.apache.tika.pipes.emitter.es.ESEmitterFactory
- ESPipesPlugin - Class in org.apache.tika.pipes.plugin.es
- ESPipesPlugin(PluginWrapper) - Constructor for class org.apache.tika.pipes.plugin.es.ESPipesPlugin
- ESPipesReporter - Class in org.apache.tika.pipes.reporter.es
- ESPipesReporter(ExtensionConfig, ESReporterConfig) - Constructor for class org.apache.tika.pipes.reporter.es.ESPipesReporter
- ESReporterConfig - Record Class in org.apache.tika.pipes.reporter.es
- ESReporterConfig(String, Set<String>, Set<String>, String, boolean, String, HttpClientConfig) - Constructor for record class org.apache.tika.pipes.reporter.es.ESReporterConfig
-
Creates an instance of a
ESReporterConfigrecord class. - ESReporterFactory - Class in org.apache.tika.pipes.reporter.es
-
Factory for creating ES pipes reporters.
- ESReporterFactory() - Constructor for class org.apache.tika.pipes.reporter.es.ESReporterFactory
- ESRI_LAYER - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
- estimateSize(PSTMessage) - Static method in class org.apache.tika.parser.microsoft.pst.OutlookPSTParser
- esUrl() - Method in record class org.apache.tika.pipes.emitter.es.ESEmitterConfig
-
Returns the value of the
esUrlrecord component. - esUrl() - Method in record class org.apache.tika.pipes.reporter.es.ESReporterConfig
-
Returns the value of the
esUrlrecord component. - ETHIOPIC - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- EvalCharsetDetectors - Class in org.apache.tika.ml.chardetect.tools
-
Compares
MojibusterEncodingDetectoragainst ICU4J and juniversalchardet. - EvalCharsetDetectors() - Constructor for class org.apache.tika.ml.chardetect.tools.EvalCharsetDetectors
- EvalConfig - Class in org.apache.tika.eval.app
- EvalConfig() - Constructor for class org.apache.tika.eval.app.EvalConfig
- EvalExceptionUtils - Class in org.apache.tika.eval.core.util
- EvalExceptionUtils() - Constructor for class org.apache.tika.eval.core.util.EvalExceptionUtils
- EvalJunkDetector - Class in org.apache.tika.ml.junkdetect.tools
-
Ablation evaluation for the junk detector.
- EvalJunkDetector() - Constructor for class org.apache.tika.ml.junkdetect.tools.EvalJunkDetector
- EVENT - Static variable in interface org.apache.tika.metadata.IPTC
-
Names or describes the specific event the content relates to.
- ExcelExtractor - Class in org.apache.tika.parser.microsoft
-
Excel parser implementation which uses POI's Event API to handle the contents of a Workbook.
- ExcelExtractor(ParseContext, Metadata) - Constructor for class org.apache.tika.parser.microsoft.ExcelExtractor
- EXCEPTION - Enum constant in enum class org.apache.tika.pipes.api.pipesiterator.TotalCountResult.STATUS
- EXCEPTION - Enum constant in enum class org.apache.tika.renderer.RenderResult.STATUS
- EXCEPTION_TABLE - Static variable in class org.apache.tika.eval.app.ExtractProfiler
- EXCEPTION_TABLE_A - Static variable in class org.apache.tika.eval.app.ExtractComparer
- EXCEPTION_TABLE_B - Static variable in class org.apache.tika.eval.app.ExtractComparer
- ExceptionUtils - Class in org.apache.tika.utils
- ExceptionUtils() - Constructor for class org.apache.tika.utils.ExceptionUtils
- exclude - Variable in class org.apache.tika.metadata.filter.ExcludeFieldMetadataFilter.Config
- ExcludeFieldMetadataFilter - Class in org.apache.tika.metadata.filter
- ExcludeFieldMetadataFilter() - Constructor for class org.apache.tika.metadata.filter.ExcludeFieldMetadataFilter
- ExcludeFieldMetadataFilter(Set<String>) - Constructor for class org.apache.tika.metadata.filter.ExcludeFieldMetadataFilter
- ExcludeFieldMetadataFilter(JsonConfig) - Constructor for class org.apache.tika.metadata.filter.ExcludeFieldMetadataFilter
-
Constructor for JSON configuration.
- ExcludeFieldMetadataFilter(ExcludeFieldMetadataFilter.Config) - Constructor for class org.apache.tika.metadata.filter.ExcludeFieldMetadataFilter
-
Constructor with explicit Config object.
- ExcludeFieldMetadataFilter.Config - Class in org.apache.tika.metadata.filter
-
Configuration class for JSON deserialization.
- excludes() - Method in record class org.apache.tika.pipes.reporter.es.ESReporterConfig
-
Returns the value of the
excludesrecord component. - excludes() - Method in record class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterConfig
-
Returns the value of the
excludesrecord component. - excludes() - Method in record class org.apache.tika.pipes.reporter.opensearch.OpenSearchReporterConfig
-
Returns the value of the
excludesrecord component. - excludeUnmapped - Variable in class org.apache.tika.metadata.filter.FieldNameMappingFilter.Config
- ExecutableParser - Class in org.apache.tika.parser.executable
-
Parser for executable files.
- ExecutableParser() - Constructor for class org.apache.tika.parser.executable.ExecutableParser
- execute(ProcessBuilder, long, int, int) - Static method in class org.apache.tika.utils.ProcessUtils
-
This writes stdout and stderr to the FileProcessResult.
- execute(ProcessBuilder, long, Path, int) - Static method in class org.apache.tika.utils.ProcessUtils
-
This redirects stdout to stdoutRedirect path.
- execute(Connection, Path) - Method in class org.apache.tika.eval.app.reports.ResultsReporter
- execute(ParseContext, Runnable) - Static method in class org.apache.tika.utils.ConcurrentUtils
-
Execute a runnable using an ExecutorService from the ParseContext if possible.
- exGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataNodeObjectData
- exGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.NodeObject
- ExGuid - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
- ExGuid() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
-
Initializes a new instance of the ExGuid class, this is a default constructor.
- ExGuid(int, UUID) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
-
Initializes a new instance of the ExGuid class with specified value.
- ExGuid(ExGuid) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
-
Initializes a new instance of the ExGuid class, this is the copy constructor.
- ExGUIDArray - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
- ExGUIDArray() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
-
Initializes a new instance of the ExGUIDArray class, this is the default constructor.
- ExGUIDArray(List<ExGuid>) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
-
Initializes a new instance of the ExGUIDArray class with specified value.
- ExGUIDArray(ExGUIDArray) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
-
Initializes a new instance of the ExGUIDArray class, this is copy constructor.
- EXIF_PAGE_COUNT - Static variable in interface org.apache.tika.metadata.TIFF
- EXISTING - Enum constant in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.SUFFIX_STRATEGY
- EXIT_VALUE - Static variable in interface org.apache.tika.metadata.ExternalProcess
-
Exit value of the sub process
- ExpandedTitleContentHandler - Class in org.apache.tika.sax
-
Content handler decorator which wraps a
TransformerHandlerin order to allow theTITLEtag to render as<title></title>rather than<title/>which is accomplished by calling theContentHandler.characters(char[], int, int)method with alengthof 1 but a zero length char array. - ExpandedTitleContentHandler() - Constructor for class org.apache.tika.sax.ExpandedTitleContentHandler
- ExpandedTitleContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.ExpandedTitleContentHandler
- EXPERIMENT_ID - Static variable in interface org.apache.tika.metadata.ClimateForcast
- EXPOSURE_TIME - Static variable in interface org.apache.tika.metadata.TIFF
-
"Exposure time in seconds."
- ExtendedGUID - Class in org.apache.tika.parser.microsoft.onenote
- ExtendedGUID() - Constructor for class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
- ExtendedGUID(GUID, long) - Constructor for class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
- ExtendedGUID10BitUintType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
-
Specify the extended GUID 10 Bit int type value.
- ExtendedGUID17BitUintType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
-
Specify the extended GUID 17 Bit int type value.
- ExtendedGUID32BitUintType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
-
Specify the extended GUID 32 Bit int type value.
- ExtendedGUID5BitUintType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
-
Specify the extended GUID 5 Bit int type value.
- ExtendedGUIDNullType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
-
Specify the extended GUID null type value.
- ExtendedMetadataExtractor - Class in org.apache.tika.parser.microsoft.msg
-
This class extracts mapi properties as defined in the props_table.txt, which was generated from MS-OXPROPS.
- ExtendedMetadataExtractor() - Constructor for class org.apache.tika.parser.microsoft.msg.ExtendedMetadataExtractor
- extendedStreamsPresent - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
- extendGUID1 - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
- extendGUID2 - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
- extension_neg(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- EXTENSION_TAG_EXIF - Static variable in class org.apache.tika.parser.image.BPGParser
- EXTENSION_TAG_ICC_PROFILE - Static variable in class org.apache.tika.parser.image.BPGParser
- EXTENSION_TAG_THUMBNAIL - Static variable in class org.apache.tika.parser.image.BPGParser
- EXTENSION_TAG_XMP - Static variable in class org.apache.tika.parser.image.BPGParser
- extension_trust(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- ExtensionConfig - Record Class in org.apache.tika.plugins
-
Configuration for a plugin extension.
- ExtensionConfig(String, String, String) - Constructor for record class org.apache.tika.plugins.ExtensionConfig
-
Creates an instance of a
ExtensionConfigrecord class. - ExtensionConfigDTO - Class in org.apache.tika.pipes.ignite
-
Value object for storing configuration in an Ignite 3.x
KeyValueView. - ExtensionConfigDTO() - Constructor for class org.apache.tika.pipes.ignite.ExtensionConfigDTO
- ExtensionConfigDTO(String, String) - Constructor for class org.apache.tika.pipes.ignite.ExtensionConfigDTO
- externalBoolean(String) - Static method in class org.apache.tika.metadata.Property
- externalBooleanSeq(String) - Static method in class org.apache.tika.metadata.Property
- externalClosedChoise(String, String...) - Static method in class org.apache.tika.metadata.Property
- externalDate(String) - Static method in class org.apache.tika.metadata.Property
- ExternalEmbedder - Class in org.apache.tika.embedder
-
Embedder that uses an external program (like sed or exiftool) to embed text content and metadata into a given document.
- ExternalEmbedder() - Constructor for class org.apache.tika.embedder.ExternalEmbedder
- externalInteger(String) - Static method in class org.apache.tika.metadata.Property
- externalOpenChoise(String, String...) - Static method in class org.apache.tika.metadata.Property
- ExternalParser - Class in org.apache.tika.parser.external
-
Parser that uses an external program (like ffmpeg, exiftool or sox) to extract text content and metadata from a given document.
- ExternalParser() - Constructor for class org.apache.tika.parser.external.ExternalParser
-
Default constructor - not typically useful since ExternalParser requires configuration.
- ExternalParser(JsonConfig) - Constructor for class org.apache.tika.parser.external.ExternalParser
-
JSON config constructor - used for deserialization.
- ExternalParser(ExternalParserConfig) - Constructor for class org.apache.tika.parser.external.ExternalParser
-
Programmatic constructor with typed config.
- ExternalParserConfig - Class in org.apache.tika.parser.external
-
Configuration for
ExternalParser. - ExternalParserConfig() - Constructor for class org.apache.tika.parser.external.ExternalParserConfig
- ExternalProcess - Interface in org.apache.tika.metadata
- externalReal(String) - Static method in class org.apache.tika.metadata.Property
- externalRealSeq(String) - Static method in class org.apache.tika.metadata.Property
- externalRef(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- externalRef(String, String) - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
-
Called when an external reference URL is found in a field code.
- externalText(String) - Static method in class org.apache.tika.metadata.Property
- externalTextBag(String) - Static method in class org.apache.tika.metadata.Property
- ExternalTranslator - Class in org.apache.tika.language.translate.impl
-
Abstract class used to interact with command line/external Translators.
- ExternalTranslator() - Constructor for class org.apache.tika.language.translate.impl.ExternalTranslator
- EXTRA_BITS - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- extract(byte[]) - Method in class org.apache.tika.ml.chardetect.Utf16ColumnFeatureExtractor
- extract(byte[]) - Static method in class org.apache.tika.parser.microsoft.msg.RTFEncapsulatedHTMLExtractor
-
Extracts the HTML content from an encapsulated-HTML RTF document.
- extract(byte[]) - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFHtmlDecapsulator
- extract(byte[], int, int) - Method in class org.apache.tika.ml.chardetect.Utf16ColumnFeatureExtractor
-
Extract from a sub-range of a byte array.
- extract(JsonNode, ObjectMapper) - Static method in class org.apache.tika.config.loader.FrameworkConfig
-
Extracts framework config from JSON node, returning the cleaned component config.
- extract(InputStream, Metadata, XHTMLContentHandler) - Method in class org.apache.tika.parser.hwp.HwpTextExtractorV5
-
extract Text from HWP Stream.
- extract(String) - Method in class org.apache.tika.langdetect.charsoup.CharSoupFeatureExtractor
-
Full preprocessing + feature extraction pipeline.
- extract(String) - Method in interface org.apache.tika.langdetect.charsoup.FeatureExtractor
-
Full preprocessing + feature extraction pipeline.
- extract(String) - Method in class org.apache.tika.langdetect.charsoup.SaltedNgramFeatureExtractor
- extract(String) - Method in class org.apache.tika.langdetect.charsoup.ScriptAwareFeatureExtractor
- extract(String) - Method in class org.apache.tika.langdetect.charsoup.ShortTextFeatureExtractor
- extract(String) - Method in class org.apache.tika.parser.html.DataURISchemeUtil
-
Extracts DataURISchemes from free text, as in javascript.
- extract(String, int[]) - Method in class org.apache.tika.langdetect.charsoup.CharSoupFeatureExtractor
-
Extract features into a caller-supplied buffer, avoiding allocation.
- extract(String, int[]) - Method in interface org.apache.tika.langdetect.charsoup.FeatureExtractor
-
Extract into caller-supplied buffer (zeroed first).
- extract(String, int[]) - Method in class org.apache.tika.langdetect.charsoup.SaltedNgramFeatureExtractor
- extract(String, int[]) - Method in class org.apache.tika.langdetect.charsoup.ScriptAwareFeatureExtractor
- extract(String, int[]) - Method in class org.apache.tika.langdetect.charsoup.ShortTextFeatureExtractor
- extract(XMPMetadata, Metadata, ParseContext) - Static method in class org.apache.tika.parser.pdf.PDMetadataExtractor
- extract(PDMetadata, Metadata, ParseContext) - Static method in class org.apache.tika.parser.pdf.PDMetadataExtractor
- extract(MAPIMessage, Metadata) - Static method in class org.apache.tika.parser.microsoft.msg.ExtendedMetadataExtractor
- extract(TikaInputStream, Path) - Method in class org.apache.tika.example.ExtractEmbeddedFiles
- extract(TikaInputStream, ContainerExtractor, EmbeddedResourceHandler, ParseContext) - Method in interface org.apache.tika.extractor.ContainerExtractor
-
Processes a container file, and extracts all the embedded resources from within it.
- extract(TikaInputStream, ContainerExtractor, EmbeddedResourceHandler, ParseContext) - Method in class org.apache.tika.extractor.ParserContainerExtractor
- extract(Metadata) - Method in class org.apache.tika.parser.microsoft.ooxml.MetadataExtractor
- extract(T) - Method in interface org.apache.tika.ml.FeatureExtractor
-
Extract features from the given input.
- EXTRACT_CONTENT - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Should content be extracted, generally.
- EXTRACT_EXCEPTION_DESCRIPTION - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- EXTRACT_EXCEPTION_ID - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- EXTRACT_EXCEPTION_TABLE - Static variable in class org.apache.tika.eval.app.ExtractProfiler
- EXTRACT_EXCEPTION_TABLE_A - Static variable in class org.apache.tika.eval.app.ExtractComparer
- EXTRACT_EXCEPTION_TABLE_B - Static variable in class org.apache.tika.eval.app.ExtractComparer
- EXTRACT_FILE_LENGTH - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- EXTRACT_FILE_LENGTH_A - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- EXTRACT_FILE_LENGTH_B - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- EXTRACT_FILE_TOO_LONG - Enum constant in enum class org.apache.tika.eval.app.io.ExtractReaderException.TYPE
- EXTRACT_FILE_TOO_SHORT - Enum constant in enum class org.apache.tika.eval.app.io.ExtractReaderException.TYPE
- EXTRACT_FOR_ACCESSIBILITY - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Should content be extracted for the purposes of accessibility.
- EXTRACT_PARSE_EXCEPTION - Enum constant in enum class org.apache.tika.eval.app.io.ExtractReaderException.TYPE
- extractAndCount(String, int[]) - Method in interface org.apache.tika.langdetect.charsoup.FeatureExtractor
-
Extract features into
countsand return the total n-gram emission count. - extractAndCount(String, int[]) - Method in class org.apache.tika.langdetect.charsoup.SaltedNgramFeatureExtractor
- extractAndCount(String, int[]) - Method in class org.apache.tika.langdetect.charsoup.ScriptAwareFeatureExtractor
- extractAndCount(String, int[]) - Method in class org.apache.tika.langdetect.charsoup.ShortTextFeatureExtractor
- extractChannelInfo(Metadata, int) - Static method in class org.apache.tika.parser.ogg.OggAudioParser
- extractChannelInfo(Metadata, OggAudioInfoHeader) - Static method in class org.apache.tika.parser.ogg.OggAudioParser
- extractChmEntry(DirectoryListingEntry) - Method in class org.apache.tika.parser.microsoft.chm.ChmExtractor
-
Decompresses a chm entry
- extractComments(Metadata, XHTMLContentHandler, VorbisStyleComments) - Static method in class org.apache.tika.parser.ogg.OggAudioParser
- ExtractComparer - Class in org.apache.tika.eval.app
- ExtractComparer(Path, Path, Path, ExtractReader, IDBWriter) - Constructor for class org.apache.tika.eval.app.ExtractComparer
- ExtractComparerRunner - Class in org.apache.tika.eval.app
- ExtractComparerRunner() - Constructor for class org.apache.tika.eval.app.ExtractComparerRunner
- extractDublinCore(XMPMetadata, Metadata) - Static method in class org.apache.tika.parser.xmp.JempboxExtractor
-
Tries to extract Dublin Core schema from XMP.
- extractDublinCoreSchema(XMPMetadata, Metadata) - Static method in class org.apache.tika.parser.xmp.XMPMetadataExtractor
-
Extracts Dublin Core.
- extractDuration(Metadata, XHTMLContentHandler, double) - Static method in class org.apache.tika.parser.ogg.OggAudioParser
- extractDuration(Metadata, XHTMLContentHandler, OggAudioHeaders, OggAudioStream) - Static method in class org.apache.tika.parser.ogg.OggAudioParser
- extractEmbeddedDocumentsExample(Path) - Method in class org.apache.tika.example.ParsingExample
- ExtractEmbeddedFiles - Class in org.apache.tika.example
- ExtractEmbeddedFiles() - Constructor for class org.apache.tika.example.ExtractEmbeddedFiles
- extractFromPreprocessed(String) - Method in class org.apache.tika.langdetect.charsoup.CharSoupFeatureExtractor
-
Extract features from already-preprocessed text (no NFC, no URL stripping, no truncation).
- extractFromPreprocessed(String) - Method in interface org.apache.tika.langdetect.charsoup.FeatureExtractor
-
Extract from already-preprocessed text.
- extractFromPreprocessed(String) - Method in class org.apache.tika.langdetect.charsoup.SaltedNgramFeatureExtractor
- extractFromPreprocessed(String) - Method in class org.apache.tika.langdetect.charsoup.ScriptAwareFeatureExtractor
- extractFromPreprocessed(String) - Method in class org.apache.tika.langdetect.charsoup.ShortTextFeatureExtractor
- extractFromPreprocessed(String, int[]) - Method in class org.apache.tika.langdetect.charsoup.CharSoupFeatureExtractor
-
Extract features from already-preprocessed text into a caller-supplied buffer.
- extractFromPreprocessed(String, int[], boolean) - Method in class org.apache.tika.langdetect.charsoup.CharSoupFeatureExtractor
-
Extract features from already-preprocessed text into a caller-supplied buffer, optionally clearing it first.
- extractFromPreprocessed(String, int[], boolean) - Method in interface org.apache.tika.langdetect.charsoup.FeatureExtractor
-
Extract from already-preprocessed text into a caller-supplied buffer.
- extractFromPreprocessed(String, int[], boolean) - Method in class org.apache.tika.langdetect.charsoup.SaltedNgramFeatureExtractor
- extractFromPreprocessed(String, int[], boolean) - Method in class org.apache.tika.langdetect.charsoup.ScriptAwareFeatureExtractor
- extractFromPreprocessed(String, int[], boolean) - Method in class org.apache.tika.langdetect.charsoup.ShortTextFeatureExtractor
- extractGenre(String) - Static method in class org.apache.tika.parser.mp3.ID3v22Handler
- extractHeaderFooter(String, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFBExcelExtractorDecorator
- extractHeaderFooter(String, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
- extractHyperLinks(PackagePart, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
- extractInfo(Metadata, FlacInfo) - Method in class org.apache.tika.parser.ogg.FlacParser
- extractInfo(Metadata, OpusInfo) - Method in class org.apache.tika.parser.ogg.OpusParser
- extractInfo(Metadata, SpeexInfo) - Method in class org.apache.tika.parser.ogg.SpeexParser
- extractInfo(Metadata, TheoraInfo) - Method in class org.apache.tika.parser.ogg.TheoraParser
- extractInfo(Metadata, VorbisInfo) - Method in class org.apache.tika.parser.ogg.VorbisParser
- extractInlineImageMetadataOnly - Variable in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- extractInlineImageMetadataOnly(PDImage, Metadata) - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- extractLinks(String) - Static method in class org.apache.tika.utils.RegexUtils
-
Extract urls from plain text.
- extractMacros - Variable in class org.apache.tika.parser.odf.FlatOpenDocumentParser.Config
- extractMacros - Variable in class org.apache.tika.parser.odf.OpenDocumentParser.Config
- extractMacros(POIFSFileSystem, ContentHandler, EmbeddedDocumentExtractor, ParseContext) - Static method in class org.apache.tika.parser.microsoft.OfficeParser
-
Helper to extract macros from an NPOIFS/vbaProject.bin
- extractMetadata(Connection, Metadata) - Method in class org.apache.tika.parser.jdbc.AbstractDBParser
-
This is called before parsing the tables to extract metadata from the db, if any.
- extractMetadata(Connection, Metadata) - Method in class org.apache.tika.parser.sqlite3.SQLite3DBParser
- extractPhoneNumbers(String) - Static method in class org.apache.tika.sax.CleanPhoneText
- ExtractProfiler - Class in org.apache.tika.eval.app
- ExtractProfileRunner - Class in org.apache.tika.eval.app
- ExtractProfileRunner() - Constructor for class org.apache.tika.eval.app.ExtractProfileRunner
- ExtractReader - Class in org.apache.tika.eval.app.io
- ExtractReader() - Constructor for class org.apache.tika.eval.app.io.ExtractReader
-
Reads full extract, no modification of metadata list, no min or max extract length checking
- ExtractReader(ExtractReader.ALTER_METADATA_LIST) - Constructor for class org.apache.tika.eval.app.io.ExtractReader
- ExtractReader(ExtractReader.ALTER_METADATA_LIST, long, long) - Constructor for class org.apache.tika.eval.app.io.ExtractReader
- ExtractReader.ALTER_METADATA_LIST - Enum Class in org.apache.tika.eval.app.io
- ExtractReaderException - Exception in org.apache.tika.eval.app.io
-
Exception when trying to read extract
- ExtractReaderException(ExtractReaderException.TYPE) - Constructor for exception org.apache.tika.eval.app.io.ExtractReaderException
- ExtractReaderException(ExtractReaderException.TYPE, Throwable) - Constructor for exception org.apache.tika.eval.app.io.ExtractReaderException
- ExtractReaderException.TYPE - Enum Class in org.apache.tika.eval.app.io
- extractResponseText(String, Metadata) - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
-
Parse the API response body and extract the model's text output.
- extractResponseText(String, Metadata) - Method in class org.apache.tika.parser.vlm.ClaudeVLMParser
- extractResponseText(String, Metadata) - Method in class org.apache.tika.parser.vlm.GeminiVLMParser
- extractResponseText(String, Metadata) - Method in class org.apache.tika.parser.vlm.OpenAIVLMParser
- extractRootElement(byte[]) - Method in class org.apache.tika.detect.XmlRootExtractor
- extractRootElement(InputStream) - Method in class org.apache.tika.detect.XmlRootExtractor
- extractScripts - Variable in class org.apache.tika.parser.html.JSoupParser.Config
- extractSparseInto(byte[], int[], int[]) - Method in class org.apache.tika.ml.chardetect.Utf16ColumnFeatureExtractor
-
Sparse extraction into caller-owned, reusable buffers.
- extractSparseInto(T, int[], int[]) - Method in interface org.apache.tika.ml.FeatureExtractor
-
Sparse extraction into caller-owned reusable buffers: populates
densewith feature counts, writes the indices of non-zero entries intotouched, and returns how many indices were written. - extractStandardReferences(String, double) - Static method in class org.apache.tika.sax.StandardsText
-
Extracts the standard references found within the given text.
- extractXMPBasicSchema(XMPMetadata, Metadata) - Static method in class org.apache.tika.parser.xmp.XMPMetadataExtractor
-
Extracts basic schema metadata from XMP.
- extractXMPMM(XMPMetadata, Metadata) - Static method in class org.apache.tika.parser.xmp.JempboxExtractor
-
Extracts Media Management metadata from XMP.
- extractXMPMM(XMPMetadata, Metadata) - Static method in class org.apache.tika.parser.xmp.XMPMetadataExtractor
-
Extracts Media Management metadata from XMP.
F
- F_NUMBER - Static variable in interface org.apache.tika.metadata.TIFF
-
"F-Number."
- FAIL - Static variable in class org.apache.tika.sax.xpath.Matcher
-
State of a failed XPath evaluation, where nothing is matched.
- FAILED_TO_INITIALIZE - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- FallbackParser - Class in org.apache.tika.parser.multiple
-
Tries multiple parsers in turn, until one succeeds.
- FallbackParser(MediaTypeRegistry, AbstractMultipleParser.MetadataPolicy, Collection<? extends Parser>) - Constructor for class org.apache.tika.parser.multiple.FallbackParser
- FallbackParser(MediaTypeRegistry, AbstractMultipleParser.MetadataPolicy, Parser...) - Constructor for class org.apache.tika.parser.multiple.FallbackParser
- FALSE - Static variable in class org.apache.tika.eval.app.ProfilerBase
- FASTER - Static variable in class org.apache.tika.parser.pdf.OcrConfig.StrategyAuto
- FATAL - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.CATEGORY
-
Fatal system error - cannot continue, must be fixed and restarted
- fatalError(SAXParseException) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- FCHARSET_MAP - Static variable in class org.apache.tika.parser.microsoft.rtf.jflex.RTFCharsetMaps
-
Maps
\fcharsetNvalues to Java charsets. - FEATURE_FLAGS - Static variable in class org.apache.tika.langdetect.charsoup.SaltedNgramFeatureExtractor
- FEATURE_FLAGS - Static variable in class org.apache.tika.langdetect.charsoup.ScriptAwareFeatureExtractor
-
Bitmask of
CharSoupModel.FLAG_*constants that exactly describes the features this extractor emits. - FEATURE_FLAGS - Static variable in class org.apache.tika.langdetect.charsoup.ShortTextFeatureExtractor
-
Bitmask of
CharSoupModel.FLAG_*constants that exactly describes the features this extractor emits. - FEATURE_FLAGS_LEGACY - Static variable in class org.apache.tika.langdetect.charsoup.ScriptAwareFeatureExtractor
-
Flags used by models trained before script block features were added.
- FEATURE_FLAGS_LEGACY - Static variable in class org.apache.tika.langdetect.charsoup.ShortTextFeatureExtractor
- FEATURE_FLAGS_V11 - Static variable in class org.apache.tika.langdetect.charsoup.SaltedNgramFeatureExtractor
- FEATURE_FLAGS_WITH_WORD_BIGRAMS - Static variable in class org.apache.tika.langdetect.charsoup.SaltedNgramFeatureExtractor
- FeatureExtractor - Interface in org.apache.tika.langdetect.charsoup
-
Common interface for feature extractors used by the bigram language detector.
- FeatureExtractor<T> - Interface in org.apache.tika.ml
-
Generic feature extractor that maps an input of type
Tto a fixed-length integer feature vector suitable for aLinearModel. - featureLabel(int) - Static method in class org.apache.tika.ml.chardetect.Utf16ColumnFeatureExtractor
-
Human-readable label for feature index
i(for debugging). - FeedParser - Class in org.apache.tika.parser.feed
-
Feed parser.
- FeedParser() - Constructor for class org.apache.tika.parser.feed.FeedParser
- fetch(String, long, long, Metadata) - Method in interface org.apache.tika.pipes.api.fetcher.RangeFetcher
- fetch(String, long, long, Metadata, ParseContext) - Method in interface org.apache.tika.pipes.api.fetcher.RangeFetcher
- fetch(String, long, long, Metadata, ParseContext) - Method in class org.apache.tika.pipes.fetcher.http.HttpFetcher
- fetch(String, long, long, Metadata, ParseContext) - Method in class org.apache.tika.pipes.fetcher.s3.S3Fetcher
- fetch(String, Metadata, ParseContext) - Method in interface org.apache.tika.pipes.api.fetcher.Fetcher
-
Fetches a resource and returns it as a TikaInputStream.
- fetch(String, Metadata, ParseContext) - Method in class org.apache.tika.pipes.core.fetcher.EmptyFetcher
- fetch(String, Metadata, ParseContext) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcher
- fetch(String, Metadata, ParseContext) - Method in class org.apache.tika.pipes.fetcher.azblob.AZBlobFetcher
- fetch(String, Metadata, ParseContext) - Method in class org.apache.tika.pipes.fetcher.fs.FileSystemFetcher
- fetch(String, Metadata, ParseContext) - Method in class org.apache.tika.pipes.fetcher.gcs.GCSFetcher
- fetch(String, Metadata, ParseContext) - Method in class org.apache.tika.pipes.fetcher.googledrive.GoogleDriveFetcher
- fetch(String, Metadata, ParseContext) - Method in class org.apache.tika.pipes.fetcher.http.HttpFetcher
- fetch(String, Metadata, ParseContext) - Method in class org.apache.tika.pipes.fetcher.s3.S3Fetcher
- fetch(String, Metadata, ParseContext) - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.MicrosoftGraphFetcher
- FETCH_EXCEPTION - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- FETCH_KEY - Static variable in class org.apache.tika.pipes.core.serialization.FetchEmitTupleSerializer
- FETCH_KEY_FIELD_NUMBER - Static variable in class org.apache.tika.FetchAndParseReply
- FETCH_KEY_FIELD_NUMBER - Static variable in class org.apache.tika.FetchAndParseRequest
- FETCH_RANGE_END - Static variable in class org.apache.tika.pipes.core.serialization.FetchEmitTupleSerializer
- FETCH_RANGE_START - Static variable in class org.apache.tika.pipes.core.serialization.FetchEmitTupleSerializer
- fetchAndParse(FetchAndParseRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingStub
-
Using a Fetcher in the fetcher store, send a FetchAndParse request.
- fetchAndParse(FetchAndParseRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingV2Stub
-
Using a Fetcher in the fetcher store, send a FetchAndParse request.
- fetchAndParse(FetchAndParseRequest) - Method in class org.apache.tika.TikaGrpc.TikaFutureStub
-
Using a Fetcher in the fetcher store, send a FetchAndParse request.
- fetchAndParse(FetchAndParseRequest, StreamObserver<FetchAndParseReply>) - Method in interface org.apache.tika.TikaGrpc.AsyncService
-
Using a Fetcher in the fetcher store, send a FetchAndParse request.
- fetchAndParse(FetchAndParseRequest, StreamObserver<FetchAndParseReply>) - Method in class org.apache.tika.TikaGrpc.TikaStub
-
Using a Fetcher in the fetcher store, send a FetchAndParse request.
- fetchAndParseBiDirectionalStreaming() - Method in class org.apache.tika.TikaGrpc.TikaBlockingV2Stub
-
Using a Fetcher in the fetcher store, send a FetchAndParse request.
- fetchAndParseBiDirectionalStreaming(StreamObserver<FetchAndParseReply>) - Method in interface org.apache.tika.TikaGrpc.AsyncService
-
Using a Fetcher in the fetcher store, send a FetchAndParse request.
- fetchAndParseBiDirectionalStreaming(StreamObserver<FetchAndParseReply>) - Method in class org.apache.tika.TikaGrpc.TikaStub
-
Using a Fetcher in the fetcher store, send a FetchAndParse request.
- FetchAndParseReply - Class in org.apache.tika
-
Protobuf type
tika.FetchAndParseReply - FetchAndParseReply.Builder - Class in org.apache.tika
-
Protobuf type
tika.FetchAndParseReply - FetchAndParseReplyOrBuilder - Interface in org.apache.tika
- FetchAndParseRequest - Class in org.apache.tika
-
Protobuf type
tika.FetchAndParseRequest - FetchAndParseRequest.Builder - Class in org.apache.tika
-
Protobuf type
tika.FetchAndParseRequest - FetchAndParseRequestOrBuilder - Interface in org.apache.tika
- fetchAndParseServerSideStreaming(FetchAndParseRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingStub
-
Using a Fetcher in the fetcher store, send a FetchAndParse request.
- fetchAndParseServerSideStreaming(FetchAndParseRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingV2Stub
-
Using a Fetcher in the fetcher store, send a FetchAndParse request.
- fetchAndParseServerSideStreaming(FetchAndParseRequest, StreamObserver<FetchAndParseReply>) - Method in interface org.apache.tika.TikaGrpc.AsyncService
-
Using a Fetcher in the fetcher store, send a FetchAndParse request.
- fetchAndParseServerSideStreaming(FetchAndParseRequest, StreamObserver<FetchAndParseReply>) - Method in class org.apache.tika.TikaGrpc.TikaStub
-
Using a Fetcher in the fetcher store, send a FetchAndParse request.
- FetchEmitTuple - Class in org.apache.tika.pipes.api
- FetchEmitTuple(String, FetchKey, EmitKey) - Constructor for class org.apache.tika.pipes.api.FetchEmitTuple
- FetchEmitTuple(String, FetchKey, EmitKey, Metadata) - Constructor for class org.apache.tika.pipes.api.FetchEmitTuple
- FetchEmitTuple(String, FetchKey, EmitKey, Metadata, ParseContext) - Constructor for class org.apache.tika.pipes.api.FetchEmitTuple
- FetchEmitTuple(String, FetchKey, EmitKey, Metadata, ParseContext, FetchEmitTuple.ON_PARSE_EXCEPTION) - Constructor for class org.apache.tika.pipes.api.FetchEmitTuple
- FetchEmitTuple.ON_PARSE_EXCEPTION - Enum Class in org.apache.tika.pipes.api
- FetchEmitTupleDeserializer - Class in org.apache.tika.pipes.core.serialization
- FetchEmitTupleDeserializer() - Constructor for class org.apache.tika.pipes.core.serialization.FetchEmitTupleDeserializer
- FetchEmitTupleSerializer - Class in org.apache.tika.pipes.core.serialization
- FetchEmitTupleSerializer() - Constructor for class org.apache.tika.pipes.core.serialization.FetchEmitTupleSerializer
- Fetcher - Interface in org.apache.tika.pipes.api.fetcher
-
Interface for an object that will fetch a TikaInputStream given a fetch string.
- FETCHER - Static variable in class org.apache.tika.pipes.core.serialization.FetchEmitTupleSerializer
- FETCHER_CLASS_FIELD_NUMBER - Static variable in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- FETCHER_CLASS_FIELD_NUMBER - Static variable in class org.apache.tika.GetFetcherReply
- FETCHER_CLASS_FIELD_NUMBER - Static variable in class org.apache.tika.SaveFetcherRequest
- FETCHER_CONFIG_JSON_FIELD_NUMBER - Static variable in class org.apache.tika.SaveFetcherRequest
- FETCHER_CONFIG_JSON_SCHEMA_FIELD_NUMBER - Static variable in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- FETCHER_ID_FIELD_NUMBER - Static variable in class org.apache.tika.DeleteFetcherRequest
- FETCHER_ID_FIELD_NUMBER - Static variable in class org.apache.tika.FetchAndParseRequest
- FETCHER_ID_FIELD_NUMBER - Static variable in class org.apache.tika.GetFetcherReply
- FETCHER_ID_FIELD_NUMBER - Static variable in class org.apache.tika.GetFetcherRequest
- FETCHER_ID_FIELD_NUMBER - Static variable in class org.apache.tika.SaveFetcherReply
- FETCHER_ID_FIELD_NUMBER - Static variable in class org.apache.tika.SaveFetcherRequest
- FETCHER_INITIALIZATION_EXCEPTION - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- FETCHER_NOT_FOUND - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- FetcherFactory - Interface in org.apache.tika.pipes.api.fetcher
- fetcherId() - Method in record class org.apache.tika.pipes.core.config.ConfigMerger.MergeResult
-
Returns the value of the
fetcherIdrecord component. - FetcherManager - Class in org.apache.tika.pipes.core.fetcher
-
Utility class to hold multiple fetchers.
- FetcherNotFoundException - Exception in org.apache.tika.pipes.api.fetcher
-
Exception thrown when a requested fetcher configuration does not exist.
- FetcherNotFoundException(String) - Constructor for exception org.apache.tika.pipes.api.fetcher.FetcherNotFoundException
- FetcherNotFoundException(String, Throwable) - Constructor for exception org.apache.tika.pipes.api.fetcher.FetcherNotFoundException
- FetcherOverride(String, String, Map<String, Object>) - Constructor for class org.apache.tika.pipes.core.config.ConfigOverrides.FetcherOverride
- FetcherStringException - Exception in org.apache.tika.pipes.core.fetcher
-
If something goes wrong in parsing the fetcher string
- FetcherStringException(String) - Constructor for exception org.apache.tika.pipes.core.fetcher.FetcherStringException
- FetchKey - Class in org.apache.tika.pipes.api.fetcher
-
Pair of fetcherId (which fetcher to call) and the key to send to that fetcher to retrieve a specific file.
- FetchKey() - Constructor for class org.apache.tika.pipes.api.fetcher.FetchKey
- FetchKey(String, String) - Constructor for class org.apache.tika.pipes.api.fetcher.FetchKey
- FetchKey(String, String, long, long) - Constructor for class org.apache.tika.pipes.api.fetcher.FetchKey
- FictionBookParser - Class in org.apache.tika.parser.xml
- FictionBookParser() - Constructor for class org.apache.tika.parser.xml.FictionBookParser
- fieldCodeHyperlinkStart(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- fieldCodeHyperlinkStart(String) - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
-
Called when a hyperlink is found via a field code (instrText HYPERLINK).
- FieldCodeParser - Class in org.apache.tika.parser.microsoft.ooxml
-
Parses OOXML field codes (instrText) to extract URLs from HYPERLINK, INCLUDEPICTURE, INCLUDETEXT, IMPORT, and LINK fields.
- FieldNameMappingFilter - Class in org.apache.tika.metadata.filter
- FieldNameMappingFilter() - Constructor for class org.apache.tika.metadata.filter.FieldNameMappingFilter
- FieldNameMappingFilter(JsonConfig) - Constructor for class org.apache.tika.metadata.filter.FieldNameMappingFilter
-
Constructor for JSON configuration.
- FieldNameMappingFilter(FieldNameMappingFilter.Config) - Constructor for class org.apache.tika.metadata.filter.FieldNameMappingFilter
-
Constructor with explicit Config object.
- FieldNameMappingFilter.Config - Class in org.apache.tika.metadata.filter
-
Configuration class for JSON deserialization.
- FIELDS_FIELD_NUMBER - Static variable in class org.apache.tika.FetchAndParseReply
- FILE_DATA_RATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The file data rate in megabytes per second.
- FILE_EXTENSION - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- FILE_ID - Static variable in interface org.apache.tika.metadata.WordPerfect
-
File identifier.
- FILE_MIME - Static variable in class org.apache.tika.detect.FileCommandDetector
- FILE_MIME_ID - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- FILE_NAME - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- FILE_PATH - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- FILE_SIZE - Static variable in interface org.apache.tika.metadata.WordPerfect
-
File size as defined in document header.
- FILE_TYPE - Static variable in interface org.apache.tika.metadata.WordPerfect
-
File type.
- FileBasedConfigStore - Class in org.apache.tika.pipes.core.config
-
File-based implementation of
ConfigStorethat persists configurations to a JSON file. - FileBasedConfigStore(Path) - Constructor for class org.apache.tika.pipes.core.config.FileBasedConfigStore
- FileBasedConfigStoreFactory - Class in org.apache.tika.pipes.core.config
-
Factory for creating FileBasedConfigStore instances.
- FileBasedConfigStoreFactory() - Constructor for class org.apache.tika.pipes.core.config.FileBasedConfigStoreFactory
- FileCommandDetector - Class in org.apache.tika.detect
-
This runs the linux 'file' command against a file.
- FileCommandDetector() - Constructor for class org.apache.tika.detect.FileCommandDetector
- fileContent - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.AbstractChunking
- fileDataObject - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObject
- fileExtension() - Method in record class org.apache.tika.pipes.emitter.azblob.AZBlobEmitterConfig
-
Returns the value of the
fileExtensionrecord component. - fileExtension() - Method in record class org.apache.tika.pipes.emitter.fs.FileSystemEmitterConfig
-
Returns the value of the
fileExtensionrecord component. - fileExtension() - Method in record class org.apache.tika.pipes.emitter.gcs.GCSEmitterConfig
-
Returns the value of the
fileExtensionrecord component. - fileExtension() - Method in record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
-
Returns the value of the
fileExtensionrecord component. - FileHash - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
File Hash
- fileName() - Method in record class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler.FrictionlessFileInfo
-
Returns the value of the
fileNamerecord component. - fileName() - Method in record class org.apache.tika.pipes.core.extractor.TempFileUnpackHandler.EmbeddedFileInfo
-
Returns the value of the
fileNamerecord component. - FilenameUtils - Class in org.apache.tika.io
- FilenameUtils() - Constructor for class org.apache.tika.io.FilenameUtils
- filePath() - Method in record class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler.FrictionlessFileInfo
-
Returns the value of the
filePathrecord component. - filePath() - Method in record class org.apache.tika.pipes.core.extractor.TempFileUnpackHandler.EmbeddedFileInfo
-
Returns the value of the
filePathrecord component. - FileProcessResult - Class in org.apache.tika.utils
- FileProcessResult() - Constructor for class org.apache.tika.utils.FileProcessResult
- FileSystem - Interface in org.apache.tika.metadata
-
A collection of metadata elements for file system level metadata
- FileSystemEmitter - Class in org.apache.tika.pipes.emitter.fs
-
Emitter to write to a file system.
- FileSystemEmitter(ExtensionConfig) - Constructor for class org.apache.tika.pipes.emitter.fs.FileSystemEmitter
- FileSystemEmitterConfig - Record Class in org.apache.tika.pipes.emitter.fs
- FileSystemEmitterConfig(String, String, FileSystemEmitterConfig.ON_EXISTS, boolean) - Constructor for record class org.apache.tika.pipes.emitter.fs.FileSystemEmitterConfig
-
Creates an instance of a
FileSystemEmitterConfigrecord class. - FileSystemEmitterFactory - Class in org.apache.tika.pipes.emitter.fs
-
Factory for creating file system emitters.
- FileSystemEmitterFactory() - Constructor for class org.apache.tika.pipes.emitter.fs.FileSystemEmitterFactory
- FileSystemEmitterRuntimeConfig - Class in org.apache.tika.pipes.emitter.fs
-
Runtime configuration for FileSystemEmitter.
- FileSystemEmitterRuntimeConfig() - Constructor for class org.apache.tika.pipes.emitter.fs.FileSystemEmitterRuntimeConfig
- FileSystemFetcher - Class in org.apache.tika.pipes.fetcher.fs
-
Fetches files from a local/mounted file system.
- FileSystemFetcher(ExtensionConfig) - Constructor for class org.apache.tika.pipes.fetcher.fs.FileSystemFetcher
- FileSystemFetcherConfig - Class in org.apache.tika.pipes.fetcher.fs
- FileSystemFetcherConfig() - Constructor for class org.apache.tika.pipes.fetcher.fs.FileSystemFetcherConfig
- FileSystemFetcherFactory - Class in org.apache.tika.pipes.fetcher.fs
-
Factory for creating file system fetchers.
- FileSystemFetcherFactory() - Constructor for class org.apache.tika.pipes.fetcher.fs.FileSystemFetcherFactory
- FileSystemPipesIterator - Class in org.apache.tika.pipes.iterator.fs
- FileSystemPipesIteratorConfig - Class in org.apache.tika.pipes.iterator.fs
- FileSystemPipesIteratorConfig() - Constructor for class org.apache.tika.pipes.iterator.fs.FileSystemPipesIteratorConfig
- FileSystemPipesIteratorFactory - Class in org.apache.tika.pipes.iterator.fs
-
Factory for creating file system pipes iterators.
- FileSystemPipesIteratorFactory() - Constructor for class org.apache.tika.pipes.iterator.fs.FileSystemPipesIteratorFactory
- FileSystemPipesPlugin - Class in org.apache.tika.pipes.plugin.fs
- FileSystemPipesPlugin(PluginWrapper) - Constructor for class org.apache.tika.pipes.plugin.fs.FileSystemPipesPlugin
- FileSystemReporterConfig - Record Class in org.apache.tika.pipes.reporter.fs
- FileSystemReporterConfig(Path, long) - Constructor for record class org.apache.tika.pipes.reporter.fs.FileSystemReporterConfig
-
Creates an instance of a
FileSystemReporterConfigrecord class. - FileSystemReporterFactory - Class in org.apache.tika.pipes.reporter.fs
-
Factory for creating file system status reporters.
- FileSystemReporterFactory() - Constructor for class org.apache.tika.pipes.reporter.fs.FileSystemReporterFactory
- FileSystemStatusReporter - Class in org.apache.tika.pipes.reporter.fs
-
This is intended to write summary statistics to disk periodically.
- FileTooLongException - Exception in org.apache.tika.exception
- FileTooLongException(long, long) - Constructor for exception org.apache.tika.exception.FileTooLongException
- FileTooLongException(String) - Constructor for exception org.apache.tika.exception.FileTooLongException
- FILL_IN_FORM - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Can the user fill in a form
- fillAndStrokePath(int) - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- fillMetadata(Parser, Metadata, MultivaluedMap<String, String>) - Static method in class org.apache.tika.server.core.resource.TikaResource
- fillPath(int) - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- filter(ContainerRequestContext) - Method in class org.apache.tika.server.core.ConfigEndpointSecurityFilter
- filter(ContainerRequestContext) - Method in class org.apache.tika.server.core.TikaLoggingFilter
- filter(List<Metadata>) - Method in class org.apache.tika.metadata.filter.MetadataFilter
-
Convenience overload for callers that have no per-request context.
- filter(List<Metadata>) - Method in class org.apache.tika.pipes.core.PassbackFilter
-
Filters the metadata list in place, selecting which data to pass back to the client.
- filter(List<Metadata>, ParseContext) - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- filter(List<Metadata>, ParseContext) - Method in class org.apache.tika.metadata.filter.CompositeMetadataFilter
- filter(List<Metadata>, ParseContext) - Method in class org.apache.tika.metadata.filter.MetadataFilter
-
Filters the metadata list in place using per-request context.
- filter(List<Metadata>, ParseContext) - Method in class org.apache.tika.metadata.filter.MetadataFilterBase
- filter(List<Metadata>, ParseContext) - Method in class org.apache.tika.metadata.filter.NoOpFilter
- filter(List<Metadata>, ParseContext) - Method in class org.apache.tika.metadata.filter.RemoveByMimeMetadataFilter
- filter(Metadata) - Method in class org.apache.tika.eval.core.metadata.TikaEvalMetadataFilter
- filter(Metadata) - Method in class org.apache.tika.langdetect.charsoup.CharSoupMetadataFilter
- filter(Metadata) - Method in class org.apache.tika.langdetect.opennlp.metadatafilter.OpenNLPMetadataFilter
- filter(Metadata) - Method in class org.apache.tika.langdetect.optimaize.metadatafilter.OptimaizeMetadataFilter
- filter(Metadata) - Method in class org.apache.tika.metadata.filter.CaptureGroupMetadataFilter
- filter(Metadata) - Method in class org.apache.tika.metadata.filter.ClearByAttachmentTypeMetadataFilter
- filter(Metadata) - Method in class org.apache.tika.metadata.filter.DateNormalizingMetadataFilter
- filter(Metadata) - Method in class org.apache.tika.metadata.filter.ExcludeFieldMetadataFilter
- filter(Metadata) - Method in class org.apache.tika.metadata.filter.FieldNameMappingFilter
- filter(Metadata) - Method in class org.apache.tika.metadata.filter.GeoPointMetadataFilter
- filter(Metadata) - Method in class org.apache.tika.metadata.filter.IncludeFieldMetadataFilter
- filter(Metadata) - Method in class org.apache.tika.metadata.filter.MetadataFilterBase
- FINAL_EMBEDDED_RESOURCE_PATH - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
This is calculated in
RecursiveParserWrapperHandler. - findContextKeyInterface(Class<?>) - Static method in class org.apache.tika.serialization.ComponentNameResolver
-
Finds the appropriate context key interface for a given type.
- findDuplicateParsers(ParseContext) - Method in class org.apache.tika.parser.CompositeParser
-
Utility method that goes through all the component parsers and finds all media types for which more than one parser declares support.
- findInFile(String, Path) - Method in class org.apache.tika.example.InterruptableParsingExample
- findMatches(String, Pattern) - Method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
-
finds matching sub groups in text
- findNames(String[]) - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
-
finds names from given array of tokens
- findServiceResources(String) - Method in class org.apache.tika.config.ServiceLoader
-
Returns all the available service resources matching the given pattern, such as all instances of tika-mimetypes.xml on the classpath, or all org.apache.tika.parser.Parser service files.
- findStorageIndexCellMapping(CellID) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
-
This method is used to find the Storage Index Cell Mapping matches the Cell ID.
- findStorageIndexRevisionMapping(ExGuid) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
-
This method is used to find the Storage Index Revision Mapping that matches the Revision Mapping Extended GUID.
- finish() - Method in interface org.apache.tika.eval.core.textstats.BytesRefCalculator.BytesRefCalcInstance
- finished() - Method in class org.apache.tika.pipes.core.async.AsyncProcessor
- finished(byte[]) - Static method in record class org.apache.tika.pipes.core.protocol.PipesMessage
- FINISHED - Enum constant in enum class org.apache.tika.pipes.core.protocol.PipesMessageType
- FIRST_ONLY - Enum constant in enum class org.apache.tika.eval.app.io.ExtractReader.ALTER_METADATA_LIST
- FIRST_ONLY - Enum constant in enum class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig.AttachmentStrategy
- FIRST_ONLY - Enum constant in enum class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig.MultivaluedFieldStrategy
- FIRST_WINS - Enum constant in enum class org.apache.tika.parser.multiple.AbstractMultipleParser.MetadataPolicy
-
The first parser to output a given key wins, merge in non-clashing other keys
- FlacParser - Class in org.apache.tika.parser.ogg
-
Parser for FLAC audio files (both native FLAC and OGG-FLAC).
- FlacParser() - Constructor for class org.apache.tika.parser.ogg.FlacParser
- flag - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.RawTag
- FLAG_4GRAMS - Static variable in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Feature flag: enable character 4-grams.
- FLAG_5GRAMS - Static variable in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Feature flag: enable character 5-grams.
- FLAG_CHAR_UNIGRAMS - Static variable in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Feature flag: enable non-CJK character unigrams.
- FLAG_L2_NORM - Static variable in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Feature flag: L2-normalize the feature vector before prediction.
- FLAG_PREFIX - Static variable in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Feature flag: enable 3-char word prefixes.
- FLAG_SCRIPT_BLOCKS - Static variable in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Feature flag: enable script-block presence + transition features.
- FLAG_SKIP_BIGRAMS - Static variable in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Feature flag: enable skip bigrams.
- FLAG_SUFFIX4 - Static variable in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Feature flag: enable 4-char word suffixes.
- FLAG_SUFFIXES - Static variable in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Feature flag: enable 3-char word suffixes.
- FLAG_TRIGRAMS - Static variable in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Feature flag: enable character trigrams.
- FLAG_WORD_BIGRAMS - Static variable in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Feature flag: short-word-anchored word bigrams (hash pairs where anchor is 1–3 chars).
- FLAG_WORD_LENGTH - Static variable in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Feature flag: non-CJK word length features (exact length, capped).
- FLAG_WORD_UNIGRAMS - Static variable in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Feature flag: enable whole-word unigrams.
- FLASH_FIRED - Static variable in interface org.apache.tika.metadata.TIFF
-
Did the Flash fire when taking this image?
- FlatOpenDocumentParser - Class in org.apache.tika.parser.odf
- FlatOpenDocumentParser() - Constructor for class org.apache.tika.parser.odf.FlatOpenDocumentParser
- FlatOpenDocumentParser(JsonConfig) - Constructor for class org.apache.tika.parser.odf.FlatOpenDocumentParser
-
Constructor for JSON configuration.
- FlatOpenDocumentParser(FlatOpenDocumentParser.Config) - Constructor for class org.apache.tika.parser.odf.FlatOpenDocumentParser
-
Constructor with explicit Config object.
- FlatOpenDocumentParser.Config - Class in org.apache.tika.parser.odf
-
Configuration class for JSON deserialization.
- floatValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
- floatValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- floatValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
- floatValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
- flush() - Method in class org.apache.tika.language.detect.LanguageWriter
-
Ignored.
- FLVParser - Class in org.apache.tika.parser.video
-
Parser for metadata contained in Flash Videos (.flv).
- FLVParser() - Constructor for class org.apache.tika.parser.video.FLVParser
- FOCAL_LENGTH - Static variable in interface org.apache.tika.metadata.TIFF
-
"Focal length of the lens, in millimeters."
- Font - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- Font - Interface in org.apache.tika.metadata
- FONT - Enum constant in enum class org.apache.tika.metadata.TikaCoreProperties.EmbeddedResourceType
- FONT_NAME - Static variable in interface org.apache.tika.metadata.Font
-
Basic name of a font used in a file
- FontColor - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- FontSize - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- footers - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
- footnoteReference(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- footnoteReference(String) - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- format(Object, StringBuffer, FieldPosition) - Method in class org.apache.tika.parser.microsoft.TikaExcelGeneralFormat
- FORMAT - Static variable in class org.apache.tika.detect.siegfried.SiegfriedDetector
- FORMAT - Static variable in interface org.apache.tika.metadata.DublinCore
-
Typically, Format may include the media-type or dimensions of the resource.
- FORMAT - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- FORMAT - Static variable in interface org.apache.tika.metadata.XMPDC
-
Typically, Format may include the media-type or dimensions of the resource.
- formatDate(Calendar) - Static method in class org.apache.tika.utils.DateUtils
-
Returns a ISO 8601 representation of the given date in UTC, truncated to the seconds unit.
- formatDate(Date) - Static method in class org.apache.tika.utils.DateUtils
-
Returns a ISO 8601 representation of the given date in UTC, truncated to the seconds unit.
- formatDateUnknownTimezone(Date) - Static method in class org.apache.tika.utils.DateUtils
-
Returns a ISO 8601 representation of the given date in UTC, truncated to the seconds unit.
- formatHash(byte[]) - Static method in record class org.apache.tika.pipes.core.extractor.frictionless.FrictionlessResource
-
Formats a SHA256 byte array as the Frictionless hash string format.
- formatMillis(long) - Static method in class org.apache.tika.utils.DurationFormatUtils
- formatRawCellContents(double, int, String, boolean) - Method in class org.apache.tika.parser.microsoft.TikaExcelDataFormatter
- formatter - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
- FormattingUtils - Class in org.apache.tika.parser.microsoft
- FormattingUtils.Tag - Enum Class in org.apache.tika.parser.microsoft
- forName(String) - Method in class org.apache.tika.mime.MimeTypes
-
Returns the registered media type with the given name (or alias).
- forName(String) - Static method in class org.apache.tika.utils.CharsetUtils
-
Returns Charset impl, if one exists.
- FourBytesOfData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
-
This class is used to represent the property contains 4 bytes of data in the PropertySet.rgData stream field.
- FourBytesOfData - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
The property contains 4 bytes of data in the PropertySet.rgData stream field.
- FourBytesOfData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.FourBytesOfData
- FourBytesOfLengthFollowedByData - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
The property contains a prtFourBytesOfLengthFollowedByData in the PropertySet.rgData stream field.
- FragmentDataElementData - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
-
Fragment Data Element
- FragmentKnowledge - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Fragment Knowledge
- FragmentKnowledge - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Fragment Knowledge
- FragmentKnowledgeEntry - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Fragment Knowledge Entry
- FrameworkConfig - Class in org.apache.tika.config.loader
-
Extracts framework-level configuration from component JSON, separating fields prefixed with underscore from component-specific config.
- FrameworkConfig.ParserDecoration - Class in org.apache.tika.config.loader
-
Parser decoration configuration for mime type filtering.
- FRICTIONLESS - Enum constant in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.OUTPUT_FORMAT
-
Frictionless Data Package format with datapackage.json manifest, SHA256 hashes, mimetypes, and files in unpacked/ subdirectory
- FrictionlessFileInfo(int, String, Path, Metadata, String, long, String) - Constructor for record class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler.FrictionlessFileInfo
-
Creates an instance of a
FrictionlessFileInforecord class. - FrictionlessPackageDetector - Class in org.apache.tika.detect.zip
- FrictionlessPackageDetector() - Constructor for class org.apache.tika.detect.zip.FrictionlessPackageDetector
- FrictionlessResource - Record Class in org.apache.tika.pipes.core.extractor.frictionless
-
Represents a resource entry in a Frictionless Data Package.
- FrictionlessResource(String, String, long, String, String) - Constructor for record class org.apache.tika.pipes.core.extractor.frictionless.FrictionlessResource
-
Creates an instance of a
FrictionlessResourcerecord class. - FrictionlessUnpackHandler - Class in org.apache.tika.pipes.core.extractor
-
An UnpackHandler that collects embedded files for Frictionless Data Package output.
- FrictionlessUnpackHandler(EmitKey, UnpackConfig) - Constructor for class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler
-
Creates a new FrictionlessUnpackHandler.
- FrictionlessUnpackHandler.FrictionlessFileInfo - Record Class in org.apache.tika.pipes.core.extractor
-
Information about an embedded file including its SHA256 hash.
- FROM_REPRESENTING_EMAIL - Static variable in interface org.apache.tika.metadata.MAPI
- FROM_REPRESENTING_NAME - Static variable in interface org.apache.tika.metadata.MAPI
- fromBytes(byte[], Class<T>) - Static method in class org.apache.tika.pipes.core.serialization.JsonPipesIpc
-
Deserialize Smile binary format bytes to an object.
- fromCurlyBraceUTF16Bytes(byte[]) - Static method in class org.apache.tika.parser.microsoft.onenote.GUID
-
Converts a GUID of format: {AAAAAAAA-BBBB-CCCC-DDDD-EEEEEEEEEEEE} (in bytes) to a GUID object.
- fromIntVal(int) - Static method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
- fromIntVal(int) - Static method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
- fromIntVal(int) - Static method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
- fromIntVal(int) - Static method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
- fromJson(Reader) - Static method in class org.apache.tika.pipes.core.serialization.JsonFetchEmitTuple
- fromJson(Reader) - Static method in class org.apache.tika.pipes.core.serialization.JsonFetchEmitTupleList
- fromJson(Reader) - Static method in class org.apache.tika.serialization.JsonMetadata
-
Read metadata from reader.
- fromJson(Reader) - Static method in class org.apache.tika.serialization.JsonMetadataList
-
Read metadata from reader.
- fromJson(String) - Static method in class org.apache.tika.inference.ChunkSerializer
-
Deserialize a JSON array string back to a list of chunks.
- fromJson(String) - Static method in class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
-
Parses a DataPackage from JSON string.
- FsshttpbResponse - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
The Response
- FsshttpbSubResponse - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
FSSHTTPB Sub Response
G
- GCSEmitter - Class in org.apache.tika.pipes.emitter.gcs
-
Emitter to write parsed documents to Google Cloud Storage.
- GCSEmitterConfig - Record Class in org.apache.tika.pipes.emitter.gcs
- GCSEmitterConfig(String, String, String, String) - Constructor for record class org.apache.tika.pipes.emitter.gcs.GCSEmitterConfig
-
Creates an instance of a
GCSEmitterConfigrecord class. - GCSEmitterFactory - Class in org.apache.tika.pipes.emitter.gcs
-
Factory for creating Google Cloud Storage emitters.
- GCSEmitterFactory() - Constructor for class org.apache.tika.pipes.emitter.gcs.GCSEmitterFactory
- GCSFetcher - Class in org.apache.tika.pipes.fetcher.gcs
-
Fetches files from google cloud storage.
- GCSFetcherConfig - Class in org.apache.tika.pipes.fetcher.gcs.config
- GCSFetcherConfig() - Constructor for class org.apache.tika.pipes.fetcher.gcs.config.GCSFetcherConfig
- GCSFetcherFactory - Class in org.apache.tika.pipes.fetcher.gcs
-
Factory for creating Google Cloud Storage fetchers.
- GCSFetcherFactory() - Constructor for class org.apache.tika.pipes.fetcher.gcs.GCSFetcherFactory
- GCSPipesIterator - Class in org.apache.tika.pipes.iterator.gcs
- GCSPipesIteratorConfig - Class in org.apache.tika.pipes.iterator.gcs
- GCSPipesIteratorConfig() - Constructor for class org.apache.tika.pipes.iterator.gcs.GCSPipesIteratorConfig
- GCSPipesIteratorFactory - Class in org.apache.tika.pipes.iterator.gcs
-
Factory for creating Google Cloud Storage pipes iterators.
- GCSPipesIteratorFactory() - Constructor for class org.apache.tika.pipes.iterator.gcs.GCSPipesIteratorFactory
- GCSPipesPlugin - Class in org.apache.tika.pipes.plugin.gcs
- GCSPipesPlugin(PluginWrapper) - Constructor for class org.apache.tika.pipes.plugin.gcs.GCSPipesPlugin
- GDALParser - Class in org.apache.tika.parser.gdal
-
Wraps execution of the Geospatial Data Abstraction Library (GDAL)
gdalinfotool used to extract geospatial information out of hundreds of geo file formats. - GDALParser() - Constructor for class org.apache.tika.parser.gdal.GDALParser
- GeminiVLMParser - Class in org.apache.tika.parser.vlm
-
VLM parser for the Google Gemini
generateContentAPI. - GeminiVLMParser() - Constructor for class org.apache.tika.parser.vlm.GeminiVLMParser
- GeminiVLMParser(JsonConfig) - Constructor for class org.apache.tika.parser.vlm.GeminiVLMParser
- GeminiVLMParser(VLMOCRConfig) - Constructor for class org.apache.tika.parser.vlm.GeminiVLMParser
- GENERAL_EMBEDDED - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
General embedded document type within an OLE2 container
- generateFooter(StringBuffer) - Method in class org.apache.tika.server.core.HTMLHelper
- generateHeader(StringBuffer, String) - Method in class org.apache.tika.server.core.HTMLHelper
-
Generates the HTML Header for the user facing page, adding in the given title as required
- generateJwt(String, String) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtGenerator
- generateResourceName(EmbeddedDocumentUtil.EmbeddedResourcePrefix, int, String) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
Generates a canonical resource name from a type, counter, and media type.
- generateRSS(Path) - Method in class org.apache.tika.example.RecentFiles
- GENERIC - Enum constant in enum class org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
- GenericConverter - Class in org.apache.tika.xmp.convert
-
Trys to convert as much of the properties in the
Metadatamap to XMP namespaces. - GenericConverter() - Constructor for class org.apache.tika.xmp.convert.GenericConverter
- GENRE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the genre."
- GENRES - Static variable in interface org.apache.tika.parser.mp3.ID3Tags
-
List of predefined genres.
- GeoGazetteerClient - Class in org.apache.tika.parser.geo.topic.gazetteer
- GeoGazetteerClient(String) - Constructor for class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
-
Pass URL on which lucene-geo-gazetteer is available - eg. http://localhost:8765/api/search
- GeoGazetteerClient(GeoParserConfig) - Constructor for class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
- Geographic - Interface in org.apache.tika.metadata
-
Geographic schema.
- GeographicInformationParser - Class in org.apache.tika.parser.geoinfo
- GeographicInformationParser() - Constructor for class org.apache.tika.parser.geoinfo.GeographicInformationParser
- geoInfoType - Static variable in class org.apache.tika.parser.geoinfo.GeographicInformationParser
- GeoParser - Class in org.apache.tika.parser.geo.topic
- GeoParser() - Constructor for class org.apache.tika.parser.geo.topic.GeoParser
- GeoParser(JsonConfig) - Constructor for class org.apache.tika.parser.geo.topic.GeoParser
- GeoParser(GeoParserConfig) - Constructor for class org.apache.tika.parser.geo.topic.GeoParser
- GeoParserConfig - Class in org.apache.tika.parser.geo.topic
- GeoParserConfig() - Constructor for class org.apache.tika.parser.geo.topic.GeoParserConfig
- GeoParserConfig.RuntimeConfig - Class in org.apache.tika.parser.geo.topic
-
RuntimeConfig blocks modification of security-sensitive URL/path fields at runtime.
- GeoPkgParser - Class in org.apache.tika.parser.geopkg
-
Customization of sqlite parser to skip certain common blob columns.
- GeoPkgParser() - Constructor for class org.apache.tika.parser.geopkg.GeoPkgParser
-
Checks to see if class is available for org.sqlite.JDBC.
- geoPointFieldName - Variable in class org.apache.tika.metadata.filter.GeoPointMetadataFilter.Config
- GeoPointMetadataFilter - Class in org.apache.tika.metadata.filter
-
If
Metadatacontains aTikaCoreProperties.LATITUDEand aTikaCoreProperties.LONGITUDE, this filter concatenates those with a comma in the order LATITUDE,LONGITUDE. - GeoPointMetadataFilter() - Constructor for class org.apache.tika.metadata.filter.GeoPointMetadataFilter
- GeoPointMetadataFilter(JsonConfig) - Constructor for class org.apache.tika.metadata.filter.GeoPointMetadataFilter
-
Constructor for JSON configuration.
- GeoPointMetadataFilter(GeoPointMetadataFilter.Config) - Constructor for class org.apache.tika.metadata.filter.GeoPointMetadataFilter
-
Constructor with explicit Config object.
- GeoPointMetadataFilter.Config - Class in org.apache.tika.metadata.filter
-
Configuration class for JSON deserialization.
- GEORGIAN - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- GeoTag - Class in org.apache.tika.parser.geo.topic
- GeoTag() - Constructor for class org.apache.tika.parser.geo.topic.GeoTag
- get() - Method in enum class org.apache.tika.parser.strings.StringsEncoding
- get() - Method in class org.apache.tika.pipes.core.server.IntermediateResult
- get(byte[]) - Static method in class org.apache.tika.io.TikaInputStream
- get(byte[], Metadata) - Static method in class org.apache.tika.io.TikaInputStream
- get(File) - Static method in class org.apache.tika.io.TikaInputStream
- get(File, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
- get(InputStream) - Static method in class org.apache.tika.io.TikaInputStream
- get(InputStream, TemporaryResources, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
- get(InputStream, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
- get(Class<T>) - Method in interface org.apache.tika.config.loader.LoaderContext.DependencyProvider
- get(Class<T>) - Method in class org.apache.tika.config.loader.LoaderContext
-
Get a dependency by class type.
- get(Class<T>) - Method in class org.apache.tika.config.loader.TikaLoader
-
Gets a component by its class type.
- get(Class<T>) - Method in class org.apache.tika.detect.zip.StreamingDetectContext
-
Returns the object in this context that implements the given interface.
- get(Class<T>) - Method in class org.apache.tika.parser.ParseContext
-
Returns the object in this context that implements the given interface.
- get(Class<T>, T) - Method in class org.apache.tika.detect.zip.StreamingDetectContext
-
Returns the object in this context that implements the given interface, or the given default value if such an object is not found.
- get(Class<T>, T) - Method in class org.apache.tika.parser.ParseContext
-
Returns the object in this context that implements the given interface, or the given default value if such an object is not found.
- get(String) - Method in class org.apache.tika.config.loader.TikaLoader
-
Gets a component by its JSON field name.
- get(String) - Method in class org.apache.tika.metadata.Metadata
-
Get the value associated to a metadata name.
- get(String) - Static method in class org.apache.tika.metadata.Property
-
Retrieve the property object that corresponds to the given key
- get(String) - Method in interface org.apache.tika.pipes.core.config.ConfigStore
-
Retrieves a configuration by ID.
- get(String) - Method in class org.apache.tika.pipes.core.config.FileBasedConfigStore
- get(String) - Method in class org.apache.tika.pipes.core.config.InMemoryConfigStore
- get(String) - Method in class org.apache.tika.pipes.ignite.IgniteConfigStore
- get(String) - Method in class org.apache.tika.xmp.XMPMetadata
-
Returns the value of a simple property or the first one of an array.
- get(String, Map<String, String>, int) - Method in class org.apache.tika.http.TikaHttpClient
-
GET
urland return the response body as a string. - get(URI) - Static method in class org.apache.tika.io.TikaInputStream
- get(URI, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
- get(URL) - Static method in class org.apache.tika.io.TikaInputStream
- get(URL, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
- get(Path) - Static method in class org.apache.tika.io.TikaInputStream
- get(Path, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
- get(Path, Metadata, TemporaryResources) - Static method in class org.apache.tika.io.TikaInputStream
- get(Blob) - Static method in class org.apache.tika.io.TikaInputStream
- get(Blob, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
- get(HttpClientFactory, List<String>) - Static method in class org.apache.tika.server.client.TikaClient
- get(Property) - Method in class org.apache.tika.metadata.Metadata
-
Returns the value (if any) of the identified metadata property.
- get(Property) - Method in class org.apache.tika.xmp.XMPMetadata
- get(ParseContext) - Static method in class org.apache.tika.config.EmbeddedLimits
-
Helper method to get EmbeddedLimits from ParseContext with defaults.
- get(ParseContext) - Static method in class org.apache.tika.config.OutputLimits
-
Helper method to get OutputLimits from ParseContext with defaults.
- get(ParseContext) - Static method in class org.apache.tika.config.TimeoutLimits
-
Helper method to get TimeoutLimits from ParseContext with defaults.
- GET_FETCHER_REPLIES_FIELD_NUMBER - Static variable in class org.apache.tika.ListFetchersReply
- get7BitsInt(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
AKA a Synchsafe integer. 4 bytes hold a 28 bit number.
- getAbstractNumId(int) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFNumberingShim
- getAbstractNumLevels(int) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFNumberingShim
- getAccessCheckMode() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- getAccessKey() - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- getAccessKey() - Method in class org.apache.tika.pipes.iterator.s3.S3PipesIteratorConfig
- getAcronym() - Method in class org.apache.tika.mime.MimeType
-
Returns an acronym for this mime type.
- getAdditionalFetchConfigJson() - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
You can supply additional fetch configuration using this.
- getAdditionalFetchConfigJson() - Method in class org.apache.tika.FetchAndParseRequest
-
You can supply additional fetch configuration using this.
- getAdditionalFetchConfigJson() - Method in interface org.apache.tika.FetchAndParseRequestOrBuilder
-
You can supply additional fetch configuration using this.
- getAdditionalFetchConfigJsonBytes() - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
You can supply additional fetch configuration using this.
- getAdditionalFetchConfigJsonBytes() - Method in class org.apache.tika.FetchAndParseRequest
-
You can supply additional fetch configuration using this.
- getAdditionalFetchConfigJsonBytes() - Method in interface org.apache.tika.FetchAndParseRequestOrBuilder
-
You can supply additional fetch configuration using this.
- getAdditionalFields() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
Every Converter has to provide information about namespaces that are used additionally to the core set of XMP namespaces.
- getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.GenericConverter
- getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.MSOfficeBinaryConverter
- getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.MSOfficeXMLConverter
- getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.OpenDocumentConverter
- getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.RTFConverter
- getAdmin1Code() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
- getAdmin2Code() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
- getAeDescriptorPath() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the path to XML descriptor for AnalysisEngine.
- getAlbum() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
- getAlbum() - Method in interface org.apache.tika.parser.mp3.ID3Tags
- getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
- getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
- getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
- getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
- getAlbumArtist() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
The Artist for the overall album / compilation of albums
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
ID3v1 doesn't have album-wide artists, so returns null;
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
- getAlgorithm() - Method in class org.apache.tika.digest.DigestDef
- getAliases(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Returns the set of known aliases of the given canonical media type.
- getAlignedLenTable() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getAlignedTreeTable() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getAll() - Method in class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks
- getAllComponentParsers() - Method in class org.apache.tika.parser.CompositeParser
-
Returns all parsers registered with the Composite Parser, including ones which may not currently be active.
- getAllComponentParsers() - Method in class org.apache.tika.parser.DefaultParser
- getAllComponents() - Method in class org.apache.tika.config.loader.ComponentRegistry
-
Returns all registered component names.
- getAllDetectableCharsets() - Static method in class org.apache.tika.parser.txt.CharsetDetector
-
Get the names of all charsets supported by
CharsetDetectorclass. - getAllNameEntitiesfromInput(InputStream) - Method in class org.apache.tika.parser.geo.topic.NameEntityExtractor
- getAllowedHostsForRedirect() - Method in class org.apache.tika.client.HttpClientFactory
- getAllParsers() - Method in class org.apache.tika.parser.multiple.AbstractMultipleParser
- getAllTagHandlers(InputStream, ContentHandler) - Static method in class org.apache.tika.parser.mp3.Mp3Parser
-
Scans the MP3 frames for ID3 tags, and creates ID3Tag Handlers for each supported set of tags.
- getAlpha(int) - Method in class org.apache.tika.parser.ocr.tess4j.ImageDeskew
- getAlphabeticTokens() - Method in class org.apache.tika.eval.core.tokens.CommonTokenResult
- getAnalysisEngine(String, String, String) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Returns a new UIMA Analysis Engine (AE).
- getAnnotationProperty(IdentifiedAnnotation, CTAKESAnnotationProperty) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Returns the annotation value based on the given annotation type.
- getAnnotationProps() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns an array of
CTAKESAnnotationProperty's that will be included into cTAKES metadata. - getAnnotationPropsAsString() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns a string containing a comma-separated list of
CTAKESAnnotationPropertynames that will be included into cTAKES metadata. - getAnsiSkip() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFState
-
Returns the number of ANSI chars remaining to skip.
- getApiKey() - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- getApiKey() - Method in class org.apache.tika.inference.ImageEmbeddingConfig
- getApiKey() - Method in class org.apache.tika.inference.InferenceConfig
- getApiKey() - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- getApiKey() - Method in class org.apache.tika.language.translate.impl.YandexTranslator
-
Get the API Key in use for client authentication
- getApiKey() - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- getApiKey() - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- getApiKeyHeaderName() - Method in class org.apache.tika.inference.OpenAIEmbeddingFilter
- getApiKeyHeaderName() - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- getApiKeyHeaderName() - Method in class org.apache.tika.parser.vlm.OpenAIVLMParser
- getApiKeyPrefix() - Method in class org.apache.tika.inference.OpenAIEmbeddingFilter
- getApiKeyPrefix() - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- getApiKeyPrefix() - Method in class org.apache.tika.parser.vlm.OpenAIVLMParser
- getApplicationName() - Method in class org.apache.tika.pipes.fetcher.googledrive.config.GoogleDriveFetcherConfig
- getArbitrationInfo() - Method in class org.apache.tika.detect.EncodingDetectorContext
- getArray() - Method in class org.apache.tika.eval.core.textstats.TokenCountPriorityQueue
- getArray() - Method in class org.apache.tika.eval.core.tokens.TokenCountPriorityQueue
- getArrayComponents(String) - Method in class org.apache.tika.config.loader.TikaJsonConfig
-
Gets component configurations for a specific type (array format - used for detectors, etc.).
- getArtist() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
- getArtist() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
The Artist for the track
- getArtist() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
- getArtist() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
- getArtist() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
- getArtist() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
- getAttachmentStrategyEnum() - Method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
- getAttachmentStrategyEnum() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
- getAttributesMapping() - Method in class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
- getAttrValue(String, Attributes) - Static method in class org.apache.tika.utils.XMLReaderUtils
- getAuthScheme() - Method in class org.apache.tika.client.HttpClientFactory
- getAuthScheme() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getAuthScheme() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- getAutoDetectParser() - Method in class org.apache.tika.pipes.core.server.SharedServerResources
- getAutoDetectParserConfig() - Method in class org.apache.tika.parser.AutoDetectParser
- getAutoOffsetReset() - Method in class org.apache.tika.pipes.iterator.kafka.KafkaPipesIteratorConfig
- getAverageCharTolerance() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- getBasePath() - Method in class org.apache.tika.pipes.fetcher.fs.FileSystemFetcherConfig
- getBasePath() - Method in class org.apache.tika.pipes.iterator.fs.FileSystemPipesIteratorConfig
- getBaseType() - Method in class org.apache.tika.mime.MediaType
-
Returns the base form of the MediaType, excluding any parameters, such as "text/plain" for "text/plain; charset=utf-8"
- getBaseUrl() - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- getBaseUrl() - Method in class org.apache.tika.inference.ImageEmbeddingConfig
- getBaseUrl() - Method in class org.apache.tika.inference.InferenceConfig
- getBaseUrl() - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- getBaseUrl() - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- getBaseUrl() - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- getBbox() - Method in class org.apache.tika.inference.locator.PaginatedLocator
- getBbox() - Method in class org.apache.tika.inference.locator.SpatialLocator
- getBestNameEntity() - Method in class org.apache.tika.parser.geo.topic.NameEntityExtractor
- getBiases() - Method in class org.apache.tika.langdetect.charsoup.CharSoupModel
- getBiases() - Method in class org.apache.tika.ml.LinearModel
- getBigInteger(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
- getBitRate() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the bit rate in bit per second.
- getBlob(ResultSet, int, Metadata) - Method in class org.apache.tika.parser.jdbc.JDBCTableReader
- getBlob(ResultSet, int, Metadata) - Method in class org.apache.tika.parser.sqlite3.SQLite3TableReader
- getBlock_len() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Returns block's length
- getBlockAddress() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
-
Returns block addresses
- getBlockCount() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
-
Gets a block count
- getBlockidx_intvl() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Returns block index interval
- getBlockLen() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
-
Gets a block length
- getBlockLength() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getBlockNext() - Method in class org.apache.tika.parser.microsoft.chm.ChmPmglHeader
- getBlockNumber() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxBlock
- getBlockPrev() - Method in class org.apache.tika.parser.microsoft.chm.ChmPmglHeader
- getBlockRemaining() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getBlockType() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getBody() - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
- getBootstrapServers() - Method in class org.apache.tika.pipes.iterator.kafka.KafkaPipesIteratorConfig
- getBucket() - Method in class org.apache.tika.pipes.fetcher.gcs.config.GCSFetcherConfig
- getBucket() - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- getBucket() - Method in class org.apache.tika.pipes.iterator.gcs.GCSPipesIteratorConfig
- getBucket() - Method in class org.apache.tika.pipes.iterator.s3.S3PipesIteratorConfig
- getBucketName() - Method in class org.apache.tika.parser.transcribe.aws.AmazonTranscribeConfig
- getByte() - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
- getByte() - Method in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- getByte() - Method in enum class org.apache.tika.pipes.core.protocol.PipesMessageType
-
Returns the single byte used on the wire for this message type.
- getByteArrayMaxOverride() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
- getByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
- getBytes() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
-
Gets a copy byte array which contains the current written byte.
- getBytes(boolean) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- getBytes(char) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- getBytes(double) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- getBytes(float) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- getBytes(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- getBytes(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.LittleEndianBitConverter
-
Returns the specified 32-bit unsigned integer value as an array of bytes.
- getBytes(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- getBytes(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.LittleEndianBitConverter
-
Returns the specified 64-bit unsigned integer value as an array of bytes.
- getBytes(short) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- getBytes(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- getBytesWritten() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFPictStreamParser
-
Returns the number of bytes written so far.
- getCapacity() - Method in class org.apache.tika.pipes.core.async.AsyncProcessor
- getCaptureMap() - Method in class org.apache.tika.parser.RegexCaptureParserConfig
- getCategory() - Method in record class org.apache.tika.pipes.api.PipesResult
-
Gets the high-level category for this result.
- getCategory() - Method in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
-
Gets the high-level category for this result status.
- getCause() - Method in exception org.apache.tika.sax.TaggedSAXException
-
Returns the wrapped exception.
- getCellManifestDataElementData(List<DataElement>, StorageManifestDataElementData, HashMap<CellID, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to get cell manifest data element from a list of data element.
- getCenter() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
- getCertExpirationWarningDays() - Method in class org.apache.tika.server.core.TlsConfig
- getCertificateBytes() - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.ClientCertificateCredentialsConfig
- getCertificatePassword() - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.ClientCertificateCredentialsConfig
- getChannels() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the number of channels (1=mono, 2=stereo)
- getChar() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFToken
-
For TEXT and CONTROL_SYMBOL tokens: the single character, without allocating a String.
- getCharset() - Method in class org.apache.tika.detect.AutoDetectReader
- getCharset() - Method in class org.apache.tika.detect.EncodingDetectorContext.Result
-
The top-ranked charset from this detector.
- getCharset() - Method in class org.apache.tika.detect.EncodingResult
- getCharset() - Method in class org.apache.tika.detect.OverrideEncodingDetector
- getCharset() - Method in class org.apache.tika.parser.csv.CSVParams
- getCheckCommandLine() - Method in class org.apache.tika.parser.external.ExternalParserConfig
- getCheckErrorCodes() - Method in class org.apache.tika.parser.external.ExternalParserConfig
- getChildTypes(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Returns the set of known children of the given canonical media type
- getChmBlockInfoInstance(DirectoryListingEntry, int, ChmLzxcControlData) - Static method in class org.apache.tika.parser.microsoft.chm.ChmBlockInfo
-
Deprecated.
- getChmBlockInfoInstance(DirectoryListingEntry, int, ChmLzxcControlData, ChmBlockInfo) - Static method in class org.apache.tika.parser.microsoft.chm.ChmBlockInfo
- getChmBlockSegment(byte[], ChmLzxcResetTable, int, int, int) - Static method in class org.apache.tika.parser.microsoft.chm.ChmCommons
- getChmDirList() - Method in class org.apache.tika.parser.microsoft.chm.ChmExtractor
- getChmDirList() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- getChmItsfHeader() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- getChmItspHeader() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- getChmLzxcControlData() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- getChmLzxcResetTable() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- getChoices() - Method in class org.apache.tika.metadata.Property
-
Returns the (immutable) set of choices for the values of this property.
- getChunks() - Method in class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks
- getCiHigh() - Method in class org.apache.tika.quality.TextQualityScore
-
Upper bound of the 95% confidence interval on zScore.
- getCiLow() - Method in class org.apache.tika.quality.TextQualityScore
-
Lower bound of the 95% confidence interval on zScore.
- getClassID() - Method in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PredefinedPropertySet
- getClassID() - Method in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PropertySetType
- getClassLoader() - Method in class org.apache.tika.config.loader.LoaderContext
- getClassLoader() - Method in class org.apache.tika.config.loader.TikaLoader
-
Gets the class loader used for loading components.
- getClassLogits() - Method in class org.apache.tika.ml.chardetect.SpecialistOutput
- getClassMean() - Method in class org.apache.tika.ml.LinearModel
- getClassName() - Method in enum class org.apache.tika.parser.ctakes.CTAKESSerializer
- getClassStd() - Method in class org.apache.tika.ml.LinearModel
- getCleanDwgReadOutputBatchSize() - Method in class org.apache.tika.parser.dwg.DWGParserConfig
- getCleanDwgReadRegexToReplace() - Method in class org.apache.tika.parser.dwg.DWGParserConfig
- getCleanDwgReadReplaceWith() - Method in class org.apache.tika.parser.dwg.DWGParserConfig
- getClientCertificateCredentialsConfig() - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.MicrosoftGraphFetcherConfig
- getClientId() - Method in class org.apache.tika.parser.transcribe.aws.AmazonTranscribeConfig
- getClientId() - Method in interface org.apache.tika.pipes.fetchers.microsoftgraph.config.AadCredentialConfigBase
- getClientId() - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.Client2CertificateCredentialsConfig
- getClientId() - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.ClientCertificateCredentialsConfig
- getClientId() - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.ClientSecretCredentialsConfig
- getClientSecret() - Method in class org.apache.tika.parser.transcribe.aws.AmazonTranscribeConfig
- getClientSecret() - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.Client2CertificateCredentialsConfig
- getClientSecret() - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.ClientSecretCredentialsConfig
- getClientSecretCredentialsConfig() - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.MicrosoftGraphFetcherConfig
- getColInfos() - Method in class org.apache.tika.eval.app.db.TableInfo
- getColorspace() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- getCommand() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Gets the command to be run.
- getCommand() - Method in class org.apache.tika.parser.gdal.GDALParser
- getCommandAppendOperator() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Gets the operator to append rather than replace a value for the command line tool, i.e. "+=".
- getCommandAssignmentDelimeter() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Gets the delimiter for multiple assignments for the command line tool, i.e. ", ".
- getCommandAssignmentOperator() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Gets the assignment operator for the command line tool, i.e. "=".
- getCommandLine() - Method in class org.apache.tika.parser.external.ExternalParserConfig
- getCommandMetadataSegments(Metadata) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Constructs a collection of command line arguments responsible for setting individual metadata fields based on the given
metadata. - getComment(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
Builds up the ID3 comment, by parsing and extracting the comment string parts from the given data.
- getComments() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
- getComments() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
Retrieves the comments, if any.
- getComments() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
- getComments() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
- getComments() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
- getComments() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
- getCommitWithinOrDefault() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
- getCommonTokens() - Method in class org.apache.tika.eval.core.tokens.CommonTokenResult
- getCompilation() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
- getCompilation() - Method in interface org.apache.tika.parser.mp3.ID3Tags
- getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
ID3v1 doesn't have compilations, so returns null;
- getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
ID3v22 doesn't have compilations, so returns null;
- getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
- getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
- getCompletionsPath() - Method in class org.apache.tika.parser.vlm.OpenAIVLMParser
- getCompletionsPath() - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- getComponent() - Method in class org.apache.tika.pipes.core.AbstractComponentManager
-
Convenience method that returns a component if only one component is configured.
- getComponent(String) - Method in class org.apache.tika.pipes.core.AbstractComponentManager
-
Gets a component by ID, lazily instantiating it if needed.
- getComponentClass() - Method in class org.apache.tika.config.loader.AbstractSpiComponentLoader
- getComponentClass() - Method in class org.apache.tika.serialization.ComponentConfig
- getComponentClass(String) - Method in class org.apache.tika.config.loader.ComponentRegistry
-
Looks up a component class by name.
- getComponentConfig(Class<T>) - Static method in class org.apache.tika.serialization.ComponentNameResolver
-
Gets component configuration by component class.
- getComponentConfig(String) - Method in class org.apache.tika.pipes.core.AbstractComponentManager
-
Gets the configuration for a specific component by ID.
- getComponentConfig(String) - Static method in class org.apache.tika.serialization.ComponentNameResolver
-
Gets component configuration by JSON field name.
- getComponentConfigJson() - Method in class org.apache.tika.config.loader.FrameworkConfig
- getComponentConfigNode() - Method in class org.apache.tika.config.loader.FrameworkConfig
- getComponentFields() - Static method in class org.apache.tika.serialization.ComponentNameResolver
-
Gets all registered component JSON field names.
- getComponentInfo(String) - Method in class org.apache.tika.config.loader.ComponentRegistry
-
Looks up full component information by name.
- getComponentInfo(String) - Static method in class org.apache.tika.serialization.ComponentNameResolver
-
Gets the component info for a given friendly name.
- getComponentName() - Method in class org.apache.tika.pipes.core.AbstractComponentManager
-
Returns the component name for error messages (e.g., "fetcher", "emitter").
- getComponentName() - Method in class org.apache.tika.pipes.core.emitter.EmitterManager
- getComponentName() - Method in class org.apache.tika.pipes.core.fetcher.FetcherManager
- getComponents(String) - Method in class org.apache.tika.config.loader.TikaJsonConfig
-
Gets component configurations for a specific type (object format - used for parsers).
- getComposer() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
- getComposer() - Method in interface org.apache.tika.parser.mp3.ID3Tags
- getComposer() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
ID3v1 doesn't have composers, so returns null;
- getComposer() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
- getComposer() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
- getComposer() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
- getCompoundTypes() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
-
Gets the StreamObjectTypeHeaderStart
- getCompressedLen() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
-
Gets compressed length
- getConfidence() - Method in class org.apache.tika.detect.EncodingDetectorContext.Result
-
The confidence of the top-ranked result from this detector.
- getConfidence() - Method in class org.apache.tika.detect.EncodingResult
-
Detection confidence in
[0.0, 1.0]. - getConfidence() - Method in class org.apache.tika.language.detect.LanguageResult
- getConfidence() - Method in class org.apache.tika.ml.Prediction
-
Calibration-independent confidence (0–1).
- getConfidence() - Method in class org.apache.tika.parser.csv.CSVResult
- getConfidence() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Get an indication of the confidence in the charset detected.
- getConfidenceScore() - Method in class org.apache.tika.language.detect.LanguageResult
-
Detector-agnostic confidence score (0.0 to 1.0).
- getConfig() - Method in class org.apache.tika.config.loader.TikaLoader
-
Gets the underlying JSON configuration.
- getConfig() - Method in class org.apache.tika.parser.external.ExternalParser
-
Returns the configuration for this parser.
- getConfig() - Method in class org.apache.tika.parser.RegexCaptureParser
- getConfig() - Method in class org.apache.tika.pipes.core.config.ConfigOverrides.EmitterOverride
- getConfig() - Method in class org.apache.tika.pipes.core.config.ConfigOverrides.FetcherOverride
- getConfig(String) - Method in class org.apache.tika.pipes.core.emitter.EmitterManager
-
Gets the configuration for a specific emitter by ID.
- getConfig(String) - Method in class org.apache.tika.pipes.core.fetcher.FetcherManager
-
Gets the configuration for a specific fetcher by ID.
- getConfig(ParseContext) - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- getConfig(ParseContext, String, Class<T>) - Static method in class org.apache.tika.serialization.ConfigDeserializer
-
Retrieves and deserializes a configuration from ParseContext.
- getConfig(ParseContext, String, Class<T>, T) - Static method in class org.apache.tika.config.ParseContextConfig
-
Retrieves runtime configuration from ParseContext.
- getConfig(ParseContext, String, Class<T>, T) - Static method in class org.apache.tika.serialization.ConfigDeserializer
-
Retrieves and deserializes a configuration from ParseContext.
- getConfigKey() - Method in class org.apache.tika.pipes.core.AbstractComponentManager
-
Returns the JSON configuration key for this component type (e.g., "fetchers", "emitters").
- getConfigKey() - Method in class org.apache.tika.pipes.core.emitter.EmitterManager
- getConfigKey() - Method in class org.apache.tika.pipes.core.fetcher.FetcherManager
- getConfigPath() - Method in class org.apache.tika.server.core.TikaServerConfig
- getConfigStore() - Method in class org.apache.tika.pipes.core.AbstractComponentManager
-
Returns the config store used by this manager.
- getConfigStore() - Method in class org.apache.tika.pipes.core.server.SharedServerResources
- getConfigStoreParams() - Method in class org.apache.tika.pipes.core.PipesConfig
- getConfigStoreType() - Method in class org.apache.tika.pipes.core.PipesConfig
- getConnection() - Method in class org.apache.tika.eval.app.db.JDBCUtil
-
Override this any optimizations you want to do on the db before writing/reading.
- getConnection() - Method in class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIteratorConfig
- getConnection(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.parser.jdbc.AbstractDBParser
-
Override this for special configuration of the connection, such as limiting the number of rows to be held in memory.
- getConnection(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.parser.sqlite3.SQLite3DBParser
- getConnectionString() - Method in class org.apache.tika.eval.app.db.H2Util
- getConnectionString() - Method in class org.apache.tika.eval.app.db.JDBCUtil
- getConnectionString(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.parser.jdbc.AbstractDBParser
-
Implement for db specific connection information, e.g.
- getConnectionString(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.parser.sqlite3.SQLite3DBParser
- getConnectionTimeoutMillis() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- getConnectionTimeoutMillisOrDefault() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
- getConnectTimeoutMillis() - Method in class org.apache.tika.client.HttpClientFactory
- getConnectTimeoutMillis() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- getConnectTimeoutMillis() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getConstraints() - Method in class org.apache.tika.eval.app.db.ColInfo
- getContainer() - Method in class org.apache.tika.pipes.fetcher.azblob.config.AZBlobFetcherConfig
- getContainer() - Method in class org.apache.tika.pipes.iterator.azblob.AZBlobPipesIteratorConfig
- getContainerEmitKey() - Method in class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler
-
Returns the container emit key.
- getContainerStackTrace() - Method in interface org.apache.tika.pipes.api.emitter.EmitData
- getContainerStackTrace() - Method in class org.apache.tika.pipes.core.emitter.EmitDataImpl
- getContent() - Method in class org.apache.tika.eval.core.util.ContentTags
- getContent() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxBlock
- getContent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
- getContent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.IntermediateNodeObject
-
Get all the content which is represented by the root node object.
- getContent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
-
Get all the content which is represented by the intermediate node object.
- getContent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.NodeObject
-
Get all the content which is represented by the node object.
- getContent() - Method in class org.apache.tika.pipes.fork.PipesForkResult
-
Get the content from the container document only.
- getContent(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxBlock
- getContent(int, int) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxBlock
- getContent(EvalFilePaths, Metadata) - Static method in class org.apache.tika.eval.app.ProfilerBase
- getContentField() - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- getContentField() - Method in class org.apache.tika.inference.InferenceConfig
- getContentHandler() - Method in class org.apache.tika.extractor.ParentContentHandler
- getContentHandler(ContentHandler, Metadata) - Method in class org.apache.tika.parser.mif.MIFParser
-
Get the content handler to use.
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.example.PrescriptionParser
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.dif.DIFParser
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.epub.OPFParser
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.WordMLParser
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentMetaParser
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.DcXMLParser
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.FictionBookParser
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.TextAndAttributeXMLParser
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.XMLParser
- getContentHandlerDecoratorFactory() - Method in class org.apache.tika.parser.AutoDetectParserConfig
- getContentHandlerFactory() - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Get the content handler factory that specifies how content should be handled.
- getContentHandlerFactory() - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
- getContentLanguage() - Method in class org.apache.tika.example.ImportContextImpl
- getContentLength() - Method in class org.apache.tika.example.ImportContextImpl
- getContentLength() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxBlock
- getContentParser() - Method in class org.apache.tika.parser.epub.EpubParser
- getContentParser() - Method in class org.apache.tika.parser.odf.OpenDocumentParser
- getContentSource() - Method in class org.apache.tika.parser.external.ExternalParserConfig
-
Which stream provides the XHTML content output.
- getContextIDs() - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
- getContextKey(Class<?>) - Static method in class org.apache.tika.serialization.ComponentNameResolver
-
Gets the contextKey for a class from the component registry.
- getContextKeyInterfaces() - Static method in class org.apache.tika.serialization.ComponentNameResolver
-
Returns the set of interfaces that use compact format serialization.
- getContextMap() - Method in class org.apache.tika.parser.ParseContext
-
Returns the internal context map for serialization purposes.
- getContributingSpecialists() - Method in class org.apache.tika.ml.chardetect.ScoredCandidate
- getControlDataIndex() - Method in class org.apache.tika.parser.microsoft.chm.ChmDirectoryListingSet
-
Returns control data index that located in List
- getConverter(String) - Static method in class org.apache.tika.xmp.convert.TikaToXMP
-
Retrieve a specific converter according to the mimetype
- getCors() - Method in class org.apache.tika.server.core.TikaServerConfig
- getCount() - Method in class org.apache.tika.parser.pdf.OCRPageCounter
- getCount(String) - Method in class org.apache.tika.eval.core.tokens.LangModel
- getCountryCode() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
- getCounts() - Method in class org.apache.tika.eval.core.tokens.LangModel
- getCoveredLabels() - Method in class org.apache.tika.ml.chardetect.SpecialistOutput
- getCreated() - Method in class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
- getCredentialsProvider() - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- getCredentialsProvider() - Method in class org.apache.tika.pipes.iterator.s3.S3PipesIteratorConfig
- getCsvPath() - Method in class org.apache.tika.pipes.iterator.csv.CSVPipesIteratorConfig
- getCurrent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
- getCurrent(byte[], AtomicInteger, Class<T>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
-
Get current stream object.
- getCurrentCharset() - Method in class org.apache.tika.example.PickBestTextEncodingParser.CharsetTester
-
Deprecated.
- getCurrentCharset() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFState
-
Returns the charset that should be used to decode the current hex escape or text byte.
- getCurrentFSSHTTPBSubRequestID() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
-
This method is used to get the current sub request ID and atomic adding the token by 1.
- getCurrentGroup() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFState
-
Returns the current group state.
- getCurrentPageNo() - Method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
-
we need to override this because we are overriding
PDFTextStripper.processPages(PDPageTree) - getCurrentPoint() - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- GetCurrentSerialNumber() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
-
This method is used to get the current serial number and atomic adding the token by 1.
- getCurrentServerPort() - Method in class org.apache.tika.pipes.core.PipesParser
-
Returns the current server port.
- getCurrentToken() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
-
This method is used to get the current token value and atomic adding the token by 1.
- getCustomLoader() - Method in class org.apache.tika.serialization.ComponentConfig
-
Gets the custom loader for this component.
- getData() - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
- getData() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- getData() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
- getData(Class<T>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
-
Used to get data.
- getDataObjectDataElementData(List<DataElement>, ExGuid, AtomicReference<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to get the list of object group data element from a list of data element.
- getDataObjectDataElementData(List<DataElement>, RevisionManifestDataElementData, AtomicReference<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to get a list of object group data element from a list of data element.
- getDataOffset() - Method in class org.apache.tika.parser.microsoft.chm.ChmDirectoryListingSet
-
Returns data offset
- getDataOffset() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Returns data offset
- getDataPath() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
- getDataPath() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- getDate(Property) - Method in class org.apache.tika.metadata.Metadata
-
Returns the value of the identified Date based metadata property.
- getDate(Property) - Method in class org.apache.tika.xmp.XMPMetadata
- getDateFormatOverride() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
- getDecodedValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
- getDecoration() - Method in class org.apache.tika.config.loader.FrameworkConfig
- getDecorationName() - Method in class org.apache.tika.parser.ctakes.CTAKESParser
- getDecorationName() - Method in class org.apache.tika.parser.ParserDecorator
- getDecorationName() - Method in class org.apache.tika.parser.ParserDecorator.MimeFilteringDecorator
- getDectorsHTML() - Method in class org.apache.tika.server.core.resource.TikaDetectors
- getDefault() - Method in class org.apache.tika.serialization.ComponentConfig
- getDefaultComponents() - Method in class org.apache.tika.config.loader.ComponentRegistry
-
Returns all components marked as defaults.
- getDefaultConfig() - Method in class org.apache.tika.detect.magika.MagikaDetector
- getDefaultConfig() - Method in class org.apache.tika.detect.siegfried.SiegfriedDetector
- getDefaultConfig() - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- getDefaultConfig() - Method in class org.apache.tika.parser.dwg.AbstractDWGParser
- getDefaultConfig() - Method in class org.apache.tika.parser.geo.topic.GeoParser
- getDefaultConfig() - Method in class org.apache.tika.parser.html.HtmlEncodingDetector
- getDefaultConfig() - Method in class org.apache.tika.parser.image.PSDParser
- getDefaultConfig() - Method in class org.apache.tika.parser.mail.RFC822Parser
- getDefaultConfig() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
- getDefaultConfig() - Method in class org.apache.tika.parser.microsoft.libpst.LibPstParser
- getDefaultConfig() - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
- getDefaultConfig() - Method in class org.apache.tika.parser.ocrencode.EncodeOCRParser
- getDefaultConfig() - Method in class org.apache.tika.parser.pdf.PDFParser
- getDefaultConfig() - Method in class org.apache.tika.parser.pkg.CompressorParser
- getDefaultConfig() - Method in class org.apache.tika.parser.strings.StringsParser
- getDefaultConfig() - Method in class org.apache.tika.parser.transcribe.aws.AmazonTranscribe
- getDefaultConfig() - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
- getDefaultConfig() - Method in class org.apache.tika.parser.txt.UniversalEncodingDetector
- getDefaultConfig() - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- getDefaultContentHandlerFactory() - Method in class org.apache.tika.pipes.core.server.SharedServerResources
- getDefaultInstance() - Static method in class org.apache.tika.DeleteFetcherReply
- getDefaultInstance() - Static method in class org.apache.tika.DeleteFetcherRequest
- getDefaultInstance() - Static method in class org.apache.tika.DeletePipesIteratorReply
- getDefaultInstance() - Static method in class org.apache.tika.DeletePipesIteratorRequest
- getDefaultInstance() - Static method in class org.apache.tika.FetchAndParseReply
- getDefaultInstance() - Static method in class org.apache.tika.FetchAndParseRequest
- getDefaultInstance() - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- getDefaultInstance() - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- getDefaultInstance() - Static method in class org.apache.tika.GetFetcherReply
- getDefaultInstance() - Static method in class org.apache.tika.GetFetcherRequest
- getDefaultInstance() - Static method in class org.apache.tika.GetPipesIteratorReply
- getDefaultInstance() - Static method in class org.apache.tika.GetPipesIteratorRequest
- getDefaultInstance() - Static method in class org.apache.tika.ListFetchersReply
- getDefaultInstance() - Static method in class org.apache.tika.ListFetchersRequest
- getDefaultInstance() - Static method in class org.apache.tika.SaveFetcherReply
- getDefaultInstance() - Static method in class org.apache.tika.SaveFetcherRequest
- getDefaultInstance() - Static method in class org.apache.tika.SavePipesIteratorReply
- getDefaultInstance() - Static method in class org.apache.tika.SavePipesIteratorRequest
- getDefaultInstanceForType() - Method in class org.apache.tika.DeleteFetcherReply.Builder
- getDefaultInstanceForType() - Method in class org.apache.tika.DeleteFetcherReply
- getDefaultInstanceForType() - Method in class org.apache.tika.DeleteFetcherRequest.Builder
- getDefaultInstanceForType() - Method in class org.apache.tika.DeleteFetcherRequest
- getDefaultInstanceForType() - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
- getDefaultInstanceForType() - Method in class org.apache.tika.DeletePipesIteratorReply
- getDefaultInstanceForType() - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- getDefaultInstanceForType() - Method in class org.apache.tika.DeletePipesIteratorRequest
- getDefaultInstanceForType() - Method in class org.apache.tika.FetchAndParseReply.Builder
- getDefaultInstanceForType() - Method in class org.apache.tika.FetchAndParseReply
- getDefaultInstanceForType() - Method in class org.apache.tika.FetchAndParseRequest.Builder
- getDefaultInstanceForType() - Method in class org.apache.tika.FetchAndParseRequest
- getDefaultInstanceForType() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- getDefaultInstanceForType() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- getDefaultInstanceForType() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- getDefaultInstanceForType() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- getDefaultInstanceForType() - Method in class org.apache.tika.GetFetcherReply.Builder
- getDefaultInstanceForType() - Method in class org.apache.tika.GetFetcherReply
- getDefaultInstanceForType() - Method in class org.apache.tika.GetFetcherRequest.Builder
- getDefaultInstanceForType() - Method in class org.apache.tika.GetFetcherRequest
- getDefaultInstanceForType() - Method in class org.apache.tika.GetPipesIteratorReply.Builder
- getDefaultInstanceForType() - Method in class org.apache.tika.GetPipesIteratorReply
- getDefaultInstanceForType() - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
- getDefaultInstanceForType() - Method in class org.apache.tika.GetPipesIteratorRequest
- getDefaultInstanceForType() - Method in class org.apache.tika.ListFetchersReply.Builder
- getDefaultInstanceForType() - Method in class org.apache.tika.ListFetchersReply
- getDefaultInstanceForType() - Method in class org.apache.tika.ListFetchersRequest.Builder
- getDefaultInstanceForType() - Method in class org.apache.tika.ListFetchersRequest
- getDefaultInstanceForType() - Method in class org.apache.tika.SaveFetcherReply.Builder
- getDefaultInstanceForType() - Method in class org.apache.tika.SaveFetcherReply
- getDefaultInstanceForType() - Method in class org.apache.tika.SaveFetcherRequest.Builder
- getDefaultInstanceForType() - Method in class org.apache.tika.SaveFetcherRequest
- getDefaultInstanceForType() - Method in class org.apache.tika.SavePipesIteratorReply.Builder
- getDefaultInstanceForType() - Method in class org.apache.tika.SavePipesIteratorReply
- getDefaultInstanceForType() - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
- getDefaultInstanceForType() - Method in class org.apache.tika.SavePipesIteratorRequest
- getDefaultLanguageDetector() - Static method in class org.apache.tika.language.detect.LanguageDetector
- getDefaultMarkerName() - Method in class org.apache.tika.config.loader.AbstractSpiComponentLoader
- getDefaultMetadataFilter() - Method in class org.apache.tika.pipes.core.server.SharedServerResources
- getDefaultMetadataWriteLimiterFactory() - Method in class org.apache.tika.pipes.core.server.SharedServerResources
- getDefaultMimeTypes() - Static method in class org.apache.tika.mime.MimeTypes
-
Get the default MimeTypes.
- getDefaultMimeTypes(ClassLoader) - Static method in class org.apache.tika.mime.MimeTypes
-
Get the default MimeTypes.
- getDefaultRegistry() - Static method in class org.apache.tika.mime.MediaTypeRegistry
-
Returns the built-in media type registry included in Tika.
- getDefaultTimeZone() - Method in class org.apache.tika.metadata.filter.DateNormalizingMetadataFilter
- getDelegateParser(ParseContext) - Method in class org.apache.tika.parser.DelegatingParser
-
Returns the parser instance to which parsing tasks should be delegated.
- getDelegatingParser() - Method in class org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
- getDeleteFetcherMethod() - Static method in class org.apache.tika.TikaGrpc
- getDeletePipesIteratorMethod() - Static method in class org.apache.tika.TikaGrpc
- getDelimiter() - Method in class org.apache.tika.parser.csv.CSVParams
- getDelimiter() - Method in class org.apache.tika.parser.csv.CSVResult
- getDelimiterToNameMap() - Method in class org.apache.tika.parser.csv.TextAndCSVConfig
- getDensity() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- getDepth() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFState
-
Returns the current group nesting depth.
- getDepth() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- getDepth() - Method in class org.apache.tika.parser.ParseRecord
- getDescription() - Method in class org.apache.tika.mime.MimeType
-
Returns the description of this media type.
- getDescription() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
-
Gets the description, if present
- getDescription() - Method in class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
- getDescriptor() - Static method in class org.apache.tika.DeleteFetcherReply.Builder
- getDescriptor() - Static method in class org.apache.tika.DeleteFetcherReply
- getDescriptor() - Static method in class org.apache.tika.DeleteFetcherRequest.Builder
- getDescriptor() - Static method in class org.apache.tika.DeleteFetcherRequest
- getDescriptor() - Static method in class org.apache.tika.DeletePipesIteratorReply.Builder
- getDescriptor() - Static method in class org.apache.tika.DeletePipesIteratorReply
- getDescriptor() - Static method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- getDescriptor() - Static method in class org.apache.tika.DeletePipesIteratorRequest
- getDescriptor() - Static method in class org.apache.tika.FetchAndParseReply.Builder
- getDescriptor() - Static method in class org.apache.tika.FetchAndParseReply
- getDescriptor() - Static method in class org.apache.tika.FetchAndParseRequest.Builder
- getDescriptor() - Static method in class org.apache.tika.FetchAndParseRequest
- getDescriptor() - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- getDescriptor() - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- getDescriptor() - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- getDescriptor() - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- getDescriptor() - Static method in class org.apache.tika.GetFetcherReply.Builder
- getDescriptor() - Static method in class org.apache.tika.GetFetcherReply
- getDescriptor() - Static method in class org.apache.tika.GetFetcherRequest.Builder
- getDescriptor() - Static method in class org.apache.tika.GetFetcherRequest
- getDescriptor() - Static method in class org.apache.tika.GetPipesIteratorReply.Builder
- getDescriptor() - Static method in class org.apache.tika.GetPipesIteratorReply
- getDescriptor() - Static method in class org.apache.tika.GetPipesIteratorRequest.Builder
- getDescriptor() - Static method in class org.apache.tika.GetPipesIteratorRequest
- getDescriptor() - Static method in class org.apache.tika.ListFetchersReply.Builder
- getDescriptor() - Static method in class org.apache.tika.ListFetchersReply
- getDescriptor() - Static method in class org.apache.tika.ListFetchersRequest.Builder
- getDescriptor() - Static method in class org.apache.tika.ListFetchersRequest
- getDescriptor() - Static method in class org.apache.tika.SaveFetcherReply.Builder
- getDescriptor() - Static method in class org.apache.tika.SaveFetcherReply
- getDescriptor() - Static method in class org.apache.tika.SaveFetcherRequest.Builder
- getDescriptor() - Static method in class org.apache.tika.SaveFetcherRequest
- getDescriptor() - Static method in class org.apache.tika.SavePipesIteratorReply.Builder
- getDescriptor() - Static method in class org.apache.tika.SavePipesIteratorReply
- getDescriptor() - Static method in class org.apache.tika.SavePipesIteratorRequest.Builder
- getDescriptor() - Static method in class org.apache.tika.SavePipesIteratorRequest
- getDescriptor() - Static method in class org.apache.tika.TikaProto
- getDescriptorForType() - Method in class org.apache.tika.DeleteFetcherReply.Builder
- getDescriptorForType() - Method in class org.apache.tika.DeleteFetcherRequest.Builder
- getDescriptorForType() - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
- getDescriptorForType() - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- getDescriptorForType() - Method in class org.apache.tika.FetchAndParseReply.Builder
- getDescriptorForType() - Method in class org.apache.tika.FetchAndParseRequest.Builder
- getDescriptorForType() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- getDescriptorForType() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- getDescriptorForType() - Method in class org.apache.tika.GetFetcherReply.Builder
- getDescriptorForType() - Method in class org.apache.tika.GetFetcherRequest.Builder
- getDescriptorForType() - Method in class org.apache.tika.GetPipesIteratorReply.Builder
- getDescriptorForType() - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
- getDescriptorForType() - Method in class org.apache.tika.ListFetchersReply.Builder
- getDescriptorForType() - Method in class org.apache.tika.ListFetchersRequest.Builder
- getDescriptorForType() - Method in class org.apache.tika.SaveFetcherReply.Builder
- getDescriptorForType() - Method in class org.apache.tika.SaveFetcherRequest.Builder
- getDescriptorForType() - Method in class org.apache.tika.SavePipesIteratorReply.Builder
- getDescriptorForType() - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
- getDetectableCharsets() - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Deprecated.This API is ICU internal only.
- getDetectionContentLength(Metadata) - Static method in class org.apache.tika.detect.DetectHelper
-
Gets the number of bytes buffered for detection.
- getDetector() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
- getDetector() - Method in class org.apache.tika.language.detect.LanguageHandler
-
Returns the language detector used by this content handler.
- getDetector() - Method in class org.apache.tika.language.detect.LanguageWriter
-
Returns the language detector used by this writer.
- getDetector() - Method in class org.apache.tika.parser.AutoDetectParser
-
Returns the type detector used by this parser to auto-detect the type of a document.
- getDetector() - Method in class org.apache.tika.parser.microsoft.OutlookExtractor
- getDetector() - Method in class org.apache.tika.pipes.core.server.SharedServerResources
- getDetector() - Method in class org.apache.tika.Tika
-
Returns the detector instance used by this facade.
- getDetectorName() - Method in class org.apache.tika.detect.EncodingDetectorContext.Result
- getDetectors() - Method in class org.apache.tika.detect.CompositeDetector
-
Returns the component detectors.
- getDetectors() - Method in class org.apache.tika.detect.CompositeEncodingDetector
- getDetectors() - Method in class org.apache.tika.detect.DefaultDetector
- getDetectors() - Method in class org.apache.tika.detect.DefaultProbDetector
- getDetectorsJSON() - Method in class org.apache.tika.server.core.resource.TikaDetectors
- getDetectorsPlain() - Method in class org.apache.tika.server.core.resource.TikaDetectors
- getDiceCoefficient() - Method in class org.apache.tika.eval.core.tokens.ContrastStatistics
- getDigest() - Method in class org.apache.tika.server.core.TikaServerConfig
-
digest configuration string, e.g. md5 or sha256, alternately w 16 or 32 encoding, e.g. md5:32,sha256:16 would result in two digests per file
- getDigestMarkLimit() - Method in class org.apache.tika.server.core.TikaServerConfig
- getDigests() - Method in class org.apache.tika.parser.digestutils.BouncyCastleDigesterFactory
- getDigests() - Method in class org.apache.tika.parser.digestutils.CommonsDigesterFactory
- getDir_uuid() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Returns directory uuid
- getDirectoryListingEntryList() - Method in class org.apache.tika.parser.microsoft.chm.ChmDirectoryListingSet
-
Returns chm directory listing entry list
- getDirLen() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Returns directory length
- getDirOffset() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Returns directory offset
- getDisc() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
- getDisc() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
The number of the disc this belongs to, within the set
- getDisc() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
ID3v1 doesn't have disc numbers, so returns null;
- getDisc() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
- getDisc() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
- getDisc() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
- getDistributionEntropy() - Method in class org.apache.tika.langdetect.charsoup.CharSoupLanguageDetector
-
Returns the Shannon entropy (in bits) of the probability distribution from the most recent
CharSoupLanguageDetector.detectAll()call, orFloat.NaNifdetectAll()has not been called since the lastCharSoupLanguageDetector.reset(). - getDocumentBuilder() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the DOM builder specified in this parsing context.
- getDocumentBuilder(ParseContext) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the DOM builder specified in this parsing context.
- getDocumentBuilderFactory() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the DOM builder factory specified in this parsing context.
- getDominantScript() - Method in class org.apache.tika.quality.TextQualityScore
-
Name of the dominant Unicode script detected, e.g.
- getDpi() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
- getDpi() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- getDpi() - Method in class org.apache.tika.parser.pdf.OcrConfig
- getDpi() - Method in class org.apache.tika.renderer.pdf.poppler.PopplerRenderer
- getDPI(ParseContext) - Method in class org.apache.tika.renderer.pdf.pdfbox.PDFBoxRenderer
- getDropThreshold() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- getDuration() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Returns the duration in milliseconds.
- getDwgReadExecutable() - Method in class org.apache.tika.parser.dwg.DWGParserConfig
- getDwgReadTimeout() - Method in class org.apache.tika.parser.dwg.DWGParserConfig
- getEffectiveMaxStringLength() - Method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
-
Returns the effective maxStringLength, using the default of 64000 if not set or 0.
- getEmbeddedCount() - Method in class org.apache.tika.parser.ParseRecord
-
Gets the current count of embedded documents processed.
- getEmbeddedDocumentExtractor(ParseContext) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
This offers a uniform way to get an EmbeddedDocumentExtractor from a ParseContext.
- getEmbeddedFileFieldNameOrDefault() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
- getEmbeddedFiles() - Method in class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler
-
Returns information about all embedded files.
- getEmbeddedFiles() - Method in class org.apache.tika.pipes.core.extractor.TempFileUnpackHandler
-
Returns information about all embedded files stored.
- getEmbeddedIdPrefix() - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- getEmbeddedLimits() - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Get the embedded limits configuration.
- getEmbeddedPartMetadataMap() - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
- getEmbeddedPartMetadataMap() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- getEmbeddedPartMetadataMap() - Method in class org.apache.tika.parser.microsoft.ooxml.SXWPFWordExtractorDecorator
- getEmbeddingsPath() - Method in class org.apache.tika.inference.OpenAIEmbeddingFilter
- getEmbeddingsPath() - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- getEmfRelationshipId() - Method in class org.apache.tika.parser.microsoft.ooxml.EmbeddedPartMetadata
- getEmitDataQueue(int) - Method in class org.apache.tika.server.core.resource.AsyncResource
- getEmitKey() - Method in interface org.apache.tika.pipes.api.emitter.EmitData
- getEmitKey() - Method in class org.apache.tika.pipes.api.emitter.EmitKey
- getEmitKey() - Method in class org.apache.tika.pipes.api.FetchEmitTuple
- getEmitKey() - Method in class org.apache.tika.pipes.core.emitter.EmitDataImpl
- getEmitKey(String, int, UnpackConfig, Metadata) - Method in class org.apache.tika.pipes.core.extractor.AbstractUnpackHandler
- getEmitKeyBase() - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- getEmitKeyColumn() - Method in class org.apache.tika.pipes.iterator.csv.CSVPipesIteratorConfig
- getEmitKeyColumn() - Method in class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIteratorConfig
- getEmitMax() - Method in class org.apache.tika.pipes.iterator.kafka.KafkaPipesIteratorConfig
- getEmitMaxEstimatedBytes() - Method in class org.apache.tika.pipes.core.PipesConfig
-
When the emit queue hits this estimated size (sum of estimated extract sizes), emit the batch.
- getEmitStrategy() - Method in class org.apache.tika.pipes.core.config.ConfigOverrides
- getEmitStrategy() - Method in class org.apache.tika.pipes.core.PipesConfig
-
Get the emit strategy configuration.
- getEmitStrategy() - Method in class org.apache.tika.pipes.core.server.SharedServerResources
- getEmittedCommentIds() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
Returns the set of comment IDs that were inlined during parsing.
- getEmitter() - Method in class org.apache.tika.pipes.core.emitter.EmitterManager
-
Convenience method that returns an emitter if only one emitter is configured.
- getEmitter() - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- getEmitter(String) - Method in class org.apache.tika.pipes.core.emitter.EmitterManager
-
Gets an emitter by ID, lazily instantiating it if needed.
- getEmitterId() - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
The ID of the emitter to use (optional).
- getEmitterId() - Method in class org.apache.tika.FetchAndParseRequest
-
The ID of the emitter to use (optional).
- getEmitterId() - Method in interface org.apache.tika.FetchAndParseRequestOrBuilder
-
The ID of the emitter to use (optional).
- getEmitterId() - Method in class org.apache.tika.pipes.api.emitter.EmitKey
- getEmitterId() - Method in class org.apache.tika.pipes.pipesiterator.PipesIteratorConfig
- getEmitterIdBytes() - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
The ID of the emitter to use (optional).
- getEmitterIdBytes() - Method in class org.apache.tika.FetchAndParseRequest
-
The ID of the emitter to use (optional).
- getEmitterIdBytes() - Method in interface org.apache.tika.FetchAndParseRequestOrBuilder
-
The ID of the emitter to use (optional).
- getEmitterManager() - Method in class org.apache.tika.pipes.core.server.SharedServerResources
- getEmitters() - Method in class org.apache.tika.pipes.core.config.ConfigOverrides
- getEmitWithinMillis() - Method in class org.apache.tika.pipes.core.PipesConfig
- getEncint() - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
- getEncoding() - Method in class org.apache.tika.digest.DigestDef
- getEncoding() - Method in class org.apache.tika.example.ImportContextImpl
- getEncoding() - Method in class org.apache.tika.parser.strings.StringsConfig
-
Returns the character encoding of the strings that are to be found.
- getEncodingDetector() - Method in class org.apache.tika.config.loader.LoaderContext
-
Get the EncodingDetector for injection into parsers.
- getEncodingDetector() - Method in class org.apache.tika.parser.AbstractEncodingDetectorParser
- getEncodingDetector(ParseContext) - Method in class org.apache.tika.parser.AbstractEncodingDetectorParser
-
Look for an EncodingDetetor in the ParseContext.
- getEncodingDetector(ParseContext) - Method in class org.apache.tika.parser.html.JSoupParser
-
Look for an EncodingDetetor in the ParseContext.
- getEncodingResults() - Method in class org.apache.tika.detect.EncodingDetectorContext.Result
-
All ranked results from this detector, highest confidence first.
- getEndBlock() - Method in class org.apache.tika.parser.microsoft.chm.ChmBlockInfo
-
Returns the end block index
- getEndEofOffset() - Method in class org.apache.tika.parser.pdf.updates.StartXRefOffset
- getEndMs() - Method in class org.apache.tika.inference.locator.TemporalLocator
- getEndOffset() - Method in class org.apache.tika.inference.Chunk
-
Convenience: returns the end offset from the first
TextLocator, or -1 if none. - getEndOffset() - Method in class org.apache.tika.inference.locator.TextLocator
- getEndOffset() - Method in class org.apache.tika.parser.microsoft.chm.ChmBlockInfo
-
Returns the end offset index
- getEndpoint() - Method in class org.apache.tika.pipes.fetcher.azblob.config.AZBlobFetcherConfig
- getEndpoint() - Method in class org.apache.tika.pipes.iterator.azblob.AZBlobPipesIteratorConfig
- getEndpointConfigurationService() - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- getEndpointConfigurationService() - Method in class org.apache.tika.pipes.iterator.s3.S3PipesIteratorConfig
- getEndpoints() - Method in class org.apache.tika.server.core.TikaServerConfig
- getEnqueued() - Method in class org.apache.tika.pipes.core.pipesiterator.CallablePipesIterator
- getEntityTypes() - Method in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
Gets set of entity types recognised by this recogniser
- getEntityTypes() - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
Gets set of entity types recognised by this recogniser
- getEntityTypes() - Method in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
-
Gets set of entity types recognised by this recogniser
- getEntityTypes() - Method in interface org.apache.tika.parser.ner.NERecogniser
-
gets a set of entity types whose names are recognisable by this
- getEntityTypes() - Method in class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
-
Gets set of entity types recognised by this recogniser
- getEntityTypes() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
- getEntityTypes() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- getEntityTypes() - Method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
- getEntriesToCopy() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
- getEntropy() - Method in class org.apache.tika.eval.core.tokens.TokenStatistics
- getEntryEncoding() - Method in class org.apache.tika.parser.pkg.ZipParserConfig
- getEntryType() - Method in class org.apache.tika.parser.microsoft.chm.DirectoryListingEntry
-
Returns ChmCommons.EntryType (COMPRESSED or UNCOMPRESSED)
- getErrorLogFile() - Method in class org.apache.tika.eval.app.EvalConfig
- getErrorMessage() - Method in class org.apache.tika.FetchAndParseReply.Builder
-
If there was an error, this will contain the error message.
- getErrorMessage() - Method in class org.apache.tika.FetchAndParseReply
-
If there was an error, this will contain the error message.
- getErrorMessage() - Method in interface org.apache.tika.FetchAndParseReplyOrBuilder
-
If there was an error, this will contain the error message.
- getErrorMessageBytes() - Method in class org.apache.tika.FetchAndParseReply.Builder
-
If there was an error, this will contain the error message.
- getErrorMessageBytes() - Method in class org.apache.tika.FetchAndParseReply
-
If there was an error, this will contain the error message.
- getErrorMessageBytes() - Method in interface org.apache.tika.FetchAndParseReplyOrBuilder
-
If there was an error, this will contain the error message.
- getEstimatedSizeBytes() - Method in interface org.apache.tika.pipes.api.emitter.EmitData
- getEstimatedSizeBytes() - Method in class org.apache.tika.pipes.core.emitter.EmitDataImpl
- getExceptions() - Method in class org.apache.tika.parser.ParseRecord
- getExceptions() - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- getExclude() - Method in class org.apache.tika.metadata.filter.ExcludeFieldMetadataFilter
- getExcludedCipherSuites() - Method in class org.apache.tika.server.core.TlsConfig
- getExcludedClasses() - Method in class org.apache.tika.detect.DefaultDetector
-
Returns the classes that were explicitly excluded when constructing this detector.
- getExcludedClasses() - Method in class org.apache.tika.parser.DefaultParser
-
Returns the classes that were explicitly excluded when constructing this parser.
- getExcludedClasses(DefaultDetector) - Method in class org.apache.tika.serialization.serdes.DefaultDetectorSerializer
- getExcludedClasses(DefaultParser) - Method in class org.apache.tika.serialization.serdes.DefaultParserSerializer
- getExcludedClasses(T) - Method in class org.apache.tika.serialization.serdes.SpiCompositeSerializer
-
Get the excluded classes from the composite instance.
- getExcludedProtocols() - Method in class org.apache.tika.server.core.TlsConfig
- getExcludeEmbeddedResourceTypes() - Method in class org.apache.tika.pipes.core.extractor.StandardUnpackSelector
- getExcludeFields() - Method in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- getExcludeMimeTypes() - Method in class org.apache.tika.pipes.core.extractor.StandardUnpackSelector
- getExcludeTypes() - Method in class org.apache.tika.parser.ParserDecorator.MimeFilteringDecorator
- getExitCode() - Method in enum class org.apache.tika.pipes.core.protocol.PipesMessageType
-
Returns the exit code the server should use when exiting due to this condition, or empty if this message type does not trigger an exit.
- getExitValue() - Method in class org.apache.tika.utils.FileProcessResult
- getExpiresInSeconds() - Method in class org.apache.tika.pipes.fetcher.http.jwt.JwtCreds
- getExtendedGuidString() - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
- getExtendedHeader() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
- getExtension() - Method in class org.apache.tika.mime.MimeType
-
Returns the preferred file extension of this type, or an empty string if no extensions are known.
- getExtension() - Method in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- getExtension(TikaInputStream, Metadata) - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
- getExtensionConfig() - Method in class org.apache.tika.pipes.core.config.FileBasedConfigStore
- getExtensionConfig() - Method in class org.apache.tika.pipes.core.config.InMemoryConfigStore
- getExtensionConfig() - Method in class org.apache.tika.pipes.core.fetcher.EmptyFetcher
- getExtensionConfig() - Method in class org.apache.tika.pipes.core.reporter.CompositePipesReporter
- getExtensionConfig() - Method in class org.apache.tika.pipes.core.reporter.NoOpReporter
- getExtensionConfig() - Method in class org.apache.tika.pipes.ignite.IgniteConfigStore
- getExtensionConfig() - Method in class org.apache.tika.pipes.reporter.fs.FileSystemStatusReporter
- getExtensionConfig() - Method in class org.apache.tika.plugins.AbstractTikaExtension
- getExtensionConfig() - Method in interface org.apache.tika.plugins.TikaExtension
- getExtensionForMediaType(String) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
- getExtensions() - Method in class org.apache.tika.mime.MimeType
-
Returns the list of all known file extensions of this media type.
- getFactories(PluginManager) - Method in class org.apache.tika.pipes.core.AbstractComponentManager
- getFactoryClass() - Method in class org.apache.tika.pipes.core.AbstractComponentManager
-
Returns the factory class for this component type.
- getFactoryClass() - Method in class org.apache.tika.pipes.core.emitter.EmitterManager
- getFactoryClass() - Method in class org.apache.tika.pipes.core.fetcher.FetcherManager
- getFailCountField() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- getFallback() - Method in class org.apache.tika.parser.CompositeParser
-
Returns the fallback parser.
- getFeatureFlags() - Method in class org.apache.tika.langdetect.charsoup.CharSoupModel
- getFeatureFlags() - Method in interface org.apache.tika.langdetect.charsoup.FeatureExtractor
-
Returns the bitmask of
CharSoupModelFLAG_*constants that describes which feature types this extractor emits. - getFeatureFlags() - Method in class org.apache.tika.langdetect.charsoup.SaltedNgramFeatureExtractor
- getFeatureFlags() - Method in class org.apache.tika.langdetect.charsoup.ScriptAwareFeatureExtractor
- getFeatureFlags() - Method in class org.apache.tika.langdetect.charsoup.ShortTextFeatureExtractor
- getFetchAndParseBiDirectionalStreamingMethod() - Static method in class org.apache.tika.TikaGrpc
- getFetchAndParseMethod() - Static method in class org.apache.tika.TikaGrpc
- getFetchAndParseServerSideStreamingMethod() - Static method in class org.apache.tika.TikaGrpc
- getFetchEmitQueue(int) - Method in class org.apache.tika.server.core.resource.AsyncResource
- getFetcher() - Method in class org.apache.tika.pipes.core.fetcher.FetcherManager
-
Convenience method that returns a fetcher if only one fetcher is configured.
- getFetcher(String) - Method in class org.apache.tika.pipes.core.fetcher.FetcherManager
-
Gets a fetcher by ID, lazily instantiating it if needed.
- getFetcher(GetFetcherRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingStub
-
Get a fetcher's data from the fetcher store.
- getFetcher(GetFetcherRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingV2Stub
-
Get a fetcher's data from the fetcher store.
- getFetcher(GetFetcherRequest) - Method in class org.apache.tika.TikaGrpc.TikaFutureStub
-
Get a fetcher's data from the fetcher store.
- getFetcher(GetFetcherRequest, StreamObserver<GetFetcherReply>) - Method in interface org.apache.tika.TikaGrpc.AsyncService
-
Get a fetcher's data from the fetcher store.
- getFetcher(GetFetcherRequest, StreamObserver<GetFetcherReply>) - Method in class org.apache.tika.TikaGrpc.TikaStub
-
Get a fetcher's data from the fetcher store.
- getFetcherClass() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
-
The full java class name of the fetcher config for which to fetch json schema.
- getFetcherClass() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
-
The full java class name of the fetcher config for which to fetch json schema.
- getFetcherClass() - Method in interface org.apache.tika.GetFetcherConfigJsonSchemaRequestOrBuilder
-
The full java class name of the fetcher config for which to fetch json schema.
- getFetcherClass() - Method in class org.apache.tika.GetFetcherReply.Builder
-
The full Java class name of the Fetcher.
- getFetcherClass() - Method in class org.apache.tika.GetFetcherReply
-
The full Java class name of the Fetcher.
- getFetcherClass() - Method in interface org.apache.tika.GetFetcherReplyOrBuilder
-
The full Java class name of the Fetcher.
- getFetcherClass() - Method in class org.apache.tika.SaveFetcherRequest.Builder
-
The full java class name of the fetcher class.
- getFetcherClass() - Method in class org.apache.tika.SaveFetcherRequest
-
The full java class name of the fetcher class.
- getFetcherClass() - Method in interface org.apache.tika.SaveFetcherRequestOrBuilder
-
The full java class name of the fetcher class.
- getFetcherClassBytes() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
-
The full java class name of the fetcher config for which to fetch json schema.
- getFetcherClassBytes() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
-
The full java class name of the fetcher config for which to fetch json schema.
- getFetcherClassBytes() - Method in interface org.apache.tika.GetFetcherConfigJsonSchemaRequestOrBuilder
-
The full java class name of the fetcher config for which to fetch json schema.
- getFetcherClassBytes() - Method in class org.apache.tika.GetFetcherReply.Builder
-
The full Java class name of the Fetcher.
- getFetcherClassBytes() - Method in class org.apache.tika.GetFetcherReply
-
The full Java class name of the Fetcher.
- getFetcherClassBytes() - Method in interface org.apache.tika.GetFetcherReplyOrBuilder
-
The full Java class name of the Fetcher.
- getFetcherClassBytes() - Method in class org.apache.tika.SaveFetcherRequest.Builder
-
The full java class name of the fetcher class.
- getFetcherClassBytes() - Method in class org.apache.tika.SaveFetcherRequest
-
The full java class name of the fetcher class.
- getFetcherClassBytes() - Method in interface org.apache.tika.SaveFetcherRequestOrBuilder
-
The full java class name of the fetcher class.
- getFetcherConfigJson() - Method in class org.apache.tika.SaveFetcherRequest.Builder
-
JSON string of the fetcher config object.
- getFetcherConfigJson() - Method in class org.apache.tika.SaveFetcherRequest
-
JSON string of the fetcher config object.
- getFetcherConfigJson() - Method in interface org.apache.tika.SaveFetcherRequestOrBuilder
-
JSON string of the fetcher config object.
- getFetcherConfigJsonBytes() - Method in class org.apache.tika.SaveFetcherRequest.Builder
-
JSON string of the fetcher config object.
- getFetcherConfigJsonBytes() - Method in class org.apache.tika.SaveFetcherRequest
-
JSON string of the fetcher config object.
- getFetcherConfigJsonBytes() - Method in interface org.apache.tika.SaveFetcherRequestOrBuilder
-
JSON string of the fetcher config object.
- getFetcherConfigJsonSchema() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
-
The json schema that describes the fetcher config in string format.
- getFetcherConfigJsonSchema() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
-
The json schema that describes the fetcher config in string format.
- getFetcherConfigJsonSchema() - Method in interface org.apache.tika.GetFetcherConfigJsonSchemaReplyOrBuilder
-
The json schema that describes the fetcher config in string format.
- getFetcherConfigJsonSchema(GetFetcherConfigJsonSchemaRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingStub
-
Get the Fetcher Config schema for a given fetcher class.
- getFetcherConfigJsonSchema(GetFetcherConfigJsonSchemaRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingV2Stub
-
Get the Fetcher Config schema for a given fetcher class.
- getFetcherConfigJsonSchema(GetFetcherConfigJsonSchemaRequest) - Method in class org.apache.tika.TikaGrpc.TikaFutureStub
-
Get the Fetcher Config schema for a given fetcher class.
- getFetcherConfigJsonSchema(GetFetcherConfigJsonSchemaRequest, StreamObserver<GetFetcherConfigJsonSchemaReply>) - Method in interface org.apache.tika.TikaGrpc.AsyncService
-
Get the Fetcher Config schema for a given fetcher class.
- getFetcherConfigJsonSchema(GetFetcherConfigJsonSchemaRequest, StreamObserver<GetFetcherConfigJsonSchemaReply>) - Method in class org.apache.tika.TikaGrpc.TikaStub
-
Get the Fetcher Config schema for a given fetcher class.
- getFetcherConfigJsonSchemaBytes() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
-
The json schema that describes the fetcher config in string format.
- getFetcherConfigJsonSchemaBytes() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
-
The json schema that describes the fetcher config in string format.
- getFetcherConfigJsonSchemaBytes() - Method in interface org.apache.tika.GetFetcherConfigJsonSchemaReplyOrBuilder
-
The json schema that describes the fetcher config in string format.
- GetFetcherConfigJsonSchemaReply - Class in org.apache.tika
-
Protobuf type
tika.GetFetcherConfigJsonSchemaReply - GetFetcherConfigJsonSchemaReply.Builder - Class in org.apache.tika
-
Protobuf type
tika.GetFetcherConfigJsonSchemaReply - GetFetcherConfigJsonSchemaReplyOrBuilder - Interface in org.apache.tika
- GetFetcherConfigJsonSchemaRequest - Class in org.apache.tika
-
Protobuf type
tika.GetFetcherConfigJsonSchemaRequest - GetFetcherConfigJsonSchemaRequest.Builder - Class in org.apache.tika
-
Protobuf type
tika.GetFetcherConfigJsonSchemaRequest - GetFetcherConfigJsonSchemaRequestOrBuilder - Interface in org.apache.tika
- getFetcherId() - Method in class org.apache.tika.DeleteFetcherRequest.Builder
-
ID of the fetcher to delete.
- getFetcherId() - Method in class org.apache.tika.DeleteFetcherRequest
-
ID of the fetcher to delete.
- getFetcherId() - Method in interface org.apache.tika.DeleteFetcherRequestOrBuilder
-
ID of the fetcher to delete.
- getFetcherId() - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
The ID of the fetcher in the fetcher store (previously saved by SaveFetcher) to use for the fetch.
- getFetcherId() - Method in class org.apache.tika.FetchAndParseRequest
-
The ID of the fetcher in the fetcher store (previously saved by SaveFetcher) to use for the fetch.
- getFetcherId() - Method in interface org.apache.tika.FetchAndParseRequestOrBuilder
-
The ID of the fetcher in the fetcher store (previously saved by SaveFetcher) to use for the fetch.
- getFetcherId() - Method in class org.apache.tika.GetFetcherReply.Builder
-
Echoes the ID of the fetcher being returned.
- getFetcherId() - Method in class org.apache.tika.GetFetcherReply
-
Echoes the ID of the fetcher being returned.
- getFetcherId() - Method in interface org.apache.tika.GetFetcherReplyOrBuilder
-
Echoes the ID of the fetcher being returned.
- getFetcherId() - Method in class org.apache.tika.GetFetcherRequest.Builder
-
ID of the fetcher for which to return config.
- getFetcherId() - Method in class org.apache.tika.GetFetcherRequest
-
ID of the fetcher for which to return config.
- getFetcherId() - Method in interface org.apache.tika.GetFetcherRequestOrBuilder
-
ID of the fetcher for which to return config.
- getFetcherId() - Method in class org.apache.tika.pipes.api.fetcher.FetchKey
- getFetcherId() - Method in class org.apache.tika.pipes.pipesiterator.PipesIteratorConfig
- getFetcherId() - Method in class org.apache.tika.SaveFetcherReply.Builder
-
The fetcher_id that was saved.
- getFetcherId() - Method in class org.apache.tika.SaveFetcherReply
-
The fetcher_id that was saved.
- getFetcherId() - Method in interface org.apache.tika.SaveFetcherReplyOrBuilder
-
The fetcher_id that was saved.
- getFetcherId() - Method in class org.apache.tika.SaveFetcherRequest.Builder
-
A unique identifier for each fetcher.
- getFetcherId() - Method in class org.apache.tika.SaveFetcherRequest
-
A unique identifier for each fetcher.
- getFetcherId() - Method in interface org.apache.tika.SaveFetcherRequestOrBuilder
-
A unique identifier for each fetcher.
- getFetcherIdBytes() - Method in class org.apache.tika.DeleteFetcherRequest.Builder
-
ID of the fetcher to delete.
- getFetcherIdBytes() - Method in class org.apache.tika.DeleteFetcherRequest
-
ID of the fetcher to delete.
- getFetcherIdBytes() - Method in interface org.apache.tika.DeleteFetcherRequestOrBuilder
-
ID of the fetcher to delete.
- getFetcherIdBytes() - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
The ID of the fetcher in the fetcher store (previously saved by SaveFetcher) to use for the fetch.
- getFetcherIdBytes() - Method in class org.apache.tika.FetchAndParseRequest
-
The ID of the fetcher in the fetcher store (previously saved by SaveFetcher) to use for the fetch.
- getFetcherIdBytes() - Method in interface org.apache.tika.FetchAndParseRequestOrBuilder
-
The ID of the fetcher in the fetcher store (previously saved by SaveFetcher) to use for the fetch.
- getFetcherIdBytes() - Method in class org.apache.tika.GetFetcherReply.Builder
-
Echoes the ID of the fetcher being returned.
- getFetcherIdBytes() - Method in class org.apache.tika.GetFetcherReply
-
Echoes the ID of the fetcher being returned.
- getFetcherIdBytes() - Method in interface org.apache.tika.GetFetcherReplyOrBuilder
-
Echoes the ID of the fetcher being returned.
- getFetcherIdBytes() - Method in class org.apache.tika.GetFetcherRequest.Builder
-
ID of the fetcher for which to return config.
- getFetcherIdBytes() - Method in class org.apache.tika.GetFetcherRequest
-
ID of the fetcher for which to return config.
- getFetcherIdBytes() - Method in interface org.apache.tika.GetFetcherRequestOrBuilder
-
ID of the fetcher for which to return config.
- getFetcherIdBytes() - Method in class org.apache.tika.SaveFetcherReply.Builder
-
The fetcher_id that was saved.
- getFetcherIdBytes() - Method in class org.apache.tika.SaveFetcherReply
-
The fetcher_id that was saved.
- getFetcherIdBytes() - Method in interface org.apache.tika.SaveFetcherReplyOrBuilder
-
The fetcher_id that was saved.
- getFetcherIdBytes() - Method in class org.apache.tika.SaveFetcherRequest.Builder
-
A unique identifier for each fetcher.
- getFetcherIdBytes() - Method in class org.apache.tika.SaveFetcherRequest
-
A unique identifier for each fetcher.
- getFetcherIdBytes() - Method in interface org.apache.tika.SaveFetcherRequestOrBuilder
-
A unique identifier for each fetcher.
- getFetcherManager() - Method in class org.apache.tika.pipes.core.server.SharedServerResources
- getFetcherName() - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Get the fetcher name used for file system fetching.
- GetFetcherReply - Class in org.apache.tika
-
Protobuf type
tika.GetFetcherReply - GetFetcherReply.Builder - Class in org.apache.tika
-
Protobuf type
tika.GetFetcherReply - GetFetcherReplyOrBuilder - Interface in org.apache.tika
- GetFetcherRequest - Class in org.apache.tika
-
Protobuf type
tika.GetFetcherRequest - GetFetcherRequest.Builder - Class in org.apache.tika
-
Protobuf type
tika.GetFetcherRequest - GetFetcherRequestOrBuilder - Interface in org.apache.tika
- getFetchers() - Method in class org.apache.tika.pipes.core.config.ConfigOverrides
- getFetchKey() - Method in class org.apache.tika.FetchAndParseReply.Builder
-
Echoes the fetch_key that was sent in the request.
- getFetchKey() - Method in class org.apache.tika.FetchAndParseReply
-
Echoes the fetch_key that was sent in the request.
- getFetchKey() - Method in interface org.apache.tika.FetchAndParseReplyOrBuilder
-
Echoes the fetch_key that was sent in the request.
- getFetchKey() - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
The "Fetch Key" of the item that will be fetched.
- getFetchKey() - Method in class org.apache.tika.FetchAndParseRequest
-
The "Fetch Key" of the item that will be fetched.
- getFetchKey() - Method in interface org.apache.tika.FetchAndParseRequestOrBuilder
-
The "Fetch Key" of the item that will be fetched.
- getFetchKey() - Method in class org.apache.tika.pipes.api.FetchEmitTuple
- getFetchKey() - Method in class org.apache.tika.pipes.api.fetcher.FetchKey
- getFetchKeyBytes() - Method in class org.apache.tika.FetchAndParseReply.Builder
-
Echoes the fetch_key that was sent in the request.
- getFetchKeyBytes() - Method in class org.apache.tika.FetchAndParseReply
-
Echoes the fetch_key that was sent in the request.
- getFetchKeyBytes() - Method in interface org.apache.tika.FetchAndParseReplyOrBuilder
-
Echoes the fetch_key that was sent in the request.
- getFetchKeyBytes() - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
The "Fetch Key" of the item that will be fetched.
- getFetchKeyBytes() - Method in class org.apache.tika.FetchAndParseRequest
-
The "Fetch Key" of the item that will be fetched.
- getFetchKeyBytes() - Method in interface org.apache.tika.FetchAndParseRequestOrBuilder
-
The "Fetch Key" of the item that will be fetched.
- getFetchKeyColumn() - Method in class org.apache.tika.pipes.iterator.csv.CSVPipesIteratorConfig
- getFetchKeyColumn() - Method in class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIteratorConfig
- getFetchKeyRangeEndColumn() - Method in class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIteratorConfig
- getFetchKeyRangeStartColumn() - Method in class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIteratorConfig
- getFetchSize() - Method in class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIteratorConfig
- getFields() - Method in class org.apache.tika.FetchAndParseReply.Builder
-
Deprecated.
- getFields() - Method in class org.apache.tika.FetchAndParseReply
-
Deprecated.
- getFields() - Method in interface org.apache.tika.FetchAndParseReplyOrBuilder
-
Deprecated.
- getFieldsCount() - Method in class org.apache.tika.FetchAndParseReply.Builder
- getFieldsCount() - Method in class org.apache.tika.FetchAndParseReply
- getFieldsCount() - Method in interface org.apache.tika.FetchAndParseReplyOrBuilder
-
Metadata fields from the parse output.
- getFieldsMap() - Method in class org.apache.tika.FetchAndParseReply.Builder
-
Metadata fields from the parse output.
- getFieldsMap() - Method in class org.apache.tika.FetchAndParseReply
-
Metadata fields from the parse output.
- getFieldsMap() - Method in interface org.apache.tika.FetchAndParseReplyOrBuilder
-
Metadata fields from the parse output.
- getFieldsOrDefault(String, String) - Method in class org.apache.tika.FetchAndParseReply.Builder
-
Metadata fields from the parse output.
- getFieldsOrDefault(String, String) - Method in class org.apache.tika.FetchAndParseReply
-
Metadata fields from the parse output.
- getFieldsOrDefault(String, String) - Method in interface org.apache.tika.FetchAndParseReplyOrBuilder
-
Metadata fields from the parse output.
- getFieldsOrThrow(String) - Method in class org.apache.tika.FetchAndParseReply.Builder
-
Metadata fields from the parse output.
- getFieldsOrThrow(String) - Method in class org.apache.tika.FetchAndParseReply
-
Metadata fields from the parse output.
- getFieldsOrThrow(String) - Method in interface org.apache.tika.FetchAndParseReplyOrBuilder
-
Metadata fields from the parse output.
- getFile() - Method in class org.apache.tika.io.TikaInputStream
- getFileChannel() - Method in class org.apache.tika.io.TikaInputStream
- getFileExtension() - Method in class org.apache.tika.pipes.emitter.fs.FileSystemEmitterRuntimeConfig
- getFileExtensionOrDefault() - Method in record class org.apache.tika.pipes.emitter.azblob.AZBlobEmitterConfig
- getFileLength(Path) - Method in class org.apache.tika.eval.app.ProfilerBase
- getFileName() - Method in class org.apache.tika.server.core.TaskStatus
- getFileNamePattern() - Method in class org.apache.tika.pipes.iterator.s3.S3PipesIteratorConfig
- getFilePath() - Method in class org.apache.tika.parser.strings.StringsConfig
-
Returns the path to the "file" command.
- getFilesProcessed() - Method in class org.apache.tika.pipes.core.PipesClient
- getFilesProcessed() - Method in class org.apache.tika.server.core.ServerStatus
-
Returns the total number of tasks started since server startup.
- getFilter() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- getFilteredStackTrace(Throwable) - Static method in class org.apache.tika.utils.ExceptionUtils
-
Simple util to get stack trace.
- getFilters() - Method in class org.apache.tika.metadata.filter.CompositeMetadataFilter
- getFilters() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- getFlags() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
- getFontToCharset() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFState
-
Returns the font-to-charset mapping table.
- getForkedJvmArgs() - Method in class org.apache.tika.pipes.core.config.ConfigOverrides.PipesConfigOverride
- getForkedJvmArgs() - Method in class org.apache.tika.pipes.core.PipesConfig
- getFormat() - Method in class org.apache.tika.language.translate.impl.YandexTranslator
-
Retrieve the current text format setting.
- getFormatName() - Method in enum class org.apache.tika.parser.pdf.OcrConfig.ImageFormat
- getFormattedNumber(BigInteger, int) - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFListManager
- getFormattedNumber(Paragraph) - Method in class org.apache.tika.parser.microsoft.ListManager
-
Get the formatted number for a given paragraph
- getFormattedNumber(XWPFParagraph) - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFListManager
- getFramesRead() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getFreeSpace() - Method in class org.apache.tika.parser.microsoft.chm.ChmPmgiHeader
-
Returns pmgi free space
- getFreeSpace() - Method in class org.apache.tika.parser.microsoft.chm.ChmPmglHeader
- getFriendlyName(Class<?>) - Method in class org.apache.tika.config.loader.ComponentRegistry
-
Looks up a component's friendly name by its class.
- getFriendlyName(Class<?>) - Static method in class org.apache.tika.serialization.ComponentNameResolver
-
Gets the friendly name for a class, or null if not registered.
- getFrom() - Method in class org.apache.tika.renderer.PageRangeRequest
- getFromContainer(Object, long, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
- getFullName() - Method in class org.apache.tika.parser.microsoft.ooxml.EmbeddedPartMetadata
- getGazetteerRestEndpoint() - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
- getGenre() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
- getGenre() - Method in interface org.apache.tika.parser.mp3.ID3Tags
- getGenre() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
- getGenre() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
- getGenre() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
- getGenre() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
- getGeoPointFieldName() - Method in class org.apache.tika.metadata.filter.GeoPointMetadataFilter
- getGetFetcherConfigJsonSchemaMethod() - Static method in class org.apache.tika.TikaGrpc
- getGetFetcherMethod() - Static method in class org.apache.tika.TikaGrpc
- getGetFetcherReplies(int) - Method in class org.apache.tika.ListFetchersReply.Builder
-
List of fetcher configs returned by the Lists Fetchers service.
- getGetFetcherReplies(int) - Method in class org.apache.tika.ListFetchersReply
-
List of fetcher configs returned by the Lists Fetchers service.
- getGetFetcherReplies(int) - Method in interface org.apache.tika.ListFetchersReplyOrBuilder
-
List of fetcher configs returned by the Lists Fetchers service.
- getGetFetcherRepliesBuilder(int) - Method in class org.apache.tika.ListFetchersReply.Builder
-
List of fetcher configs returned by the Lists Fetchers service.
- getGetFetcherRepliesBuilderList() - Method in class org.apache.tika.ListFetchersReply.Builder
-
List of fetcher configs returned by the Lists Fetchers service.
- getGetFetcherRepliesCount() - Method in class org.apache.tika.ListFetchersReply.Builder
-
List of fetcher configs returned by the Lists Fetchers service.
- getGetFetcherRepliesCount() - Method in class org.apache.tika.ListFetchersReply
-
List of fetcher configs returned by the Lists Fetchers service.
- getGetFetcherRepliesCount() - Method in interface org.apache.tika.ListFetchersReplyOrBuilder
-
List of fetcher configs returned by the Lists Fetchers service.
- getGetFetcherRepliesList() - Method in class org.apache.tika.ListFetchersReply.Builder
-
List of fetcher configs returned by the Lists Fetchers service.
- getGetFetcherRepliesList() - Method in class org.apache.tika.ListFetchersReply
-
List of fetcher configs returned by the Lists Fetchers service.
- getGetFetcherRepliesList() - Method in interface org.apache.tika.ListFetchersReplyOrBuilder
-
List of fetcher configs returned by the Lists Fetchers service.
- getGetFetcherRepliesOrBuilder(int) - Method in class org.apache.tika.ListFetchersReply.Builder
-
List of fetcher configs returned by the Lists Fetchers service.
- getGetFetcherRepliesOrBuilder(int) - Method in class org.apache.tika.ListFetchersReply
-
List of fetcher configs returned by the Lists Fetchers service.
- getGetFetcherRepliesOrBuilder(int) - Method in interface org.apache.tika.ListFetchersReplyOrBuilder
-
List of fetcher configs returned by the Lists Fetchers service.
- getGetFetcherRepliesOrBuilderList() - Method in class org.apache.tika.ListFetchersReply.Builder
-
List of fetcher configs returned by the Lists Fetchers service.
- getGetFetcherRepliesOrBuilderList() - Method in class org.apache.tika.ListFetchersReply
-
List of fetcher configs returned by the Lists Fetchers service.
- getGetFetcherRepliesOrBuilderList() - Method in interface org.apache.tika.ListFetchersReplyOrBuilder
-
List of fetcher configs returned by the Lists Fetchers service.
- getGetPipesIteratorMethod() - Static method in class org.apache.tika.TikaGrpc
- getGlobalCharset() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFState
-
Returns the global charset (
\ansicpgN). - getGlobalSettings() - Method in class org.apache.tika.config.loader.TikaLoader
-
Gets the global settings if they have been loaded.
- getGroupId() - Method in class org.apache.tika.pipes.iterator.kafka.KafkaPipesIteratorConfig
- getGroupInitialRebalanceDelayMs() - Method in class org.apache.tika.pipes.iterator.kafka.KafkaPipesIteratorConfig
- getGuid() - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
- getGuid() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
- getGuid() - Method in class org.apache.tika.parser.microsoft.onenote.GUID
- getGuidString() - Method in class org.apache.tika.parser.microsoft.onenote.GUID
- getHadStarted() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getHeader_len() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Returns header length
- getHeaderLen() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Returns itsf header length
- getHeaders() - Method in class org.apache.tika.parser.jdbc.JDBCTableReader
- getHeaders() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpHeaders
- getHealthCheckUrl(VLMOCRConfig) - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- getHealthCheckUrl(VLMOCRConfig) - Method in class org.apache.tika.parser.vlm.ClaudeVLMParser
- getHealthCheckUrl(VLMOCRConfig) - Method in class org.apache.tika.parser.vlm.GeminiVLMParser
- getHealthCheckUrl(VLMOCRConfig) - Method in class org.apache.tika.parser.vlm.OpenAIVLMParser
- getHeartbeatIntervalMs() - Method in class org.apache.tika.pipes.core.PipesConfig
- getHexValue() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFToken
- getHlinkClickUrl() - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
- getHost() - Method in class org.apache.tika.server.core.TikaServerConfig
- getHtml(InputStream, HttpHeaders) - Method in class org.apache.tika.server.core.resource.TikaResource
-
Parse document and return HTML content.
- getHttpClientFactory() - Method in class org.apache.tika.server.client.TikaServerClientConfig
- getHttpFetcherConfig() - Method in class org.apache.tika.pipes.fetcher.http.HttpFetcher
- getHttpHeaders() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- getHttpHeaders() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getHttpRequestHeaders() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- getHttpRequestHeaders() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getId() - Method in class org.apache.tika.pipes.api.FetchEmitTuple
- getId() - Method in class org.apache.tika.pipes.core.config.ConfigOverrides.EmitterOverride
- getId() - Method in class org.apache.tika.pipes.core.config.ConfigOverrides.FetcherOverride
- getId() - Method in class org.apache.tika.renderer.RenderResult
- getId() - Method in class org.apache.tika.server.core.TikaServerConfig
- getId(String) - Method in class org.apache.tika.eval.app.db.DBBuffer
- getIdBase() - Method in class org.apache.tika.server.core.TikaServerConfig
- getIdColumn() - Method in class org.apache.tika.pipes.iterator.csv.CSVPipesIteratorConfig
- getIdColumn() - Method in class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIteratorConfig
- getIdentifier() - Method in class org.apache.tika.sax.StandardReference
- getIdField() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- getIdFieldOrDefault() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
- getIds() - Method in interface org.apache.tika.extractor.UnpackHandler
- getIds() - Method in class org.apache.tika.pipes.core.extractor.AbstractUnpackHandler
- getIgniteInstanceName() - Method in class org.apache.tika.pipes.ignite.config.IgniteConfigStoreConfig
- getIgnoreCharsets() - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector.Config
- getIlvl() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
- getImageFormat() - Method in class org.apache.tika.parser.pdf.OcrConfig
- getImageFormatName(ParseContext) - Method in class org.apache.tika.renderer.pdf.pdfbox.PDFBoxRenderer
- getImageGraphicsEngineFactory() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- getImageMagickPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- getImageMagickProg() - Static method in class org.apache.tika.parser.ocr.TesseractOCRParser
- getImageQuality() - Method in class org.apache.tika.parser.pdf.OcrConfig
- getImageStrategy() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- getImageType() - Method in class org.apache.tika.parser.pdf.OcrConfig
- getImageType(ParseContext) - Method in class org.apache.tika.renderer.pdf.pdfbox.PDFBoxRenderer
- getImportRoot() - Method in class org.apache.tika.example.ImportContextImpl
- getInclude() - Method in class org.apache.tika.metadata.filter.IncludeFieldMetadataFilter
- getIncludedCipherSuites() - Method in class org.apache.tika.server.core.TlsConfig
- getIncludedProtocols() - Method in class org.apache.tika.server.core.TlsConfig
- getIncludeEmbeddedResourceTypes() - Method in class org.apache.tika.pipes.core.extractor.StandardUnpackSelector
- getIncludeFields() - Method in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- getIncludeMimeTypes() - Method in class org.apache.tika.pipes.core.extractor.StandardUnpackSelector
- getIncludeTypes() - Method in class org.apache.tika.parser.ParserDecorator.MimeFilteringDecorator
- getIndex() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
- getIndex_depth() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Returns an index depth
- getIndex_head() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Returns an index head
- getIndex_root() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Returns index root
- getIndexCopyFromStart() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
- getIndexCopyToStart() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
- getIndexOfContent() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- getIndexOfResetData() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- getIndexOfResetTable() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- getIniBlock() - Method in class org.apache.tika.parser.microsoft.chm.ChmBlockInfo
-
Returns an initial block index
- getInlineBool(OneNotePropertyEnum) - Static method in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- getInputStream() - Method in class org.apache.tika.example.ImportContextImpl
-
Returns a new
InputStreamto the temporary file created during instanciation ornull, if this context does not provide a stream. - getInputStream() - Method in class org.apache.tika.parser.html.DataURIScheme
- getInputStream() - Method in class org.apache.tika.renderer.RenderResult
- getInputTempDirectory() - Method in class org.apache.tika.server.core.resource.PipesParsingHelper
-
Gets the input temp directory path.
- getInstance() - Method in interface org.apache.tika.eval.core.textstats.BytesRefCalculator
- getInstance() - Method in class org.apache.tika.eval.core.textstats.TextSha256Signature
- getInstance() - Static method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
- getInt(byte[]) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
- getInt(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
- getInt(Property) - Method in class org.apache.tika.metadata.Metadata
-
Returns the value of the identified Integer based metadata property.
- getInt(Property) - Method in class org.apache.tika.xmp.XMPMetadata
- getInt2(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
- getInt3(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
- getIntBE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE int value from the beginning of a byte array
- getIntBE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE int value from a byte array
- getIntelCurrentPossition() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getIntelFileSize() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getIntelState() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getIntLE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE int value from the beginning of a byte array
- getIntLE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE int value from a byte array
- getIntVal() - Method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
- getIntVal() - Method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
- getIntVal() - Method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.RequestTypes
- getIntVal() - Method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
- getIntVal() - Method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
- getIntValues(Property) - Method in class org.apache.tika.metadata.Metadata
-
Gets the array of ints of the identified "seq" integer metadata property.
- getIOListener() - Method in class org.apache.tika.example.ImportContextImpl
- getIssuer() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- getIssuer() - Method in class org.apache.tika.pipes.fetcher.http.jwt.JwtCreds
- getIsTruncated() - Method in class org.apache.tika.utils.StreamGobbler
- getIteratorClass() - Method in class org.apache.tika.GetPipesIteratorReply.Builder
-
The full java class name of the pipes iterator
- getIteratorClass() - Method in class org.apache.tika.GetPipesIteratorReply
-
The full java class name of the pipes iterator
- getIteratorClass() - Method in interface org.apache.tika.GetPipesIteratorReplyOrBuilder
-
The full java class name of the pipes iterator
- getIteratorClass() - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
-
The full java class name of the pipes iterator class.
- getIteratorClass() - Method in class org.apache.tika.SavePipesIteratorRequest
-
The full java class name of the pipes iterator class.
- getIteratorClass() - Method in interface org.apache.tika.SavePipesIteratorRequestOrBuilder
-
The full java class name of the pipes iterator class.
- getIteratorClassBytes() - Method in class org.apache.tika.GetPipesIteratorReply.Builder
-
The full java class name of the pipes iterator
- getIteratorClassBytes() - Method in class org.apache.tika.GetPipesIteratorReply
-
The full java class name of the pipes iterator
- getIteratorClassBytes() - Method in interface org.apache.tika.GetPipesIteratorReplyOrBuilder
-
The full java class name of the pipes iterator
- getIteratorClassBytes() - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
-
The full java class name of the pipes iterator class.
- getIteratorClassBytes() - Method in class org.apache.tika.SavePipesIteratorRequest
-
The full java class name of the pipes iterator class.
- getIteratorClassBytes() - Method in interface org.apache.tika.SavePipesIteratorRequestOrBuilder
-
The full java class name of the pipes iterator class.
- getIteratorConfigJson() - Method in class org.apache.tika.GetPipesIteratorReply.Builder
-
JSON string of the pipes iterator config object
- getIteratorConfigJson() - Method in class org.apache.tika.GetPipesIteratorReply
-
JSON string of the pipes iterator config object
- getIteratorConfigJson() - Method in interface org.apache.tika.GetPipesIteratorReplyOrBuilder
-
JSON string of the pipes iterator config object
- getIteratorConfigJson() - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
-
JSON string of the pipes iterator config object.
- getIteratorConfigJson() - Method in class org.apache.tika.SavePipesIteratorRequest
-
JSON string of the pipes iterator config object.
- getIteratorConfigJson() - Method in interface org.apache.tika.SavePipesIteratorRequestOrBuilder
-
JSON string of the pipes iterator config object.
- getIteratorConfigJsonBytes() - Method in class org.apache.tika.GetPipesIteratorReply.Builder
-
JSON string of the pipes iterator config object
- getIteratorConfigJsonBytes() - Method in class org.apache.tika.GetPipesIteratorReply
-
JSON string of the pipes iterator config object
- getIteratorConfigJsonBytes() - Method in interface org.apache.tika.GetPipesIteratorReplyOrBuilder
-
JSON string of the pipes iterator config object
- getIteratorConfigJsonBytes() - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
-
JSON string of the pipes iterator config object.
- getIteratorConfigJsonBytes() - Method in class org.apache.tika.SavePipesIteratorRequest
-
JSON string of the pipes iterator config object.
- getIteratorConfigJsonBytes() - Method in interface org.apache.tika.SavePipesIteratorRequestOrBuilder
-
JSON string of the pipes iterator config object.
- getIteratorId() - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
-
The pipes iterator ID to delete
- getIteratorId() - Method in class org.apache.tika.DeletePipesIteratorRequest
-
The pipes iterator ID to delete
- getIteratorId() - Method in interface org.apache.tika.DeletePipesIteratorRequestOrBuilder
-
The pipes iterator ID to delete
- getIteratorId() - Method in class org.apache.tika.GetPipesIteratorReply.Builder
-
The pipes iterator ID
- getIteratorId() - Method in class org.apache.tika.GetPipesIteratorReply
-
The pipes iterator ID
- getIteratorId() - Method in interface org.apache.tika.GetPipesIteratorReplyOrBuilder
-
The pipes iterator ID
- getIteratorId() - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
-
The pipes iterator ID to retrieve
- getIteratorId() - Method in class org.apache.tika.GetPipesIteratorRequest
-
The pipes iterator ID to retrieve
- getIteratorId() - Method in interface org.apache.tika.GetPipesIteratorRequestOrBuilder
-
The pipes iterator ID to retrieve
- getIteratorId() - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
-
A unique identifier for each pipes iterator.
- getIteratorId() - Method in class org.apache.tika.SavePipesIteratorRequest
-
A unique identifier for each pipes iterator.
- getIteratorId() - Method in interface org.apache.tika.SavePipesIteratorRequestOrBuilder
-
A unique identifier for each pipes iterator.
- getIteratorIdBytes() - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
-
The pipes iterator ID to delete
- getIteratorIdBytes() - Method in class org.apache.tika.DeletePipesIteratorRequest
-
The pipes iterator ID to delete
- getIteratorIdBytes() - Method in interface org.apache.tika.DeletePipesIteratorRequestOrBuilder
-
The pipes iterator ID to delete
- getIteratorIdBytes() - Method in class org.apache.tika.GetPipesIteratorReply.Builder
-
The pipes iterator ID
- getIteratorIdBytes() - Method in class org.apache.tika.GetPipesIteratorReply
-
The pipes iterator ID
- getIteratorIdBytes() - Method in interface org.apache.tika.GetPipesIteratorReplyOrBuilder
-
The pipes iterator ID
- getIteratorIdBytes() - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
-
The pipes iterator ID to retrieve
- getIteratorIdBytes() - Method in class org.apache.tika.GetPipesIteratorRequest
-
The pipes iterator ID to retrieve
- getIteratorIdBytes() - Method in interface org.apache.tika.GetPipesIteratorRequestOrBuilder
-
The pipes iterator ID to retrieve
- getIteratorIdBytes() - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
-
A unique identifier for each pipes iterator.
- getIteratorIdBytes() - Method in class org.apache.tika.SavePipesIteratorRequest
-
A unique identifier for each pipes iterator.
- getIteratorIdBytes() - Method in interface org.apache.tika.SavePipesIteratorRequestOrBuilder
-
A unique identifier for each pipes iterator.
- getJavaName() - Method in enum class org.apache.tika.digest.DigestDef.Algorithm
-
Returns the Java Security name for this algorithm (for use with MessageDigest.getInstance()).
- getJavaPath() - Method in class org.apache.tika.pipes.core.PipesConfig
- getJCas(AnalysisEngine) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Returns a new JCas () appropriate for the given Analysis Engine.
- getJDBCClassName() - Method in class org.apache.tika.parser.jdbc.AbstractDBParser
-
JDBC class name, e.g. org.sqlite.JDBC
- getJDBCClassName() - Method in class org.apache.tika.parser.sqlite3.SQLite3DBParser
- getJdbcDriverClass() - Method in class org.apache.tika.eval.app.EvalConfig
- getJDBCDriverClass() - Method in class org.apache.tika.eval.app.db.H2Util
- getJDBCDriverClass() - Method in class org.apache.tika.eval.app.db.JDBCUtil
-
JDBC driver class.
- getJdbcString() - Method in class org.apache.tika.eval.app.EvalConfig
- getJson() - Method in class org.apache.tika.pipes.emitter.es.JsonResponse
- getJson() - Method in class org.apache.tika.pipes.emitter.opensearch.JsonResponse
- getJson() - Method in class org.apache.tika.pipes.ignite.ExtensionConfigDTO
- getJson() - Method in class org.apache.tika.pipes.reporter.opensearch.JsonResponse
- getJson(InputStream, HttpHeaders, String) - Method in class org.apache.tika.server.core.resource.TikaResource
-
Parse document and return JSON with metadata and specified content type.
- getJsonConfig(String) - Method in class org.apache.tika.parser.ParseContext
-
Gets a JSON configuration by component name.
- getJsonConfigs() - Method in class org.apache.tika.parser.ParseContext
-
Returns all JSON configurations for serialization.
- getJsonDefault(InputStream, HttpHeaders) - Method in class org.apache.tika.server.core.resource.TikaResource
-
Parse document and return JSON with metadata and text content.
- getJsonField() - Method in class org.apache.tika.serialization.ComponentConfig
- getJsonPath() - Method in class org.apache.tika.pipes.pipesiterator.json.JsonPipesIteratorConfig
- getJunkDetector() - Static method in class org.apache.tika.eval.core.metadata.TikaEvalMetadataFilter
-
Shared
JunkDetector, ornullif the model failed to load. - getJustFileName(String) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
- getJwtExpiresInSeconds() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- getJwtExpiresInSeconds() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getJwtGenerator() - Method in class org.apache.tika.pipes.fetcher.http.HttpFetcher
- getJwtIssuer() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getJwtPrivateKeyBase64() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getJwtSecret() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getJwtSubject() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getKeepAliveOnBadKeepAliveValueMs() - Method in class org.apache.tika.client.HttpClientFactory
- getKeepAliveStrategy() - Method in class org.apache.tika.client.HttpClientFactory
- getKey() - Static method in class org.apache.tika.example.Pharmacy
- getKeyBaseStrategy() - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- getKeySerializer() - Method in class org.apache.tika.pipes.iterator.kafka.KafkaPipesIteratorConfig
- getKeyStoreFile() - Method in class org.apache.tika.server.core.TlsConfig
- getKeyStorePassword() - Method in class org.apache.tika.server.core.TlsConfig
- getKeyStoreType() - Method in class org.apache.tika.server.core.TlsConfig
- getLabel() - Method in class org.apache.tika.detect.EncodingResult
-
The detector's original label for this result.
- getLabel() - Method in class org.apache.tika.inference.locator.SpatialLocator
- getLabel() - Method in class org.apache.tika.ml.chardetect.ScoredCandidate
- getLabel() - Method in class org.apache.tika.ml.Prediction
-
The predicted class label (e.g.
- getLabel(int) - Method in class org.apache.tika.langdetect.charsoup.CharSoupModel
- getLabel(int) - Method in class org.apache.tika.ml.chardetect.NaiveBayesBigramEncodingDetector
- getLabel(int) - Method in class org.apache.tika.ml.LinearModel
- getLabels() - Method in class org.apache.tika.langdetect.charsoup.CharSoupModel
- getLabels() - Method in class org.apache.tika.ml.LinearModel
- getLang_id() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Returns language id
- getLangCode() - Method in class org.apache.tika.eval.core.tokens.CommonTokenResult
- getLangId() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Returns language ID
- getLangs() - Method in class org.apache.tika.eval.core.tokens.CommonTokenCountManager
- getLangs() - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
- getLangs(String, Set<String>, Set<String>) - Static method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
This takes a language string, parses it and then bins individual langs into valid or invalid based on regexes against the language codes
- getLangTokens(String) - Method in class org.apache.tika.eval.core.tokens.CommonTokenCountManager
- getLanguage() - Method in class org.apache.tika.language.detect.LanguageHandler
-
Returns the detected language based on text handled thus far.
- getLanguage() - Method in class org.apache.tika.language.detect.LanguageResult
-
The ISO 639-1 language code (plus optional country code)
- getLanguage() - Method in class org.apache.tika.language.detect.LanguageWriter
-
Returns the detected language based on text written thus far.
- getLanguage() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
-
Gets the language, if present
- getLanguage() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
- getLanguage() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- getLanguage() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- getLanguage() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Get the ISO code for the language of the detected charset.
- getLanguage(long) - Static method in class org.apache.tika.parser.microsoft.chm.ChmCommons
-
Returns textual representation of LangID
- getLanguageDetectors() - Static method in class org.apache.tika.language.detect.LanguageDetector
- getLanguageDetectors(ServiceLoader) - Static method in class org.apache.tika.language.detect.LanguageDetector
- getLastClosedGroup() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFState
-
Returns the group state that was just closed on the most recent GROUP_CLOSE.
- getLastModified() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Returns last modified date of the chm file
- getLastProgressMillis() - Method in class org.apache.tika.config.TikaProgressTracker
-
Returns the epoch millis of the last progress update.
- getLatitude() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
- getLayer() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the audio layer code.
- getLeafRenderer(MediaType) - Method in class org.apache.tika.renderer.CompositeRenderer
- getLeft() - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
- getLeft() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
- getLength() - Method in class org.apache.tika.detect.MagicDetector
- getLength() - Method in class org.apache.tika.io.TikaInputStream
- getLength() - Method in class org.apache.tika.parser.microsoft.chm.DirectoryListingEntry
- getLength() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Returns the frame length in bytes.
- getLength() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
- getLengthTreeLengtsTable() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getLengthTreeTable() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getLimit() - Method in exception org.apache.tika.exception.EmbeddedLimitReachedException
- getLimitType() - Method in exception org.apache.tika.exception.EmbeddedLimitReachedException
- getLines() - Method in class org.apache.tika.utils.StreamGobbler
- getLinks() - Method in class org.apache.tika.mime.MimeType
-
Get a list of links to help document this mime type
- getLinks() - Method in class org.apache.tika.sax.LinkContentHandler
-
Returns the list of collected links.
- getListFetchersMethod() - Static method in class org.apache.tika.TikaGrpc
- getLoader() - Method in class org.apache.tika.config.ServiceLoader
- getLocations(List<String>) - Method in class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
-
Calls API of lucene-geo-gazetteer to search location name in gazetteer.
- getLocators() - Method in class org.apache.tika.inference.Chunk
- getLogit(String) - Method in class org.apache.tika.ml.chardetect.SpecialistOutput
-
Raw logit for
label, ornullif not covered. - getLogLevel() - Method in class org.apache.tika.server.core.TikaServerConfig
- getLongitude() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
- getLongLE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE long value from a byte array
- getLongValues(Property) - Method in class org.apache.tika.metadata.Metadata
-
Gets the array of ints of the identified "seq" integer metadata property.
- getLzxBlockLength() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- getLzxBlockOffset() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- getLzxBlocksCache() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- getMacroLanguage(String) - Static method in class org.apache.tika.language.detect.LanguageNames
-
If language is a specific variant of a macro language (e.g.
- getMagikaPath() - Method in class org.apache.tika.detect.magika.MagikaDetector.Config
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
Return a list of the main parts of the document, used when searching for embedded resources.
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.SXSLFPowerPointExtractorDecorator
-
In PowerPoint files, slides have things embedded in them, and slide drawings which have the images
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.SXWPFWordExtractorDecorator
-
This returns all items that might contain embedded objects: main document, headers, footers, comments, etc.
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.VSDXExtractorDecorator
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSExtractorDecorator
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
In Excel files, sheets have things embedded in them, and sheet drawings which have the images
- getMainOrganizationAcronym() - Method in class org.apache.tika.sax.StandardReference
- getMainTreeElements() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getMainTreeLengtsTable() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getMainTreeTable() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getMajorVersion() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
- getMap() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpHeaders
- getMappedTagName() - Method in class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
- getMapper() - Static method in class org.apache.tika.config.JsonConfigHelper
-
Returns the ObjectMapper used by this helper.
- getMapper() - Static method in class org.apache.tika.config.loader.TikaObjectMapperFactory
- getMapper() - Static method in class org.apache.tika.pipes.core.serialization.JsonPipesIpc
-
Get the configured ObjectMapper for direct use if needed.
- getMappins() - Method in class org.apache.tika.metadata.filter.FieldNameMappingFilter
- getMarkdown(InputStream, HttpHeaders) - Method in class org.apache.tika.server.core.resource.TikaResource
-
Parse document and return Markdown content.
- getMarkLimit() - Method in class org.apache.tika.parser.csv.TextAndCSVConfig
- getMarkLimit() - Method in class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
- getMarkLimit() - Method in class org.apache.tika.parser.html.HtmlEncodingDetector.Config
- getMarkLimit() - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector.Config
- getMarkLimit() - Method in class org.apache.tika.parser.txt.UniversalEncodingDetector.Config
- getMatchMap() - Method in class org.apache.tika.parser.RegexCaptureParserConfig
- getMaxBatchSize() - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- getMaxBatchSize() - Method in class org.apache.tika.inference.InferenceConfig
- getMaxBytes() - Method in class org.apache.tika.detect.magika.MagikaDetector.Config
- getMaxBytes() - Method in class org.apache.tika.detect.siegfried.SiegfriedDetector.Config
- getMaxChunkChars() - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- getMaxChunkChars() - Method in class org.apache.tika.inference.InferenceConfig
- getMaxChunks() - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- getMaxChunks() - Method in class org.apache.tika.inference.InferenceConfig
- getMaxConnections() - Method in class org.apache.tika.client.HttpClientFactory
- getMaxConnections() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- getMaxConnections() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getMaxConnections() - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- getMaxConnections() - Method in class org.apache.tika.pipes.iterator.s3.S3PipesIteratorConfig
- getMaxConnectionsPerRoute() - Method in class org.apache.tika.client.HttpClientFactory
- getMaxConnectionsPerRoute() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- getMaxConnectionsPerRoute() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getMaxContentLength() - Method in class org.apache.tika.eval.app.EvalConfig
- getMaxCount() - Method in class org.apache.tika.config.EmbeddedLimits
-
Gets the maximum number of embedded documents to process.
- getMaxDataLengthBytes() - Method in class org.apache.tika.parser.image.PSDParser.PSDParserConfig
- getMaxDepth() - Method in class org.apache.tika.config.EmbeddedLimits
-
Gets the maximum nesting depth for embedded documents.
- getMaxEmails() - Method in class org.apache.tika.parser.microsoft.libpst.LibPstParserConfig
- getMaxEmbeddedCount() - Method in class org.apache.tika.parser.ParseRecord
-
Gets the maximum number of embedded documents to parse.
- getMaxEmbeddedDepth() - Method in class org.apache.tika.parser.ParseRecord
-
Gets the maximum depth for parsing embedded documents.
- getMaxEntityExpansions() - Method in class org.apache.tika.config.GlobalSettings.XmlReaderUtilsConfig
- getMaxEntityExpansions() - Static method in class org.apache.tika.utils.XMLReaderUtils
- getMaxErrMsgSize() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- getMaxErrMsgSize() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getMaxExtractLength() - Method in class org.apache.tika.eval.app.EvalConfig
- getMaxFieldSize() - Method in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- getMaxFileSizeToEmbed() - Method in class org.apache.tika.inference.ImageEmbeddingConfig
- getMaxFileSizeToEmbed() - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- getMaxFileSizeToOcr() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
- getMaxFileSizeToOcr() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- getMaxFileSizeToOcr() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- getMaxFileSizeToOcr() - Method in class org.apache.tika.parser.ocrencode.EncodeOCRConfig
- getMaxFileSizeToOcr() - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- getMaxFileSizeToOcr() - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- getMaxFilesProcessedPerProcess() - Method in class org.apache.tika.pipes.core.config.ConfigOverrides.PipesConfigOverride
- getMaxFilesProcessedPerProcess() - Method in class org.apache.tika.pipes.core.PipesConfig
-
Restart the forked PipesServer after it has processed this many files to avoid slow-building memory leaks.
- getMaxFilesToAdd() - Method in class org.apache.tika.eval.app.EvalConfig
- getMaxImagePixels() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
- getMaxImagePixels() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- getMaxImagePixels() - Method in class org.apache.tika.parser.pdf.OcrConfig
- getMaxImagePixels() - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- getMaxImagesToOcr() - Method in class org.apache.tika.parser.ocrencode.EncodeOCRConfig
- getMaximumCompressionRatio() - Method in class org.apache.tika.sax.SecureContentHandler
-
Returns the maximum compression ratio.
- getMaximumDepth() - Method in class org.apache.tika.sax.SecureContentHandler
-
Returns the maximum XML element nesting level.
- getMaximumPackageEntryDepth() - Method in class org.apache.tika.sax.SecureContentHandler
-
Returns the maximum package entry nesting level.
- getMaxIncrementalUpdates() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- getMaxKeySize() - Method in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- getMaxLength() - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- getMaxMainMemoryBytes() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
The maximum amount of memory to use when loading a pdf into a PDDocument.
- getMaxNumReuses() - Method in class org.apache.tika.config.GlobalSettings.XmlReaderUtilsConfig
- getMaxNumReuses() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Get the maximum number of times a SAXParser or DOMBuilder may be reused.
- getMaxOverride() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
- getMaxPackageEntryDepth() - Method in class org.apache.tika.config.OutputLimits
-
Gets the maximum package entry nesting depth.
- getMaxPagesToOcr() - Method in class org.apache.tika.parser.pdf.OcrConfig
- getMaxRecordLength() - Method in class org.apache.tika.parser.image.BPGParser
- getMaxRecordSize() - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
- getMaxRecordSize() - Method in class org.apache.tika.parser.mp3.Mp3Parser
- getMaxRedirects() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- getMaxRedirects() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getMaxScaleTo() - Method in class org.apache.tika.renderer.pdf.poppler.PopplerRenderer
- getMaxSpoolSize() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- getMaxSpoolSize() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getMaxStdErr() - Method in class org.apache.tika.parser.external.ExternalParserConfig
- getMaxStdOut() - Method in class org.apache.tika.parser.external.ExternalParserConfig
- getMaxStringLength() - Method in class org.apache.tika.Tika
-
Returns the maximum length of strings returned by the parseToString methods.
- getMaxTokens() - Method in class org.apache.tika.eval.app.EvalConfig
- getMaxTokens() - Method in class org.apache.tika.eval.core.tokens.AnalyzerManager
-
Get the max token limit.
- getMaxTokens() - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- getMaxTokens() - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- getMaxTotalBytes() - Method in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- getMaxUnpackBytes() - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
-
Maximum total bytes to unpack per file.
- getMaxValuesPerField() - Method in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- getMaxWaitForClientMillis() - Method in class org.apache.tika.pipes.core.PipesConfig
- getMaxWaitMillis() - Method in class org.apache.tika.server.client.TikaServerClientConfig
- getMaxXmlDepth() - Method in class org.apache.tika.config.OutputLimits
-
Gets the maximum XML element nesting depth.
- getMaxXMPMMHistory() - Static method in class org.apache.tika.parser.xmp.JempboxExtractor
- getMaxXMPMMHistory() - Static method in class org.apache.tika.parser.xmp.XMPMetadataExtractor
- getMediaType() - Method in class org.apache.tika.parser.csv.CSVParams
- getMediaType() - Method in class org.apache.tika.parser.csv.CSVResult
- getMediaType() - Method in class org.apache.tika.parser.html.DataURIScheme
- getMediaType(String) - Static method in class org.apache.tika.detect.zip.CompressorConstants
- getMediaType(String) - Static method in class org.apache.tika.detect.zip.PackageConstants
- getMediaType(String, String) - Method in class org.apache.tika.server.core.resource.TikaMimeTypes
- getMediaTypeRegistry() - Static method in class org.apache.tika.config.loader.TikaLoader
-
Gets the media type registry.
- getMediaTypeRegistry() - Method in class org.apache.tika.io.SpoolingStrategy
-
Returns the media type registry.
- getMediaTypeRegistry() - Method in class org.apache.tika.mime.MimeTypes
- getMediaTypeRegistry() - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
- getMediaTypeRegistry() - Method in class org.apache.tika.parser.CompositeParser
-
Returns the media type registry used to infer type relationships.
- getMediaTypeRegistry() - Method in class org.apache.tika.parser.multiple.AbstractMultipleParser
-
Returns the media type registry used to infer type relationships.
- getMediaTypes() - Method in class org.apache.tika.server.core.resource.TikaMimeTypes
- getMemoryLimitInKb() - Method in class org.apache.tika.parser.pkg.CompressorParser.Config
- getMessage() - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
-
Status message
- getMessage() - Method in class org.apache.tika.DeletePipesIteratorReply
-
Status message
- getMessage() - Method in interface org.apache.tika.DeletePipesIteratorReplyOrBuilder
-
Status message
- getMessage() - Method in exception org.apache.tika.exception.WriteLimitReachedException
- getMessage() - Method in exception org.apache.tika.pipes.core.async.OfferLargerThanQueueSize
- getMessage() - Method in class org.apache.tika.pipes.fork.PipesForkResult
-
Get any error message associated with the result.
- getMessage() - Method in class org.apache.tika.SavePipesIteratorReply.Builder
-
Status message
- getMessage() - Method in class org.apache.tika.SavePipesIteratorReply
-
Status message
- getMessage() - Method in interface org.apache.tika.SavePipesIteratorReplyOrBuilder
-
Status message
- getMessage() - Method in class org.apache.tika.server.core.resource.TikaResource
- getMessageBytes() - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
-
Status message
- getMessageBytes() - Method in class org.apache.tika.DeletePipesIteratorReply
-
Status message
- getMessageBytes() - Method in interface org.apache.tika.DeletePipesIteratorReplyOrBuilder
-
Status message
- getMessageBytes() - Method in class org.apache.tika.SavePipesIteratorReply.Builder
-
Status message
- getMessageBytes() - Method in class org.apache.tika.SavePipesIteratorReply
-
Status message
- getMessageBytes() - Method in interface org.apache.tika.SavePipesIteratorReplyOrBuilder
-
Status message
- getMet(URL) - Static method in class org.apache.tika.example.DisplayMetInstance
- getMetadata() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns an array of metadata whose values will be analyzed using cTAKES.
- getMetadata() - Method in class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
Returns metadata that includes cTAKES annotations.
- getMetadata() - Method in class org.apache.tika.pipes.api.FetchEmitTuple
- getMetadata() - Method in class org.apache.tika.pipes.fork.PipesForkResult
-
Get the container document's metadata only.
- getMetadata() - Method in class org.apache.tika.renderer.RenderResult
- getMetadata() - Method in class org.apache.tika.server.core.MetadataList
- getMetadata(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.core.resource.MetadataResource
- getMetadata(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.standard.resource.XMPMetadataResource
- getMetadata(InputStream, HttpHeaders, String) - Method in class org.apache.tika.server.core.resource.RecursiveMetadataResource
-
Returns an InputStream that can be deserialized as a list of
Metadataobjects. - getMetadataAsString() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns a string containing a comma-separated list of metadata whose values will be analyzed using cTAKES.
- getMetadataCommandArguments() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Gets the map of Metadata keys to command line parameters.
- getMetadataExtractor() - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
- getMetadataExtractor() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor
- getMetadataExtractor() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
- getMetadataField(InputStream, HttpHeaders, UriInfo, String) - Method in class org.apache.tika.server.core.resource.MetadataResource
-
Get a specific metadata field.
- getMetadataField(InputStream, HttpHeaders, UriInfo, String) - Method in class org.apache.tika.server.standard.resource.XMPMetadataResource
- getMetadataFromMultipart(Attachment, UriInfo) - Method in class org.apache.tika.server.core.resource.MetadataResource
- getMetadataFromMultipart(Attachment, UriInfo) - Method in class org.apache.tika.server.standard.resource.XMPMetadataResource
- getMetadataFromMultipart(Attachment, String) - Method in class org.apache.tika.server.core.resource.RecursiveMetadataResource
-
Returns an InputStream that can be deserialized as a list of
Metadataobjects. - getMetadataList() - Method in class org.apache.tika.parser.ParseRecord
- getMetadataList() - Method in interface org.apache.tika.pipes.api.emitter.EmitData
- getMetadataList() - Method in class org.apache.tika.pipes.core.emitter.EmitDataImpl
- getMetadataList() - Method in class org.apache.tika.pipes.fork.PipesForkResult
-
Get the list of metadata objects from parsing.
- getMetadataList() - Method in class org.apache.tika.sax.RecursiveParserWrapperHandler
- getMetadataPolicy() - Method in class org.apache.tika.parser.multiple.AbstractMultipleParser
- getMetadataWithConfig(List<Attachment>, HttpHeaders) - Method in class org.apache.tika.server.core.resource.RecursiveMetadataResource
-
Multipart endpoint with per-request ParseContext configuration.
- getMetadataWithConfig(List<Attachment>, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.core.resource.MetadataResource
-
Multipart endpoint with per-request ParseContext configuration.
- getMetaParser() - Method in class org.apache.tika.parser.epub.EpubParser
- getMetaParser() - Method in class org.apache.tika.parser.odf.OpenDocumentParser
- getMillisSinceLastParseStarted() - Method in class org.apache.tika.server.core.ServerStatus
-
Returns milliseconds since the last task was started.
- getMimeExclude() - Method in class org.apache.tika.config.loader.FrameworkConfig.ParserDecoration
- getMimeId(String) - Method in class org.apache.tika.eval.app.io.DBWriter
- getMimeId(String) - Method in interface org.apache.tika.eval.app.io.IDBWriter
- getMimeInclude() - Method in class org.apache.tika.config.loader.FrameworkConfig.ParserDecoration
- getMimes() - Method in class org.apache.tika.metadata.filter.RemoveByMimeMetadataFilter
- getMimeType() - Method in class org.apache.tika.example.ImportContextImpl
- getMimeType(File) - Method in class org.apache.tika.mime.MimeTypes
-
Deprecated.Use
Tika.detect(File)instead - getMimeType(String) - Method in class org.apache.tika.mime.MimeTypes
-
Deprecated.Use
Tika.detect(String)instead - getMimeTypeDetailsHTML(String, String) - Method in class org.apache.tika.server.core.resource.TikaMimeTypes
- getMimeTypeDetailsJSON(String, String) - Method in class org.apache.tika.server.core.resource.TikaMimeTypes
- getMimeTypes() - Static method in class org.apache.tika.config.loader.TikaLoader
- getMimeTypes() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
- getMimeTypesHTML() - Method in class org.apache.tika.server.core.resource.TikaMimeTypes
- getMimeTypesJSON() - Method in class org.apache.tika.server.core.resource.TikaMimeTypes
- getMimeTypesPlain() - Method in class org.apache.tika.server.core.resource.TikaMimeTypes
- getMinConfidence() - Method in class org.apache.tika.parser.csv.TextAndCSVConfig
- getMinExtractLength() - Method in class org.apache.tika.eval.app.EvalConfig
- getMinFileSizeToEmbed() - Method in class org.apache.tika.inference.ImageEmbeddingConfig
- getMinFileSizeToEmbed() - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- getMinFileSizeToOcr() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
- getMinFileSizeToOcr() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- getMinFileSizeToOcr() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- getMinFileSizeToOcr() - Method in class org.apache.tika.parser.ocrencode.EncodeOCRConfig
- getMinFileSizeToOcr() - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- getMinFileSizeToOcr() - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- getMinLength() - Method in class org.apache.tika.detect.TrainedModelDetector
- getMinLength() - Method in class org.apache.tika.mime.MimeTypes
-
Return the minimum length of data to provide to analyzing methods based on the document's content in order to check all the known MimeTypes.
- getMinLength() - Method in class org.apache.tika.parser.strings.StringsConfig
-
Returns the minimum sequence length (characters) to print.
- getMinorVersion() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
- getMinSize() - Method in class org.apache.tika.parser.strings.Latin1StringsParser
-
Returns the minimum size of a character sequence to be extracted.
- getMode() - Method in class org.apache.tika.server.client.TikaServerClientConfig
- getModel() - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- getModel() - Method in class org.apache.tika.inference.ImageEmbeddingConfig
- getModel() - Method in class org.apache.tika.inference.InferenceConfig
- getModel() - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- getModel() - Method in class org.apache.tika.langdetect.charsoup.CharSoupLanguageDetector
-
Returns the model this detector instance is using for predictions.
- getModel() - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- getModel() - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- getModelVersion() - Method in class org.apache.tika.ml.junkdetect.JunkDetector
-
Returns the version of the loaded model (1, 2, or 3).
- getModificationTime() - Method in class org.apache.tika.example.ImportContextImpl
- getMSB() - Method in class org.apache.tika.metadata.MachineMetadata.Endian
- getMsg() - Method in class org.apache.tika.pipes.emitter.es.JsonResponse
- getMsg() - Method in class org.apache.tika.pipes.emitter.opensearch.JsonResponse
- getMsg() - Method in class org.apache.tika.pipes.reporter.opensearch.JsonResponse
- getMsg() - Method in class org.apache.tika.server.client.TikaEmitterResult
- getMultivaluedFieldStrategyEnum() - Method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
- getMutableFields() - Method in class org.apache.tika.FetchAndParseReply.Builder
-
Deprecated.
- getMutableParams() - Method in class org.apache.tika.GetFetcherReply.Builder
-
Deprecated.
- getN() - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
- getName() - Method in class org.apache.tika.eval.app.db.ColInfo
- getName() - Method in class org.apache.tika.eval.app.db.TableInfo
- getName() - Method in class org.apache.tika.metadata.MachineMetadata.Endian
- getName() - Method in class org.apache.tika.metadata.Property
- getName() - Method in class org.apache.tika.mime.MimeType
-
Returns the name of this media type.
- getName() - Method in interface org.apache.tika.ml.chardetect.StatisticalSpecialist
-
Short name:
"utf16","sbcs", etc. - getName() - Method in class org.apache.tika.ml.chardetect.Utf16SpecialistEncodingDetector
- getName() - Method in enum class org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
- getName() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
- getName() - Method in class org.apache.tika.parser.microsoft.chm.DirectoryListingEntry
-
Returns an entry name
- getName() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFToken
-
For CONTROL_WORD tokens: the control word name.
- getName() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Get the name of the detected charset.
- getName() - Method in class org.apache.tika.pipes.core.config.FileBasedConfigStoreFactory
- getName() - Method in class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
- getName() - Method in class org.apache.tika.pipes.emitter.azblob.AZBlobEmitterFactory
- getName() - Method in class org.apache.tika.pipes.emitter.es.ESEmitterFactory
- getName() - Method in class org.apache.tika.pipes.emitter.fs.FileSystemEmitterFactory
- getName() - Method in class org.apache.tika.pipes.emitter.gcs.GCSEmitterFactory
- getName() - Method in class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterFactory
- getName() - Method in class org.apache.tika.pipes.emitter.kafka.KafkaEmitterFactory
- getName() - Method in class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterFactory
- getName() - Method in class org.apache.tika.pipes.emitter.s3.S3EmitterFactory
- getName() - Method in class org.apache.tika.pipes.emitter.solr.SolrEmitterFactory
- getName() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcherFactory
- getName() - Method in class org.apache.tika.pipes.fetcher.azblob.AZBlobFetcherFactory
- getName() - Method in class org.apache.tika.pipes.fetcher.fs.FileSystemFetcherFactory
- getName() - Method in class org.apache.tika.pipes.fetcher.gcs.GCSFetcherFactory
- getName() - Method in class org.apache.tika.pipes.fetcher.googledrive.GoogleDriveFetcherFactory
- getName() - Method in class org.apache.tika.pipes.fetcher.http.HttpFetcherFactory
- getName() - Method in class org.apache.tika.pipes.fetcher.s3.S3FetcherFactory
- getName() - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.MicrosoftGraphFetcherFactory
- getName() - Method in class org.apache.tika.pipes.ignite.ExtensionConfigDTO
- getName() - Method in class org.apache.tika.pipes.ignite.IgniteConfigStoreFactory
- getName() - Method in class org.apache.tika.pipes.iterator.azblob.AZBlobPipesIteratorFactory
- getName() - Method in class org.apache.tika.pipes.iterator.csv.CSVPipesIteratorFactory
- getName() - Method in class org.apache.tika.pipes.iterator.fs.FileSystemPipesIteratorFactory
- getName() - Method in class org.apache.tika.pipes.iterator.gcs.GCSPipesIteratorFactory
- getName() - Method in class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIteratorFactory
- getName() - Method in class org.apache.tika.pipes.iterator.kafka.KafkaPipesIteratorFactory
- getName() - Method in class org.apache.tika.pipes.iterator.s3.S3PipesIteratorFactory
- getName() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorFactory
- getName() - Method in class org.apache.tika.pipes.pipesiterator.json.JsonPipesIteratorFactory
- getName() - Method in class org.apache.tika.pipes.reporter.es.ESReporterFactory
- getName() - Method in class org.apache.tika.pipes.reporter.fs.FileSystemReporterFactory
- getName() - Method in class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterFactory
- getName() - Method in class org.apache.tika.pipes.reporter.opensearch.OpenSearchReporterFactory
- getName() - Method in interface org.apache.tika.plugins.TikaExtensionFactory
- getName(String) - Static method in class org.apache.tika.io.FilenameUtils
-
This is a duplication of the algorithm and functionality available in commons io FilenameUtils.
- getNameLength() - Method in class org.apache.tika.parser.microsoft.chm.DirectoryListingEntry
-
Returns an entry name length
- getNamespace() - Method in enum class org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
- getNamespacePrefix(String) - Static method in class org.apache.tika.xmp.XMPMetadata
-
Obtain the prefix for a registered namespace URI.
- getNamespaces() - Static method in class org.apache.tika.xmp.XMPMetadata
- getNamespaceURI(String) - Static method in class org.apache.tika.xmp.XMPMetadata
-
Obtain the URI for a registered namespace prefix.
- getNameToDelimiterMap() - Method in class org.apache.tika.parser.csv.TextAndCSVConfig
- getNativeLibPath() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
- getNativeLibPath() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- getNerModelUrl() - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
- getNextCharset() - Method in class org.apache.tika.example.PickBestTextEncodingParser.CharsetTester
-
Deprecated.
- getNextId() - Method in class org.apache.tika.renderer.RenderingTracker
- getNormalizedMessageClass(String) - Static method in class org.apache.tika.parser.microsoft.OutlookExtractor
- getNormalizedName() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
strips e.g.
- getNormalizedPrefix() - Method in record class org.apache.tika.pipes.emitter.azblob.AZBlobEmitterConfig
-
Get the prefix, stripping any trailing slash.
- getNormalizedPrefix() - Method in record class org.apache.tika.pipes.emitter.gcs.GCSEmitterConfig
-
Get the prefix, stripping any trailing slash.
- getNtDomain() - Method in class org.apache.tika.client.HttpClientFactory
- getNtDomain() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getNum_blocks() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Returns number of blocks
- getNumberOfLevels() - Method in class org.apache.tika.parser.microsoft.AbstractListManager.ParagraphLevelCounter
- getNumBuckets() - Method in class org.apache.tika.langdetect.charsoup.CharSoupFeatureExtractor
- getNumBuckets() - Method in class org.apache.tika.langdetect.charsoup.CharSoupModel
- getNumBuckets() - Method in interface org.apache.tika.langdetect.charsoup.FeatureExtractor
- getNumBuckets() - Method in class org.apache.tika.langdetect.charsoup.SaltedNgramFeatureExtractor
- getNumBuckets() - Method in class org.apache.tika.langdetect.charsoup.ScriptAwareFeatureExtractor
- getNumBuckets() - Method in class org.apache.tika.langdetect.charsoup.ShortTextFeatureExtractor
- getNumBuckets() - Method in class org.apache.tika.ml.chardetect.Utf16ColumnFeatureExtractor
- getNumBuckets() - Method in interface org.apache.tika.ml.FeatureExtractor
- getNumBuckets() - Method in class org.apache.tika.ml.LinearModel
- getNumClasses() - Method in class org.apache.tika.langdetect.charsoup.CharSoupModel
- getNumClasses() - Method in class org.apache.tika.ml.chardetect.NaiveBayesBigramEncodingDetector
- getNumClasses() - Method in class org.apache.tika.ml.LinearModel
- getNumClients() - Method in class org.apache.tika.pipes.core.config.ConfigOverrides.PipesConfigOverride
- getNumClients() - Method in class org.apache.tika.pipes.core.PipesConfig
- getNumClients() - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Get the number of forked JVM processes configured.
- getNumEmitters() - Method in class org.apache.tika.pipes.core.PipesConfig
-
Number of emitters
- getNumFetchersPerPage() - Method in class org.apache.tika.ListFetchersRequest.Builder
-
List this many fetchers per page.
- getNumFetchersPerPage() - Method in class org.apache.tika.ListFetchersRequest
-
List this many fetchers per page.
- getNumFetchersPerPage() - Method in interface org.apache.tika.ListFetchersRequestOrBuilder
-
List this many fetchers per page.
- getNumId() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
- getNumOfHidden() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- getNumOfInputs() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- getNumOfOutputs() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- getNumThreads() - Method in class org.apache.tika.server.client.TikaServerClientConfig
- getNumTranslationPairs() - Method in class org.apache.tika.language.translate.impl.CachedTranslator
-
Get the number of different source/target translation pairs this CachedTranslator currently has in its cache.
- getNumTranslationsFor(String, String) - Method in class org.apache.tika.language.translate.impl.CachedTranslator
-
Get the number of different translations from the source language to the target language this CachedTranslator has in its cache.
- getNumWorkers() - Method in class org.apache.tika.eval.app.EvalConfig
- getNumWrites() - Method in class org.apache.tika.eval.app.db.DBBuffer
- getObjectMapper() - Method in class org.apache.tika.config.loader.LoaderContext
- getOcr() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- getOcrDPI() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- getOcrEngineMode() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
- getOcrEngineMode() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- getOcrImageFormat() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- getOcrImageQuality() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- getOcrImageType() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- getOcrMaxImagePixels() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- getOcrMaxPagesToOcr() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- getOcrRenderingStrategy() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- getOcrStrategy() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- getOcrStrategyAuto() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- getOffset() - Method in class org.apache.tika.parser.microsoft.chm.DirectoryListingEntry
- getOffsets() - Method in class org.apache.tika.parser.pdf.updates.IncrementalUpdateRecord
- getOids() - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
- getOnExists() - Method in class org.apache.tika.pipes.emitter.fs.FileSystemEmitterRuntimeConfig
- getOnParseException() - Method in class org.apache.tika.pipes.api.FetchEmitTuple
- getOnParseException() - Method in class org.apache.tika.pipes.core.PipesConfig
-
Gets the default behavior when a parse exception occurs.
- getOOV() - Method in class org.apache.tika.eval.core.tokens.CommonTokenResult
- getOOV(String) - Method in class org.apache.tika.example.TextStatsFromTikaEval
-
Use the default language id models and the default common tokens lists in tika-eval to calculate the out-of-vocabulary percentage for a given string.
- getOPCPackage() - Method in class org.apache.tika.parser.microsoft.ooxml.OPCPackageWrapper
- getOpenContainer() - Method in class org.apache.tika.io.TikaInputStream
- getOrganizations() - Static method in class org.apache.tika.sax.StandardOrganizations
-
Returns the map containing the collection of the most important technical standard organizations.
- getOrganzationsRegex() - Static method in class org.apache.tika.sax.StandardOrganizations
-
Returns the regular expression containing the most important technical standard organizations.
- getOriginalDocumentName() - Method in class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler
-
Returns the name of the original document if stored.
- getOriginalDocumentName() - Method in class org.apache.tika.pipes.core.extractor.TempFileUnpackHandler
-
Returns the name of the original document if stored.
- getOriginalDocumentPath() - Method in class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler
-
Returns the path to the original document if stored.
- getOriginalDocumentPath() - Method in class org.apache.tika.pipes.core.extractor.TempFileUnpackHandler
-
Returns the path to the original document if stored.
- getOsids() - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
- getOtherTesseractConfig() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- getOtherTesseractSettings() - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
- getOuterClass() - Method in interface org.apache.tika.eval.core.textstats.BytesRefCalculator.BytesRefCalcInstance
- getOutputField() - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- getOutputField() - Method in class org.apache.tika.inference.ImageEmbeddingConfig
- getOutputField() - Method in class org.apache.tika.inference.InferenceConfig
- getOutputFileHandler() - Method in class org.apache.tika.parser.external.ExternalParserConfig
- getOutputFormat() - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
-
Get the output format for UNPACK mode.
- getOutputMode() - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
-
Get the output mode for how embedded files are delivered.
- getOutputStream() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns an
OutputStreamobject used write the CAS. - getOutputThreshold() - Method in class org.apache.tika.sax.SecureContentHandler
-
Returns the configured output threshold.
- getOutputType() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- getOverallTimeoutMillis() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- getOverallTimeoutMillis() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getOverlap() - Method in class org.apache.tika.eval.core.tokens.ContrastStatistics
- getOverlapChars() - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- getOverlapChars() - Method in class org.apache.tika.inference.InferenceConfig
- getOverrideLevels(int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFNumberingShim
-
Build override level tuples array for a given numId with the specified length.
- getPackage() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
- getPackage() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
- getPage() - Method in class org.apache.tika.inference.locator.PaginatedLocator
- getPage(int) - Method in class org.apache.tika.renderer.PageBasedRenderResults
- getPageNumber() - Method in class org.apache.tika.ListFetchersRequest.Builder
-
List the fetchers starting at this page number
- getPageNumber() - Method in class org.apache.tika.ListFetchersRequest
-
List the fetchers starting at this page number
- getPageNumber() - Method in interface org.apache.tika.ListFetchersRequestOrBuilder
-
List the fetchers starting at this page number
- getPageSegMode() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
- getPageSegMode() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- getPageSegMode() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- getPageSeparator() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- getPaginated() - Method in class org.apache.tika.inference.locator.Locators
- getParameter() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFToken
- getParameters() - Method in class org.apache.tika.mime.MediaType
-
Returns an immutable sorted map of the parameters of this media type.
- getParams() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- getParams() - Method in class org.apache.tika.GetFetcherReply.Builder
-
Deprecated.
- getParams() - Method in class org.apache.tika.GetFetcherReply
-
Deprecated.
- getParams() - Method in interface org.apache.tika.GetFetcherReplyOrBuilder
-
Deprecated.
- getParamsCount() - Method in class org.apache.tika.GetFetcherReply.Builder
- getParamsCount() - Method in class org.apache.tika.GetFetcherReply
- getParamsCount() - Method in interface org.apache.tika.GetFetcherReplyOrBuilder
-
The configuration parameters.
- getParamsMap() - Method in class org.apache.tika.GetFetcherReply.Builder
-
The configuration parameters.
- getParamsMap() - Method in class org.apache.tika.GetFetcherReply
-
The configuration parameters.
- getParamsMap() - Method in interface org.apache.tika.GetFetcherReplyOrBuilder
-
The configuration parameters.
- getParamsOrDefault(String, String) - Method in class org.apache.tika.GetFetcherReply.Builder
-
The configuration parameters.
- getParamsOrDefault(String, String) - Method in class org.apache.tika.GetFetcherReply
-
The configuration parameters.
- getParamsOrDefault(String, String) - Method in interface org.apache.tika.GetFetcherReplyOrBuilder
-
The configuration parameters.
- getParamsOrThrow(String) - Method in class org.apache.tika.GetFetcherReply.Builder
-
The configuration parameters.
- getParamsOrThrow(String) - Method in class org.apache.tika.GetFetcherReply
-
The configuration parameters.
- getParamsOrThrow(String) - Method in interface org.apache.tika.GetFetcherReplyOrBuilder
-
The configuration parameters.
- getParseContext() - Method in interface org.apache.tika.pipes.api.emitter.EmitData
-
Gets the ParseContext.
- getParseContext() - Method in class org.apache.tika.pipes.api.FetchEmitTuple
- getParseContext() - Method in class org.apache.tika.pipes.core.emitter.EmitDataImpl
-
Gets the ParseContext.
- getParseContextJson() - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
Optional JSON object to configure the ParseContext for this request, overriding server defaults.
- getParseContextJson() - Method in class org.apache.tika.FetchAndParseRequest
-
Optional JSON object to configure the ParseContext for this request, overriding server defaults.
- getParseContextJson() - Method in interface org.apache.tika.FetchAndParseRequestOrBuilder
-
Optional JSON object to configure the ParseContext for this request, overriding server defaults.
- getParseContextJsonBytes() - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
Optional JSON object to configure the ParseContext for this request, overriding server defaults.
- getParseContextJsonBytes() - Method in class org.apache.tika.FetchAndParseRequest
-
Optional JSON object to configure the ParseContext for this request, overriding server defaults.
- getParseContextJsonBytes() - Method in interface org.apache.tika.FetchAndParseRequestOrBuilder
-
Optional JSON object to configure the ParseContext for this request, overriding server defaults.
- getParseException() - Method in class org.apache.tika.eval.core.util.ContentTags
- getParseMode() - Method in class org.apache.tika.pipes.core.PipesConfig
-
Gets the default parse mode for how embedded documents are handled.
- getParseMode() - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Get the parse mode.
- getParser() - Method in class org.apache.tika.Tika
-
Returns the parser instance used by this facade.
- getParser(Metadata) - Method in class org.apache.tika.parser.CompositeParser
-
Returns the parser that best matches the given metadata.
- getParser(Metadata, ParseContext) - Method in class org.apache.tika.parser.CompositeParser
- getParserClassname(Parser) - Static method in class org.apache.tika.utils.ParserUtils
-
Identifies the real class name of the
Parser, unwrapping anyParserDecoratordecorations on top of it. - getParserDetailsHTML() - Method in class org.apache.tika.server.core.resource.TikaParsers
- getParserDetailsJSON() - Method in class org.apache.tika.server.core.resource.TikaParsers
- getParserDetailssPlain() - Method in class org.apache.tika.server.core.resource.TikaParsers
- getParserForType() - Method in class org.apache.tika.DeleteFetcherReply
- getParserForType() - Method in class org.apache.tika.DeleteFetcherRequest
- getParserForType() - Method in class org.apache.tika.DeletePipesIteratorReply
- getParserForType() - Method in class org.apache.tika.DeletePipesIteratorRequest
- getParserForType() - Method in class org.apache.tika.FetchAndParseReply
- getParserForType() - Method in class org.apache.tika.FetchAndParseRequest
- getParserForType() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- getParserForType() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- getParserForType() - Method in class org.apache.tika.GetFetcherReply
- getParserForType() - Method in class org.apache.tika.GetFetcherRequest
- getParserForType() - Method in class org.apache.tika.GetPipesIteratorReply
- getParserForType() - Method in class org.apache.tika.GetPipesIteratorRequest
- getParserForType() - Method in class org.apache.tika.ListFetchersReply
- getParserForType() - Method in class org.apache.tika.ListFetchersRequest
- getParserForType() - Method in class org.apache.tika.SaveFetcherReply
- getParserForType() - Method in class org.apache.tika.SaveFetcherRequest
- getParserForType() - Method in class org.apache.tika.SavePipesIteratorReply
- getParserForType() - Method in class org.apache.tika.SavePipesIteratorRequest
- getParsers() - Method in class org.apache.tika.parser.CompositeParser
-
Returns the component parsers.
- getParsers() - Method in class org.apache.tika.parser.ParseRecord
- getParsers(ParseContext) - Method in class org.apache.tika.parser.CompositeParser
- getParsers(ParseContext) - Method in class org.apache.tika.parser.DefaultParser
- getParsersHTML() - Method in class org.apache.tika.server.core.resource.TikaParsers
- getParsersHTML(boolean) - Method in class org.apache.tika.server.core.resource.TikaParsers
- getParsersJSON() - Method in class org.apache.tika.server.core.resource.TikaParsers
- getParsersJSON(boolean) - Method in class org.apache.tika.server.core.resource.TikaParsers
- getParsersPlain() - Method in class org.apache.tika.server.core.resource.TikaParsers
- getParsersPlain(boolean) - Method in class org.apache.tika.server.core.resource.TikaParsers
- getParsingIdField() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- getPart() - Method in enum class org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
- getPart() - Method in class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFUA
- getPartitions() - Method in class org.apache.tika.pipes.ignite.config.IgniteConfigStoreConfig
- getPassword() - Method in class org.apache.tika.client.HttpClientFactory
- getPassword() - Method in class org.apache.tika.parser.microsoft.OutlookExtractor
-
Returns the password to be used for this file, or null if no / default password should be used
- getPassword() - Method in class org.apache.tika.parser.SimplePasswordProvider
- getPassword() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getPassword() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- getPassword(Metadata) - Method in interface org.apache.tika.parser.PasswordProvider
-
Looks up the password for a document with the given metadata, and returns it for the Parser.
- getPassword(Metadata) - Method in class org.apache.tika.parser.SimplePasswordProvider
- getPasswordProvider() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
- getPath() - Method in class org.apache.tika.io.TikaInputStream
- getPath() - Method in class org.apache.tika.parser.pdf.updates.IncrementalUpdateRecord
- getPathsFromExtractCrawl(FetchKey, Path) - Method in class org.apache.tika.eval.app.ProfilerBase
- getPathsFromSrcCrawl(FetchKey, Path, Path) - Method in class org.apache.tika.eval.app.ProfilerBase
- getPClean() - Method in class org.apache.tika.quality.TextQualityScore
-
Probability in [0,1] that this string is clean text.
- getPDDocument(Path, String, RandomAccessStreamCache.StreamCacheCreateFunction, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
- getPDDocument(TikaInputStream, String, RandomAccessStreamCache.StreamCacheCreateFunction, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
- getPDDocumentFromStream(InputStream, String, RandomAccessStreamCache.StreamCacheCreateFunction, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
- getPdfBoxImageType() - Method in enum class org.apache.tika.parser.pdf.OcrConfig.ImageType
- getPDFParserConfig() - Method in class org.apache.tika.parser.pdf.PDFParser
- getPdftoppmPath() - Method in class org.apache.tika.renderer.pdf.poppler.PopplerRenderer
- getPDFVTModified() - Method in class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFVT
- getPDFVTVersion() - Method in class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFVT
- getPDFXConformance() - Method in class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFX
- getPDFXVersion() - Method in class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFX
- getPDFXVersion() - Method in class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFXId
- getPipesClientId() - Method in class org.apache.tika.pipes.core.PipesClient
- getPipesConfig() - Method in class org.apache.tika.pipes.core.config.ConfigOverrides
- getPipesConfig() - Method in class org.apache.tika.pipes.core.server.SharedServerResources
- getPipesConfig() - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Get the underlying PipesConfig for advanced configuration.
- getPipesConfig() - Method in class org.apache.tika.server.core.resource.PipesParsingHelper
-
Gets the PipesConfig instance.
- getPipesIterator(GetPipesIteratorRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingStub
-
Get a pipes iterator's data from the iterator store.
- getPipesIterator(GetPipesIteratorRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingV2Stub
-
Get a pipes iterator's data from the iterator store.
- getPipesIterator(GetPipesIteratorRequest) - Method in class org.apache.tika.TikaGrpc.TikaFutureStub
-
Get a pipes iterator's data from the iterator store.
- getPipesIterator(GetPipesIteratorRequest, StreamObserver<GetPipesIteratorReply>) - Method in interface org.apache.tika.TikaGrpc.AsyncService
-
Get a pipes iterator's data from the iterator store.
- getPipesIterator(GetPipesIteratorRequest, StreamObserver<GetPipesIteratorReply>) - Method in class org.apache.tika.TikaGrpc.TikaStub
-
Get a pipes iterator's data from the iterator store.
- GetPipesIteratorReply - Class in org.apache.tika
-
Protobuf type
tika.GetPipesIteratorReply - GetPipesIteratorReply.Builder - Class in org.apache.tika
-
Protobuf type
tika.GetPipesIteratorReply - GetPipesIteratorReplyOrBuilder - Interface in org.apache.tika
- GetPipesIteratorRequest - Class in org.apache.tika
-
Protobuf type
tika.GetPipesIteratorRequest - GetPipesIteratorRequest.Builder - Class in org.apache.tika
-
Protobuf type
tika.GetPipesIteratorRequest - GetPipesIteratorRequestOrBuilder - Interface in org.apache.tika
- getPipesParser() - Method in class org.apache.tika.server.core.resource.PipesParsingHelper
-
Gets the PipesParser instance.
- getPipesParsingHelper() - Static method in class org.apache.tika.server.core.resource.TikaResource
-
Gets the PipesParsingHelper instance.
- getPipesReporters() - Method in class org.apache.tika.pipes.core.reporter.CompositePipesReporter
- getPipesResult() - Method in class org.apache.tika.pipes.fork.PipesForkResult
-
Get the underlying PipesResult for advanced access.
- getPlainMapper() - Static method in class org.apache.tika.config.loader.TikaObjectMapperFactory
-
Returns a shared plain ObjectMapper without TikaModule registration.
- getPluginRoots() - Method in class org.apache.tika.pipes.core.config.ConfigOverrides
- getPluginsDir() - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Get the plugins directory.
- getPollDelayMs() - Method in class org.apache.tika.pipes.iterator.kafka.KafkaPipesIteratorConfig
- getPoolSize() - Method in class org.apache.tika.config.GlobalSettings.XmlReaderUtilsConfig
- getPoolSize() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
- getPoolSize() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- getPoolSize() - Static method in class org.apache.tika.utils.XMLReaderUtils
- getPort() - Method in class org.apache.tika.pipes.core.PerClientServerManager
- getPort() - Method in interface org.apache.tika.pipes.core.ServerManager
-
Returns the port number the server is listening on.
- getPort() - Method in class org.apache.tika.pipes.core.SharedServerManager
-
Returns the current server port, blocking if a restart is in progress.
- getPort() - Method in class org.apache.tika.server.core.TikaServerConfig
- getPos() - Method in class org.apache.tika.io.BoundedInputStream
- getPosition() - Method in class org.apache.tika.io.TikaInputStream
- getPrecision() - Method in class org.apache.tika.eval.app.db.ColInfo
-
Gets the precision.
- getPrefix() - Method in enum class org.apache.tika.extractor.EmbeddedDocumentUtil.EmbeddedResourcePrefix
- getPrefix() - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- getPrefix() - Method in class org.apache.tika.pipes.iterator.azblob.AZBlobPipesIteratorConfig
- getPrefix() - Method in class org.apache.tika.pipes.iterator.gcs.GCSPipesIteratorConfig
- getPrefix() - Method in class org.apache.tika.pipes.iterator.s3.S3PipesIteratorConfig
- getPrefixes() - Static method in class org.apache.tika.xmp.XMPMetadata
- getPrevContent() - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
- getPrimaryProperty() - Method in class org.apache.tika.metadata.Property
-
Gets the primary property for a composite property
- getPrivateKey() - Method in class org.apache.tika.pipes.fetcher.http.jwt.JwtPrivateKeyCreds
- getProbability() - Method in class org.apache.tika.ml.Prediction
-
Softmax probability of this label (0–1), relative to all other labels.
- getProbability(String) - Method in class org.apache.tika.eval.core.tokens.LangModel
- getProcess() - Method in class org.apache.tika.pipes.core.PerClientServerManager
-
Returns the server process.
- getProcessTimeMillis() - Method in class org.apache.tika.utils.FileProcessResult
- getProcessTimeoutMillis(ParseContext, long) - Static method in class org.apache.tika.config.TimeoutLimits
-
Returns the per-process timeout to use for external process execution.
- getProfile() - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- getProfile() - Method in class org.apache.tika.pipes.iterator.s3.S3PipesIteratorConfig
- getProgId() - Method in class org.apache.tika.parser.microsoft.ooxml.EmbeddedPartMetadata
- getProgressTimeoutMillis() - Method in class org.apache.tika.config.TimeoutLimits
-
Gets the maximum time in milliseconds between progress updates before the task is considered stalled.
- getProjectId() - Method in class org.apache.tika.pipes.fetcher.gcs.config.GCSFetcherConfig
- getProjectId() - Method in class org.apache.tika.pipes.iterator.gcs.GCSPipesIteratorConfig
- getPrompt() - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- getPrompt() - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- getProperties(String) - Static method in class org.apache.tika.metadata.Property
- getProperty(Object) - Method in class org.apache.tika.example.ImportContextImpl
- getPropertyTag(ClassID, String, long) - Method in class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks
-
Get property tag id by property set GUID and string name or numerical name from named properties mapping
- getPropertyType() - Method in class org.apache.tika.metadata.Property
- getPropertyType(String) - Static method in class org.apache.tika.metadata.Property
-
Get the type of a property
- getProvider() - Method in class org.apache.tika.digest.InputStreamDigester
-
When subclassing this, becare to ensure that your provider is thread-safe (not likely) or return a new provider with each call.
- getProxyHost() - Method in class org.apache.tika.client.HttpClientFactory
- getProxyHost() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getProxyHost() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- getProxyPort() - Method in class org.apache.tika.client.HttpClientFactory
- getProxyPort() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getProxyPort() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- getQNameAsString(QName) - Static method in class org.apache.tika.sax.ElementMappingContentHandler
- getQueryTimeoutSeconds() - Method in class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIteratorConfig
- getQueueSize() - Method in exception org.apache.tika.pipes.core.async.OfferLargerThanQueueSize
- getQueueSize() - Method in class org.apache.tika.pipes.core.PipesConfig
-
FetchEmitTuple queue size
- getR0() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getR1() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getR2() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getRangeEnd() - Method in class org.apache.tika.pipes.api.fetcher.FetchKey
- getRangeStart() - Method in class org.apache.tika.pipes.api.fetcher.FetchKey
- getRawScore() - Method in class org.apache.tika.language.detect.LanguageResult
- getReader() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Create a java.io.Reader for reading the Unicode character data corresponding to the original byte data supplied to the Charset detect operation.
- getReader(InputStream, String) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Autodetect the charset of an inputStream, and return a Java Reader to access the converted input data.
- getReadLimit() - Method in class org.apache.tika.ml.junkdetect.JunkFilterEncodingDetector
- getReadPstPath() - Method in class org.apache.tika.parser.microsoft.libpst.LibPstParserConfig
- getRegex() - Method in class org.apache.tika.metadata.filter.CaptureGroupMetadataFilter
- getRegion() - Method in class org.apache.tika.parser.transcribe.aws.AmazonTranscribeConfig
- getRegion() - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- getRegion() - Method in class org.apache.tika.pipes.iterator.s3.S3PipesIteratorConfig
- getRegisteredMimeType(String) - Method in class org.apache.tika.mime.MimeTypes
-
Returns the registered, normalised media type with the given name (or alias).
- getRel() - Method in class org.apache.tika.sax.Link
- getRenderedName() - Method in class org.apache.tika.parser.microsoft.ooxml.EmbeddedPartMetadata
- getRenderer() - Method in class org.apache.tika.config.loader.LoaderContext
-
Get the Renderer for injection into rendering parsers.
- getRenderer() - Method in class org.apache.tika.parser.pdf.PDFParser
- getRenderingStrategy() - Method in class org.apache.tika.parser.pdf.OcrConfig
- getRenderResults() - Method in class org.apache.tika.renderer.pdf.pdfbox.PDFRenderingState
- getReplicas() - Method in class org.apache.tika.pipes.ignite.config.IgniteConfigStoreConfig
- getRequestTimeoutMillis() - Method in class org.apache.tika.client.HttpClientFactory
- getRequestTimeoutMillis() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- getRequestTimeoutMillis() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getResetInterval() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
-
Returns reset interval
- getResetTableIndex() - Method in class org.apache.tika.parser.microsoft.chm.ChmDirectoryListingSet
-
Return index of reset table
- getResize() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- getResolvedConfig(String) - Method in class org.apache.tika.parser.ParseContext
-
Gets a resolved configuration object from the cache.
- getResource(Class<T>) - Method in class org.apache.tika.io.TemporaryResources
-
Returns the latest of the tracked resources that implements or extends the given interface or class.
- getResourceAsStream(String) - Method in class org.apache.tika.config.ServiceLoader
-
Returns an input stream for reading the specified resource from the configured class loader.
- getResourceName(Metadata, AtomicInteger) - Static method in class org.apache.tika.parser.RecursiveParserWrapper
- getResources() - Method in class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
- getResults() - Method in class org.apache.tika.detect.EncodingDetectorContext
- getResults() - Method in class org.apache.tika.renderer.RenderResults
- getResultType() - Method in class org.apache.tika.detect.EncodingDetectorContext.Result
-
The
EncodingResult.ResultTypeof the top-ranked result from this detector. - getResultType() - Method in class org.apache.tika.detect.EncodingResult
-
The nature of the evidence that produced this result.
- getRevisionManifestDataElementData(List<DataElement>, CellManifestDataElementData, HashMap<ExGuid, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to get revision manifest data element from a list of data element.
- getRight() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
- getRMetaParser() - Method in class org.apache.tika.pipes.core.server.SharedServerResources
- getRootNode() - Method in class org.apache.tika.config.loader.TikaJsonConfig
-
Gets the raw root JSON node.
- getRows() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- getRSSFooters() - Method in class org.apache.tika.example.RecentFiles
- getRSSHeaders() - Method in class org.apache.tika.example.RecentFiles
- getRSSItem(Document) - Method in class org.apache.tika.example.RecentFiles
- getRtfEmbeddedMaxBytesInKb() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Maximum bytes (in KB) per embedded object/pict when extracting from RTF within MSG files.
- getSampleRate() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the sampling rate, in Hz
- getSanitizedEmbeddedFileName(Metadata, String, int) - Static method in class org.apache.tika.io.FilenameUtils
- getSanitizedEmbeddedFilePath(Metadata, String, int) - Static method in class org.apache.tika.io.FilenameUtils
-
This tries to sanitize dangerous user generated embedded file paths.
- getSasToken() - Method in class org.apache.tika.pipes.fetcher.azblob.config.AZBlobFetcherConfig
- getSasToken() - Method in class org.apache.tika.pipes.iterator.azblob.AZBlobPipesIteratorConfig
- getSaveFetcherMethod() - Static method in class org.apache.tika.TikaGrpc
- getSavePipesIteratorMethod() - Static method in class org.apache.tika.TikaGrpc
- getSAXParser() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the SAX parser specified in this parsing context.
- getSAXParserFactory() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the SAX parser factory specified in this parsing context.
- getSAXTransformerFactory() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns a SAXTransformerFactory.
- getScales() - Method in class org.apache.tika.langdetect.charsoup.CharSoupModel
- getScales() - Method in class org.apache.tika.ml.LinearModel
- getScopes() - Method in class org.apache.tika.pipes.fetcher.googledrive.config.GoogleDriveFetcherConfig
- getScopes() - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.MicrosoftGraphFetcherConfig
- getScore() - Method in class org.apache.tika.ml.chardetect.ScoredCandidate
- getScore() - Method in class org.apache.tika.sax.StandardReference
- getSecondaryExtractProperties() - Method in class org.apache.tika.metadata.Property
-
Gets the secondary properties for a composite property
- getSecondOrganizationAcronym() - Method in class org.apache.tika.sax.StandardReference
- getSecret() - Method in class org.apache.tika.pipes.fetcher.http.jwt.JwtSecretCreds
- getSecretKey() - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- getSecretKey() - Method in class org.apache.tika.pipes.iterator.s3.S3PipesIteratorConfig
- getSectionName() - Method in class org.apache.tika.config.loader.AbstractSpiComponentLoader
- getSelect() - Method in class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIteratorConfig
- getSeparator() - Method in class org.apache.tika.sax.StandardReference
- getSeparatorChar() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the separator character used for annotation properties.
- getSerializedSize() - Method in class org.apache.tika.DeleteFetcherReply
- getSerializedSize() - Method in class org.apache.tika.DeleteFetcherRequest
- getSerializedSize() - Method in class org.apache.tika.DeletePipesIteratorReply
- getSerializedSize() - Method in class org.apache.tika.DeletePipesIteratorRequest
- getSerializedSize() - Method in class org.apache.tika.FetchAndParseReply
- getSerializedSize() - Method in class org.apache.tika.FetchAndParseRequest
- getSerializedSize() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- getSerializedSize() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- getSerializedSize() - Method in class org.apache.tika.GetFetcherReply
- getSerializedSize() - Method in class org.apache.tika.GetFetcherRequest
- getSerializedSize() - Method in class org.apache.tika.GetPipesIteratorReply
- getSerializedSize() - Method in class org.apache.tika.GetPipesIteratorRequest
- getSerializedSize() - Method in class org.apache.tika.ListFetchersReply
- getSerializedSize() - Method in class org.apache.tika.ListFetchersRequest
- getSerializedSize() - Method in class org.apache.tika.SaveFetcherReply
- getSerializedSize() - Method in class org.apache.tika.SaveFetcherRequest
- getSerializedSize() - Method in class org.apache.tika.SavePipesIteratorReply
- getSerializedSize() - Method in class org.apache.tika.SavePipesIteratorRequest
- getSerializerType() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the type of cTAKES (UIMA) serializer used to write the CAS.
- getServerSocket() - Method in class org.apache.tika.pipes.core.PerClientServerManager
-
Returns the ServerSocket for accepting client connections.
- getServiceAccountKeyBase64() - Method in class org.apache.tika.pipes.fetcher.googledrive.config.GoogleDriveFetcherConfig
- getServiceClass(Class<T>, String) - Method in class org.apache.tika.config.ServiceLoader
-
Loads and returns the named service class that's expected to implement the given interface.
- getServiceDescriptor() - Static method in class org.apache.tika.TikaGrpc
- getServiceLoader() - Method in class org.apache.tika.config.GlobalSettings
- getSharedMapper() - Static method in class org.apache.tika.serialization.TikaModule
-
Gets the shared ObjectMapper.
- getSharedSecret() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- getShortBE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE short value from the beginning of a byte array
- getShortBE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE short value from a byte array
- getShortLE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE short value from the beginning of a byte array
- getShortLE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE short value from a byte array
- getShutdownClientAfterMillis() - Method in class org.apache.tika.pipes.core.PipesConfig
- getSiegfriedPath() - Method in class org.apache.tika.detect.siegfried.SiegfriedDetector.Config
- getSignature() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Returns a signature of itsf header
- getSignature() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Returns a signature of the header
- getSignature() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
-
Returns a signature of control data block
- getSignature() - Method in class org.apache.tika.parser.microsoft.chm.ChmPmgiHeader
-
Returns pmgi signature if exists
- getSignature() - Method in class org.apache.tika.parser.microsoft.chm.ChmPmglHeader
- getSize() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
-
Returns a size of control data
- getSize() - Method in class org.apache.tika.parser.mp3.ID3v2Frame.RawTag
- getSize(Map<String, byte[]>, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.core.writer.TarWriter
- getSize(Map<String, byte[]>, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.core.writer.ZipWriter
- getSize(Metadata, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.core.writer.CSVMessageBodyWriter
- getSize(Metadata, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.core.writer.JSONMessageBodyWriter
- getSize(Metadata, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.core.writer.JSONObjWriter
- getSize(Metadata, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.core.writer.TextMessageBodyWriter
- getSize(Metadata, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.standard.writer.XMPMessageBodyWriter
- getSize(MetadataList, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.core.writer.MetadataListMessageBodyWriter
- getSizeFieldName() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- getSizeOffered() - Method in exception org.apache.tika.pipes.core.async.OfferLargerThanQueueSize
- getSkewAngle() - Method in class org.apache.tika.parser.ocr.tess4j.ImageDeskew
- getSleepOnStartupTimeoutMillis() - Method in class org.apache.tika.pipes.core.PipesConfig
- getSocketTimeoutMillis() - Method in class org.apache.tika.client.HttpClientFactory
- getSocketTimeoutMillis() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- getSocketTimeoutMillis() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getSocketTimeoutMillis() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- getSocketTimeoutMillisOrDefault() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
- getSocketTimeoutMs() - Method in class org.apache.tika.pipes.core.PipesConfig
- getSolrCollection() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- getSolrUrls() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- getSolrZkChroot() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- getSolrZkHosts() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- getSourceField() - Method in class org.apache.tika.metadata.filter.CaptureGroupMetadataFilter
- getSourceFileLength(EvalFilePaths, List<Metadata>) - Method in class org.apache.tika.eval.app.ProfilerBase
- getSpacingTolerance() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- getSpatial() - Method in class org.apache.tika.inference.locator.Locators
- getSpecialistName() - Method in class org.apache.tika.ml.chardetect.SpecialistOutput
- getSpoolTypes() - Method in class org.apache.tika.io.SpoolingStrategy
-
Returns the media types that should be spooled to disk.
- getSqlDef() - Method in class org.apache.tika.eval.app.db.ColInfo
- getStackTrace(Throwable) - Static method in class org.apache.tika.utils.ExceptionUtils
-
Get the full stacktrace as a string
- getStaleFetcherDelaySeconds() - Method in class org.apache.tika.pipes.core.PipesConfig
- getStaleFetcherTimeoutSeconds() - Method in class org.apache.tika.pipes.core.PipesConfig
- getStartBlock() - Method in class org.apache.tika.parser.microsoft.chm.ChmBlockInfo
-
Returns the start block index
- getStarted() - Method in class org.apache.tika.server.core.TaskStatus
- getStartIndex() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- getStartMs() - Method in class org.apache.tika.inference.locator.TemporalLocator
- getStartOffset() - Method in class org.apache.tika.inference.Chunk
-
Convenience: returns the start offset from the first
TextLocator, or -1 if none. - getStartOffset() - Method in class org.apache.tika.inference.locator.TextLocator
- getStartOffset() - Method in class org.apache.tika.parser.microsoft.chm.ChmBlockInfo
-
Returns the start offset index
- getStartupTimeoutMillis() - Method in class org.apache.tika.pipes.core.config.ConfigOverrides.PipesConfigOverride
- getStartupTimeoutMillis() - Method in class org.apache.tika.pipes.core.PipesConfig
- getStartxref() - Method in class org.apache.tika.parser.pdf.updates.StartXRefOffset
- getStartXrefOffset() - Method in class org.apache.tika.parser.pdf.updates.StartXRefOffset
- getState() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxBlock
- getStatelessParser(ParseContext) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
Utility function to get the Parser that was sent in to the ParseContext to handle embedded documents.
- getStatus() - Method in class org.apache.tika.FetchAndParseReply.Builder
-
The status from the message.
- getStatus() - Method in class org.apache.tika.FetchAndParseReply
-
The status from the message.
- getStatus() - Method in interface org.apache.tika.FetchAndParseReplyOrBuilder
-
The status from the message.
- getStatus() - Method in class org.apache.tika.pipes.api.pipesiterator.TotalCountResult
- getStatus() - Method in class org.apache.tika.pipes.emitter.es.JsonResponse
- getStatus() - Method in class org.apache.tika.pipes.emitter.opensearch.JsonResponse
- getStatus() - Method in exception org.apache.tika.pipes.fork.PipesForkParserException
-
Get the result status that caused this exception.
- getStatus() - Method in class org.apache.tika.pipes.fork.PipesForkResult
-
Get the result status.
- getStatus() - Method in class org.apache.tika.pipes.reporter.opensearch.JsonResponse
- getStatus() - Method in class org.apache.tika.renderer.RenderResult
- getStatus() - Method in class org.apache.tika.server.client.TikaEmitterResult
- getStatus() - Method in class org.apache.tika.server.core.resource.TikaServerStatus
- getStatusBytes() - Method in class org.apache.tika.FetchAndParseReply.Builder
-
The status from the message.
- getStatusBytes() - Method in class org.apache.tika.FetchAndParseReply
-
The status from the message.
- getStatusBytes() - Method in interface org.apache.tika.FetchAndParseReplyOrBuilder
-
The status from the message.
- getStderr() - Method in class org.apache.tika.utils.FileProcessResult
- getStderrHandler() - Method in class org.apache.tika.parser.external.ExternalParserConfig
- getStderrLength() - Method in class org.apache.tika.utils.FileProcessResult
- getStdout() - Method in class org.apache.tika.utils.FileProcessResult
- getStdoutHandler() - Method in class org.apache.tika.parser.external.ExternalParserConfig
- getStdoutLength() - Method in class org.apache.tika.utils.FileProcessResult
- getStorageManifestDataElementData(List<DataElement>, ExGuid) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to get storage manifest data element from a list of data element.
- getStrategy() - Method in class org.apache.tika.parser.pdf.OcrConfig
- getStrategyAuto() - Method in class org.apache.tika.parser.pdf.OcrConfig
- getStream_uuid() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Returns stream uuid
- getStreamForDetectionOnly(InputStream, int) - Static method in class org.apache.tika.detect.DetectHelper
-
Creates a TikaInputStream suitable for detection-only purposes by reading up to
maxLengthbytes from the input stream into a byte array. - getStreamForDetectionOnly(InputStream, int, Metadata) - Static method in class org.apache.tika.detect.DetectHelper
-
Creates a TikaInputStream suitable for detection-only purposes by reading up to
maxLengthbytes from the input stream into a byte array. - getStreamLength() - Method in class org.apache.tika.utils.StreamGobbler
- getStreamObjectTypeMapping() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
-
Gets the StreamObjectTypeMapping
- getStreamReadConstraints() - Static method in class org.apache.tika.serialization.JsonMetadata
-
Gets the current stream read constraints.
- getStreamReadConstraints() - Static method in class org.apache.tika.serialization.JsonMetadataList
-
Gets the current stream read constraints.
- getString() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Create a Java String from Unicode character data corresponding to the original byte data supplied to the Charset detect operation.
- getString() - Static method in class org.apache.tika.Tika
- getString(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
Returns the String at the given offset and length.
- getString(byte[], String) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Autodetect the charset of an inputStream, and return a String containing the converted input data.
- getString(int) - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Create a Java String from Unicode character data corresponding to the original byte data supplied to the Charset detect operation.
- getStringsPath() - Method in class org.apache.tika.parser.strings.StringsConfig
-
Returns the "strings" installation folder.
- getStringsProg() - Static method in class org.apache.tika.parser.strings.StringsParser
- getStyleClass() - Method in class org.apache.tika.parser.microsoft.WordExtractor.TagAndStyle
- getStyleID() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
- getStyleName(String) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFStylesShim
- getSubject() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- getSubject() - Method in class org.apache.tika.pipes.fetcher.http.jwt.JwtCreds
- getSubjectUser() - Method in class org.apache.tika.pipes.fetcher.googledrive.config.GoogleDriveFetcherConfig
- getSubtype() - Method in class org.apache.tika.mime.MediaType
-
Return the Sub-Type of the MediaType, such as "plain" for "text/plain"
- getSuccess() - Method in class org.apache.tika.DeleteFetcherReply.Builder
-
Success if the fetcher was successfully removed from the fetch store.
- getSuccess() - Method in class org.apache.tika.DeleteFetcherReply
-
Success if the fetcher was successfully removed from the fetch store.
- getSuccess() - Method in interface org.apache.tika.DeleteFetcherReplyOrBuilder
-
Success if the fetcher was successfully removed from the fetch store.
- getSuffix(InputStream, int) - Static method in class org.apache.tika.parser.mp3.LyricsHandler
-
Reads and returns the last
lengthbytes from the given stream. - getSuffix(PDImage, Metadata) - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- getSuffixFromPath(String) - Static method in class org.apache.tika.io.FilenameUtils
-
This includes the period, e.g. ".pdf".
- getSuffixStrategy() - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- getSummaryStatistics() - Method in class org.apache.tika.eval.core.tokens.TokenStatistics
- getSupertype(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Returns the supertype of the given type.
- getSupported() - Method in class org.apache.tika.pipes.core.AbstractComponentManager
-
Returns the set of supported component IDs.
- getSupportedEmbedTypes() - Method in class org.apache.tika.embedder.ExternalEmbedder
- getSupportedEmbedTypes(ParseContext) - Method in interface org.apache.tika.embedder.Embedder
-
Returns the set of media types supported by this embedder when used with the given parse context.
- getSupportedEmbedTypes(ParseContext) - Method in class org.apache.tika.embedder.ExternalEmbedder
- getSupportedLanguages() - Static method in class org.apache.tika.eval.core.langid.LanguageIDWrapper
- getSupportedLanguages() - Static method in class org.apache.tika.langdetect.charsoup.CharSoupLanguageDetector
-
Returns all language codes supported by the loaded model.
- getSupportedLanguages() - Method in class org.apache.tika.langdetect.opennlp.OpenNLPDetector
- getSupportedMediaTypes() - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- getSupportedMediaTypes() - Method in class org.apache.tika.parser.vlm.ClaudeVLMParser
- getSupportedMediaTypes() - Method in class org.apache.tika.parser.vlm.GeminiVLMParser
- getSupportedMediaTypes() - Method in class org.apache.tika.parser.vlm.OpenAIVLMParser
- getSupportedTypes() - Method in class org.apache.tika.parser.external.ExternalParserConfig
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.example.DirListParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.example.EncryptedPrescriptionParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.example.PrescriptionParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.apple.AppleSingleFileParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.apple.PListParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.asm.ClassParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.audio.AudioParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.audio.MidiParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.code.SourceCodeParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.CompositeParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.crypto.Pkcs7Parser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.crypto.TSDParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.CryptoParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.csv.TextAndCSVParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dbf.DBFParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.DelegatingParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dgn.DGN8Parser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dif.DIFParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dwg.DWGParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dwg.DWGReadParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.EmptyParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.envi.EnviHeaderParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.epub.EpubContentParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.epub.EpubParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ErrorParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.executable.ExecutableParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.executable.UniversalExecutableParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.external.ExternalParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.feed.FeedParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.font.AdobeFontMetricParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.font.TrueTypeParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.gdal.GDALParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.geo.topic.GeoParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.geoinfo.GeographicInformationParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.geopkg.GeoPkgParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.grib.GribParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.hdf.HDFParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.html.JSoupParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.http.HttpParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.hwp.HwpV5Parser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.BPGParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.HeifParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.ICNSParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.ImageParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.JpegParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.JXLParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.PSDParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.TiffParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.WebPParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.indesign.IDMLParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iptc.IptcAnpaParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.isatab.ISArchiveParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iwork.iwana.IWork18PackageParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iwork.IWorkPackageParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.jdbc.AbstractDBParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.journal.JournalParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mail.RFC822Parser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mat.MatParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mbox.MboxParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.activemime.ActiveMimeParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.chm.ChmParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.EMFParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.JackcessParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.libpst.LibPstParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.MSOwnerFileParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.OfficeParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.OldExcelParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006.Word2006MLParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.pst.OutlookPSTParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.pst.PSTMailItemParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.rtf.RTFParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.TNEFParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.WMFParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.WordMLParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mif.MIFParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mp3.Mp3Parser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mp4.MP4Parser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.multiple.AbstractMultipleParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ner.NamedEntityParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.netcdf.NetCDFParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.NetworkParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ocrencode.EncodeOCRParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.odf.FlatOpenDocumentParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentContentParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ogg.FlacParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ogg.OggParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ogg.OpusParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ogg.SpeexParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ogg.TheoraParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ogg.VorbisParser
- getSupportedTypes(ParseContext) - Method in interface org.apache.tika.parser.Parser
-
Returns the set of media types supported by this parser when used with the given parse context.
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ParserDecorator
-
Delegates the method call to the decorated parser.
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ParserDecorator.MimeFilteringDecorator
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.CompressorParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.PackageParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.RarParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.SevenZParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.UnrarParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.ZipParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.prt.PRTParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.RecursiveParserWrapper
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.RegexCaptureParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.sas.SAS7BDATParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.sqlite3.SQLite3DBParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.sqlite3.SQLite3Parser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.strings.Latin1StringsParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.strings.StringsParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.tmx.TMXParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.transcribe.aws.AmazonTranscribe
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.txt.TXTParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.video.FLVParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.wacz.WACZParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.warc.WARCParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.wordperfect.QuattroProParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.wordperfect.WordPerfectParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xliff.XLIFF12Parser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xliff.XLZParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xml.FictionBookParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xml.XMLParser
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xml.XMLProfiler
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.renderer.CompositeRenderer
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.renderer.pdf.pdfbox.PDFBoxRenderer
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.renderer.pdf.poppler.PopplerRenderer
- getSupportedTypes(ParseContext) - Method in interface org.apache.tika.renderer.Renderer
-
Returns the set of media types supported by this renderer when used with the given parse context.
- getSwath() - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
- getSyncBits(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
- getSystem_uuid() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Returns system uuid
- getSystemId() - Method in class org.apache.tika.example.ImportContextImpl
- getTableName() - Method in class org.apache.tika.parser.jdbc.JDBCTableReader
- getTableName() - Method in class org.apache.tika.pipes.ignite.config.IgniteConfigStoreConfig
- getTableNames(Connection, Metadata, ParseContext) - Method in class org.apache.tika.parser.jdbc.AbstractDBParser
-
Returns the names of the tables to process
- getTableNames(Connection, Metadata, ParseContext) - Method in class org.apache.tika.parser.sqlite3.SQLite3DBParser
- getTableOffset() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
-
Gets a table offset
- getTableReader(Connection, String, EmbeddedDocumentUtil) - Method in class org.apache.tika.parser.jdbc.AbstractDBParser
-
Given a connection and a table name, return the JDBCTableReader for this db.
- getTableReader(Connection, String, EmbeddedDocumentUtil) - Method in class org.apache.tika.parser.sqlite3.SQLite3DBParser
- getTableReader(Connection, String, ParseContext) - Method in class org.apache.tika.parser.jdbc.AbstractDBParser
- getTableReader(Connection, String, ParseContext) - Method in class org.apache.tika.parser.sqlite3.SQLite3DBParser
- getTables(Connection) - Method in class org.apache.tika.eval.app.db.H2Util
- getTables(Connection) - Method in class org.apache.tika.eval.app.db.JDBCUtil
- getTag() - Method in class org.apache.tika.parser.microsoft.WordExtractor.TagAndStyle
- getTag() - Method in exception org.apache.tika.sax.TaggedSAXException
-
Returns the object reference used as the tag this exception.
- getTags() - Method in class org.apache.tika.eval.core.util.ContentTags
- getTags(int) - Method in class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
- getTagsPresent() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
Does the file contain this kind of tags?
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
- getTagString(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
Returns the (possibly null padded) String at the given offset and length.
- getTail() - Method in class org.apache.tika.io.TailStream
-
Returns an array with the last data read from the underlying stream.
- getTargetField() - Method in class org.apache.tika.metadata.filter.CaptureGroupMetadataFilter
- getTask() - Method in class org.apache.tika.server.core.TaskStatus
- getTasks() - Method in class org.apache.tika.server.core.ServerStatus
-
Returns a snapshot of currently running tasks.
- getTempDirectory() - Method in class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler
-
Returns the temporary directory where files are stored.
- getTempDirectory() - Method in class org.apache.tika.pipes.core.extractor.TempFileUnpackHandler
-
Returns the temporary directory where embedded files are stored.
- getTempDirectory() - Method in class org.apache.tika.pipes.core.PerClientServerManager
- getTempDirectory() - Method in class org.apache.tika.pipes.core.PipesConfig
-
Gets the directory for temporary files during pipes-based parsing.
- getTempDirectory() - Method in interface org.apache.tika.pipes.core.ServerManager
-
Returns the path to the temporary directory used by the server.
- getTempDirectory() - Method in class org.apache.tika.pipes.core.SharedServerManager
- getTemporal() - Method in class org.apache.tika.inference.locator.Locators
- getTenantId() - Method in interface org.apache.tika.pipes.fetchers.microsoftgraph.config.AadCredentialConfigBase
- getTenantId() - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.Client2CertificateCredentialsConfig
- getTenantId() - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.ClientCertificateCredentialsConfig
- getTenantId() - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.ClientSecretCredentialsConfig
- getTessdataPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- getTesseractPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- getTesseractProg() - Static method in class org.apache.tika.parser.ocr.TesseractOCRParser
- getTestLanguages() - Method in class org.apache.tika.langdetect.LanguageDetectorTest
- getText() - Method in class org.apache.tika.inference.Chunk
- getText() - Method in class org.apache.tika.inference.locator.Locators
- getText() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
-
Gets the text, if present
- getText() - Method in class org.apache.tika.sax.Link
- getText(InputStream, HttpHeaders) - Method in class org.apache.tika.server.core.resource.TikaResource
-
Parse document and return plain text content.
- getTextDocument() - Method in class org.apache.tika.sax.boilerpipe.BoilerpipeContentHandler
-
Retrieves the built TextDocument
- getThreshold() - Method in class org.apache.tika.sax.StandardsExtractingContentHandler
-
Gets the threshold to be used for selecting the standard references found within the text based on their score.
- getThresholdBytes() - Method in class org.apache.tika.pipes.core.EmitStrategyConfig
-
Get the threshold in bytes for DYNAMIC strategy.
- getThrottleSeconds() - Method in class org.apache.tika.pipes.fetcher.googledrive.config.GoogleDriveFetcherConfig
- getThrottleSeconds() - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- getThrottleSeconds() - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.MicrosoftGraphFetcherConfig
- getThrowOnWriteLimitReached(MultivaluedMap<String, String>) - Static method in class org.apache.tika.server.core.resource.TikaResource
- getThrowOnZeroBytes() - Method in class org.apache.tika.parser.AutoDetectParserConfig
- getTikaEndpoints() - Method in class org.apache.tika.server.client.TikaServerClientConfig
- getTikaInputStream() - Method in class org.apache.tika.renderer.pdf.pdfbox.PDFRenderingState
- getTikaLoader() - Method in class org.apache.tika.pipes.core.server.SharedServerResources
- getTikaLoader() - Static method in class org.apache.tika.server.core.resource.TikaResource
- getTimeElapsed() - Method in class org.apache.tika.server.client.TikaEmitterResult
- getTimeoutLimits() - Method in class org.apache.tika.pipes.core.config.ConfigOverrides
- getTimeoutLimits() - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Get the timeout limits.
- getTimeoutMillis() - Method in class org.apache.tika.pipes.iterator.azblob.AZBlobPipesIteratorConfig
- getTimeoutMs() - Method in class org.apache.tika.detect.magika.MagikaDetector.Config
- getTimeoutMs() - Method in class org.apache.tika.detect.siegfried.SiegfriedDetector.Config
- getTimeoutMs() - Method in class org.apache.tika.parser.external.ExternalParserConfig
- getTimeoutMs() - Method in class org.apache.tika.renderer.pdf.poppler.PopplerRenderer
- getTimeoutSeconds() - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- getTimeoutSeconds() - Method in class org.apache.tika.inference.ImageEmbeddingConfig
- getTimeoutSeconds() - Method in class org.apache.tika.inference.InferenceConfig
- getTimeoutSeconds() - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- getTimeoutSeconds() - Method in class org.apache.tika.parser.microsoft.libpst.LibPstParserConfig
- getTimeoutSeconds() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
- getTimeoutSeconds() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- getTimeoutSeconds() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- getTimeoutSeconds() - Method in class org.apache.tika.parser.strings.StringsConfig
-
Returns the maximum time (in seconds) to wait for the "strings" command to terminate.
- getTimeoutSeconds() - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- getTimeoutSeconds() - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- getTitle() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
- getTitle() - Method in interface org.apache.tika.parser.mp3.ID3Tags
- getTitle() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
- getTitle() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
- getTitle() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
- getTitle() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
- getTitle() - Method in class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
- getTitle() - Method in class org.apache.tika.sax.Link
- getTlsConfig() - Method in class org.apache.tika.server.core.TikaServerConfig
- getTo() - Method in class org.apache.tika.renderer.PageRangeRequest
- getToken() - Method in class org.apache.tika.eval.core.tokens.TokenIntPair
- getToken() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenizer
-
Returns the reusable token instance.
- getTokens() - Method in class org.apache.tika.eval.core.tokens.LangModel
- getTokens() - Method in class org.apache.tika.eval.core.tokens.TokenCounts
- getTokens(String) - Method in class org.apache.tika.eval.core.tokens.CommonTokenCountManager
- getTopConfidenceFor(Charset) - Method in class org.apache.tika.detect.EncodingDetectorContext
-
Returns the highest confidence seen for the given charset across all detector results (not just top results).
- getTopic() - Method in class org.apache.tika.pipes.iterator.kafka.KafkaPipesIteratorConfig
- getTopN() - Method in class org.apache.tika.eval.core.tokens.TokenStatistics
- getTopNMoreA() - Method in class org.apache.tika.eval.core.tokens.ContrastStatistics
- getTopNMoreB() - Method in class org.apache.tika.eval.core.tokens.ContrastStatistics
- getTopNUniqueA() - Method in class org.apache.tika.eval.core.tokens.ContrastStatistics
- getTopNUniqueB() - Method in class org.apache.tika.eval.core.tokens.ContrastStatistics
- getTotal() - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
- getTotalCharsPerPage() - Method in class org.apache.tika.parser.pdf.OcrConfig.StrategyAuto
- getTotalCount() - Method in interface org.apache.tika.pipes.api.pipesiterator.TotalCounter
-
Returns the total count so far.
- getTotalCount() - Method in class org.apache.tika.pipes.api.pipesiterator.TotalCountResult
- getTotalCount() - Method in class org.apache.tika.pipes.iterator.fs.FileSystemPipesIterator
- getTotalProcessed() - Method in class org.apache.tika.pipes.core.async.AsyncProcessor
- getTotalTaskTimeoutMillis() - Method in class org.apache.tika.config.TimeoutLimits
-
Gets the maximum wall-clock time in milliseconds for a parse task.
- getTotalTokens() - Method in class org.apache.tika.eval.core.tokens.TokenCounts
- getTotalTokens() - Method in class org.apache.tika.eval.core.tokens.TokenStatistics
- getTotalUniqueTokens() - Method in class org.apache.tika.eval.core.tokens.TokenCounts
- getTotalUniqueTokens() - Method in class org.apache.tika.eval.core.tokens.TokenStatistics
- getTrackingMetadata() - Method in class org.apache.tika.parser.mbox.MboxParser
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
- getTrackNumber() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
The number of the track within the album / recording
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
- getTransformer() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns a new transformer
- getTransformer(ParseContext) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the transformer specified in this parsing context.
- getTransformerFactory() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns a TransformerFactory.
- getTranslator() - Method in class org.apache.tika.language.translate.DefaultTranslator
-
Returns the current translator
- getTranslator() - Method in class org.apache.tika.language.translate.impl.CachedTranslator
- getTranslator() - Method in class org.apache.tika.Tika
-
Returns the translator instance used by this facade.
- getTranslators() - Method in class org.apache.tika.language.translate.DefaultTranslator
-
Returns all available translators
- getTrustStoreFile() - Method in class org.apache.tika.server.core.TlsConfig
- getTrustStorePassword() - Method in class org.apache.tika.server.core.TlsConfig
- getTrustStoreType() - Method in class org.apache.tika.server.core.TlsConfig
- getTuples() - Method in class org.apache.tika.server.core.resource.AsyncRequest
- getType() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- getType() - Method in class org.apache.tika.eval.app.db.ColInfo
- getType() - Method in exception org.apache.tika.eval.app.io.ExtractReaderException
- getType() - Method in class org.apache.tika.mime.MediaType
-
Return the Type of the MediaType, such as "text" for "text/plain"
- getType() - Method in class org.apache.tika.mime.MimeType
-
Returns the normalized media type name.
- getType() - Method in enum class org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
- getType() - Method in enum class org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
- getType() - Method in enum class org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
- getType() - Method in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- getType() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
- getType() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFToken
- getType() - Method in class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaIllustrator
- getType() - Method in class org.apache.tika.pipes.core.config.ConfigOverrides.EmitterOverride
- getType() - Method in class org.apache.tika.pipes.core.config.ConfigOverrides.FetcherOverride
- getType() - Method in class org.apache.tika.pipes.core.EmitStrategyConfig
-
Get the emit strategy type.
- getType() - Method in class org.apache.tika.sax.BasicContentHandlerFactory
- getType() - Method in class org.apache.tika.sax.Link
- getType(OneNotePropertyEnum) - Static method in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- getTypeFromVal(int) - Static method in enum class org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
- getTypes() - Method in class org.apache.tika.metadata.filter.ClearByAttachmentTypeMetadataFilter
- getTypes() - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Returns the set of all known canonical media types.
- getUByte(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
get the unsigned value of a byte.
- getUCEntry(DirectoryEntry, String) - Static method in class org.apache.tika.parser.microsoft.OfficeParser
-
Looks for entry within root (non-recursive) that has an upper-cased name that equals ucTarget
- getUIntBE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE unsigned int value from a byte array
- getUIntBE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE unsigned int value from a byte array
- getUIntLE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE unsigned int value from a byte array
- getUIntLE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE unsigned int value from a byte array
- getUMLSPass() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the UMLS password.
- getUMLSUser() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the UMLS username.
- getUncompressedLen() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
-
Gets uncompressed length
- getUnderline() - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
- getUniformTypeIdentifier() - Method in class org.apache.tika.mime.MimeType
-
Get the UTI for this mime type.
- getUniqueAlphabeticTokens() - Method in class org.apache.tika.eval.core.tokens.CommonTokenResult
- getUniqueCharsets() - Method in class org.apache.tika.detect.EncodingDetectorContext
-
Returns the unique charsets from ALL results of every detector, in detection order (top result first within each detector).
- getUniqueCommonTokens() - Method in class org.apache.tika.eval.core.tokens.CommonTokenResult
- getUnknown() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
-
Gets unknown
- getUnknown_000c() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Returns unknown_00c value
- getUnknown_000c() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Returns 000c unknown bytes
- getUnknown_0024() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Returns 0024 unknown bytes
- getUnknown_002c() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Returns 002c unknown bytes
- getUnknown_0044() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Returns 0044 unknown bytes
- getUnknown_18() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
-
Returns unknown 18 bytes
- getUnknown0008() - Method in class org.apache.tika.parser.microsoft.chm.ChmPmglHeader
- getUnknownLen() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Returns unknown length
- getUnknownOffset() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Returns unknown offset
- getUnmappedUnicodeCharsPerPage() - Method in class org.apache.tika.parser.pdf.OcrConfig.StrategyAuto
- getUnpackConfig() - Method in class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler
-
Returns the UnpackConfig used by this handler.
- getUnpackedDirectory() - Method in class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler
-
Returns the unpacked subdirectory where embedded files are stored.
- getUnseenProbability() - Method in class org.apache.tika.eval.core.tokens.LangModel
- getUpdateStrategyEnum() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
- getUri() - Method in class org.apache.tika.sax.Link
- getUserAgent() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- getUserAgent() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getUserConfigPath() - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Get the user-provided configuration file path.
- getUserName() - Method in class org.apache.tika.client.HttpClientFactory
- getUserName() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- getUserName() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- getUShortBE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE unsigned short value from the beginning of a byte array
- getUShortBE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE unsigned short value from a byte array
- getUShortLE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE unsigned short value from the beginning of a byte array
- getUShortLE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE unsigned short value from a byte array
- getUtf16PropertiesToPrint() - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteTreeWalkerOptions
-
Print file node data in UTF-16 format when they match these props.
- getValue() - Method in class org.apache.tika.eval.core.tokens.TokenIntPair
- getValues(String) - Method in class org.apache.tika.metadata.Metadata
-
Get the values associated to a metadata name.
- getValues(String) - Method in class org.apache.tika.xmp.XMPMetadata
-
Returns the value of a simple property or all if the property is an array and the elements are of simple type.
- getValues(Property) - Method in class org.apache.tika.metadata.Metadata
-
Get the values associated to a metadata name.
- getValues(Property) - Method in class org.apache.tika.xmp.XMPMetadata
- getValueSerializer() - Method in class org.apache.tika.pipes.iterator.kafka.KafkaPipesIteratorConfig
- getValueType() - Method in class org.apache.tika.metadata.Property
- getVector() - Method in class org.apache.tika.inference.Chunk
- getVersion() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Returns itsf header version
- getVersion() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Returns version of itsp header
- getVersion() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
-
Returns a version of control data block
- getVersion() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
-
Returns the version
- getVersion() - Method in class org.apache.tika.parser.mp3.AudioFrame
- getVersion() - Method in class org.apache.tika.server.core.resource.TikaVersion
- getVersionCode() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the version code.
- getWarnings() - Method in class org.apache.tika.parser.ParseRecord
- getWeights() - Method in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Return weights in class-major
[class][bucket]layout. - getWeights() - Method in class org.apache.tika.ml.LinearModel
-
Return weights in class-major
[class][bucket]layout. - getWelcomeHTML() - Method in class org.apache.tika.server.core.resource.TikaWelcome
- getWelcomePlain() - Method in class org.apache.tika.server.core.resource.TikaWelcome
- getWindow() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getWindowPosition() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getWindowSize() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
-
Returns a window size
- getWindowSize() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- getWindowSize(int) - Static method in class org.apache.tika.parser.microsoft.chm.ChmCommons
-
LZX supports window sizes of 2^15 (32Kb) through 2^21 (2Mb) Returns X, i.e 2^X
- getWindowsPerReset() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
-
Returns windows per reset
- getWrappedParser() - Method in class org.apache.tika.parser.ParserDecorator
-
Gets the parser wrapped by this ParserDecorator
- getWriteLimit() - Method in class org.apache.tika.config.OutputLimits
-
Gets the maximum characters to write.
- getWriteLimit() - Method in class org.apache.tika.sax.BasicContentHandlerFactory
- getWriteLimit() - Method in interface org.apache.tika.sax.WriteLimiter
- getWriteLimit(MultivaluedMap<String, String>) - Static method in class org.apache.tika.server.core.resource.TikaResource
-
Parses the writeLimit header value from HTTP headers.
- getXhtml(InputStream, HttpHeaders) - Method in class org.apache.tika.server.core.resource.TikaResource
-
Parse document and return XHTML content.
- getXHTML(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
- getXHTML(ContentHandler, Metadata, ParseContext) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor
-
Parses the document into a sequence of XHTML SAX events sent to the given content handler.
- getXHTML(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFBExcelExtractorDecorator
- getXHTML(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
- getXml(InputStream, HttpHeaders) - Method in class org.apache.tika.server.core.resource.TikaResource
-
Parse document and return XML content.
- getXMLInputFactory() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the StAX input factory specified in this parsing context.
- getXMLInputFactory(ParseContext) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the StAX input factory specified in this parsing context.
- getXMLReader() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the XMLReader specified in this parsing context.
- getXmlReaderUtils() - Method in class org.apache.tika.config.GlobalSettings
- getXMPData() - Method in class org.apache.tika.xmp.XMPMetadata
-
Provides direct access to the XMP data model, in case a client prefers to work directly on it instead of using the Metadata API
- getXMPMeta() - Method in class org.apache.tika.xmp.convert.AbstractConverter
- getYear() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
- getYear() - Method in interface org.apache.tika.parser.mp3.ID3Tags
- getYear() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
- getYear() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
- getYear() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
- getYear() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
- getZeroPadName() - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- getZipBombRatio() - Method in class org.apache.tika.config.OutputLimits
-
Gets the zip bomb ratio (maximum output:input ratio).
- getZipBombThreshold() - Method in class org.apache.tika.config.OutputLimits
-
Gets the zip bomb threshold (characters before check activates).
- getZipInputStream() - Method in record class org.apache.tika.server.core.resource.PipesParsingHelper.UnpackResult
-
Returns an InputStream for the zip file.
- getZScore() - Method in class org.apache.tika.quality.TextQualityScore
-
Z-score relative to clean text for the detected script. 0 = average clean; negative = worse.
- GLOB_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- GlobalIdTableEntry3FNDX - Class in org.apache.tika.parser.microsoft.onenote
- GlobalIdTableEntry3FNDX() - Constructor for class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
- GlobalIdTableEntryFNDX - Class in org.apache.tika.parser.microsoft.onenote
- GlobalIdTableEntryFNDX() - Constructor for class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
- GlobalSettings - Class in org.apache.tika.config
-
Global Tika configuration settings that don't belong to specific components.
- GlobalSettings() - Constructor for class org.apache.tika.config.GlobalSettings
- GlobalSettings.ServiceLoaderConfig - Class in org.apache.tika.config
-
Service loader configuration.
- GlobalSettings.XmlReaderUtilsConfig - Class in org.apache.tika.config
-
XML reader utilities security configuration.
- GoogleDriveFetcher - Class in org.apache.tika.pipes.fetcher.googledrive
- GoogleDriveFetcher(ExtensionConfig, GoogleDriveFetcherConfig) - Constructor for class org.apache.tika.pipes.fetcher.googledrive.GoogleDriveFetcher
- GoogleDriveFetcherConfig - Class in org.apache.tika.pipes.fetcher.googledrive.config
- GoogleDriveFetcherConfig() - Constructor for class org.apache.tika.pipes.fetcher.googledrive.config.GoogleDriveFetcherConfig
- GoogleDriveFetcherFactory - Class in org.apache.tika.pipes.fetcher.googledrive
-
Factory for creating Google Drive fetchers.
- GoogleDriveFetcherFactory() - Constructor for class org.apache.tika.pipes.fetcher.googledrive.GoogleDriveFetcherFactory
- GoogleDrivePipesPlugin - Class in org.apache.tika.pipes.plugin.googledrive
- GoogleDrivePipesPlugin(PluginWrapper) - Constructor for class org.apache.tika.pipes.plugin.googledrive.GoogleDrivePipesPlugin
- googleTranslateToEnglish(String) - Static method in class org.apache.tika.example.TranscribeTranslateExample
-
Use
GoogleTranslatorto execute translation on input data. - GoogleTranslator - Class in org.apache.tika.language.translate.impl
-
An implementation of a REST client to the Google Translate v2 API.
- GoogleTranslator() - Constructor for class org.apache.tika.language.translate.impl.GoogleTranslator
- GrabPhoneNumbersExample - Class in org.apache.tika.example
-
Class to demonstrate how to use the
PhoneExtractingContentHandlerto get a list of all of the phone numbers from every file in a directory. - GrabPhoneNumbersExample() - Constructor for class org.apache.tika.example.GrabPhoneNumbersExample
- GRAPH - Enum constant in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- GRAY - Enum constant in enum class org.apache.tika.parser.pdf.OcrConfig.ImageType
- GREEK - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- GREETING - Static variable in class org.apache.tika.server.core.resource.TikaResource
- GRIB_MIME_TYPE - Static variable in class org.apache.tika.parser.grib.GribParser
- GribParser - Class in org.apache.tika.parser.grib
- GribParser() - Constructor for class org.apache.tika.parser.grib.GribParser
- GrobidNERecogniser - Class in org.apache.tika.parser.ner.grobid
- GrobidNERecogniser() - Constructor for class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
- GrobidRESTParser - Class in org.apache.tika.parser.journal
- GrobidRESTParser() - Constructor for class org.apache.tika.parser.journal.GrobidRESTParser
- GROUP_CLOSE - Enum constant in enum class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenType
- GROUP_OPEN - Enum constant in enum class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenType
- GROUPS - Static variable in class org.apache.tika.ml.chardetect.CharsetConfusables
-
All confusable groups (both symmetric and superset chains), used for probability collapsing during inference via
CharsetConfusables.collapseGroups(float[], int[][]). - GTAR - Static variable in class org.apache.tika.detect.zip.PackageConstants
- guid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
- guid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
- guid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestSchemaGUID
- GUID - Class in org.apache.tika.parser.microsoft.onenote
- GUID(int[]) - Constructor for class org.apache.tika.parser.microsoft.onenote.GUID
- guidCellSchemaId - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
- guidFile - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
- guidFileFormat - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
- guidFileType - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
- guidIndex - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CompactID
- guidLegacyFileVersion - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
- GuidUtil - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
- GuidUtil() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.GuidUtil
- GZ - Static variable in class org.apache.tika.detect.gzip.GZipSpecializationDetector
- GZIP - Static variable in class org.apache.tika.detect.zip.CompressorConstants
- GZIP_ALT - Static variable in class org.apache.tika.detect.zip.CompressorConstants
- GZipSpecializationDetector - Class in org.apache.tika.detect.gzip
-
This is designed to detect commonly gzipped file types such as warc.gz.
- GZipSpecializationDetector() - Constructor for class org.apache.tika.detect.gzip.GZipSpecializationDetector
H
- H2Util - Class in org.apache.tika.eval.app.db
- H2Util(Path) - Constructor for class org.apache.tika.eval.app.db.H2Util
- HAN - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- HAN_COMPAT - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- HAN_EXT_A - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- HAN_EXT_B - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- handle(Metadata) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
-
Copies extracted tags to tika metadata using registered handlers.
- handle(String, MediaType, TikaInputStream, ParseContext) - Method in interface org.apache.tika.extractor.EmbeddedResourceHandler
-
Called to process an embedded resource within the container.
- handle(Iterator<Directory>) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
-
Copies extracted tags to tika metadata using registered handlers.
- handleBlob(String, String, int, ResultSet, int, ContentHandler, ParseContext) - Method in class org.apache.tika.parser.jdbc.JDBCTableReader
- handleCatchableIOE(IOException) - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- handleClob(String, String, int, ResultSet, int, ContentHandler, ParseContext) - Method in class org.apache.tika.parser.jdbc.JDBCTableReader
- handleClob(String, String, int, ResultSet, int, ContentHandler, ParseContext) - Method in class org.apache.tika.parser.sqlite3.SQLite3TableReader
-
No-op for now in
SQLite3TableReader. - handleCrashAndGetExitCode() - Method in class org.apache.tika.pipes.core.PerClientServerManager
- handleCrashAndGetExitCode() - Method in interface org.apache.tika.pipes.core.ServerManager
-
Handles a crash by checking the process exit code and marking for restart.
- handleDate(ResultSet, int, ContentHandler) - Method in class org.apache.tika.parser.jdbc.JDBCTableReader
- handleEmbeddedFile(PackagePart, XHTMLContentHandler, String, EmbeddedPartMetadata, TikaCoreProperties.EmbeddedResourceType) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
Handles an embedded file in the document
- handleEmbeddedOfficeDoc(DirectoryEntry, String, XHTMLContentHandler, boolean) - Method in class org.apache.tika.parser.microsoft.OutlookExtractor
-
Handle an office document that's embedded at the POIFS level
- handleEmbeddedOfficeDoc(DirectoryEntry, Metadata, String, XHTMLContentHandler, boolean) - Method in class org.apache.tika.parser.microsoft.OutlookExtractor
-
Handle an office document that's embedded at the POIFS level
- handleEmbeddedOfficeDoc(DirectoryEntry, XHTMLContentHandler, boolean) - Method in class org.apache.tika.parser.microsoft.OutlookExtractor
-
Handle an office document that's embedded at the POIFS level
- handleEmbeddedResource(TikaInputStream, String, String, String, XHTMLContentHandler, boolean) - Method in class org.apache.tika.parser.microsoft.OutlookExtractor
- handleEmbeddedResource(TikaInputStream, String, String, ClassID, String, XHTMLContentHandler, boolean) - Method in class org.apache.tika.parser.microsoft.OutlookExtractor
- handleEmbeddedResource(TikaInputStream, Metadata, String, String, ClassID, String, XHTMLContentHandler, boolean) - Method in class org.apache.tika.parser.microsoft.OutlookExtractor
- handleEntryMetadata(String, Date, Date, Long, XHTMLContentHandler, ParseContext) - Static method in class org.apache.tika.parser.pkg.AbstractArchiveParser
-
Handles metadata for an archive entry and writes appropriate XHTML elements.
- handleException(SAXException) - Method in class org.apache.tika.sax.ContentHandlerDecorator
-
Handle any exceptions thrown by methods in this class.
- handleException(SAXException) - Method in class org.apache.tika.sax.TaggedContentHandler
-
Tags any
SAXExceptions thrown, wrapping and re-throwing. - handleGlobError(MimeType, String, MimeTypeException, String, Attributes) - Method in class org.apache.tika.mime.MimeTypesReader
- handleInteger(ResultSet, int, ContentHandler) - Method in class org.apache.tika.parser.jdbc.JDBCTableReader
- handleMimeError(String, MimeTypeException, String, Attributes) - Method in class org.apache.tika.mime.MimeTypesReader
- HANDLER_TYPE_HEADER - Static variable in class org.apache.tika.server.core.resource.TikaResource
-
Header to specify the handler type for content extraction.
- HANDLER_TYPE_PARAM - Static variable in class org.apache.tika.server.core.resource.RecursiveMetadataResource
- handlerTypeName() - Method in class org.apache.tika.sax.BasicContentHandlerFactory
- handlerTypeName() - Method in interface org.apache.tika.sax.ContentHandlerFactory
-
Returns the name of the handler type produced by this factory (e.g.
- handleSpecialName(String, JsonNode, LoaderContext) - Method in class org.apache.tika.config.loader.AbstractSpiComponentLoader
-
Handle special component names that require custom loading.
- handleSpecialName(String, JsonNode, LoaderContext) - Method in class org.apache.tika.config.loader.DetectorLoader
- handleTimeStamp(ResultSet, int, ContentHandler) - Method in class org.apache.tika.parser.jdbc.JDBCTableReader
- handleXMP(InputStream, int, ImageMetadataExtractor) - Method in class org.apache.tika.parser.image.BPGParser
- HANGUL - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- HAS_3D - Static variable in interface org.apache.tika.metadata.PDF
-
If the PDF has an annotation of type 3D
- HAS_ACROFORM_FIELDS - Static variable in interface org.apache.tika.metadata.PDF
-
Has > 0 AcroForm fields
- HAS_ANIMATIONS - Static variable in interface org.apache.tika.metadata.Office
- HAS_ATTACHED_TEMPLATE - Static variable in interface org.apache.tika.metadata.Office
- HAS_COLLECTION - Static variable in interface org.apache.tika.metadata.PDF
-
Has a collection element in the root.
- HAS_COMMENTS - Static variable in interface org.apache.tika.metadata.Office
- HAS_CONTENT - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- HAS_DATA_CONNECTIONS - Static variable in interface org.apache.tika.metadata.Office
- HAS_DDE_LINKS - Static variable in interface org.apache.tika.metadata.Office
- HAS_EXTERNAL_CHART_DATA - Static variable in interface org.apache.tika.metadata.Office
- HAS_EXTERNAL_LINKS - Static variable in interface org.apache.tika.metadata.Office
- HAS_EXTERNAL_OLE_OBJECTS - Static variable in interface org.apache.tika.metadata.Office
- HAS_EXTERNAL_PIVOT_DATA - Static variable in interface org.apache.tika.metadata.Office
- HAS_FIELD_HYPERLINKS - Static variable in interface org.apache.tika.metadata.Office
- HAS_FRAMESETS - Static variable in interface org.apache.tika.metadata.Office
- HAS_HIDDEN_COLUMNS - Static variable in interface org.apache.tika.metadata.Office
- HAS_HIDDEN_ROWS - Static variable in interface org.apache.tika.metadata.Office
- HAS_HIDDEN_SHEETS - Static variable in interface org.apache.tika.metadata.Office
- HAS_HIDDEN_SLIDES - Static variable in interface org.apache.tika.metadata.Office
- HAS_HIDDEN_TEXT - Static variable in interface org.apache.tika.metadata.Office
- HAS_HOVER_HYPERLINKS - Static variable in interface org.apache.tika.metadata.Office
- HAS_LINKED_OLE_OBJECTS - Static variable in interface org.apache.tika.metadata.Office
- HAS_MAIL_MERGE - Static variable in interface org.apache.tika.metadata.Office
- HAS_MARKED_CONTENT - Static variable in interface org.apache.tika.metadata.PDF
- HAS_POWER_QUERY - Static variable in interface org.apache.tika.metadata.Office
- HAS_SIGNATURE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- HAS_SUBDOCUMENTS - Static variable in interface org.apache.tika.metadata.Office
- HAS_TRACK_CHANGES - Static variable in interface org.apache.tika.metadata.Office
- HAS_VERY_HIDDEN_SHEETS - Static variable in interface org.apache.tika.metadata.Office
- HAS_VML_HYPERLINKS - Static variable in interface org.apache.tika.metadata.Office
- HAS_WEB_QUERIES - Static variable in interface org.apache.tika.metadata.Office
- HAS_XFA - Static variable in interface org.apache.tika.metadata.PDF
-
Has XFA
- HAS_XMP - Static variable in interface org.apache.tika.metadata.PDF
-
Has XMP, whether or not it is valid
- has2ByteColumnAsymmetry(byte[]) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
-
Returns
trueif the probe's byte distribution across stride-2 columns is sufficiently asymmetric to be plausible UTF-16 of some script. - has2ByteColumnAsymmetryEvidence(byte[]) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
-
Evidence-based variant of
StructuralEncodingRules.has2ByteColumnAsymmetry(byte[])with no conservative short-probe default: returnstrueonly when the bytes themselves demonstrate column asymmetry, regardless of probe length. - hasAnimations() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
- hasApplicationError() - Method in class org.apache.tika.pipes.core.async.AsyncProcessor
-
Returns true if an application error has occurred during processing.
- hasArrayComponents(String) - Method in class org.apache.tika.config.loader.TikaJsonConfig
-
Checks if a component type has any configured components (array format).
- hasC1Bytes(byte[]) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
-
Returns
trueif the probe contains any byte in the C1 control range0x80–0x9F. - hasC1Bytes(byte[], int, int) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
- hasCalibration() - Method in class org.apache.tika.ml.LinearModel
-
trueif this model carries per-class calibration statistics. - hasComponent(String) - Method in class org.apache.tika.config.loader.ComponentRegistry
-
Checks if a component with the given name is registered.
- hasComponent(String) - Static method in class org.apache.tika.serialization.ComponentNameResolver
-
Checks if a component with the given name is registered in any registry.
- hasComponentConfig(Class<?>) - Static method in class org.apache.tika.serialization.ComponentNameResolver
-
Checks if a component config is registered for the given class.
- hasComponentConfig(String) - Static method in class org.apache.tika.serialization.ComponentNameResolver
-
Checks if a component config is registered for the given JSON field.
- hasComponents(String) - Method in class org.apache.tika.config.loader.TikaJsonConfig
-
Checks if a component type has any configured components (object format).
- hasComponentSection(String) - Method in class org.apache.tika.config.loader.TikaJsonConfig
-
Checks if a component type section exists in the config (even if empty).
- hasConfig(ParseContext, String) - Static method in class org.apache.tika.config.ParseContextConfig
-
Checks if runtime configuration exists for the given key.
- hasConfig(ParseContext, String) - Static method in class org.apache.tika.serialization.ConfigDeserializer
-
Checks if a configuration exists in the ParseContext.
- hasConfigFile() - Method in class org.apache.tika.server.core.TikaServerConfig
- hasConfiguration(JsonConfig, ObjectMapper) - Static method in class org.apache.tika.config.loader.ComponentInstantiator
-
Checks if the JsonConfig contains actual configuration (non-empty JSON object with fields).
- hasCrlfBytes(byte[]) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
-
Returns
trueif the probe contains at least one CRLF pair (0x0D 0x0A). - hasCrlfBytes(byte[], int, int) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
- hasCustomLoader() - Method in class org.apache.tika.serialization.ComponentConfig
-
Returns true if this component has a custom loader.
- hasDefault() - Method in class org.apache.tika.serialization.ComponentConfig
- hasDwgRead() - Method in class org.apache.tika.parser.dwg.DWGParserConfig
- hasEmbeddedFiles() - Method in class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler
-
Returns true if there are any embedded files.
- hasEmbeddedFiles() - Method in class org.apache.tika.pipes.core.extractor.TempFileUnpackHandler
-
Returns true if there are any embedded files stored.
- hasEnoughText() - Method in class org.apache.tika.langdetect.charsoup.CharSoupLanguageDetector
- hasEnoughText() - Method in class org.apache.tika.langdetect.optimaize.OptimaizeLangDetector
- hasEnoughText() - Method in class org.apache.tika.language.detect.LanguageDetector
-
Tell the caller whether more text is required for the current document before the language can be reliably detected.
- hasFile() - Method in class org.apache.tika.io.TikaInputStream
- hasFiltering() - Method in class org.apache.tika.config.loader.FrameworkConfig.ParserDecoration
- hasGb18030FourByteSequence(byte[]) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
-
Returns
trueif the probe contains at least one GB18030-specific 4-byte sequence. - hasGb18030FourByteSequence(byte[], int, int) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
- hash() - Method in record class org.apache.tika.pipes.core.extractor.frictionless.FrictionlessResource
-
Returns the value of the
hashrecord component. - hashCode() - Method in class org.apache.tika.config.EmbeddedLimits
- hashCode() - Method in record class org.apache.tika.config.loader.ComponentInfo
-
Returns a hash code value for this object.
- hashCode() - Method in class org.apache.tika.config.OutputLimits
- hashCode() - Method in class org.apache.tika.config.TimeoutLimits
- hashCode() - Method in class org.apache.tika.DeleteFetcherReply
- hashCode() - Method in class org.apache.tika.DeleteFetcherRequest
- hashCode() - Method in class org.apache.tika.DeletePipesIteratorReply
- hashCode() - Method in class org.apache.tika.DeletePipesIteratorRequest
- hashCode() - Method in class org.apache.tika.eval.app.db.ColInfo
- hashCode() - Method in class org.apache.tika.eval.core.tokens.TokenIntPair
- hashCode() - Method in class org.apache.tika.eval.core.tokens.TokenStatistics
- hashCode() - Method in class org.apache.tika.FetchAndParseReply
- hashCode() - Method in class org.apache.tika.FetchAndParseRequest
- hashCode() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- hashCode() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- hashCode() - Method in class org.apache.tika.GetFetcherReply
- hashCode() - Method in class org.apache.tika.GetFetcherRequest
- hashCode() - Method in class org.apache.tika.GetPipesIteratorReply
- hashCode() - Method in class org.apache.tika.GetPipesIteratorRequest
- hashCode() - Method in class org.apache.tika.ListFetchersReply
- hashCode() - Method in class org.apache.tika.ListFetchersRequest
- hashCode() - Method in class org.apache.tika.metadata.Metadata
- hashCode() - Method in class org.apache.tika.metadata.Property
- hashCode() - Method in class org.apache.tika.mime.MediaType
- hashCode() - Method in class org.apache.tika.mime.MimeType
- hashCode() - Method in class org.apache.tika.parser.csv.CSVResult
- hashCode() - Method in class org.apache.tika.parser.html.DataURIScheme
- hashCode() - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
- hashCode() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
-
Override the GetHashCode.
- hashCode() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
-
Override the GetHashCode.
- hashCode() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
- hashCode() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
- hashCode() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- hashCode() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
- hashCode() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
- hashCode() - Method in class org.apache.tika.parser.microsoft.onenote.GUID
- hashCode() - Method in class org.apache.tika.parser.ParseContext
- hashCode() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
generates a hashCode based on the confidence value
- hashCode() - Method in record class org.apache.tika.parser.vlm.AbstractVLMParser.HttpCall
-
Returns a hash code value for this object.
- hashCode() - Method in class org.apache.tika.pipes.api.emitter.EmitKey
- hashCode() - Method in class org.apache.tika.pipes.api.FetchEmitTuple
- hashCode() - Method in class org.apache.tika.pipes.api.fetcher.FetchKey
- hashCode() - Method in record class org.apache.tika.pipes.api.PipesResult
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.core.async.EmitDataPair
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.core.config.ConfigMerger.MergeResult
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.core.extractor.frictionless.FrictionlessResource
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler.FrictionlessFileInfo
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.core.extractor.TempFileUnpackHandler.EmbeddedFileInfo
-
Returns a hash code value for this object.
- hashCode() - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- hashCode() - Method in record class org.apache.tika.pipes.core.protocol.PipesMessage
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.emitter.azblob.AZBlobEmitterConfig
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.emitter.es.ESEmitterConfig
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.emitter.es.HttpClientConfig
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.emitter.fs.FileSystemEmitterConfig
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.emitter.gcs.GCSEmitterConfig
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.emitter.opensearch.HttpClientConfig
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Returns a hash code value for this object.
- hashCode() - Method in class org.apache.tika.pipes.fetcher.http.config.HttpHeaders
- hashCode() - Method in class org.apache.tika.pipes.iterator.azblob.AZBlobPipesIteratorConfig
- hashCode() - Method in class org.apache.tika.pipes.iterator.csv.CSVPipesIteratorConfig
- hashCode() - Method in class org.apache.tika.pipes.iterator.fs.FileSystemPipesIteratorConfig
- hashCode() - Method in class org.apache.tika.pipes.iterator.gcs.GCSPipesIteratorConfig
- hashCode() - Method in class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIteratorConfig
- hashCode() - Method in class org.apache.tika.pipes.iterator.kafka.KafkaPipesIteratorConfig
- hashCode() - Method in class org.apache.tika.pipes.iterator.s3.S3PipesIteratorConfig
- hashCode() - Method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- hashCode() - Method in class org.apache.tika.pipes.pipesiterator.json.JsonPipesIteratorConfig
- hashCode() - Method in class org.apache.tika.pipes.pipesiterator.PipesIteratorConfig
- hashCode() - Method in record class org.apache.tika.pipes.reporter.es.ESReporterConfig
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.reporter.fs.FileSystemReporterConfig
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterConfig
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.reporter.opensearch.HttpClientConfig
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.pipes.reporter.opensearch.OpenSearchReporterConfig
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.plugins.ExtensionConfig
-
Returns a hash code value for this object.
- hashCode() - Method in class org.apache.tika.renderer.PageRangeRequest
- hashCode() - Method in class org.apache.tika.SaveFetcherReply
- hashCode() - Method in class org.apache.tika.SaveFetcherRequest
- hashCode() - Method in class org.apache.tika.SavePipesIteratorReply
- hashCode() - Method in class org.apache.tika.SavePipesIteratorRequest
- hashCode() - Method in class org.apache.tika.sax.BasicContentHandlerFactory
- hashCode() - Method in record class org.apache.tika.server.core.resource.PipesParsingHelper.UnpackResult
-
Returns a hash code value for this object.
- hashCode() - Method in record class org.apache.tika.server.core.resource.ServerHandlerConfig
-
Returns a hash code value for this object.
- hashCode() - Method in class org.apache.tika.xmp.XMPMetadata
- hasHitBound() - Method in class org.apache.tika.io.BoundedInputStream
- hasID3v1() - Method in class org.apache.tika.parser.mp3.LyricsHandler
- hasImplementationsOf(Class<?>) - Static method in class org.apache.tika.serialization.ComponentNameResolver
-
Checks if any registered component implements or extends the given abstract type.
- hasJsonConfig(String) - Method in class org.apache.tika.parser.ParseContext
-
Checks if a JSON configuration exists for the given component name.
- hasKey(String) - Method in class org.apache.tika.config.loader.ConfigLoader
-
Checks if a configuration key exists in the JSON config.
- hasKey(String) - Method in class org.apache.tika.config.loader.TikaJsonConfig
-
Checks if a configuration key exists.
- hasLength() - Method in class org.apache.tika.io.TikaInputStream
- hasListWrapper() - Method in class org.apache.tika.serialization.ComponentConfig
- hasLyrics() - Method in class org.apache.tika.parser.mp3.LyricsHandler
- hasMacroLanguage(String) - Static method in class org.apache.tika.language.detect.LanguageNames
- hasMagic() - Method in class org.apache.tika.mime.MimeType
- hasMasks(PDImage) - Static method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- hasModel(String) - Method in class org.apache.tika.langdetect.charsoup.CharSoupLanguageDetector
- hasModel(String) - Method in class org.apache.tika.langdetect.lingo24.Lingo24LangDetector
- hasModel(String) - Method in class org.apache.tika.langdetect.mitll.TextLangDetector
- hasModel(String) - Method in class org.apache.tika.langdetect.opennlp.OpenNLPDetector
- hasModel(String) - Method in class org.apache.tika.langdetect.optimaize.OptimaizeLangDetector
- hasModel(String) - Method in class org.apache.tika.language.detect.LanguageDetector
-
Provide information about whether a model exists for a specific language.
- hasNext() - Method in class org.apache.tika.parser.mp3.ID3v2Frame.RawTagIterator
- hasOriginalDocument() - Method in class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler
-
Returns true if the original document was stored.
- hasOriginalDocument() - Method in class org.apache.tika.pipes.core.extractor.TempFileUnpackHandler
-
Returns true if the original document was stored.
- hasParameter() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFToken
- hasParameters() - Method in class org.apache.tika.mime.MediaType
-
Checks whether this media type contains parameters.
- hasRange() - Method in class org.apache.tika.pipes.api.fetcher.FetchKey
- hasResources() - Method in class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
-
Returns true if this package has any resources.
- hasSkip(DirectoryListingEntry) - Static method in class org.apache.tika.parser.microsoft.chm.ChmCommons
-
Checks skippable patterns
- hasStream() - Method in class org.apache.tika.example.ImportContextImpl
- hasTesseract() - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
- hasTestLanguage(String) - Method in class org.apache.tika.langdetect.LanguageDetectorTest
- hasTrustStore() - Method in class org.apache.tika.server.core.TlsConfig
- HasVersionPages - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- hasWarned() - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
- HDFParser - Class in org.apache.tika.parser.hdf
-
Since the
NetCDFParserdepends on the NetCDF-Java API, we are able to use it to parse HDF files as well. - HDFParser() - Constructor for class org.apache.tika.parser.hdf.HDFParser
- header - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfContextIDs
- header - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOIDs
- header - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOSIDs
- headerCell - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
- HeaderCell - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
- HeaderCell() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.HeaderCell
- headerCellCellManifest - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
- headerCellRevisionManifest - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
- headerFooter(String, boolean, String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
- HeaderFooterFromString(String) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
- headers - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
- headers() - Method in record class org.apache.tika.parser.vlm.AbstractVLMParser.HttpCall
-
Returns the value of the
headersrecord component. - headerType - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
-
Gets or sets the type of the stream object.
- HEADLINE - Static variable in interface org.apache.tika.metadata.IPTC
-
A brief synopsis of the caption.
- HEADLINE - Static variable in interface org.apache.tika.metadata.Photoshop
- HEBREW - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- HeifParser - Class in org.apache.tika.parser.image
- HeifParser() - Constructor for class org.apache.tika.parser.image.HeifParser
- HEX - Enum constant in enum class org.apache.tika.digest.DigestDef.Encoding
- HEX_ESCAPE - Enum constant in enum class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenType
- HEX_OUT_OF_RANGE - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.Error
- HexCoDec - Class in org.apache.tika.mime
-
A set of Hex encoding and decoding utility methods.
- HexCoDec() - Constructor for class org.apache.tika.mime.HexCoDec
- hfHelper - Static variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
Allows access to headers/footers from raw xml strings
- Hidden - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- HIDDEN_SHEET_NAMES - Static variable in interface org.apache.tika.metadata.Office
- HIDDEN_SLIDES - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- HIGH - Enum constant in enum class org.apache.tika.language.detect.LanguageConfidence
- Highlight - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- HIRAGANA - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- HISTORY - Static variable in interface org.apache.tika.metadata.ClimateForcast
- HISTORY_ACTION - Static variable in interface org.apache.tika.metadata.XMPMM
-
Action in the XMPMM's history section
- HISTORY_EVENT_INSTANCEID - Static variable in interface org.apache.tika.metadata.XMPMM
-
Instance id in the XMPMM's history section
- HISTORY_OF - Enum constant in enum class org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
- HISTORY_SOFTWARE_AGENT - Static variable in interface org.apache.tika.metadata.XMPMM
-
Software agent that created the action in the XMPMM's history section
- HISTORY_WHEN - Static variable in interface org.apache.tika.metadata.XMPMM
-
When the action occurred in the XMPMM's history section
- HOCR - Enum constant in enum class org.apache.tika.parser.ocr.TesseractOCRConfig.OUTPUT_TYPE
- HoughLine() - Constructor for class org.apache.tika.parser.ocr.tess4j.ImageDeskew.HoughLine
- HRESULTError - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
HRESULT Error
- HSLFExtractor - Class in org.apache.tika.parser.microsoft
- HSLFExtractor(ParseContext, Metadata) - Constructor for class org.apache.tika.parser.microsoft.HSLFExtractor
- HTML - Enum constant in enum class org.apache.tika.parser.microsoft.OutlookExtractor.BODY_TYPES_PROCESSED
- HTML - Enum constant in enum class org.apache.tika.sax.BasicContentHandlerFactory.HANDLER_TYPE
- HTML - Interface in org.apache.tika.metadata
- HtmlByteStripper - Class in org.apache.tika.ml.chardetect
-
Byte-level HTML tag stripper used as a preprocess for charset detection.
- HtmlByteStripper.Result - Class in org.apache.tika.ml.chardetect
-
Result of a strip operation: new content length and the number of well-formed tags (including comments) successfully parsed.
- HtmlEncodingDetector - Class in org.apache.tika.parser.html
-
Character encoding detector for determining the character encoding of a HTML document based on the potential charset parameter found in a Content-Type http-equiv meta tag somewhere near the beginning.
- HtmlEncodingDetector() - Constructor for class org.apache.tika.parser.html.HtmlEncodingDetector
-
Default constructor for SPI loading.
- HtmlEncodingDetector(JsonConfig) - Constructor for class org.apache.tika.parser.html.HtmlEncodingDetector
-
Constructor for JSON configuration.
- HtmlEncodingDetector(HtmlEncodingDetector.Config) - Constructor for class org.apache.tika.parser.html.HtmlEncodingDetector
-
Constructor with explicit Config object.
- HtmlEncodingDetector.Config - Class in org.apache.tika.parser.html
-
Configuration class for JSON deserialization.
- HTMLHelper - Class in org.apache.tika.server.core
-
Helps produce user facing HTML output.
- HTMLHelper() - Constructor for class org.apache.tika.server.core.HTMLHelper
- HtmlMapper - Interface in org.apache.tika.parser.html
-
HTML mapper used to make incoming HTML documents easier to handle by Tika clients.
- HTTP_CONTENT_ENCODING - Static variable in class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcher
- HTTP_CONTENT_ENCODING - Static variable in class org.apache.tika.pipes.fetcher.http.HttpFetcher
- HTTP_CONTENT_TYPE - Static variable in class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcher
- HTTP_CONTENT_TYPE - Static variable in class org.apache.tika.pipes.fetcher.http.HttpFetcher
- HTTP_FETCH_PREFIX - Static variable in class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcher
- HTTP_FETCH_PREFIX - Static variable in class org.apache.tika.pipes.fetcher.http.HttpFetcher
- HTTP_FETCH_TRUNCATED - Static variable in class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcher
- HTTP_FETCH_TRUNCATED - Static variable in class org.apache.tika.pipes.fetcher.http.HttpFetcher
- HTTP_HEADER_PREFIX - Static variable in class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcher
- HTTP_HEADER_PREFIX - Static variable in class org.apache.tika.pipes.fetcher.http.HttpFetcher
- HTTP_NUM_REDIRECTS - Static variable in class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcher
- HTTP_NUM_REDIRECTS - Static variable in class org.apache.tika.pipes.fetcher.http.HttpFetcher
-
Number of redirects
- HTTP_STATUS_CODE - Static variable in class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcher
- HTTP_STATUS_CODE - Static variable in class org.apache.tika.pipes.fetcher.http.HttpFetcher
-
http status code
- HTTP_TARGET_IP_ADDRESS - Static variable in class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcher
- HTTP_TARGET_IP_ADDRESS - Static variable in class org.apache.tika.pipes.fetcher.http.HttpFetcher
- HTTP_TARGET_URL - Static variable in class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcher
- HTTP_TARGET_URL - Static variable in class org.apache.tika.pipes.fetcher.http.HttpFetcher
-
If there were redirects, this captures the final URL visited
- HttpCall(String, String, Map<String, String>) - Constructor for record class org.apache.tika.parser.vlm.AbstractVLMParser.HttpCall
-
Creates an instance of a
HttpCallrecord class. - httpClient - Variable in class org.apache.tika.pipes.emitter.es.ESClient
- httpClient - Variable in class org.apache.tika.pipes.emitter.opensearch.OpenSearchClient
- httpClient - Variable in class org.apache.tika.pipes.reporter.opensearch.OpenSearchClient
- httpClientConfig() - Method in record class org.apache.tika.pipes.emitter.es.ESEmitterConfig
-
Returns the value of the
httpClientConfigrecord component. - httpClientConfig() - Method in record class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig
-
Returns the value of the
httpClientConfigrecord component. - httpClientConfig() - Method in record class org.apache.tika.pipes.reporter.es.ESReporterConfig
-
Returns the value of the
httpClientConfigrecord component. - httpClientConfig() - Method in record class org.apache.tika.pipes.reporter.opensearch.OpenSearchReporterConfig
-
Returns the value of the
httpClientConfigrecord component. - HttpClientConfig - Record Class in org.apache.tika.pipes.emitter.es
-
HTTP client settings for the ES emitter and reporter.
- HttpClientConfig - Record Class in org.apache.tika.pipes.emitter.opensearch
- HttpClientConfig - Record Class in org.apache.tika.pipes.reporter.opensearch
- HttpClientConfig(String, String, String, int, int, String, int) - Constructor for record class org.apache.tika.pipes.emitter.opensearch.HttpClientConfig
-
Creates an instance of a
HttpClientConfigrecord class. - HttpClientConfig(String, String, String, int, int, String, int) - Constructor for record class org.apache.tika.pipes.reporter.opensearch.HttpClientConfig
-
Creates an instance of a
HttpClientConfigrecord class. - HttpClientConfig(String, String, String, int, int, String, int, boolean) - Constructor for record class org.apache.tika.pipes.emitter.es.HttpClientConfig
-
Creates an instance of a
HttpClientConfigrecord class. - HttpClientFactory - Class in org.apache.tika.client
-
This holds quite a bit of state and is not thread safe.
- HttpClientFactory() - Constructor for class org.apache.tika.client.HttpClientFactory
- HttpClientUtil - Class in org.apache.tika.client
- HttpClientUtil() - Constructor for class org.apache.tika.client.HttpClientUtil
- HttpFetcher - Class in org.apache.tika.pipes.fetcher.http
-
Based on Apache httpclient
- HttpFetcher(ExtensionConfig, HttpFetcherConfig) - Constructor for class org.apache.tika.pipes.fetcher.http.HttpFetcher
- HttpFetcherConfig - Class in org.apache.tika.pipes.fetcher.http.config
- HttpFetcherConfig() - Constructor for class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- HttpFetcherFactory - Class in org.apache.tika.pipes.fetcher.http
-
Factory for creating HTTP fetchers.
- HttpFetcherFactory() - Constructor for class org.apache.tika.pipes.fetcher.http.HttpFetcherFactory
- HttpHeaders - Class in org.apache.tika.pipes.fetcher.http.config
- HttpHeaders - Interface in org.apache.tika.metadata
-
A collection of HTTP header names.
- HttpHeaders() - Constructor for class org.apache.tika.pipes.fetcher.http.config.HttpHeaders
- HttpHeaders(Map<String, List<String>>) - Constructor for class org.apache.tika.pipes.fetcher.http.config.HttpHeaders
- httpMethod - Variable in class org.apache.tika.server.core.resource.TikaWelcome.Endpoint
- HttpParser - Class in org.apache.tika.parser.http
- HttpParser() - Constructor for class org.apache.tika.parser.http.HttpParser
- HttpPipesPlugin - Class in org.apache.tika.pipes.plugin.http
- HttpPipesPlugin(PluginWrapper) - Constructor for class org.apache.tika.pipes.plugin.http.HttpPipesPlugin
- HWP - Static variable in class org.apache.tika.detect.ole.MiscOLEDetector
-
Hangul Word Processor (Korean)
- HWP_MIME_TYPE - Static variable in class org.apache.tika.parser.hwp.HwpV5Parser
- HwpStreamReader - Class in org.apache.tika.parser.hwp
- HwpStreamReader(InputStream) - Constructor for class org.apache.tika.parser.hwp.HwpStreamReader
- HwpTextExtractorV5 - Class in org.apache.tika.parser.hwp
- HwpTextExtractorV5() - Constructor for class org.apache.tika.parser.hwp.HwpTextExtractorV5
- HwpV5Parser - Class in org.apache.tika.parser.hwp
- HwpV5Parser() - Constructor for class org.apache.tika.parser.hwp.HwpV5Parser
- Hyperlink - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- hyperlinkEnd() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- hyperlinkEnd() - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- HyperlinkProtected - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- hyperlinkStart(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- hyperlinkStart(String) - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- hyperlinkUpdate(HyperlinkEvent) - Method in class org.apache.tika.gui.TikaGUI
I
- I - Enum constant in enum class org.apache.tika.parser.microsoft.FormattingUtils.Tag
- ICC_NS - Static variable in class org.apache.tika.parser.image.ImageMetadataExtractor
- ICNS_MIME_TYPE - Static variable in class org.apache.tika.parser.image.ICNSParser
- ICNSParser - Class in org.apache.tika.parser.image
-
A basic parser class for Apple ICNS icon files
- ICNSParser() - Constructor for class org.apache.tika.parser.image.ICNSParser
- Icu4jEncodingDetector - Class in org.apache.tika.parser.txt
- Icu4jEncodingDetector() - Constructor for class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
Default constructor for SPI loading.
- Icu4jEncodingDetector(JsonConfig) - Constructor for class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
Constructor for JSON configuration.
- Icu4jEncodingDetector(Icu4jEncodingDetector.Config) - Constructor for class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
Constructor with explicit Config object.
- Icu4jEncodingDetector.Config - Class in org.apache.tika.parser.txt
-
Configuration class for JSON deserialization.
- id - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
- id - Variable in class org.apache.tika.parser.microsoft.rtf.ListDescriptor
- id() - Method in record class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler.FrictionlessFileInfo
-
Returns the value of the
idrecord component. - id() - Method in record class org.apache.tika.pipes.core.extractor.TempFileUnpackHandler.EmbeddedFileInfo
-
Returns the value of the
idrecord component. - id() - Method in record class org.apache.tika.plugins.ExtensionConfig
-
Returns the value of the
idrecord component. - ID - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- ID - Enum constant in enum class org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
- ID - Static variable in class org.apache.tika.detect.siegfried.SiegfriedDetector
- ID - Static variable in class org.apache.tika.eval.app.ProfilerBase
- ID - Static variable in interface org.apache.tika.metadata.QuattroPro
-
ID.
- ID - Static variable in class org.apache.tika.pipes.core.serialization.FetchEmitTupleSerializer
- ID_PROPERTY - Static variable in class org.apache.tika.language.translate.impl.MicrosoftTranslator
- ID3Comment(String) - Constructor for class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
-
Creates an ID3 v1 style comment tag
- ID3Comment(String, String, String) - Constructor for class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
-
Creates an ID3 v2 style comment tag
- ID3Tags - Interface in org.apache.tika.parser.mp3
-
Interface that defines the common interface for ID3 tag parsers, such as ID3v1 and ID3v2.3.
- ID3Tags.ID3Comment - Class in org.apache.tika.parser.mp3
-
Represents a comments in ID3 (especially ID3 v2), where are made up of several parts
- ID3TagsAndAudio() - Constructor for class org.apache.tika.parser.mp3.Mp3Parser.ID3TagsAndAudio
- ID3v1Handler - Class in org.apache.tika.parser.mp3
-
This is used to parse ID3 Version 1 Tag information from an MP3 file, if available.
- ID3v1Handler(byte[]) - Constructor for class org.apache.tika.parser.mp3.ID3v1Handler
-
Creates from the last 128 bytes of a stream.
- ID3v1Handler(InputStream, ContentHandler) - Constructor for class org.apache.tika.parser.mp3.ID3v1Handler
- ID3v22Handler - Class in org.apache.tika.parser.mp3
-
This is used to parse ID3 Version 2.2 Tag information from an MP3 file, if available.
- ID3v22Handler(ID3v2Frame) - Constructor for class org.apache.tika.parser.mp3.ID3v22Handler
- ID3v23Handler - Class in org.apache.tika.parser.mp3
-
This is used to parse ID3 Version 2.3 Tag information from an MP3 file, if available.
- ID3v23Handler(ID3v2Frame) - Constructor for class org.apache.tika.parser.mp3.ID3v23Handler
- ID3v24Handler - Class in org.apache.tika.parser.mp3
-
This is used to parse ID3 Version 2.4 Tag information from an MP3 file, if available.
- ID3v24Handler(ID3v2Frame) - Constructor for class org.apache.tika.parser.mp3.ID3v24Handler
- ID3v2Frame - Class in org.apache.tika.parser.mp3
-
A frame of ID3v2 data, which is then passed to a handler to be turned into useful data.
- ID3v2Frame.RawTag - Class in org.apache.tika.parser.mp3
- ID3v2Frame.RawTagIterator - Class in org.apache.tika.parser.mp3
-
Iterates over id3v2 raw tags.
- ID3v2Frame.TextEncoding - Class in org.apache.tika.parser.mp3
- IDBWriter - Interface in org.apache.tika.eval.app.io
- IDENTIFIER - Static variable in interface org.apache.tika.metadata.DublinCore
-
Recommended best practice is to identify the resource by means of a string or number conforming to a formal identification system.
- IDENTIFIER - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- IDENTIFIER - Static variable in interface org.apache.tika.metadata.XMP
-
An unordered array of text strings that unambiguously identify the resource within a given context.
- IDENTIFIER - Static variable in interface org.apache.tika.metadata.XMPDC
-
Recommended best practice is to identify the resource by means of a string or number conforming to a formal identification system.
- identifyEndpoints() - Method in class org.apache.tika.server.core.resource.TikaWelcome
- identifyStaticServiceProviders(Class<T>) - Method in class org.apache.tika.config.ServiceLoader
-
Returns the defined static service providers of the given type, without attempting to load them.
- IdentityHtmlMapper - Class in org.apache.tika.parser.html
-
Alternative HTML mapping rules that pass the input HTML as-is without any modifications.
- IdentityHtmlMapper() - Constructor for class org.apache.tika.parser.html.IdentityHtmlMapper
- idField() - Method in record class org.apache.tika.pipes.emitter.es.ESEmitterConfig
-
Returns the value of the
idFieldrecord component. - idField() - Method in record class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig
-
Returns the value of the
idFieldrecord component. - idField() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Returns the value of the
idFieldrecord component. - IDMLParser - Class in org.apache.tika.parser.indesign
-
Adobe InDesign IDML Parser.
- IDMLParser() - Constructor for class org.apache.tika.parser.indesign.IDMLParser
- IFSSHTTPBSerializable - Interface in org.apache.tika.parser.microsoft.onenote.fsshttpb
-
FSSHTTPB Serialize interface.
- IgniteConfigStore - Class in org.apache.tika.pipes.ignite
-
Apache Ignite 3.x-based implementation of
ConfigStore. - IgniteConfigStore() - Constructor for class org.apache.tika.pipes.ignite.IgniteConfigStore
- IgniteConfigStore(String) - Constructor for class org.apache.tika.pipes.ignite.IgniteConfigStore
- IgniteConfigStore(ExtensionConfig) - Constructor for class org.apache.tika.pipes.ignite.IgniteConfigStore
- IgniteConfigStoreConfig - Class in org.apache.tika.pipes.ignite.config
-
Configuration for
IgniteConfigStore. - IgniteConfigStoreConfig() - Constructor for class org.apache.tika.pipes.ignite.config.IgniteConfigStoreConfig
- IgniteConfigStoreFactory - Class in org.apache.tika.pipes.ignite
-
Factory for creating Ignite-based ConfigStore instances.
- IgniteConfigStoreFactory() - Constructor for class org.apache.tika.pipes.ignite.IgniteConfigStoreFactory
- IgniteStoreServer - Class in org.apache.tika.pipes.ignite.server
-
Embedded Ignite 3.x server node that hosts the config store table.
- IgniteStoreServer() - Constructor for class org.apache.tika.pipes.ignite.server.IgniteStoreServer
- IgniteStoreServer(String, String) - Constructor for class org.apache.tika.pipes.ignite.server.IgniteStoreServer
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.parser.dif.DIFContentHandler
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.DIFContentHandler
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.LinkContentHandler
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.SafeContentHandler
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.SecureContentHandler
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.TeeContentHandler
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.TextContentHandler
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.ToMarkdownContentHandler
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.ToTextContentHandler
-
Writes the given ignorable characters to the given character stream.
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.WriteOutContentHandler
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
- IGNORE - Enum constant in enum class org.apache.tika.sax.BasicContentHandlerFactory.HANDLER_TYPE
- IGNORE_ACCESSIBILITY_ALLOWANCE - Enum constant in enum class org.apache.tika.parser.pdf.PDFParserConfig.AccessCheckMode
-
If extraction is blocked, throw an
AccessPermissionExceptioneven if the document allows extraction for accessibility. - IGNORE_LENGTH - Static variable in class org.apache.tika.eval.app.io.ExtractReader
- IGNORE_ZERO_BYTE_FILE_EXCEPTION - Static variable in exception org.apache.tika.exception.ZeroByteFileException
-
If this is in the
ParseContext, theAutoDetectParserand theRecursiveParserWrapperwill ignore embedded files with zero-byte length inputstreams - ignoreCharsets - Variable in class org.apache.tika.parser.txt.Icu4jEncodingDetector.Config
- ignoreListMarkup - Variable in class org.apache.tika.parser.microsoft.rtf.RTFParser.Config
- IgnoreZeroByteFileException() - Constructor for class org.apache.tika.exception.ZeroByteFileException.IgnoreZeroByteFileException
- ILLUSTRATOR - Static variable in class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaIllustrator
- ILLUSTRATOR_TYPE - Static variable in interface org.apache.tika.metadata.PDF
- image(String) - Static method in class org.apache.tika.mime.MediaType
- IMAGE - Enum constant in enum class org.apache.tika.extractor.EmbeddedDocumentUtil.EmbeddedResourcePrefix
- IMAGE_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Images in the document
- IMAGE_CREATOR - Static variable in interface org.apache.tika.metadata.IPTC
-
Creator or creators of the image.
- IMAGE_CREATOR_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
The ID of the creator or creators of the image.
- IMAGE_CREATOR_ID_WRONG_CASE - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.
- IMAGE_CREATOR_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of the creator or creators of the image.
- IMAGE_LENGTH - Static variable in interface org.apache.tika.metadata.TIFF
-
"Image height in pixels."
- IMAGE_MAGICK - Static variable in class org.apache.tika.parser.ocr.TesseractOCRParser
- IMAGE_REGISTRY_ENTRY - Static variable in interface org.apache.tika.metadata.IPTC
-
Both a Registry Item Id and a Registry Organisation Id to record any registration of this item with a registry.
- IMAGE_ROTATION - Static variable in class org.apache.tika.parser.ocr.TesseractOCRParser
- IMAGE_SUPPLIER - Static variable in interface org.apache.tika.metadata.IPTC
-
Identifies the most recent supplier of the item, who is not necessarily its owner or creator.
- IMAGE_SUPPLIER_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
Identifies the most recent supplier of the item, who is not necessarily its owner or creator.
- IMAGE_SUPPLIER_ID_WRONG_CASE - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.
- IMAGE_SUPPLIER_IMAGE_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
Optional identifier assigned by the Image Supplier to the image.
- IMAGE_SUPPLIER_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
Identifies the most recent supplier of the item, who is not necessarily its owner or creator.
- IMAGE_WIDTH - Static variable in interface org.apache.tika.metadata.TIFF
-
"Image width in pixels."
- ImageAltText - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- imageCounter - Variable in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- ImageDeskew - Class in org.apache.tika.parser.ocr.tess4j
-
Copied and pasted from Tess4j (https://sourceforge.net/projects/tess4j/)
- ImageDeskew(BufferedImage) - Constructor for class org.apache.tika.parser.ocr.tess4j.ImageDeskew
- ImageDeskew.HoughLine - Class in org.apache.tika.parser.ocr.tess4j
- ImageEmbeddingConfig - Class in org.apache.tika.inference
-
Configuration for image embedding parsers that call a CLIP-like vector endpoint.
- ImageEmbeddingConfig() - Constructor for class org.apache.tika.inference.ImageEmbeddingConfig
- ImageEmbeddingConfig.RuntimeConfig - Class in org.apache.tika.inference
-
Runtime-only config that prevents modification of security-sensitive and cost-sensitive fields (
baseUrl,apiKey,model) at parse time. - ImageFilename - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ImageGraphicsEngine - Class in org.apache.tika.parser.pdf.image
-
Copied nearly verbatim from PDFBox
- ImageGraphicsEngine(PDPage, int, EmbeddedDocumentExtractor, PDFParserConfig, Map<COSStream, Integer>, AtomicInteger, XHTMLContentHandler, Metadata, ParseContext) - Constructor for class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- ImageGraphicsEngineFactory - Class in org.apache.tika.parser.pdf.image
- ImageGraphicsEngineFactory() - Constructor for class org.apache.tika.parser.pdf.image.ImageGraphicsEngineFactory
- ImageMetadataExtractor - Class in org.apache.tika.parser.image
-
Uses the Metadata Extractor library to read EXIF and IPTC image metadata and map to Tika fields.
- ImageMetadataExtractor(Metadata) - Constructor for class org.apache.tika.parser.image.ImageMetadataExtractor
- ImageMetadataExtractor(Metadata, ImageMetadataExtractor.DirectoryHandler...) - Constructor for class org.apache.tika.parser.image.ImageMetadataExtractor
- ImageParser - Class in org.apache.tika.parser.image
- ImageParser() - Constructor for class org.apache.tika.parser.image.ImageParser
- ImageUploadState - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ImageUtil - Class in org.apache.tika.parser.ocr.tess4j
- ImageUtil() - Constructor for class org.apache.tika.parser.ocr.tess4j.ImageUtil
- IMPORTANCE - Static variable in interface org.apache.tika.metadata.MAPI
- ImportContextImpl - Class in org.apache.tika.example
-
ImportContextImpl... - ImportContextImpl(Item, String, InputContext, InputStream, IOListener, Detector) - Constructor for class org.apache.tika.example.ImportContextImpl
-
Creates a new item import context.
- IN_REPLY_TO_ID - Static variable in interface org.apache.tika.metadata.MAPI
- include - Variable in class org.apache.tika.metadata.filter.IncludeFieldMetadataFilter.Config
- IncludeFieldMetadataFilter - Class in org.apache.tika.metadata.filter
- IncludeFieldMetadataFilter() - Constructor for class org.apache.tika.metadata.filter.IncludeFieldMetadataFilter
- IncludeFieldMetadataFilter(Set<String>) - Constructor for class org.apache.tika.metadata.filter.IncludeFieldMetadataFilter
- IncludeFieldMetadataFilter(JsonConfig) - Constructor for class org.apache.tika.metadata.filter.IncludeFieldMetadataFilter
-
Constructor for JSON configuration.
- IncludeFieldMetadataFilter(IncludeFieldMetadataFilter.Config) - Constructor for class org.apache.tika.metadata.filter.IncludeFieldMetadataFilter
-
Constructor with explicit Config object.
- IncludeFieldMetadataFilter.Config - Class in org.apache.tika.metadata.filter
-
Configuration class for JSON deserialization.
- includeRouting() - Method in record class org.apache.tika.pipes.reporter.es.ESReporterConfig
-
Returns the value of the
includeRoutingrecord component. - includeRouting() - Method in record class org.apache.tika.pipes.reporter.opensearch.OpenSearchReporterConfig
-
Returns the value of the
includeRoutingrecord component. - includes() - Method in record class org.apache.tika.pipes.reporter.es.ESReporterConfig
-
Returns the value of the
includesrecord component. - includes() - Method in record class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterConfig
-
Returns the value of the
includesrecord component. - includes() - Method in record class org.apache.tika.pipes.reporter.opensearch.OpenSearchReporterConfig
-
Returns the value of the
includesrecord component. - inclusiveOr(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- inclusiveOr(long) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- inclusiveOr(UInteger) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- INCORRECT_EXTRACT_FILE_SUFFIX - Enum constant in enum class org.apache.tika.eval.app.io.ExtractReaderException.TYPE
- increaseFramesRead() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- increment() - Method in class org.apache.tika.parser.pdf.OCRPageCounter
- increment(String) - Method in class org.apache.tika.eval.core.tokens.TokenCounts
- INCREMENTAL_UPDATE_NUMBER - Static variable in interface org.apache.tika.metadata.PDF
-
This is a zero-based number for incremental updates within a PDF -- 0 is the first update, 1 is the second, etc.
- IncrementalUpdateRecord - Class in org.apache.tika.parser.pdf.updates
- IncrementalUpdateRecord(Path, List<StartXRefOffset>) - Constructor for class org.apache.tika.parser.pdf.updates.IncrementalUpdateRecord
- incrementEmbeddedCount() - Method in class org.apache.tika.parser.ParseRecord
-
Increments the embedded document count.
- incrementFilesProcessed(long) - Method in class org.apache.tika.pipes.core.PerClientServerManager
- incrementFilesProcessed(long) - Method in interface org.apache.tika.pipes.core.ServerManager
-
Increments the count of files processed and marks for restart if limit reached.
- incrementFilesProcessed(long) - Method in class org.apache.tika.pipes.core.SharedServerManager
-
Increments the count of files processed and marks for restart if limit reached.
- incrementLevel(int, AbstractListManager.LevelTuple[]) - Method in class org.apache.tika.parser.microsoft.AbstractListManager.ParagraphLevelCounter
-
Apply this to every numbered paragraph in order.
- index - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
- index - Variable in exception org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectParseErrorException
- index - Variable in class org.apache.tika.parser.ocr.tess4j.ImageDeskew.HoughLine
- indexContentSpecificMet(File) - Method in class org.apache.tika.example.MetadataAwareLuceneIndexer
- indexDocument(File) - Method in class org.apache.tika.example.LuceneIndexer
- indexDocument(File) - Method in class org.apache.tika.example.LuceneIndexerExtended
- indexOfDataSpaceStorageElement(byte[], byte[]) - Static method in class org.apache.tika.parser.microsoft.chm.ChmCommons
-
Searches some pattern in byte[]
- indexOfDataSpaceStorageElement(List<DirectoryListingEntry>, String) - Static method in class org.apache.tika.parser.microsoft.chm.ChmCommons
-
Searches for some pattern in the directory listing entry list This requires that the entry name start with "::DataSpaceStorage" See TIKA-4204
- indexOfResetTableBlock(byte[], byte[]) - Static method in class org.apache.tika.parser.microsoft.chm.ChmCommons
-
Returns an index of the reset table
- indexWithDublinCore(File) - Method in class org.apache.tika.example.MetadataAwareLuceneIndexer
- InferenceConfig - Class in org.apache.tika.inference
-
Configuration for the inference metadata filters.
- InferenceConfig() - Constructor for class org.apache.tika.inference.InferenceConfig
- InferenceConfig.RuntimeConfig - Class in org.apache.tika.inference
-
Runtime-only config that prevents modification of security-sensitive and cost-sensitive fields (
baseUrl,apiKey,model) at parse time. - informCompleted(boolean) - Method in class org.apache.tika.example.ImportContextImpl
- init() - Method in interface org.apache.tika.pipes.core.config.ConfigStore
-
Initializes the configuration store.
- init() - Method in class org.apache.tika.pipes.core.config.FileBasedConfigStore
- init() - Method in class org.apache.tika.pipes.ignite.IgniteConfigStore
- init() - Method in class org.apache.tika.pipes.reporter.opensearch.OpenSearchPipesReporter
- init(ProcessingEnvironment) - Method in class org.apache.tika.annotation.TikaComponentProcessor
- init(TikaLoader, ServerStatus, PipesParsingHelper) - Static method in class org.apache.tika.server.core.resource.TikaResource
-
Initialize TikaResource with pipes-based parsing for process isolation.
- INITIAL_AUTHOR - Static variable in interface org.apache.tika.metadata.Office
-
Name of the initial creator/author of a document
- Initializable - Interface in org.apache.tika.config
-
Components that must do special processing across multiple fields at initialization time should implement this interface.
- INITIALIZATION_FAILURE - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.CATEGORY
-
Component initialization failed - processing should stop, might be transient
- initialize() - Method in interface org.apache.tika.config.Initializable
-
Called after all properties have been set to allow for validation and initialization that depends on multiple properties.
- initialize() - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- initialize() - Method in class org.apache.tika.metadata.filter.CaptureGroupMetadataFilter
- initialize() - Method in class org.apache.tika.parser.dwg.DWGParserConfig
- initialize() - Method in class org.apache.tika.parser.microsoft.libpst.LibPstParser
- initialize() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- initialize() - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
- initialize() - Method in class org.apache.tika.parser.ocrencode.EncodeOCRParser
- initialize() - Method in class org.apache.tika.parser.strings.StringsParser
- initialize() - Method in class org.apache.tika.parser.transcribe.aws.AmazonTranscribe
- initialize() - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- initialize() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcher
- initialize() - Method in class org.apache.tika.pipes.fetcher.googledrive.GoogleDriveFetcher
- initialize(GeoParserConfig) - Method in class org.apache.tika.parser.geo.topic.GeoParser
-
Initializes this parser
- initializeResources() - Method in class org.apache.tika.pipes.core.server.PipesServer
- INLINE - Enum constant in enum class org.apache.tika.metadata.TikaCoreProperties.EmbeddedResourceType
- InMemoryConfigStore - Class in org.apache.tika.pipes.core.config
-
Default in-memory implementation of
ConfigStoreusing aConcurrentHashMap. - InMemoryConfigStore() - Constructor for class org.apache.tika.pipes.core.config.InMemoryConfigStore
- INPUT_FILE_TOKEN - Static variable in class org.apache.tika.embedder.ExternalEmbedder
- INPUT_FILE_TOKEN - Static variable in class org.apache.tika.parser.external.ExternalParser
- inputFilterEnabled() - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Test whether or not input filtering is enabled.
- InputStreamDigester - Class in org.apache.tika.digest
-
Digester that uses
TikaInputStream.enableRewind()andTikaInputStream.rewind()to read the entire stream for digesting, then rewind for subsequent processing. - InputStreamDigester(String, String, Encoder) - Constructor for class org.apache.tika.digest.InputStreamDigester
- insert() - Method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
-
Returns the value of the
insertrecord component. - INSERT - Enum constant in enum class org.apache.tika.parser.microsoft.ooxml.EditType
- insertWithOverflow(TokenIntPair) - Method in class org.apache.tika.eval.core.textstats.TokenCountPriorityQueue
- insertWithOverflow(TokenIntPair) - Method in class org.apache.tika.eval.core.tokens.TokenCountPriorityQueue
- INSTANCE - Static variable in class org.apache.tika.detect.EmptyDetector
-
Singleton instance of this class.
- INSTANCE - Static variable in class org.apache.tika.digest.SkipContainerDocumentDigest
- INSTANCE - Static variable in class org.apache.tika.parser.EmptyParser
-
Singleton instance of this class.
- INSTANCE - Static variable in class org.apache.tika.parser.ErrorParser
-
Singleton instance of this class.
- INSTANCE - Static variable in class org.apache.tika.parser.html.DefaultHtmlMapper
- INSTANCE - Static variable in class org.apache.tika.parser.html.IdentityHtmlMapper
- INSTANCE - Static variable in exception org.apache.tika.sax.StoppingEarlyException
- INSTANCE - Static variable in class org.apache.tika.sax.xpath.AttributeMatcher
- INSTANCE - Static variable in class org.apache.tika.sax.xpath.ElementMatcher
- INSTANCE - Static variable in class org.apache.tika.sax.xpath.NodeMatcher
- INSTANCE - Static variable in class org.apache.tika.sax.xpath.TextMatcher
- INSTANCEID - Static variable in interface org.apache.tika.metadata.XMPMM
-
An identifier for a specific incarnation of a resource, updated each time a file is saved.
- instantiate(Class<?>, JsonNode, ObjectMapper) - Static method in class org.apache.tika.config.loader.ComponentInstantiator
-
Instantiates a component from a JsonNode configuration.
- instantiate(Class<?>, JsonConfig, ClassLoader, String, ObjectMapper) - Static method in class org.apache.tika.config.loader.ComponentInstantiator
-
Instantiates a component with JsonConfig constructor or falls back to zero-arg constructor.
- instantiate(String, JsonNode) - Method in class org.apache.tika.config.loader.LoaderContext
-
Instantiate a component by name and config.
- instantiate(String, JsonNode, ObjectMapper, ClassLoader) - Static method in class org.apache.tika.config.loader.ComponentInstantiator
-
Instantiates a component by resolving a friendly name or FQCN to a class.
- instantiateComponent(String, JsonNode, ObjectMapper, ClassLoader, Class<?>) - Static method in class org.apache.tika.config.loader.ComponentInstantiator
-
Instantiates a Tika component with full special-case handling.
- inStartElement - Variable in class org.apache.tika.sax.ToXMLContentHandler
- INSTITUTION - Static variable in interface org.apache.tika.metadata.ClimateForcast
- INSTRUCTIONS - Static variable in interface org.apache.tika.metadata.IPTC
-
Any of a number of instructions from the provider or creator to the receiver of the item.
- INSTRUCTIONS - Static variable in interface org.apache.tika.metadata.Photoshop
- INSTRUMENT - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The musical instrument."
- int64BitsToDouble(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- INTEGER - Enum constant in enum class org.apache.tika.metadata.Property.ValueType
- INTEGRITY_CHECK_RESULT - Static variable in interface org.apache.tika.metadata.Zip
-
Result of the integrity check comparing central directory to local headers.
- intelE8Decoding() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxBlock
- INTELLECTUAL_GENRE - Static variable in interface org.apache.tika.metadata.IPTC
-
Describes the nature, intellectual, artistic or journalistic characteristic of a item, not specifically its content.
- interceptorClasses() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
interceptorClassesrecord component. - INTERMEDIATE_RESULT - Enum constant in enum class org.apache.tika.pipes.core.protocol.PipesMessageType
- IntermediateNodeEnd - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Intermediate Node End
- IntermediateNodeObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- IntermediateNodeObject - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Root Node Object
- IntermediateNodeObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.IntermediateNodeObject
-
Initializes a new instance of the IntermediateNodeObject class.
- IntermediateNodeObject.RootNodeObjectBuilder - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
The class is used to build a root node object.
- IntermediateNodeObjectBuilder() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject.IntermediateNodeObjectBuilder
- intermediateNodeObjectList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.NodeObject
- intermediateResult(byte[]) - Static method in record class org.apache.tika.pipes.core.protocol.PipesMessage
- IntermediateResult - Class in org.apache.tika.pipes.core.server
- IntermediateResult() - Constructor for class org.apache.tika.pipes.core.server.IntermediateResult
- INTERNAL_PATH - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
This records the metadata as stored within a file for an embedded file's path including the file name.
- internalBoolean(String) - Static method in class org.apache.tika.metadata.Property
- internalClosedChoise(String, String...) - Static method in class org.apache.tika.metadata.Property
- internalDate(String) - Static method in class org.apache.tika.metadata.Property
- internalDateBag(String) - Static method in class org.apache.tika.metadata.Property
- internalGetFieldAccessorTable() - Method in class org.apache.tika.DeleteFetcherReply.Builder
- internalGetFieldAccessorTable() - Method in class org.apache.tika.DeleteFetcherReply
- internalGetFieldAccessorTable() - Method in class org.apache.tika.DeleteFetcherRequest.Builder
- internalGetFieldAccessorTable() - Method in class org.apache.tika.DeleteFetcherRequest
- internalGetFieldAccessorTable() - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
- internalGetFieldAccessorTable() - Method in class org.apache.tika.DeletePipesIteratorReply
- internalGetFieldAccessorTable() - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- internalGetFieldAccessorTable() - Method in class org.apache.tika.DeletePipesIteratorRequest
- internalGetFieldAccessorTable() - Method in class org.apache.tika.FetchAndParseReply.Builder
- internalGetFieldAccessorTable() - Method in class org.apache.tika.FetchAndParseReply
- internalGetFieldAccessorTable() - Method in class org.apache.tika.FetchAndParseRequest.Builder
- internalGetFieldAccessorTable() - Method in class org.apache.tika.FetchAndParseRequest
- internalGetFieldAccessorTable() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- internalGetFieldAccessorTable() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- internalGetFieldAccessorTable() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- internalGetFieldAccessorTable() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- internalGetFieldAccessorTable() - Method in class org.apache.tika.GetFetcherReply.Builder
- internalGetFieldAccessorTable() - Method in class org.apache.tika.GetFetcherReply
- internalGetFieldAccessorTable() - Method in class org.apache.tika.GetFetcherRequest.Builder
- internalGetFieldAccessorTable() - Method in class org.apache.tika.GetFetcherRequest
- internalGetFieldAccessorTable() - Method in class org.apache.tika.GetPipesIteratorReply.Builder
- internalGetFieldAccessorTable() - Method in class org.apache.tika.GetPipesIteratorReply
- internalGetFieldAccessorTable() - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
- internalGetFieldAccessorTable() - Method in class org.apache.tika.GetPipesIteratorRequest
- internalGetFieldAccessorTable() - Method in class org.apache.tika.ListFetchersReply.Builder
- internalGetFieldAccessorTable() - Method in class org.apache.tika.ListFetchersReply
- internalGetFieldAccessorTable() - Method in class org.apache.tika.ListFetchersRequest.Builder
- internalGetFieldAccessorTable() - Method in class org.apache.tika.ListFetchersRequest
- internalGetFieldAccessorTable() - Method in class org.apache.tika.SaveFetcherReply.Builder
- internalGetFieldAccessorTable() - Method in class org.apache.tika.SaveFetcherReply
- internalGetFieldAccessorTable() - Method in class org.apache.tika.SaveFetcherRequest.Builder
- internalGetFieldAccessorTable() - Method in class org.apache.tika.SaveFetcherRequest
- internalGetFieldAccessorTable() - Method in class org.apache.tika.SavePipesIteratorReply.Builder
- internalGetFieldAccessorTable() - Method in class org.apache.tika.SavePipesIteratorReply
- internalGetFieldAccessorTable() - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
- internalGetFieldAccessorTable() - Method in class org.apache.tika.SavePipesIteratorRequest
- internalGetMapFieldReflection(int) - Method in class org.apache.tika.FetchAndParseReply.Builder
- internalGetMapFieldReflection(int) - Method in class org.apache.tika.FetchAndParseReply
- internalGetMapFieldReflection(int) - Method in class org.apache.tika.GetFetcherReply.Builder
- internalGetMapFieldReflection(int) - Method in class org.apache.tika.GetFetcherReply
- internalGetMutableMapFieldReflection(int) - Method in class org.apache.tika.FetchAndParseReply.Builder
- internalGetMutableMapFieldReflection(int) - Method in class org.apache.tika.GetFetcherReply.Builder
- internalInteger(String) - Static method in class org.apache.tika.metadata.Property
- internalIntegerSequence(String) - Static method in class org.apache.tika.metadata.Property
- internalOpenChoise(String, String...) - Static method in class org.apache.tika.metadata.Property
- internalRational(String) - Static method in class org.apache.tika.metadata.Property
- internalReal(String) - Static method in class org.apache.tika.metadata.Property
- internalText(String) - Static method in class org.apache.tika.metadata.Property
- internalTextBag(String) - Static method in class org.apache.tika.metadata.Property
- internalURI(String) - Static method in class org.apache.tika.metadata.Property
- INTERNET_MESSAGE_ID - Static variable in interface org.apache.tika.metadata.MAPI
- INTERNET_REFERENCES - Static variable in interface org.apache.tika.metadata.MAPI
- INTERPRETED_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- InterruptableParsingExample - Class in org.apache.tika.example
-
This example demonstrates how to interrupt document parsing if some condition is met.
- InterruptableParsingExample() - Constructor for class org.apache.tika.example.InterruptableParsingExample
- intValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
- intValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- intValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
- intValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
- INVALID_CONSTANT - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.Error
- IO_EXCEPTION - Enum constant in enum class org.apache.tika.eval.app.io.ExtractReaderException.TYPE
- IPADetector - Class in org.apache.tika.detect.zip
- IPADetector() - Constructor for class org.apache.tika.detect.zip.IPADetector
- IProperty - Interface in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
-
The interface of the property in OneNote file.
- IPTC - Interface in org.apache.tika.metadata
-
IPTC photo metadata schema.
- IPTC_LAST_EDITED - Static variable in interface org.apache.tika.metadata.IPTC
-
The date and optionally time when any of the IPTC photo metadata fields has been last edited
- IptcAnpaParser - Class in org.apache.tika.parser.iptc
-
Parser for IPTC ANPA New Wire Feeds
- IptcAnpaParser() - Constructor for class org.apache.tika.parser.iptc.IptcAnpaParser
- IRecordMedia - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- IS_EMBEDDED - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- IS_ENCRYPTED - Static variable in interface org.apache.tika.metadata.PDF
- IS_ENCRYPTED - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- IS_FLAGGED - Static variable in interface org.apache.tika.metadata.MAPI
- IS_INCREMENTAL_UPDATE - Static variable in class org.apache.tika.parser.pdf.updates.IsIncrementalUpdate
- IS_OS_AIX - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_HP_UX - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_IRIX - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_LINUX - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_MAC - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_MAC_OSX - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_OS2 - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_SOLARIS - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_SUN_OS - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_UNIX - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_VERSION_WSL - Static variable in class org.apache.tika.utils.SystemUtils
- IS_OS_WINDOWS - Static variable in class org.apache.tika.utils.SystemUtils
- IS_TIMEOUT - Static variable in interface org.apache.tika.metadata.ExternalProcess
-
Was the process timed out
- IS_VALID - Static variable in interface org.apache.tika.metadata.PST
- isActive() - Method in class org.apache.tika.server.core.TlsConfig
- isAllowAbsolutePaths() - Method in class org.apache.tika.pipes.fetcher.fs.FileSystemFetcherConfig
-
If true, allows fetchKey to be an absolute path when basePath is not set.
- isAllowRuntimePrompt() - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- isAnchor() - Method in class org.apache.tika.sax.Link
- isApplyRotation() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- ISArchiveParser - Class in org.apache.tika.parser.isatab
- ISArchiveParser() - Constructor for class org.apache.tika.parser.isatab.ISArchiveParser
-
Default constructor.
- ISArchiveParser(String) - Constructor for class org.apache.tika.parser.isatab.ISArchiveParser
-
Constructor that accepts the pathname of ISArchive folder.
- ISATabUtils - Class in org.apache.tika.parser.isatab
- ISATabUtils() - Constructor for class org.apache.tika.parser.isatab.ISATabUtils
- isAudioHeader(int, int, int, int) - Static method in class org.apache.tika.parser.mp3.AudioFrame
-
Does this appear to be a 4 byte audio frame header?
- isAutoClose() - Method in class org.apache.tika.pipes.ignite.config.IgniteConfigStoreConfig
- isAvailable() - Method in class org.apache.tika.langdetect.lingo24.Lingo24LangDetector
- isAvailable() - Method in class org.apache.tika.language.translate.DefaultTranslator
- isAvailable() - Method in class org.apache.tika.language.translate.EmptyTranslator
- isAvailable() - Method in class org.apache.tika.language.translate.impl.CachedTranslator
- isAvailable() - Method in class org.apache.tika.language.translate.impl.GoogleTranslator
- isAvailable() - Method in class org.apache.tika.language.translate.impl.JoshuaNetworkTranslator
- isAvailable() - Method in class org.apache.tika.language.translate.impl.Lingo24Translator
- isAvailable() - Method in class org.apache.tika.language.translate.impl.MarianTranslator
- isAvailable() - Method in class org.apache.tika.language.translate.impl.MicrosoftTranslator
-
Check whether this instance has a working property file and its keys are not the defaults.
- isAvailable() - Method in class org.apache.tika.language.translate.impl.MosesTranslator
- isAvailable() - Method in class org.apache.tika.language.translate.impl.RTGTranslator
- isAvailable() - Method in class org.apache.tika.language.translate.impl.YandexTranslator
- isAvailable() - Method in interface org.apache.tika.language.translate.Translator
- isAvailable() - Method in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
- isAvailable() - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
- isAvailable() - Method in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
- isAvailable() - Method in interface org.apache.tika.parser.ner.NERecogniser
-
checks if this Named Entity recogniser is available for service
- isAvailable() - Method in class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
- isAvailable() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
- isAvailable() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- isAvailable() - Method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
- isAvailable() - Method in class org.apache.tika.parser.transcribe.aws.AmazonTranscribe
- isAvailable(String, String) - Method in class org.apache.tika.language.translate.impl.MarianTranslator
-
Checks if the approproate Marian engine is available.
- isAvailable(GeoParserConfig) - Method in class org.apache.tika.parser.geo.topic.GeoParser
- IsBackground - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- isBase64() - Method in class org.apache.tika.parser.html.DataURIScheme
- isBinary - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
- isBitSet(byte[], long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.Bit
-
Read a bit value from a byte array with the specified bit position.
- isBlack(BufferedImage, int, int) - Static method in class org.apache.tika.parser.ocr.tess4j.ImageUtil
- isBlack(BufferedImage, int, int, int) - Static method in class org.apache.tika.parser.ocr.tess4j.ImageUtil
- isBlank(String) - Static method in class org.apache.tika.utils.StringUtils
- IsBoilerText - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- isBold() - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
- isCatchIntermediateIOExceptions() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isCauseOf(SAXException) - Method in class org.apache.tika.sax.TaggedContentHandler
-
Tests if the given exception was caused by this handler.
- isCjkOrKana(int) - Static method in class org.apache.tika.langdetect.charsoup.ScriptAwareFeatureExtractor
- isCjkScript(int) - Static method in class org.apache.tika.langdetect.charsoup.ScriptAwareFeatureExtractor
- isCleanDwgReadOutput() - Method in class org.apache.tika.parser.dwg.DWGParserConfig
- isClearContentAfterChunking() - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- isClearContentAfterChunking() - Method in class org.apache.tika.inference.InferenceConfig
- isClientAuthenticationRequired() - Method in class org.apache.tika.server.core.TlsConfig
- isClientAuthenticationWanted() - Method in class org.apache.tika.server.core.TlsConfig
- isCloseShield() - Method in class org.apache.tika.io.TikaInputStream
- isComplete() - Method in class org.apache.tika.parser.csv.CSVParams
- isCompleted() - Method in class org.apache.tika.example.ImportContextImpl
- isConcatenatePhoneticRuns() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
- isConfigDeserializerAvailable() - Static method in class org.apache.tika.config.ParseContextConfig
-
Checks if ConfigDeserializer is available on the classpath.
- isConfigurationError() - Method in exception org.apache.tika.pipes.fork.PipesForkParserException
-
Check if this exception was caused by a configuration error.
- IsConflictObjectForRender - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- IsConflictObjectForSelection - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- IsConflictPage - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- isContentTruncatedForDetection(Metadata) - Static method in class org.apache.tika.detect.DetectHelper
-
Checks if the given metadata indicates that the content was truncated for detection.
- isConverterAvailable(String) - Static method in class org.apache.tika.xmp.convert.TikaToXMP
-
Check if there is a converter available which allows to convert the Tika metadata to XMP
- isCountTotal() - Method in class org.apache.tika.pipes.iterator.fs.FileSystemPipesIteratorConfig
- isCrawlAllFileNodesFromRoot() - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteTreeWalkerOptions
-
Do this to ignore revisions and just parse all file nodes from the root recursively.
- isDebug() - Method in class org.apache.tika.parser.microsoft.libpst.LibPstParserConfig
- isDecisive() - Method in enum class org.apache.tika.ml.chardetect.StructuralEncodingRules.Utf8Result
-
Returns true when the grammar check produced a directional answer (either LIKELY_UTF8 or NOT_UTF8).
- isDecompressConcatenated() - Method in class org.apache.tika.parser.pkg.CompressorParser.Config
- isDefault() - Method in record class org.apache.tika.config.loader.ComponentInfo
-
Returns the value of the
isDefaultrecord component. - isDetectAngles() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isDetectCharsetsInEntryNames() - Method in class org.apache.tika.parser.pkg.ZipParserConfig
- isDiscardElement(String) - Method in class org.apache.tika.parser.html.DefaultHtmlMapper
- isDiscardElement(String) - Method in interface org.apache.tika.parser.html.HtmlMapper
-
Checks whether all content within the given HTML element should be discarded instead of including it in the parse output.
- isDiscardElement(String) - Method in class org.apache.tika.parser.html.IdentityHtmlMapper
- isDynamic() - Method in class org.apache.tika.config.ServiceLoader
-
Returns if the service loader is static or dynamic
- isEbcdicLikely(byte[]) - Static method in class org.apache.tika.ml.chardetect.StructuralEncodingRules
-
Returns
trueif the probe is plausibly EBCDIC based on the word-separator distribution. - isEmbeddedCountLimitReached() - Method in class org.apache.tika.parser.ParseRecord
-
Returns whether the embedded count limit was reached during parsing.
- isEmbeddedDepthLimitReached() - Method in class org.apache.tika.parser.ParseRecord
-
Returns whether the embedded depth limit was reached during parsing.
- isEmitIntermediateResults() - Method in class org.apache.tika.pipes.core.PipesConfig
- isEmpty() - Method in class org.apache.tika.inference.locator.Locators
- isEmpty() - Method in class org.apache.tika.parser.csv.CSVParams
- isEmpty() - Method in class org.apache.tika.parser.ParseContext
- isEmpty(CharSequence) - Static method in class org.apache.tika.utils.StringUtils
- isEmpty(String) - Static method in class org.apache.tika.parser.microsoft.chm.ChmCommons
- isEnableAutoSpace() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isEnableImagePreprocessing() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- isEnableUnsecureFeatures() - Method in class org.apache.tika.server.core.TikaServerConfig
- isEndDocumentWasCalled() - Method in class org.apache.tika.sax.EndDocumentShieldingContentHandler
- isEOL(int) - Method in class org.apache.tika.parser.pdf.updates.StartXRefScanner
-
This will tell if the next byte to be read is an end of line byte.
- isExternal() - Method in class org.apache.tika.metadata.Property
- isExtractAcroFormContent() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isExtractActions() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isExtractAllAlternatives() - Method in class org.apache.tika.parser.mail.RFC822Parser.Config
- isExtractAnnotationText() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isExtractBookmarksText() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isExtractFileSystemMetadata() - Method in class org.apache.tika.pipes.fetcher.fs.FileSystemFetcherConfig
- isExtractFontNames() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isExtractIncrementalUpdateInfo() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isExtractInlineImageMetadataOnly() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isExtractInlineImages() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isExtractMacros() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
- isExtractMacros() - Method in class org.apache.tika.parser.odf.FlatOpenDocumentParser
- isExtractMacros() - Method in class org.apache.tika.parser.odf.OpenDocumentParser
- isExtractMarkedContent() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isExtractScripts() - Method in class org.apache.tika.parser.html.JSoupParser
- isExtractUniqueInlineImagesOnly() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isExtractUserMetadata() - Method in class org.apache.tika.pipes.fetcher.azblob.config.AZBlobFetcherConfig
- isExtractUserMetadata() - Method in class org.apache.tika.pipes.fetcher.gcs.config.GCSFetcherConfig
- isExtractUserMetadata() - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- isFatal() - Method in record class org.apache.tika.pipes.api.PipesResult
-
Checks if this result represents a fatal error.
- isFatal() - Method in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
-
Checks if this status represents a fatal error.
- isFatal() - Method in class org.apache.tika.pipes.fork.PipesForkResult
-
Check if there was a fatal error (failed to initialize pipes system).
- isFileData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
- isFileHeader(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ZipHeader
-
Check the input data is a local file header.
- isForceDrop() - Method in class org.apache.tika.eval.app.EvalConfig
- isGraphNode - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
- isGray() - Method in class org.apache.tika.renderer.pdf.poppler.PopplerRenderer
- isHasEof() - Method in class org.apache.tika.parser.pdf.updates.StartXRefOffset
- isHeading() - Method in class org.apache.tika.parser.microsoft.WordExtractor.TagAndStyle
- isHiddenSlide() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
- isIframe() - Method in class org.apache.tika.sax.Link
- isIfXFAExtractOnlyXFA() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isIgnoreContentStreamSpaceGlyphs() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isImage() - Method in class org.apache.tika.sax.Link
- isIncludeDeleted() - Method in class org.apache.tika.parser.microsoft.libpst.LibPstParserConfig
- isIncludeDeletedContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
- isIncludeDeletedContent() - Method in class org.apache.tika.parser.wordperfect.WordPerfectParser
- isIncludeDeletedText() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- isIncludeDeletedText() - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- isIncludeEmpty() - Method in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- isIncludeFullMetadata() - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
-
Whether to include full RMETA-style metadata in metadata.json.
- isIncludeGlossary() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
- isIncludeGlossary() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
- isIncludeHeadersAndFooters() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
- isIncludeMarkup() - Method in class org.apache.tika.sax.boilerpipe.BoilerpipeContentHandler
- isIncludeMetadataInZip() - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
-
Whether to include the metadata JSON for each embedded document in the zip file.
- isIncludeMissingRows() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
- isIncludeMoveFromContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
- isIncludeMoveFromText() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- isIncludeMoveFromText() - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- isIncludeOriginal() - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- isIncludeShapeBasedContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
- isIncludeSlideMasterContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
- isIncludeSlideNotes() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
- isIncludeTitle() - Method in class org.apache.tika.sax.SAXOutputConfig
- isIncludeTrigrams() - Method in class org.apache.tika.langdetect.charsoup.CharSoupFeatureExtractor
- IsIncrementalUpdate - Class in org.apache.tika.parser.pdf.updates
- IsIncrementalUpdate() - Constructor for class org.apache.tika.parser.pdf.updates.IsIncrementalUpdate
- isInHeader() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFState
-
Returns true if we're still in the RTF header (before body content).
- isInitializationFailure() - Method in record class org.apache.tika.pipes.api.PipesResult
-
Checks if this result represents an initialization failure.
- isInitializationFailure() - Method in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
-
Checks if this status represents an initialization failure.
- isInitializationFailure() - Method in exception org.apache.tika.pipes.fork.PipesForkParserException
-
Check if this exception was caused by an initialization failure.
- isInitializationFailure() - Method in class org.apache.tika.pipes.fork.PipesForkResult
-
Check if there was an initialization failure (fetcher/emitter initialization issues).
- isInitialized() - Method in class org.apache.tika.DeleteFetcherReply.Builder
- isInitialized() - Method in class org.apache.tika.DeleteFetcherReply
- isInitialized() - Method in class org.apache.tika.DeleteFetcherRequest.Builder
- isInitialized() - Method in class org.apache.tika.DeleteFetcherRequest
- isInitialized() - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
- isInitialized() - Method in class org.apache.tika.DeletePipesIteratorReply
- isInitialized() - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- isInitialized() - Method in class org.apache.tika.DeletePipesIteratorRequest
- isInitialized() - Method in class org.apache.tika.FetchAndParseReply.Builder
- isInitialized() - Method in class org.apache.tika.FetchAndParseReply
- isInitialized() - Method in class org.apache.tika.FetchAndParseRequest.Builder
- isInitialized() - Method in class org.apache.tika.FetchAndParseRequest
- isInitialized() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- isInitialized() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- isInitialized() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- isInitialized() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- isInitialized() - Method in class org.apache.tika.GetFetcherReply.Builder
- isInitialized() - Method in class org.apache.tika.GetFetcherReply
- isInitialized() - Method in class org.apache.tika.GetFetcherRequest.Builder
- isInitialized() - Method in class org.apache.tika.GetFetcherRequest
- isInitialized() - Method in class org.apache.tika.GetPipesIteratorReply.Builder
- isInitialized() - Method in class org.apache.tika.GetPipesIteratorReply
- isInitialized() - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
- isInitialized() - Method in class org.apache.tika.GetPipesIteratorRequest
- isInitialized() - Method in class org.apache.tika.ListFetchersReply.Builder
- isInitialized() - Method in class org.apache.tika.ListFetchersReply
- isInitialized() - Method in class org.apache.tika.ListFetchersRequest.Builder
- isInitialized() - Method in class org.apache.tika.ListFetchersRequest
- isInitialized() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
-
Returns whether the parser has been successfully initialized (i.e., Tess4J native library is available).
- isInitialized() - Method in class org.apache.tika.SaveFetcherReply.Builder
- isInitialized() - Method in class org.apache.tika.SaveFetcherReply
- isInitialized() - Method in class org.apache.tika.SaveFetcherRequest.Builder
- isInitialized() - Method in class org.apache.tika.SaveFetcherRequest
- isInitialized() - Method in class org.apache.tika.SavePipesIteratorReply.Builder
- isInitialized() - Method in class org.apache.tika.SavePipesIteratorReply
- isInitialized() - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
- isInitialized() - Method in class org.apache.tika.SavePipesIteratorRequest
- isInlineContent() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- isInlineContent() - Method in class org.apache.tika.parser.ocrencode.EncodeOCRConfig
- isInlineContent() - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- isInlineContent() - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- isInstanceOf(String, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Parses and normalises the given media type string and checks whether the result equals the given base type or is a specialization of it.
- isInstanceOf(MediaType, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Checks whether the given media type equals the given base type or is a specialization of it.
- isIntegrityCheck() - Method in class org.apache.tika.parser.pkg.ZipParserConfig
- isInternal() - Method in class org.apache.tika.metadata.Property
- isInvalid(int) - Method in class org.apache.tika.sax.SafeContentHandler
-
Checks whether the given Unicode character is an invalid XML character and should be replaced for output.
- isInvalid(int) - Method in class org.apache.tika.sax.XHTMLContentHandler
- isItalics() - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
- isJacksonAvailable() - Static method in class org.apache.tika.config.ConfigDeserializer
-
Checks if Jackson ObjectMapper is available on the classpath.
- isLanguage(String) - Method in class org.apache.tika.language.detect.LanguageResult
-
Return true if the target language matches the detected language.
- IsLayoutSizeSetByUser - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- isLenientMatch(String, String) - Static method in class org.apache.tika.ml.chardetect.CharsetConfusables
-
Return
trueif predictingpredictedwhen the true charset isactualis an acceptable ("lenient") result. - isLink() - Method in class org.apache.tika.sax.Link
- isListenForAllRecords() - Method in class org.apache.tika.parser.microsoft.ExcelExtractor
-
Returns
trueif this parser is configured to listen for all records instead of just the specified few. - isLoadAsList() - Method in class org.apache.tika.serialization.ComponentConfig
- isMacroLanguage(String) - Static method in class org.apache.tika.language.detect.LanguageNames
- isMatchingElement(String, String) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
- isMatchingParentElement(String, String) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
- isMetadataField(String) - Static method in class org.apache.tika.parser.image.MetadataFields
- isMetadataField(Property) - Static method in class org.apache.tika.parser.image.MetadataFields
- isMixedLanguages() - Method in class org.apache.tika.language.detect.LanguageDetector
- isMostlyAscii() - Method in class org.apache.tika.detect.TextStatistics
-
Checks whether at least one byte was seen and that the bytes that were seen were mostly plain text (i.e.
- isMSB() - Method in class org.apache.tika.metadata.MachineMetadata.Endian
- isMultiValued(String) - Method in class org.apache.tika.metadata.Metadata
-
Returns true if named value is multivalued.
- isMultiValued(String) - Method in class org.apache.tika.xmp.XMPMetadata
-
Checks if the named property is an array.
- isMultiValued(Property) - Method in class org.apache.tika.metadata.Metadata
-
Returns true if named value is multivalued.
- isMultiValued(Property) - Method in class org.apache.tika.xmp.XMPMetadata
- isMultiValuePermitted() - Method in class org.apache.tika.metadata.Property
-
Is the PropertyType one which accepts multiple values?
- ISO_SPEED_RATINGS - Static variable in interface org.apache.tika.metadata.TIFF
-
"ISO Speed and ISO Latitude of the input device as specified in ISO 12232"
- ISO_TO_WINDOWS - Static variable in class org.apache.tika.ml.chardetect.CharsetConfusables
-
Maps each ISO-8859-X charset to its Windows-12XX equivalent.
- isOnlyLatestRevision() - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteTreeWalkerOptions
-
Only parse the latest revision.
- isParseIncrementalUpdates() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isPathStyleAccessEnabled() - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- isPathStyleAccessEnabled() - Method in class org.apache.tika.pipes.iterator.s3.S3PipesIteratorConfig
- isPreferAlternateContentChoice() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
In OOXML,
mc:AlternateContentwrapsmc:Choice(newer/richer rendering, e.g. - isPreloadLangs() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- isPreserveInterwordSpacing() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- isPrettyPrint() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns
trueif formatted output is enabled,falseotherwise. - isPrettyPrint() - Method in class org.apache.tika.pipes.emitter.fs.FileSystemEmitterRuntimeConfig
- isProcessCrash() - Method in record class org.apache.tika.pipes.api.PipesResult
-
Checks if this result represents a process crash (OOM, timeout, etc.).
- isProcessCrash() - Method in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
-
Checks if this status represents a process crash (OOM, timeout, etc.).
- isProcessCrash() - Method in class org.apache.tika.pipes.fork.PipesForkResult
-
Check if there was a process crash (OOM, timeout, etc.).
- isProcessEmailAsMsg() - Method in class org.apache.tika.parser.microsoft.libpst.LibPstParserConfig
- isPropertySet - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
- isQuoteAssignmentValues() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Gets whether or not to quote assignment values, i.e. tag='value'.
- isReadOnly - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
- IsReadOnly - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- isReasonablyCertain() - Method in class org.apache.tika.language.detect.LanguageResult
- ISREGEX_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- isReturnStackTrace() - Method in class org.apache.tika.server.core.TikaServerConfig
- isReturnStderr() - Method in class org.apache.tika.parser.external.ExternalParserConfig
- isReturnStdout() - Method in class org.apache.tika.parser.external.ExternalParserConfig
- isRunning() - Method in class org.apache.tika.pipes.core.PerClientServerManager
- isRunning() - Method in interface org.apache.tika.pipes.core.ServerManager
-
Checks if the server process is currently running.
- isRunning() - Method in class org.apache.tika.pipes.core.SharedServerManager
- isRunning() - Method in class org.apache.tika.pipes.ignite.server.IgniteStoreServer
- isScript() - Method in class org.apache.tika.sax.Link
- isSerialize() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns
trueif CAS serialization is enabled,falseotherwise. - isServerAvailable() - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- isSetKCMS() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isSharedMode() - Method in class org.apache.tika.pipes.core.PipesParser
-
Returns whether this parser is using shared server mode.
- isShortText() - Method in class org.apache.tika.language.detect.LanguageDetector
- isSkipContainerDocumentDigest() - Method in interface org.apache.tika.digest.DigesterFactory
-
Returns whether to skip digesting for container (top-level) documents.
- isSkipContainerDocumentDigest() - Method in class org.apache.tika.parser.digestutils.BouncyCastleDigesterFactory
- isSkipContainerDocumentDigest() - Method in class org.apache.tika.parser.digestutils.CommonsDigesterFactory
- isSkipEmbedding() - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- isSkipEmbedding() - Method in class org.apache.tika.inference.ImageEmbeddingConfig
- isSkipEmbedding() - Method in class org.apache.tika.inference.InferenceConfig
- isSkipEmbedding() - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- isSkipOcr() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
- isSkipOcr() - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- isSkipOcr() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- isSkipOcr() - Method in class org.apache.tika.parser.ocrencode.EncodeOCRConfig
- isSkipOcr() - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- isSkipOcr() - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- isSortByPosition() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isSpecializationOf(MediaType, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Checks whether the given media type a is a specialization of a more generic type b.
- isSpoolToTemp() - Method in class org.apache.tika.pipes.fetcher.azblob.config.AZBlobFetcherConfig
- isSpoolToTemp() - Method in class org.apache.tika.pipes.fetcher.gcs.config.GCSFetcherConfig
- isSpoolToTemp() - Method in class org.apache.tika.pipes.fetcher.googledrive.config.GoogleDriveFetcherConfig
- isSpoolToTemp() - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- isSpoolToTemp() - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.MicrosoftGraphFetcherConfig
- isStderrTruncated() - Method in class org.apache.tika.utils.FileProcessResult
- isStdoutTruncated() - Method in class org.apache.tika.utils.FileProcessResult
- isStopOnlyOnFatal() - Method in class org.apache.tika.pipes.core.PipesConfig
-
When true, only stop processing on fatal errors (FAILED_TO_INITIALIZE).
- isStrikeThrough() - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
- isStripMarkup() - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector.Config
- isStyle - Variable in class org.apache.tika.parser.microsoft.rtf.ListDescriptor
- isSuccess() - Method in record class org.apache.tika.pipes.api.PipesResult
-
Checks if this result represents successful processing.
- isSuccess() - Method in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
-
Checks if this status represents successful processing.
- isSuccess() - Method in class org.apache.tika.pipes.fork.PipesForkResult
-
Check if the parsing was successful.
- isSupported(String) - Static method in class org.apache.tika.utils.CharsetUtils
-
Safely return whether
is supported, without throwing exceptions - isSupported(TikaInputStream, ParseContext) - Method in interface org.apache.tika.extractor.ContainerExtractor
-
Is this Container Extractor able to process the supplied container?
- isSupported(TikaInputStream, ParseContext) - Method in class org.apache.tika.extractor.ParserContainerExtractor
- isSuppressDuplicateOverlappingText() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isTaskException() - Method in record class org.apache.tika.pipes.api.PipesResult
-
Checks if this result represents a task-level exception.
- isTaskException() - Method in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
-
Checks if this status represents a task-level exception.
- isTaskException() - Method in class org.apache.tika.pipes.fork.PipesForkResult
-
Check if there was a task exception (fetch/emit/parse issues for a specific request).
- isText() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns
trueif content text analysis is enabledfalseotherwise. - isThrowOnEncryptedPayload() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- isThrowOnMaxCount() - Method in class org.apache.tika.config.EmbeddedLimits
-
Gets whether to throw an exception when maxCount is reached.
- isThrowOnMaxCount() - Method in class org.apache.tika.parser.ParseRecord
-
Returns whether throwing is configured when max count is reached.
- isThrowOnMaxDepth() - Method in class org.apache.tika.config.EmbeddedLimits
-
Gets whether to throw an exception when maxDepth is reached.
- isThrowOnMaxDepth() - Method in class org.apache.tika.parser.ParseRecord
-
Returns whether throwing is configured when max depth is reached.
- isThrowOnWriteLimit() - Method in class org.apache.tika.config.OutputLimits
-
Gets whether to throw an exception when writeLimit is reached.
- isThrowOnWriteLimitReached() - Method in class org.apache.tika.sax.BasicContentHandlerFactory
- isThrowOnWriteLimitReached() - Method in interface org.apache.tika.sax.WriteLimiter
- isTimeout() - Method in class org.apache.tika.utils.FileProcessResult
- IsTitleDate - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- IsTitleText - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- IsTitleTime - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- isTracking() - Method in class org.apache.tika.parser.mbox.MboxParser
- isTransparent(int) - Static method in class org.apache.tika.langdetect.charsoup.CharSoupFeatureExtractor
-
Determine whether a codepoint should be treated as transparent (skipped) during bigram extraction and word tokenization.
- isUnknown() - Method in class org.apache.tika.language.detect.LanguageResult
- isUnknown() - Method in class org.apache.tika.quality.TextQualityScore
-
True if scoring could not be performed (e.g. empty or unsupported-script input).
- isUnordered(int) - Method in class org.apache.tika.parser.microsoft.rtf.ListDescriptor
- isUseMime() - Method in class org.apache.tika.detect.FileCommandDetector
- isUseMime() - Method in class org.apache.tika.detect.magika.MagikaDetector.Config
- isUseMime() - Method in class org.apache.tika.detect.siegfried.SiegfriedDetector.Config
- isUseSharedServer() - Method in class org.apache.tika.pipes.core.PipesConfig
-
Returns whether shared server mode is enabled.
- isValid(String) - Static method in class org.apache.tika.mime.MimeType
-
Checks that the given string is a valid Internet media type name based on rules from RFC 2054 section 5.3.
- isWhitespace(int) - Method in class org.apache.tika.parser.pdf.updates.StartXRefScanner
- isWriteable(Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.core.writer.CSVMessageBodyWriter
- isWriteable(Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.core.writer.JSONMessageBodyWriter
- isWriteable(Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.core.writer.JSONObjWriter
- isWriteable(Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.core.writer.MetadataListMessageBodyWriter
- isWriteable(Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.core.writer.TarWriter
- isWriteable(Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.core.writer.TextMessageBodyWriter
- isWriteable(Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.core.writer.ZipWriter
- isWriteable(Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.standard.writer.XMPMessageBodyWriter
- isWriteContent() - Method in class org.apache.tika.parser.RegexCaptureParserConfig
- isWriteFileNameToContent() - Method in class org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
-
Returns whether to write file names to content based on
SAXOutputConfigin the ParseContext. - isWriteFileNameToContent() - Method in class org.apache.tika.sax.SAXOutputConfig
- isWriteLimitReached() - Method in class org.apache.tika.parser.ParseRecord
- isWriteLimitReached(Throwable) - Static method in exception org.apache.tika.exception.WriteLimitReachedException
-
Checks whether the given exception (or any of it's root causes) was thrown by this handler as a signal of reaching the write limit.
- isWriteMetadataToHead() - Method in class org.apache.tika.sax.SAXOutputConfig
- isWriteSelectHeadersInBody() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
The default changed to
falsein 4.x. - isZipEmbeddedFiles() - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
-
Whether to zip all embedded files into a single archive before emitting.
- Italic - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- iterator() - Method in class org.apache.tika.pipes.pipesiterator.PipesIteratorBase
- ITERATOR_CLASS_FIELD_NUMBER - Static variable in class org.apache.tika.GetPipesIteratorReply
- ITERATOR_CLASS_FIELD_NUMBER - Static variable in class org.apache.tika.SavePipesIteratorRequest
- ITERATOR_CONFIG_JSON_FIELD_NUMBER - Static variable in class org.apache.tika.GetPipesIteratorReply
- ITERATOR_CONFIG_JSON_FIELD_NUMBER - Static variable in class org.apache.tika.SavePipesIteratorRequest
- ITERATOR_ID_FIELD_NUMBER - Static variable in class org.apache.tika.DeletePipesIteratorRequest
- ITERATOR_ID_FIELD_NUMBER - Static variable in class org.apache.tika.GetPipesIteratorReply
- ITERATOR_ID_FIELD_NUMBER - Static variable in class org.apache.tika.GetPipesIteratorRequest
- ITERATOR_ID_FIELD_NUMBER - Static variable in class org.apache.tika.SavePipesIteratorRequest
- ITikaToXMPConverter - Interface in org.apache.tika.xmp.convert
-
Interface for the specific
Metadatato XMP converters - ITSF - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- ITSP - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- ITUNES - Static variable in class org.apache.tika.detect.apple.BPListDetector
- IWORK_COMMON_ENTRY - Static variable in class org.apache.tika.parser.iwork.IWorkPackageParser
-
All iWork files contain one of these, so we can detect based on it
- IWORK_CONTENT_ENTRIES - Static variable in class org.apache.tika.parser.iwork.IWorkPackageParser
-
Which files within an iWork file contain the actual content?
- IWORK13_COMMON_ENTRY - Static variable in class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
-
All iWork 13 files contain this, so we can detect based on it
- IWORK13_MAIN_ENTRY - Static variable in class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
- IWork13PackageParser - Class in org.apache.tika.parser.iwork.iwana
- IWork13PackageParser() - Constructor for class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
- IWork13PackageParser.IWork13DocumentType - Enum Class in org.apache.tika.parser.iwork.iwana
- IWork18PackageParser - Class in org.apache.tika.parser.iwork.iwana
-
For now, this parser isn't even registered.
- IWork18PackageParser() - Constructor for class org.apache.tika.parser.iwork.iwana.IWork18PackageParser
- IWork18PackageParser.IWork18DocumentType - Enum Class in org.apache.tika.parser.iwork.iwana
- IWorkDetector - Class in org.apache.tika.detect.apple
- IWorkDetector() - Constructor for class org.apache.tika.detect.apple.IWorkDetector
- IWorkPackageParser - Class in org.apache.tika.parser.iwork
-
A parser for the IWork container files.
- IWorkPackageParser() - Constructor for class org.apache.tika.parser.iwork.IWorkPackageParser
- IWorkPackageParser.IWORKDocumentType - Enum Class in org.apache.tika.parser.iwork
- IWORKS_BUILD_VERSION_HISTORY - Static variable in class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
- IWORKS_DOC_ID - Static variable in class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
- IWORKS_PREFIX - Static variable in class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
J
- JackcessParser - Class in org.apache.tika.parser.microsoft
-
Parser that handles Microsoft Access files via Jackcess
- JackcessParser() - Constructor for class org.apache.tika.parser.microsoft.JackcessParser
- JAR - Static variable in class org.apache.tika.detect.zip.PackageConstants
- JarDetector - Class in org.apache.tika.detect.zip
-
This detector detects JAR files and file type variants of zip subtypes that may contain a MANIFEST.MF
- JarDetector() - Constructor for class org.apache.tika.detect.zip.JarDetector
- JB2 - Static variable in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- jcid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.JCIDObject
- jcid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObject
- JCID - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
-
This class is used to represent a JCID
- JCID() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
- JCIDObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
This class is used to represent the JCID object.
- JCIDObject(ObjectGroupObjectDeclare, ObjectGroupObjectData) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.JCIDObject
-
Construct the JCIDObject instance.
- JDBCEmitter - Class in org.apache.tika.pipes.emitter.jdbc
-
Emitter to write parsed documents to a JDBC database.
- JDBCEmitterConfig - Record Class in org.apache.tika.pipes.emitter.jdbc
- JDBCEmitterConfig(String, String, String, String, String, int, int, LinkedHashMap<String, String>, String, String, String) - Constructor for record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
-
Creates an instance of a
JDBCEmitterConfigrecord class. - JDBCEmitterConfig.AttachmentStrategy - Enum Class in org.apache.tika.pipes.emitter.jdbc
- JDBCEmitterConfig.MultivaluedFieldStrategy - Enum Class in org.apache.tika.pipes.emitter.jdbc
- JDBCEmitterFactory - Class in org.apache.tika.pipes.emitter.jdbc
-
Factory for creating JDBC emitters.
- JDBCEmitterFactory() - Constructor for class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterFactory
- JDBCPipesIterator - Class in org.apache.tika.pipes.iterator.jdbc
-
Iterates through a the results from a sql call via jdbc.
- JDBCPipesIteratorConfig - Class in org.apache.tika.pipes.iterator.jdbc
- JDBCPipesIteratorConfig() - Constructor for class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIteratorConfig
- JDBCPipesIteratorFactory - Class in org.apache.tika.pipes.iterator.jdbc
-
Factory for creating JDBC pipes iterators.
- JDBCPipesIteratorFactory() - Constructor for class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIteratorFactory
- JDBCPipesPlugin - Class in org.apache.tika.pipes.plugin.jdbc
- JDBCPipesPlugin(PluginWrapper) - Constructor for class org.apache.tika.pipes.plugin.jdbc.JDBCPipesPlugin
- JDBCPipesReporter - Class in org.apache.tika.pipes.reporter.jdbc
-
This is an initial draft of a JDBCPipesReporter.
- JDBCPipesReporter(ExtensionConfig, JDBCPipesReporterConfig) - Constructor for class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporter
- JDBCPipesReporterConfig - Record Class in org.apache.tika.pipes.reporter.jdbc
- JDBCPipesReporterConfig(String, Set<String>, Set<String>) - Constructor for record class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterConfig
- JDBCPipesReporterConfig(String, Set<String>, Set<String>, String, String, boolean, String, List<String>, long, int) - Constructor for record class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterConfig
-
Creates an instance of a
JDBCPipesReporterConfigrecord class. - JDBCPipesReporterFactory - Class in org.apache.tika.pipes.reporter.jdbc
-
Factory for creating JDBC pipes reporters.
- JDBCPipesReporterFactory() - Constructor for class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterFactory
- JDBCTableReader - Class in org.apache.tika.parser.jdbc
-
General base class to iterate through rows of a JDBC table
- JDBCTableReader(Connection, String, EmbeddedDocumentUtil) - Constructor for class org.apache.tika.parser.jdbc.JDBCTableReader
- JDBCUtil - Class in org.apache.tika.eval.app.db
- JDBCUtil(String, String) - Constructor for class org.apache.tika.eval.app.db.JDBCUtil
- JDBCUtil.CREATE_TABLE - Enum Class in org.apache.tika.eval.app.db
- JempboxExtractor - Class in org.apache.tika.parser.xmp
- JempboxExtractor(Metadata) - Constructor for class org.apache.tika.parser.xmp.JempboxExtractor
- JOB_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
Number or identifier for the purpose of improved workflow handling.
- joinCreators(List<String>) - Static method in class org.apache.tika.parser.xmp.JempboxExtractor
- joinWith(String, List<String>) - Static method in class org.apache.tika.utils.StringUtils
- JoshuaNetworkTranslator - Class in org.apache.tika.language.translate.impl
-
This translator is designed to work with a TCP-IP available Joshua translation server, specifically the REST-based Joshua server.
- JoshuaNetworkTranslator() - Constructor for class org.apache.tika.language.translate.impl.JoshuaNetworkTranslator
-
Default constructor which first checks for the presence of the
translator.joshua.propertiesfile. - JournalParser - Class in org.apache.tika.parser.journal
- JournalParser() - Constructor for class org.apache.tika.parser.journal.JournalParser
- JP2 - Static variable in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- JPEG - Enum constant in enum class org.apache.tika.parser.pdf.OcrConfig.ImageFormat
- JPEG - Static variable in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- JpegParser - Class in org.apache.tika.parser.image
- JpegParser() - Constructor for class org.apache.tika.parser.image.JpegParser
- JS_NAME - Static variable in interface org.apache.tika.metadata.PDF
-
When javascript is stored in the names tree, there's a name associated with that script.
- json() - Method in interface org.apache.tika.config.JsonConfig
-
Returns the JSON configuration string.
- json() - Method in record class org.apache.tika.parser.vlm.AbstractVLMParser.HttpCall
-
Returns the value of the
jsonrecord component. - json() - Method in record class org.apache.tika.plugins.ExtensionConfig
-
Returns the value of the
jsonrecord component. - JsonConfig - Interface in org.apache.tika.config
-
Interface for objects that provide JSON configuration strings.
- JsonConfigHelper - Class in org.apache.tika.config
-
Helper class for loading JSON config templates with placeholder replacement.
- JsonConfigHelper() - Constructor for class org.apache.tika.config.JsonConfigHelper
- JsonEmitData - Class in org.apache.tika.pipes.core.serialization
- JsonEmitData() - Constructor for class org.apache.tika.pipes.core.serialization.JsonEmitData
- JsonFetchEmitTuple - Class in org.apache.tika.pipes.core.serialization
- JsonFetchEmitTuple() - Constructor for class org.apache.tika.pipes.core.serialization.JsonFetchEmitTuple
- JsonFetchEmitTupleList - Class in org.apache.tika.pipes.core.serialization
- JsonFetchEmitTupleList() - Constructor for class org.apache.tika.pipes.core.serialization.JsonFetchEmitTupleList
- JsonMergeUtils - Class in org.apache.tika.config.loader
-
Utility methods for merging JSON configurations with default values.
- JSONMessageBodyWriter - Class in org.apache.tika.server.core.writer
- JSONMessageBodyWriter() - Constructor for class org.apache.tika.server.core.writer.JSONMessageBodyWriter
- JsonMetadata - Class in org.apache.tika.serialization
- JsonMetadata() - Constructor for class org.apache.tika.serialization.JsonMetadata
- JsonMetadataList - Class in org.apache.tika.serialization
- JsonMetadataList() - Constructor for class org.apache.tika.serialization.JsonMetadataList
- JSONObjWriter - Class in org.apache.tika.server.core.writer
- JSONObjWriter() - Constructor for class org.apache.tika.server.core.writer.JSONObjWriter
- JsonPipesIpc - Class in org.apache.tika.pipes.core.serialization
-
Binary serialization/deserialization for IPC communication between PipesClient and PipesServer.
- JsonPipesIpc() - Constructor for class org.apache.tika.pipes.core.serialization.JsonPipesIpc
- JsonPipesIterator - Class in org.apache.tika.pipes.pipesiterator.json
-
Iterates through a UTF-8 text file with one FetchEmitTuple json object per line.
- JsonPipesIteratorConfig - Class in org.apache.tika.pipes.pipesiterator.json
- JsonPipesIteratorConfig() - Constructor for class org.apache.tika.pipes.pipesiterator.json.JsonPipesIteratorConfig
- JsonPipesIteratorFactory - Class in org.apache.tika.pipes.pipesiterator.json
-
Factory for creating JSON pipes iterators.
- JsonPipesIteratorFactory() - Constructor for class org.apache.tika.pipes.pipesiterator.json.JsonPipesIteratorFactory
- JsonPipesPlugin - Class in org.apache.tika.pipes.plugin
- JsonPipesPlugin(PluginWrapper) - Constructor for class org.apache.tika.pipes.plugin.JsonPipesPlugin
- JsonResponse - Class in org.apache.tika.pipes.emitter.es
- JsonResponse - Class in org.apache.tika.pipes.emitter.opensearch
- JsonResponse - Class in org.apache.tika.pipes.reporter.opensearch
- JsonResponse(int, JsonNode) - Constructor for class org.apache.tika.pipes.emitter.es.JsonResponse
- JsonResponse(int, JsonNode) - Constructor for class org.apache.tika.pipes.emitter.opensearch.JsonResponse
- JsonResponse(int, JsonNode) - Constructor for class org.apache.tika.pipes.reporter.opensearch.JsonResponse
- JsonResponse(int, String) - Constructor for class org.apache.tika.pipes.emitter.es.JsonResponse
- JsonResponse(int, String) - Constructor for class org.apache.tika.pipes.emitter.opensearch.JsonResponse
- JsonResponse(int, String) - Constructor for class org.apache.tika.pipes.reporter.opensearch.JsonResponse
- JSoupParser - Class in org.apache.tika.parser.html
-
HTML parser.
- JSoupParser() - Constructor for class org.apache.tika.parser.html.JSoupParser
- JSoupParser(JsonConfig) - Constructor for class org.apache.tika.parser.html.JSoupParser
-
Constructor for JSON configuration.
- JSoupParser(EncodingDetector) - Constructor for class org.apache.tika.parser.html.JSoupParser
- JSoupParser(JSoupParser.Config) - Constructor for class org.apache.tika.parser.html.JSoupParser
-
Constructor with explicit Config object.
- JSoupParser.Config - Class in org.apache.tika.parser.html
-
Configuration class for JSON deserialization.
- JunkDetector - Class in org.apache.tika.ml.junkdetect
-
Language-agnostic text quality scorer.
- JunkFilterEncodingDetector - Class in org.apache.tika.ml.junkdetect
-
A
MetaEncodingDetectorthat arbitrates charset candidates by asking aTextQualityDetectorwhich decoded candidate looks most like natural text. - JunkFilterEncodingDetector() - Constructor for class org.apache.tika.ml.junkdetect.JunkFilterEncodingDetector
- JunkFilterEncodingDetector(TextQualityDetector) - Constructor for class org.apache.tika.ml.junkdetect.JunkFilterEncodingDetector
-
Test-only / deterministic-wiring constructor.
- jwt() - Method in class org.apache.tika.pipes.fetcher.http.jwt.JwtGenerator
- JwtCreds - Class in org.apache.tika.pipes.fetcher.http.jwt
- JwtCreds(String, String, int) - Constructor for class org.apache.tika.pipes.fetcher.http.jwt.JwtCreds
- JwtGenerator - Class in org.apache.tika.pipes.fetcher.http.jwt
- JwtGenerator(JwtCreds) - Constructor for class org.apache.tika.pipes.fetcher.http.jwt.JwtGenerator
- JwtPrivateKeyCreds - Class in org.apache.tika.pipes.fetcher.http.jwt
- JwtPrivateKeyCreds(PrivateKey, String, String, int) - Constructor for class org.apache.tika.pipes.fetcher.http.jwt.JwtPrivateKeyCreds
- JwtSecretCreds - Class in org.apache.tika.pipes.fetcher.http.jwt
- JwtSecretCreds(byte[], String, String, int) - Constructor for class org.apache.tika.pipes.fetcher.http.jwt.JwtSecretCreds
- JXLParser - Class in org.apache.tika.parser.image
-
Tries to scrape XMP out of JXL
- JXLParser() - Constructor for class org.apache.tika.parser.image.JXLParser
K
- KafkaEmitter - Class in org.apache.tika.pipes.emitter.kafka
-
Emitter to write parsed documents into a specified Apache Kafka topic.
- KafkaEmitterConfig - Record Class in org.apache.tika.pipes.emitter.kafka
- KafkaEmitterConfig(String, String, String, int, int, int, String, int, int, boolean, String, int, int, int, int, int, int, int, int, String, String, String, String) - Constructor for record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Creates an instance of a
KafkaEmitterConfigrecord class. - KafkaEmitterFactory - Class in org.apache.tika.pipes.emitter.kafka
-
Factory for creating Kafka emitters.
- KafkaEmitterFactory() - Constructor for class org.apache.tika.pipes.emitter.kafka.KafkaEmitterFactory
- KafkaPipesIterator - Class in org.apache.tika.pipes.iterator.kafka
- KafkaPipesIteratorConfig - Class in org.apache.tika.pipes.iterator.kafka
- KafkaPipesIteratorConfig() - Constructor for class org.apache.tika.pipes.iterator.kafka.KafkaPipesIteratorConfig
- KafkaPipesIteratorFactory - Class in org.apache.tika.pipes.iterator.kafka
-
Factory for creating Kafka pipes iterators.
- KafkaPipesIteratorFactory() - Constructor for class org.apache.tika.pipes.iterator.kafka.KafkaPipesIteratorFactory
- KafkaPipesPlugin - Class in org.apache.tika.pipes.plugin.kafka
- KafkaPipesPlugin(PluginWrapper) - Constructor for class org.apache.tika.pipes.plugin.kafka.KafkaPipesPlugin
- KATAKANA - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- KATE - Static variable in class org.apache.tika.parser.ogg.OggParser
- KebabCaseConverter - Class in org.apache.tika.annotation
-
Utility for converting Java class names to kebab-case.
- KebabCaseConverter - Class in org.apache.tika.config.loader
-
Utility for converting Java class names to kebab-case.
- KEEP_ALL - Enum constant in enum class org.apache.tika.parser.multiple.AbstractMultipleParser.MetadataPolicy
-
Where multiple parsers output a given key, store all their different (unique) values
- KEY - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio's musical key."
- KEY_WORDS - Static variable in interface org.apache.tika.metadata.XMPPDF
-
Unordered text strings of keywords.
- KEYNOTE - Enum constant in enum class org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
- KEYNOTE13 - Enum constant in enum class org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
- KEYNOTE18 - Enum constant in enum class org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
- keyPrefix() - Method in record class org.apache.tika.pipes.reporter.es.ESReporterConfig
-
Returns the value of the
keyPrefixrecord component. - keyPrefix() - Method in record class org.apache.tika.pipes.reporter.opensearch.OpenSearchReporterConfig
-
Returns the value of the
keyPrefixrecord component. - keys() - Method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
-
Returns the value of the
keysrecord component. - keySerializer() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
keySerializerrecord component. - keySet() - Method in interface org.apache.tika.pipes.core.config.ConfigStore
-
Returns all configuration IDs.
- keySet() - Method in class org.apache.tika.pipes.core.config.FileBasedConfigStore
- keySet() - Method in class org.apache.tika.pipes.core.config.InMemoryConfigStore
- keySet() - Method in class org.apache.tika.pipes.ignite.IgniteConfigStore
- KEYWORDS - Static variable in interface org.apache.tika.metadata.IPTC
-
Keywords to express the subject of the content.
- KEYWORDS - Static variable in interface org.apache.tika.metadata.Office
-
Keywords pertaining to a document.
- KHMER - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- KMZ - Static variable in class org.apache.tika.detect.zip.PackageConstants
- KMZDetector - Class in org.apache.tika.detect.zip
-
This looks for a single file with a name ending in ".kml" at the root level of the zip file.
- KMZDetector() - Constructor for class org.apache.tika.detect.zip.KMZDetector
- Knowledge - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
The Knowledge
- Knowledge - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
The Knowledge
- knownScripts() - Method in class org.apache.tika.ml.junkdetect.JunkDetector
-
Returns the set of script names this model knows about.
L
- LABEL - Static variable in interface org.apache.tika.metadata.XMP
-
A word or short phrase that identifies a resource as a member of a userdefined collection.
- labelA() - Method in class org.apache.tika.quality.TextQualityComparison
-
Label supplied for candidate A (e.g. a charset name or encoding description).
- labelB() - Method in class org.apache.tika.quality.TextQualityComparison
-
Label supplied for candidate B.
- LANG_ID_1 - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- LANG_ID_2 - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- LANG_ID_PROB_1 - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- LANG_ID_PROB_2 - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- LangModel - Class in org.apache.tika.eval.core.tokens
- LangModel(long) - Constructor for class org.apache.tika.eval.core.tokens.LangModel
- Language - Class in org.apache.tika.example
- Language() - Constructor for class org.apache.tika.example.Language
- LANGUAGE - Static variable in class org.apache.tika.eval.core.metadata.TikaEvalMetadataFilter
- LANGUAGE - Static variable in interface org.apache.tika.metadata.DublinCore
-
A language of the intellectual content of the resource.
- LANGUAGE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- LANGUAGE - Static variable in interface org.apache.tika.metadata.XMPDC
-
A language of the intellectual content of the resource.
- LANGUAGE_CONFIDENCE - Static variable in class org.apache.tika.eval.core.metadata.TikaEvalMetadataFilter
- LanguageAwareTokenCountStats<T> - Interface in org.apache.tika.eval.core.textstats
-
Interface for calculators that require language probabilities and token stats
- LanguageConfidence - Enum Class in org.apache.tika.language.detect
- LanguageDetectingParser - Class in org.apache.tika.example
- LanguageDetectingParser() - Constructor for class org.apache.tika.example.LanguageDetectingParser
- languageDetection() - Static method in class org.apache.tika.example.Language
- languageDetectionWithHandler() - Static method in class org.apache.tika.example.Language
- languageDetectionWithWriter() - Static method in class org.apache.tika.example.Language
- LanguageDetector - Class in org.apache.tika.language.detect
- LanguageDetector() - Constructor for class org.apache.tika.language.detect.LanguageDetector
- LanguageDetectorExample - Class in org.apache.tika.example
- LanguageDetectorExample() - Constructor for class org.apache.tika.example.LanguageDetectorExample
- LanguageDetectorTest - Class in org.apache.tika.langdetect
- LanguageDetectorTest() - Constructor for class org.apache.tika.langdetect.LanguageDetectorTest
- LanguageHandler - Class in org.apache.tika.language.detect
-
SAX content handler that updates a language detector based on all the received character content.
- LanguageHandler() - Constructor for class org.apache.tika.language.detect.LanguageHandler
- LanguageHandler(LanguageDetector) - Constructor for class org.apache.tika.language.detect.LanguageHandler
- LanguageHandler(LanguageWriter) - Constructor for class org.apache.tika.language.detect.LanguageHandler
- LanguageID - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- LanguageIDWrapper - Class in org.apache.tika.eval.core.langid
- LanguageIDWrapper() - Constructor for class org.apache.tika.eval.core.langid.LanguageIDWrapper
- LanguageNames - Class in org.apache.tika.language.detect
-
Support for language tags (as defined by https://tools.ietf.org/html/bcp47)
- LanguageNames() - Constructor for class org.apache.tika.language.detect.LanguageNames
- LANGUAGENESS - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- LANGUAGENESS - Static variable in class org.apache.tika.eval.core.metadata.TikaEvalMetadataFilter
- LanguageResource - Class in org.apache.tika.server.core.resource
- LanguageResource() - Constructor for class org.apache.tika.server.core.resource.LanguageResource
- LanguageResult - Class in org.apache.tika.language.detect
- LanguageResult(String, LanguageConfidence, float) - Constructor for class org.apache.tika.language.detect.LanguageResult
- LanguageResult(String, LanguageConfidence, float, float) - Constructor for class org.apache.tika.language.detect.LanguageResult
- LanguageWriter - Class in org.apache.tika.language.detect
-
Writer that builds a language profile based on all the written content.
- LanguageWriter(LanguageDetector) - Constructor for class org.apache.tika.language.detect.LanguageWriter
- largeLength - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart32bit
-
Gets or sets an optional compact uint64 that specifies the length in bytes for additional data (if any).
- LAST_AUTHOR - Static variable in interface org.apache.tika.metadata.Office
-
Name of the last (most recent) author of a document
- LAST_MODIFIED_BY - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
The user who performed the last modification.
- LAST_PRINTED - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
The date and time of the last printing.
- LAST_WINS - Enum constant in enum class org.apache.tika.parser.multiple.AbstractMultipleParser.MetadataPolicy
-
The last parser to output a given key wins, overriding previous parser values for a clashing key.
- LastModifiedTime - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- LastModifiedTimeStamp - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- lastProgressMillis() - Method in record class org.apache.tika.pipes.core.protocol.PipesMessage
-
Extracts the last-progress timestamp from a WORKING message payload.
- LATIN - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- Latin1StringsParser - Class in org.apache.tika.parser.strings
-
Parser to extract printable Latin1 strings from arbitrary files with pure java without running any external process.
- Latin1StringsParser() - Constructor for class org.apache.tika.parser.strings.Latin1StringsParser
- LATITUDE - Static variable in interface org.apache.tika.metadata.Geographic
-
The WGS84 Latitude of the Point
- LATITUDE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- LAYER_1 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
-
Constant for audio layer 1.
- LAYER_2 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
-
Constant for audio layer 2.
- LAYER_3 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
-
Constant for audio layer 3.
- LayoutAlignmentInParent - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- LayoutAlignmentSelf - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- LayoutCollisionPriority - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- LayoutMaxHeight - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- LayoutMaxWidth - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- LayoutMinimumOutlineWidth - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- LayoutOutlineReservedWidth - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- LayoutResolveChildCollisions - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- LayoutTightAlignment - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- LayoutTightLayout - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- LeafNodeObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- LeafNodeObject - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Intermediate Node Object
- LeafNodeObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
-
Initializes a new instance of the LeafNodeObjectData class.
- LeafNodeObject.IntermediateNodeObjectBuilder - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
The class is used to build a intermediate node object.
- leftPad(String, int, char) - Static method in class org.apache.tika.utils.StringUtils
- leftPad(String, int, String) - Static method in class org.apache.tika.utils.StringUtils
-
Left pad a String with a specified String.
- leftShift(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- LeipzigHelper - Class in org.apache.tika.eval.app.tools
- LeipzigHelper() - Constructor for class org.apache.tika.eval.app.tools.LeipzigHelper
- LeipzigSampler - Class in org.apache.tika.eval.app.tools
- LeipzigSampler() - Constructor for class org.apache.tika.eval.app.tools.LeipzigSampler
- length - Variable in class org.apache.tika.ml.chardetect.HtmlByteStripper.Result
-
Content byte count written into the destination.
- length - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
- length - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
- LENGTH - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- lengthTreeLengtsTable - Variable in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- lengthTreeTable - Variable in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- LevelTuple(int, int, String, String, boolean) - Constructor for class org.apache.tika.parser.microsoft.AbstractListManager.LevelTuple
- LevelTuple(String) - Constructor for class org.apache.tika.parser.microsoft.AbstractListManager.LevelTuple
- LibPstParser - Class in org.apache.tika.parser.microsoft.libpst
-
This is an optional PST parser that relies on the user installing the GPL-3 libpst/readpst commandline tool and configuring Tika to call this library via tika-config.xml
- LibPstParser() - Constructor for class org.apache.tika.parser.microsoft.libpst.LibPstParser
- LibPstParser(JsonConfig) - Constructor for class org.apache.tika.parser.microsoft.libpst.LibPstParser
- LibPstParser(LibPstParserConfig) - Constructor for class org.apache.tika.parser.microsoft.libpst.LibPstParser
- LibPstParserConfig - Class in org.apache.tika.parser.microsoft.libpst
- LibPstParserConfig() - Constructor for class org.apache.tika.parser.microsoft.libpst.LibPstParserConfig
- LibPstParserConfig.RuntimeConfig - Class in org.apache.tika.parser.microsoft.libpst
-
RuntimeConfig blocks modification of security-sensitive path fields at runtime.
- LICENSE_LOCATION - Static variable in interface org.apache.tika.metadata.CreativeCommons
- LICENSE_URL - Static variable in interface org.apache.tika.metadata.CreativeCommons
- LICENSOR - Static variable in interface org.apache.tika.metadata.IPTC
-
A person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_CITY - Static variable in interface org.apache.tika.metadata.IPTC
-
The city of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_COUNTRY - Static variable in interface org.apache.tika.metadata.IPTC
-
The country of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_EMAIL - Static variable in interface org.apache.tika.metadata.IPTC
-
The email of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_EXTENDED_ADDRESS - Static variable in interface org.apache.tika.metadata.IPTC
-
The extended address of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
The ID of the person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_ID_WRONG_CASE - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.use
IPTC.LICENSOR_ID - LICENSOR_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of the person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_POSTAL_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
The postal code of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_REGION - Static variable in interface org.apache.tika.metadata.IPTC
-
The region of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_STREET_ADDRESS - Static variable in interface org.apache.tika.metadata.IPTC
-
The street address of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_TELEPHONE_1 - Static variable in interface org.apache.tika.metadata.IPTC
-
The phone number of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_TELEPHONE_2 - Static variable in interface org.apache.tika.metadata.IPTC
-
The phone number of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LICENSOR_URL - Static variable in interface org.apache.tika.metadata.IPTC
-
The URL of a person or company that should be contacted to obtain a licence for using the item or who has licensed the item.
- LIKELY_UTF8 - Enum constant in enum class org.apache.tika.ml.chardetect.StructuralEncodingRules.Utf8Result
-
Sample is grammatically valid UTF-8 and contains at least one complete multi-byte sequence.
- LINE_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of lines in the document
- LinearModel - Class in org.apache.tika.ml
-
INT8-quantized multinomial logistic regression model for classification.
- LinearModel(int, int, String[], float[], float[], byte[][]) - Constructor for class org.apache.tika.ml.LinearModel
-
Construct without calibration (V1-compatible).
- LinearModel(int, int, String[], float[], float[], byte[][], float[], float[]) - Constructor for class org.apache.tika.ml.LinearModel
-
Construct with optional calibration.
- lineTo(float, float) - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- lingerMs() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
lingerMsrecord component. - Lingo24LangDetector - Class in org.apache.tika.langdetect.lingo24
-
An implementation of a Language Detector using the Premium MT API v1.
- Lingo24LangDetector() - Constructor for class org.apache.tika.langdetect.lingo24.Lingo24LangDetector
-
Default constructor which first checks for the presence of the
langdetect.lingo24.propertiesfile to set the API Key. - Lingo24Translator - Class in org.apache.tika.language.translate.impl
-
An implementation of a REST client for the Premium MT API v1.
- Lingo24Translator() - Constructor for class org.apache.tika.language.translate.impl.Lingo24Translator
- Link - Class in org.apache.tika.sax
- Link(String, String, String, String) - Constructor for class org.apache.tika.sax.Link
- Link(String, String, String, String, String) - Constructor for class org.apache.tika.sax.Link
- LinkContentHandler - Class in org.apache.tika.sax
-
Content handler that collects links from an XHTML document.
- LinkContentHandler() - Constructor for class org.apache.tika.sax.LinkContentHandler
-
Default constructor
- LinkContentHandler(boolean) - Constructor for class org.apache.tika.sax.LinkContentHandler
-
Default constructor
- LinkedCell - Class in org.apache.tika.parser.microsoft
-
Linked cell.
- LinkedCell(Cell, String) - Constructor for class org.apache.tika.parser.microsoft.LinkedCell
- linkedOLERef(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- linkedOLERef(String) - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
-
Called when a linked (vs embedded) OLE object is found.
- listAllTypes() - Static method in class org.apache.tika.example.MediaTypeExample
- ListDescriptor - Class in org.apache.tika.parser.microsoft.rtf
-
Contains the information for a single list in the list or list override tables.
- ListDescriptor() - Constructor for class org.apache.tika.parser.microsoft.rtf.ListDescriptor
- listFetchers(ListFetchersRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingStub
-
List fetchers that are currently in the fetcher store.
- listFetchers(ListFetchersRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingV2Stub
-
List fetchers that are currently in the fetcher store.
- listFetchers(ListFetchersRequest) - Method in class org.apache.tika.TikaGrpc.TikaFutureStub
-
List fetchers that are currently in the fetcher store.
- listFetchers(ListFetchersRequest, StreamObserver<ListFetchersReply>) - Method in interface org.apache.tika.TikaGrpc.AsyncService
-
List fetchers that are currently in the fetcher store.
- listFetchers(ListFetchersRequest, StreamObserver<ListFetchersReply>) - Method in class org.apache.tika.TikaGrpc.TikaStub
-
List fetchers that are currently in the fetcher store.
- ListFetchersReply - Class in org.apache.tika
-
Protobuf type
tika.ListFetchersReply - ListFetchersReply.Builder - Class in org.apache.tika
-
Protobuf type
tika.ListFetchersReply - ListFetchersReplyOrBuilder - Interface in org.apache.tika
- ListFetchersRequest - Class in org.apache.tika
-
Protobuf type
tika.ListFetchersRequest - ListFetchersRequest.Builder - Class in org.apache.tika
-
Protobuf type
tika.ListFetchersRequest - ListFetchersRequestOrBuilder - Interface in org.apache.tika
- ListFont - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- listLevelMap - Variable in class org.apache.tika.parser.microsoft.AbstractListManager
- ListManager - Class in org.apache.tika.parser.microsoft
-
Computes the number text which goes at the beginning of each list paragraph
- ListManager(HWPFDocument) - Constructor for class org.apache.tika.parser.microsoft.ListManager
-
Ordinary constructor for a new list reader
- ListMSAAIndex - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ListNodes - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ListRestart - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ListSpacingMu - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- listZipEntries(String) - Static method in class org.apache.tika.example.ZipListFiles
- LITTLE - Static variable in class org.apache.tika.metadata.MachineMetadata.Endian
- LITTLEENDIAN_16_BIT - Enum constant in enum class org.apache.tika.parser.strings.StringsEncoding
- LITTLEENDIAN_32_BIT - Enum constant in enum class org.apache.tika.parser.strings.StringsEncoding
- LittleEndianBitConverter - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
-
Implement a converter which converts to/from little-endian byte arrays
- load() - Static method in class org.apache.tika.langdetect.charsoup.ConfusableGroups
-
Load and return the confusable groups.
- load() - Static method in class org.apache.tika.server.core.TikaServerConfig
-
Config with only the defaults
- load(int, Path) - Static method in class org.apache.tika.pipes.core.server.PipesServer
- load(InputStream) - Static method in class org.apache.tika.config.loader.TikaJsonConfig
-
Loads configuration from an input stream.
- load(InputStream) - Static method in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Load a model from an input stream.
- load(InputStream) - Static method in class org.apache.tika.ml.junkdetect.JunkDetector
-
Loads a model from an
InputStream. - load(InputStream) - Static method in class org.apache.tika.ml.LinearModel
-
Load a model from an input stream.
- load(Class<T>) - Method in class org.apache.tika.config.loader.ConfigLoader
-
Loads a configuration object using the class name converted to kebab-case.
- load(Class<T>, T) - Method in class org.apache.tika.config.loader.ConfigLoader
-
Loads a configuration object using the class name, with a default value.
- load(String) - Static method in record class org.apache.tika.pipes.emitter.azblob.AZBlobEmitterConfig
- load(String) - Static method in record class org.apache.tika.pipes.emitter.es.ESEmitterConfig
- load(String) - Static method in record class org.apache.tika.pipes.emitter.fs.FileSystemEmitterConfig
- load(String) - Static method in class org.apache.tika.pipes.emitter.fs.FileSystemEmitterRuntimeConfig
- load(String) - Static method in record class org.apache.tika.pipes.emitter.gcs.GCSEmitterConfig
- load(String) - Static method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
- load(String) - Static method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
- load(String) - Static method in record class org.apache.tika.pipes.emitter.opensearch.HttpClientConfig
- load(String) - Static method in record class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig
- load(String) - Static method in record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
- load(String) - Static method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
- load(String) - Static method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- load(String) - Static method in class org.apache.tika.pipes.fetcher.azblob.config.AZBlobFetcherConfig
- load(String) - Static method in class org.apache.tika.pipes.fetcher.fs.FileSystemFetcherConfig
- load(String) - Static method in class org.apache.tika.pipes.fetcher.gcs.config.GCSFetcherConfig
- load(String) - Static method in class org.apache.tika.pipes.fetcher.googledrive.config.GoogleDriveFetcherConfig
- load(String) - Static method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- load(String) - Static method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- load(String) - Static method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.MicrosoftGraphFetcherConfig
- load(String) - Static method in class org.apache.tika.pipes.ignite.config.IgniteConfigStoreConfig
- load(String) - Static method in class org.apache.tika.pipes.iterator.azblob.AZBlobPipesIteratorConfig
- load(String) - Static method in class org.apache.tika.pipes.iterator.csv.CSVPipesIteratorConfig
- load(String) - Static method in class org.apache.tika.pipes.iterator.fs.FileSystemPipesIteratorConfig
- load(String) - Static method in class org.apache.tika.pipes.iterator.gcs.GCSPipesIteratorConfig
- load(String) - Static method in class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIteratorConfig
- load(String) - Static method in class org.apache.tika.pipes.iterator.kafka.KafkaPipesIteratorConfig
- load(String) - Static method in class org.apache.tika.pipes.iterator.s3.S3PipesIteratorConfig
- load(String) - Static method in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- load(String) - Static method in class org.apache.tika.pipes.pipesiterator.json.JsonPipesIteratorConfig
- load(String) - Static method in record class org.apache.tika.pipes.reporter.es.ESReporterConfig
- load(String) - Static method in record class org.apache.tika.pipes.reporter.fs.FileSystemReporterConfig
- load(String) - Static method in record class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterConfig
- load(String) - Static method in record class org.apache.tika.pipes.reporter.opensearch.HttpClientConfig
- load(String) - Static method in record class org.apache.tika.pipes.reporter.opensearch.OpenSearchReporterConfig
- load(String, Class<T>) - Method in class org.apache.tika.config.loader.ConfigLoader
-
Loads a configuration object from the specified JSON key.
- load(String, Class<T>, T) - Method in class org.apache.tika.config.loader.ConfigLoader
-
Loads a configuration object from the specified JSON key, with a default value.
- load(Path) - Static method in class org.apache.tika.config.loader.TikaJsonConfig
-
Loads configuration from a file.
- load(Path) - Static method in class org.apache.tika.config.loader.TikaLoader
-
Loads a Tika configuration from a file.
- load(Path) - Static method in class org.apache.tika.eval.app.EvalConfig
- load(Path) - Static method in class org.apache.tika.pipes.core.async.AsyncProcessor
-
Loads an AsyncProcessor from a configuration file path.
- load(Path) - Static method in class org.apache.tika.pipes.core.PipesParser
-
Loads a PipesParser from a configuration file path.
- load(Path) - Static method in class org.apache.tika.plugins.TikaPluginManager
-
Loads plugin manager from a configuration file.
- load(Path, ClassLoader) - Static method in class org.apache.tika.config.loader.TikaLoader
-
Loads a Tika configuration from a file with a specific class loader.
- load(Path, Map<String, Object>) - Static method in class org.apache.tika.config.JsonConfigHelper
-
Loads a JSON config template from a file path and applies replacements.
- load(Path, PipesIterator) - Static method in class org.apache.tika.pipes.core.async.AsyncProcessor
-
Loads an AsyncProcessor from a configuration file path with a custom PipesIterator.
- load(CommandLine) - Static method in class org.apache.tika.server.core.TikaServerConfig
- load(TikaJsonConfig) - Static method in class org.apache.tika.pipes.core.PipesConfig
-
Loads PipesConfig from the "pipes" section of the JSON configuration.
- load(TikaJsonConfig) - Static method in class org.apache.tika.plugins.TikaPluginManager
-
Loads plugin manager from a pre-parsed TikaJsonConfig.
- load(TikaJsonConfig, LoaderContext) - Method in class org.apache.tika.config.loader.AbstractSpiComponentLoader
- load(TikaJsonConfig, LoaderContext) - Method in interface org.apache.tika.config.loader.ComponentLoader
-
Load components from the JSON config.
- load(TikaJsonConfig, PipesConfig, Path) - Static method in class org.apache.tika.pipes.core.PipesParser
-
Loads a PipesParser from pre-loaded configuration objects.
- load(TikaLoader, PipesConfig) - Static method in class org.apache.tika.pipes.core.server.SharedServerResources
-
Loads shared server resources from configuration.
- load(PluginManager, TikaJsonConfig) - Static method in class org.apache.tika.pipes.core.emitter.EmitterManager
-
Loads an EmitterManager without allowing runtime modifications.
- load(PluginManager, TikaJsonConfig) - Static method in class org.apache.tika.pipes.core.fetcher.FetcherManager
-
Loads a FetcherManager without allowing runtime modifications.
- load(PluginManager, TikaJsonConfig) - Static method in class org.apache.tika.pipes.core.pipesiterator.PipesIteratorManager
- load(PluginManager, TikaJsonConfig) - Static method in class org.apache.tika.pipes.core.reporter.ReporterManager
- load(PluginManager, TikaJsonConfig, boolean) - Static method in class org.apache.tika.pipes.core.emitter.EmitterManager
-
Loads an EmitterManager with optional support for runtime modifications.
- load(PluginManager, TikaJsonConfig, boolean) - Static method in class org.apache.tika.pipes.core.fetcher.FetcherManager
-
Loads a FetcherManager with optional support for runtime modifications.
- load(PluginManager, TikaJsonConfig, boolean, ConfigStore) - Static method in class org.apache.tika.pipes.core.emitter.EmitterManager
-
Loads an EmitterManager with optional support for runtime modifications and a custom config store.
- load(PluginManager, TikaJsonConfig, boolean, ConfigStore) - Static method in class org.apache.tika.pipes.core.fetcher.FetcherManager
-
Loads a FetcherManager with optional support for runtime modifications and a custom config store.
- loadAsList() - Method in class org.apache.tika.serialization.ComponentConfig.Builder
-
Configure this component to be loaded as a list from JSON.
- loadAutoDetectParser() - Method in class org.apache.tika.config.loader.TikaLoader
-
Loads and returns an AutoDetectParser configured with this loader's parsers and detectors.
- loadCommonTokens(Path, String) - Static method in class org.apache.tika.eval.app.ProfilerBase
- loadComponent(String, JsonNode, LoaderContext) - Method in class org.apache.tika.config.loader.AbstractSpiComponentLoader
-
Load a single component from config.
- loadComponent(String, JsonNode, LoaderContext) - Method in class org.apache.tika.config.loader.DetectorLoader
- loadComponent(String, JsonNode, LoaderContext) - Method in class org.apache.tika.config.loader.EncodingDetectorLoader
- loadComponent(String, JsonNode, LoaderContext) - Method in class org.apache.tika.config.loader.ParserLoader
- loadConfig(Class<T>, T) - Method in class org.apache.tika.config.loader.TikaLoader
-
Loads a configuration object from the "parse-context" section, merging with defaults.
- loadConfig(String, Class<T>, T) - Method in class org.apache.tika.config.loader.TikaLoader
-
Loads a configuration object from the "parse-context" section by explicit key, merging with defaults.
- loadContentHandlerFactory() - Method in class org.apache.tika.config.loader.TikaLoader
-
Loads and returns the content handler factory.
- loadDefault() - Static method in class org.apache.tika.config.loader.TikaJsonConfig
-
Creates an empty configuration (no config file).
- loadDefault() - Static method in class org.apache.tika.config.loader.TikaLoader
-
Creates a default Tika loader with no configuration file.
- loadDefault(ClassLoader) - Static method in class org.apache.tika.config.loader.TikaLoader
-
Creates a default Tika loader with no configuration file and a specific class loader.
- loadDefaultModels(File) - Method in class org.apache.tika.detect.TrainedModelDetector
- loadDefaultModels(InputStream) - Method in class org.apache.tika.detect.NNExampleModelDetector
- loadDefaultModels(InputStream) - Method in class org.apache.tika.detect.TrainedModelDetector
- loadDefaultModels(ClassLoader) - Method in class org.apache.tika.detect.NNExampleModelDetector
-
this method gets overwritten to register load neural network models
- loadDefaultModels(ClassLoader) - Method in class org.apache.tika.detect.TrainedModelDetector
- loadDefaultModels(Path) - Method in class org.apache.tika.detect.TrainedModelDetector
- loadDetectors() - Method in class org.apache.tika.config.loader.TikaLoader
-
Loads and returns all detectors.
- loadDynamicServiceProviders(Class<T>) - Method in class org.apache.tika.config.ServiceLoader
-
Returns the available dynamic service providers of the given type.
- loadEncodingDetectors() - Method in class org.apache.tika.config.loader.TikaLoader
-
Loads and returns all encoding detectors.
- LoaderContext - Class in org.apache.tika.config.loader
-
Shared context passed to ComponentLoaders.
- LoaderContext(ClassLoader, ObjectMapper, LoaderContext.DependencyProvider) - Constructor for class org.apache.tika.config.loader.LoaderContext
- LoaderContext.DependencyProvider - Interface in org.apache.tika.config.loader
-
Interface for lazy access to cross-component dependencies.
- loadExtract(Path) - Method in class org.apache.tika.eval.app.io.ExtractReader
- loadFromClasspath() - Static method in class org.apache.tika.ml.junkdetect.JunkDetector
-
Loads the bundled model from the classpath.
- loadFromClasspath(String) - Static method in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Load a model from the classpath.
- loadFromClasspath(String) - Static method in class org.apache.tika.ml.LinearModel
-
Load a model from the classpath.
- loadFromPath(Path) - Static method in class org.apache.tika.ml.junkdetect.JunkDetector
-
Loads a model from the given file path.
- loadFromPath(Path) - Static method in class org.apache.tika.ml.LinearModel
-
Load a model from a file on disk.
- loadFromPaths(String) - Static method in class org.apache.tika.plugins.TikaPluginManager
-
Loads plugin manager from a comma-separated string of paths.
- loadFromResource(String, Class<?>, Map<String, Object>) - Static method in class org.apache.tika.config.JsonConfigHelper
-
Loads a JSON config template from a resource path and applies replacements.
- loadFromString(String, Map<String, Object>) - Static method in class org.apache.tika.config.JsonConfigHelper
-
Loads a JSON config template from a string and applies replacements.
- loadGlobalSettings() - Method in class org.apache.tika.config.loader.TikaLoader
-
Loads global configuration settings from the JSON config.
- loadInstances(PluginManager, Class<? extends TikaExtensionFactory<T>>, JsonNode) - Static method in class org.apache.tika.plugins.PluginComponentLoader
-
Load multiple named instances from config, grouped by type.
- loadLinkedRelationships(PackagePart, boolean, Metadata) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
This is used by the SAX docx and pptx decorators to load hyperlinks and other linked objects
- loadMetadataFilters() - Method in class org.apache.tika.config.loader.TikaLoader
-
Loads and returns all metadata filters.
- loadModels() - Method in class org.apache.tika.langdetect.charsoup.CharSoupLanguageDetector
- loadModels() - Method in class org.apache.tika.langdetect.lingo24.Lingo24LangDetector
- loadModels() - Method in class org.apache.tika.langdetect.mitll.TextLangDetector
- loadModels() - Method in class org.apache.tika.langdetect.opennlp.OpenNLPDetector
-
No-op.
- loadModels() - Method in class org.apache.tika.langdetect.optimaize.OptimaizeLangDetector
- loadModels() - Method in class org.apache.tika.language.detect.LanguageDetector
-
Load (or re-load) all available language models.
- loadModels(Set<String>) - Method in class org.apache.tika.langdetect.charsoup.CharSoupLanguageDetector
- loadModels(Set<String>) - Method in class org.apache.tika.langdetect.lingo24.Lingo24LangDetector
- loadModels(Set<String>) - Method in class org.apache.tika.langdetect.mitll.TextLangDetector
- loadModels(Set<String>) - Method in class org.apache.tika.langdetect.opennlp.OpenNLPDetector
-
NOT SUPPORTED.
- loadModels(Set<String>) - Method in class org.apache.tika.langdetect.optimaize.OptimaizeLangDetector
- loadModels(Set<String>) - Method in class org.apache.tika.language.detect.LanguageDetector
-
Load (or re-load) the models specified in
. - loadParseContext() - Method in class org.apache.tika.config.loader.TikaLoader
-
Loads and returns a ParseContext populated with components from the "parse-context" section.
- loadParsers() - Method in class org.apache.tika.config.loader.TikaLoader
-
Loads and returns all parsers.
- loadRenderers() - Method in class org.apache.tika.config.loader.TikaLoader
-
Loads and returns all renderers.
- loadServiceProviders(Class<T>) - Method in class org.apache.tika.config.ServiceLoader
-
Returns all the available service providers of the given type.
- loadSingleton(PluginManager, Class<? extends TikaExtensionFactory<T>>, JsonNode) - Static method in class org.apache.tika.plugins.PluginComponentLoader
-
Load a singleton component from config.
- loadStaticServiceProviders(Class<T>) - Method in class org.apache.tika.config.ServiceLoader
- loadStaticServiceProviders(Class<T>, Collection<Class<? extends T>>) - Method in class org.apache.tika.config.ServiceLoader
-
Returns the available static service providers of the given type.
- loadTranslator() - Method in class org.apache.tika.config.loader.TikaLoader
-
Loads and returns the translator.
- loadUnnamedInstances(PluginManager, Class<? extends TikaExtensionFactory<T>>, JsonNode) - Static method in class org.apache.tika.plugins.PluginComponentLoader
-
Load multiple unnamed instances from config, keyed by type name.
- loadWithDefaults(Class<T>, T) - Method in class org.apache.tika.config.loader.ConfigLoader
-
Loads a configuration object by class name with defaults, merging JSON properties.
- loadWithDefaults(String, Class<T>, T) - Method in class org.apache.tika.config.loader.ConfigLoader
-
Loads a configuration object by merging JSON properties into a copy of the default instance.
- LOCAL_FILE_HEADER - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ZipHeader
-
The file header in zip.
- LOCAL_HEADER_ONLY_ENTRIES - Static variable in interface org.apache.tika.metadata.Zip
-
Entry names that exist in local headers but not in central directory.
- LOCAL_NAME_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- LOCALE - Enum constant in enum class org.apache.tika.metadata.Property.ValueType
- Location - Class in org.apache.tika.parser.geo.topic.gazetteer
- Location() - Constructor for class org.apache.tika.parser.geo.topic.gazetteer.Location
- LOCATION - Static variable in interface org.apache.tika.metadata.HttpHeaders
- LOCATION - Static variable in interface org.apache.tika.parser.ner.NERecogniser
- LOCATION_CREATED - Static variable in interface org.apache.tika.metadata.IPTC
-
The location the content of the item was created.
- LOCATION_CREATED_CITY - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of the city of a location.
- LOCATION_CREATED_COUNTRY_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
The ISO code of a country of a location.
- LOCATION_CREATED_COUNTRY_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of a country of a location.
- LOCATION_CREATED_PROVINCE_OR_STATE - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of a subregion of a country - a province or state - of a location.
- LOCATION_CREATED_SUBLOCATION - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of a sublocation.
- LOCATION_CREATED_WORLD_REGION - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of a world region of a location.
- LOCATION_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- LOCATION_SHOWN - Static variable in interface org.apache.tika.metadata.IPTC
-
A location the content of the item is about.
- LOCATION_SHOWN_CITY - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of the city of a location.
- LOCATION_SHOWN_COUNTRY_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
The ISO code of a country of a location.
- LOCATION_SHOWN_COUNTRY_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of a country of a location.
- LOCATION_SHOWN_PROVINCE_OR_STATE - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of a subregion of a country - a province or state - of a location.
- LOCATION_SHOWN_SUBLOCATION - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of a sublocation.
- LOCATION_SHOWN_WORLD_REGION - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of a world region of a location.
- Locators - Class in org.apache.tika.inference.locator
-
Container for all locator types that identify where a chunk comes from in the original content.
- Locators() - Constructor for class org.apache.tika.inference.locator.Locators
- LOG - Static variable in class org.apache.tika.parser.hwp.HwpTextExtractorV5
- LOG - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
- LOG - Static variable in interface org.apache.tika.pipes.core.config.ConfigStoreFactory
- LOG - Static variable in class org.apache.tika.renderer.pdf.pdfbox.PDFBoxRenderer
- LOG_COMMENT - Static variable in interface org.apache.tika.metadata.XMPDM
-
"User's log comments."
- LOG_LEVELS - Static variable in class org.apache.tika.server.core.TikaServerConfig
- LOG_LEVELS - Static variable in class org.apache.tika.server.core.TikaServerProcess
- logRequest(Logger, String, Metadata) - Static method in class org.apache.tika.server.core.resource.TikaResource
- LONGITUDE - Static variable in interface org.apache.tika.metadata.Geographic
-
The WGS84 Longitude of the Point
- LONGITUDE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- longValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
- longValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- longValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
- longValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
- LookaheadInputStream - Class in org.apache.tika.io
-
Stream wrapper that make it easy to read up to n bytes ahead from a stream that supports the mark feature.
- LookaheadInputStream(InputStream, int) - Constructor for class org.apache.tika.io.LookaheadInputStream
-
Creates a lookahead wrapper for the given input stream.
- looksLikeUTF8() - Method in class org.apache.tika.detect.TextStatistics
-
Checks whether the observed byte stream looks like UTF-8 encoded text.
- lookup(int) - Static method in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- lookup(int) - Static method in enum class org.apache.tika.pipes.core.protocol.PipesMessageType
-
Looks up a message type by its wire byte.
- LOOP - Static variable in interface org.apache.tika.metadata.XMPDM
-
"When true, the clip can be looped seamlessly."
- LOW - Enum constant in enum class org.apache.tika.language.detect.LanguageConfidence
- LOWEST_VERSION - Static variable in interface org.apache.tika.metadata.QuattroPro
-
Lowest version.
- LuceneIndexer - Class in org.apache.tika.example
- LuceneIndexer(Tika, IndexWriter) - Constructor for class org.apache.tika.example.LuceneIndexer
- LuceneIndexerExtended - Class in org.apache.tika.example
- LuceneIndexerExtended(IndexWriter, Tika) - Constructor for class org.apache.tika.example.LuceneIndexerExtended
- LyricsHandler - Class in org.apache.tika.parser.mp3
-
This is used to parse Lyrics3 tag information from an MP3 file, if available.
- LyricsHandler(byte[]) - Constructor for class org.apache.tika.parser.mp3.LyricsHandler
-
Looks for the Lyrics data, which will be just before the ID3v1 data (if present), and process it.
- LyricsHandler(InputStream, ContentHandler) - Constructor for class org.apache.tika.parser.mp3.LyricsHandler
- LZ4_BLOCK - Static variable in class org.apache.tika.detect.zip.CompressorConstants
- LZ4_FRAMED - Static variable in class org.apache.tika.detect.zip.CompressorConstants
- LZMA - Static variable in class org.apache.tika.detect.zip.CompressorConstants
- LZX_ALIGNED_MAXSYMBOLS - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_ALIGNED_NUM_ELEMENTS - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_ALIGNED_TABLEBITS - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_BLOCKTYPE_ALIGNED - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_BLOCKTYPE_INVALID - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_BLOCKTYPE_UNCOMPRESSED - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_BLOCKTYPE_VERBATIM - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_LENGTH_MAXSYMBOLS - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_LENGTH_TABLEBITS - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_LENTABLE_SAFETY - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_MAIN_MAXSYMBOLS - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_MAINTREE_MAXSYMBOLS - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_MAINTREE_TABLEBITS - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_MAX_MATCH - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_MIN_MATCH - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_NUM_CHARS - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_NUM_PRIMARY_LENGTHS - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_NUM_SECONDARY_LENGTHS - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_PRETREE_MAXSYMBOLS - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_PRETREE_NUM_ELEMENTS - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_PRETREE_NUM_ELEMENTS_BITS - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZX_PRETREE_TABLEBITS - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- LZXC - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
M
- MACHINE_ALPHA - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_ARM - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_EFI - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_IA_64 - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_M32R - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_M68K - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_M88K - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_MIPS - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_PPC - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_S370 - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_S390 - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_SH3 - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_SH4 - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_SH5 - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_SPARC - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_TYPE - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_UNKNOWN - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_VAX - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_x86_32 - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MACHINE_x86_64 - Static variable in interface org.apache.tika.metadata.MachineMetadata
- MachineMetadata - Interface in org.apache.tika.metadata
-
Metadata for describing machines, such as their architecture, type and endian-ness
- MachineMetadata.Endian - Class in org.apache.tika.metadata
- MACRO - Enum constant in enum class org.apache.tika.metadata.TikaCoreProperties.EmbeddedResourceType
- MAGIC - Static variable in class org.apache.tika.ml.chardetect.tools.TrainNaiveBayesBigram
-
Binary magic for the saved model — "NBB3".
- MAGIC - Static variable in class org.apache.tika.ml.LinearModel
- magic_neg(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- MAGIC_PRIORITY_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- MAGIC_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- magic_trust(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- MagicDetector - Class in org.apache.tika.detect
-
Content type detection based on magic bytes, i.e. type-specific patterns near the beginning of the document input stream.
- MagicDetector(MediaType, byte[]) - Constructor for class org.apache.tika.detect.MagicDetector
-
Creates a detector for input documents that have the exact given byte pattern at the beginning of the document stream.
- MagicDetector(MediaType, byte[], byte[], boolean, boolean, int, int) - Constructor for class org.apache.tika.detect.MagicDetector
-
Creates a detector for input documents that meet the specified magic match.
- MagicDetector(MediaType, byte[], byte[], boolean, int, int) - Constructor for class org.apache.tika.detect.MagicDetector
-
Creates a detector for input documents that meet the specified magic match.
- MagicDetector(MediaType, byte[], byte[], int, int) - Constructor for class org.apache.tika.detect.MagicDetector
-
Creates a detector for input documents that meet the specified magic match.
- MagicDetector(MediaType, byte[], int) - Constructor for class org.apache.tika.detect.MagicDetector
-
Creates a detector for input documents that have the exact given byte pattern at the given offset of the document stream.
- MAGIKA_DESCRIPTION - Static variable in class org.apache.tika.detect.magika.MagikaDetector
- MAGIKA_ERRORS - Static variable in class org.apache.tika.detect.magika.MagikaDetector
- MAGIKA_GROUP - Static variable in class org.apache.tika.detect.magika.MagikaDetector
- MAGIKA_IS_TEXT - Static variable in class org.apache.tika.detect.magika.MagikaDetector
- MAGIKA_LABEL - Static variable in class org.apache.tika.detect.magika.MagikaDetector
- MAGIKA_MIME - Static variable in class org.apache.tika.detect.magika.MagikaDetector
- MAGIKA_PREFIX - Static variable in class org.apache.tika.detect.magika.MagikaDetector
- MAGIKA_SCORE - Static variable in class org.apache.tika.detect.magika.MagikaDetector
- MAGIKA_STATUS - Static variable in class org.apache.tika.detect.magika.MagikaDetector
- MAGIKA_VERSION - Static variable in class org.apache.tika.detect.magika.MagikaDetector
- MagikaDetector - Class in org.apache.tika.detect.magika
-
Simple wrapper around Google's magika: https://github.com/google/magika The tool must be installed on the host where Tika is running.
- MagikaDetector() - Constructor for class org.apache.tika.detect.magika.MagikaDetector
-
Default constructor.
- MagikaDetector(JsonConfig) - Constructor for class org.apache.tika.detect.magika.MagikaDetector
-
Constructor for JSON configuration.
- MagikaDetector.Config - Class in org.apache.tika.detect.magika
-
Configuration class for JSON deserialization.
- MagikaDetector.RuntimeConfig - Class in org.apache.tika.detect.magika
-
RuntimeConfig blocks modification of security-sensitive path fields at runtime.
- MAIL_MAX_SIZE - Static variable in class org.apache.tika.parser.mbox.MboxParser
- MailDateParser - Class in org.apache.tika.parser.mailcommons
-
Dates in emails are a mess.
- MailDateParser() - Constructor for class org.apache.tika.parser.mailcommons.MailDateParser
- MailUtil - Class in org.apache.tika.parser.mailcommons
- MailUtil() - Constructor for class org.apache.tika.parser.mailcommons.MailUtil
- main(String[]) - Static method in class org.apache.tika.async.cli.TikaAsyncCLI
- main(String[]) - Static method in class org.apache.tika.cli.TikaCLI
- main(String[]) - Static method in class org.apache.tika.eval.app.ExtractComparerRunner
- main(String[]) - Static method in class org.apache.tika.eval.app.ExtractProfileRunner
- main(String[]) - Static method in class org.apache.tika.eval.app.reports.ResultsReporter
- main(String[]) - Static method in class org.apache.tika.eval.app.TikaEvalCLI
- main(String[]) - Static method in class org.apache.tika.eval.app.tools.CommonTokenOverlapCounter
- main(String[]) - Static method in class org.apache.tika.eval.app.tools.LeipzigSampler
- main(String[]) - Static method in class org.apache.tika.eval.app.tools.TrainTestSplit
- main(String[]) - Static method in class org.apache.tika.example.CustomMimeInfo
- main(String[]) - Static method in class org.apache.tika.example.DescribeMetadata
- main(String[]) - Static method in class org.apache.tika.example.DirListParser
- main(String[]) - Static method in class org.apache.tika.example.DisplayMetInstance
- main(String[]) - Static method in class org.apache.tika.example.GrabPhoneNumbersExample
- main(String[]) - Static method in class org.apache.tika.example.LuceneIndexerExtended
- main(String[]) - Static method in class org.apache.tika.example.MediaTypeExample
- main(String[]) - Static method in class org.apache.tika.example.MyFirstTika
- main(String[]) - Static method in class org.apache.tika.example.PipesForkParserExample
- main(String[]) - Static method in class org.apache.tika.example.RollbackSoftware
- main(String[]) - Static method in class org.apache.tika.example.SimpleTextExtractor
- main(String[]) - Static method in class org.apache.tika.example.SimpleTypeDetector
- main(String[]) - Static method in class org.apache.tika.example.SpringExample
- main(String[]) - Static method in class org.apache.tika.example.StandardsExtractionExample
- main(String[]) - Static method in class org.apache.tika.example.TranscribeTranslateExample
-
Main method to run this example.
- main(String[]) - Static method in class org.apache.tika.example.ZipListFiles
- main(String[]) - Static method in class org.apache.tika.gui.TikaGUI
-
Main method.
- main(String[]) - Static method in class org.apache.tika.ml.chardetect.tools.BenchmarkCharsetDetectors
- main(String[]) - Static method in class org.apache.tika.ml.chardetect.tools.BuildCharsetTrainingData
- main(String[]) - Static method in class org.apache.tika.ml.chardetect.tools.EvalCharsetDetectors
- main(String[]) - Static method in class org.apache.tika.ml.chardetect.tools.TrainNaiveBayesBigram
- main(String[]) - Static method in class org.apache.tika.ml.junkdetect.tools.BuildJunkTrainingData
- main(String[]) - Static method in class org.apache.tika.ml.junkdetect.tools.EvalJunkDetector
- main(String[]) - Static method in class org.apache.tika.ml.junkdetect.tools.TrainJunkModel
- main(String[]) - Static method in class org.apache.tika.parser.microsoft.chm.ChmBlockInfo
- main(String[]) - Static method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
- main(String[]) - Static method in class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
- main(String[]) - Static method in class org.apache.tika.parser.microsoft.chm.ChmSection
- main(String[]) - Static method in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
- main(String[]) - Static method in class org.apache.tika.pipes.core.server.PipesServer
- main(String[]) - Static method in class org.apache.tika.pipes.grpc.TikaGrpcServer
-
Main launches the server from the command line.
- main(String[]) - Static method in class org.apache.tika.server.client.TikaClientCLI
- main(String[]) - Static method in class org.apache.tika.server.core.TikaServerCli
- main(String[]) - Static method in class org.apache.tika.server.core.TikaServerProcess
- mainLoop() - Method in class org.apache.tika.pipes.core.server.PipesServer
- mainTreeLengtsTable - Variable in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- mainTreeTable - Variable in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- MAJOR_VERSION - Static variable in interface org.apache.tika.metadata.WordPerfect
-
Major version.
- makeName(String, String, String) - Static method in class org.apache.tika.language.detect.LanguageNames
- MANAGER - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- manifestMappingExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexManifestMapping
- manifestMappingSerialNumber - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexManifestMapping
- mapAttributes(Attributes) - Method in class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
- MAPI - Interface in org.apache.tika.metadata
-
Properties that typically appear in MSG/PST message format files.
- MAPITag - Class in org.apache.tika.parser.microsoft.msg
- MAPITag(int, String, ClassID) - Constructor for class org.apache.tika.parser.microsoft.msg.MAPITag
- mappings - Variable in class org.apache.tika.metadata.filter.FieldNameMappingFilter.Config
- mapSafeAttribute(String, String) - Method in class org.apache.tika.parser.html.DefaultHtmlMapper
-
Normalizes an attribute name.
- mapSafeAttribute(String, String) - Method in interface org.apache.tika.parser.html.HtmlMapper
-
Maps "safe" HTML attribute names to semantic XHTML equivalents.
- mapSafeAttribute(String, String) - Method in class org.apache.tika.parser.html.IdentityHtmlMapper
- mapSafeElement(String) - Method in class org.apache.tika.parser.html.DefaultHtmlMapper
- mapSafeElement(String) - Method in interface org.apache.tika.parser.html.HtmlMapper
-
Maps "safe" HTML element names to semantic XHTML equivalents.
- mapSafeElement(String) - Method in class org.apache.tika.parser.html.IdentityHtmlMapper
- mapStatusToHttpResponse(PipesResult.RESULT_STATUS) - Static method in class org.apache.tika.server.core.resource.PipesParsingHelper
-
Maps PipesResult status to HTTP response status.
- MarianServerClient(URI, File) - Constructor for class org.apache.tika.language.translate.impl.MarianTranslator.MarianServerClient
-
Marian Server Web Socket Client.
- MarianTranslator - Class in org.apache.tika.language.translate.impl
-
Translator that uses the Marian NMT decoder for translation.
- MarianTranslator() - Constructor for class org.apache.tika.language.translate.impl.MarianTranslator
-
Default constructor.
- MarianTranslator.MarianServerClient - Class in org.apache.tika.language.translate.impl
-
Internal Client for marian-server Web Socket Server.
- mark(int) - Method in class org.apache.tika.io.BoundedInputStream
- mark(int) - Method in class org.apache.tika.io.LookaheadInputStream
- mark(int) - Method in class org.apache.tika.io.TailStream
-
This implementation saves the internal state including the content of the tail buffer so that it can be restored when ''reset()'' is called later.
- mark(int) - Method in class org.apache.tika.io.TikaInputStream
- MARKDOWN - Enum constant in enum class org.apache.tika.sax.BasicContentHandlerFactory.HANDLER_TYPE
- MarkdownChunker - Class in org.apache.tika.inference
-
Splits markdown text into chunks that respect structural boundaries.
- MarkdownChunker(int, int) - Constructor for class org.apache.tika.inference.MarkdownChunker
- MarkdownSummaryWriter - Class in org.apache.tika.eval.app.reports
-
Writes a markdown summary of a tika-eval comparison run.
- MarkdownSummaryWriter() - Constructor for class org.apache.tika.eval.app.reports.MarkdownSummaryWriter
- MARKED - Static variable in interface org.apache.tika.metadata.XMPRights
-
When true, indicates that this is a rights-managed resource.
- markLimit - Variable in class org.apache.tika.parser.txt.Icu4jEncodingDetector.Config
- markServerForRestart() - Method in class org.apache.tika.pipes.core.PerClientServerManager
- markServerForRestart() - Method in interface org.apache.tika.pipes.core.ServerManager
-
Marks the server for restart due to a fatal error (OOM, timeout, etc.).
- markServerForRestart() - Method in class org.apache.tika.pipes.core.SharedServerManager
-
Marks the server for restart due to a fatal error (OOM, timeout).
- markSupported() - Method in class org.apache.tika.io.BoundedInputStream
- markSupported() - Method in class org.apache.tika.io.LookaheadInputStream
- markSupported() - Method in class org.apache.tika.io.TikaInputStream
- MATCH_MASK_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- MATCH_MINSHOULDMATCH_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- MATCH_OFFSET_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- MATCH_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- MATCH_TYPE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- MATCH_VALUE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- Matcher - Class in org.apache.tika.sax.xpath
-
XPath element matcher.
- Matcher() - Constructor for class org.apache.tika.sax.xpath.Matcher
- matches(byte[]) - Method in class org.apache.tika.detect.MagicDetector
-
Checks if the given byte array matches this magic pattern.
- matches(byte[]) - Method in class org.apache.tika.mime.MimeType
- matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.AttributeMatcher
- matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.CompositeMatcher
- matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.Matcher
-
Returns
trueif the XPath expression matches the named attribute of the element associated with this evaluation state. - matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.NamedAttributeMatcher
- matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.NodeMatcher
- matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.SubtreeMatcher
- matchesElement() - Method in class org.apache.tika.sax.xpath.CompositeMatcher
- matchesElement() - Method in class org.apache.tika.sax.xpath.ElementMatcher
- matchesElement() - Method in class org.apache.tika.sax.xpath.Matcher
-
Returns
trueif the XPath expression matches the element associated with this evaluation state. - matchesElement() - Method in class org.apache.tika.sax.xpath.NodeMatcher
- matchesElement() - Method in class org.apache.tika.sax.xpath.SubtreeMatcher
- matchesMagic(byte[]) - Method in class org.apache.tika.mime.MimeType
- matchesText() - Method in class org.apache.tika.sax.xpath.CompositeMatcher
- matchesText() - Method in class org.apache.tika.sax.xpath.Matcher
-
Returns
trueif the XPath expression matches all text nodes whose parent is the element associated with this evaluation state. - matchesText() - Method in class org.apache.tika.sax.xpath.NodeMatcher
- matchesText() - Method in class org.apache.tika.sax.xpath.SubtreeMatcher
- matchesText() - Method in class org.apache.tika.sax.xpath.TextMatcher
- MatchingContentHandler - Class in org.apache.tika.sax.xpath
-
Content handler decorator that only passes the elements, attributes, and text nodes that match the given XPath expression.
- MatchingContentHandler(ContentHandler, Matcher) - Constructor for class org.apache.tika.sax.xpath.MatchingContentHandler
- MathFormatting - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- MATLAB_MIME_TYPE - Static variable in class org.apache.tika.parser.mat.MatParser
- MatParser - Class in org.apache.tika.parser.mat
- MatParser() - Constructor for class org.apache.tika.parser.mat.MatParser
- MatroskaDetector - Class in org.apache.tika.detect
-
Detector for Matroska (MKV and WEBM) files based on the EBML header.
- MatroskaDetector() - Constructor for class org.apache.tika.detect.MatroskaDetector
- max(UByte, UByte) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
-
Returns the greater of two
UBytevalues. - max(UInteger, UInteger) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
-
Returns the greater of two
UIntegervalues. - max(ULong, ULong) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
-
Returns the greater of two
ULongvalues. - max(UShort, UShort) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
-
Returns the greater of two
UShortvalues. - MAX - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
A constant holding the maximum value an
unsigned bytecan have as UByte, 28-1. - MAX - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
-
A constant holding the maximum value an
unsigned intcan have as UInteger, 232-1. - MAX - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
A constant holding the maximum value + 1 an
signed longcan have as ULong, 263. - MAX - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
-
A constant holding the maximum value an
unsigned shortcan have as UShort, 216-1. - MAX_AVAIL_HEIGHT - Static variable in interface org.apache.tika.metadata.IPTC
-
The maximum available height in pixels of the original photo from which this photo has been derived by downsizing.
- MAX_AVAIL_WIDTH - Static variable in interface org.apache.tika.metadata.IPTC
-
The maximum available width in pixels of the original photo from which this photo has been derived by downsizing.
- MAX_COUNT - Enum constant in enum class org.apache.tika.exception.EmbeddedLimitReachedException.LimitType
- MAX_DEPTH - Enum constant in enum class org.apache.tika.exception.EmbeddedLimitReachedException.LimitType
- MAX_IMAGE_LENGTH_BYTES - Static variable in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- MAX_PAYLOAD_BYTES - Static variable in record class org.apache.tika.pipes.core.protocol.PipesMessage
-
Maximum payload size: 100 MB (same as old MAX_FETCH_EMIT_TUPLE_BYTES).
- MAX_PROBE_BYTES - Static variable in class org.apache.tika.ml.chardetect.Utf16SpecialistEncodingDetector
-
Default number of probe bytes read.
- MAX_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
A constant holding the maximum value an
unsigned bytecan have, 28-1. - MAX_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
-
A constant holding the maximum value an
unsigned intcan have, 232-1. - MAX_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
A constant holding the maximum value an
unsigned longcan have, 264-1. - MAX_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
-
A constant holding the maximum value an
unsigned shortcan have, 216-1. - MAX_VALUE_LONG - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
A constant holding the maximum value + 1 an
signed longcan have, 263. - maxBlockMs() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
maxBlockMsrecord component. - maxConnections() - Method in record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
-
Returns the value of the
maxConnectionsrecord component. - maxEmbeddedCount() - Method in record class org.apache.tika.server.core.resource.ServerHandlerConfig
-
Returns the value of the
maxEmbeddedCountrecord component. - MAXIMUM_TEXT_CHUNK_SIZE - Variable in class org.apache.tika.example.ContentHandlerExample
- maxInFlightRequestsPerConnection() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
maxInFlightRequestsPerConnectionrecord component. - maxRequestSize() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
maxRequestSizerecord component. - maxRetries() - Method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
-
Returns the value of the
maxRetriesrecord component. - maxStringLength() - Method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
-
Returns the value of the
maxStringLengthrecord component. - MAXSUBREQUESTID - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
-
Specify the max sub request ID.
- MAXTOKENVALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
-
Specify the max token value.
- maybeDigest(TikaInputStream, Metadata, ParseContext) - Static method in class org.apache.tika.digest.DigestHelper
-
Computes digests on the stream if a DigesterFactory is configured in ParseContext.
- MBOX_MIME_TYPE - Static variable in class org.apache.tika.parser.mbox.MboxParser
- MBOX_RECORD_DIVIDER - Static variable in class org.apache.tika.parser.mbox.MboxParser
- MboxParser - Class in org.apache.tika.parser.mbox
-
Mbox (mailbox) parser.
- MboxParser() - Constructor for class org.apache.tika.parser.mbox.MboxParser
- MD_KEY_PREFIX - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
- MD2 - Enum constant in enum class org.apache.tika.digest.DigestDef.Algorithm
- MD5 - Enum constant in enum class org.apache.tika.digest.DigestDef.Algorithm
- MD5 - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- MDB_PROPERTY_PREFIX - Static variable in class org.apache.tika.parser.microsoft.JackcessParser
- MDB_PW - Static variable in class org.apache.tika.parser.microsoft.JackcessParser
- MEDIA_TYPE - Static variable in class org.apache.tika.parser.pdf.PDFParser
- MEDIA_TYPES - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
- mediatype() - Method in record class org.apache.tika.pipes.core.extractor.frictionless.FrictionlessResource
-
Returns the value of the
mediatyperecord component. - mediatype() - Method in record class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler.FrictionlessFileInfo
-
Returns the value of the
mediatyperecord component. - MediaType - Class in org.apache.tika.mime
-
Internet media type.
- MediaType(String, String) - Constructor for class org.apache.tika.mime.MediaType
- MediaType(String, String, Map<String, String>) - Constructor for class org.apache.tika.mime.MediaType
- MediaType(MediaType, String, String) - Constructor for class org.apache.tika.mime.MediaType
-
Creates a media type by adding a parameter to a base type.
- MediaType(MediaType, Charset) - Constructor for class org.apache.tika.mime.MediaType
-
Creates a media type by adding the "charset" parameter to a base type.
- MediaType(MediaType, Map<String, String>) - Constructor for class org.apache.tika.mime.MediaType
- MediaTypeExample - Class in org.apache.tika.example
- MediaTypeExample() - Constructor for class org.apache.tika.example.MediaTypeExample
- MediaTypeRegistry - Class in org.apache.tika.mime
-
Registry of known Internet media types.
- MediaTypeRegistry() - Constructor for class org.apache.tika.mime.MediaTypeRegistry
- MEDIUM - Enum constant in enum class org.apache.tika.language.detect.LanguageConfidence
- memcmp(int[], int[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.GUID
- MEMGRAPH - Static variable in class org.apache.tika.detect.apple.BPListDetector
- memoryLimitInKb - Variable in class org.apache.tika.parser.microsoft.rtf.RTFParser.Config
- mergeFrom(CodedInputStream, ExtensionRegistryLite) - Method in class org.apache.tika.DeleteFetcherReply.Builder
- mergeFrom(CodedInputStream, ExtensionRegistryLite) - Method in class org.apache.tika.DeleteFetcherRequest.Builder
- mergeFrom(CodedInputStream, ExtensionRegistryLite) - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
- mergeFrom(CodedInputStream, ExtensionRegistryLite) - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- mergeFrom(CodedInputStream, ExtensionRegistryLite) - Method in class org.apache.tika.FetchAndParseReply.Builder
- mergeFrom(CodedInputStream, ExtensionRegistryLite) - Method in class org.apache.tika.FetchAndParseRequest.Builder
- mergeFrom(CodedInputStream, ExtensionRegistryLite) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- mergeFrom(CodedInputStream, ExtensionRegistryLite) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- mergeFrom(CodedInputStream, ExtensionRegistryLite) - Method in class org.apache.tika.GetFetcherReply.Builder
- mergeFrom(CodedInputStream, ExtensionRegistryLite) - Method in class org.apache.tika.GetFetcherRequest.Builder
- mergeFrom(CodedInputStream, ExtensionRegistryLite) - Method in class org.apache.tika.GetPipesIteratorReply.Builder
- mergeFrom(CodedInputStream, ExtensionRegistryLite) - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
- mergeFrom(CodedInputStream, ExtensionRegistryLite) - Method in class org.apache.tika.ListFetchersReply.Builder
- mergeFrom(CodedInputStream, ExtensionRegistryLite) - Method in class org.apache.tika.ListFetchersRequest.Builder
- mergeFrom(CodedInputStream, ExtensionRegistryLite) - Method in class org.apache.tika.SaveFetcherReply.Builder
- mergeFrom(CodedInputStream, ExtensionRegistryLite) - Method in class org.apache.tika.SaveFetcherRequest.Builder
- mergeFrom(CodedInputStream, ExtensionRegistryLite) - Method in class org.apache.tika.SavePipesIteratorReply.Builder
- mergeFrom(CodedInputStream, ExtensionRegistryLite) - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
- mergeFrom(Message) - Method in class org.apache.tika.DeleteFetcherReply.Builder
- mergeFrom(Message) - Method in class org.apache.tika.DeleteFetcherRequest.Builder
- mergeFrom(Message) - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
- mergeFrom(Message) - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- mergeFrom(Message) - Method in class org.apache.tika.FetchAndParseReply.Builder
- mergeFrom(Message) - Method in class org.apache.tika.FetchAndParseRequest.Builder
- mergeFrom(Message) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- mergeFrom(Message) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- mergeFrom(Message) - Method in class org.apache.tika.GetFetcherReply.Builder
- mergeFrom(Message) - Method in class org.apache.tika.GetFetcherRequest.Builder
- mergeFrom(Message) - Method in class org.apache.tika.GetPipesIteratorReply.Builder
- mergeFrom(Message) - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
- mergeFrom(Message) - Method in class org.apache.tika.ListFetchersReply.Builder
- mergeFrom(Message) - Method in class org.apache.tika.ListFetchersRequest.Builder
- mergeFrom(Message) - Method in class org.apache.tika.SaveFetcherReply.Builder
- mergeFrom(Message) - Method in class org.apache.tika.SaveFetcherRequest.Builder
- mergeFrom(Message) - Method in class org.apache.tika.SavePipesIteratorReply.Builder
- mergeFrom(Message) - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
- mergeFrom(DeleteFetcherReply) - Method in class org.apache.tika.DeleteFetcherReply.Builder
- mergeFrom(DeleteFetcherRequest) - Method in class org.apache.tika.DeleteFetcherRequest.Builder
- mergeFrom(DeletePipesIteratorReply) - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
- mergeFrom(DeletePipesIteratorRequest) - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- mergeFrom(FetchAndParseReply) - Method in class org.apache.tika.FetchAndParseReply.Builder
- mergeFrom(FetchAndParseRequest) - Method in class org.apache.tika.FetchAndParseRequest.Builder
- mergeFrom(GetFetcherConfigJsonSchemaReply) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- mergeFrom(GetFetcherConfigJsonSchemaRequest) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- mergeFrom(GetFetcherReply) - Method in class org.apache.tika.GetFetcherReply.Builder
- mergeFrom(GetFetcherRequest) - Method in class org.apache.tika.GetFetcherRequest.Builder
- mergeFrom(GetPipesIteratorReply) - Method in class org.apache.tika.GetPipesIteratorReply.Builder
- mergeFrom(GetPipesIteratorRequest) - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
- mergeFrom(ListFetchersReply) - Method in class org.apache.tika.ListFetchersReply.Builder
- mergeFrom(ListFetchersRequest) - Method in class org.apache.tika.ListFetchersRequest.Builder
- mergeFrom(SaveFetcherReply) - Method in class org.apache.tika.SaveFetcherReply.Builder
- mergeFrom(SaveFetcherRequest) - Method in class org.apache.tika.SaveFetcherRequest.Builder
- mergeFrom(SavePipesIteratorReply) - Method in class org.apache.tika.SavePipesIteratorReply.Builder
- mergeFrom(SavePipesIteratorRequest) - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
- mergeInto(Metadata, List<Chunk>) - Static method in class org.apache.tika.inference.ChunkSerializer
-
Reads any existing chunks from the metadata field, appends the new chunks, and writes the merged list back.
- mergeMetadata(Metadata, Metadata, AbstractMultipleParser.MetadataPolicy) - Static method in class org.apache.tika.parser.multiple.AbstractMultipleParser
- mergeOrCreate(Path, ConfigOverrides) - Static method in class org.apache.tika.pipes.core.config.ConfigMerger
-
Merges overrides with an existing config, or creates a new config if none exists.
- mergeParseContextFromConfig(String, ParseContext) - Static method in class org.apache.tika.server.core.resource.TikaResource
-
Parses config JSON and merges parseContext entries into the provided ParseContext.
- MergeResult(Path, String, String) - Constructor for record class org.apache.tika.pipes.core.config.ConfigMerger.MergeResult
-
Creates an instance of a
MergeResultrecord class. - mergeUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.DeleteFetcherReply.Builder
- mergeUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.DeleteFetcherRequest.Builder
- mergeUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
- mergeUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- mergeUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.FetchAndParseReply.Builder
- mergeUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.FetchAndParseRequest.Builder
- mergeUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- mergeUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- mergeUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.GetFetcherReply.Builder
- mergeUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.GetFetcherRequest.Builder
- mergeUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.GetPipesIteratorReply.Builder
- mergeUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
- mergeUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.ListFetchersReply.Builder
- mergeUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.ListFetchersRequest.Builder
- mergeUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.SaveFetcherReply.Builder
- mergeUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.SaveFetcherRequest.Builder
- mergeUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.SavePipesIteratorReply.Builder
- mergeUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
- mergeWithDefaults(ObjectMapper, JsonNode, Class<T>, T) - Static method in class org.apache.tika.config.loader.JsonMergeUtils
-
Deserializes a JsonNode and merges it with a default configuration object.
- mergeWithDefaults(ObjectMapper, String, Class<T>, T) - Static method in class org.apache.tika.config.loader.JsonMergeUtils
-
Deserializes JSON and merges it with a default configuration object.
- message() - Method in record class org.apache.tika.pipes.api.PipesResult
-
Returns the value of the
messagerecord component. - Message - Interface in org.apache.tika.metadata
-
A collection of Message related property names.
- MESSAGE - Static variable in class org.apache.tika.pipes.core.serialization.PipesResultSerializer
- MESSAGE_BCC - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_BCC_DISPLAY_NAME - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_BCC_EMAIL - Static variable in interface org.apache.tika.metadata.Message
-
Where possible, this records the email value in the bcc field.
- MESSAGE_BCC_NAME - Static variable in interface org.apache.tika.metadata.Message
-
In Outlook messages, there are sometimes separate fields for "bcc-name" and "bcc-display-name" name.
- MESSAGE_CC - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_CC_DISPLAY_NAME - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_CC_EMAIL - Static variable in interface org.apache.tika.metadata.Message
-
Where possible, this records the email value in the cc field.
- MESSAGE_CC_NAME - Static variable in interface org.apache.tika.metadata.Message
-
In Outlook messages, there are sometimes separate fields for "cc-name" and "cc-display-name" name.
- MESSAGE_CLASS - Static variable in interface org.apache.tika.metadata.MAPI
-
MAPI message class.
- MESSAGE_CLASS_RAW - Static variable in interface org.apache.tika.metadata.MAPI
-
MAPI message class.
- MESSAGE_FIELD_NUMBER - Static variable in class org.apache.tika.DeletePipesIteratorReply
- MESSAGE_FIELD_NUMBER - Static variable in class org.apache.tika.SavePipesIteratorReply
- MESSAGE_FROM - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_FROM_EMAIL - Static variable in interface org.apache.tika.metadata.Message
-
Where possible, this records the value from the name field.
- MESSAGE_FROM_NAME - Static variable in interface org.apache.tika.metadata.Message
-
Where possible, this records the value from the name field.
- MESSAGE_PREFIX - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_RAW_HEADER_PREFIX - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_RECIPIENT_ADDRESS - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_TO - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_TO_DISPLAY_NAME - Static variable in interface org.apache.tika.metadata.Message
- MESSAGE_TO_EMAIL - Static variable in interface org.apache.tika.metadata.Message
-
Where possible, this records the email value in the to field.
- MESSAGE_TO_NAME - Static variable in interface org.apache.tika.metadata.Message
-
In Outlook messages, there are sometimes separate fields for "to-name" and "to-display-name" name.
- meta - Variable in class org.apache.tika.xmp.convert.AbstractConverter
- meta_neg(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- meta_trust(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- metadata - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
- metadata() - Method in record class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler.FrictionlessFileInfo
-
Returns the value of the
metadatarecord component. - metadata() - Method in record class org.apache.tika.pipes.core.extractor.TempFileUnpackHandler.EmbeddedFileInfo
-
Returns the value of the
metadatarecord component. - metadata(Metadata) - Method in class org.apache.tika.sax.XMPContentHandler
- Metadata - Class in org.apache.tika.metadata
-
A multi-valued metadata container.
- Metadata() - Constructor for class org.apache.tika.metadata.Metadata
-
Constructs a new, empty metadata.
- Metadata(MetadataWriteLimiter) - Constructor for class org.apache.tika.metadata.Metadata
-
Constructs a new, empty metadata with the specified write limiter.
- METADATA - Enum constant in enum class org.apache.tika.metadata.TikaCoreProperties.EmbeddedResourceType
- METADATA_COMMAND_ARGUMENTS_SERIALIZED_TOKEN - Static variable in class org.apache.tika.embedder.ExternalEmbedder
-
Token to be replaced with a String array of metadata assignment command arguments
- METADATA_COMMAND_ARGUMENTS_TOKEN - Static variable in class org.apache.tika.embedder.ExternalEmbedder
-
Token to be replaced with a String array of metadata assignment command arguments
- METADATA_DATE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- METADATA_DATE - Static variable in interface org.apache.tika.metadata.XMP
-
The date and time that any metadata for this resource was last changed.
- METADATA_KEY - Static variable in class org.apache.tika.pipes.core.serialization.FetchEmitTupleSerializer
- METADATA_KEYS - Static variable in class org.apache.tika.parser.sqlite3.SQLite3DBParser
- METADATA_LIST - Static variable in class org.apache.tika.pipes.core.serialization.EmitDataSerializer
- METADATA_MOD_DATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The date and time when the metadata was last modified."
- METADATA_POLICY_CONFIG_KEY - Static variable in class org.apache.tika.parser.multiple.AbstractMultipleParser
- MetadataAwareLuceneIndexer - Class in org.apache.tika.example
-
Builds on the LuceneIndexer from Chapter 5 and adds indexing of Metadata.
- MetadataAwareLuceneIndexer(IndexWriter, Tika) - Constructor for class org.apache.tika.example.MetadataAwareLuceneIndexer
- MetadataCharsetDetector - Class in org.apache.tika.detect
-
Encoding detector that extracts a declared charset from Tika metadata without reading any bytes from the stream.
- MetadataCharsetDetector() - Constructor for class org.apache.tika.detect.MetadataCharsetDetector
- MetadataDeserializer - Class in org.apache.tika.serialization.serdes
- MetadataDeserializer() - Constructor for class org.apache.tika.serialization.serdes.MetadataDeserializer
- MetadataExtractor - Class in org.apache.tika.parser.microsoft.ooxml
-
OOXML metadata extractor base class.
- MetadataExtractor() - Constructor for class org.apache.tika.parser.microsoft.ooxml.MetadataExtractor
- MetadataFields - Class in org.apache.tika.parser.image
-
Knowns about all declared
Metadatafields. - MetadataFields() - Constructor for class org.apache.tika.parser.image.MetadataFields
- MetadataFilter - Class in org.apache.tika.metadata.filter
- MetadataFilter() - Constructor for class org.apache.tika.metadata.filter.MetadataFilter
- MetadataFilterBase - Class in org.apache.tika.metadata.filter
-
Base class for iterating a call to
MetadataFilterBase.filter(Metadata)on a list of metadata objects. - MetadataFilterBase() - Constructor for class org.apache.tika.metadata.filter.MetadataFilterBase
- MetadataHandler - Class in org.apache.tika.parser.xml
-
Deprecated.Use the
AttributeMetadataHandlerandElementMetadataHandlerclasses instead - MetadataHandler(Metadata, String) - Constructor for class org.apache.tika.parser.xml.MetadataHandler
-
Deprecated.
- MetadataHandler(Metadata, Property) - Constructor for class org.apache.tika.parser.xml.MetadataHandler
-
Deprecated.
- metadataKey() - Method in class org.apache.tika.digest.DigestDef
-
Returns the metadata key for storing this digest value.
- metadataList - Variable in class org.apache.tika.sax.RecursiveParserWrapperHandler
- metadataList() - Method in record class org.apache.tika.server.core.resource.PipesParsingHelper.UnpackResult
-
Returns the value of the
metadataListrecord component. - MetadataList - Class in org.apache.tika.server.core
-
wrapper class to make isWriteable in MetadataListMBW simpler
- MetadataList(List<Metadata>) - Constructor for class org.apache.tika.server.core.MetadataList
- MetadataListMessageBodyWriter - Class in org.apache.tika.server.core.writer
- MetadataListMessageBodyWriter() - Constructor for class org.apache.tika.server.core.writer.MetadataListMessageBodyWriter
- metadataMaxAgeMs() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
metadataMaxAgeMsrecord component. - MetaDataObjectsAboveGraphSpace - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- MetadataResource - Class in org.apache.tika.server.core.resource
- MetadataResource() - Constructor for class org.apache.tika.server.core.resource.MetadataResource
- MetadataSerializer - Class in org.apache.tika.serialization.serdes
- MetadataSerializer() - Constructor for class org.apache.tika.serialization.serdes.MetadataSerializer
- MetadataSerializer(boolean) - Constructor for class org.apache.tika.serialization.serdes.MetadataSerializer
- metadataToJsonContainerInsert(Metadata, OpenSearchEmitterConfig.AttachmentStrategy) - Static method in class org.apache.tika.pipes.emitter.opensearch.OpenSearchClient
- metadataToJsonEmbeddedInsert(Metadata, OpenSearchEmitterConfig.AttachmentStrategy, String, String) - Static method in class org.apache.tika.pipes.emitter.opensearch.OpenSearchClient
- MetadataWriteLimiter - Interface in org.apache.tika.metadata.writefilter
- MetadataWriteLimiterFactory - Interface in org.apache.tika.metadata.writefilter
-
Factory interface for creating
MetadataWriteLimiterinstances. - MetaEncodingDetector - Interface in org.apache.tika.detect
-
Marker interface for encoding detectors that arbitrate among candidates collected by base detectors rather than detecting encoding directly from the stream.
- methodName - Variable in class org.apache.tika.server.core.resource.TikaWelcome.Endpoint
- MicrosoftGraphFetcher - Class in org.apache.tika.pipes.fetchers.microsoftgraph
-
Fetches files from Microsoft Graph API.
- MicrosoftGraphFetcherConfig - Class in org.apache.tika.pipes.fetchers.microsoftgraph.config
- MicrosoftGraphFetcherConfig() - Constructor for class org.apache.tika.pipes.fetchers.microsoftgraph.config.MicrosoftGraphFetcherConfig
- MicrosoftGraphFetcherFactory - Class in org.apache.tika.pipes.fetchers.microsoftgraph
-
Factory for creating Microsoft Graph fetchers.
- MicrosoftGraphFetcherFactory() - Constructor for class org.apache.tika.pipes.fetchers.microsoftgraph.MicrosoftGraphFetcherFactory
- MicrosoftGraphPipesPlugin - Class in org.apache.tika.pipes.plugin.microsoftgraph
- MicrosoftGraphPipesPlugin(PluginWrapper) - Constructor for class org.apache.tika.pipes.plugin.microsoftgraph.MicrosoftGraphPipesPlugin
- microsoftTranslateToFrench(String) - Method in class org.apache.tika.example.TranslatorExample
- MicrosoftTranslator - Class in org.apache.tika.language.translate.impl
-
Wrapper class to access the Windows translation service.
- MicrosoftTranslator() - Constructor for class org.apache.tika.language.translate.impl.MicrosoftTranslator
-
Create a new MicrosoftTranslator with the client keys specified in resources/org/apache/tika/language/translate/translator.microsoft.properties.
- MIDDAY - Static variable in class org.apache.tika.utils.DateUtils
-
Custom time zone used to interpret date values without a time component in a way that most likely falls within the same day regardless of in which time zone it is later interpreted.
- MidiParser - Class in org.apache.tika.parser.audio
- MidiParser() - Constructor for class org.apache.tika.parser.audio.MidiParser
- MIFContentHandler - Class in org.apache.tika.parser.mif
-
Content handler for MIF Content and Metadata.
- MIFExtractor - Class in org.apache.tika.parser.mif
-
Helper Class to Parse and Extract Adobe MIF Files.
- MIFExtractor() - Constructor for class org.apache.tika.parser.mif.MIFExtractor
- MIFParser - Class in org.apache.tika.parser.mif
- MIFParser() - Constructor for class org.apache.tika.parser.mif.MIFParser
- MIFParser(EncodingDetector) - Constructor for class org.apache.tika.parser.mif.MIFParser
- MIME - Static variable in class org.apache.tika.detect.siegfried.SiegfriedDetector
- MIME_ID - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- MIME_INFO_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- MIME_STRING - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- MIME_TABLE - Static variable in class org.apache.tika.eval.app.ProfilerBase
- MIME_TYPE - Enum constant in enum class org.apache.tika.metadata.Property.ValueType
- MIME_TYPE_MAGIC - Static variable in interface org.apache.tika.metadata.TikaMimeKeys
- MIME_TYPE_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- MIME_TYPE_TYPE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- MimeBuffer - Class in org.apache.tika.eval.app.db
- MimeBuffer(Connection, TableInfo, MimeTypes) - Constructor for class org.apache.tika.eval.app.db.MimeBuffer
- MimeFilteringDecorator(Parser, Set<MediaType>, Set<MediaType>) - Constructor for class org.apache.tika.parser.ParserDecorator.MimeFilteringDecorator
- mimes - Variable in class org.apache.tika.metadata.filter.RemoveByMimeMetadataFilter.Config
- MimeType - Class in org.apache.tika.mime
-
Internet media type.
- MimeTypeException - Exception in org.apache.tika.mime
-
A class to encapsulate MimeType related exceptions.
- MimeTypeException(String) - Constructor for exception org.apache.tika.mime.MimeTypeException
-
Constructs a MimeTypeException with the specified detail message.
- MimeTypeException(String, Throwable) - Constructor for exception org.apache.tika.mime.MimeTypeException
-
Constructs a MimeTypeException with the specified detail message and root cause.
- MimeTypes - Class in org.apache.tika.mime
-
This class is a MimeType repository.
- MimeTypes() - Constructor for class org.apache.tika.mime.MimeTypes
- MimeTypesFactory - Class in org.apache.tika.mime
-
Creates instances of MimeTypes.
- MimeTypesFactory() - Constructor for class org.apache.tika.mime.MimeTypesFactory
- MimeTypesReader - Class in org.apache.tika.mime
-
A reader for XML files compliant with the freedesktop MIME-info DTD.
- MimeTypesReader(MimeTypes) - Constructor for class org.apache.tika.mime.MimeTypesReader
- MimeTypesReaderMetKeys - Interface in org.apache.tika.mime
-
Met Keys used by the
MimeTypesReader. - min(UByte, UByte) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
-
Returns the smaller of two
UBytevalues. - min(UInteger, UInteger) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
-
Returns the smaller of two
UIntegervalues. - min(ULong, ULong) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
-
Returns the smaller of two
ULongvalues. - min(UShort, UShort) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UMath
-
Returns the smaller of two
UShortvalues. - MIN - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
A constant holding the minimum value an
unsigned bytecan have as UByte, 0. - MIN - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
-
A constant holding the minimum value an
unsigned intcan have as UInteger, 0. - MIN - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
A constant holding the minimum value an
unsigned longcan have as ULong, 0. - MIN - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
-
A constant holding the minimum value an
unsigned shortcan have as UShort, 0. - MIN_COLUMN_ASYMMETRY_PROBE - Static variable in class org.apache.tika.ml.chardetect.StructuralEncodingRules
-
Minimum probe length before
StructuralEncodingRules.has2ByteColumnAsymmetry(byte[])produces meaningful diversity counts. - MIN_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
A constant holding the minimum value an
unsigned bytecan have, 0. - MIN_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
-
A constant holding the minimum value an
unsigned intcan have, 0. - MIN_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
A constant holding the minimum value an
unsigned longcan have, 0. - MIN_VALUE - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
-
A constant holding the minimum value an
unsigned shortcan have, 0. - MINOR_MODEL_AGE_DISCLOSURE - Static variable in interface org.apache.tika.metadata.IPTC
-
Age of the youngest model pictured in the image, at the time that the image was made.
- MINOR_VERSION - Static variable in interface org.apache.tika.metadata.WordPerfect
-
Minor version.
- MISCELLANEOUS - Static variable in interface org.apache.tika.parser.ner.NERecogniser
- MiscOLEDetector - Class in org.apache.tika.detect.ole
-
A detector that works on a POIFS OLE2 document to figure out exactly what the file is.
- MiscOLEDetector() - Constructor for class org.apache.tika.detect.ole.MiscOLEDetector
- MITIENERecogniser - Class in org.apache.tika.parser.ner.mitie
-
This class offers an implementation of
NERecogniserbased on trained models using state-of-the-art information extraction tools. - MITIENERecogniser() - Constructor for class org.apache.tika.parser.ner.mitie.MITIENERecogniser
- MITIENERecogniser(String) - Constructor for class org.apache.tika.parser.ner.mitie.MITIENERecogniser
-
Creates a NERecogniser by loading model from given path
- mixedLanguages - Variable in class org.apache.tika.language.detect.LanguageDetector
- MM_SLASH_DD_SLASH_YY_HH_MM - Static variable in class org.apache.tika.parser.mailcommons.MailDateParser
- MM_SLASH_DD_SLASH_YY_HH_MM_AM_PM - Static variable in class org.apache.tika.parser.mailcommons.MailDateParser
- MM_SLASH_DD_SLASH_YYYY - Static variable in class org.apache.tika.parser.mailcommons.MailDateParser
- MMM_D_YYYY_HH_MM - Static variable in class org.apache.tika.parser.mailcommons.MailDateParser
- MMM_D_YYYY_HH_MM_AM_PM - Static variable in class org.apache.tika.parser.mailcommons.MailDateParser
- MMM_DD_YY - Static variable in class org.apache.tika.parser.mailcommons.MailDateParser
- MODEL_AGE - Static variable in interface org.apache.tika.metadata.IPTC
-
Age of the human model(s) at the time this image was taken in a model released image.
- MODEL_NAME_ENGLISH - Static variable in interface org.apache.tika.metadata.ClimateForcast
- MODEL_PROP_NAME - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
- MODEL_PROP_NAME - Static variable in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
- MODEL_RELEASE_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
Optional identifier associated with each Model Release.
- MODEL_RELEASE_STATUS - Static variable in interface org.apache.tika.metadata.IPTC
-
Summarizes the availability and scope of model releases authorizing usage of the likenesses of persons appearing in the photograph.
- MODELS_DIR - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- MODIFIED - Static variable in interface org.apache.tika.metadata.DublinCore
-
Date on which the resource was changed.
- MODIFIED - Static variable in interface org.apache.tika.metadata.FileSystem
- MODIFIED - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- MODIFIED - Static variable in interface org.apache.tika.metadata.XMPDC
-
Date on which the resource was changed.
- modifiedService(ServiceReference, Object) - Method in class org.apache.tika.config.TikaActivator
- MODIFIER - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- MODIFY_DATE - Static variable in interface org.apache.tika.metadata.XMP
-
The date and time the resource was last modified.
- MojibusterEncodingDetector - Class in org.apache.tika.ml.chardetect
-
Naive-Bayes pipeline detector: structural checks for wide Unicode + BOMs before falling through to the bigram NB classifier for everything else.
- MojibusterEncodingDetector() - Constructor for class org.apache.tika.ml.chardetect.MojibusterEncodingDetector
-
Default SPI constructor: load the NB bigram model from the classpath at
MojibusterEncodingDetector.DEFAULT_MODEL_RESOURCE. - MojibusterEncodingDetector(Path) - Constructor for class org.apache.tika.ml.chardetect.MojibusterEncodingDetector
- MONEY - Static variable in interface org.apache.tika.parser.ner.NERecogniser
- MONEY_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- moreToTest() - Method in class org.apache.tika.example.PickBestTextEncodingParser.CharsetTester
-
Deprecated.
- MosesTranslator - Class in org.apache.tika.language.translate.impl
-
Translator that uses the Moses decoder for translation.
- MosesTranslator() - Constructor for class org.apache.tika.language.translate.impl.MosesTranslator
-
Default constructor that attempts to read the smt jar and script paths from the translator.moses.properties file.
- MosesTranslator(String, String) - Constructor for class org.apache.tika.language.translate.impl.MosesTranslator
-
Create a Moses Translator with the specified smt jar and script paths.
- MOVE_FROM - Enum constant in enum class org.apache.tika.parser.microsoft.ooxml.EditType
- MOVE_TO - Enum constant in enum class org.apache.tika.parser.microsoft.ooxml.EditType
- moveNext() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
-
Advances the enumerator to the next bit of the byte array.
- moveTo(float, float) - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- MP3Frame - Interface in org.apache.tika.parser.mp3
-
A frame in an MP3 file, such as ID3v2 Tags or some audio.
- Mp3Parser - Class in org.apache.tika.parser.mp3
-
The
Mp3Parseris used to parse ID3 Version 1 Tag information from an MP3 file, if available. - Mp3Parser() - Constructor for class org.apache.tika.parser.mp3.Mp3Parser
- Mp3Parser.ID3TagsAndAudio - Class in org.apache.tika.parser.mp3
- MP4Parser - Class in org.apache.tika.parser.mp4
-
Parser for the MP4 media container format, as well as the older QuickTime format that MP4 is based on.
- MP4Parser() - Constructor for class org.apache.tika.parser.mp4.MP4Parser
- MPEG_V1 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
-
Constant for the MPEG version 1.
- MPEG_V2 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
-
Constant for the MPEG version 2.
- MPEG_V2_5 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
-
Constant for the MPEG version 2.5.
- MPP - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
Microsoft Project
- MS_EQUATION - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
Equation embedded in Office docs
- MS_GRAPH_CHART - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
Graph/Charts embedded in PowerPoint and Excel
- MS_OUTLOOK_PST_MIMETYPE - Static variable in class org.apache.tika.parser.microsoft.libpst.LibPstParser
- MS_OUTLOOK_PST_MIMETYPE - Static variable in class org.apache.tika.parser.microsoft.pst.OutlookPSTParser
- MSEmbeddedStreamTranslator - Class in org.apache.tika.extractor.microsoft
- MSEmbeddedStreamTranslator() - Constructor for class org.apache.tika.extractor.microsoft.MSEmbeddedStreamTranslator
- MSG - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
Microsoft Outlook
- MSOfficeBinaryConverter - Class in org.apache.tika.xmp.convert
-
Tika to XMP mapping for the binary MS formats Word (.doc), Excel (.xls) and PowerPoint (.ppt).
- MSOfficeBinaryConverter() - Constructor for class org.apache.tika.xmp.convert.MSOfficeBinaryConverter
- MSOfficeXMLConverter - Class in org.apache.tika.xmp.convert
-
Tika to XMP mapping for the Office Open XML formats Word (.docx), Excel (.xlsx) and PowerPoint (.pptx).
- MSOfficeXMLConverter() - Constructor for class org.apache.tika.xmp.convert.MSOfficeXMLConverter
- MSOneStorePackage - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb
- MSOneStorePackage() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
- MSOneStoreParser - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb
- MSOneStoreParser() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStoreParser
- MSOwnerFileParser - Class in org.apache.tika.parser.microsoft
-
Parser for temporary MSOFfice files.
- MSOwnerFileParser() - Constructor for class org.apache.tika.parser.microsoft.MSOwnerFileParser
- MULTIPART_BOUNDARY - Static variable in interface org.apache.tika.metadata.Message
- MULTIPART_SUBTYPE - Static variable in interface org.apache.tika.metadata.Message
- multivaluedFieldDelimiter() - Method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
-
Returns the value of the
multivaluedFieldDelimiterrecord component. - multivaluedFieldStrategy() - Method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
-
Returns the value of the
multivaluedFieldStrategyrecord component. - mustNotBeEmpty(String, String) - Static method in class org.apache.tika.config.ConfigValidator
-
Validates that a string parameter is not null or blank.
- mustNotBeEmpty(String, Path) - Static method in class org.apache.tika.config.ConfigValidator
-
Validates that a Path parameter is not null.
- MYANMAR - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- MyFirstTika - Class in org.apache.tika.example
- MyFirstTika() - Constructor for class org.apache.tika.example.MyFirstTika
N
- n - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CompactID
- N_PAGES - Static variable in interface org.apache.tika.metadata.PagedText
-
"The number of pages in the document (including any in contained documents)."
- NaiveBayesBigramEncodingDetector - Class in org.apache.tika.ml.chardetect
-
Naive-Bayes byte-bigram charset classifier.
- NaiveBayesBigramEncodingDetector(InputStream) - Constructor for class org.apache.tika.ml.chardetect.NaiveBayesBigramEncodingDetector
- NaiveBayesBigramEncodingDetector(Path) - Constructor for class org.apache.tika.ml.chardetect.NaiveBayesBigramEncodingDetector
- name - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.RawTag
- name() - Element in annotation interface org.apache.tika.config.TikaComponent
-
The component name used in JSON configuration.
- name() - Method in record class org.apache.tika.pipes.core.extractor.frictionless.FrictionlessResource
-
Returns the value of the
namerecord component. - name() - Method in record class org.apache.tika.plugins.ExtensionConfig
-
Returns the value of the
namerecord component. - name(int) - Static method in class org.apache.tika.langdetect.charsoup.ScriptCategory
-
Human-readable name of a category.
- NAME - Static variable in class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks
- NAME - Static variable in class org.apache.tika.pipes.emitter.es.ESEmitterFactory
- NAME - Static variable in class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterFactory
- NAME - Static variable in class org.apache.tika.pipes.fetcher.azblob.AZBlobFetcherFactory
- NAME - Static variable in class org.apache.tika.pipes.fetcher.gcs.GCSFetcherFactory
- NAME - Static variable in class org.apache.tika.pipes.fetcher.s3.S3FetcherFactory
- NAME - Static variable in class org.apache.tika.pipes.fetchers.microsoftgraph.MicrosoftGraphFetcherFactory
- NAME - Static variable in class org.apache.tika.pipes.iterator.azblob.AZBlobPipesIteratorFactory
- NAME - Static variable in class org.apache.tika.pipes.iterator.csv.CSVPipesIteratorFactory
- NAME - Static variable in class org.apache.tika.pipes.iterator.fs.FileSystemPipesIteratorFactory
- NAME - Static variable in class org.apache.tika.pipes.iterator.gcs.GCSPipesIteratorFactory
- NAME - Static variable in class org.apache.tika.pipes.iterator.jdbc.JDBCPipesIteratorFactory
- NAME - Static variable in class org.apache.tika.pipes.iterator.kafka.KafkaPipesIteratorFactory
- NAME - Static variable in class org.apache.tika.pipes.iterator.s3.S3PipesIteratorFactory
- NAME - Static variable in class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorFactory
- NAME - Static variable in class org.apache.tika.pipes.pipesiterator.json.JsonPipesIteratorFactory
- NAME - Static variable in class org.apache.tika.pipes.reporter.es.ESReporterFactory
- NAME - Static variable in class org.apache.tika.pipes.reporter.fs.FileSystemReporterFactory
- NAME - Static variable in class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterFactory
- NAME - Static variable in class org.apache.tika.pipes.reporter.opensearch.OpenSearchReporterFactory
- NamedAttributeMatcher - Class in org.apache.tika.sax.xpath
-
Final evaluation state of a
... - NamedAttributeMatcher(String, String) - Constructor for class org.apache.tika.sax.xpath.NamedAttributeMatcher
- NamedElementMatcher - Class in org.apache.tika.sax.xpath
-
Intermediate evaluation state of a
... - NamedElementMatcher(String, String, Matcher) - Constructor for class org.apache.tika.sax.xpath.NamedElementMatcher
- NamedEntityParser - Class in org.apache.tika.parser.ner
-
This implementation of
Parserextracts entity names from text content and adds it to the metadata. - NamedEntityParser() - Constructor for class org.apache.tika.parser.ner.NamedEntityParser
- NameDetector - Class in org.apache.tika.detect
-
Content type detection based on the resource name.
- NameDetector(Map<Pattern, MediaType>) - Constructor for class org.apache.tika.detect.NameDetector
-
Creates a new content type detector based on the given name patterns.
- NameEntityExtractor - Class in org.apache.tika.parser.geo.topic
- NameEntityExtractor(NameFinderME) - Constructor for class org.apache.tika.parser.geo.topic.NameEntityExtractor
- names() - Method in class org.apache.tika.metadata.Metadata
-
Returns an array of the names contained in the metadata.
- names() - Method in class org.apache.tika.xmp.XMPMetadata
-
For XMP it is not clear what that API should return, therefor not implemented
- Namespace - Class in org.apache.tika.xmp.convert
-
Utility class to hold namespace information.
- Namespace(String, String) - Constructor for class org.apache.tika.xmp.convert.Namespace
- NAMESPACE - Static variable in class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaIllustrator
- NAMESPACE - Static variable in class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFUA
- NAMESPACE - Static variable in class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFVT
- NAMESPACE - Static variable in class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFX
- NAMESPACE - Static variable in class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFXId
- NAMESPACE_PREFIX_DELIMITER - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
The common delimiter used between the namespace abbreviation and the property name
- NAMESPACE_URI - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
- NAMESPACE_URI - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- NAMESPACE_URI - Static variable in interface org.apache.tika.metadata.XMP
- NAMESPACE_URI - Static variable in interface org.apache.tika.metadata.XMPIdq
- NAMESPACE_URI - Static variable in interface org.apache.tika.metadata.XMPMM
- NAMESPACE_URI - Static variable in class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaIllustrator
- NAMESPACE_URI - Static variable in class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFUA
- NAMESPACE_URI - Static variable in class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFVT
- NAMESPACE_URI - Static variable in class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFX
- NAMESPACE_URI - Static variable in class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFXId
- NAMESPACE_URI_DC - Static variable in interface org.apache.tika.metadata.DublinCore
- NAMESPACE_URI_DC_TERMS - Static variable in interface org.apache.tika.metadata.DublinCore
- NAMESPACE_URI_DOC_META - Static variable in interface org.apache.tika.metadata.Office
- NAMESPACE_URI_IPTC_CORE - Static variable in interface org.apache.tika.metadata.IPTC
- NAMESPACE_URI_IPTC_EXT - Static variable in interface org.apache.tika.metadata.IPTC
- NAMESPACE_URI_PHOTOSHOP - Static variable in interface org.apache.tika.metadata.Photoshop
- NAMESPACE_URI_PLUS - Static variable in interface org.apache.tika.metadata.IPTC
- NAMESPACE_URI_XMP_RIGHTS - Static variable in interface org.apache.tika.metadata.XMPRights
- namespaces - Variable in class org.apache.tika.sax.ToXMLContentHandler
- NATIVE_FLAC - Static variable in class org.apache.tika.parser.ogg.FlacParser
- needsRestart() - Method in class org.apache.tika.pipes.core.PerClientServerManager
- needsRestart() - Method in interface org.apache.tika.pipes.core.ServerManager
-
Checks if the server has been marked for restart.
- needsRestart() - Method in class org.apache.tika.pipes.core.SharedServerManager
-
Checks if the server has been marked for restart.
- NER_3CLASS_MODEL - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
- NER_4CLASS_MODEL - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
- NER_7CLASS_MODEL - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
- NER_DATE_MODEL - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- NER_LOCATION_MODEL - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- NER_MONEY_MODEL - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- NER_ORGANIZATION_MODEL - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- NER_PERCENT_MODEL - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- NER_PERSON_MODEL - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- NER_REGEX_FILE - Static variable in class org.apache.tika.parser.ner.regex.RegexNERecogniser
- NER_TIME_MODEL - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- NERecogniser - Interface in org.apache.tika.parser.ner
-
Defines a contract for named entity recogniser.
- NetCDFParser - Class in org.apache.tika.parser.netcdf
- NetCDFParser() - Constructor for class org.apache.tika.parser.netcdf.NetCDFParser
- NetworkParser - Class in org.apache.tika.parser
- NetworkParser(URI) - Constructor for class org.apache.tika.parser.NetworkParser
- NetworkParser(URI, Set<MediaType>) - Constructor for class org.apache.tika.parser.NetworkParser
- NEW_REQUEST - Enum constant in enum class org.apache.tika.pipes.core.protocol.PipesMessageType
- newBlockingStub(Channel) - Static method in class org.apache.tika.TikaGrpc
-
Creates a new blocking-style stub that supports unary and streaming output calls on the service
- newBlockingV2Stub(Channel) - Static method in class org.apache.tika.TikaGrpc
-
Creates a new blocking-style stub that supports all types of calls on the service
- newBuilder() - Static method in class org.apache.tika.DeleteFetcherReply
- newBuilder() - Static method in class org.apache.tika.DeleteFetcherRequest
- newBuilder() - Static method in class org.apache.tika.DeletePipesIteratorReply
- newBuilder() - Static method in class org.apache.tika.DeletePipesIteratorRequest
- newBuilder() - Static method in class org.apache.tika.FetchAndParseReply
- newBuilder() - Static method in class org.apache.tika.FetchAndParseRequest
- newBuilder() - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- newBuilder() - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- newBuilder() - Static method in class org.apache.tika.GetFetcherReply
- newBuilder() - Static method in class org.apache.tika.GetFetcherRequest
- newBuilder() - Static method in class org.apache.tika.GetPipesIteratorReply
- newBuilder() - Static method in class org.apache.tika.GetPipesIteratorRequest
- newBuilder() - Static method in class org.apache.tika.ListFetchersReply
- newBuilder() - Static method in class org.apache.tika.ListFetchersRequest
- newBuilder() - Static method in class org.apache.tika.SaveFetcherReply
- newBuilder() - Static method in class org.apache.tika.SaveFetcherRequest
- newBuilder() - Static method in class org.apache.tika.SavePipesIteratorReply
- newBuilder() - Static method in class org.apache.tika.SavePipesIteratorRequest
- newBuilder(DeleteFetcherReply) - Static method in class org.apache.tika.DeleteFetcherReply
- newBuilder(DeleteFetcherRequest) - Static method in class org.apache.tika.DeleteFetcherRequest
- newBuilder(DeletePipesIteratorReply) - Static method in class org.apache.tika.DeletePipesIteratorReply
- newBuilder(DeletePipesIteratorRequest) - Static method in class org.apache.tika.DeletePipesIteratorRequest
- newBuilder(FetchAndParseReply) - Static method in class org.apache.tika.FetchAndParseReply
- newBuilder(FetchAndParseRequest) - Static method in class org.apache.tika.FetchAndParseRequest
- newBuilder(GetFetcherConfigJsonSchemaReply) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- newBuilder(GetFetcherConfigJsonSchemaRequest) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- newBuilder(GetFetcherReply) - Static method in class org.apache.tika.GetFetcherReply
- newBuilder(GetFetcherRequest) - Static method in class org.apache.tika.GetFetcherRequest
- newBuilder(GetPipesIteratorReply) - Static method in class org.apache.tika.GetPipesIteratorReply
- newBuilder(GetPipesIteratorRequest) - Static method in class org.apache.tika.GetPipesIteratorRequest
- newBuilder(ListFetchersReply) - Static method in class org.apache.tika.ListFetchersReply
- newBuilder(ListFetchersRequest) - Static method in class org.apache.tika.ListFetchersRequest
- newBuilder(SaveFetcherReply) - Static method in class org.apache.tika.SaveFetcherReply
- newBuilder(SaveFetcherRequest) - Static method in class org.apache.tika.SaveFetcherRequest
- newBuilder(SavePipesIteratorReply) - Static method in class org.apache.tika.SavePipesIteratorReply
- newBuilder(SavePipesIteratorRequest) - Static method in class org.apache.tika.SavePipesIteratorRequest
- newBuilderForType() - Method in class org.apache.tika.DeleteFetcherReply
- newBuilderForType() - Method in class org.apache.tika.DeleteFetcherRequest
- newBuilderForType() - Method in class org.apache.tika.DeletePipesIteratorReply
- newBuilderForType() - Method in class org.apache.tika.DeletePipesIteratorRequest
- newBuilderForType() - Method in class org.apache.tika.FetchAndParseReply
- newBuilderForType() - Method in class org.apache.tika.FetchAndParseRequest
- newBuilderForType() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- newBuilderForType() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- newBuilderForType() - Method in class org.apache.tika.GetFetcherReply
- newBuilderForType() - Method in class org.apache.tika.GetFetcherRequest
- newBuilderForType() - Method in class org.apache.tika.GetPipesIteratorReply
- newBuilderForType() - Method in class org.apache.tika.GetPipesIteratorRequest
- newBuilderForType() - Method in class org.apache.tika.ListFetchersReply
- newBuilderForType() - Method in class org.apache.tika.ListFetchersRequest
- newBuilderForType() - Method in class org.apache.tika.SaveFetcherReply
- newBuilderForType() - Method in class org.apache.tika.SaveFetcherRequest
- newBuilderForType() - Method in class org.apache.tika.SavePipesIteratorReply
- newBuilderForType() - Method in class org.apache.tika.SavePipesIteratorRequest
- newBuilderForType(GeneratedMessageV3.BuilderParent) - Method in class org.apache.tika.DeleteFetcherReply
- newBuilderForType(GeneratedMessageV3.BuilderParent) - Method in class org.apache.tika.DeleteFetcherRequest
- newBuilderForType(GeneratedMessageV3.BuilderParent) - Method in class org.apache.tika.DeletePipesIteratorReply
- newBuilderForType(GeneratedMessageV3.BuilderParent) - Method in class org.apache.tika.DeletePipesIteratorRequest
- newBuilderForType(GeneratedMessageV3.BuilderParent) - Method in class org.apache.tika.FetchAndParseReply
- newBuilderForType(GeneratedMessageV3.BuilderParent) - Method in class org.apache.tika.FetchAndParseRequest
- newBuilderForType(GeneratedMessageV3.BuilderParent) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- newBuilderForType(GeneratedMessageV3.BuilderParent) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- newBuilderForType(GeneratedMessageV3.BuilderParent) - Method in class org.apache.tika.GetFetcherReply
- newBuilderForType(GeneratedMessageV3.BuilderParent) - Method in class org.apache.tika.GetFetcherRequest
- newBuilderForType(GeneratedMessageV3.BuilderParent) - Method in class org.apache.tika.GetPipesIteratorReply
- newBuilderForType(GeneratedMessageV3.BuilderParent) - Method in class org.apache.tika.GetPipesIteratorRequest
- newBuilderForType(GeneratedMessageV3.BuilderParent) - Method in class org.apache.tika.ListFetchersReply
- newBuilderForType(GeneratedMessageV3.BuilderParent) - Method in class org.apache.tika.ListFetchersRequest
- newBuilderForType(GeneratedMessageV3.BuilderParent) - Method in class org.apache.tika.SaveFetcherReply
- newBuilderForType(GeneratedMessageV3.BuilderParent) - Method in class org.apache.tika.SaveFetcherRequest
- newBuilderForType(GeneratedMessageV3.BuilderParent) - Method in class org.apache.tika.SavePipesIteratorReply
- newBuilderForType(GeneratedMessageV3.BuilderParent) - Method in class org.apache.tika.SavePipesIteratorRequest
- newDecoder() - Method in class org.apache.tika.parser.html.charsetdetector.charsets.ReplacementCharset
- newDecoder() - Method in class org.apache.tika.parser.html.charsetdetector.charsets.XUserDefinedCharset
- newEncoder() - Method in class org.apache.tika.parser.html.charsetdetector.charsets.ReplacementCharset
- newEncoder() - Method in class org.apache.tika.parser.html.charsetdetector.charsets.XUserDefinedCharset
- newEngine(PDPage, int, EmbeddedDocumentExtractor, PDFParserConfig, Map<COSStream, Integer>, AtomicInteger, XHTMLContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngineFactory
- newFutureStub(Channel) - Static method in class org.apache.tika.TikaGrpc
-
Creates a new ListenableFuture-style stub that supports unary calls on the service
- newInstance() - Method in interface org.apache.tika.metadata.writefilter.MetadataWriteLimiterFactory
-
Creates a new limiter instance.
- newInstance() - Method in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- newInstance(int) - Static method in class org.apache.tika.eval.core.tokens.AnalyzerManager
- newInstance(GeneratedMessageV3.UnusedPrivateParameter) - Method in class org.apache.tika.DeleteFetcherReply
- newInstance(GeneratedMessageV3.UnusedPrivateParameter) - Method in class org.apache.tika.DeleteFetcherRequest
- newInstance(GeneratedMessageV3.UnusedPrivateParameter) - Method in class org.apache.tika.DeletePipesIteratorReply
- newInstance(GeneratedMessageV3.UnusedPrivateParameter) - Method in class org.apache.tika.DeletePipesIteratorRequest
- newInstance(GeneratedMessageV3.UnusedPrivateParameter) - Method in class org.apache.tika.FetchAndParseReply
- newInstance(GeneratedMessageV3.UnusedPrivateParameter) - Method in class org.apache.tika.FetchAndParseRequest
- newInstance(GeneratedMessageV3.UnusedPrivateParameter) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- newInstance(GeneratedMessageV3.UnusedPrivateParameter) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- newInstance(GeneratedMessageV3.UnusedPrivateParameter) - Method in class org.apache.tika.GetFetcherReply
- newInstance(GeneratedMessageV3.UnusedPrivateParameter) - Method in class org.apache.tika.GetFetcherRequest
- newInstance(GeneratedMessageV3.UnusedPrivateParameter) - Method in class org.apache.tika.GetPipesIteratorReply
- newInstance(GeneratedMessageV3.UnusedPrivateParameter) - Method in class org.apache.tika.GetPipesIteratorRequest
- newInstance(GeneratedMessageV3.UnusedPrivateParameter) - Method in class org.apache.tika.ListFetchersReply
- newInstance(GeneratedMessageV3.UnusedPrivateParameter) - Method in class org.apache.tika.ListFetchersRequest
- newInstance(GeneratedMessageV3.UnusedPrivateParameter) - Method in class org.apache.tika.SaveFetcherReply
- newInstance(GeneratedMessageV3.UnusedPrivateParameter) - Method in class org.apache.tika.SaveFetcherRequest
- newInstance(GeneratedMessageV3.UnusedPrivateParameter) - Method in class org.apache.tika.SavePipesIteratorReply
- newInstance(GeneratedMessageV3.UnusedPrivateParameter) - Method in class org.apache.tika.SavePipesIteratorRequest
- newInstance(Class, ServiceLoader) - Static method in class org.apache.tika.utils.ServiceLoaderUtils
-
Loads a class and instantiates it.
- newInstance(Metadata, ParseContext) - Method in interface org.apache.tika.extractor.EmbeddedDocumentExtractorFactory
- newInstance(Metadata, ParseContext) - Method in class org.apache.tika.extractor.StandardExtractorFactory
- newInstance(Metadata, ParseContext) - Method in class org.apache.tika.pipes.core.extractor.UnpackExtractorFactory
- newInstance(ParseContext) - Static method in class org.apache.tika.metadata.Metadata
-
Creates a new Metadata instance configured from the ParseContext.
- newInstance(ParseContext) - Static method in class org.apache.tika.parser.ParseRecord
-
Creates a new ParseRecord configured from EmbeddedLimits in the ParseContext.
- newInstance(BasicContentHandlerFactory.HANDLER_TYPE, ParseContext) - Static method in class org.apache.tika.sax.BasicContentHandlerFactory
-
Creates a new BasicContentHandlerFactory configured from OutputLimits in the ParseContext.
- newInstance(ContentHandler, TikaInputStream, ParseContext) - Static method in class org.apache.tika.sax.SecureContentHandler
-
Creates a new SecureContentHandler configured from OutputLimits in the ParseContext.
- newInstance(ContentHandler, ParseContext) - Static method in class org.apache.tika.sax.WriteOutContentHandler
-
Creates a new WriteOutContentHandler configured from OutputLimits in the ParseContext.
- newline() - Method in class org.apache.tika.sax.XHTMLContentHandler
- newMetadata() - Method in class org.apache.tika.parser.ParseContext
-
Creates a new Metadata object with any configured limits applied.
- newRequest(byte[]) - Static method in record class org.apache.tika.pipes.core.protocol.PipesMessage
- newStub(Channel) - Static method in class org.apache.tika.TikaGrpc
-
Creates a new async stub that supports all call types for the service
- next() - Method in class org.apache.tika.parser.mp3.ID3v2Frame.RawTagIterator
- nextRow(ContentHandler, ParseContext) - Method in class org.apache.tika.parser.jdbc.JDBCTableReader
- NextStyle - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- NICKNAME - Static variable in interface org.apache.tika.metadata.XMP
-
A word or short phrase that represents the nick name fo the file
- nil() - Static method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
- nil() - Static method in class org.apache.tika.parser.microsoft.onenote.GUID
- NLTKNERecogniser - Class in org.apache.tika.parser.ner.nltk
-
This class offers an implementation of
NERecogniserbased on ne_chunk() module of NLTK. - NLTKNERecogniser() - Constructor for class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
- NNExampleModelDetector - Class in org.apache.tika.detect
- NNExampleModelDetector() - Constructor for class org.apache.tika.detect.NNExampleModelDetector
- NNExampleModelDetector(File) - Constructor for class org.apache.tika.detect.NNExampleModelDetector
- NNExampleModelDetector(Path) - Constructor for class org.apache.tika.detect.NNExampleModelDetector
- NNTrainedModel - Class in org.apache.tika.detect
- NNTrainedModel(int, int, int, float[]) - Constructor for class org.apache.tika.detect.NNTrainedModel
- NNTrainedModelBuilder - Class in org.apache.tika.detect
- NNTrainedModelBuilder() - Constructor for class org.apache.tika.detect.NNTrainedModelBuilder
- NO_EMIT - Static variable in class org.apache.tika.pipes.api.emitter.EmitKey
- NO_EXTRACT_FILE - Enum constant in enum class org.apache.tika.eval.app.io.ExtractReaderException.TYPE
- NO_OCR - Enum constant in enum class org.apache.tika.parser.pdf.OcrConfig.Strategy
- NO_OP - Static variable in class org.apache.tika.pipes.core.reporter.NoOpReporter
- NO_PARSE - Enum constant in enum class org.apache.tika.pipes.api.ParseMode
-
Performs digest (if configured) and content type detection only.
- NO_TEXT - Enum constant in enum class org.apache.tika.parser.pdf.OcrConfig.RenderingStrategy
- NoData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
-
This class is used to represent the property contains no data.
- NoData - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
The property contains no data.
- NoData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.NoData
- NodeMatcher - Class in org.apache.tika.sax.xpath
-
Final evaluation state of a
... - NodeMatcher() - Constructor for class org.apache.tika.sax.xpath.NodeMatcher
- NodeObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- NodeObject(StreamObjectTypeHeaderStart) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.NodeObject
-
Initializes a new instance of the NodeObject class.
- None - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
-
None data element type
- NONE - Enum constant in enum class org.apache.tika.language.detect.LanguageConfidence
- NONE - Enum constant in enum class org.apache.tika.parser.microsoft.ooxml.EditType
- NONE - Enum constant in enum class org.apache.tika.parser.pdf.PDFParserConfig.IMAGE_STRATEGY
- NONE - Enum constant in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.SUFFIX_STRATEGY
- NOOP_FILTER - Static variable in class org.apache.tika.metadata.filter.NoOpFilter
- NoOpFilter - Class in org.apache.tika.metadata.filter
-
This filter performs no operations on the metadata and leaves it untouched.
- NoOpFilter() - Constructor for class org.apache.tika.metadata.filter.NoOpFilter
- NoOpReporter - Class in org.apache.tika.pipes.core.reporter
- NoOpReporter() - Constructor for class org.apache.tika.pipes.core.reporter.NoOpReporter
- normalize(String) - Static method in class org.apache.tika.eval.core.util.EvalExceptionUtils
- normalize(String) - Static method in class org.apache.tika.io.FilenameUtils
-
Scans the given file name for reserved characters on different OSs and file systems and returns a sanitized version of the name with the reserved chars replaced by their hexadecimal value.
- normalize(String) - Static method in class org.apache.tika.parser.mailcommons.MailDateParser
- normalize(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
- normalizedPrefix() - Method in record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
- normalizeMediaType(String) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
Normalizes internal OCR routing media types (e.g.,
image/ocr-png) back to standard media types (e.g.,image/png). - normalizeName(String) - Static method in class org.apache.tika.language.detect.LanguageNames
- NOT_COMPLETED - Enum constant in enum class org.apache.tika.pipes.api.pipesiterator.TotalCountResult.STATUS
- NOT_STARTED - Enum constant in enum class org.apache.tika.parser.microsoft.chm.ChmCommons.IntelState
- NOT_STARTED_DECODING - Enum constant in enum class org.apache.tika.parser.microsoft.chm.ChmCommons.LzxState
- NOT_UTF8 - Enum constant in enum class org.apache.tika.ml.chardetect.StructuralEncodingRules.Utf8Result
-
Sample contains at least one invalid UTF-8 sequence.
- NotebookElementOrderingID - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- NotebookManagementEntityGuid - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- NOTES - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- NoteTagCompleted - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- NoteTagCreated - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- NoteTagDefinitionOid - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- NoteTagHighlightColor - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- NoteTagLabel - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- NoteTagPropertyStatus - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- NoteTagShape - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- NoteTagStates - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- NoteTagTextColor - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- NoTextPDFRenderer - Class in org.apache.tika.renderer.pdf.pdfbox
-
This class extends the PDFRenderer to exclude rendering of electronic text.
- NoTextPDFRenderer(PDDocument) - Constructor for class org.apache.tika.renderer.pdf.pdfbox.NoTextPDFRenderer
- NotImplementedException(String) - Constructor for exception org.apache.tika.parser.html.charsetdetector.charsets.XUserDefinedCharset.NotImplementedException
- NS_URI_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- NSNormalizerContentHandler - Class in org.apache.tika.parser.odf
-
Content handler decorator that: Maps old OpenOffice 1.0 Namespaces to the OpenDocument ones Returns a fake DTD when parser requests OpenOffice DTD
- NSNormalizerContentHandler(ContentHandler) - Constructor for class org.apache.tika.parser.odf.NSNormalizerContentHandler
- NULL - Static variable in class org.apache.tika.language.detect.LanguageResult
- NUM_3D_ANNOTATIONS - Static variable in interface org.apache.tika.metadata.PDF
-
Number of 3D annotations a PDF contains.
- NUM_ALPHA_TOKENS - Static variable in class org.apache.tika.eval.core.metadata.TikaEvalMetadataFilter
- NUM_ALPHABETIC_TOKENS - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- NUM_ATTACHMENTS - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- NUM_COLUMNS - Static variable in class org.apache.tika.ml.chardetect.Utf16ColumnFeatureExtractor
-
Number of columns (even-offset vs odd-offset).
- NUM_COLUMNS - Static variable in class org.apache.tika.parser.csv.TextAndCSVParser
-
If the file is detected as a csv/tsv, this is the number of columns in the first row.
- NUM_COMMON_TOKENS - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- NUM_COMMON_TOKENS - Static variable in class org.apache.tika.eval.core.metadata.TikaEvalMetadataFilter
- NUM_FEATURES - Static variable in class org.apache.tika.ml.chardetect.Utf16ColumnFeatureExtractor
-
Total feature-vector dimension: ranges * columns.
- NUM_FETCHERS_PER_PAGE_FIELD_NUMBER - Static variable in class org.apache.tika.ListFetchersRequest
- NUM_HIDDEN_SLIDES - Static variable in interface org.apache.tika.metadata.Office
- NUM_IMAGES - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
This is the number of images (as in a multi-frame gif) returned by Java's
ImageReader.getNumImages(boolean). - NUM_METADATA_VALUES - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- NUM_OCR_PAGES - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- NUM_PAGES - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- NUM_RANGES - Static variable in class org.apache.tika.ml.chardetect.Utf16ColumnFeatureExtractor
-
Number of byte-value ranges tracked.
- NUM_ROWS - Static variable in class org.apache.tika.parser.csv.TextAndCSVParser
-
If the file is detected as a csv/tsv, this is the number of rows if the file is successfully read (e.g. no encapsulation exceptions, etc).
- NUM_TOKENS - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- NUM_TOKENS - Static variable in class org.apache.tika.eval.core.metadata.TikaEvalMetadataFilter
- NUM_UNIQUE_ALPHA_TOKENS - Static variable in class org.apache.tika.eval.core.metadata.TikaEvalMetadataFilter
- NUM_UNIQUE_ALPHABETIC_TOKENS - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- NUM_UNIQUE_COMMON_TOKENS - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- NUM_UNIQUE_TOKENS - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- NUM_UNIQUE_TOKENS - Static variable in class org.apache.tika.eval.core.metadata.TikaEvalMetadataFilter
- NUM_UNLISTED_SLIDES - Static variable in interface org.apache.tika.metadata.Office
- number - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.ArrayNumber
- NUMBER_OF_BEATS - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The number of beats."
- NUMBER_TYPE_BULLET - Static variable in class org.apache.tika.parser.microsoft.rtf.ListDescriptor
- NumberCell - Class in org.apache.tika.parser.microsoft
-
Number cell.
- NumberCell(double, NumberFormat) - Constructor for class org.apache.tika.parser.microsoft.NumberCell
- NumberListFormat - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- NUMBERS - Enum constant in enum class org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
- NUMBERS13 - Enum constant in enum class org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
- NUMBERS18 - Enum constant in enum class org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
- numberType - Variable in class org.apache.tika.parser.microsoft.rtf.ListDescriptor
O
- OBJECT_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Objects in the document.
- ObjectChangeFrequency - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadata
-
Gets or sets a compact unsigned 64-bit integer that specifies the expected change frequency of the object.
- objectData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataNodeObjectData
- objectData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.HeaderCell
- ObjectDataBLOB - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Object Data BLOB
- ObjectDataBLOBDataElementData - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
-
Object Data BLOB Data Element
- objectDataBLOBExGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
- objectDataSize - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
-
Gets or sets a compact unsigned 64-bit integer that specifies the size in bytes of the object.opaque binary data for the declared object.
- objectDataSize - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
-
Gets or sets a compact unsigned 64-bit integer that specifies the size in bytes of the object.binary data opaque to this protocol for the declared object.
- objectDeclaration - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.HeaderCell
- objectDeclaration - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.JCIDObject
- objectDeclaration - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySetObject
- objectDeclarationList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDeclarations
- objectExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestRootDeclare
- objectExGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
- objectExGUIDArray - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
- objectExtendedGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
- objectExtendedGUIDArray - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDataBLOBReference
- objectGroupData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
- ObjectGroupData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
The ObjectGroupData class.
- ObjectGroupData - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Object Group Data
- ObjectGroupData - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Object Group Data
- ObjectGroupData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupData
-
Initializes a new instance of the ObjectGroupData class.
- ObjectGroupDataElementData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- ObjectGroupDataElementData - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
-
Object Group Data Element
- ObjectGroupDataElementData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
-
Initializes a new instance of the ObjectGroupDataElementData class.
- ObjectGroupDataElementData.Builder - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
The internal class for build a list of DataElement from a node object.
- objectGroupDeclarations - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
- ObjectGroupDeclarations - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Object Group Declarations
- ObjectGroupDeclarations - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Object Group Declarations
- ObjectGroupDeclarations - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Object Group Declarations
- ObjectGroupDeclarations() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDeclarations
-
Initializes a new instance of the ObjectGroupDeclarations class.
- objectGroupExtendedGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestObjectGroupReferences
- objectGroupID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObject
- objectGroupID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObjectGroup
- ObjectGroupMetadata - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Specifies an object group metadata
- ObjectGroupMetadata - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Object Group Metadata
- ObjectGroupMetadata() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadata
-
Initializes a new instance of the ObjectGroupMetadata class.
- ObjectGroupMetadataDeclarations - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Object Metadata Declaration
- ObjectGroupMetadataDeclarations - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Object Group Metadata Declarations, new added in MOSS2013.
- ObjectGroupMetadataDeclarations - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Object Group Metadata Declarations
- ObjectGroupMetadataDeclarations() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadataDeclarations
-
Initializes a new instance of the ObjectGroupMetadataDeclarations class.
- objectGroupMetadataList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadataDeclarations
- ObjectGroupObjectBLOBDataDeclaration - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
object data BLOB declaration
- ObjectGroupObjectBLOBDataDeclaration - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Object Group Object BLOB Data Declaration
- ObjectGroupObjectBLOBDataDeclaration() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
-
Initializes a new instance of the ObjectGroupObjectBLOBDataDeclaration class.
- objectGroupObjectBLOBDataDeclarationList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDeclarations
- ObjectGroupObjectData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- ObjectGroupObjectData - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Object Group Object Data
- ObjectGroupObjectData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
-
Initializes a new instance of the ObjectGroupObjectData class.
- ObjectGroupObjectDataBLOBReference - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
object data BLOB reference
- ObjectGroupObjectDataBLOBReference - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Object Group Object Data BLOB Reference
- ObjectGroupObjectDataBLOBReference() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDataBLOBReference
-
Initializes a new instance of the ObjectGroupObjectDataBLOBReference class.
- objectGroupObjectDataBLOBReferenceList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupData
- objectGroupObjectDataList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupData
- ObjectGroupObjectDeclare - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- ObjectGroupObjectDeclare - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Object Group Object Declare
- ObjectGroupObjectDeclare() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
-
Initializes a new instance of the ObjectGroupObjectDeclare class.
- objectID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObject
- ObjectID - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
The property contains one CompactID in the ObjectSpaceObjectPropSet.OIDs.body stream field.
- objectMetadataDeclaration - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
- objectPartitionID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
- objectPartitionID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
- objectReferencesCount - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
- objectReferencesCount - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
- objects - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObjectGroup
- ObjectSpaceID - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
The property contains one CompactID structure in the ObjectSpaceObjectPropSet.OSIDs.body stream field.
- objectSpaceObjectPropSet - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySetObject
- ObjectSpaceObjectPropSet - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space
-
This class is used to represent a ObjectSpaceObjectPropSet.
- ObjectSpaceObjectPropSet - Class in org.apache.tika.parser.microsoft.onenote
- ObjectSpaceObjectPropSet() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
- ObjectSpaceObjectPropSet() - Constructor for class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
- ObjectSpaceObjectStreamHeader - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space
- ObjectSpaceObjectStreamHeader() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
- ObjectSpaceObjectStreamOfContextIDs - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space
-
This class is used to represent a ObjectSpaceObjectStreamOfContextIDs.
- ObjectSpaceObjectStreamOfContextIDs() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfContextIDs
- ObjectSpaceObjectStreamOfOIDs - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space
-
This class is used to represent a ObjectSpaceObjectStreamOfOIDs.
- ObjectSpaceObjectStreamOfOIDs() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOIDs
- ObjectSpaceObjectStreamOfOSIDs - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space
-
This class is used to represent a ObjectSpaceObjectStreamOfOSIDs.
- ObjectSpaceObjectStreamOfOSIDs() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOSIDs
- OCR_AND_TEXT_EXTRACTION - Enum constant in enum class org.apache.tika.parser.pdf.OcrConfig.Strategy
- OCR_MEDIATYPE_PREFIX - Static variable in class org.apache.tika.parser.image.AbstractImageParser
- OCR_ONLY - Enum constant in enum class org.apache.tika.parser.pdf.OcrConfig.Strategy
- OCR_PAGE_COUNT - Static variable in interface org.apache.tika.metadata.PDF
-
This counts the number of pages that would have been OCR'd or were OCR'd depending on the OCR settings.
- OcrConfig - Class in org.apache.tika.parser.pdf
-
Configuration for OCR processing in PDF parsing.
- OcrConfig() - Constructor for class org.apache.tika.parser.pdf.OcrConfig
- OcrConfig.ImageFormat - Enum Class in org.apache.tika.parser.pdf
- OcrConfig.ImageType - Enum Class in org.apache.tika.parser.pdf
- OcrConfig.RenderingStrategy - Enum Class in org.apache.tika.parser.pdf
- OcrConfig.Strategy - Enum Class in org.apache.tika.parser.pdf
- OcrConfig.StrategyAuto - Class in org.apache.tika.parser.pdf
-
Configuration for AUTO strategy behavior.
- OCRPageCounter - Class in org.apache.tika.parser.pdf
-
This counts the number of pages that OCR would have been run or was run depending on the settings.
- OCRPageCounter() - Constructor for class org.apache.tika.parser.pdf.OCRPageCounter
- OCTET_STREAM - Static variable in class org.apache.tika.mime.MediaType
- OCTET_STREAM - Static variable in class org.apache.tika.mime.MimeTypes
-
Name of the
roottype, application/octet-stream. - OCX_NAME - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
- OCX_NAME - Static variable in interface org.apache.tika.metadata.Office
- ODF_VERSION_KEY - Static variable in class org.apache.tika.parser.odf.OpenDocumentMetaParser
- of(int) - Static method in class org.apache.tika.langdetect.charsoup.ScriptCategory
-
Map a codepoint to its coarse script category.
- of(Long) - Static method in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- offer(List<FetchEmitTuple>, long) - Method in class org.apache.tika.pipes.core.async.AsyncProcessor
- offer(FetchEmitTuple, long) - Method in class org.apache.tika.pipes.core.async.AsyncProcessor
- OfferLargerThanQueueSize - Exception in org.apache.tika.pipes.core.async
- OfferLargerThanQueueSize(int, int) - Constructor for exception org.apache.tika.pipes.core.async.OfferLargerThanQueueSize
- Office - Interface in org.apache.tika.metadata
-
Office Document properties collection.
- OfficeOpenXMLCore - Interface in org.apache.tika.metadata
-
Core properties as defined in the Office Open XML specification part Two that are not in the DublinCore namespace.
- OfficeOpenXMLExtended - Interface in org.apache.tika.metadata
-
Extended properties as defined in the Office Open XML specification part Four.
- OfficeParser - Class in org.apache.tika.parser.microsoft
-
Defines a Microsoft document content extractor.
- OfficeParser() - Constructor for class org.apache.tika.parser.microsoft.OfficeParser
- OfficeParser(JsonConfig) - Constructor for class org.apache.tika.parser.microsoft.OfficeParser
- OfficeParser(OfficeParserConfig) - Constructor for class org.apache.tika.parser.microsoft.OfficeParser
- OfficeParser.POIFSDocumentType - Enum Class in org.apache.tika.parser.microsoft
- officeParserConfig - Variable in class org.apache.tika.parser.microsoft.OutlookExtractor
- OfficeParserConfig - Class in org.apache.tika.parser.microsoft
- OfficeParserConfig() - Constructor for class org.apache.tika.parser.microsoft.OfficeParserConfig
- OfflineContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that always returns an empty stream from the
OfflineContentHandler.resolveEntity(String, String)method to prevent potential network or other external resources from being accessed by an XML parser. - OfflineContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.OfflineContentHandler
- OffsetFromParentHoriz - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- OffsetFromParentVert - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- OGG_AUDIO - Static variable in class org.apache.tika.detect.ogg.OggDetector
- OGG_AUDIO - Static variable in class org.apache.tika.parser.ogg.OggParser
- OGG_FLAC - Static variable in class org.apache.tika.parser.ogg.FlacParser
- OGG_GENERAL - Static variable in class org.apache.tika.detect.ogg.OggDetector
- OGG_GENERAL - Static variable in class org.apache.tika.parser.ogg.OggParser
- OGG_PCM - Static variable in class org.apache.tika.parser.ogg.OggParser
- OGG_RGB - Static variable in class org.apache.tika.parser.ogg.OggParser
- OGG_UVS - Static variable in class org.apache.tika.parser.ogg.OggParser
- OGG_VIDEO - Static variable in class org.apache.tika.detect.ogg.OggDetector
- OGG_VIDEO - Static variable in class org.apache.tika.parser.ogg.OggParser
- OGG_VORBIS - Static variable in class org.apache.tika.parser.ogg.VorbisParser
- OGG_YUV - Static variable in class org.apache.tika.parser.ogg.OggParser
- OggAudioParser - Class in org.apache.tika.parser.ogg
-
Parent parser for the various Ogg Audio formats, such as Vorbis and Opus.
- OggAudioParser() - Constructor for class org.apache.tika.parser.ogg.OggAudioParser
- OggDetector - Class in org.apache.tika.detect.ogg
-
Detector for identifying specific file types stored within an Ogg container.
- OggDetector() - Constructor for class org.apache.tika.detect.ogg.OggDetector
- OggParser - Class in org.apache.tika.parser.ogg
-
General parser for Ogg files where we don't know what the specific kind is.
- OggParser() - Constructor for class org.apache.tika.parser.ogg.OggParser
- OGM_VIDEO - Static variable in class org.apache.tika.parser.ogg.OggParser
- oids - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
- OK - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.Error
- OldExcelParser - Class in org.apache.tika.parser.microsoft
-
A POI-powered Tika Parser for very old versions of Excel, from pre-OLE2 days, such as Excel 4.
- OldExcelParser() - Constructor for class org.apache.tika.parser.microsoft.OldExcelParser
- OLE - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
The OLE base file format
- OLE - Static variable in class org.apache.tika.detect.ole.MiscOLEDetector
-
The OLE base file format
- OLE10_NATIVE - Enum constant in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- OLE10_NATIVE - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
An OLE10 Native embedded document within another OLE2 document
- ON_PARSE_EXCEPTION - Static variable in class org.apache.tika.pipes.core.serialization.FetchEmitTupleSerializer
- onByte(int) - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFObjDataStreamParser
-
Receive a single decoded byte from the objdata hex stream.
- onByte(int) - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFPictStreamParser
-
Receive a single decoded byte from the pict hex stream.
- onClose(Session) - Method in class org.apache.tika.language.translate.impl.MarianTranslator.MarianServerClient
- onComplete(Metadata) - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFPictStreamParser
-
Called when the pict group closes.
- onComplete(Metadata, AtomicInteger) - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFObjDataStreamParser
-
Called when the objdata group closes.
- ONE_NOTE_PREFIX - Static variable in class org.apache.tika.parser.microsoft.onenote.OneNoteParser
- OneByteOfData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
-
This class is used to represent the property contains 1 byte of data in the PropertySet.rgData stream field.
- OneByteOfData - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
The property contains 1 byte of data in the PropertySet.rgData stream field.
- OneByteOfData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.OneByteOfData
- OneNoteParser - Class in org.apache.tika.parser.microsoft.onenote
-
OneNote tika parser capable of parsing Microsoft OneNote files.
- OneNoteParser() - Constructor for class org.apache.tika.parser.microsoft.onenote.OneNoteParser
- OneNotePropertyEnum - Enum Class in org.apache.tika.parser.microsoft.onenote
- OneNoteTreeWalkerOptions - Class in org.apache.tika.parser.microsoft.onenote
-
Options when walking the one note tree.
- OneNoteTreeWalkerOptions() - Constructor for class org.apache.tika.parser.microsoft.onenote.OneNoteTreeWalkerOptions
- onExists() - Method in record class org.apache.tika.pipes.emitter.fs.FileSystemEmitterConfig
-
Returns the value of the
onExistsrecord component. - onOpen(Session) - Method in class org.apache.tika.language.translate.impl.MarianTranslator.MarianServerClient
- ONTOLOGY_CONCEPT_ARR - Enum constant in enum class org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
- OOM - Enum constant in enum class org.apache.tika.eval.app.ProfilerBase.PARSE_ERROR_TYPE
- OOM - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- OOM - Enum constant in enum class org.apache.tika.pipes.core.protocol.PipesMessageType
- OOM - Static variable in class org.apache.tika.pipes.core.PipesResults
- OOV - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- OOXML_PROTECTED - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
The protected OOXML base file format
- OOXMLExtractor - Interface in org.apache.tika.parser.microsoft.ooxml
-
Interface implemented by all Tika OOXML extractors.
- OOXMLExtractorFactory - Class in org.apache.tika.parser.microsoft.ooxml
-
Figures out the correct
OOXMLExtractorfor the supplied document and returns it. - OOXMLExtractorFactory() - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory
- OOXMLParser - Class in org.apache.tika.parser.microsoft.ooxml
-
Office Open XML (OOXML) parser.
- OOXMLParser() - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
- OOXMLParser(JsonConfig) - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
- OOXMLParser(OfficeParserConfig) - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
- OOXMLTikaBodyPartHandler - Class in org.apache.tika.parser.microsoft.ooxml
- OOXMLTikaBodyPartHandler(XHTMLContentHandler) - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- OOXMLTikaBodyPartHandler(XHTMLContentHandler, Metadata) - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- OOXMLTikaBodyPartHandler(XHTMLContentHandler, XWPFStylesShim, XWPFListManager, OfficeParserConfig) - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- OOXMLTikaBodyPartHandler(XHTMLContentHandler, XWPFStylesShim, XWPFListManager, OfficeParserConfig, Metadata) - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- OOXMLWordAndPowerPointTextHandler - Class in org.apache.tika.parser.microsoft.ooxml
-
This class is intended to handle anything that might contain IBodyElements: main document, headers, footers, notes, slides, etc.
- OOXMLWordAndPowerPointTextHandler(XWPFBodyContentsHandler, Map<String, String>) - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
- OOXMLWordAndPowerPointTextHandler(XWPFBodyContentsHandler, Map<String, String>, boolean, boolean) - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
- OOXMLWordAndPowerPointTextHandler(XWPFBodyContentsHandler, Map<String, String>, boolean, boolean, boolean) - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
- opcPackage - Variable in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
- OPCPackageDetector - Class in org.apache.tika.detect.microsoft.ooxml
- OPCPackageDetector() - Constructor for class org.apache.tika.detect.microsoft.ooxml.OPCPackageDetector
- OPCPackageWrapper - Class in org.apache.tika.parser.microsoft.ooxml
-
This is a wrapper around OPCPackage that calls revert() instead of close().
- OPCPackageWrapper(OPCPackage) - Constructor for class org.apache.tika.parser.microsoft.ooxml.OPCPackageWrapper
- OPEN_CHOICE - Enum constant in enum class org.apache.tika.metadata.Property.ValueType
- OpenAIEmbeddingFilter - Class in org.apache.tika.inference
-
Metadata filter that calls an OpenAI-compatible
/v1/embeddingsendpoint to produce vectors for each text chunk. - OpenAIEmbeddingFilter() - Constructor for class org.apache.tika.inference.OpenAIEmbeddingFilter
- OpenAIEmbeddingFilter(InferenceConfig) - Constructor for class org.apache.tika.inference.OpenAIEmbeddingFilter
- OpenAIImageEmbeddingParser - Class in org.apache.tika.inference
-
Parser that sends images to a CLIP-like embedding endpoint (OpenAI-compatible
/v1/embeddingswith image input) and stores the resulting vector in metadata. - OpenAIImageEmbeddingParser() - Constructor for class org.apache.tika.inference.OpenAIImageEmbeddingParser
- OpenAIImageEmbeddingParser(JsonConfig) - Constructor for class org.apache.tika.inference.OpenAIImageEmbeddingParser
- OpenAIImageEmbeddingParser(ImageEmbeddingConfig) - Constructor for class org.apache.tika.inference.OpenAIImageEmbeddingParser
- OpenAIVLMParser - Class in org.apache.tika.parser.vlm
-
VLM parser for OpenAI-compatible chat completions endpoints (OpenAI, Azure OpenAI, OpenRouter, vLLM, Ollama, LiteLLM, Together AI, Groq, Fireworks, Mistral, NVIDIA NIM, Jina, local FastAPI wrappers, etc.).
- OpenAIVLMParser() - Constructor for class org.apache.tika.parser.vlm.OpenAIVLMParser
- OpenAIVLMParser(JsonConfig) - Constructor for class org.apache.tika.parser.vlm.OpenAIVLMParser
- OpenAIVLMParser(VLMOCRConfig) - Constructor for class org.apache.tika.parser.vlm.OpenAIVLMParser
- OpenDocumentContentParser - Class in org.apache.tika.parser.odf
-
Parser for ODF
content.xmlfiles. - OpenDocumentContentParser() - Constructor for class org.apache.tika.parser.odf.OpenDocumentContentParser
- OpenDocumentConverter - Class in org.apache.tika.xmp.convert
-
Tika to XMP mapping for the Open Document formats: Text (.odt), Spreatsheet (.ods), Graphics (.odg) and Presentation (.odp).
- OpenDocumentConverter() - Constructor for class org.apache.tika.xmp.convert.OpenDocumentConverter
- OpenDocumentDetector - Class in org.apache.tika.detect.zip
- OpenDocumentDetector() - Constructor for class org.apache.tika.detect.zip.OpenDocumentDetector
- OpenDocumentMetaParser - Class in org.apache.tika.parser.odf
-
Parser for OpenDocument
meta.xmlfiles. - OpenDocumentMetaParser() - Constructor for class org.apache.tika.parser.odf.OpenDocumentMetaParser
- OpenDocumentParser - Class in org.apache.tika.parser.odf
-
OpenOffice parser
- OpenDocumentParser() - Constructor for class org.apache.tika.parser.odf.OpenDocumentParser
- OpenDocumentParser(JsonConfig) - Constructor for class org.apache.tika.parser.odf.OpenDocumentParser
-
Constructor for JSON configuration.
- OpenDocumentParser(OpenDocumentParser.Config) - Constructor for class org.apache.tika.parser.odf.OpenDocumentParser
-
Constructor with explicit Config object.
- OpenDocumentParser.Config - Class in org.apache.tika.parser.odf
-
Configuration class for JSON deserialization.
- openFile(File) - Method in class org.apache.tika.gui.TikaGUI
- OpenNLPDetector - Class in org.apache.tika.langdetect.opennlp
-
This is based on OpenNLP's language detector.
- OpenNLPDetector() - Constructor for class org.apache.tika.langdetect.opennlp.OpenNLPDetector
- OpenNLPMetadataFilter - Class in org.apache.tika.langdetect.opennlp.metadatafilter
- OpenNLPMetadataFilter() - Constructor for class org.apache.tika.langdetect.opennlp.metadatafilter.OpenNLPMetadataFilter
- OpenNLPNameFinder - Class in org.apache.tika.parser.ner.opennlp
-
An implementation of
NERecogniserthat finds names in text using Open NLP Model. - OpenNLPNameFinder(String, String) - Constructor for class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
-
Creates OpenNLP name finder
- OpenNLPNERecogniser - Class in org.apache.tika.parser.ner.opennlp
-
This implementation of
NERecogniserchains an array ofOpenNLPNameFinders for which NER models are available in classpath. - OpenNLPNERecogniser() - Constructor for class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
Creates a default chain of Name finders using default OpenNLP recognizers
- OpenNLPNERecogniser(Map<String, String>) - Constructor for class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
Creates a chain of Named Entity recognisers
- OpenSearchClient - Class in org.apache.tika.pipes.emitter.opensearch
- OpenSearchClient - Class in org.apache.tika.pipes.reporter.opensearch
- OpenSearchClient(String, HttpClient) - Constructor for class org.apache.tika.pipes.reporter.opensearch.OpenSearchClient
- OpenSearchClient(OpenSearchEmitterConfig, HttpClient) - Constructor for class org.apache.tika.pipes.emitter.opensearch.OpenSearchClient
- OpenSearchEmitter - Class in org.apache.tika.pipes.emitter.opensearch
- OpenSearchEmitter(ExtensionConfig, OpenSearchEmitterConfig) - Constructor for class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitter
- OpenSearchEmitterConfig - Record Class in org.apache.tika.pipes.emitter.opensearch
- OpenSearchEmitterConfig(String, String, OpenSearchEmitterConfig.AttachmentStrategy, OpenSearchEmitterConfig.UpdateStrategy, int, String, HttpClientConfig) - Constructor for record class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig
-
Creates an instance of a
OpenSearchEmitterConfigrecord class. - OpenSearchEmitterConfig.AttachmentStrategy - Enum Class in org.apache.tika.pipes.emitter.opensearch
- OpenSearchEmitterConfig.UpdateStrategy - Enum Class in org.apache.tika.pipes.emitter.opensearch
- OpenSearchEmitterFactory - Class in org.apache.tika.pipes.emitter.opensearch
-
Factory for creating OpenSearch emitters.
- OpenSearchEmitterFactory() - Constructor for class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterFactory
- OpenSearchPipesPlugin - Class in org.apache.tika.pipes.plugin.opensearch
- OpenSearchPipesPlugin(PluginWrapper) - Constructor for class org.apache.tika.pipes.plugin.opensearch.OpenSearchPipesPlugin
- OpenSearchPipesReporter - Class in org.apache.tika.pipes.reporter.opensearch
-
As of the 2.5.0 release, this is ALPHA version.
- OpenSearchPipesReporter(ExtensionConfig, OpenSearchReporterConfig) - Constructor for class org.apache.tika.pipes.reporter.opensearch.OpenSearchPipesReporter
- OpenSearchReporterConfig - Record Class in org.apache.tika.pipes.reporter.opensearch
- OpenSearchReporterConfig(String, Set<String>, Set<String>, String, boolean, HttpClientConfig) - Constructor for record class org.apache.tika.pipes.reporter.opensearch.OpenSearchReporterConfig
-
Creates an instance of a
OpenSearchReporterConfigrecord class. - OpenSearchReporterFactory - Class in org.apache.tika.pipes.reporter.opensearch
-
Factory for creating OpenSearch pipes reporters.
- OpenSearchReporterFactory() - Constructor for class org.apache.tika.pipes.reporter.opensearch.OpenSearchReporterFactory
- openSearchUrl - Variable in class org.apache.tika.pipes.reporter.opensearch.OpenSearchClient
- openSearchUrl() - Method in record class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig
-
Returns the value of the
openSearchUrlrecord component. - openSearchUrl() - Method in record class org.apache.tika.pipes.reporter.opensearch.OpenSearchReporterConfig
-
Returns the value of the
openSearchUrlrecord component. - openURL(URL) - Method in class org.apache.tika.gui.TikaGUI
- OPFParser - Class in org.apache.tika.parser.epub
-
Use this to parse the .opf files
- OPFParser() - Constructor for class org.apache.tika.parser.epub.OPFParser
- OptimaizeLangDetector - Class in org.apache.tika.langdetect.optimaize
-
Implementation of the LanguageDetector API that uses https://github.com/optimaize/language-detector
- OptimaizeLangDetector() - Constructor for class org.apache.tika.langdetect.optimaize.OptimaizeLangDetector
- OptimaizeLangDetector(int) - Constructor for class org.apache.tika.langdetect.optimaize.OptimaizeLangDetector
- OptimaizeMetadataFilter - Class in org.apache.tika.langdetect.optimaize.metadatafilter
- OptimaizeMetadataFilter() - Constructor for class org.apache.tika.langdetect.optimaize.metadatafilter.OptimaizeMetadataFilter
- OPUS_AUDIO - Static variable in class org.apache.tika.parser.ogg.OpusParser
- OPUS_AUDIO_ALT - Static variable in class org.apache.tika.parser.ogg.OpusParser
- OpusParser - Class in org.apache.tika.parser.ogg
-
Parser for OGG Opus audio files.
- OpusParser() - Constructor for class org.apache.tika.parser.ogg.OpusParser
- org.apache.tika - package org.apache.tika
-
Apache Tika.
- org.apache.tika.annotation - package org.apache.tika.annotation
- org.apache.tika.async.cli - package org.apache.tika.async.cli
- org.apache.tika.bundle.internal - package org.apache.tika.bundle.internal
- org.apache.tika.cli - package org.apache.tika.cli
- org.apache.tika.client - package org.apache.tika.client
- org.apache.tika.concurrent - package org.apache.tika.concurrent
- org.apache.tika.config - package org.apache.tika.config
-
Tika configuration tools.
- org.apache.tika.config.loader - package org.apache.tika.config.loader
- org.apache.tika.detect - package org.apache.tika.detect
-
Media type detection.
- org.apache.tika.detect.apple - package org.apache.tika.detect.apple
- org.apache.tika.detect.gzip - package org.apache.tika.detect.gzip
- org.apache.tika.detect.magika - package org.apache.tika.detect.magika
- org.apache.tika.detect.microsoft - package org.apache.tika.detect.microsoft
- org.apache.tika.detect.microsoft.ooxml - package org.apache.tika.detect.microsoft.ooxml
- org.apache.tika.detect.ogg - package org.apache.tika.detect.ogg
- org.apache.tika.detect.ole - package org.apache.tika.detect.ole
- org.apache.tika.detect.siegfried - package org.apache.tika.detect.siegfried
- org.apache.tika.detect.zip - package org.apache.tika.detect.zip
- org.apache.tika.digest - package org.apache.tika.digest
- org.apache.tika.embedder - package org.apache.tika.embedder
- org.apache.tika.eval.app - package org.apache.tika.eval.app
- org.apache.tika.eval.app.db - package org.apache.tika.eval.app.db
- org.apache.tika.eval.app.io - package org.apache.tika.eval.app.io
- org.apache.tika.eval.app.reports - package org.apache.tika.eval.app.reports
- org.apache.tika.eval.app.tools - package org.apache.tika.eval.app.tools
- org.apache.tika.eval.core.langid - package org.apache.tika.eval.core.langid
- org.apache.tika.eval.core.metadata - package org.apache.tika.eval.core.metadata
- org.apache.tika.eval.core.textstats - package org.apache.tika.eval.core.textstats
- org.apache.tika.eval.core.tokens - package org.apache.tika.eval.core.tokens
- org.apache.tika.eval.core.util - package org.apache.tika.eval.core.util
- org.apache.tika.example - package org.apache.tika.example
- org.apache.tika.exception - package org.apache.tika.exception
-
Tika exception.
- org.apache.tika.extractor - package org.apache.tika.extractor
-
Extraction of component documents.
- org.apache.tika.extractor.microsoft - package org.apache.tika.extractor.microsoft
- org.apache.tika.filetypedetector - package org.apache.tika.filetypedetector
-
Tika Java-7 FileTypeDetector implementations.
- org.apache.tika.gui - package org.apache.tika.gui
- org.apache.tika.http - package org.apache.tika.http
- org.apache.tika.inference - package org.apache.tika.inference
- org.apache.tika.inference.locator - package org.apache.tika.inference.locator
- org.apache.tika.io - package org.apache.tika.io
-
IO utilities.
- org.apache.tika.langdetect - package org.apache.tika.langdetect
- org.apache.tika.langdetect.charsoup - package org.apache.tika.langdetect.charsoup
- org.apache.tika.langdetect.lingo24 - package org.apache.tika.langdetect.lingo24
- org.apache.tika.langdetect.mitll - package org.apache.tika.langdetect.mitll
- org.apache.tika.langdetect.opennlp - package org.apache.tika.langdetect.opennlp
- org.apache.tika.langdetect.opennlp.metadatafilter - package org.apache.tika.langdetect.opennlp.metadatafilter
- org.apache.tika.langdetect.optimaize - package org.apache.tika.langdetect.optimaize
- org.apache.tika.langdetect.optimaize.metadatafilter - package org.apache.tika.langdetect.optimaize.metadatafilter
- org.apache.tika.language.detect - package org.apache.tika.language.detect
- org.apache.tika.language.translate - package org.apache.tika.language.translate
- org.apache.tika.language.translate.impl - package org.apache.tika.language.translate.impl
- org.apache.tika.metadata - package org.apache.tika.metadata
-
Multi-valued metadata container, and set of constant metadata fields.
- org.apache.tika.metadata.filter - package org.apache.tika.metadata.filter
- org.apache.tika.metadata.writefilter - package org.apache.tika.metadata.writefilter
- org.apache.tika.mime - package org.apache.tika.mime
-
Media type information.
- org.apache.tika.ml - package org.apache.tika.ml
- org.apache.tika.ml.chardetect - package org.apache.tika.ml.chardetect
- org.apache.tika.ml.chardetect.tools - package org.apache.tika.ml.chardetect.tools
- org.apache.tika.ml.junkdetect - package org.apache.tika.ml.junkdetect
- org.apache.tika.ml.junkdetect.tools - package org.apache.tika.ml.junkdetect.tools
- org.apache.tika.parser - package org.apache.tika.parser
-
Tika parsers.
- org.apache.tika.parser.apple - package org.apache.tika.parser.apple
- org.apache.tika.parser.asm - package org.apache.tika.parser.asm
- org.apache.tika.parser.audio - package org.apache.tika.parser.audio
- org.apache.tika.parser.code - package org.apache.tika.parser.code
- org.apache.tika.parser.crypto - package org.apache.tika.parser.crypto
- org.apache.tika.parser.csv - package org.apache.tika.parser.csv
- org.apache.tika.parser.ctakes - package org.apache.tika.parser.ctakes
- org.apache.tika.parser.dbf - package org.apache.tika.parser.dbf
- org.apache.tika.parser.dgn - package org.apache.tika.parser.dgn
- org.apache.tika.parser.dif - package org.apache.tika.parser.dif
- org.apache.tika.parser.digestutils - package org.apache.tika.parser.digestutils
- org.apache.tika.parser.dwg - package org.apache.tika.parser.dwg
- org.apache.tika.parser.envi - package org.apache.tika.parser.envi
- org.apache.tika.parser.epub - package org.apache.tika.parser.epub
- org.apache.tika.parser.executable - package org.apache.tika.parser.executable
- org.apache.tika.parser.external - package org.apache.tika.parser.external
- org.apache.tika.parser.feed - package org.apache.tika.parser.feed
- org.apache.tika.parser.font - package org.apache.tika.parser.font
- org.apache.tika.parser.gdal - package org.apache.tika.parser.gdal
- org.apache.tika.parser.geo.topic - package org.apache.tika.parser.geo.topic
- org.apache.tika.parser.geo.topic.gazetteer - package org.apache.tika.parser.geo.topic.gazetteer
- org.apache.tika.parser.geoinfo - package org.apache.tika.parser.geoinfo
- org.apache.tika.parser.geopkg - package org.apache.tika.parser.geopkg
- org.apache.tika.parser.grib - package org.apache.tika.parser.grib
- org.apache.tika.parser.hdf - package org.apache.tika.parser.hdf
- org.apache.tika.parser.html - package org.apache.tika.parser.html
- org.apache.tika.parser.html.charsetdetector - package org.apache.tika.parser.html.charsetdetector
- org.apache.tika.parser.html.charsetdetector.charsets - package org.apache.tika.parser.html.charsetdetector.charsets
- org.apache.tika.parser.http - package org.apache.tika.parser.http
- org.apache.tika.parser.hwp - package org.apache.tika.parser.hwp
- org.apache.tika.parser.image - package org.apache.tika.parser.image
- org.apache.tika.parser.indesign - package org.apache.tika.parser.indesign
- org.apache.tika.parser.iptc - package org.apache.tika.parser.iptc
- org.apache.tika.parser.isatab - package org.apache.tika.parser.isatab
- org.apache.tika.parser.iwork - package org.apache.tika.parser.iwork
- org.apache.tika.parser.iwork.iwana - package org.apache.tika.parser.iwork.iwana
- org.apache.tika.parser.jdbc - package org.apache.tika.parser.jdbc
- org.apache.tika.parser.journal - package org.apache.tika.parser.journal
- org.apache.tika.parser.mail - package org.apache.tika.parser.mail
- org.apache.tika.parser.mailcommons - package org.apache.tika.parser.mailcommons
- org.apache.tika.parser.mat - package org.apache.tika.parser.mat
- org.apache.tika.parser.mbox - package org.apache.tika.parser.mbox
- org.apache.tika.parser.microsoft - package org.apache.tika.parser.microsoft
- org.apache.tika.parser.microsoft.activemime - package org.apache.tika.parser.microsoft.activemime
- org.apache.tika.parser.microsoft.chm - package org.apache.tika.parser.microsoft.chm
- org.apache.tika.parser.microsoft.libpst - package org.apache.tika.parser.microsoft.libpst
- org.apache.tika.parser.microsoft.msg - package org.apache.tika.parser.microsoft.msg
- org.apache.tika.parser.microsoft.onenote - package org.apache.tika.parser.microsoft.onenote
- org.apache.tika.parser.microsoft.onenote.fsshttpb - package org.apache.tika.parser.microsoft.onenote.fsshttpb
- org.apache.tika.parser.microsoft.onenote.fsshttpb.exception - package org.apache.tika.parser.microsoft.onenote.fsshttpb.exception
- org.apache.tika.parser.microsoft.onenote.fsshttpb.property - package org.apache.tika.parser.microsoft.onenote.fsshttpb.property
- org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj - package org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic - package org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
- org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking - package org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
- org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space - package org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space
- org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned - package org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned
- org.apache.tika.parser.microsoft.onenote.fsshttpb.util - package org.apache.tika.parser.microsoft.onenote.fsshttpb.util
- org.apache.tika.parser.microsoft.ooxml - package org.apache.tika.parser.microsoft.ooxml
- org.apache.tika.parser.microsoft.ooxml.xps - package org.apache.tika.parser.microsoft.ooxml.xps
- org.apache.tika.parser.microsoft.ooxml.xslf - package org.apache.tika.parser.microsoft.ooxml.xslf
- org.apache.tika.parser.microsoft.ooxml.xwpf - package org.apache.tika.parser.microsoft.ooxml.xwpf
- org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006 - package org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006
- org.apache.tika.parser.microsoft.pst - package org.apache.tika.parser.microsoft.pst
- org.apache.tika.parser.microsoft.rtf - package org.apache.tika.parser.microsoft.rtf
- org.apache.tika.parser.microsoft.rtf.jflex - package org.apache.tika.parser.microsoft.rtf.jflex
- org.apache.tika.parser.microsoft.xml - package org.apache.tika.parser.microsoft.xml
- org.apache.tika.parser.mif - package org.apache.tika.parser.mif
- org.apache.tika.parser.mp3 - package org.apache.tika.parser.mp3
- org.apache.tika.parser.mp4 - package org.apache.tika.parser.mp4
- org.apache.tika.parser.mp4.boxes - package org.apache.tika.parser.mp4.boxes
- org.apache.tika.parser.multiple - package org.apache.tika.parser.multiple
- org.apache.tika.parser.ner - package org.apache.tika.parser.ner
- org.apache.tika.parser.ner.corenlp - package org.apache.tika.parser.ner.corenlp
- org.apache.tika.parser.ner.grobid - package org.apache.tika.parser.ner.grobid
- org.apache.tika.parser.ner.mitie - package org.apache.tika.parser.ner.mitie
- org.apache.tika.parser.ner.nltk - package org.apache.tika.parser.ner.nltk
- org.apache.tika.parser.ner.opennlp - package org.apache.tika.parser.ner.opennlp
- org.apache.tika.parser.ner.regex - package org.apache.tika.parser.ner.regex
- org.apache.tika.parser.netcdf - package org.apache.tika.parser.netcdf
- org.apache.tika.parser.ocr - package org.apache.tika.parser.ocr
- org.apache.tika.parser.ocr.tess4j - package org.apache.tika.parser.ocr.tess4j
- org.apache.tika.parser.ocrencode - package org.apache.tika.parser.ocrencode
- org.apache.tika.parser.odf - package org.apache.tika.parser.odf
- org.apache.tika.parser.ogg - package org.apache.tika.parser.ogg
- org.apache.tika.parser.pdf - package org.apache.tika.parser.pdf
- org.apache.tika.parser.pdf.image - package org.apache.tika.parser.pdf.image
- org.apache.tika.parser.pdf.updates - package org.apache.tika.parser.pdf.updates
- org.apache.tika.parser.pdf.xmpschemas - package org.apache.tika.parser.pdf.xmpschemas
- org.apache.tika.parser.pkg - package org.apache.tika.parser.pkg
- org.apache.tika.parser.prt - package org.apache.tika.parser.prt
- org.apache.tika.parser.sas - package org.apache.tika.parser.sas
- org.apache.tika.parser.sqlite3 - package org.apache.tika.parser.sqlite3
- org.apache.tika.parser.strings - package org.apache.tika.parser.strings
- org.apache.tika.parser.tmx - package org.apache.tika.parser.tmx
- org.apache.tika.parser.transcribe.aws - package org.apache.tika.parser.transcribe.aws
- org.apache.tika.parser.txt - package org.apache.tika.parser.txt
- org.apache.tika.parser.video - package org.apache.tika.parser.video
- org.apache.tika.parser.vlm - package org.apache.tika.parser.vlm
- org.apache.tika.parser.wacz - package org.apache.tika.parser.wacz
- org.apache.tika.parser.warc - package org.apache.tika.parser.warc
- org.apache.tika.parser.wordperfect - package org.apache.tika.parser.wordperfect
- org.apache.tika.parser.xliff - package org.apache.tika.parser.xliff
- org.apache.tika.parser.xml - package org.apache.tika.parser.xml
- org.apache.tika.parser.xmp - package org.apache.tika.parser.xmp
- org.apache.tika.pipes.api - package org.apache.tika.pipes.api
- org.apache.tika.pipes.api.emitter - package org.apache.tika.pipes.api.emitter
- org.apache.tika.pipes.api.fetcher - package org.apache.tika.pipes.api.fetcher
- org.apache.tika.pipes.api.pipesiterator - package org.apache.tika.pipes.api.pipesiterator
- org.apache.tika.pipes.api.reporter - package org.apache.tika.pipes.api.reporter
- org.apache.tika.pipes.core - package org.apache.tika.pipes.core
- org.apache.tika.pipes.core.async - package org.apache.tika.pipes.core.async
- org.apache.tika.pipes.core.config - package org.apache.tika.pipes.core.config
- org.apache.tika.pipes.core.emitter - package org.apache.tika.pipes.core.emitter
- org.apache.tika.pipes.core.extractor - package org.apache.tika.pipes.core.extractor
- org.apache.tika.pipes.core.extractor.frictionless - package org.apache.tika.pipes.core.extractor.frictionless
- org.apache.tika.pipes.core.fetcher - package org.apache.tika.pipes.core.fetcher
- org.apache.tika.pipes.core.pipesiterator - package org.apache.tika.pipes.core.pipesiterator
- org.apache.tika.pipes.core.protocol - package org.apache.tika.pipes.core.protocol
- org.apache.tika.pipes.core.reporter - package org.apache.tika.pipes.core.reporter
- org.apache.tika.pipes.core.serialization - package org.apache.tika.pipes.core.serialization
- org.apache.tika.pipes.core.server - package org.apache.tika.pipes.core.server
- org.apache.tika.pipes.emitter.azblob - package org.apache.tika.pipes.emitter.azblob
- org.apache.tika.pipes.emitter.es - package org.apache.tika.pipes.emitter.es
- org.apache.tika.pipes.emitter.fs - package org.apache.tika.pipes.emitter.fs
- org.apache.tika.pipes.emitter.gcs - package org.apache.tika.pipes.emitter.gcs
- org.apache.tika.pipes.emitter.jdbc - package org.apache.tika.pipes.emitter.jdbc
- org.apache.tika.pipes.emitter.kafka - package org.apache.tika.pipes.emitter.kafka
- org.apache.tika.pipes.emitter.opensearch - package org.apache.tika.pipes.emitter.opensearch
- org.apache.tika.pipes.emitter.s3 - package org.apache.tika.pipes.emitter.s3
- org.apache.tika.pipes.emitter.solr - package org.apache.tika.pipes.emitter.solr
- org.apache.tika.pipes.fetcher.atlassianjwt - package org.apache.tika.pipes.fetcher.atlassianjwt
- org.apache.tika.pipes.fetcher.atlassianjwt.config - package org.apache.tika.pipes.fetcher.atlassianjwt.config
- org.apache.tika.pipes.fetcher.azblob - package org.apache.tika.pipes.fetcher.azblob
- org.apache.tika.pipes.fetcher.azblob.config - package org.apache.tika.pipes.fetcher.azblob.config
- org.apache.tika.pipes.fetcher.fs - package org.apache.tika.pipes.fetcher.fs
- org.apache.tika.pipes.fetcher.gcs - package org.apache.tika.pipes.fetcher.gcs
- org.apache.tika.pipes.fetcher.gcs.config - package org.apache.tika.pipes.fetcher.gcs.config
- org.apache.tika.pipes.fetcher.googledrive - package org.apache.tika.pipes.fetcher.googledrive
- org.apache.tika.pipes.fetcher.googledrive.config - package org.apache.tika.pipes.fetcher.googledrive.config
- org.apache.tika.pipes.fetcher.http - package org.apache.tika.pipes.fetcher.http
- org.apache.tika.pipes.fetcher.http.config - package org.apache.tika.pipes.fetcher.http.config
- org.apache.tika.pipes.fetcher.http.jwt - package org.apache.tika.pipes.fetcher.http.jwt
- org.apache.tika.pipes.fetcher.s3 - package org.apache.tika.pipes.fetcher.s3
- org.apache.tika.pipes.fetcher.s3.config - package org.apache.tika.pipes.fetcher.s3.config
- org.apache.tika.pipes.fetchers.microsoftgraph - package org.apache.tika.pipes.fetchers.microsoftgraph
- org.apache.tika.pipes.fetchers.microsoftgraph.config - package org.apache.tika.pipes.fetchers.microsoftgraph.config
- org.apache.tika.pipes.fork - package org.apache.tika.pipes.fork
- org.apache.tika.pipes.grpc - package org.apache.tika.pipes.grpc
- org.apache.tika.pipes.ignite - package org.apache.tika.pipes.ignite
- org.apache.tika.pipes.ignite.config - package org.apache.tika.pipes.ignite.config
- org.apache.tika.pipes.ignite.server - package org.apache.tika.pipes.ignite.server
- org.apache.tika.pipes.iterator.azblob - package org.apache.tika.pipes.iterator.azblob
- org.apache.tika.pipes.iterator.csv - package org.apache.tika.pipes.iterator.csv
- org.apache.tika.pipes.iterator.fs - package org.apache.tika.pipes.iterator.fs
- org.apache.tika.pipes.iterator.gcs - package org.apache.tika.pipes.iterator.gcs
- org.apache.tika.pipes.iterator.jdbc - package org.apache.tika.pipes.iterator.jdbc
- org.apache.tika.pipes.iterator.kafka - package org.apache.tika.pipes.iterator.kafka
- org.apache.tika.pipes.iterator.s3 - package org.apache.tika.pipes.iterator.s3
- org.apache.tika.pipes.iterator.solr - package org.apache.tika.pipes.iterator.solr
- org.apache.tika.pipes.pipesiterator - package org.apache.tika.pipes.pipesiterator
- org.apache.tika.pipes.pipesiterator.json - package org.apache.tika.pipes.pipesiterator.json
- org.apache.tika.pipes.plugin - package org.apache.tika.pipes.plugin
- org.apache.tika.pipes.plugin.atlassianjwt - package org.apache.tika.pipes.plugin.atlassianjwt
- org.apache.tika.pipes.plugin.azblob - package org.apache.tika.pipes.plugin.azblob
- org.apache.tika.pipes.plugin.csv - package org.apache.tika.pipes.plugin.csv
- org.apache.tika.pipes.plugin.es - package org.apache.tika.pipes.plugin.es
- org.apache.tika.pipes.plugin.fs - package org.apache.tika.pipes.plugin.fs
- org.apache.tika.pipes.plugin.gcs - package org.apache.tika.pipes.plugin.gcs
- org.apache.tika.pipes.plugin.googledrive - package org.apache.tika.pipes.plugin.googledrive
- org.apache.tika.pipes.plugin.http - package org.apache.tika.pipes.plugin.http
- org.apache.tika.pipes.plugin.jdbc - package org.apache.tika.pipes.plugin.jdbc
- org.apache.tika.pipes.plugin.kafka - package org.apache.tika.pipes.plugin.kafka
- org.apache.tika.pipes.plugin.microsoftgraph - package org.apache.tika.pipes.plugin.microsoftgraph
- org.apache.tika.pipes.plugin.opensearch - package org.apache.tika.pipes.plugin.opensearch
- org.apache.tika.pipes.plugin.s3 - package org.apache.tika.pipes.plugin.s3
- org.apache.tika.pipes.plugin.solr - package org.apache.tika.pipes.plugin.solr
- org.apache.tika.pipes.reporter.es - package org.apache.tika.pipes.reporter.es
- org.apache.tika.pipes.reporter.fs - package org.apache.tika.pipes.reporter.fs
- org.apache.tika.pipes.reporter.jdbc - package org.apache.tika.pipes.reporter.jdbc
- org.apache.tika.pipes.reporter.opensearch - package org.apache.tika.pipes.reporter.opensearch
- org.apache.tika.pipes.reporters - package org.apache.tika.pipes.reporters
- org.apache.tika.plugins - package org.apache.tika.plugins
- org.apache.tika.quality - package org.apache.tika.quality
- org.apache.tika.renderer - package org.apache.tika.renderer
- org.apache.tika.renderer.pdf.pdfbox - package org.apache.tika.renderer.pdf.pdfbox
- org.apache.tika.renderer.pdf.poppler - package org.apache.tika.renderer.pdf.poppler
- org.apache.tika.sax - package org.apache.tika.sax
-
SAX utilities.
- org.apache.tika.sax.boilerpipe - package org.apache.tika.sax.boilerpipe
- org.apache.tika.sax.xpath - package org.apache.tika.sax.xpath
-
XPath utilities
- org.apache.tika.serialization - package org.apache.tika.serialization
- org.apache.tika.serialization.serdes - package org.apache.tika.serialization.serdes
- org.apache.tika.server.client - package org.apache.tika.server.client
- org.apache.tika.server.core - package org.apache.tika.server.core
- org.apache.tika.server.core.resource - package org.apache.tika.server.core.resource
- org.apache.tika.server.core.writer - package org.apache.tika.server.core.writer
- org.apache.tika.server.standard.resource - package org.apache.tika.server.standard.resource
- org.apache.tika.server.standard.writer - package org.apache.tika.server.standard.writer
- org.apache.tika.utils - package org.apache.tika.utils
-
Utilities.
- org.apache.tika.xmp - package org.apache.tika.xmp
- org.apache.tika.xmp.convert - package org.apache.tika.xmp.convert
- org.apache.tika.zip.utils - package org.apache.tika.zip.utils
- ORGANISATION_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
A set of metadata about artwork or an object in the item
- ORGANISATION_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of the organisation or company which is featured in the content.
- ORGANIZATION - Static variable in interface org.apache.tika.parser.ner.NERecogniser
- ORGANIZATION_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- ORIENTATION - Static variable in interface org.apache.tika.metadata.TIFF
-
"The Orientation of the image." 1 = 0th row at top, 0th column at left 2 = 0th row at top, 0th column at right 3 = 0th row at bottom, 0th column at right 4 = 0th row at bottom, 0th column at left 5 = 0th row at left, 0th column at top 6 = 0th row at right, 0th column at top 7 = 0th row at right, 0th column at bottom 8 = 0th row at left, 0th column at bottom
- ORIG_STACK_TRACE - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- ORIGINAL_DATE - Static variable in interface org.apache.tika.metadata.TIFF
-
"Date and time when original image was generated"
- ORIGINAL_DOCUMENTID - Static variable in interface org.apache.tika.metadata.XMPMM
-
The common identifier for the original resource from which the current resource is derived.
- ORIGINAL_RESOURCE_NAME - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Some file formats can store information about their original file name/location or about their attachment's original file name/location within the file.
- OS_NAME - Static variable in class org.apache.tika.utils.SystemUtils
- OS_VERSION - Static variable in class org.apache.tika.utils.SystemUtils
- osids - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
- osidStreamNotPresent - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
- OTHER - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- OtherFileNodeList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
- OUT_OF_VOCABULARY - Static variable in class org.apache.tika.eval.core.metadata.TikaEvalMetadataFilter
- OutlineElementChildLevel - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- OutlineElementRTL - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- OUTLOOK - Enum constant in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- OutlookExtractor - Class in org.apache.tika.parser.microsoft
-
Outlook Message Parser.
- OutlookExtractor(DirectoryNode, Metadata, ParseContext) - Constructor for class org.apache.tika.parser.microsoft.OutlookExtractor
- OutlookExtractor.BODY_TYPES_PROCESSED - Enum Class in org.apache.tika.parser.microsoft
- OutlookExtractor.RECIPIENT_TYPE - Enum Class in org.apache.tika.parser.microsoft
- OutlookPSTParser - Class in org.apache.tika.parser.microsoft.pst
-
Parser for MS Outlook PST email storage files
- OutlookPSTParser() - Constructor for class org.apache.tika.parser.microsoft.pst.OutlookPSTParser
- OUTPUT_FILE_TOKEN - Static variable in class org.apache.tika.embedder.ExternalEmbedder
- OUTPUT_FILE_TOKEN - Static variable in class org.apache.tika.parser.external.ExternalParser
- OutputLimits - Class in org.apache.tika.config
-
Configuration for output and security limits.
- OutputLimits() - Constructor for class org.apache.tika.config.OutputLimits
-
No-arg constructor for Jackson deserialization.
- OutputLimits(int, boolean, int, int, long, long) - Constructor for class org.apache.tika.config.OutputLimits
-
Constructor with all parameters.
- OVERALL_PERCENTAGE_UNMAPPED_UNICODE_CHARS - Static variable in interface org.apache.tika.metadata.PDF
- OVERLAP - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- OverrideDetector - Class in org.apache.tika.detect
-
Deprecated.after 2.5.0 this functionality was moved to the CompositeDetector
- OverrideDetector() - Constructor for class org.apache.tika.detect.OverrideDetector
-
Deprecated.
- OverrideEncodingDetector - Class in org.apache.tika.detect
-
Always returns the charset passed in via the initializer
- OverrideEncodingDetector() - Constructor for class org.apache.tika.detect.OverrideEncodingDetector
-
Sets charset to UTF-8.
- OverrideEncodingDetector(Charset) - Constructor for class org.apache.tika.detect.OverrideEncodingDetector
-
Constructor with explicit Charset object.
- OverrideEncodingDetector(JsonConfig) - Constructor for class org.apache.tika.detect.OverrideEncodingDetector
-
Constructor for JSON configuration.
- OverrideEncodingDetector(OverrideEncodingDetector.Config) - Constructor for class org.apache.tika.detect.OverrideEncodingDetector
-
Constructor with explicit Config object.
- OverrideEncodingDetector.Config - Class in org.apache.tika.detect
-
Configuration class for JSON deserialization.
- overrideTupleMap - Variable in class org.apache.tika.parser.microsoft.AbstractListManager
- OVERWRITE - Enum constant in enum class org.apache.tika.pipes.emitter.es.ESEmitterConfig.UpdateStrategy
- OVERWRITE - Enum constant in enum class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig.UpdateStrategy
- overwriteExisting() - Method in record class org.apache.tika.pipes.emitter.azblob.AZBlobEmitterConfig
-
Returns the value of the
overwriteExistingrecord component. - OWNER - Static variable in interface org.apache.tika.metadata.XMPRights
-
A list of legal owners of the resource.
P
- PACK - Static variable in class org.apache.tika.detect.zip.CompressorConstants
- PackageConstants - Class in org.apache.tika.detect.zip
- PackageConstants() - Constructor for class org.apache.tika.detect.zip.PackageConstants
- PackageParser - Class in org.apache.tika.parser.pkg
-
Parser for streaming archive formats: AR, ARJ, CPIO, DUMP, TAR.
- PackageParser() - Constructor for class org.apache.tika.parser.pkg.PackageParser
- packagingEnd - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
- packagingStart - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
- padding - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
- PAGE_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Pages are there in the (paged) document
- PAGE_NUMBER - Static variable in interface org.apache.tika.metadata.TikaPagedText
-
1-based page number for a specific page
- PAGE_NUMBER_FIELD_NUMBER - Static variable in class org.apache.tika.ListFetchersRequest
- PAGE_ROTATION - Static variable in interface org.apache.tika.metadata.TikaPagedText
- PageBasedRenderResults - Class in org.apache.tika.renderer
- PageBasedRenderResults(TemporaryResources) - Constructor for class org.apache.tika.renderer.PageBasedRenderResults
- PagedText - Interface in org.apache.tika.metadata
-
XMP Paged-text schema.
- PageHeight - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- PageLevel - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- PageMarginBottom - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- PageMarginLeft - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- PageMarginOriginX - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- PageMarginOriginY - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- PageMarginRight - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- PageMarginTop - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- pageNumber - Variable in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- PageRangeRequest - Class in org.apache.tika.renderer
-
The range of pages to render.
- PageRangeRequest(int, int) - Constructor for class org.apache.tika.renderer.PageRangeRequest
- PAGES - Enum constant in enum class org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
- PAGES13 - Enum constant in enum class org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
- PAGES18 - Enum constant in enum class org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
- PageSize - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- PageWidth - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- PaginatedLocator - Class in org.apache.tika.inference.locator
-
Locator for paginated documents (PDF, PPTX, DOCX, etc.).
- PaginatedLocator(int) - Constructor for class org.apache.tika.inference.locator.PaginatedLocator
- PaginatedLocator(int, float[]) - Constructor for class org.apache.tika.inference.locator.PaginatedLocator
- PARAGRAPH_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of individual Paragraphs in the document
- ParagraphAlignment - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ParagraphLevelCounter(AbstractListManager.LevelTuple[]) - Constructor for class org.apache.tika.parser.microsoft.AbstractListManager.ParagraphLevelCounter
- ParagraphLineSpacingExact - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ParagraphProperties - Class in org.apache.tika.parser.microsoft.ooxml
- ParagraphProperties() - Constructor for class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
- ParagraphProperties(ParagraphProperties) - Constructor for class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
- ParagraphSpaceAfter - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ParagraphSpaceBefore - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ParagraphStyle - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ParagraphStyleId - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- PARAMS_FIELD_NUMBER - Static variable in class org.apache.tika.GetFetcherReply
- PARENT_CHILD - Enum constant in enum class org.apache.tika.pipes.emitter.es.ESEmitterConfig.AttachmentStrategy
- PARENT_CHILD - Enum constant in enum class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig.AttachmentStrategy
- PARENT_CHILD - Enum constant in enum class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig.AttachmentStrategy
- ParentContentHandler - Class in org.apache.tika.extractor
-
Simple pointer class to allow parsers to pass on the parent contenthandler through to the embedded document's parse
- ParentContentHandler(ContentHandler) - Constructor for class org.apache.tika.extractor.ParentContentHandler
- parentMetadata - Variable in class org.apache.tika.parser.microsoft.OutlookExtractor
- parentMetadata - Variable in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- parse(byte[], AtomicInteger, Class<T>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BasicObject
-
Used to parse byte array to special object.
- parse(byte[], ChmItsfHeader) - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
- parse(byte[], ChmItspHeader) - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
- parse(byte[], ChmLzxcControlData) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
- parse(byte[], ChmLzxcResetTable) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
- parse(byte[], ChmPmgiHeader) - Method in class org.apache.tika.parser.microsoft.chm.ChmPmgiHeader
- parse(byte[], ChmPmglHeader) - Method in class org.apache.tika.parser.microsoft.chm.ChmPmglHeader
- parse(byte[], T) - Method in interface org.apache.tika.parser.microsoft.chm.ChmAccessor
-
Parses chm accessor
- parse(Image, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
- parse(File) - Method in class org.apache.tika.Tika
-
Parses the given file and returns the extracted text content.
- parse(File, Metadata) - Method in class org.apache.tika.Tika
-
Parses the given file and returns the extracted text content.
- parse(InputStream) - Method in class org.apache.tika.parser.xmp.JempboxExtractor
- parse(InputStream) - Method in class org.apache.tika.Tika
-
Parses the given document and returns the extracted text content.
- parse(InputStream, OutputStream) - Method in class org.apache.tika.parser.xmp.XMPPacketScanner
-
Locates an XMP packet in a stream, parses it and returns the XMP metadata.
- parse(InputStream, Metadata) - Static method in class org.apache.tika.parser.xmp.XMPMetadataExtractor
-
Parse the XMP Packets.
- parse(InputStream, Metadata) - Method in class org.apache.tika.Tika
-
Parses the given document and returns the extracted text content.
- parse(String) - Static method in class org.apache.tika.mime.MediaType
-
Parses the given string to a media type.
- parse(String) - Method in class org.apache.tika.parser.html.DataURISchemeUtil
- parse(String) - Static method in enum class org.apache.tika.pipes.api.ParseMode
-
Parses a string to a ParseMode enum value.
- parse(String) - Static method in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.KEY_BASE_STRATEGY
- parse(String) - Static method in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.OUTPUT_FORMAT
- parse(String) - Static method in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.OUTPUT_MODE
- parse(String) - Static method in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.SUFFIX_STRATEGY
- parse(String) - Method in class org.apache.tika.sax.xpath.XPathParser
-
Parses the given simple XPath expression to an evaluation state initialized at the document node.
- parse(String, ParseContext) - Method in class org.apache.tika.parser.journal.TEIDOMParser
- parse(String, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.journal.GrobidRESTParser
- parse(URL) - Method in class org.apache.tika.Tika
-
Parses the resource at the given URL and returns the extracted text content.
- parse(Path) - Method in class org.apache.tika.pipes.fork.PipesForkParser
-
Parse a file in a forked JVM process.
- parse(Path) - Method in class org.apache.tika.Tika
-
Parses the file at the given path and returns the extracted text content.
- parse(Path, Metadata) - Method in class org.apache.tika.pipes.fork.PipesForkParser
-
Parse a file in a forked JVM process with the specified metadata.
- parse(Path, Metadata) - Method in class org.apache.tika.Tika
-
Parses the file at the given path and returns the extracted text content.
- parse(Path, Metadata, ParseContext) - Method in class org.apache.tika.pipes.fork.PipesForkParser
-
Parse a file in a forked JVM process with the specified metadata and parse context.
- parse(OldExcelExtractor, XHTMLContentHandler) - Static method in class org.apache.tika.parser.microsoft.OldExcelParser
- parse(DirectoryNode, ParseContext, Metadata, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.OfficeParser
- parse(DirectoryNode, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.HSLFExtractor
- parse(DirectoryNode, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.WordExtractor
- parse(DirectoryNode, XHTMLContentHandler, Locale) - Method in class org.apache.tika.parser.microsoft.ExcelExtractor
- parse(POIFSFileSystem, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.HSLFExtractor
- parse(POIFSFileSystem, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.WordExtractor
- parse(POIFSFileSystem, XHTMLContentHandler, Locale) - Method in class org.apache.tika.parser.microsoft.ExcelExtractor
-
Extracts text from an Excel Workbook writing the extracted content to the specified
Appendable. - parse(TikaInputStream) - Method in class org.apache.tika.pipes.fork.PipesForkParser
-
Parse a file in a forked JVM process.
- parse(TikaInputStream, Metadata) - Method in class org.apache.tika.pipes.fork.PipesForkParser
-
Parse a file in a forked JVM process with the specified metadata.
- parse(TikaInputStream, Metadata, ParseContext) - Method in class org.apache.tika.pipes.fork.PipesForkParser
-
Parse a file in a forked JVM process with the specified metadata and parse context.
- parse(TikaInputStream, Metadata, ParseContext, ParseMode) - Method in class org.apache.tika.server.core.resource.PipesParsingHelper
-
Parses content using pipes-based parsing with process isolation.
- parse(TikaInputStream, ContentHandlerFactory, Metadata, ParseContext) - Method in class org.apache.tika.example.PickBestTextEncodingParser
-
Deprecated.
- parse(TikaInputStream, ContentHandlerFactory, Metadata, ParseContext) - Method in class org.apache.tika.parser.multiple.AbstractMultipleParser
-
Deprecated.The
ContentHandlerFactoryoverride is still experimental and the method signature is subject to change before Tika 2.0 - parse(TikaInputStream, ContentHandler, Metadata) - Method in class org.apache.tika.example.DirListParser
- parse(TikaInputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.AbstractParser
-
Deprecated.use the
Parser.parse(TikaInputStream, ContentHandler, Metadata, ParseContext)method instead - parse(TikaInputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.AutoDetectParser
- parse(TikaInputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.iptc.IptcAnpaParser
-
Deprecated.This method will be removed in Apache Tika 1.0.
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.example.DirListParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.example.EncryptedPrescriptionParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.example.LanguageDetectingParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.example.PickBestTextEncodingParser
-
Deprecated.
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.apple.AppleSingleFileParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.apple.PListParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.asm.ClassParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.audio.AudioParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.audio.MidiParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.AutoDetectParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.code.SourceCodeParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.CompositeParser
-
Delegates the call to the matching component parser.
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.crypto.Pkcs7Parser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.crypto.TSDParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.CryptoParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.csv.TextAndCSVParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ctakes.CTAKESParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.dbf.DBFParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.DelegatingParser
-
Looks up the delegate parser from the parsing context and delegates the parse operation to it.
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.dgn.DGN8Parser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.dif.DIFParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.dwg.DWGParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.dwg.DWGReadParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.EmptyParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.envi.EnviHeaderParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.epub.EpubContentParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.epub.EpubParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ErrorParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.executable.ExecutableParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.executable.UniversalExecutableParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.external.ExternalParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.feed.FeedParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.font.AdobeFontMetricParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.font.TrueTypeParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.gdal.GDALParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.geo.topic.GeoParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.geoinfo.GeographicInformationParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.geopkg.GeoPkgParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.grib.GribParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.hdf.HDFParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.html.JSoupParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.http.HttpParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.hwp.HwpV5Parser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.AbstractImageParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.ICNSParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.JXLParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.PSDParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.WebPParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.indesign.IDMLParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.iptc.IptcAnpaParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.isatab.ISArchiveParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.iwork.iwana.IWork18PackageParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.iwork.IWorkPackageParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.jdbc.AbstractDBParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.journal.JournalParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mail.RFC822Parser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mat.MatParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mbox.MboxParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.activemime.ActiveMimeParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.chm.ChmParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.EMFParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.JackcessParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.libpst.LibPstParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.MSOwnerFileParser
-
Extracts owner from MS temp file
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.OfficeParser
-
Extracts properties and text from an MS Document input stream
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.OldExcelParser
-
Extracts properties and text from an MS Document input stream
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Static method in class org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006.Word2006MLParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.pst.OutlookPSTParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.pst.PSTMailItemParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.rtf.RTFParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.TNEFParser
-
Extracts properties and text from an MS Document input stream
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.WMFParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mif.MIFParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mp3.Mp3Parser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mp4.MP4Parser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.multiple.AbstractMultipleParser
-
Processes the given Stream through one or more parsers, resetting things between parsers as requested by policy.
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ner.NamedEntityParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.netcdf.NetCDFParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.NetworkParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ocrencode.EncodeOCRParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.FlatOpenDocumentParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentContentParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentMetaParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ogg.FlacParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ogg.OggParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ogg.OpusParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ogg.SpeexParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ogg.TheoraParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ogg.VorbisParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in interface org.apache.tika.parser.Parser
-
Parses a document stream into a sequence of XHTML SAX events.
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ParserDecorator
-
Delegates the method call to the decorated parser.
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ParserPostProcessor
-
Forwards the call to the delegated parser and post-processes the results as described above.
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.CompressorParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.PackageParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.RarParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.SevenZParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.UnrarParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.ZipParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.prt.PRTParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.RecursiveParserWrapper
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.RegexCaptureParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.sas.SAS7BDATParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.sqlite3.SQLite3Parser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.strings.Latin1StringsParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.strings.StringsParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.tmx.TMXParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.transcribe.aws.AmazonTranscribe
-
Starts AWS Transcribe Job with language specification.
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.txt.TXTParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.video.FLVParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.wacz.WACZParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.warc.WARCParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.wordperfect.QuattroProParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.wordperfect.WordPerfectParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xliff.XLIFF12Parser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xliff.XLZParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.XMLParser
- parse(TikaInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.XMLProfiler
- parse(MediaType, String, String, String, String) - Static method in class org.apache.tika.detect.MagicDetector
- parse(DataElementPackage) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStoreParser
- parse(Parser, Logger, String, TikaInputStream, ContentHandler, Metadata, ParseContext) - Static method in class org.apache.tika.server.core.resource.TikaResource
-
Use this to call a parser and unify exception handling.
- parse(FetchEmitTuple) - Method in class org.apache.tika.pipes.core.PipesParser
- parse(FetchEmitTuple) - Method in class org.apache.tika.server.client.TikaClient
- parse(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.OutlookExtractor
- PARSE - Enum constant in enum class org.apache.tika.server.core.ServerStatus.TASK
- PARSE_CONTEXT - Static variable in class org.apache.tika.serialization.serdes.ParseContextSerializer
- PARSE_CONTEXT_JSON_FIELD_NUMBER - Static variable in class org.apache.tika.FetchAndParseRequest
- PARSE_ERROR_DESCRIPTION - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- PARSE_ERROR_ID - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- PARSE_EXCEPTION_DESCRIPTION - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- PARSE_EXCEPTION_ID - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- PARSE_EXCEPTION_NO_EMIT - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- PARSE_SUCCESS - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- PARSE_SUCCESS_WITH_EXCEPTION - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- PARSE_TIME_MILLIS - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- parseAssay(InputStream, XHTMLContentHandler, Metadata, ParseContext) - Static method in class org.apache.tika.parser.isatab.ISATabUtils
- parseBodyToHTML() - Method in class org.apache.tika.example.ContentHandlerExample
-
Example of extracting just the body as HTML, without the head part, as a string
- parseContext - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
- parseContext - Variable in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- ParseContext - Class in org.apache.tika.parser
-
Parse context.
- ParseContext() - Constructor for class org.apache.tika.parser.ParseContext
- ParseContextConfig - Class in org.apache.tika.config
-
Facade for accessing runtime configuration from ParseContext's jsonConfigs.
- ParseContextConfig() - Constructor for class org.apache.tika.config.ParseContextConfig
- ParseContextDeserializer - Class in org.apache.tika.serialization.serdes
-
Deserializes ParseContext from JSON.
- ParseContextDeserializer() - Constructor for class org.apache.tika.serialization.serdes.ParseContextDeserializer
- ParseContextSerializer - Class in org.apache.tika.serialization.serdes
-
Serializes ParseContext to JSON.
- ParseContextSerializer() - Constructor for class org.apache.tika.serialization.serdes.ParseContextSerializer
- ParseContextUtils - Class in org.apache.tika.serialization
-
Utility methods for working with ParseContext objects in JSON-based configurations.
- ParseContextUtils() - Constructor for class org.apache.tika.serialization.ParseContextUtils
- parseDateLenient(String) - Static method in class org.apache.tika.parser.mailcommons.MailDateParser
- parseDelimitedFrom(InputStream) - Static method in class org.apache.tika.DeleteFetcherReply
- parseDelimitedFrom(InputStream) - Static method in class org.apache.tika.DeleteFetcherRequest
- parseDelimitedFrom(InputStream) - Static method in class org.apache.tika.DeletePipesIteratorReply
- parseDelimitedFrom(InputStream) - Static method in class org.apache.tika.DeletePipesIteratorRequest
- parseDelimitedFrom(InputStream) - Static method in class org.apache.tika.FetchAndParseReply
- parseDelimitedFrom(InputStream) - Static method in class org.apache.tika.FetchAndParseRequest
- parseDelimitedFrom(InputStream) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- parseDelimitedFrom(InputStream) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- parseDelimitedFrom(InputStream) - Static method in class org.apache.tika.GetFetcherReply
- parseDelimitedFrom(InputStream) - Static method in class org.apache.tika.GetFetcherRequest
- parseDelimitedFrom(InputStream) - Static method in class org.apache.tika.GetPipesIteratorReply
- parseDelimitedFrom(InputStream) - Static method in class org.apache.tika.GetPipesIteratorRequest
- parseDelimitedFrom(InputStream) - Static method in class org.apache.tika.ListFetchersReply
- parseDelimitedFrom(InputStream) - Static method in class org.apache.tika.ListFetchersRequest
- parseDelimitedFrom(InputStream) - Static method in class org.apache.tika.SaveFetcherReply
- parseDelimitedFrom(InputStream) - Static method in class org.apache.tika.SaveFetcherRequest
- parseDelimitedFrom(InputStream) - Static method in class org.apache.tika.SavePipesIteratorReply
- parseDelimitedFrom(InputStream) - Static method in class org.apache.tika.SavePipesIteratorRequest
- parseDelimitedFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.DeleteFetcherReply
- parseDelimitedFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.DeleteFetcherRequest
- parseDelimitedFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.DeletePipesIteratorReply
- parseDelimitedFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.DeletePipesIteratorRequest
- parseDelimitedFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.FetchAndParseReply
- parseDelimitedFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.FetchAndParseRequest
- parseDelimitedFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- parseDelimitedFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- parseDelimitedFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherReply
- parseDelimitedFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherRequest
- parseDelimitedFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.GetPipesIteratorReply
- parseDelimitedFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.GetPipesIteratorRequest
- parseDelimitedFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.ListFetchersReply
- parseDelimitedFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.ListFetchersRequest
- parseDelimitedFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.SaveFetcherReply
- parseDelimitedFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.SaveFetcherRequest
- parseDelimitedFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.SavePipesIteratorReply
- parseDelimitedFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.SavePipesIteratorRequest
- parseELF(XHTMLContentHandler, Metadata, InputStream, byte[]) - Method in class org.apache.tika.parser.executable.ExecutableParser
-
Parses a Unix ELF file
- parseEmbedded(TikaInputStream, ContentHandler, Metadata, boolean) - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
- parseEmbedded(TikaInputStream, ContentHandler, Metadata, ParseContext, boolean) - Method in interface org.apache.tika.extractor.EmbeddedDocumentExtractor
-
Processes the supplied embedded resource, calling the delegating parser with the appropriate details.
- parseEmbedded(TikaInputStream, ContentHandler, Metadata, ParseContext, boolean) - Method in class org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
- parseEmbedded(TikaInputStream, ContentHandler, Metadata, ParseContext, boolean) - Method in class org.apache.tika.pipes.core.extractor.UnpackExtractor
- parseEmbeddedDocumentsConcatenate(Path) - Method in class org.apache.tika.example.PipesForkParserExample
-
Example of parsing documents with embedded files using CONCATENATE mode (legacy).
- parseEmbeddedDocumentsRmeta(Path) - Method in class org.apache.tika.example.PipesForkParserExample
-
Example of parsing documents with embedded files using RMETA mode.
- parseEmbeddedExample() - Method in class org.apache.tika.example.ParsingExample
-
This example shows how to extract content from the outer document and all embedded documents.
- parseExample() - Method in class org.apache.tika.example.ParsingExample
-
Example of how to use Tika to parse a file when you do not know its file type ahead of time.
- parseExternalRefFromInstrText(String, StringBuilder) - Static method in class org.apache.tika.parser.microsoft.ooxml.FieldCodeParser
-
Parses URLs from instrText field codes that reference external resources.
- parseFileAllContent(Path) - Method in class org.apache.tika.example.PipesForkParserExample
-
Example of parsing a file and getting ALL content (container + embedded documents).
- parseFileBasic(Path) - Method in class org.apache.tika.example.PipesForkParserExample
-
Basic example of parsing a file using PipesForkParser with default settings.
- parseFileInputStream(String) - Static method in class org.apache.tika.example.TIAParsingExample
- parseFrom(byte[]) - Static method in class org.apache.tika.DeleteFetcherReply
- parseFrom(byte[]) - Static method in class org.apache.tika.DeleteFetcherRequest
- parseFrom(byte[]) - Static method in class org.apache.tika.DeletePipesIteratorReply
- parseFrom(byte[]) - Static method in class org.apache.tika.DeletePipesIteratorRequest
- parseFrom(byte[]) - Static method in class org.apache.tika.FetchAndParseReply
- parseFrom(byte[]) - Static method in class org.apache.tika.FetchAndParseRequest
- parseFrom(byte[]) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- parseFrom(byte[]) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- parseFrom(byte[]) - Static method in class org.apache.tika.GetFetcherReply
- parseFrom(byte[]) - Static method in class org.apache.tika.GetFetcherRequest
- parseFrom(byte[]) - Static method in class org.apache.tika.GetPipesIteratorReply
- parseFrom(byte[]) - Static method in class org.apache.tika.GetPipesIteratorRequest
- parseFrom(byte[]) - Static method in class org.apache.tika.ListFetchersReply
- parseFrom(byte[]) - Static method in class org.apache.tika.ListFetchersRequest
- parseFrom(byte[]) - Static method in class org.apache.tika.SaveFetcherReply
- parseFrom(byte[]) - Static method in class org.apache.tika.SaveFetcherRequest
- parseFrom(byte[]) - Static method in class org.apache.tika.SavePipesIteratorReply
- parseFrom(byte[]) - Static method in class org.apache.tika.SavePipesIteratorRequest
- parseFrom(byte[], ExtensionRegistryLite) - Static method in class org.apache.tika.DeleteFetcherReply
- parseFrom(byte[], ExtensionRegistryLite) - Static method in class org.apache.tika.DeleteFetcherRequest
- parseFrom(byte[], ExtensionRegistryLite) - Static method in class org.apache.tika.DeletePipesIteratorReply
- parseFrom(byte[], ExtensionRegistryLite) - Static method in class org.apache.tika.DeletePipesIteratorRequest
- parseFrom(byte[], ExtensionRegistryLite) - Static method in class org.apache.tika.FetchAndParseReply
- parseFrom(byte[], ExtensionRegistryLite) - Static method in class org.apache.tika.FetchAndParseRequest
- parseFrom(byte[], ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- parseFrom(byte[], ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- parseFrom(byte[], ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherReply
- parseFrom(byte[], ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherRequest
- parseFrom(byte[], ExtensionRegistryLite) - Static method in class org.apache.tika.GetPipesIteratorReply
- parseFrom(byte[], ExtensionRegistryLite) - Static method in class org.apache.tika.GetPipesIteratorRequest
- parseFrom(byte[], ExtensionRegistryLite) - Static method in class org.apache.tika.ListFetchersReply
- parseFrom(byte[], ExtensionRegistryLite) - Static method in class org.apache.tika.ListFetchersRequest
- parseFrom(byte[], ExtensionRegistryLite) - Static method in class org.apache.tika.SaveFetcherReply
- parseFrom(byte[], ExtensionRegistryLite) - Static method in class org.apache.tika.SaveFetcherRequest
- parseFrom(byte[], ExtensionRegistryLite) - Static method in class org.apache.tika.SavePipesIteratorReply
- parseFrom(byte[], ExtensionRegistryLite) - Static method in class org.apache.tika.SavePipesIteratorRequest
- parseFrom(ByteString) - Static method in class org.apache.tika.DeleteFetcherReply
- parseFrom(ByteString) - Static method in class org.apache.tika.DeleteFetcherRequest
- parseFrom(ByteString) - Static method in class org.apache.tika.DeletePipesIteratorReply
- parseFrom(ByteString) - Static method in class org.apache.tika.DeletePipesIteratorRequest
- parseFrom(ByteString) - Static method in class org.apache.tika.FetchAndParseReply
- parseFrom(ByteString) - Static method in class org.apache.tika.FetchAndParseRequest
- parseFrom(ByteString) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- parseFrom(ByteString) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- parseFrom(ByteString) - Static method in class org.apache.tika.GetFetcherReply
- parseFrom(ByteString) - Static method in class org.apache.tika.GetFetcherRequest
- parseFrom(ByteString) - Static method in class org.apache.tika.GetPipesIteratorReply
- parseFrom(ByteString) - Static method in class org.apache.tika.GetPipesIteratorRequest
- parseFrom(ByteString) - Static method in class org.apache.tika.ListFetchersReply
- parseFrom(ByteString) - Static method in class org.apache.tika.ListFetchersRequest
- parseFrom(ByteString) - Static method in class org.apache.tika.SaveFetcherReply
- parseFrom(ByteString) - Static method in class org.apache.tika.SaveFetcherRequest
- parseFrom(ByteString) - Static method in class org.apache.tika.SavePipesIteratorReply
- parseFrom(ByteString) - Static method in class org.apache.tika.SavePipesIteratorRequest
- parseFrom(ByteString, ExtensionRegistryLite) - Static method in class org.apache.tika.DeleteFetcherReply
- parseFrom(ByteString, ExtensionRegistryLite) - Static method in class org.apache.tika.DeleteFetcherRequest
- parseFrom(ByteString, ExtensionRegistryLite) - Static method in class org.apache.tika.DeletePipesIteratorReply
- parseFrom(ByteString, ExtensionRegistryLite) - Static method in class org.apache.tika.DeletePipesIteratorRequest
- parseFrom(ByteString, ExtensionRegistryLite) - Static method in class org.apache.tika.FetchAndParseReply
- parseFrom(ByteString, ExtensionRegistryLite) - Static method in class org.apache.tika.FetchAndParseRequest
- parseFrom(ByteString, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- parseFrom(ByteString, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- parseFrom(ByteString, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherReply
- parseFrom(ByteString, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherRequest
- parseFrom(ByteString, ExtensionRegistryLite) - Static method in class org.apache.tika.GetPipesIteratorReply
- parseFrom(ByteString, ExtensionRegistryLite) - Static method in class org.apache.tika.GetPipesIteratorRequest
- parseFrom(ByteString, ExtensionRegistryLite) - Static method in class org.apache.tika.ListFetchersReply
- parseFrom(ByteString, ExtensionRegistryLite) - Static method in class org.apache.tika.ListFetchersRequest
- parseFrom(ByteString, ExtensionRegistryLite) - Static method in class org.apache.tika.SaveFetcherReply
- parseFrom(ByteString, ExtensionRegistryLite) - Static method in class org.apache.tika.SaveFetcherRequest
- parseFrom(ByteString, ExtensionRegistryLite) - Static method in class org.apache.tika.SavePipesIteratorReply
- parseFrom(ByteString, ExtensionRegistryLite) - Static method in class org.apache.tika.SavePipesIteratorRequest
- parseFrom(CodedInputStream) - Static method in class org.apache.tika.DeleteFetcherReply
- parseFrom(CodedInputStream) - Static method in class org.apache.tika.DeleteFetcherRequest
- parseFrom(CodedInputStream) - Static method in class org.apache.tika.DeletePipesIteratorReply
- parseFrom(CodedInputStream) - Static method in class org.apache.tika.DeletePipesIteratorRequest
- parseFrom(CodedInputStream) - Static method in class org.apache.tika.FetchAndParseReply
- parseFrom(CodedInputStream) - Static method in class org.apache.tika.FetchAndParseRequest
- parseFrom(CodedInputStream) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- parseFrom(CodedInputStream) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- parseFrom(CodedInputStream) - Static method in class org.apache.tika.GetFetcherReply
- parseFrom(CodedInputStream) - Static method in class org.apache.tika.GetFetcherRequest
- parseFrom(CodedInputStream) - Static method in class org.apache.tika.GetPipesIteratorReply
- parseFrom(CodedInputStream) - Static method in class org.apache.tika.GetPipesIteratorRequest
- parseFrom(CodedInputStream) - Static method in class org.apache.tika.ListFetchersReply
- parseFrom(CodedInputStream) - Static method in class org.apache.tika.ListFetchersRequest
- parseFrom(CodedInputStream) - Static method in class org.apache.tika.SaveFetcherReply
- parseFrom(CodedInputStream) - Static method in class org.apache.tika.SaveFetcherRequest
- parseFrom(CodedInputStream) - Static method in class org.apache.tika.SavePipesIteratorReply
- parseFrom(CodedInputStream) - Static method in class org.apache.tika.SavePipesIteratorRequest
- parseFrom(CodedInputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.DeleteFetcherReply
- parseFrom(CodedInputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.DeleteFetcherRequest
- parseFrom(CodedInputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.DeletePipesIteratorReply
- parseFrom(CodedInputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.DeletePipesIteratorRequest
- parseFrom(CodedInputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.FetchAndParseReply
- parseFrom(CodedInputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.FetchAndParseRequest
- parseFrom(CodedInputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- parseFrom(CodedInputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- parseFrom(CodedInputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherReply
- parseFrom(CodedInputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherRequest
- parseFrom(CodedInputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.GetPipesIteratorReply
- parseFrom(CodedInputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.GetPipesIteratorRequest
- parseFrom(CodedInputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.ListFetchersReply
- parseFrom(CodedInputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.ListFetchersRequest
- parseFrom(CodedInputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.SaveFetcherReply
- parseFrom(CodedInputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.SaveFetcherRequest
- parseFrom(CodedInputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.SavePipesIteratorReply
- parseFrom(CodedInputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.SavePipesIteratorRequest
- parseFrom(InputStream) - Static method in class org.apache.tika.DeleteFetcherReply
- parseFrom(InputStream) - Static method in class org.apache.tika.DeleteFetcherRequest
- parseFrom(InputStream) - Static method in class org.apache.tika.DeletePipesIteratorReply
- parseFrom(InputStream) - Static method in class org.apache.tika.DeletePipesIteratorRequest
- parseFrom(InputStream) - Static method in class org.apache.tika.FetchAndParseReply
- parseFrom(InputStream) - Static method in class org.apache.tika.FetchAndParseRequest
- parseFrom(InputStream) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- parseFrom(InputStream) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- parseFrom(InputStream) - Static method in class org.apache.tika.GetFetcherReply
- parseFrom(InputStream) - Static method in class org.apache.tika.GetFetcherRequest
- parseFrom(InputStream) - Static method in class org.apache.tika.GetPipesIteratorReply
- parseFrom(InputStream) - Static method in class org.apache.tika.GetPipesIteratorRequest
- parseFrom(InputStream) - Static method in class org.apache.tika.ListFetchersReply
- parseFrom(InputStream) - Static method in class org.apache.tika.ListFetchersRequest
- parseFrom(InputStream) - Static method in class org.apache.tika.SaveFetcherReply
- parseFrom(InputStream) - Static method in class org.apache.tika.SaveFetcherRequest
- parseFrom(InputStream) - Static method in class org.apache.tika.SavePipesIteratorReply
- parseFrom(InputStream) - Static method in class org.apache.tika.SavePipesIteratorRequest
- parseFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.DeleteFetcherReply
- parseFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.DeleteFetcherRequest
- parseFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.DeletePipesIteratorReply
- parseFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.DeletePipesIteratorRequest
- parseFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.FetchAndParseReply
- parseFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.FetchAndParseRequest
- parseFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- parseFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- parseFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherReply
- parseFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherRequest
- parseFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.GetPipesIteratorReply
- parseFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.GetPipesIteratorRequest
- parseFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.ListFetchersReply
- parseFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.ListFetchersRequest
- parseFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.SaveFetcherReply
- parseFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.SaveFetcherRequest
- parseFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.SavePipesIteratorReply
- parseFrom(InputStream, ExtensionRegistryLite) - Static method in class org.apache.tika.SavePipesIteratorRequest
- parseFrom(ByteBuffer) - Static method in class org.apache.tika.DeleteFetcherReply
- parseFrom(ByteBuffer) - Static method in class org.apache.tika.DeleteFetcherRequest
- parseFrom(ByteBuffer) - Static method in class org.apache.tika.DeletePipesIteratorReply
- parseFrom(ByteBuffer) - Static method in class org.apache.tika.DeletePipesIteratorRequest
- parseFrom(ByteBuffer) - Static method in class org.apache.tika.FetchAndParseReply
- parseFrom(ByteBuffer) - Static method in class org.apache.tika.FetchAndParseRequest
- parseFrom(ByteBuffer) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- parseFrom(ByteBuffer) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- parseFrom(ByteBuffer) - Static method in class org.apache.tika.GetFetcherReply
- parseFrom(ByteBuffer) - Static method in class org.apache.tika.GetFetcherRequest
- parseFrom(ByteBuffer) - Static method in class org.apache.tika.GetPipesIteratorReply
- parseFrom(ByteBuffer) - Static method in class org.apache.tika.GetPipesIteratorRequest
- parseFrom(ByteBuffer) - Static method in class org.apache.tika.ListFetchersReply
- parseFrom(ByteBuffer) - Static method in class org.apache.tika.ListFetchersRequest
- parseFrom(ByteBuffer) - Static method in class org.apache.tika.SaveFetcherReply
- parseFrom(ByteBuffer) - Static method in class org.apache.tika.SaveFetcherRequest
- parseFrom(ByteBuffer) - Static method in class org.apache.tika.SavePipesIteratorReply
- parseFrom(ByteBuffer) - Static method in class org.apache.tika.SavePipesIteratorRequest
- parseFrom(ByteBuffer, ExtensionRegistryLite) - Static method in class org.apache.tika.DeleteFetcherReply
- parseFrom(ByteBuffer, ExtensionRegistryLite) - Static method in class org.apache.tika.DeleteFetcherRequest
- parseFrom(ByteBuffer, ExtensionRegistryLite) - Static method in class org.apache.tika.DeletePipesIteratorReply
- parseFrom(ByteBuffer, ExtensionRegistryLite) - Static method in class org.apache.tika.DeletePipesIteratorRequest
- parseFrom(ByteBuffer, ExtensionRegistryLite) - Static method in class org.apache.tika.FetchAndParseReply
- parseFrom(ByteBuffer, ExtensionRegistryLite) - Static method in class org.apache.tika.FetchAndParseRequest
- parseFrom(ByteBuffer, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- parseFrom(ByteBuffer, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- parseFrom(ByteBuffer, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherReply
- parseFrom(ByteBuffer, ExtensionRegistryLite) - Static method in class org.apache.tika.GetFetcherRequest
- parseFrom(ByteBuffer, ExtensionRegistryLite) - Static method in class org.apache.tika.GetPipesIteratorReply
- parseFrom(ByteBuffer, ExtensionRegistryLite) - Static method in class org.apache.tika.GetPipesIteratorRequest
- parseFrom(ByteBuffer, ExtensionRegistryLite) - Static method in class org.apache.tika.ListFetchersReply
- parseFrom(ByteBuffer, ExtensionRegistryLite) - Static method in class org.apache.tika.ListFetchersRequest
- parseFrom(ByteBuffer, ExtensionRegistryLite) - Static method in class org.apache.tika.SaveFetcherReply
- parseFrom(ByteBuffer, ExtensionRegistryLite) - Static method in class org.apache.tika.SaveFetcherRequest
- parseFrom(ByteBuffer, ExtensionRegistryLite) - Static method in class org.apache.tika.SavePipesIteratorReply
- parseFrom(ByteBuffer, ExtensionRegistryLite) - Static method in class org.apache.tika.SavePipesIteratorRequest
- parseHandlerType(String, BasicContentHandlerFactory.HANDLER_TYPE) - Static method in class org.apache.tika.sax.BasicContentHandlerFactory
-
Tries to parse string into handler type.
- parseHeaders(String) - Static method in class org.apache.tika.pipes.fetcher.http.HttpFetcher
- parseHeif(InputStream) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
- parseHTML(String, Set<String>) - Static method in class org.apache.tika.eval.core.util.ContentTagParser
- parseHyperlinkFromInstrText(String) - Static method in class org.apache.tika.parser.microsoft.ooxml.FieldCodeParser
-
Parses a HYPERLINK URL from instrText field code content.
- parseInline(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.rtf.RTFParser
-
This bypasses wrapping the handler for inline parsing (in at least the OutlookExtractor).
- parseInputStream(InputStream) - Method in class org.apache.tika.example.PipesForkParserExample
-
Example of parsing from an InputStream.
- parseInvestigation(InputStream, XHTMLContentHandler, Metadata, ParseContext) - Static method in class org.apache.tika.parser.isatab.ISATabUtils
- parseInvestigation(InputStream, XHTMLContentHandler, Metadata, ParseContext, String) - Static method in class org.apache.tika.parser.isatab.ISATabUtils
- parseJpeg(File) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
- parseMachO(XHTMLContentHandler, EmbeddedDocumentExtractor, Metadata, InputStream, byte[], ParseContext) - Method in class org.apache.tika.parser.executable.UniversalExecutableParser
-
Parses a Mach-O Universal file
- parseMachO(XHTMLContentHandler, Metadata, InputStream, byte[]) - Method in class org.apache.tika.parser.executable.ExecutableParser
-
Parses a Mach-O file
- parseManyFiles(List<Path>) - Method in class org.apache.tika.example.PipesForkParserExample
-
Example of reusing PipesForkParser for multiple documents.
- parseMetadata(TikaInputStream, Metadata, MultivaluedMap<String, String>, UriInfo) - Method in class org.apache.tika.server.core.resource.MetadataResource
- parseMetadata(TikaInputStream, Metadata, MultivaluedMap<String, String>, ServerHandlerConfig) - Static method in class org.apache.tika.server.core.resource.RecursiveMetadataResource
-
Parses content and returns metadata list.
- parseMode() - Method in record class org.apache.tika.server.core.resource.ServerHandlerConfig
-
Returns the value of the
parseModerecord component. - ParseMode - Enum Class in org.apache.tika.pipes.api
-
Controls how embedded documents are handled during parsing.
- parseNoEmbeddedExample() - Method in class org.apache.tika.example.ParsingExample
-
If you don't want content from embedded documents, send in a
ParseContextthat does contains aEmptyParser. - parseObject(String, ParsePosition) - Method in class org.apache.tika.parser.microsoft.TikaExcelGeneralFormat
- parseOnePartToHTML() - Method in class org.apache.tika.example.ContentHandlerExample
-
Example of extracting just one part of the document's body, as HTML as a string, excluding the rest
- parseOOXMLRels(InputStream) - Static method in class org.apache.tika.detect.microsoft.ooxml.OPCPackageDetector
- parsePE(XHTMLContentHandler, Metadata, InputStream, byte[]) - Method in class org.apache.tika.parser.executable.ExecutableParser
-
Parses a DOS or Windows PE file
- parser() - Static method in class org.apache.tika.DeleteFetcherReply
- parser() - Static method in class org.apache.tika.DeleteFetcherRequest
- parser() - Static method in class org.apache.tika.DeletePipesIteratorReply
- parser() - Static method in class org.apache.tika.DeletePipesIteratorRequest
- parser() - Static method in class org.apache.tika.FetchAndParseReply
- parser() - Static method in class org.apache.tika.FetchAndParseRequest
- parser() - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- parser() - Static method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- parser() - Static method in class org.apache.tika.GetFetcherReply
- parser() - Static method in class org.apache.tika.GetFetcherRequest
- parser() - Static method in class org.apache.tika.GetPipesIteratorReply
- parser() - Static method in class org.apache.tika.GetPipesIteratorRequest
- parser() - Static method in class org.apache.tika.ListFetchersReply
- parser() - Static method in class org.apache.tika.ListFetchersRequest
- parser() - Static method in class org.apache.tika.SaveFetcherReply
- parser() - Static method in class org.apache.tika.SaveFetcherRequest
- parser() - Static method in class org.apache.tika.SavePipesIteratorReply
- parser() - Static method in class org.apache.tika.SavePipesIteratorRequest
- Parser - Interface in org.apache.tika.parser
-
Tika parser interface.
- parseRawExif(byte[]) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
- parseRawExif(InputStream, int, boolean) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
- parseRawXMP(byte[]) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
- parserCompleted(Parser, Metadata, ContentHandler, ParseContext, Exception) - Method in class org.apache.tika.example.PickBestTextEncodingParser
-
Deprecated.
- parserCompleted(Parser, Metadata, ContentHandler, ParseContext, Exception) - Method in class org.apache.tika.parser.multiple.AbstractMultipleParser
-
Used to notify implementations that a Parser has Finished or Failed, and to allow them to decide to continue or abort further parsing
- parserCompleted(Parser, Metadata, ContentHandler, ParseContext, Exception) - Method in class org.apache.tika.parser.multiple.FallbackParser
- parserCompleted(Parser, Metadata, ContentHandler, ParseContext, Exception) - Method in class org.apache.tika.parser.multiple.SupplementingParser
- ParserContainerExtractor - Class in org.apache.tika.extractor
-
An implementation of
ContainerExtractorpowered by the regularParserAPI. - ParserContainerExtractor() - Constructor for class org.apache.tika.extractor.ParserContainerExtractor
- ParserContainerExtractor(Parser, Detector) - Constructor for class org.apache.tika.extractor.ParserContainerExtractor
- ParserDecoration(List<String>, List<String>) - Constructor for class org.apache.tika.config.loader.FrameworkConfig.ParserDecoration
- ParserDecorator - Class in org.apache.tika.parser
-
Decorator base class for the
Parserinterface. - ParserDecorator(Parser) - Constructor for class org.apache.tika.parser.ParserDecorator
-
Creates a decorator for the given parser.
- ParserDecorator.MimeFilteringDecorator - Class in org.apache.tika.parser
-
A ParserDecorator that filters supported mime types.
- ParseRecord - Class in org.apache.tika.parser
-
Use this class to store exceptions, warnings and other information during the parse.
- ParseRecord() - Constructor for class org.apache.tika.parser.ParseRecord
- parseRFC5322(String) - Static method in class org.apache.tika.parser.mailcommons.MailDateParser
- ParserLoader - Class in org.apache.tika.config.loader
-
Loader for parsers with support for: SPI fallback via "default-parser" marker with exclusions Mime type filtering decorations (_mime-include, _mime-exclude) EncodingDetector and Renderer dependency injection
- ParserLoader() - Constructor for class org.apache.tika.config.loader.ParserLoader
- ParserPostProcessor - Class in org.apache.tika.parser
-
Parser decorator that post-processes the results from a decorated parser.
- ParserPostProcessor(Parser) - Constructor for class org.apache.tika.parser.ParserPostProcessor
-
Creates a post-processing decorator for the given parser.
- parserPrepare(Parser, Metadata, ParseContext) - Method in class org.apache.tika.example.PickBestTextEncodingParser
-
Deprecated.
- parserPrepare(Parser, Metadata, ParseContext) - Method in class org.apache.tika.parser.multiple.AbstractMultipleParser
-
Used to allow implementations to prepare or change things before parsing occurs
- ParserUtils - Class in org.apache.tika.utils
-
Helper util methods for Parsers themselves.
- ParserUtils() - Constructor for class org.apache.tika.utils.ParserUtils
- parseSAX(InputStream, ContentHandler, ParseContext) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
This checks context for a user specified
SAXParser. - parseSAX(Reader, ContentHandler, ParseContext) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
This checks context for a user specified
SAXParser. - parseStreamObject(StreamObjectHeaderStart, byte[], AtomicInteger) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
-
Parse stream object from byte array.
- parseString(String, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.html.JSoupParser
- parseStudy(InputStream, XHTMLContentHandler, Metadata, ParseContext) - Static method in class org.apache.tika.parser.isatab.ISATabUtils
- parseSuffixes(String) - Static method in class org.apache.tika.eval.app.io.ExtractReader
- parseSummaries(DirectoryNode) - Method in class org.apache.tika.parser.microsoft.SummaryExtractor
- parseSummaries(POIFSFileSystem) - Method in class org.apache.tika.parser.microsoft.SummaryExtractor
- parseTiff(File) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
- parseTikaInputStream(String) - Static method in class org.apache.tika.example.TIAParsingExample
- parseToHTML() - Method in class org.apache.tika.example.ContentHandlerExample
-
Example of extracting the contents as HTML, as a string.
- parseToPlainText() - Method in class org.apache.tika.example.ContentHandlerExample
-
Example of extracting the plain text of the contents.
- parseToPlainTextChunks() - Method in class org.apache.tika.example.ContentHandlerExample
-
Example of extracting the plain text in chunks, with each chunk of no more than a certain maximum size
- parseToReaderExample() - Static method in class org.apache.tika.example.TIAParsingExample
- parseToString(File) - Method in class org.apache.tika.Tika
-
Parses the given file and returns the extracted text content.
- parseToString(InputStream) - Method in class org.apache.tika.Tika
-
Parses the given document and returns the extracted text content.
- parseToString(InputStream, Metadata) - Method in class org.apache.tika.Tika
-
Parses the given document and returns the extracted text content.
- parseToString(InputStream, Metadata, int) - Method in class org.apache.tika.Tika
-
Parses the given document and returns the extracted text content.
- parseToString(URL) - Method in class org.apache.tika.Tika
-
Parses the resource at the given URL and returns the extracted text content.
- parseToString(Path) - Method in class org.apache.tika.Tika
-
Parses the file at the given path and returns the extracted text content.
- parseToStringExample() - Method in class org.apache.tika.example.ParsingExample
-
Example of how to use Tika's parseToString method to parse the content of a file, and return any text found.
- parseToStringExample() - Static method in class org.apache.tika.example.TIAParsingExample
- parseUnpack(TikaInputStream, Metadata, ParseContext, boolean) - Method in class org.apache.tika.server.core.resource.PipesParsingHelper
-
Parses content using UNPACK mode and returns a path to the zip file containing extracted embedded documents.
- parseURLStream(String) - Static method in class org.apache.tika.example.TIAParsingExample
- parseUsingAutoDetect(String, TikaLoader, Metadata) - Static method in class org.apache.tika.example.MyFirstTika
- parseUsingComponents(String, TikaLoader, Metadata) - Static method in class org.apache.tika.example.MyFirstTika
- parseWebP(File) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
- parseWithContentTypeHint(Path, String) - Method in class org.apache.tika.example.PipesForkParserExample
-
Example of providing initial metadata hints.
- parseWithCustomConfig(Path) - Method in class org.apache.tika.example.PipesForkParserExample
-
Example of parsing with custom configuration.
- parseWithErrorHandling(Path) - Method in class org.apache.tika.example.PipesForkParserExample
-
Example of proper error handling with PipesForkParser.
- parseWithMetadata(Path) - Method in class org.apache.tika.example.PipesForkParserExample
-
Example of parsing with metadata extraction.
- parseWithPipes(TikaInputStream, Metadata, ParseContext, ParseMode) - Static method in class org.apache.tika.server.core.resource.TikaResource
-
Parses using pipes-based parsing with process isolation.
- parseWord6(DirectoryNode, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.WordExtractor
- parseWord6(POIFSFileSystem, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.WordExtractor
- parseXML(String, Set<String>) - Static method in class org.apache.tika.eval.core.util.ContentTagParser
- ParsingEmbeddedDocumentExtractor - Class in org.apache.tika.extractor
-
Helper class for parsers of package archives or other compound document formats that support embedded or attached component documents.
- ParsingEmbeddedDocumentExtractor(ParseContext) - Constructor for class org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
- ParsingExample - Class in org.apache.tika.example
- ParsingExample() - Constructor for class org.apache.tika.example.ParsingExample
- ParsingIntent - Class in org.apache.tika.parser
-
Marker class to indicate parsing intent in ParseContext.
- ParsingReader - Class in org.apache.tika.parser
-
Reader for the text content from a given binary stream.
- ParsingReader(File) - Constructor for class org.apache.tika.parser.ParsingReader
-
Creates a reader for the text content of the given file.
- ParsingReader(InputStream) - Constructor for class org.apache.tika.parser.ParsingReader
-
Creates a reader for the text content of the given binary stream.
- ParsingReader(InputStream, String) - Constructor for class org.apache.tika.parser.ParsingReader
-
Creates a reader for the text content of the given binary stream with the given name.
- ParsingReader(Path) - Constructor for class org.apache.tika.parser.ParsingReader
-
Creates a reader for the text content of the file at the given path.
- ParsingReader(Parser, InputStream, Metadata, ParseContext) - Constructor for class org.apache.tika.parser.ParsingReader
-
Creates a reader for the text content of the given binary stream with the given document metadata.
- ParsingReader(Parser, InputStream, Metadata, ParseContext, Executor) - Constructor for class org.apache.tika.parser.ParsingReader
-
Creates a reader for the text content of the given binary stream with the given document metadata.
- PASSBACK_ALL - Enum constant in enum class org.apache.tika.pipes.core.EmitStrategy
- PassbackFilter - Class in org.apache.tika.pipes.core
-
Filter/Select some of the emitted output and pass it back to the client parser.
- PassbackFilter() - Constructor for class org.apache.tika.pipes.core.PassbackFilter
- password() - Method in record class org.apache.tika.pipes.emitter.es.HttpClientConfig
-
Returns the value of the
passwordrecord component. - password() - Method in record class org.apache.tika.pipes.emitter.opensearch.HttpClientConfig
-
Returns the value of the
passwordrecord component. - password() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Returns the value of the
passwordrecord component. - password() - Method in record class org.apache.tika.pipes.reporter.opensearch.HttpClientConfig
-
Returns the value of the
passwordrecord component. - PasswordProvider - Interface in org.apache.tika.parser
-
Interface for providing a password to a Parser for handling Encrypted and Password Protected Documents.
- path - Variable in class org.apache.tika.server.core.resource.TikaWelcome.Endpoint
- path() - Method in record class org.apache.tika.pipes.core.extractor.frictionless.FrictionlessResource
-
Returns the value of the
pathrecord component. - pathStyleAccessEnabled() - Method in record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
-
Returns the value of the
pathStyleAccessEnabledrecord component. - PATTERN_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- patterns - Variable in class org.apache.tika.parser.ner.regex.RegexNERecogniser
- payload() - Method in record class org.apache.tika.pipes.core.protocol.PipesMessage
-
Returns the value of the
payloadrecord component. - PDDocumentRenderer - Interface in org.apache.tika.renderer.pdf.pdfbox
-
stub interface for the PDFParser to use to figure out if it needs to pass on the PDDocument or create a temp file to be used by a file-based renderer down the road.
- PDF - Interface in org.apache.tika.metadata
-
PDF properties collection.
- PDF_DOC_INFO_CUSTOM_PREFIX - Static variable in interface org.apache.tika.metadata.PDF
- PDF_DOC_INFO_PREFIX - Static variable in interface org.apache.tika.metadata.PDF
-
Prefix to be used for properties that record what was stored in the docinfo section (as opposed to XMP)
- PDF_EXTENSION_VERSION - Static variable in interface org.apache.tika.metadata.PDF
- PDF_INCREMENTAL_UPDATE_COUNT - Static variable in interface org.apache.tika.metadata.PDF
-
Incremental updates as extracted by the StartXRefScanner.
- PDF_PREFIX - Static variable in interface org.apache.tika.metadata.PDF
- PDF_VERSION - Static variable in interface org.apache.tika.metadata.PDF
- PDF_VERSION - Static variable in interface org.apache.tika.metadata.XMPPDF
- PDFA_PREFIX - Static variable in interface org.apache.tika.metadata.PDF
- PDFA_VERSION - Static variable in interface org.apache.tika.metadata.PDF
- PDFAID_CONFORMANCE - Static variable in interface org.apache.tika.metadata.PDF
- PDFAID_PART - Static variable in interface org.apache.tika.metadata.PDF
- PDFAID_PREFIX - Static variable in interface org.apache.tika.metadata.PDF
- PDFBOX_IMAGE_WRITING_TIME_MS - Static variable in class org.apache.tika.renderer.pdf.pdfbox.PDFBoxRenderer
-
This is the amount of time it takes for PDFBox/java to write the image after it has been rendered into a BufferedImage.
- PDFBOX_RENDERING_TIME_MS - Static variable in class org.apache.tika.renderer.pdf.pdfbox.PDFBoxRenderer
-
This is the amount of time it takes for PDFBox to render the page to a BufferedImage
- PDFBoxRenderer - Class in org.apache.tika.renderer.pdf.pdfbox
- PDFBoxRenderer() - Constructor for class org.apache.tika.renderer.pdf.pdfbox.PDFBoxRenderer
- PDFMarkedContent2XHTML - Class in org.apache.tika.parser.pdf
-
This was added in Tika 1.24 as an alpha version of a text extractor that builds the text from the marked text tree and includes/normalizes some of the structural tags.
- PDFParser - Class in org.apache.tika.parser.pdf
-
PDF parser.
- PDFParser() - Constructor for class org.apache.tika.parser.pdf.PDFParser
- PDFParser(JsonConfig) - Constructor for class org.apache.tika.parser.pdf.PDFParser
-
Constructor for JSON configuration.
- PDFParser(PDFParserConfig) - Constructor for class org.apache.tika.parser.pdf.PDFParser
-
Constructor with explicit PDFParserConfig object.
- pdfParserConfig - Variable in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- PDFParserConfig - Class in org.apache.tika.parser.pdf
-
Config for PDFParser.
- PDFParserConfig() - Constructor for class org.apache.tika.parser.pdf.PDFParserConfig
- PDFParserConfig.AccessCheckMode - Enum Class in org.apache.tika.parser.pdf
-
Mode for checking document access permissions.
- PDFParserConfig.IMAGE_STRATEGY - Enum Class in org.apache.tika.parser.pdf
- PDFRenderingState - Class in org.apache.tika.renderer.pdf.pdfbox
- PDFRenderingState(TikaInputStream) - Constructor for class org.apache.tika.renderer.pdf.pdfbox.PDFRenderingState
- PDFUAID_PART - Static variable in interface org.apache.tika.metadata.PDF
- PDFVT_MODIFIED - Static variable in interface org.apache.tika.metadata.PDF
- PDFVT_VERSION - Static variable in interface org.apache.tika.metadata.PDF
- PDFX_CONFORMANCE - Static variable in interface org.apache.tika.metadata.PDF
- PDFX_VERSION - Static variable in interface org.apache.tika.metadata.PDF
- PDFXID_VERSION - Static variable in interface org.apache.tika.metadata.PDF
- PDMetadataExtractor - Class in org.apache.tika.parser.pdf
- PDMetadataExtractor() - Constructor for class org.apache.tika.parser.pdf.PDMetadataExtractor
- peek(byte[]) - Method in class org.apache.tika.io.TikaInputStream
- peekBits(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
- PERCENT - Static variable in interface org.apache.tika.parser.ner.NERecogniser
- PERCENT_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- PerClientServerManager - Class in org.apache.tika.pipes.core
-
Manages a dedicated PipesServer process for a single PipesClient.
- PerClientServerManager(PipesConfig, Path, int) - Constructor for class org.apache.tika.pipes.core.PerClientServerManager
- PERSON - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of a person the content of the item is about.
- PERSON - Static variable in interface org.apache.tika.parser.ner.NERecogniser
- PERSON_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- PERSON_RELATION - Static variable in class org.apache.tika.parser.microsoft.ooxml.OPCPackageWrapper
- Pharmacy - Class in org.apache.tika.example
- Pharmacy() - Constructor for class org.apache.tika.example.Pharmacy
- PhoneExtractingContentHandler - Class in org.apache.tika.sax
-
Class used to extract phone numbers while parsing.
- PhoneExtractingContentHandler() - Constructor for class org.apache.tika.sax.PhoneExtractingContentHandler
-
Creates a decorator that by default forwards incoming SAX events to a dummy content handler that simply ignores all the events.
- PhoneExtractingContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.sax.PhoneExtractingContentHandler
-
Creates a decorator for the given SAX event handler and Metadata object.
- Photoshop - Interface in org.apache.tika.metadata
-
XMP Photoshop metadata schema.
- PickBestTextEncodingParser - Class in org.apache.tika.example
-
Deprecated.Currently not suitable for real use, more a demo / prototype!
- PickBestTextEncodingParser(MediaTypeRegistry, String[]) - Constructor for class org.apache.tika.example.PickBestTextEncodingParser
-
Deprecated.
- PickBestTextEncodingParser.CharsetContentHandlerFactory - Class in org.apache.tika.example
-
Deprecated.
- PickBestTextEncodingParser.CharsetTester - Class in org.apache.tika.example
-
Deprecated.
- PictureContainer - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- PictureHeight - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- PictureWidth - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- ping() - Static method in record class org.apache.tika.pipes.core.protocol.PipesMessage
- PING - Enum constant in enum class org.apache.tika.pipes.core.protocol.PipesMessageType
- PIPES_RESULT - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- PipesClient - Class in org.apache.tika.pipes.core
-
The PipesClient is designed to be single-threaded.
- PipesClient(PipesConfig, Path) - Constructor for class org.apache.tika.pipes.core.PipesClient
-
Creates a PipesClient with its own dedicated server process.
- PipesClient(PipesConfig, ServerManager) - Constructor for class org.apache.tika.pipes.core.PipesClient
-
Creates a PipesClient with the given server manager.
- PipesConfig - Class in org.apache.tika.pipes.core
- PipesConfig() - Constructor for class org.apache.tika.pipes.core.PipesConfig
- PipesConfigOverride(int, long, int, List<String>) - Constructor for class org.apache.tika.pipes.core.config.ConfigOverrides.PipesConfigOverride
- PipesException - Exception in org.apache.tika.pipes.core
-
Fatal exception that means that something went seriously wrong.
- PipesException(String) - Constructor for exception org.apache.tika.pipes.core.PipesException
- PipesException(String, Throwable) - Constructor for exception org.apache.tika.pipes.core.PipesException
- PipesException(Throwable) - Constructor for exception org.apache.tika.pipes.core.PipesException
- PipesForkParser - Class in org.apache.tika.pipes.fork
-
A ForkParser implementation backed by
PipesParser. - PipesForkParser() - Constructor for class org.apache.tika.pipes.fork.PipesForkParser
-
Creates a new PipesForkParser with default configuration.
- PipesForkParser(PipesForkParserConfig) - Constructor for class org.apache.tika.pipes.fork.PipesForkParser
-
Creates a new PipesForkParser with the specified configuration.
- PipesForkParserConfig - Class in org.apache.tika.pipes.fork
-
Configuration for
PipesForkParser. - PipesForkParserConfig() - Constructor for class org.apache.tika.pipes.fork.PipesForkParserConfig
- PipesForkParserExample - Class in org.apache.tika.example
-
Examples of how to use the
PipesForkParserto parse documents in a forked JVM process. - PipesForkParserExample() - Constructor for class org.apache.tika.example.PipesForkParserExample
- PipesForkParserException - Exception in org.apache.tika.pipes.fork
-
Exception thrown when
PipesForkParserencounters an application error. - PipesForkParserException(PipesResult.RESULT_STATUS, String) - Constructor for exception org.apache.tika.pipes.fork.PipesForkParserException
-
Creates a new exception with the given status and message.
- PipesForkParserException(PipesResult.RESULT_STATUS, String, Throwable) - Constructor for exception org.apache.tika.pipes.fork.PipesForkParserException
-
Creates a new exception with the given status, message, and cause.
- PipesForkResult - Class in org.apache.tika.pipes.fork
-
Result from parsing a file with
PipesForkParser. - PipesForkResult(PipesResult) - Constructor for class org.apache.tika.pipes.fork.PipesForkResult
- PipesIterator - Interface in org.apache.tika.pipes.api.pipesiterator
- PipesIteratorBase - Class in org.apache.tika.pipes.pipesiterator
-
Abstract class that handles the testing for timeouts/thread safety issues.
- PipesIteratorBase(ExtensionConfig) - Constructor for class org.apache.tika.pipes.pipesiterator.PipesIteratorBase
- PipesIteratorConfig - Class in org.apache.tika.pipes.pipesiterator
-
Abstract base class for pipes iterator configurations.
- PipesIteratorConfig() - Constructor for class org.apache.tika.pipes.pipesiterator.PipesIteratorConfig
- PipesIteratorFactory - Interface in org.apache.tika.pipes.api.pipesiterator
- PipesIteratorManager - Class in org.apache.tika.pipes.core.pipesiterator
-
Utility class to hold a single pipes iterator
- PipesIteratorManager() - Constructor for class org.apache.tika.pipes.core.pipesiterator.PipesIteratorManager
- PipesMessage - Record Class in org.apache.tika.pipes.core.protocol
-
Uniform framed message for the PipesClient/PipesServer IPC protocol.
- PipesMessage(PipesMessageType, byte[]) - Constructor for record class org.apache.tika.pipes.core.protocol.PipesMessage
-
Creates an instance of a
PipesMessagerecord class. - PipesMessageType - Enum Class in org.apache.tika.pipes.core.protocol
-
Unified message types for the PipesClient/PipesServer IPC protocol.
- PipesParser - Class in org.apache.tika.pipes.core
- PipesParsingHelper - Class in org.apache.tika.server.core.resource
-
Helper class for pipes-based parsing in tika-server endpoints.
- PipesParsingHelper(PipesParser, PipesConfig, Path, Path) - Constructor for class org.apache.tika.server.core.resource.PipesParsingHelper
-
Creates a PipesParsingHelper.
- PipesParsingHelper.UnpackResult - Record Class in org.apache.tika.server.core.resource
-
Result of UNPACK parsing containing the zip file path and metadata.
- PipesReporter - Interface in org.apache.tika.pipes.api.reporter
-
This is called asynchronously by the AsyncProcessor.
- PipesReporterBase - Class in org.apache.tika.pipes.reporters
-
Base class that includes filtering by
PipesResult.RESULT_STATUS - PipesReporterBase(ExtensionConfig, Set<String>, Set<String>) - Constructor for class org.apache.tika.pipes.reporters.PipesReporterBase
- PipesReporterFactory - Interface in org.apache.tika.pipes.api.reporter
- PipesResource - Class in org.apache.tika.server.core.resource
- PipesResource(Path) - Constructor for class org.apache.tika.server.core.resource.PipesResource
- PipesResult - Record Class in org.apache.tika.pipes.api
- PipesResult(PipesResult.RESULT_STATUS) - Constructor for record class org.apache.tika.pipes.api.PipesResult
- PipesResult(PipesResult.RESULT_STATUS, String) - Constructor for record class org.apache.tika.pipes.api.PipesResult
- PipesResult(PipesResult.RESULT_STATUS, EmitData) - Constructor for record class org.apache.tika.pipes.api.PipesResult
- PipesResult(PipesResult.RESULT_STATUS, EmitData, String) - Constructor for record class org.apache.tika.pipes.api.PipesResult
-
Creates an instance of a
PipesResultrecord class. - PipesResult.CATEGORY - Enum Class in org.apache.tika.pipes.api
-
High-level categorization of result statuses.
- PipesResult.RESULT_STATUS - Enum Class in org.apache.tika.pipes.api
- PipesResultDeserializer - Class in org.apache.tika.pipes.core.serialization
- PipesResultDeserializer() - Constructor for class org.apache.tika.pipes.core.serialization.PipesResultDeserializer
- PipesResults - Class in org.apache.tika.pipes.core
- PipesResults() - Constructor for class org.apache.tika.pipes.core.PipesResults
- PipesResultSerializer - Class in org.apache.tika.pipes.core.serialization
- PipesResultSerializer() - Constructor for class org.apache.tika.pipes.core.serialization.PipesResultSerializer
- PipesServer - Class in org.apache.tika.pipes.core.server
-
This server is forked from the PipesClient.
- PipesServer(String, TikaLoader, PipesConfig, Socket, DataInputStream, DataOutputStream, MetadataFilter, ContentHandlerFactory, MetadataWriteLimiterFactory) - Constructor for class org.apache.tika.pipes.core.server.PipesServer
- Pkcs7Parser - Class in org.apache.tika.parser.crypto
-
Basic parser for PKCS7 data.
- Pkcs7Parser() - Constructor for class org.apache.tika.parser.crypto.Pkcs7Parser
- PLAIN_TEXT - Static variable in class org.apache.tika.mime.MimeTypes
-
Name of the
texttype, text/plain. - PLATFORM - Static variable in interface org.apache.tika.metadata.MachineMetadata
- PLATFORM - Static variable in interface org.apache.tika.metadata.Zip
-
Platform that created the entry (0=MS-DOS, 3=Unix, etc.).
- PLATFORM_AIX - Static variable in interface org.apache.tika.metadata.MachineMetadata
- PLATFORM_ARM - Static variable in interface org.apache.tika.metadata.MachineMetadata
- PLATFORM_EMBEDDED - Static variable in interface org.apache.tika.metadata.MachineMetadata
- PLATFORM_FREEBSD - Static variable in interface org.apache.tika.metadata.MachineMetadata
- PLATFORM_HPUX - Static variable in interface org.apache.tika.metadata.MachineMetadata
- PLATFORM_IRIX - Static variable in interface org.apache.tika.metadata.MachineMetadata
- PLATFORM_LINUX - Static variable in interface org.apache.tika.metadata.MachineMetadata
- PLATFORM_NETBSD - Static variable in interface org.apache.tika.metadata.MachineMetadata
- PLATFORM_SOLARIS - Static variable in interface org.apache.tika.metadata.MachineMetadata
- PLATFORM_SYSV - Static variable in interface org.apache.tika.metadata.MachineMetadata
- PLATFORM_TRU64 - Static variable in interface org.apache.tika.metadata.MachineMetadata
- PLATFORM_WINDOWS - Static variable in interface org.apache.tika.metadata.MachineMetadata
- PLIST - Static variable in class org.apache.tika.detect.apple.BPListDetector
- PListParser - Class in org.apache.tika.parser.apple
-
Parser for Apple's plist and bplist.
- PListParser() - Constructor for class org.apache.tika.parser.apple.PListParser
- PluginComponentLoader - Class in org.apache.tika.plugins
- PluginComponentLoader() - Constructor for class org.apache.tika.plugins.PluginComponentLoader
- pluginConfig - Variable in class org.apache.tika.plugins.AbstractTikaExtension
- pluginManager - Variable in class org.apache.tika.pipes.core.AbstractComponentManager
- PluginsWriter - Class in org.apache.tika.async.cli
- PluginsWriter(SimpleAsyncConfig, Path) - Constructor for class org.apache.tika.async.cli.PluginsWriter
- PLUS_VERSION - Static variable in interface org.apache.tika.metadata.IPTC
-
The version number of the PLUS standards in place at the time of the transaction.
- PMGL - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- PNG - Enum constant in enum class org.apache.tika.parser.pdf.OcrConfig.ImageFormat
- POIFSContainerDetector - Class in org.apache.tika.detect.microsoft
-
A detector that works on a POIFS OLE2 document to figure out exactly what the file is.
- POIFSContainerDetector() - Constructor for class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
Default constructor for SPI loading.
- POLARITY - Enum constant in enum class org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
- pop() - Method in class org.apache.tika.eval.core.tokens.TokenCountPriorityQueue
- popGroup() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFState
-
Close the current group: pop and restore the parent state.
- PopplerRenderer - Class in org.apache.tika.renderer.pdf.poppler
-
Renderer that uses Poppler's
pdftoppmcommand to convert PDF pages to PNG images. - PopplerRenderer() - Constructor for class org.apache.tika.renderer.pdf.poppler.PopplerRenderer
- PortraitPage - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- POSITION_BASE - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- post(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.core.resource.AsyncResource
-
The client posts a json request.
- postConnection() - Method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
-
Returns the value of the
postConnectionrecord component. - postConnectionSql() - Method in record class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterConfig
-
Returns the value of the
postConnectionSqlrecord component. - postHtml(List<Attachment>, HttpHeaders) - Method in class org.apache.tika.server.core.resource.TikaResource
-
Parse multipart document with optional config, return HTML.
- postJson(String, String) - Method in class org.apache.tika.pipes.emitter.es.ESClient
- postJson(String, String) - Method in class org.apache.tika.pipes.emitter.opensearch.OpenSearchClient
- postJson(String, String) - Method in class org.apache.tika.pipes.reporter.opensearch.OpenSearchClient
- postJson(String, String, Map<String, String>, int) - Method in class org.apache.tika.http.TikaHttpClient
-
POST a JSON body to
urland return the response body as a string. - postJson(List<Attachment>, HttpHeaders) - Method in class org.apache.tika.server.core.resource.TikaResource
-
Parse multipart document with optional config, return JSON.
- postJson(HttpClient, String, byte[], boolean) - Static method in class org.apache.tika.client.HttpClientUtil
- postJson(HttpClient, String, String) - Static method in class org.apache.tika.client.HttpClientUtil
- postMarkdown(List<Attachment>, HttpHeaders) - Method in class org.apache.tika.server.core.resource.TikaResource
-
Parse multipart document with optional config, return Markdown.
- postProcess(Parser, LoaderContext) - Method in class org.apache.tika.config.loader.ParserLoader
- postProcess(T, LoaderContext) - Method in class org.apache.tika.config.loader.AbstractSpiComponentLoader
-
Post-process a single component (e.g., inject dependencies).
- postProcessList(List<T>, LoaderContext) - Method in class org.apache.tika.config.loader.AbstractSpiComponentLoader
-
Post-process a list of components (e.g., inject dependencies).
- postRaw(List<Attachment>, HttpHeaders) - Method in class org.apache.tika.server.core.resource.TikaResource
-
Parse multipart document with optional config, return XHTML output.
- postRmeta(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.core.resource.PipesResource
-
The client posts a json request.
- postText(List<Attachment>, HttpHeaders) - Method in class org.apache.tika.server.core.resource.TikaResource
-
Parse multipart document with optional config, return plain text.
- postVisitDirectory(Path, IOException) - Method in class org.apache.tika.parser.microsoft.libpst.EmailVisitor
- postXml(List<Attachment>, HttpHeaders) - Method in class org.apache.tika.server.core.resource.TikaResource
-
Parse multipart document with optional config, return XML.
- POWERPOINT - Enum constant in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- PPT - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
Microsoft PowerPoint
- predict(double[]) - Method in class org.apache.tika.detect.NNTrainedModel
- predict(double[]) - Method in class org.apache.tika.detect.TrainedModel
- predict(float[]) - Method in class org.apache.tika.detect.NNTrainedModel
-
The given input vector of unseen is m=(256 + 1) * n= 1 this returns a prediction probability
- predict(float[]) - Method in class org.apache.tika.detect.TrainedModel
- predict(int[]) - Method in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Compute softmax probabilities for the given feature vector.
- predict(int[]) - Method in class org.apache.tika.ml.LinearModel
-
Compute softmax probabilities for the given feature vector.
- predictCalibratedLogits(int[]) - Method in class org.apache.tika.ml.LinearModel
-
Compute calibrated logits:
(raw - classMean[c]) / classStd[c]for each class, if the model carries calibration statistics, else raw logits (no-op). - Prediction - Class in org.apache.tika.ml
-
The result of a single-label classification from a
LinearModel. - Prediction(String, float, float) - Constructor for class org.apache.tika.ml.Prediction
-
Construct a prediction from a raw logit and its softmax probability.
- predictLogits(int[]) - Method in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Compute raw logits (pre-softmax scores) for the given feature vector.
- predictLogits(int[]) - Method in class org.apache.tika.ml.LinearModel
-
Compute raw logits for the given feature vector (before softmax).
- predictLogitsDense(float[]) - Method in class org.apache.tika.ml.LinearModel
-
Compute logits for a dense float feature vector.
- preExtractPlugins(TikaJsonConfig) - Static method in class org.apache.tika.plugins.TikaPluginManager
-
Pre-extracts plugin zip files without loading them.
- prefix - Variable in class org.apache.tika.xmp.convert.Namespace
- prefix() - Method in record class org.apache.tika.pipes.emitter.azblob.AZBlobEmitterConfig
-
Returns the value of the
prefixrecord component. - prefix() - Method in record class org.apache.tika.pipes.emitter.gcs.GCSEmitterConfig
-
Returns the value of the
prefixrecord component. - prefix() - Method in record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
-
Returns the value of the
prefixrecord component. - PREFIX - Static variable in interface org.apache.tika.metadata.AccessPermissions
- PREFIX - Static variable in interface org.apache.tika.metadata.Database
- PREFIX - Static variable in interface org.apache.tika.metadata.FileSystem
- PREFIX - Static variable in interface org.apache.tika.metadata.MachineMetadata
- PREFIX - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
- PREFIX - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- PREFIX - Static variable in interface org.apache.tika.metadata.WARC
- PREFIX - Static variable in interface org.apache.tika.metadata.XMP
- PREFIX - Static variable in interface org.apache.tika.metadata.XMPIdq
- PREFIX - Static variable in interface org.apache.tika.metadata.XMPMM
- PREFIX - Static variable in interface org.apache.tika.metadata.XMPPDF
- PREFIX_ - Static variable in interface org.apache.tika.metadata.XMP
-
The xmp prefix followed by the colon delimiter
- PREFIX_ - Static variable in interface org.apache.tika.metadata.XMPIdq
-
The xmpidq prefix followed by the colon delimiter
- PREFIX_ - Static variable in interface org.apache.tika.metadata.XMPMM
-
The xmpMM prefix followed by the colon delimiter
- PREFIX_ - Static variable in interface org.apache.tika.metadata.XMPRights
-
The xmpRights prefix followed by the colon delimiter
- PREFIX_DC - Static variable in interface org.apache.tika.metadata.DublinCore
- PREFIX_DC - Static variable in interface org.apache.tika.metadata.XMPDC
- PREFIX_DC_TERMS - Static variable in interface org.apache.tika.metadata.DublinCore
- PREFIX_DC_TERMS - Static variable in interface org.apache.tika.metadata.XMPDC
- PREFIX_DOC_META - Static variable in interface org.apache.tika.metadata.Office
- PREFIX_EXTERNAL_META - Static variable in interface org.apache.tika.metadata.ExternalProcess
- PREFIX_FONT_META - Static variable in interface org.apache.tika.metadata.Font
- PREFIX_HTML_META - Static variable in interface org.apache.tika.metadata.HTML
- PREFIX_IPTC_CORE - Static variable in interface org.apache.tika.metadata.IPTC
- PREFIX_IPTC_EXT - Static variable in interface org.apache.tika.metadata.IPTC
- PREFIX_MAPI_ATTACH_META - Static variable in interface org.apache.tika.metadata.MAPI
- PREFIX_MAPI_META - Static variable in interface org.apache.tika.metadata.MAPI
- PREFIX_MAPI_PROPERTY - Static variable in interface org.apache.tika.metadata.MAPI
- PREFIX_PHOTOSHOP - Static variable in interface org.apache.tika.metadata.Photoshop
- PREFIX_PLUS - Static variable in interface org.apache.tika.metadata.IPTC
- PREFIX_RTF_META - Static variable in interface org.apache.tika.metadata.RTFMetadata
- PREFIX_XMP_RIGHTS - Static variable in interface org.apache.tika.metadata.XMPRights
- preparePostHeaderMap(Attachment, HttpHeaders) - Static method in class org.apache.tika.server.core.resource.TikaResource
-
Prepares a multivalued map, combining attachment headers and request headers.
- preprocess(String) - Static method in class org.apache.tika.langdetect.charsoup.CharSoupFeatureExtractor
-
Preprocessing: truncate, strip URLs/emails, NFC normalize.
- preprocessNoTruncate(String) - Static method in class org.apache.tika.langdetect.charsoup.CharSoupFeatureExtractor
-
Preprocessing without the length truncation: strip URLs/emails and NFC-normalize.
- PrescriptionParser - Class in org.apache.tika.example
- PrescriptionParser() - Constructor for class org.apache.tika.example.PrescriptionParser
- PRESENTATION_FORMAT - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- PrettyMetadataKeyComparator - Class in org.apache.tika.serialization
- PrettyMetadataKeyComparator() - Constructor for class org.apache.tika.serialization.PrettyMetadataKeyComparator
- prettyPrint() - Method in record class org.apache.tika.pipes.emitter.fs.FileSystemEmitterConfig
-
Returns the value of the
prettyPrintrecord component. - preVisitDirectory(Path, BasicFileAttributes) - Method in class org.apache.tika.parser.microsoft.libpst.EmailVisitor
- PRINT_DATE - Static variable in interface org.apache.tika.metadata.Office
-
When was the document last printed?
- PRINT_DATE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- priorExtensionFileType(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- PRIORITIZED_MEDIA_LIST - Static variable in class org.apache.tika.server.core.ProduceTypeResourceComparator
-
The prioritized MediaType list.
- priority - Variable in class org.apache.tika.mime.MimeTypesReader
- priorMagicFileType(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- priorMetaFileType(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- PRIORTY - Static variable in interface org.apache.tika.metadata.MAPI
- ProbabilisticMimeDetectionSelector - Class in org.apache.tika.mime
-
Selector for combining different mime detection results based on probability
- ProbabilisticMimeDetectionSelector() - Constructor for class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
- ProbabilisticMimeDetectionSelector(MimeTypes) - Constructor for class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
- ProbabilisticMimeDetectionSelector(MimeTypes, ProbabilisticMimeDetectionSelector.Builder) - Constructor for class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
- ProbabilisticMimeDetectionSelector(ProbabilisticMimeDetectionSelector.Builder) - Constructor for class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
- ProbabilisticMimeDetectionSelector.Builder - Class in org.apache.tika.mime
-
build class for probability parameters setting
- probeContentType(Path) - Method in class org.apache.tika.filetypedetector.TikaFileTypeDetector
- process(String) - Method in class org.apache.tika.cli.TikaCLI
- process(Path) - Static method in class org.apache.tika.example.GrabPhoneNumbersExample
- process(Path) - Static method in class org.apache.tika.example.StandardsExtractionExample
- process(Set<? extends TypeElement>, RoundEnvironment) - Method in class org.apache.tika.annotation.TikaComponentProcessor
- process(PDDocument, ContentHandler, ParseContext, Metadata, PDFParserConfig, Renderer) - Static method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
-
Converts the given PDF document (and related metadata) to a stream of XHTML SAX events sent to the given content handler.
- process(PackagePart, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFFeatureExtractor
- process(XWPFDocument, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFFeatureExtractor
- process(Metadata) - Method in class org.apache.tika.xmp.convert.AbstractConverter
- process(Metadata) - Method in class org.apache.tika.xmp.convert.GenericConverter
- process(Metadata) - Method in interface org.apache.tika.xmp.convert.ITikaToXMPConverter
-
Converts a Tika
Metadata-object into anXMPMetacontaining the useful properties. - process(Metadata) - Method in class org.apache.tika.xmp.convert.MSOfficeBinaryConverter
- process(Metadata) - Method in class org.apache.tika.xmp.convert.MSOfficeXMLConverter
- process(Metadata) - Method in class org.apache.tika.xmp.convert.OpenDocumentConverter
- process(Metadata) - Method in class org.apache.tika.xmp.convert.RTFConverter
- process(Metadata) - Method in class org.apache.tika.xmp.XMPMetadata
- process(Metadata, String) - Method in class org.apache.tika.xmp.XMPMetadata
-
Converts the Metadata information to XMP.
- process(FetchEmitTuple) - Method in class org.apache.tika.pipes.core.PipesClient
- PROCESS_CRASH - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.CATEGORY
-
Forked process crashed due to OOM, timeout, or other system failure - auto-restart
- processBox(String, byte[], long, Mp4Context) - Method in class org.apache.tika.parser.mp4.TikaMp4BoxHandler
- processCommand(TikaInputStream) - Method in class org.apache.tika.parser.gdal.GDALParser
- processDrawings(PackagePart, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
- processedInlineImages - Variable in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- processFileResource(FetchKey) - Method in class org.apache.tika.eval.app.ExtractComparer
- processFileResource(FetchKey) - Method in class org.apache.tika.eval.app.ExtractProfiler
- processFileResource(FetchKey) - Method in class org.apache.tika.eval.app.ProfilerBase
- processFolder(Path) - Static method in class org.apache.tika.example.GrabPhoneNumbersExample
- processFolder(Path) - Static method in class org.apache.tika.example.StandardsExtractionExample
- processImage(PDImage, int) - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- processingInstruction(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
- processingInstruction(String, String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- processingInstruction(String, String) - Method in class org.apache.tika.sax.TeeContentHandler
- processingInstruction(String, String) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
- processMessage(String) - Method in class org.apache.tika.language.translate.impl.MarianTranslator.MarianServerClient
- processPage(PDPage) - Method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
- processPages(PDPageTree) - Method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
- processResult(FileProcessResult, Metadata, boolean) - Static method in class org.apache.tika.detect.magika.MagikaDetector
- processResult(FileProcessResult, Metadata, boolean) - Static method in class org.apache.tika.detect.siegfried.SiegfriedDetector
- processSheet(TikaSheetContentsHandler, XSSFCommentsShim, XSSFStylesShim, XSSFSharedStringsShim, InputStream) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
- processToken(RTFToken) - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFState
-
Process a single token to update internal state.
- processToken(RTFToken, RTFState, RTFGroupState) - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFEmbeddedHandler
-
Process a token for embedded object/pict handling.
- ProcessUtils - Class in org.apache.tika.utils
- ProcessUtils() - Constructor for class org.apache.tika.utils.ProcessUtils
- PRODUCER - Static variable in interface org.apache.tika.metadata.PDF
- PRODUCER - Static variable in interface org.apache.tika.metadata.XMPPDF
- produces - Variable in class org.apache.tika.server.core.resource.TikaWelcome.Endpoint
- ProduceTypeResourceComparator - Class in org.apache.tika.server.core
-
Resource comparator based to produce type.
- ProduceTypeResourceComparator() - Constructor for class org.apache.tika.server.core.ProduceTypeResourceComparator
-
Initiates the comparator.
- PRODUCT_INFO - Static variable in interface org.apache.tika.metadata.DWG
- PRODUCT_TYPE - Static variable in interface org.apache.tika.metadata.WordPerfect
-
Product type.
- profile() - Method in record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
-
Returns the value of the
profilerecord component. - PROFILE_TABLE - Static variable in class org.apache.tika.eval.app.ExtractProfiler
- ProfilerBase - Class in org.apache.tika.eval.app
- ProfilerBase(IDBWriter) - Constructor for class org.apache.tika.eval.app.ProfilerBase
- ProfilerBase.EXCEPTION_TYPE - Enum Class in org.apache.tika.eval.app
- ProfilerBase.PARSE_ERROR_TYPE - Enum Class in org.apache.tika.eval.app
-
If information was gathered from the log file about a parse error
- PROFILES_A - Static variable in class org.apache.tika.eval.app.ExtractComparer
- PROFILES_B - Static variable in class org.apache.tika.eval.app.ExtractComparer
- PROG_ID - Static variable in interface org.apache.tika.metadata.Office
-
Embedded files may have a "progID" associated with them, such as Word.Document.12 or AcroExch.Document.DC
- PROGRAM_ID - Static variable in interface org.apache.tika.metadata.ClimateForcast
- PROJECT - Enum constant in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- PROJECT_ID - Static variable in interface org.apache.tika.metadata.ClimateForcast
- projectId() - Method in record class org.apache.tika.pipes.emitter.gcs.GCSEmitterConfig
-
Returns the value of the
projectIdrecord component. - PROPER_NAME - Enum constant in enum class org.apache.tika.metadata.Property.ValueType
- PROPERTIES_FILE - Static variable in class org.apache.tika.language.translate.impl.MicrosoftTranslator
- property(String, String) - Method in class org.apache.tika.sax.XMPContentHandler
- Property - Class in org.apache.tika.metadata
-
XMP property definition.
- PROPERTY - Enum constant in enum class org.apache.tika.metadata.Property.ValueType
- PROPERTY_GROUP_IPTC_CORE - Static variable in interface org.apache.tika.metadata.IPTC
- PROPERTY_GROUP_IPTC_EXT - Static variable in interface org.apache.tika.metadata.IPTC
- PROPERTY_RELEASE_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
Optional identifier associated with each Property Release.
- PROPERTY_RELEASE_STATUS - Static variable in interface org.apache.tika.metadata.IPTC
-
Summarises the availability and scope of property releases authorizing usage of the properties appearing in the photograph.
- Property.PropertyType - Enum Class in org.apache.tika.metadata
- Property.ValueType - Enum Class in org.apache.tika.metadata
- propertyID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
- PropertyID - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
-
This class is used to represent a PropertyID.
- PropertyID() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
- propertySet - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObject
- PropertySet - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
This class is used to represent a PropertySet.
- PropertySet - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
The property contains a child PropertySet structure in the PropertySet.rgData stream field of the parent PropertySet.
- PropertySet() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySet
- PropertySetObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
This class is used to represent the property set.
- PropertySetObject(ObjectGroupObjectDeclare, ObjectGroupObjectData) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySetObject
-
Construct the PropertySetObject instance.
- PropertyType - Enum Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
- PropertyTypeException - Exception in org.apache.tika.metadata
-
XMP property definition violation exception.
- PropertyTypeException(String) - Constructor for exception org.apache.tika.metadata.PropertyTypeException
- PropertyTypeException(Property.PropertyType) - Constructor for exception org.apache.tika.metadata.PropertyTypeException
- PropertyTypeException(Property.PropertyType, Property.PropertyType) - Constructor for exception org.apache.tika.metadata.PropertyTypeException
- PropertyTypeException(Property.ValueType, Property.ValueType) - Constructor for exception org.apache.tika.metadata.PropertyTypeException
- PROTECTED_WORKSHEET - Static variable in interface org.apache.tika.metadata.Office
- ProtocolDesyncException - Exception in org.apache.tika.pipes.core.protocol
-
Thrown when the framing magic bytes do not match, indicating that the IPC stream is desynchronized and the connection is unsalvageable.
- ProtocolDesyncException(String) - Constructor for exception org.apache.tika.pipes.core.protocol.ProtocolDesyncException
- ProtocolError - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Protocol Error
- provider() - Static method in class org.apache.tika.ml.chardetect.Utf16SpecialistEncodingDetector
-
ServiceLoader-compatible provider method. - provider() - Static method in class org.apache.tika.ml.junkdetect.JunkDetector
-
ServiceLoaderprovider hook (Java 9+). - PROVINCE_OR_STATE - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of the subregion of a country -- either called province or state or anything else -- the content is focussing on -- either the subregion shown in visual media or referenced by text or audio media.
- proxyHost() - Method in record class org.apache.tika.pipes.emitter.es.HttpClientConfig
-
Returns the value of the
proxyHostrecord component. - proxyHost() - Method in record class org.apache.tika.pipes.emitter.opensearch.HttpClientConfig
-
Returns the value of the
proxyHostrecord component. - proxyHost() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Returns the value of the
proxyHostrecord component. - proxyHost() - Method in record class org.apache.tika.pipes.reporter.opensearch.HttpClientConfig
-
Returns the value of the
proxyHostrecord component. - proxyPort() - Method in record class org.apache.tika.pipes.emitter.es.HttpClientConfig
-
Returns the value of the
proxyPortrecord component. - proxyPort() - Method in record class org.apache.tika.pipes.emitter.opensearch.HttpClientConfig
-
Returns the value of the
proxyPortrecord component. - proxyPort() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Returns the value of the
proxyPortrecord component. - proxyPort() - Method in record class org.apache.tika.pipes.reporter.opensearch.HttpClientConfig
-
Returns the value of the
proxyPortrecord component. - PRT_MIME_TYPE - Static variable in class org.apache.tika.parser.prt.PRTParser
- PrtArrayOfPropertyValues - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
-
The class is used to represent the prtArrayOfPropertyValues .
- PrtArrayOfPropertyValues() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
- PrtFourBytesOfLengthFollowedByData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
-
This class is used to represent the prtFourBytesOfLengthFollowedByData.
- PrtFourBytesOfLengthFollowedByData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtFourBytesOfLengthFollowedByData
- PRTParser - Class in org.apache.tika.parser.prt
-
A basic text extracting parser for the CADKey PRT (CAD Drawing) format.
- PRTParser() - Constructor for class org.apache.tika.parser.prt.PRTParser
- PS_INTERNET_HEADERS - Enum constant in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PropertySetType
- PS_MAPI - Enum constant in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PropertySetType
- PS_PUBLIC_STRINGS - Enum constant in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PropertySetType
- PSDParser - Class in org.apache.tika.parser.image
-
Parser for the Adobe Photoshop PSD File Format.
- PSDParser() - Constructor for class org.apache.tika.parser.image.PSDParser
- PSDParser(JsonConfig) - Constructor for class org.apache.tika.parser.image.PSDParser
- PSDParser(PSDParser.PSDParserConfig) - Constructor for class org.apache.tika.parser.image.PSDParser
- PSDParser.PSDParserConfig - Class in org.apache.tika.parser.image
-
Configuration class for PSDParser.
- PSDParserConfig() - Constructor for class org.apache.tika.parser.image.PSDParser.PSDParserConfig
- PSETID_ADDRESS - Enum constant in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PredefinedPropertySet
- PSETID_AIR_SYNC - Enum constant in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PredefinedPropertySet
- PSETID_APPOINTMENT - Enum constant in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PredefinedPropertySet
- PSETID_ATTACHMENT - Enum constant in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PredefinedPropertySet
- PSETID_CALENDAR_ASSISTANT - Enum constant in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PredefinedPropertySet
- PSETID_COMMON - Enum constant in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PredefinedPropertySet
- PSETID_LOG - Enum constant in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PredefinedPropertySet
- PSETID_MEETING - Enum constant in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PredefinedPropertySet
- PSETID_MESSAGING - Enum constant in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PredefinedPropertySet
- PSETID_NOTE - Enum constant in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PredefinedPropertySet
- PSETID_POST_RSS - Enum constant in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PredefinedPropertySet
- PSETID_SHARING - Enum constant in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PredefinedPropertySet
- PSETID_TASK - Enum constant in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PredefinedPropertySet
- PSETID_UNIFIED_MESSAGING - Enum constant in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PredefinedPropertySet
- PSETID_XML_EXTRACTED_ENTITIES - Enum constant in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PredefinedPropertySet
- PSM0_ORIENTATION - Static variable in class org.apache.tika.parser.ocr.TesseractOCRParser
- PSM0_ORIENTATION_CONFIDENCE - Static variable in class org.apache.tika.parser.ocr.TesseractOCRParser
- PSM0_PAGE_NUMBER - Static variable in class org.apache.tika.parser.ocr.TesseractOCRParser
- PSM0_ROTATE - Static variable in class org.apache.tika.parser.ocr.TesseractOCRParser
- PSM0_SCRIPT - Static variable in class org.apache.tika.parser.ocr.TesseractOCRParser
- PSM0_SCRIPT_CONFIDENCE - Static variable in class org.apache.tika.parser.ocr.TesseractOCRParser
- PST - Interface in org.apache.tika.metadata
- PST_MAIL_ITEM - Static variable in class org.apache.tika.parser.microsoft.pst.PSTMailItemParser
- PST_MAIL_ITEM_STRING - Static variable in class org.apache.tika.parser.microsoft.pst.PSTMailItemParser
- PST_PREFIX - Static variable in interface org.apache.tika.metadata.PST
- PSTEmailStreamTranslator - Class in org.apache.tika.extractor.microsoft
- PSTEmailStreamTranslator() - Constructor for class org.apache.tika.extractor.microsoft.PSTEmailStreamTranslator
- PSTMailItemParser - Class in org.apache.tika.parser.microsoft.pst
- PSTMailItemParser() - Constructor for class org.apache.tika.parser.microsoft.pst.PSTMailItemParser
- PUB - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
Microsoft Publisher
- PUBLISHER - Enum constant in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- PUBLISHER - Static variable in interface org.apache.tika.metadata.DublinCore
-
An entity responsible for making the resource available.
- PUBLISHER - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- PUBLISHER - Static variable in interface org.apache.tika.metadata.XMPDC
-
An entity responsible for making the resource available.
- PULL_DOWN - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The sampling phase of film to be converted to video (pull-down)."
- pushGroup() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFState
-
Open a new group: push current state and create a child.
- put(String, ExtensionConfig) - Method in interface org.apache.tika.pipes.core.config.ConfigStore
-
Stores a configuration.
- put(String, ExtensionConfig) - Method in class org.apache.tika.pipes.core.config.FileBasedConfigStore
- put(String, ExtensionConfig) - Method in class org.apache.tika.pipes.core.config.InMemoryConfigStore
- put(String, ExtensionConfig) - Method in class org.apache.tika.pipes.ignite.IgniteConfigStore
- putAllFields(Map<String, String>) - Method in class org.apache.tika.FetchAndParseReply.Builder
-
Metadata fields from the parse output.
- putAllParams(Map<String, String>) - Method in class org.apache.tika.GetFetcherReply.Builder
-
The configuration parameters.
- PutChanges - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.RequestTypes
-
Put changes.
- PutChangesLockId - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Put changes lock id
- PutChangesRequest - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
PutChanges Request
- PutChangesResponse - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Put Changes Response
- PutChangesResponseSerialNumberReassign - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
PutChanges Response SerialNumberReassign
- PutChangesResponseSerialNumberReassignAll - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
PutChanges Response SerialNumber ReassignAll
- putFields(String, String) - Method in class org.apache.tika.FetchAndParseReply.Builder
-
Metadata fields from the parse output.
- putParams(String, String) - Method in class org.apache.tika.GetFetcherReply.Builder
-
The configuration parameters.
- PutRawStorage - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.RequestTypes
-
Put raw storage.
Q
- QP_7_8 - Static variable in class org.apache.tika.parser.wordperfect.QuattroProParser
- QP_9 - Static variable in class org.apache.tika.parser.wordperfect.QuattroProParser
- QuattroPro - Interface in org.apache.tika.metadata
-
QuattroPro properties collection.
- QUATTROPRO - Static variable in class org.apache.tika.detect.ole.MiscOLEDetector
-
Base QuattroPro mime
- QUATTROPRO_METADATA_NAME_PREFIX - Static variable in interface org.apache.tika.metadata.QuattroPro
- QuattroProParser - Class in org.apache.tika.parser.wordperfect
-
Parser for Corel QuattroPro documents (part of Corel WordPerfect Office Suite).
- QuattroProParser() - Constructor for class org.apache.tika.parser.wordperfect.QuattroProParser
- QueryAccess - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.RequestTypes
-
Query access.
- QueryChanges - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.RequestTypes
-
Query changes.
- QueryChangesDataConstraint - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Query Changes Data Constraint
- QueryChangesFilter - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Query Changes Filter
- QueryChangesFilter - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Query Changes Filter
- QueryChangesFilterCellID - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Query Changes Filter Cell ID
- QueryChangesFilterDataElementIDs - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
QueryChanges Filter DataElement IDs
- QueryChangesFilterDataElementType - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
QueryChanges Filter Data Element Type
- QueryChangesFilterFlags - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Query Changes Filter Flags
- QueryChangesFilterHierarchy - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Query Changes Filter Hierarchy
- QueryChangesFilterSchemaSpecific - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
QueryChanges Filter Schema Specific
- QueryChangesRequest - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Query Changes Request
- QueryChangesRequest - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
QueryChanges Request
- QueryChangesRequestArguments - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Query Changes Request Arguments
- QueryChangesResponse - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Query Changes Response
- QueryChangesVersioning - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Query Changes Versioning
- QueryDataElementRequest - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Query Data Element Request
- QueryDiagnosticStoreInfo - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.RequestTypes
-
Query diagnostic store info.
- QueryKnowledge - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.RequestTypes
-
Query knowledge.
- QueryRawStorage - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.RequestTypes
-
Query raw storage.
R
- RangeFetcher - Interface in org.apache.tika.pipes.api.fetcher
-
This class extracts a range of bytes from a given fetch key.
- RarParser - Class in org.apache.tika.parser.pkg
-
Parser for Rar files.
- RarParser() - Constructor for class org.apache.tika.parser.pkg.RarParser
- RATING - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- RATING - Static variable in interface org.apache.tika.metadata.XMP
-
A user-assigned rating for this file.
- RATIONAL - Enum constant in enum class org.apache.tika.metadata.Property.ValueType
- RAW_IMAGES - Enum constant in enum class org.apache.tika.parser.pdf.PDFParserConfig.IMAGE_STRATEGY
-
This is the more modern version of
PDFParserConfig.extractInlineImages - RawTagIterator(int, int, int, int) - Constructor for class org.apache.tika.parser.mp3.ID3v2Frame.RawTagIterator
- RDCAnalysis - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingMethod
-
File data is passed to the RDC Analysis chunking method.
- RDCAnalysisChunking - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
-
This class is used to process RDC analysis chunking
- RDCAnalysisChunking(byte[]) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.RDCAnalysisChunking
-
Initializes a new instance of the
class - RDF - Static variable in class org.apache.tika.sax.XMPContentHandler
-
The RDF namespace URI
- read() - Method in class org.apache.tika.io.BoundedInputStream
- read() - Method in class org.apache.tika.io.LookaheadInputStream
- read() - Method in class org.apache.tika.io.TailStream
-
This implementation adds the read byte to the internal tail buffer.
- read(byte[]) - Method in class org.apache.tika.io.BoundedInputStream
-
Invokes the delegate's
read(byte[])method. - read(byte[]) - Method in class org.apache.tika.io.TailStream
-
This implementation delegates to the underlying stream and then adds the correct portion of the read buffer to the internal tail buffer.
- read(byte[], int, int) - Method in class org.apache.tika.io.BoundedInputStream
-
Invokes the delegate's
read(byte[], int, int)method. - read(byte[], int, int) - Method in class org.apache.tika.io.LookaheadInputStream
- read(byte[], int, int) - Method in class org.apache.tika.io.TailStream
-
This implementation delegates to the underlying stream and then adds the correct portion of the read buffer to the internal tail buffer.
- read(char[], int, int) - Method in class org.apache.tika.parser.ParsingReader
-
Reads parsed text from the pipe connected to the parsing thread.
- read(DataInputStream) - Static method in record class org.apache.tika.pipes.core.protocol.PipesMessage
-
Reads one framed message from the stream.
- read(InputStream) - Method in class org.apache.tika.mime.MimeTypesReader
- read(Document) - Method in class org.apache.tika.mime.MimeTypesReader
- ReadAccessResponse - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Read Access Response
- ReadAccessResponse - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Read Access Response
- readByteFrequencies(InputStream) - Method in class org.apache.tika.detect.TrainedModelDetector
-
Read the
inputstreamand build a byte frequency histogram - readBytes(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
-
Reading the bytes specified by the byte length.
- readFully(InputStream, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
- readFully(InputStream, int, boolean) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
- readGuid() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
-
Read as a GUID from the current offset position and increate the bit offset with 128 bit.
- readGuid(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AdapterHelper
-
This method is used to read the Guid for byte array.
- ReadingOrderRTL - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- readInt16(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
-
Read specified bit length content as an UInt16 type and increase the bit offset with the specified length.
- readInt32(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
-
Read specified bit length content as an Int32 type and increase the bit offset with the specified length.
- readIntBE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE int value from an InputStream
- readIntLE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE int value from an InputStream
- readIntME(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a PDP-11 style Middle Endian int value from an InputStream
- readLong() - Method in class org.apache.tika.parser.pdf.updates.StartXRefScanner
- readLongBE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a NE long value from an InputStream
- readLongLE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE long value from an InputStream
- readNBytes(byte[], int, int) - Method in class org.apache.tika.io.BoundedInputStream
- readNBytes(int) - Method in class org.apache.tika.io.BoundedInputStream
- readParseContext(JsonNode, ObjectMapper) - Static method in class org.apache.tika.serialization.serdes.ParseContextDeserializer
-
Deserializes a ParseContext from a JsonNode.
- readShortBE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE short value from an InputStream
- readShortLE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE short value from an InputStream
- readStringNumber() - Method in class org.apache.tika.parser.pdf.updates.StartXRefScanner
-
This method is used to read a token by the StartXRefScanner.readLong() method.
- readUE7(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Gets the integer value that is stored in UTF-8 like fashion, in Big Endian but with the high bit on each number indicating if it continues or not
- readUInt16(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
- readUInt32(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
-
Read specified bit length content as an UInt32 type and increase the bit offset with the specified length.
- readUInt64(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
-
Read specified bit length content as an UInt64 type and increase the bit offset.
- readUIntBE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE unsigned int value from an InputStream
- readUIntLE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE unsigned int value from an InputStream
- readUShortBE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
- readUShortLE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
- ready() - Static method in record class org.apache.tika.pipes.core.protocol.PipesMessage
- READY - Enum constant in enum class org.apache.tika.pipes.core.protocol.PipesMessageType
- REAL - Enum constant in enum class org.apache.tika.metadata.Property.ValueType
- REALIZATION - Static variable in interface org.apache.tika.metadata.ClimateForcast
- reallyEndDocument() - Method in class org.apache.tika.sax.EndDocumentShieldingContentHandler
- RecentFiles - Class in org.apache.tika.example
-
Builds on top of the LuceneIndexer and the Metadata discussions in Chapter 6 to output an RSS (or RDF) feed of files crawled by the LuceneIndexer within the last N minutes.
- RecentFiles() - Constructor for class org.apache.tika.example.RecentFiles
- RECIPIENTS_STRING - Static variable in interface org.apache.tika.metadata.MAPI
- recognise(String) - Method in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
recognises names of entities in the text
- recognise(String) - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
recognises names of entities in the text
- recognise(String) - Method in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
-
recognises names of entities in the text
- recognise(String) - Method in interface org.apache.tika.parser.ner.NERecogniser
-
call for name recognition action from text
- recognise(String) - Method in class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
-
recognises names of entities in the text
- recognise(String) - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
- recognise(String) - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- recognise(String) - Method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
- record(Chunk) - Method in class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks
-
Called by the parser whenever a chunk is found.
- recordEmbeddedStreamException(Throwable, Metadata) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
- recordException(Exception, ParseContext) - Method in class org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
- recordException(Throwable, Metadata) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
- recordParserDetails(String, Metadata) - Static method in class org.apache.tika.utils.ParserUtils
- recordParserDetails(Parser, Metadata) - Static method in class org.apache.tika.utils.ParserUtils
- recordParserFailure(Parser, Throwable, Metadata) - Static method in class org.apache.tika.utils.ParserUtils
- RecursiveMetadataResource - Class in org.apache.tika.server.core.resource
- RecursiveMetadataResource() - Constructor for class org.apache.tika.server.core.resource.RecursiveMetadataResource
- RecursiveParserWrapper - Class in org.apache.tika.parser
-
This is a helper class that wraps a parser in a recursive handler.
- RecursiveParserWrapper(Parser) - Constructor for class org.apache.tika.parser.RecursiveParserWrapper
-
Initialize the wrapper with
RecursiveParserWrapper.catchEmbeddedExceptionsset totrueas default. - RecursiveParserWrapper(Parser, boolean) - Constructor for class org.apache.tika.parser.RecursiveParserWrapper
- recursiveParserWrapperExample() - Method in class org.apache.tika.example.ParsingExample
-
For documents that may contain embedded documents, it might be helpful to create list of metadata objects, one for the container document and one for each embedded document.
- RecursiveParserWrapperHandler - Class in org.apache.tika.sax
-
This is the default implementation of
AbstractRecursiveParserWrapperHandler. - RecursiveParserWrapperHandler(ContentHandlerFactory) - Constructor for class org.apache.tika.sax.RecursiveParserWrapperHandler
-
Create a handler for recursive parsing.
- REF_EXTRACT_EXCEPTION_TYPES - Static variable in class org.apache.tika.eval.app.ProfilerBase
- REF_PAIR_NAMES - Static variable in class org.apache.tika.eval.app.ExtractComparer
- REF_PARSE_ERROR_TYPES - Static variable in class org.apache.tika.eval.app.ProfilerBase
- REF_PARSE_EXCEPTION_TYPES - Static variable in class org.apache.tika.eval.app.ProfilerBase
- referencedObjectID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObject
- referencedObjectSpacesID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObject
- REFERENCES - Static variable in interface org.apache.tika.metadata.ClimateForcast
- regex - Variable in class org.apache.tika.metadata.filter.CaptureGroupMetadataFilter.Config
- RegexCaptureParser - Class in org.apache.tika.parser
- RegexCaptureParser() - Constructor for class org.apache.tika.parser.RegexCaptureParser
- RegexCaptureParser(JsonConfig) - Constructor for class org.apache.tika.parser.RegexCaptureParser
- RegexCaptureParser(RegexCaptureParserConfig) - Constructor for class org.apache.tika.parser.RegexCaptureParser
- RegexCaptureParserConfig - Class in org.apache.tika.parser
-
Configuration for
RegexCaptureParser. - RegexCaptureParserConfig() - Constructor for class org.apache.tika.parser.RegexCaptureParserConfig
- RegexNERecogniser - Class in org.apache.tika.parser.ner.regex
-
This class offers an implementation of
NERecogniserbased on Regular Expressions. - RegexNERecogniser() - Constructor for class org.apache.tika.parser.ner.regex.RegexNERecogniser
- RegexNERecogniser(InputStream) - Constructor for class org.apache.tika.parser.ner.regex.RegexNERecogniser
- RegexUtils - Class in org.apache.tika.utils
-
Inspired from Nutch code class OutlinkExtractor.
- RegexUtils() - Constructor for class org.apache.tika.utils.RegexUtils
- region() - Method in record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
-
Returns the value of the
regionrecord component. - register() - Method in class org.apache.tika.serialization.ComponentConfig.Builder
-
Build and register with ComponentNameResolver.
- register(Process) - Method in class org.apache.tika.parser.AbstractExternalProcessParser
- registerAllExtensions(ExtensionRegistry) - Static method in class org.apache.tika.TikaProto
- registerAllExtensions(ExtensionRegistryLite) - Static method in class org.apache.tika.TikaProto
- registerComponentConfig(ComponentConfig<T>) - Static method in class org.apache.tika.serialization.ComponentNameResolver
-
Registers a ComponentConfig for top-level component loading.
- registerModels(MediaType, TrainedModel) - Method in class org.apache.tika.detect.TrainedModelDetector
- registerNamespace(String, String) - Static method in class org.apache.tika.xmp.XMPMetadata
-
Register a namespace URI with a suggested prefix.
- registerNamespaces(Set<Namespace>) - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
Registers a number
Namespaceinformation with XMPCore. - registerRegistry(String, ComponentRegistry) - Static method in class org.apache.tika.serialization.ComponentNameResolver
-
Registers a ComponentRegistry for name resolution.
- REGISTRY_ENTRY_CREATED_ITEM_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
A unique identifier created by a registry and applied by the creator of the item.
- REGISTRY_ENTRY_CREATED_ORGANISATION_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
An identifier for the registry which issued the corresponding Registry Image Id.
- REGULAR - Enum constant in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.OUTPUT_FORMAT
-
Regular output - embedded files emitted individually or as simple zip
- RELATION - Static variable in interface org.apache.tika.metadata.DublinCore
-
A reference to a related resource.
- RELATION - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- RELATION - Static variable in interface org.apache.tika.metadata.XMPDC
-
A reference to a related resource.
- RELATIVE_PEAK_AUDIO_FILE_PATH - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The relative path to the file's peak audio file.
- release(String) - Method in class org.apache.tika.parser.AbstractExternalProcessParser
- RELEASE_DATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The date the title was released."
- remove() - Method in class org.apache.tika.parser.mp3.ID3v2Frame.RawTagIterator
- remove(Class) - Method in class org.apache.tika.detect.zip.StreamingDetectContext
- remove(String) - Method in class org.apache.tika.metadata.Metadata
-
Remove a metadata and all its associated values.
- remove(String) - Method in interface org.apache.tika.pipes.core.config.ConfigStore
-
Removes a configuration by ID.
- remove(String) - Method in class org.apache.tika.pipes.core.config.FileBasedConfigStore
- remove(String) - Method in class org.apache.tika.pipes.core.config.InMemoryConfigStore
- remove(String) - Method in class org.apache.tika.pipes.ignite.IgniteConfigStore
- remove(String) - Method in class org.apache.tika.xmp.XMPMetadata
-
Removes the given property from the XMP data.
- remove(Property) - Method in class org.apache.tika.xmp.XMPMetadata
- RemoveByMimeMetadataFilter - Class in org.apache.tika.metadata.filter
-
This class removes the entire metadata object if the mime matches the mime filter.
- RemoveByMimeMetadataFilter() - Constructor for class org.apache.tika.metadata.filter.RemoveByMimeMetadataFilter
- RemoveByMimeMetadataFilter(Set<String>) - Constructor for class org.apache.tika.metadata.filter.RemoveByMimeMetadataFilter
- RemoveByMimeMetadataFilter(JsonConfig) - Constructor for class org.apache.tika.metadata.filter.RemoveByMimeMetadataFilter
-
Constructor for JSON configuration.
- RemoveByMimeMetadataFilter(RemoveByMimeMetadataFilter.Config) - Constructor for class org.apache.tika.metadata.filter.RemoveByMimeMetadataFilter
-
Constructor with explicit Config object.
- RemoveByMimeMetadataFilter.Config - Class in org.apache.tika.metadata.filter
-
Configuration class for JSON deserialization.
- removeCloseShield() - Method in class org.apache.tika.io.TikaInputStream
- removedService(ServiceReference, Object) - Method in class org.apache.tika.config.TikaActivator
- removeFields(String) - Method in class org.apache.tika.FetchAndParseReply.Builder
-
Metadata fields from the parse output.
- removeGetFetcherReplies(int) - Method in class org.apache.tika.ListFetchersReply.Builder
-
List of fetcher configs returned by the Lists Fetchers service.
- removeParams(String) - Method in class org.apache.tika.GetFetcherReply.Builder
-
The configuration parameters.
- render(TikaInputStream, Metadata, ParseContext, RenderRequest...) - Method in class org.apache.tika.renderer.CompositeRenderer
- render(TikaInputStream, Metadata, ParseContext, RenderRequest...) - Method in class org.apache.tika.renderer.pdf.pdfbox.PDFBoxRenderer
- render(TikaInputStream, Metadata, ParseContext, RenderRequest...) - Method in class org.apache.tika.renderer.pdf.poppler.PopplerRenderer
- render(TikaInputStream, Metadata, ParseContext, RenderRequest...) - Method in interface org.apache.tika.renderer.Renderer
- render(XHTMLContentHandler) - Method in interface org.apache.tika.parser.microsoft.Cell
-
Renders the content to the given XHTML SAX event stream.
- render(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.CellDecorator
- render(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.LinkedCell
- render(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.NumberCell
- render(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.TextCell
- RENDER_ALL - Static variable in class org.apache.tika.renderer.PageRangeRequest
- RENDER_PAGES_AT_PAGE_END - Enum constant in enum class org.apache.tika.parser.pdf.PDFParserConfig.IMAGE_STRATEGY
-
This renders each page, one at a time, at the end of the page.
- RENDER_PAGES_BEFORE_PARSE - Enum constant in enum class org.apache.tika.parser.pdf.PDFParserConfig.IMAGE_STRATEGY
-
If you want the rendered images, and you don't care that there's markup in the xhtml handler per page then go with this option.
- RENDERED_BY - Static variable in interface org.apache.tika.metadata.Rendering
- RENDERED_MS - Static variable in interface org.apache.tika.metadata.Rendering
- Renderer - Interface in org.apache.tika.renderer
-
Interface for a renderer.
- Rendering - Interface in org.apache.tika.metadata
- RENDERING - Enum constant in enum class org.apache.tika.metadata.TikaCoreProperties.EmbeddedResourceType
- RENDERING_PREFIX - Static variable in interface org.apache.tika.metadata.Rendering
- RenderingParser - Interface in org.apache.tika.parser
- RenderingState - Class in org.apache.tika.renderer
-
This should be to track state for each file (embedded or otherwise).
- RenderingState() - Constructor for class org.apache.tika.renderer.RenderingState
- RenderingTracker - Class in org.apache.tika.renderer
-
Use this in the ParseContext to keep track of unique ids for rendered images in embedded docs.
- RenderingTracker() - Constructor for class org.apache.tika.renderer.RenderingTracker
- renderPage(PDFRenderer, int, int, Metadata, ParseContext) - Method in class org.apache.tika.renderer.pdf.pdfbox.PDFBoxRenderer
- RenderRequest - Interface in org.apache.tika.renderer
-
Empty interface for requests to a renderer.
- RenderResult - Class in org.apache.tika.renderer
- RenderResult(RenderResult.STATUS, int, Object, Metadata) - Constructor for class org.apache.tika.renderer.RenderResult
- RenderResult.STATUS - Enum Class in org.apache.tika.renderer
- RenderResults - Class in org.apache.tika.renderer
- RenderResults(TemporaryResources) - Constructor for class org.apache.tika.renderer.RenderResults
- RENDITION_CLASS - Static variable in interface org.apache.tika.metadata.XMPMM
-
The rendition class name for this resource.
- RENDITION_LAYOUT - Static variable in interface org.apache.tika.metadata.Epub
-
This is set to "pre-paginated" if any itemref on the spine or the metadata has a "pre-paginated" value, "reflowable" otherwise.
- RENDITION_PARAMS - Static variable in interface org.apache.tika.metadata.XMPMM
-
Can be used to provide additional rendition parameters that are too complex or verbose to encode in xmpMM:RenditionClass
- repeat(char, int) - Static method in class org.apache.tika.utils.StringUtils
-
Returns padding using the specified delimiter repeated to a given length.
- repeat(String, int) - Static method in class org.apache.tika.utils.StringUtils
-
Repeat a String
repeattimes to form a new String. - ReplacementCharset - Class in org.apache.tika.parser.html.charsetdetector.charsets
-
An implementation of the standard "replacement" charset defined by the W3C.
- ReplacementCharset() - Constructor for class org.apache.tika.parser.html.charsetdetector.charsets.ReplacementCharset
- report(FetchEmitTuple, PipesResult, long) - Method in interface org.apache.tika.pipes.api.reporter.PipesReporter
- report(FetchEmitTuple, PipesResult, long) - Method in class org.apache.tika.pipes.core.reporter.CompositePipesReporter
- report(FetchEmitTuple, PipesResult, long) - Method in class org.apache.tika.pipes.core.reporter.NoOpReporter
- report(FetchEmitTuple, PipesResult, long) - Method in class org.apache.tika.pipes.reporter.es.ESPipesReporter
- report(FetchEmitTuple, PipesResult, long) - Method in class org.apache.tika.pipes.reporter.fs.FileSystemStatusReporter
- report(FetchEmitTuple, PipesResult, long) - Method in class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporter
- report(FetchEmitTuple, PipesResult, long) - Method in class org.apache.tika.pipes.reporter.opensearch.OpenSearchPipesReporter
- report(TotalCountResult) - Method in interface org.apache.tika.pipes.api.reporter.PipesReporter
-
Make sure to override
PipesReporter.supportsTotalCount()to returntrue - report(TotalCountResult) - Method in class org.apache.tika.pipes.core.reporter.CompositePipesReporter
- report(TotalCountResult) - Method in class org.apache.tika.pipes.core.reporter.NoOpReporter
- report(TotalCountResult) - Method in class org.apache.tika.pipes.reporter.es.ESPipesReporter
- report(TotalCountResult) - Method in class org.apache.tika.pipes.reporter.fs.FileSystemStatusReporter
- report(TotalCountResult) - Method in class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporter
- report(TotalCountResult) - Method in class org.apache.tika.pipes.reporter.opensearch.OpenSearchPipesReporter
- Report - Class in org.apache.tika.eval.app.reports
-
This class represents a single report.
- Report() - Constructor for class org.apache.tika.eval.app.reports.Report
- ReporterManager - Class in org.apache.tika.pipes.core.reporter
-
Utility class to hold multiple fetchers.
- ReporterManager() - Constructor for class org.apache.tika.pipes.core.reporter.ReporterManager
- reportSql() - Method in record class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterConfig
-
Returns the value of the
reportSqlrecord component. - reportUpdateMs() - Method in record class org.apache.tika.pipes.reporter.fs.FileSystemReporterConfig
-
Returns the value of the
reportUpdateMsrecord component. - reportVariables() - Method in record class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterConfig
-
Returns the value of the
reportVariablesrecord component. - reportWithinMs() - Method in record class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterConfig
-
Returns the value of the
reportWithinMsrecord component. - Request - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
The Request
- Request - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
The Request
- RequestHashOptions - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Request Hash Options
- requestTimeoutMs() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
requestTimeoutMsrecord component. - RequestTypes - Enum Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
-
The enumeration of request type.
- requiresAck() - Method in enum class org.apache.tika.pipes.core.protocol.PipesMessageType
-
Returns
trueif the receiver must send an ACK after reading a message of this type. - reserved - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
- reserved - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementPackage
- reserved - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
- RESERVED_FILENAME_CHARACTERS - Static variable in class org.apache.tika.io.FilenameUtils
-
Reserved characters
- RESERVED_NONZERO - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.Error
- reset() - Method in class org.apache.tika.io.BoundedInputStream
- reset() - Method in class org.apache.tika.io.LookaheadInputStream
- reset() - Method in class org.apache.tika.io.TailStream
-
This implementation restores this stream's state to the state when ''mark()'' was called the last time.
- reset() - Method in class org.apache.tika.io.TikaInputStream
- reset() - Method in class org.apache.tika.langdetect.charsoup.CharSoupLanguageDetector
- reset() - Method in class org.apache.tika.langdetect.lingo24.Lingo24LangDetector
- reset() - Method in class org.apache.tika.langdetect.mitll.TextLangDetector
- reset() - Method in class org.apache.tika.langdetect.opennlp.OpenNLPDetector
- reset() - Method in class org.apache.tika.langdetect.optimaize.OptimaizeLangDetector
- reset() - Method in class org.apache.tika.language.detect.LanguageDetector
-
Reset statistics about the current document being processed.
- reset() - Method in class org.apache.tika.language.detect.LanguageWriter
- reset() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
-
Sets the enumerator to its initial position, which is before the first bit in the byte array.
- reset() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
- reset(XSSFWorkbook) - Method in class org.apache.tika.eval.app.reports.XLSXHREFFormatter
- reset(RTFTokenType) - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFToken
- reset(ParseContext) - Method in class org.apache.tika.language.detect.LanguageDetector
-
Reset statistics about the current document being processed, applying any per-document configuration present in
context. - reset(AnalysisEngine, JCas) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Resets cTAKES objects, if created.
- RESET_TABLE - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- resetAE(AnalysisEngine) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Resets the AE (AnalysisEngine), releasing all resources held by the current AE.
- resetCAS(JCas) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Resets the CAS (Common Analysis System), emptying it of all content.
- RESOLUTION_HORIZONTAL - Static variable in interface org.apache.tika.metadata.TIFF
-
"Horizontal resolution in pixels per unit."
- RESOLUTION_UNIT - Static variable in interface org.apache.tika.metadata.TIFF
-
"Units used for Horizontal and Vertical Resolutions."
- RESOLUTION_VERTICAL - Static variable in interface org.apache.tika.metadata.TIFF
-
"Vertical resolution in pixels per unit."
- resolveAll(ParseContext, ClassLoader) - Static method in class org.apache.tika.serialization.ParseContextUtils
-
Resolves all JSON configs from ParseContext and adds them to the resolved cache.
- resolveClass(String, ClassLoader) - Static method in class org.apache.tika.serialization.ComponentNameResolver
-
Resolves a friendly name or FQCN to a Class.
- resolveCodePage(int) - Static method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFCharsetMaps
-
Resolve an ANSI code page number to a Java Charset.
- resolveEntity(String, String) - Method in class org.apache.tika.mime.MimeTypesReader
- resolveEntity(String, String) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
-
do not load any DTDs (may be requested by parser).
- resolveEntity(String, String) - Method in class org.apache.tika.sax.OfflineContentHandler
-
Returns an empty stream.
- RESOURCE_NAME_EXTENSION_INFERRED - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Indicates that the file extension on the resource name was inferred by Tika (e.g., from content type detection) rather than provided by the original document.
- RESOURCE_NAME_KEY - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- resourceCount() - Method in class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
-
Returns the number of resources in this package.
- Response - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
The Response
- ResponseError - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Response Error
- Result(int, int) - Constructor for class org.apache.tika.ml.chardetect.HtmlByteStripper.Result
- Result(List<EncodingResult>, String) - Constructor for class org.apache.tika.detect.EncodingDetectorContext.Result
- ResultsReporter - Class in org.apache.tika.eval.app.reports
- ResultsReporter() - Constructor for class org.apache.tika.eval.app.reports.ResultsReporter
- retries() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
retriesrecord component. - retryBackoffMs() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
retryBackoffMsrecord component. - reverse(byte[]) - Static method in class org.apache.tika.parser.microsoft.chm.ChmCommons
-
Reverses the order of given array
- reverseByteOrder(byte[]) - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
- REVISION - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
The revision number.
- revisionExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexRevisionMapping
- revisionID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifest
- revisionManifest - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestDataElementData
- RevisionManifest - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- RevisionManifest - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Revision Manifest
- RevisionManifest() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifest
-
Initializes a new instance of the RevisionManifest class.
- RevisionManifestDataElementData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- RevisionManifestDataElementData - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
-
Revision Manifest Data Element
- RevisionManifestDataElementData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestDataElementData
-
Initializes a new instance of the RevisionManifestDataElementData class.
- revisionManifestObjectGroupReferences - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestDataElementData
- RevisionManifestObjectGroupReferences - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Specifies a revision manifest object group references, each followed by object group extended GUIDs
- RevisionManifestObjectGroupReferences - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Revision Manifest Object Group References
- RevisionManifestObjectGroupReferences() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestObjectGroupReferences
-
Initializes a new instance of the RevisionManifestObjectGroupReferences class.
- RevisionManifestObjectGroupReferences(ExGuid) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestObjectGroupReferences
-
Initializes a new instance of the RevisionManifestObjectGroupReferences class.
- RevisionManifestRootDeclare - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Specifies a revision manifest root declare, each followed by root and object extended GUIDs
- RevisionManifestRootDeclare - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Revision Manifest Root Declare
- RevisionManifestRootDeclare() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestRootDeclare
-
Initializes a new instance of the RevisionManifestRootDeclare class.
- revisionManifestRootDeclareList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestDataElementData
- revisionManifests - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
- revisionMappingExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexRevisionMapping
- revisionMappingSerialNumber - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexRevisionMapping
- RevisionStoreObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
The class is used to represent the revision store object.
- RevisionStoreObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObject
-
Initialize the class.
- RevisionStoreObjectGroup - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- RevisionStoreObjectGroup(ExGuid) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObjectGroup
- rewind() - Method in class org.apache.tika.io.TikaInputStream
-
Rewind the stream to the beginning.
- RFC_5322 - Static variable in class org.apache.tika.parser.mailcommons.MailDateParser
- RFC_5322_AMPM_LENIENT - Static variable in class org.apache.tika.parser.mailcommons.MailDateParser
- RFC_5322_LENIENT - Static variable in class org.apache.tika.parser.mailcommons.MailDateParser
- RFC822Parser - Class in org.apache.tika.parser.mail
-
Uses apache-mime4j to parse emails.
- RFC822Parser() - Constructor for class org.apache.tika.parser.mail.RFC822Parser
- RFC822Parser(JsonConfig) - Constructor for class org.apache.tika.parser.mail.RFC822Parser
-
Constructor for JSON configuration.
- RFC822Parser(RFC822Parser.Config) - Constructor for class org.apache.tika.parser.mail.RFC822Parser
-
Constructor with explicit Config object.
- RFC822Parser.Config - Class in org.apache.tika.parser.mail
-
Configuration class for JSON deserialization.
- RGB - Enum constant in enum class org.apache.tika.parser.pdf.OcrConfig.ImageType
- rgbReserved - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
- rgData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySet
- RgOutlineIndentDistance - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- rgPrids - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySet
- RichEditTextLangID - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- RichEditTextUnicode - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- RichTextContentHandler - Class in org.apache.tika.sax
-
Content handler for Rich Text, it will extract XHTML <img/> tag <alt/> attribute and XHTML <a/> tag <name/> attribute into the output.
- RichTextContentHandler(Writer) - Constructor for class org.apache.tika.sax.RichTextContentHandler
-
Creates a content handler that writes XHTML body character events to the given writer.
- RIGHTS - Static variable in interface org.apache.tika.metadata.DublinCore
-
Information about rights held in and over the resource.
- RIGHTS - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- RIGHTS - Static variable in interface org.apache.tika.metadata.XMPDC
-
Information about rights held in and over the resource.
- RIGHTS_USAGE_TERMS - Static variable in interface org.apache.tika.metadata.IPTC
-
The licensing parameters of the item expressed in free-text.
- rightShift(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- RMETA - Enum constant in enum class org.apache.tika.pipes.api.ParseMode
-
Each embedded file gets its own metadata object in a list.
- rollback(File) - Method in class org.apache.tika.example.RollbackSoftware
- RollbackSoftware - Class in org.apache.tika.example
-
Demonstrates Tika and its ability to sense symlinks.
- RollbackSoftware() - Constructor for class org.apache.tika.example.RollbackSoftware
- ROOT_ENTITY - Static variable in class org.apache.tika.parser.xml.XMLProfiler
- ROOT_XML_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- rootExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestRootDeclare
- rootExGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestRootDeclare
- RootExGuid - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
- RootNodeEnd - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Root Node End
- RootNodeObjectBuilder() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.IntermediateNodeObject.RootNodeObjectBuilder
- rotate(BufferedImage, double, int, int) - Static method in class org.apache.tika.parser.ocr.tess4j.ImageUtil
- ROW_COUNT - Static variable in interface org.apache.tika.metadata.Database
- RowCount - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- RTF - Enum constant in enum class org.apache.tika.parser.microsoft.OutlookExtractor.BODY_TYPES_PROCESSED
- RTF_PICT_META_PREFIX - Static variable in interface org.apache.tika.metadata.RTFMetadata
- RTFCharsetMaps - Class in org.apache.tika.parser.microsoft.rtf.jflex
-
Shared charset maps for RTF parsing.
- RTFConverter - Class in org.apache.tika.xmp.convert
-
Tika to XMP mapping for the RTF format.
- RTFConverter() - Constructor for class org.apache.tika.xmp.convert.RTFConverter
- RTFEmbeddedHandler - Class in org.apache.tika.parser.microsoft.rtf.jflex
-
Handles embedded objects and pictures within the JFlex-based RTF token stream.
- RTFEmbeddedHandler(ContentHandler, ParseContext, int) - Constructor for class org.apache.tika.parser.microsoft.rtf.jflex.RTFEmbeddedHandler
- RTFEncapsulatedHTMLExtractor - Class in org.apache.tika.parser.microsoft.msg
-
Extracts the original HTML from an RTF document that contains encapsulated HTML (as indicated by the
\fromhtml1control word). - RTFEncapsulatedHTMLExtractor() - Constructor for class org.apache.tika.parser.microsoft.msg.RTFEncapsulatedHTMLExtractor
- RTFGroupState - Class in org.apache.tika.parser.microsoft.rtf.jflex
-
State associated with a single RTF group (
\{ ... \}). - RTFGroupState() - Constructor for class org.apache.tika.parser.microsoft.rtf.jflex.RTFGroupState
-
Create a root group state with defaults.
- RTFGroupState(RTFGroupState) - Constructor for class org.apache.tika.parser.microsoft.rtf.jflex.RTFGroupState
-
Create a child group state inheriting from the parent.
- RTFHtmlDecapsulator - Class in org.apache.tika.parser.microsoft.rtf.jflex
-
Extracts the original HTML from an RTF document that contains encapsulated HTML (as indicated by the
\fromhtml1control word), using a JFlex-based tokenizer and sharedRTFStatefor font/codepage tracking. - RTFHtmlDecapsulator(ContentHandler, ParseContext) - Constructor for class org.apache.tika.parser.microsoft.rtf.jflex.RTFHtmlDecapsulator
- RTFHtmlDecapsulator(ContentHandler, ParseContext, int) - Constructor for class org.apache.tika.parser.microsoft.rtf.jflex.RTFHtmlDecapsulator
- RTFMetadata - Interface in org.apache.tika.metadata
- RTFObjDataStreamParser - Class in org.apache.tika.parser.microsoft.rtf.jflex
-
Parses OLE objdata from an RTF stream inline, byte by byte.
- RTFObjDataStreamParser(long) - Constructor for class org.apache.tika.parser.microsoft.rtf.jflex.RTFObjDataStreamParser
- RTFParser - Class in org.apache.tika.parser.microsoft.rtf
-
RTF parser
- RTFParser() - Constructor for class org.apache.tika.parser.microsoft.rtf.RTFParser
- RTFParser(JsonConfig) - Constructor for class org.apache.tika.parser.microsoft.rtf.RTFParser
- RTFParser(RTFParser.Config) - Constructor for class org.apache.tika.parser.microsoft.rtf.RTFParser
- RTFParser.Config - Class in org.apache.tika.parser.microsoft.rtf
-
Configuration class for JSON deserialization.
- RTFPictStreamParser - Class in org.apache.tika.parser.microsoft.rtf.jflex
-
Streams decoded bytes from an RTF
\pictgroup to a temp file. - RTFPictStreamParser(long) - Constructor for class org.apache.tika.parser.microsoft.rtf.jflex.RTFPictStreamParser
- RTFState - Class in org.apache.tika.parser.microsoft.rtf.jflex
-
Shared RTF parsing state: group stack, font table, codepage tracking, and unicode skip handling.
- RTFState() - Constructor for class org.apache.tika.parser.microsoft.rtf.jflex.RTFState
- RTFToken - Class in org.apache.tika.parser.microsoft.rtf.jflex
-
A single token produced by the RTF tokenizer.
- RTFToken() - Constructor for class org.apache.tika.parser.microsoft.rtf.jflex.RTFToken
- RTFTokenizer - Class in org.apache.tika.parser.microsoft.rtf.jflex
- RTFTokenizer(Reader) - Constructor for class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenizer
-
Creates a new scanner
- RTFTokenType - Enum Class in org.apache.tika.parser.microsoft.rtf.jflex
- RTG_PROPS - Static variable in class org.apache.tika.language.translate.impl.RTGTranslator
- RTG_TRANSLATE_URL_BASE - Static variable in class org.apache.tika.language.translate.impl.RTGTranslator
- RTGTranslator - Class in org.apache.tika.language.translate.impl
-
This translator is designed to work with a TCP-IP available RTG translation server, specifically the REST-based RTG server.
- RTGTranslator() - Constructor for class org.apache.tika.language.translate.impl.RTGTranslator
- run() - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- run() - Method in class org.apache.tika.pipes.core.server.ConnectionHandler
- run() - Method in class org.apache.tika.utils.StreamGobbler
- run(RunProperties, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- run(RunProperties, String) - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- runAndGetOutput(String, String[], File) - Method in class org.apache.tika.language.translate.impl.ExternalTranslator
-
Run the given command and return the output written to standard out.
- RunProperties - Class in org.apache.tika.parser.microsoft.ooxml
-
WARNING: This class is mutable.
- RunProperties() - Constructor for class org.apache.tika.parser.microsoft.ooxml.RunProperties
- RUNTIME - Enum constant in enum class org.apache.tika.eval.app.ProfilerBase.EXCEPTION_TYPE
- RuntimeConfig() - Constructor for class org.apache.tika.detect.magika.MagikaDetector.RuntimeConfig
- RuntimeConfig() - Constructor for class org.apache.tika.detect.siegfried.SiegfriedDetector.RuntimeConfig
- RuntimeConfig() - Constructor for class org.apache.tika.inference.ImageEmbeddingConfig.RuntimeConfig
- RuntimeConfig() - Constructor for class org.apache.tika.inference.InferenceConfig.RuntimeConfig
- RuntimeConfig() - Constructor for class org.apache.tika.parser.dwg.DWGParserConfig.RuntimeConfig
- RuntimeConfig() - Constructor for class org.apache.tika.parser.geo.topic.GeoParserConfig.RuntimeConfig
- RuntimeConfig() - Constructor for class org.apache.tika.parser.microsoft.libpst.LibPstParserConfig.RuntimeConfig
- RuntimeConfig() - Constructor for class org.apache.tika.parser.ocr.tess4j.Tess4JConfig.RuntimeConfig
- RuntimeConfig() - Constructor for class org.apache.tika.parser.ocr.TesseractOCRConfig.RuntimeConfig
- RuntimeConfig() - Constructor for class org.apache.tika.parser.strings.StringsConfig.RuntimeConfig
- RuntimeConfig() - Constructor for class org.apache.tika.parser.transcribe.aws.AmazonTranscribeConfig.RuntimeConfig
- RuntimeConfig() - Constructor for class org.apache.tika.parser.vlm.VLMOCRConfig.RuntimeConfig
- RuntimeConfig(VLMOCRConfig) - Constructor for class org.apache.tika.parser.vlm.VLMOCRConfig.RuntimeConfig
-
Creates a RuntimeConfig that inherits the init-time
allowRuntimePromptsetting and themaxTokensceiling from the given parent config. - RuntimeSAXException - Exception in org.apache.tika.exception
-
Use this to throw a SAXException in subclassed methods that don't throw SAXExceptions
- RuntimeSAXException(SAXException) - Constructor for exception org.apache.tika.exception.RuntimeSAXException
S
- S - Enum constant in enum class org.apache.tika.parser.microsoft.FormattingUtils.Tag
- S3Emitter - Class in org.apache.tika.pipes.emitter.s3
-
Emitter to write to an existing S3 bucket.
- S3EmitterConfig - Record Class in org.apache.tika.pipes.emitter.s3
- S3EmitterConfig(String, String, String, String, String, String, String, String, String, boolean, int, boolean) - Constructor for record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
-
Creates an instance of a
S3EmitterConfigrecord class. - S3EmitterFactory - Class in org.apache.tika.pipes.emitter.s3
-
Factory for creating S3 emitters.
- S3EmitterFactory() - Constructor for class org.apache.tika.pipes.emitter.s3.S3EmitterFactory
- S3Fetcher - Class in org.apache.tika.pipes.fetcher.s3
-
Fetches files from s3.
- S3FetcherConfig - Class in org.apache.tika.pipes.fetcher.s3.config
- S3FetcherConfig() - Constructor for class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- S3FetcherFactory - Class in org.apache.tika.pipes.fetcher.s3
-
Factory for creating S3 fetchers.
- S3FetcherFactory() - Constructor for class org.apache.tika.pipes.fetcher.s3.S3FetcherFactory
- S3PipesIterator - Class in org.apache.tika.pipes.iterator.s3
- S3PipesIteratorConfig - Class in org.apache.tika.pipes.iterator.s3
- S3PipesIteratorConfig() - Constructor for class org.apache.tika.pipes.iterator.s3.S3PipesIteratorConfig
- S3PipesIteratorFactory - Class in org.apache.tika.pipes.iterator.s3
-
Factory for creating S3 pipes iterators.
- S3PipesIteratorFactory() - Constructor for class org.apache.tika.pipes.iterator.s3.S3PipesIteratorFactory
- S3PipesPlugin - Class in org.apache.tika.pipes.plugin.s3
- S3PipesPlugin(PluginWrapper) - Constructor for class org.apache.tika.pipes.plugin.s3.S3PipesPlugin
- SafeContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that makes sure that the character events (
SafeContentHandler.characters(char[], int, int)orSafeContentHandler.ignorableWhitespace(char[], int, int)) passed to the decorated content handler contain only valid XML characters. - SafeContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.SafeContentHandler
- SafeContentHandler.Output - Interface in org.apache.tika.sax
-
Internal interface that allows both character and ignorable whitespace content to be filtered the same way.
- safeGetRelatedPart(PackagePart, PackageRelationship) - Static method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
Safely resolves a related part, returning null if the part cannot be found instead of throwing
IllegalArgumentException. - SaltedNgramFeatureExtractor - Class in org.apache.tika.langdetect.charsoup
-
Feature extractor using positional salt (BOW/EOW/FULL_WORD) instead of sentinel characters in n-grams.
- SaltedNgramFeatureExtractor(int) - Constructor for class org.apache.tika.langdetect.charsoup.SaltedNgramFeatureExtractor
- SaltedNgramFeatureExtractor(int, boolean) - Constructor for class org.apache.tika.langdetect.charsoup.SaltedNgramFeatureExtractor
- SaltedNgramFeatureExtractor(int, boolean, boolean) - Constructor for class org.apache.tika.langdetect.charsoup.SaltedNgramFeatureExtractor
- salvageCopy(Path, Path) - Static method in class org.apache.tika.zip.utils.ZipSalvager
-
Streams a broken zip from a Path and rebuilds a valid zip file.
- salvageCopy(TikaInputStream, Path, boolean) - Static method in class org.apache.tika.zip.utils.ZipSalvager
-
Streams the broken zip and rebuilds a new zip that is at least a valid zip file.
- SALVAGED - Static variable in interface org.apache.tika.metadata.Zip
-
Set to true if the ZIP file was salvaged (rebuilt from a corrupt/truncated original).
- SAMPLES_PER_PIXEL - Static variable in interface org.apache.tika.metadata.TIFF
-
"Number of components per pixel."
- SAS7BDATParser - Class in org.apache.tika.parser.sas
-
Processes the SAS7BDAT data columnar database file used by SAS and other similar languages.
- SAS7BDATParser() - Constructor for class org.apache.tika.parser.sas.SAS7BDATParser
- sasToken() - Method in record class org.apache.tika.pipes.emitter.azblob.AZBlobEmitterConfig
-
Returns the value of the
sasTokenrecord component. - save(File) - Method in class org.apache.tika.config.loader.TikaLoader
-
Saves the current configuration to a JSON file (pretty-printed).
- save(OutputStream) - Method in class org.apache.tika.config.loader.TikaLoader
-
Saves the current configuration to an output stream (pretty-printed).
- save(OutputStream) - Method in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Write the model in LDM2 binary format (includes feature flags).
- save(OutputStream) - Method in class org.apache.tika.ml.LinearModel
-
Write the model in LDM binary format.
- SAVE_DATE - Static variable in interface org.apache.tika.metadata.Office
-
When was the document last saved?
- saveComponent(ExtensionConfig) - Method in class org.apache.tika.pipes.core.AbstractComponentManager
-
Dynamically adds or updates a component configuration at runtime.
- saveEmitter(ExtensionConfig) - Method in class org.apache.tika.pipes.core.emitter.EmitterManager
-
Dynamically adds or updates an emitter configuration at runtime.
- saveFetcher(ExtensionConfig) - Method in class org.apache.tika.pipes.core.fetcher.FetcherManager
-
Dynamically adds or updates a fetcher configuration at runtime.
- saveFetcher(SaveFetcherRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingStub
-
Save a fetcher to the fetcher store.
- saveFetcher(SaveFetcherRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingV2Stub
-
Save a fetcher to the fetcher store.
- saveFetcher(SaveFetcherRequest) - Method in class org.apache.tika.TikaGrpc.TikaFutureStub
-
Save a fetcher to the fetcher store.
- saveFetcher(SaveFetcherRequest, StreamObserver<SaveFetcherReply>) - Method in interface org.apache.tika.TikaGrpc.AsyncService
-
Save a fetcher to the fetcher store.
- saveFetcher(SaveFetcherRequest, StreamObserver<SaveFetcherReply>) - Method in class org.apache.tika.TikaGrpc.TikaStub
-
Save a fetcher to the fetcher store.
- SaveFetcherReply - Class in org.apache.tika
-
Protobuf type
tika.SaveFetcherReply - SaveFetcherReply.Builder - Class in org.apache.tika
-
Protobuf type
tika.SaveFetcherReply - SaveFetcherReplyOrBuilder - Interface in org.apache.tika
- SaveFetcherRequest - Class in org.apache.tika
-
Protobuf type
tika.SaveFetcherRequest - SaveFetcherRequest.Builder - Class in org.apache.tika
-
Protobuf type
tika.SaveFetcherRequest - SaveFetcherRequestOrBuilder - Interface in org.apache.tika
- savePipesIterator(SavePipesIteratorRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingStub
-
Save a pipes iterator to the iterator store.
- savePipesIterator(SavePipesIteratorRequest) - Method in class org.apache.tika.TikaGrpc.TikaBlockingV2Stub
-
Save a pipes iterator to the iterator store.
- savePipesIterator(SavePipesIteratorRequest) - Method in class org.apache.tika.TikaGrpc.TikaFutureStub
-
Save a pipes iterator to the iterator store.
- savePipesIterator(SavePipesIteratorRequest, StreamObserver<SavePipesIteratorReply>) - Method in interface org.apache.tika.TikaGrpc.AsyncService
-
Save a pipes iterator to the iterator store.
- savePipesIterator(SavePipesIteratorRequest, StreamObserver<SavePipesIteratorReply>) - Method in class org.apache.tika.TikaGrpc.TikaStub
-
Save a pipes iterator to the iterator store.
- SavePipesIteratorReply - Class in org.apache.tika
-
Protobuf type
tika.SavePipesIteratorReply - SavePipesIteratorReply.Builder - Class in org.apache.tika
-
Protobuf type
tika.SavePipesIteratorReply - SavePipesIteratorReplyOrBuilder - Interface in org.apache.tika
- SavePipesIteratorRequest - Class in org.apache.tika
-
Protobuf type
tika.SavePipesIteratorRequest - SavePipesIteratorRequest.Builder - Class in org.apache.tika
-
Protobuf type
tika.SavePipesIteratorRequest - SavePipesIteratorRequestOrBuilder - Interface in org.apache.tika
- SAXOutputConfig - Class in org.apache.tika.sax
-
Configuration for SAX output behavior.
- SAXOutputConfig() - Constructor for class org.apache.tika.sax.SAXOutputConfig
- SBCS_LATIN_FAMILY - Static variable in class org.apache.tika.ml.chardetect.CharsetConfusables
-
Single-byte Latin-family charsets that may decode byte-identically to windows-1252 on sparse probes (where the only high bytes present fall in positions the family agrees on — e.g. 0xE4='ä' in every member).
- SCALE_TYPE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The musical scale used in the music.
- scan() - Method in class org.apache.tika.parser.pdf.updates.StartXRefScanner
- SCENE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the scene."
- SCENE_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
Describes the scene of a news content.
- SchemaGuid - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
- SchemaRevisionInOrderToRead - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- SCHEME - Static variable in interface org.apache.tika.metadata.XMPIdq
-
A qualifier providing the name of the formal identification scheme used for an item in the xmp:Identifier array.
- score(byte[]) - Method in interface org.apache.tika.ml.chardetect.StatisticalSpecialist
-
Per-class logits for the probe, or
nullto decline (probe too short, hard-gated, etc.). - score(byte[]) - Method in class org.apache.tika.ml.chardetect.Utf16SpecialistEncodingDetector
-
StatisticalSpecialistentry point: raw per-class logits, ornullfor a probe too short to evaluate (fewer than 2 bytes) or missing a model. - score(String) - Method in class org.apache.tika.ml.junkdetect.JunkDetector
-
Scores the given string for text quality.
- score(String) - Method in interface org.apache.tika.quality.TextQualityDetector
-
Scores the given string for text quality.
- score(TikaInputStream) - Method in class org.apache.tika.ml.chardetect.Utf16SpecialistEncodingDetector
-
Convenience: mark/reset the stream, read a probe, and score it.
- scoreA() - Method in class org.apache.tika.quality.TextQualityComparison
-
Quality score for candidate A.
- scoreB() - Method in class org.apache.tika.quality.TextQualityComparison
-
Quality score for candidate B.
- scoreBytes(byte[]) - Method in class org.apache.tika.ml.chardetect.Utf16SpecialistEncodingDetector
-
Deprecated.use
Utf16SpecialistEncodingDetector.score(byte[]). Kept for existing tests. - ScoredCandidate - Class in org.apache.tika.ml.chardetect
-
Pooled candidate from
LogLinearCombiner: label, raw summed score (larger is better, not normalized), and the specialists that contributed. - ScoredCandidate(String, float, Set<String>) - Constructor for class org.apache.tika.ml.chardetect.ScoredCandidate
- SCRIPT_BASIS - Static variable in class org.apache.tika.langdetect.charsoup.ScriptAwareFeatureExtractor
- SCRIPT_SOURCE - Static variable in interface org.apache.tika.metadata.HTML
-
If a script element contains a src value, this value is set in the embedded document's metadata
- SCRIPT_TRANS_BASIS - Static variable in class org.apache.tika.langdetect.charsoup.ScriptAwareFeatureExtractor
- ScriptAwareFeatureExtractor - Class in org.apache.tika.langdetect.charsoup
-
Production feature extractor for the CharSoup language detection model.
- ScriptAwareFeatureExtractor(int) - Constructor for class org.apache.tika.langdetect.charsoup.ScriptAwareFeatureExtractor
- ScriptAwareFeatureExtractor(int, boolean) - Constructor for class org.apache.tika.langdetect.charsoup.ScriptAwareFeatureExtractor
- ScriptCategory - Class in org.apache.tika.langdetect.charsoup
-
Coarse Unicode script categories for language detection.
- SDA - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
StarOffice Draw
- SDC - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
StarOffice Calc
- SDD - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
StarOffice Impress
- SDW - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
StarOffice Writer
- searchGeoNames(ArrayList<String>) - Method in class org.apache.tika.parser.geo.topic.GeoParser
- secondaryParser - Variable in class org.apache.tika.parser.ner.NamedEntityParser
- SECRET_PROPERTY - Static variable in class org.apache.tika.language.translate.impl.MicrosoftTranslator
- secretKey() - Method in record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
-
Returns the value of the
secretKeyrecord component. - SectionDisplayName - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- SecureContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that attempts to prevent denial of service attacks against Tika parsers.
- SecureContentHandler(ContentHandler, TikaInputStream) - Constructor for class org.apache.tika.sax.SecureContentHandler
-
Decorates the given content handler with zip bomb prevention based on the count of bytes read from the given counting input stream.
- SECURITY_LOCKED_FOR_ANNOTATIONS - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- SECURITY_NONE - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- SECURITY_PASSWORD_PROTECTED - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- SECURITY_READ_ONLY_ENFORCED - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- SECURITY_READ_ONLY_RECOMMENDED - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- SECURITY_UNKNOWN - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- SEGV - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.Error
- select(Metadata) - Method in interface org.apache.tika.extractor.DocumentSelector
-
Checks if a document with the given metadata matches the specified selection criteria.
- select(Metadata) - Method in class org.apache.tika.extractor.SkipEmbeddedDocumentSelector
- select(Metadata) - Method in class org.apache.tika.extractor.UnpackSelector.AcceptAll
- select(Metadata) - Method in interface org.apache.tika.extractor.UnpackSelector
- select(Metadata) - Method in class org.apache.tika.pipes.core.extractor.StandardUnpackSelector
- selfConfiguring() - Method in record class org.apache.tika.config.loader.ComponentInfo
-
Returns the value of the
selfConfiguringrecord component. - SelfConfiguring - Interface in org.apache.tika.config
-
Marker interface indicating that a component reads its own configuration from
ParseContext's jsonConfigs at runtime. - SENT_BY_SERVER_TYPE - Static variable in interface org.apache.tika.metadata.MAPI
- SEPARATE_DOCUMENTS - Enum constant in enum class org.apache.tika.pipes.emitter.es.ESEmitterConfig.AttachmentStrategy
- SEPARATE_DOCUMENTS - Enum constant in enum class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig.AttachmentStrategy
- SEPARATE_DOCUMENTS - Enum constant in enum class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig.AttachmentStrategy
- SEQ - Enum constant in enum class org.apache.tika.metadata.Property.PropertyType
-
An ordered array
- SequenceNumberGenerator - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
- SequenceNumberGenerator() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
- serialize(Metadata, JsonGenerator, SerializerProvider) - Method in class org.apache.tika.serialization.serdes.MetadataSerializer
- serialize(ParseContext, JsonGenerator, SerializerProvider) - Method in class org.apache.tika.serialization.serdes.ParseContextSerializer
- serialize(EmitData, JsonGenerator, SerializerProvider) - Method in class org.apache.tika.pipes.core.serialization.EmitDataSerializer
- serialize(FetchEmitTuple, JsonGenerator, SerializerProvider) - Method in class org.apache.tika.pipes.core.serialization.FetchEmitTupleSerializer
- serialize(PipesResult, JsonGenerator, SerializerProvider) - Method in class org.apache.tika.pipes.core.serialization.PipesResultSerializer
- serialize(JCas, CTAKESSerializer, boolean, OutputStream) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Serializes a CAS in the given format.
- serialize(T, JsonGenerator, SerializerProvider) - Method in class org.apache.tika.serialization.serdes.SpiCompositeSerializer
- serializedRecursiveParserWrapperExample() - Method in class org.apache.tika.example.ParsingExample
-
We include a simple JSON serializer for a list of metadata with
JsonMetadataList. - serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestCurrentRevision
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
-
Used to convert the element into a byte List
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementPackage
-
Used to convert the element into a byte List
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataSizeObject
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.IntermediateNodeObject
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupData
-
Used to convert the element into a byte List
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDeclarations
-
Used to convert the element into a byte List
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadata
-
Used to convert the element into a byte List
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadataDeclarations
-
Used to convert the element into a byte List
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
-
Used to convert the element into a byte List
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDataBLOBReference
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifest
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestObjectGroupReferences
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestRootDeclare
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.SignatureObject
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexManifestMapping
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexRevisionMapping
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestRootDeclare
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestSchemaGUID
-
Used to convert the element into a byte List.
- serializeItemsToByteList(List<Byte>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
-
Serialize items to byte list.
- serializeMetadata(List<String>) - Static method in class org.apache.tika.embedder.ExternalEmbedder
-
Serializes a collection of metadata command line arguments into a single string.
- serializeToByteList() - Method in interface org.apache.tika.parser.microsoft.onenote.fsshttpb.IFSSHTTPBSerializable
-
Serialize to byte list.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.ArrayNumber
-
This method is used to convert the element of the number of array into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.EightBytesOfData
-
This method is used to convert the element of EightBytesOfData into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.FourBytesOfData
-
This method is used to convert the element of FourBytesOfData into a byte List.
- serializeToByteList() - Method in interface org.apache.tika.parser.microsoft.onenote.fsshttpb.property.IProperty
-
This method is used to convert the element of property into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.NoData
-
This method is used to convert the element of NoData into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.OneByteOfData
-
This method is used to convert the element of OneByteOfData into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
-
This method is used to convert the element of the prtArrayOfPropertyValues into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtFourBytesOfLengthFollowedByData
-
This method is used to convert the element of prtFourBytesOfLengthFollowedByData into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.TwoBytesOfData
-
This method is used to convert the element of TwoBytesOfData into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BasicObject
-
Used to serialize item to byte list.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
-
This method is used to convert the element of BinaryItem basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
-
This method is used to convert the element of CellID basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
-
This method is used to convert the element of CellIDArray basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
-
This method is used to convert the element of Compact64bitInt basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CompactID
-
This method is used to convert the element of CompactID object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
-
This method is used to convert the element of ExGuid basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
-
This method is used to convert the element of ExGUIDArray basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
-
This method is used to convert the element of JCID object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
-
This method is used to convert the element of PropertyID object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
-
This method is used to convert the element of SerialNumber basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestDataElementData
-
Used to convert the element into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementData
-
Serialize item to byte list.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
-
Used to convert the element into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySet
-
This method is used to convert the element of PropertySet into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestDataElementData
-
Used to convert the element into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
-
This method is used to convert the element of the ObjectSpaceObjectPropSet into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
-
This method is used to convert the element of ObjectSpaceObjectStreamHeader into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfContextIDs
-
This method is used to convert the element of ObjectSpaceObjectStreamOfContextIDs object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOIDs
-
This method is used to convert the element of ObjectSpaceObjectStreamOfOIDs object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOSIDs
-
This method is used to convert the element of ObjectSpaceObjectStreamOfOSIDs object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
-
Used to convert the element into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestDataElementData
-
Used to convert the element into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
-
Serialize item to byte list.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd16bit
-
This method is used to convert the element of StreamObjectHeaderEnd16bit basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd8bit
-
This method is used to convert the element of StreamObjectHeaderEnd8bit basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart16bit
-
This method is used to convert the element of StreamObjectHeaderStart16bit basic object into a byte List.
- serializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart32bit
-
This method is used to convert the element of StreamObjectHeaderStart32bit basic object into a byte List.
- SerializeToByteList() - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
-
This method is used to convert the element of ExtendedGUID object into a byte List.
- serialNumber - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
- SerialNumber - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
- SerialNumber() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
-
Initializes a new instance of the SerialNumber class, this is default contractor
- SerialNumber(UUID, long) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
-
Initializes a new instance of the SerialNumber class with specified values.
- SerialNumber(SerialNumber) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
-
Initializes a new instance of the SerialNumber class, this is the copy constructor.
- ServerHandlerConfig - Record Class in org.apache.tika.server.core.resource
-
Server-internal configuration for request handlers.
- ServerHandlerConfig(BasicContentHandlerFactory.HANDLER_TYPE, ParseMode, int, int, boolean) - Constructor for record class org.apache.tika.server.core.resource.ServerHandlerConfig
-
Creates an instance of a
ServerHandlerConfigrecord class. - ServerInitializationException - Exception in org.apache.tika.pipes.core
-
Exception thrown when the PipesServer fails to initialize.
- ServerInitializationException(String) - Constructor for exception org.apache.tika.pipes.core.ServerInitializationException
- ServerInitializationException(String, Throwable) - Constructor for exception org.apache.tika.pipes.core.ServerInitializationException
- ServerManager - Interface in org.apache.tika.pipes.core
-
Manages the lifecycle of a PipesServer process and client connections.
- ServerProtocolIO - Class in org.apache.tika.pipes.core.server
-
Centralizes protocol I/O operations shared by
PipesServerandConnectionHandler. - ServerProtocolIO(DataInputStream, DataOutputStream) - Constructor for class org.apache.tika.pipes.core.server.ServerProtocolIO
- ServerStatus - Class in org.apache.tika.server.core
-
Read-only server status for tracking active tasks and statistics.
- ServerStatus() - Constructor for class org.apache.tika.server.core.ServerStatus
- ServerStatus.TASK - Enum Class in org.apache.tika.server.core
- ServerStatusResource - Interface in org.apache.tika.server.core
- SERVICE_NAME - Static variable in class org.apache.tika.TikaGrpc
- ServiceLoader - Class in org.apache.tika.config
-
Internal utility class that Tika uses to look up service providers.
- ServiceLoader() - Constructor for class org.apache.tika.config.ServiceLoader
- ServiceLoader(ClassLoader) - Constructor for class org.apache.tika.config.ServiceLoader
- ServiceLoader(ClassLoader, boolean) - Constructor for class org.apache.tika.config.ServiceLoader
- ServiceLoaderConfig() - Constructor for class org.apache.tika.config.GlobalSettings.ServiceLoaderConfig
- ServiceLoaderUtils - Class in org.apache.tika.utils
-
Service Loading and Ordering related utils
- ServiceLoaderUtils() - Constructor for class org.apache.tika.utils.ServiceLoaderUtils
- set(Class<T>, T) - Method in class org.apache.tika.detect.zip.StreamingDetectContext
-
Adds the given value to the context as an implementation of the given interface.
- set(Class<T>, T) - Method in class org.apache.tika.parser.ParseContext
-
Adds the given value to the context as an implementation of the given interface.
- set(String...) - Static method in class org.apache.tika.mime.MediaType
-
Convenience method that parses the given media type strings and returns an unmodifiable set that contains all the parsed types.
- set(String, String) - Method in class org.apache.tika.metadata.Metadata
-
Set metadata name/value.
- set(String, String) - Method in class org.apache.tika.xmp.XMPMetadata
-
Sets the given property.
- set(String, String[]) - Method in class org.apache.tika.metadata.Metadata
- set(String, String, Map<String, String[]>) - Method in interface org.apache.tika.metadata.writefilter.MetadataWriteLimiter
-
Based on the field and the value, this limiter modifies the field and/or the value to something that should be set in the Metadata object.
- set(String, String, Map<String, String[]>) - Method in class org.apache.tika.metadata.writefilter.StandardMetadataLimiter
- set(Metadata) - Method in class org.apache.tika.pipes.core.server.IntermediateResult
- set(Property, boolean) - Method in class org.apache.tika.metadata.Metadata
-
Sets the integer value of the identified metadata property.
- set(Property, double) - Method in class org.apache.tika.metadata.Metadata
-
Sets the real or rational value of the identified metadata property.
- set(Property, double) - Method in class org.apache.tika.xmp.XMPMetadata
- set(Property, int) - Method in class org.apache.tika.metadata.Metadata
-
Sets the integer value of the identified metadata property.
- set(Property, int) - Method in class org.apache.tika.xmp.XMPMetadata
- set(Property, long) - Method in class org.apache.tika.metadata.Metadata
-
Sets the integer value of the identified metadata property.
- set(Property, String) - Method in class org.apache.tika.metadata.Metadata
-
Sets the value of the identified metadata property.
- set(Property, String) - Method in class org.apache.tika.xmp.XMPMetadata
- set(Property, String[]) - Method in class org.apache.tika.metadata.Metadata
-
Sets the values of the identified metadata property.
- set(Property, String[]) - Method in class org.apache.tika.xmp.XMPMetadata
-
Sets array properties.
- set(Property, Calendar) - Method in class org.apache.tika.metadata.Metadata
-
Sets the date value of the identified metadata property.
- set(Property, Date) - Method in class org.apache.tika.metadata.Metadata
-
Sets the date value of the identified metadata property.
- set(Property, Date) - Method in class org.apache.tika.xmp.XMPMetadata
- set(MediaType...) - Static method in class org.apache.tika.mime.MediaType
-
Convenience method that returns an unmodifiable set that contains all the given media types.
- set(RTFTokenType, String, int, boolean) - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFToken
- setAccessCheckMode(PDFParserConfig.AccessCheckMode) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- setAccessKey(String) - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- setActive(boolean) - Method in class org.apache.tika.server.core.TlsConfig
- setAdditionalFetchConfigJson(String) - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
You can supply additional fetch configuration using this.
- setAdditionalFetchConfigJsonBytes(ByteString) - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
You can supply additional fetch configuration using this.
- setAdmin1Code(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
- setAdmin2Code(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
- setAeDescriptorPath(String) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the path to XML descriptor for AnalysisEngine.
- setAlgorithm(DigestDef.Algorithm) - Method in class org.apache.tika.digest.DigestDef
- setAlignedLenTable(short[]) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setAlignedTreeTable(short[]) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setAll(Properties) - Method in class org.apache.tika.metadata.Metadata
-
Copy All key-value pairs from properties.
- setAll(Properties) - Method in class org.apache.tika.xmp.XMPMetadata
-
It will set all simple and array properties that have QName keys in registered namespaces.
- setAllowAbsolutePaths(boolean) - Method in class org.apache.tika.pipes.fetcher.fs.FileSystemFetcherConfig
- setAllowedHostsForRedirect(Set<String>) - Method in class org.apache.tika.client.HttpClientFactory
- setAllowRuntimePrompt(boolean) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig.RuntimeConfig
- setAllowRuntimePrompt(boolean) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- setAnnotationProps(String[]) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
ets the
CTAKESAnnotationProperty's that will be included into cTAKES metadata. - setAnnotationProps(CTAKESAnnotationProperty[]) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the
CTAKESAnnotationProperty's that will be included into cTAKES metadata. - setApiKey(String) - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- setApiKey(String) - Method in class org.apache.tika.inference.ImageEmbeddingConfig.RuntimeConfig
- setApiKey(String) - Method in class org.apache.tika.inference.ImageEmbeddingConfig
- setApiKey(String) - Method in class org.apache.tika.inference.InferenceConfig.RuntimeConfig
- setApiKey(String) - Method in class org.apache.tika.inference.InferenceConfig
- setApiKey(String) - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- setApiKey(String) - Method in class org.apache.tika.language.translate.impl.YandexTranslator
-
Set the API Key for client authentication
- setApiKey(String) - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- setApiKey(String) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig.RuntimeConfig
- setApiKey(String) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- setApiKeyHeaderName(String) - Method in class org.apache.tika.inference.OpenAIEmbeddingFilter
-
Set the HTTP header name for API key authentication.
- setApiKeyHeaderName(String) - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
-
Set the HTTP header name for API key authentication.
- setApiKeyHeaderName(String) - Method in class org.apache.tika.parser.vlm.OpenAIVLMParser
-
Set the HTTP header name for API key authentication.
- setApiKeyPrefix(String) - Method in class org.apache.tika.inference.OpenAIEmbeddingFilter
-
Set the prefix prepended to the API key in the auth header.
- setApiKeyPrefix(String) - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
-
Set the prefix prepended to the API key in the auth header.
- setApiKeyPrefix(String) - Method in class org.apache.tika.parser.vlm.OpenAIVLMParser
-
Set the prefix prepended to the API key in the auth header.
- setApplicationName(String) - Method in class org.apache.tika.pipes.fetcher.googledrive.config.GoogleDriveFetcherConfig
- setApplyRotation(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Sets whether or not a rotation value should be calculated and passed to ImageMagick.
- setArbitrationInfo(String) - Method in class org.apache.tika.detect.EncodingDetectorContext
-
Set by the meta detector to describe how it reached its decision.
- setAuthScheme(String) - Method in class org.apache.tika.client.HttpClientFactory
-
only basic and ntlm are supported
- setAuthScheme(String) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setAutoClose(boolean) - Method in class org.apache.tika.pipes.ignite.config.IgniteConfigStoreConfig
- setAutoClose(boolean) - Method in class org.apache.tika.pipes.ignite.IgniteConfigStore
- setAutoDetectParserConfig(AutoDetectParserConfig) - Method in class org.apache.tika.parser.AutoDetectParser
-
Sets the configuration that will be used to create SecureContentHandlers that will be used for parsing.
- setAverageCharTolerance(Float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
See
PDFTextStripper.setAverageCharTolerance(float) - setBasePath(String) - Method in class org.apache.tika.pipes.fetcher.fs.FileSystemFetcherConfig
- setBaseUrl(String) - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- setBaseUrl(String) - Method in class org.apache.tika.inference.ImageEmbeddingConfig.RuntimeConfig
- setBaseUrl(String) - Method in class org.apache.tika.inference.ImageEmbeddingConfig
- setBaseUrl(String) - Method in class org.apache.tika.inference.InferenceConfig.RuntimeConfig
- setBaseUrl(String) - Method in class org.apache.tika.inference.InferenceConfig
- setBaseUrl(String) - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- setBaseUrl(String) - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- setBaseUrl(String) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig.RuntimeConfig
- setBaseUrl(String) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- setBit(byte[], long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.Bit
-
Set a bit value to "On" in the specified byte array with the specified bit position.
- setBlock_len(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Sets block length
- setBlockAddress(long[]) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
-
Sets block addresses
- setBlockCount(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
-
Sets a block count
- setBlockidx_intvl(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Sets block index interval
- setBlockLength(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setBlockLlen(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
-
Sets a block length
- setBlockNext(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmPmglHeader
- setBlockPrev(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmPmglHeader
- setBlockRemaining(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setBlockType(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setBody(PropertySet) - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
- setBold(boolean) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
- setBucket(String) - Method in class org.apache.tika.parser.transcribe.aws.AmazonTranscribeConfig.RuntimeConfig
- setBucket(String) - Method in class org.apache.tika.parser.transcribe.aws.AmazonTranscribeConfig
- setBucket(String) - Method in class org.apache.tika.pipes.fetcher.gcs.config.GCSFetcherConfig
- setBucket(String) - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- setByteArrayMaxOverride(int) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
WARNING: this sets a static variable in POI.
- setCaptureMap(Map<String, String>) - Method in class org.apache.tika.parser.RegexCaptureParserConfig
- setCatchIntermediateIOExceptions(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
The PDFBox parser will throw an IOException if there is a problem with a stream.
- setCenter(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
- setCertChain(File) - Method in class org.apache.tika.pipes.grpc.TikaGrpcServer
- setCertExpirationWarningDays(int) - Method in class org.apache.tika.server.core.TlsConfig
- setCertificateBytes(byte[]) - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.ClientCertificateCredentialsConfig
- setCertificatePassword(String) - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.ClientCertificateCredentialsConfig
- setChar(RTFTokenType, char) - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFToken
- setCharset(Charset) - Method in class org.apache.tika.parser.csv.CSVParams
- setCheckCommandLine(List<String>) - Method in class org.apache.tika.parser.external.ExternalParserConfig
- setCheckErrorCodes(List<Integer>) - Method in class org.apache.tika.parser.external.ExternalParserConfig
- setChmDirList(ChmDirectoryListingSet) - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- setChmItsfHeader(ChmItsfHeader) - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- setChmItspHeader(ChmItspHeader) - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- setChmLzxcControlData(ChmLzxcControlData) - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- setChmLzxcResetTable(ChmLzxcResetTable) - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- setCleanDwgReadOutput(boolean) - Method in class org.apache.tika.parser.dwg.DWGParserConfig
- setCleanDwgReadOutputBatchSize(int) - Method in class org.apache.tika.parser.dwg.DWGParserConfig
- setCleanDwgReadRegexToReplace(String) - Method in class org.apache.tika.parser.dwg.DWGParserConfig
- setCleanDwgReadReplaceWith(String) - Method in class org.apache.tika.parser.dwg.DWGParserConfig
- setClearContentAfterChunking(boolean) - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- setClearContentAfterChunking(boolean) - Method in class org.apache.tika.inference.InferenceConfig
- setClientAuthenticationRequired(boolean) - Method in class org.apache.tika.server.core.TlsConfig
- setClientAuthenticationWanted(boolean) - Method in class org.apache.tika.server.core.TlsConfig
- setClientAuthRequired(boolean) - Method in class org.apache.tika.pipes.grpc.TikaGrpcServer
- setClientCertificateCredentialsConfig(ClientCertificateCredentialsConfig) - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.MicrosoftGraphFetcherConfig
- setClientId(String) - Method in class org.apache.tika.parser.transcribe.aws.AmazonTranscribeConfig.RuntimeConfig
- setClientId(String) - Method in class org.apache.tika.parser.transcribe.aws.AmazonTranscribeConfig
- setClientId(String) - Method in interface org.apache.tika.pipes.fetchers.microsoftgraph.config.AadCredentialConfigBase
- setClientId(String) - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.Client2CertificateCredentialsConfig
- setClientId(String) - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.ClientCertificateCredentialsConfig
- setClientId(String) - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.ClientSecretCredentialsConfig
- setClientSecret(String) - Method in class org.apache.tika.parser.transcribe.aws.AmazonTranscribeConfig.RuntimeConfig
- setClientSecret(String) - Method in class org.apache.tika.parser.transcribe.aws.AmazonTranscribeConfig
- setClientSecret(String) - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.Client2CertificateCredentialsConfig
- setClientSecret(String) - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.ClientSecretCredentialsConfig
- setClientSecretCredentialsConfig(ClientSecretCredentialsConfig) - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.MicrosoftGraphFetcherConfig
- setCloseShield() - Method in class org.apache.tika.io.TikaInputStream
- setColorspace(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- setCommand(String) - Method in class org.apache.tika.parser.gdal.GDALParser
- setCommand(String...) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Sets the command to be run.
- setCommandAppendOperator(String) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Sets the operator to append rather than replace a value for the command line tool, i.e. "+=".
- setCommandAssignmentDelimeter(String) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Sets the delimiter for multiple assignments for the command line tool, i.e. ", ".
- setCommandAssignmentOperator(String) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Sets the assignment operator for the command line tool, i.e. "=".
- setCommandLine(List<String>) - Method in class org.apache.tika.parser.external.ExternalParserConfig
- setCompletionsPath(String) - Method in class org.apache.tika.parser.vlm.OpenAIVLMParser
-
Set the URL path for chat completions requests.
- setCompletionsPath(String) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- setCompressedLen(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
-
Sets compressed length
- setConcatenatePhoneticRuns(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Microsoft Excel files can sometimes contain phonetic (furigana) strings.
- setConfigPath(String) - Method in class org.apache.tika.server.core.TikaServerConfig
- setConfigStoreParams(String) - Method in class org.apache.tika.pipes.core.PipesConfig
- setConfigStoreType(String) - Method in class org.apache.tika.pipes.core.PipesConfig
- setConnectTimeoutMillis(int) - Method in class org.apache.tika.client.HttpClientFactory
- setConnectTimeoutMillis(Integer) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- setConnectTimeoutMillis(Integer) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setContainer(String) - Method in class org.apache.tika.pipes.fetcher.azblob.config.AZBlobFetcherConfig
- setContent(List<ExGuid>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
- setContentField(String) - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- setContentField(String) - Method in class org.apache.tika.inference.InferenceConfig
- setContentHandler(ContentHandler) - Method in class org.apache.tika.sax.ContentHandlerDecorator
-
Sets the underlying content handler.
- setContentHandlerDecoratorFactory(ContentHandlerDecoratorFactory) - Method in class org.apache.tika.parser.AutoDetectParserConfig
- setContentHandlerFactory(ContentHandlerFactory) - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Set the content handler factory.
- setContentLength(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxBlock
- setContentParser(Parser) - Method in class org.apache.tika.parser.epub.EpubParser
- setContentParser(Parser) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
- setContentSource(String) - Method in class org.apache.tika.parser.external.ExternalParserConfig
- setContentType(Metadata) - Method in class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
- setContentType(Metadata) - Method in class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
- setContentType(Metadata) - Method in class org.apache.tika.parser.microsoft.xml.WordMLParser
- setContextClassLoader(ClassLoader) - Static method in class org.apache.tika.config.ServiceLoader
-
Sets the context class loader to use for all threads that access this class.
- setContextIDs(ObjectSpaceObjectStreamOfOIDsOSIDsOrContextIDs) - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
- setControlDataIndex(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmDirectoryListingSet
-
Sets control data index
- setCorePoolSize(int) - Method in interface org.apache.tika.concurrent.ConfigurableThreadPoolExecutor
- setCors(String) - Method in class org.apache.tika.server.core.TikaServerConfig
- setCountryCode(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
- setCrawlAllFileNodesFromRoot(boolean) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteTreeWalkerOptions
-
Do this to ignore revisions and just parse all file nodes from the root recursively.
- setCreated(String) - Method in class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
- setCredentialsAESEncrypted(boolean) - Method in class org.apache.tika.client.HttpClientFactory
- setCredentialsProvider(String) - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- setData(byte[]) - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- setDataOffset(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Sets data offset
- setDataPath(String) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig.RuntimeConfig
- setDataPath(String) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
-
Set the path to the tessdata directory.
- setDataPath(String) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- setDateFormatOverride(String) - Method in class org.apache.tika.parser.microsoft.TikaExcelDataFormatter
- setDateOverrideFormat(String) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
A user may wish to override the date formats in xls and xlsx files.
- setDebug(boolean) - Method in class org.apache.tika.parser.microsoft.libpst.LibPstParserConfig
- setDeclaredEncoding(String) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Set the declared encoding for charset detection.
- setDecodedValue(long) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
- setDecompressConcatenated(boolean) - Method in class org.apache.tika.parser.pkg.CompressorParser.Config
- setDefaultOfficeParserConfig(OfficeParserConfig) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
Allows subclasses to set the default configuration during construction.
- setDefaultTimeZone(String) - Method in class org.apache.tika.metadata.filter.DateNormalizingMetadataFilter
- setDelimiter(Character) - Method in class org.apache.tika.parser.csv.CSVParams
- setDensity(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- setDepth(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- setDescription(String) - Method in class org.apache.tika.mime.MimeType
-
Set the description of this media type.
- setDescription(String) - Method in class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
- setDetectableCharset(String, boolean) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Deprecated.This API is ICU internal only.
- setDetectAngles(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- setDetectCharsetsInEntryNames(boolean) - Method in class org.apache.tika.parser.pkg.ZipParserConfig
- setDetector(Detector) - Method in class org.apache.tika.parser.AutoDetectParser
-
Sets the type detector used by this parser to auto-detect the type of a document.
- setDigest(String) - Method in class org.apache.tika.server.core.TikaServerConfig
- setDigestMarkLimit(int) - Method in class org.apache.tika.server.core.TikaServerConfig
- setDigests(List<DigestDef>) - Method in class org.apache.tika.parser.digestutils.BouncyCastleDigesterFactory
- setDigests(List<DigestDef>) - Method in class org.apache.tika.parser.digestutils.CommonsDigesterFactory
- setDir_uuid(byte[]) - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Sets directory uuid
- setDirectoryListingEntryList(List<DirectoryListingEntry>) - Method in class org.apache.tika.parser.microsoft.chm.ChmDirectoryListingSet
-
Sets chm directory listing entry list
- setDirLen(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Sets directory length
- setDirOffset(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Sets directory offset
- setDisableContentCompression(boolean) - Method in class org.apache.tika.client.HttpClientFactory
- setDocumentLocator(Locator) - Method in class org.apache.tika.parser.dif.DIFContentHandler
- setDocumentLocator(Locator) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
- setDocumentLocator(Locator) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- setDocumentLocator(Locator) - Method in class org.apache.tika.sax.DIFContentHandler
- setDocumentLocator(Locator) - Method in class org.apache.tika.sax.TeeContentHandler
- setDocumentLocator(Locator) - Method in class org.apache.tika.sax.TextContentHandler
- setDpi(int) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
-
Set the DPI for image rendering.
- setDpi(int) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- setDpi(int) - Method in class org.apache.tika.parser.pdf.OcrConfig
- setDpi(int) - Method in class org.apache.tika.renderer.pdf.poppler.PopplerRenderer
-
Set the rendering resolution in DPI.
- setDPI(int) - Method in class org.apache.tika.renderer.pdf.pdfbox.PDFBoxRenderer
- setDropThreshold(Float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
See
PDFTextStripper.setDropThreshold(float) - setDwgReadExecutable(String) - Method in class org.apache.tika.parser.dwg.DWGParserConfig.RuntimeConfig
- setDwgReadExecutable(String) - Method in class org.apache.tika.parser.dwg.DWGParserConfig
- setDwgReadTimeout(long) - Method in class org.apache.tika.parser.dwg.DWGParserConfig
- setEmbeddedCountLimitReached(boolean) - Method in class org.apache.tika.parser.ParseRecord
-
Sets the flag indicating the embedded count limit was reached.
- setEmbeddedDepthLimitReached(boolean) - Method in class org.apache.tika.parser.ParseRecord
-
Sets the flag indicating the embedded depth limit was reached.
- setEmbeddedIdPrefix(String) - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- setEmbeddedLimits(EmbeddedLimits) - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Set the embedded limits configuration.
- setEmbeddingsPath(String) - Method in class org.apache.tika.inference.OpenAIEmbeddingFilter
-
Set the URL path for embeddings requests.
- setEmbeddingsPath(String) - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
-
Set the URL path for embeddings requests.
- setEmitIntermediateResults(boolean) - Method in class org.apache.tika.pipes.core.PipesConfig
- setEmitKey(EmitKey) - Method in class org.apache.tika.pipes.api.FetchEmitTuple
- setEmitKeyBase(String) - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- setEmitMaxEstimatedBytes(long) - Method in class org.apache.tika.pipes.core.PipesConfig
- setEmitStrategy(EmitStrategy) - Method in class org.apache.tika.pipes.core.config.ConfigOverrides.Builder
-
Set the emit strategy.
- setEmitStrategy(EmitStrategyConfig) - Method in class org.apache.tika.pipes.core.PipesConfig
-
Set the emit strategy configuration.
- setEmitter(String) - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- setEmitterId(String) - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
The ID of the emitter to use (optional).
- setEmitterId(String) - Method in class org.apache.tika.pipes.pipesiterator.PipesIteratorConfig
- setEmitterIdBytes(ByteString) - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
The ID of the emitter to use (optional).
- setEmitWithinMillis(long) - Method in class org.apache.tika.pipes.core.PipesConfig
-
If nothing has been emitted in this amount of time and the
PipesConfig.getEmitMaxEstimatedBytes()has not been reached yet, emit what's in the emit queue. - setEnableAutoSpace(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true (the default), the parser should estimate where spaces should be inserted between words.
- setEnableImagePreprocessing(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set the value to true if processing is to be enabled.
- setEnableUnsecureFeatures(boolean) - Method in class org.apache.tika.server.core.TikaServerConfig
- setEncoding(DigestDef.Encoding) - Method in class org.apache.tika.digest.DigestDef
- setEncoding(StringsEncoding) - Method in class org.apache.tika.parser.strings.StringsConfig
-
Sets the character encoding of the strings that are to be found.
- setEncodingDetector(EncodingDetector) - Method in class org.apache.tika.parser.AbstractEncodingDetectorParser
- setEndBookmark(PDOutlineItem) - Method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
- setEndpoint(String) - Method in class org.apache.tika.pipes.fetcher.azblob.config.AZBlobFetcherConfig
- setEndpointConfigurationService(String) - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- setEndpoints(ArrayList<String>) - Method in class org.apache.tika.server.core.TikaServerConfig
- setEntriesToCopy(long) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
- setEntryEncoding(Charset) - Method in class org.apache.tika.parser.pkg.ZipParserConfig
- setEntryEncodingName(String) - Method in class org.apache.tika.parser.pkg.ZipParserConfig
-
Set the entry encoding from a string (for JSON deserialization).
- setEntryType(ChmCommons.EntryType) - Method in class org.apache.tika.parser.microsoft.chm.DirectoryListingEntry
- setErrorMessage(String) - Method in class org.apache.tika.FetchAndParseReply.Builder
-
If there was an error, this will contain the error message.
- setErrorMessageBytes(ByteString) - Method in class org.apache.tika.FetchAndParseReply.Builder
-
If there was an error, this will contain the error message.
- setExclude(List<String>) - Method in class org.apache.tika.metadata.filter.ExcludeFieldMetadataFilter
- setExcludedCipherSuites(List<String>) - Method in class org.apache.tika.server.core.TlsConfig
- setExcludedProtocols(List<String>) - Method in class org.apache.tika.server.core.TlsConfig
- setExcludeEmbeddedResourceTypes(Set<String>) - Method in class org.apache.tika.pipes.core.extractor.StandardUnpackSelector
- setExcludeFields(Set<String>) - Method in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- setExcludeMimeTypes(Set<String>) - Method in class org.apache.tika.pipes.core.extractor.StandardUnpackSelector
- setExcludeUnmapped(boolean) - Method in class org.apache.tika.metadata.filter.FieldNameMappingFilter
-
If this is
true(default), this means that only the fields that have a "from" value in the mapper will be passed through. - setExitValue(int) - Method in class org.apache.tika.utils.FileProcessResult
- setExtensionConfig(ExtensionConfig) - Method in class org.apache.tika.pipes.core.config.FileBasedConfigStore
- setExtensionConfig(ExtensionConfig) - Method in class org.apache.tika.pipes.core.config.InMemoryConfigStore
- setExtractAcroFormContent(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true (the default), extract content from AcroForms at the end of the document.
- setExtractActions(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Whether or not to extract PDActions from the file.
- setExtractAllAlternatives(boolean) - Method in class org.apache.tika.parser.mail.RFC822Parser.Config
- setExtractAnnotationText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true (the default), text in annotations will be extracted.
- setExtractBookmarksText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, extract bookmarks (document outline) text.
- setExtractFileSystemMetadata(boolean) - Method in class org.apache.tika.pipes.fetcher.fs.FileSystemFetcherConfig
- setExtractFontNames(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Extract font names into a metadata field
- setExtractIncrementalUpdateInfo(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- setExtractInlineImageMetadataOnly(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Use this when you want to know how many images of what formats are in a PDF but you don't need to render the images (e.g. for OCR).
- setExtractInlineImages(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If
true, extract the literal inline embedded OBXImages. - setExtractMacros(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Sets whether or not MSOffice parsers should extract macros.
- setExtractMacros(boolean) - Method in class org.apache.tika.parser.odf.FlatOpenDocumentParser
- setExtractMacros(boolean) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
- setExtractMarkedContent(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If the PDF contains marked content, try to extract text and its marked structure.
- setExtractScripts(boolean) - Method in class org.apache.tika.parser.html.JSoupParser
-
Whether or not to extract contents in script entities.
- setExtractUniqueInlineImagesOnly(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Multiple pages within a PDF file might refer to the same underlying image.
- setExtractUserMetadata(boolean) - Method in class org.apache.tika.pipes.fetcher.azblob.config.AZBlobFetcherConfig
- setExtractUserMetadata(boolean) - Method in class org.apache.tika.pipes.fetcher.gcs.config.GCSFetcherConfig
- setExtractUserMetadata(boolean) - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- setFallback(Parser) - Method in class org.apache.tika.parser.CompositeParser
-
Sets the fallback parser.
- setFetcherClass(String) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
-
The full java class name of the fetcher config for which to fetch json schema.
- setFetcherClass(String) - Method in class org.apache.tika.GetFetcherReply.Builder
-
The full Java class name of the Fetcher.
- setFetcherClass(String) - Method in class org.apache.tika.SaveFetcherRequest.Builder
-
The full java class name of the fetcher class.
- setFetcherClassBytes(ByteString) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
-
The full java class name of the fetcher config for which to fetch json schema.
- setFetcherClassBytes(ByteString) - Method in class org.apache.tika.GetFetcherReply.Builder
-
The full Java class name of the Fetcher.
- setFetcherClassBytes(ByteString) - Method in class org.apache.tika.SaveFetcherRequest.Builder
-
The full java class name of the fetcher class.
- setFetcherConfigJson(String) - Method in class org.apache.tika.SaveFetcherRequest.Builder
-
JSON string of the fetcher config object.
- setFetcherConfigJsonBytes(ByteString) - Method in class org.apache.tika.SaveFetcherRequest.Builder
-
JSON string of the fetcher config object.
- setFetcherConfigJsonSchema(String) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
-
The json schema that describes the fetcher config in string format.
- setFetcherConfigJsonSchemaBytes(ByteString) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
-
The json schema that describes the fetcher config in string format.
- setFetcherId(String) - Method in class org.apache.tika.DeleteFetcherRequest.Builder
-
ID of the fetcher to delete.
- setFetcherId(String) - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
The ID of the fetcher in the fetcher store (previously saved by SaveFetcher) to use for the fetch.
- setFetcherId(String) - Method in class org.apache.tika.GetFetcherReply.Builder
-
Echoes the ID of the fetcher being returned.
- setFetcherId(String) - Method in class org.apache.tika.GetFetcherRequest.Builder
-
ID of the fetcher for which to return config.
- setFetcherId(String) - Method in class org.apache.tika.pipes.pipesiterator.PipesIteratorConfig
- setFetcherId(String) - Method in class org.apache.tika.SaveFetcherReply.Builder
-
The fetcher_id that was saved.
- setFetcherId(String) - Method in class org.apache.tika.SaveFetcherRequest.Builder
-
A unique identifier for each fetcher.
- setFetcherIdBytes(ByteString) - Method in class org.apache.tika.DeleteFetcherRequest.Builder
-
ID of the fetcher to delete.
- setFetcherIdBytes(ByteString) - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
The ID of the fetcher in the fetcher store (previously saved by SaveFetcher) to use for the fetch.
- setFetcherIdBytes(ByteString) - Method in class org.apache.tika.GetFetcherReply.Builder
-
Echoes the ID of the fetcher being returned.
- setFetcherIdBytes(ByteString) - Method in class org.apache.tika.GetFetcherRequest.Builder
-
ID of the fetcher for which to return config.
- setFetcherIdBytes(ByteString) - Method in class org.apache.tika.SaveFetcherReply.Builder
-
The fetcher_id that was saved.
- setFetcherIdBytes(ByteString) - Method in class org.apache.tika.SaveFetcherRequest.Builder
-
A unique identifier for each fetcher.
- setFetcherName(String) - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Set the fetcher name.
- setFetchKey(String) - Method in class org.apache.tika.FetchAndParseReply.Builder
-
Echoes the fetch_key that was sent in the request.
- setFetchKey(String) - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
The "Fetch Key" of the item that will be fetched.
- setFetchKeyBytes(ByteString) - Method in class org.apache.tika.FetchAndParseReply.Builder
-
Echoes the fetch_key that was sent in the request.
- setFetchKeyBytes(ByteString) - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
The "Fetch Key" of the item that will be fetched.
- setField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.DeleteFetcherReply.Builder
- setField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.DeleteFetcherRequest.Builder
- setField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
- setField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- setField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.FetchAndParseReply.Builder
- setField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.FetchAndParseRequest.Builder
- setField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- setField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- setField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.GetFetcherReply.Builder
- setField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.GetFetcherRequest.Builder
- setField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.GetPipesIteratorReply.Builder
- setField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
- setField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.ListFetchersReply.Builder
- setField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.ListFetchersRequest.Builder
- setField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.SaveFetcherReply.Builder
- setField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.SaveFetcherRequest.Builder
- setField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.SavePipesIteratorReply.Builder
- setField(Descriptors.FieldDescriptor, Object) - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
- setFileExtension(String) - Method in class org.apache.tika.pipes.emitter.fs.FileSystemEmitterRuntimeConfig
- setFilePath(String) - Method in class org.apache.tika.detect.FileCommandDetector
- setFilePath(String) - Method in class org.apache.tika.parser.strings.StringsConfig.RuntimeConfig
- setFilePath(String) - Method in class org.apache.tika.parser.strings.StringsConfig
-
Sets the path to the "file" command.
- setFilter(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- setFilters(List<MetadataFilter>) - Method in class org.apache.tika.metadata.filter.CompositeMetadataFilter
- setForkedJvmArgs(ArrayList<String>) - Method in class org.apache.tika.pipes.core.PipesConfig
- setFormat(String) - Method in class org.apache.tika.language.translate.impl.YandexTranslator
-
Set the text format to use (plain/html)
- setFramesRead(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setFreeSpace(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmPmgiHeader
-
Sets pmgi free space
- setFreeSpace(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmPmglHeader
- setFullName(String) - Method in class org.apache.tika.parser.microsoft.ooxml.EmbeddedPartMetadata
- setGazetteerRestEndpoint(String) - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig.RuntimeConfig
- setGazetteerRestEndpoint(String) - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
-
Configure REST endpoint for lucene-geo-gazetteer
- setGeneratedResourceName(Metadata, EmbeddedDocumentUtil.EmbeddedResourcePrefix, int, String) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
Sets a generated resource name on the metadata and marks the extension as inferred.
- setGeoPointFieldName(String) - Method in class org.apache.tika.metadata.filter.GeoPointMetadataFilter
-
Set the field for the concatenated LATITUDE,LONGITUDE string.
- setGetFetcherReplies(int, GetFetcherReply) - Method in class org.apache.tika.ListFetchersReply.Builder
-
List of fetcher configs returned by the Lists Fetchers service.
- setGetFetcherReplies(int, GetFetcherReply.Builder) - Method in class org.apache.tika.ListFetchersReply.Builder
-
List of fetcher configs returned by the Lists Fetchers service.
- setGray(boolean) - Method in class org.apache.tika.renderer.pdf.poppler.PopplerRenderer
-
If true (the default), render in grayscale.
- setGuid(int[]) - Method in class org.apache.tika.parser.microsoft.onenote.GUID
- setGuid(GUID) - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
- setGuid(GUID) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
- setHadStarted(ChmCommons.LzxState) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setHandlerType(BasicContentHandlerFactory.HANDLER_TYPE) - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Set the handler type (TEXT, HTML, XML, etc.).
- setHeader_len(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Sets itsp header length
- setHeaderLen(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Sets itsf header length
- setHeaders(Multimap<String, String>) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpHeaders
- setHeartbeatIntervalMs(long) - Method in class org.apache.tika.pipes.core.PipesConfig
-
Interval in milliseconds between heartbeat messages sent from server to client.
- setHlinkClickUrl(String) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
- setHost(String) - Method in class org.apache.tika.server.core.TikaServerConfig
- setHttpClient(HttpClient) - Method in class org.apache.tika.pipes.fetcher.http.HttpFetcher
- setHttpClientFactory(HttpClientFactory) - Method in class org.apache.tika.pipes.fetcher.http.HttpFetcher
- setHttpClientFactory(HttpClientFactory) - Method in class org.apache.tika.server.client.TikaServerClientConfig
- setHttpFetcherConfig(HttpFetcherConfig) - Method in class org.apache.tika.pipes.fetcher.http.HttpFetcher
- setHttpHeaders(List<String>) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- setHttpHeaders(List<String>) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setHttpRequestHeaders(Map<String, List<String>>) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- setHttpRequestHeaders(HttpHeaders) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setId(String) - Method in class org.apache.tika.language.translate.impl.MicrosoftTranslator
-
Sets the client Id for the translator API.
- setId(String) - Method in class org.apache.tika.server.core.TikaServerConfig
- setIdentifier(String) - Method in class org.apache.tika.sax.StandardReference
- setIfXFAExtractOnlyXFA(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If false (the default), extract content from the full PDF as well as the XFA form.
- setIgniteInstanceName(String) - Method in class org.apache.tika.pipes.ignite.config.IgniteConfigStoreConfig
- setIgniteInstanceName(String) - Method in class org.apache.tika.pipes.ignite.IgniteConfigStore
- setIgnoreBlobColumns(List<String>) - Method in class org.apache.tika.parser.geopkg.GeoPkgParser
- setIgnoreCharsets(List<String>) - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector.Config
- setIgnoreContentStreamSpaceGlyphs(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, the parser should ignore spaces in the content stream and rely purely on the algorithm to determine where word breaks are (PDFBOX-3774).
- setIlvl(int) - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
- setImageFormat(OcrConfig.ImageFormat) - Method in class org.apache.tika.parser.pdf.OcrConfig
- setImageFormatName(String) - Method in class org.apache.tika.renderer.pdf.pdfbox.PDFBoxRenderer
- setImageGraphicsEngineFactory(ImageGraphicsEngineFactory) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
EXPERT: Customize the class that handles inline images within a PDF page.
- setImageGraphicsEngineFactoryClass(String) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
EXPERT: Customize the class that handles inline images within a PDF page.
- setImageMagickPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig.RuntimeConfig
- setImageMagickPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- setImageQuality(float) - Method in class org.apache.tika.parser.pdf.OcrConfig
- setImageStrategy(PDFParserConfig.IMAGE_STRATEGY) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- setImageType(ImageType) - Method in class org.apache.tika.renderer.pdf.pdfbox.PDFBoxRenderer
- setImageType(OcrConfig.ImageType) - Method in class org.apache.tika.parser.pdf.OcrConfig
- setInclude(List<String>) - Method in class org.apache.tika.metadata.filter.IncludeFieldMetadataFilter
- setIncludedCipherSuites(List<String>) - Method in class org.apache.tika.server.core.TlsConfig
- setIncludeDeleted(boolean) - Method in class org.apache.tika.parser.microsoft.libpst.LibPstParserConfig
- setIncludeDeletedContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Sets whether or not the parser should include deleted content.
- setIncludeDeletedContent(boolean) - Method in class org.apache.tika.parser.wordperfect.WordPerfectParser
-
Whether or not to include deleted content.
- setIncludedProtocols(List<String>) - Method in class org.apache.tika.server.core.TlsConfig
- setIncludeEmbeddedResourceTypes(Set<String>) - Method in class org.apache.tika.pipes.core.extractor.StandardUnpackSelector
- setIncludeEmpty(boolean) - Method in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- setIncludeFields(Set<String>) - Method in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- setIncludeFullMetadata(boolean) - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- setIncludeGlossary(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Whether or not to include the glossary (building blocks / AutoText) document from docx files.
- setIncludeGlossary(boolean) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
- setIncludeHeadersAndFooters(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Whether or not to include headers and footers.
- setIncludeMarkup(boolean) - Method in class org.apache.tika.sax.boilerpipe.BoilerpipeContentHandler
- setIncludeMetadataInZip(boolean) - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- setIncludeMimeTypes(Set<String>) - Method in class org.apache.tika.pipes.core.extractor.StandardUnpackSelector
- setIncludeMissingRows(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
For table-like formats, and tables within other formats, should missing rows in sparse tables be output where detected?
- setIncludeMoveFromContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
With track changes on, when a section is moved, the content is stored in both the "moveFrom" section and in the "moveTo" section.
- setIncludeOriginal(boolean) - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- setIncludeShapeBasedContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
In Excel and Word, there can be text stored within drawing shapes.
- setIncludeSlideMasterContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Whether or not to include contents from any of the three types of masters -- slide, notes, handout -- in a .ppt or ppt[xm] file.
- setIncludeSlideNotes(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Whether or not to process slide notes content.
- setIncludeTitle(boolean) - Method in class org.apache.tika.sax.SAXOutputConfig
- setIndex(long) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
- setIndex_depth(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Sets an index depth
- setIndex_head(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Sets an index head
- setIndex_root(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Sets an index root
- setIndexCopyFromStart(long) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
- setIndexCopyToStart(long) - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
- setIndexOfContent(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- setIndexOfResetData(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- setIndexOfResetTable(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- setInlineBodyPartMap(OOXMLInlineBodyPartMap, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
Sets pre-parsed inline body part content (footnotes, endnotes, comments) so that references encountered during main document parsing can be resolved inline.
- setInlineContent(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- setInlineContent(boolean) - Method in class org.apache.tika.parser.ocrencode.EncodeOCRConfig
- setInlineContent(boolean) - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- setInlineContent(boolean) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- setIntegrityCheck(boolean) - Method in class org.apache.tika.parser.pkg.ZipParserConfig
- setIntelCurrentPossition(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setIntelFileSize(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setIntelState(ChmCommons.IntelState) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setIssuer(String) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- setItalics(boolean) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
- setIteratorClass(String) - Method in class org.apache.tika.GetPipesIteratorReply.Builder
-
The full java class name of the pipes iterator
- setIteratorClass(String) - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
-
The full java class name of the pipes iterator class.
- setIteratorClassBytes(ByteString) - Method in class org.apache.tika.GetPipesIteratorReply.Builder
-
The full java class name of the pipes iterator
- setIteratorClassBytes(ByteString) - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
-
The full java class name of the pipes iterator class.
- setIteratorConfigJson(String) - Method in class org.apache.tika.GetPipesIteratorReply.Builder
-
JSON string of the pipes iterator config object
- setIteratorConfigJson(String) - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
-
JSON string of the pipes iterator config object.
- setIteratorConfigJsonBytes(ByteString) - Method in class org.apache.tika.GetPipesIteratorReply.Builder
-
JSON string of the pipes iterator config object
- setIteratorConfigJsonBytes(ByteString) - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
-
JSON string of the pipes iterator config object.
- setIteratorId(String) - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
-
The pipes iterator ID to delete
- setIteratorId(String) - Method in class org.apache.tika.GetPipesIteratorReply.Builder
-
The pipes iterator ID
- setIteratorId(String) - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
-
The pipes iterator ID to retrieve
- setIteratorId(String) - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
-
A unique identifier for each pipes iterator.
- setIteratorIdBytes(ByteString) - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
-
The pipes iterator ID to delete
- setIteratorIdBytes(ByteString) - Method in class org.apache.tika.GetPipesIteratorReply.Builder
-
The pipes iterator ID
- setIteratorIdBytes(ByteString) - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
-
The pipes iterator ID to retrieve
- setIteratorIdBytes(ByteString) - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
-
A unique identifier for each pipes iterator.
- setJavaPath(String) - Method in class org.apache.tika.pipes.core.PipesConfig
- setJavaPath(String) - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Set the Java executable path.
- setJson(String) - Method in class org.apache.tika.pipes.ignite.ExtensionConfigDTO
- setJsonConfig(String, String) - Method in class org.apache.tika.parser.ParseContext
-
Sets a JSON configuration by component name using a raw JSON string.
- setJsonConfig(String, JsonConfig) - Method in class org.apache.tika.parser.ParseContext
-
Sets a JSON configuration by component name.
- setJvmArgs(List<String>) - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Set the JVM arguments for the forked process.
- setJwtExpiresInSeconds(int) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setJwtExpiresInSeconds(Integer) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- setJwtGenerator(JwtGenerator) - Method in class org.apache.tika.pipes.fetcher.http.HttpFetcher
- setJwtIssuer(String) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setJwtPrivateKeyBase64(String) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setJwtSecret(String) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setJwtSubject(String) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setKeepAliveOnBadKeepAliveValueMs(int) - Method in class org.apache.tika.client.HttpClientFactory
- setKey(Key) - Static method in class org.apache.tika.example.Pharmacy
- setKeyBaseStrategy(String) - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- setKeyBaseStrategy(UnpackConfig.KEY_BASE_STRATEGY) - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- setKeyStoreFile(String) - Method in class org.apache.tika.server.core.TlsConfig
- setKeyStorePassword(String) - Method in class org.apache.tika.server.core.TlsConfig
- setKeyStoreType(String) - Method in class org.apache.tika.server.core.TlsConfig
- setLang_id(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Sets language id
- setLangId(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Sets language_id
- setLanguage(String) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
-
Set tesseract language dictionary to be used.
- setLanguage(String) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- setLanguage(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set tesseract language dictionary to be used.
- setLastModified(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Sets last modified date of the chm file
- setLatitude(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
- setLeft(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
- setLength(int) - Method in class org.apache.tika.parser.microsoft.chm.DirectoryListingEntry
- setLengthTreeLengtsTable(short[]) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setLengthTreeTable(short[]) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setListenForAllRecords(boolean) - Method in class org.apache.tika.parser.microsoft.ExcelExtractor
-
Specifies whether this parser should to listen for all records or just for the specified few.
- setLogLevel(String) - Method in class org.apache.tika.server.core.TikaServerConfig
- setLongitude(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
- setLzxBlockLength(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- setLzxBlockOffset(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- setLzxBlocksCache(List<ChmLzxBlock>) - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- setMagikaPath(String) - Method in class org.apache.tika.detect.magika.MagikaDetector.Config
- setMagikaPath(String) - Method in class org.apache.tika.detect.magika.MagikaDetector.RuntimeConfig
- setMain(String, String, String) - Method in class org.apache.tika.parser.geo.topic.GeoTag
- setMainOrganizationAcronym(String) - Method in class org.apache.tika.sax.StandardReference
- setMainTreeElements(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setMainTreeLengtsTable(short[]) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setMainTreeTable(short[]) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setMap(Map<String, Collection<String>>) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpHeaders
- setMappings(Map<String, String>) - Method in class org.apache.tika.metadata.filter.FieldNameMappingFilter
- setMarkLimit(int) - Method in class org.apache.tika.parser.csv.TextAndCSVConfig
- setMarkLimit(int) - Method in class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
-
How far into the stream to scan for a
<meta charset>declaration. - setMarkLimit(int) - Method in class org.apache.tika.parser.html.HtmlEncodingDetector.Config
- setMarkLimit(int) - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector.Config
- setMarkLimit(int) - Method in class org.apache.tika.parser.txt.UniversalEncodingDetector.Config
- setMatchMap(Map<String, String>) - Method in class org.apache.tika.parser.RegexCaptureParserConfig
- setMaxBatchSize(int) - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- setMaxBatchSize(int) - Method in class org.apache.tika.inference.InferenceConfig.RuntimeConfig
- setMaxBatchSize(int) - Method in class org.apache.tika.inference.InferenceConfig
-
Set the maximum number of chunks per embeddings API request.
- setMaxBufferLength(int) - Method in class org.apache.tika.sax.StandardsExtractingContentHandler
-
The number of characters to store in memory for checking for standards.
- setMaxBytes(int) - Method in class org.apache.tika.detect.FileCommandDetector
-
If this is not called on a TikaInputStream, this detector will spool up to this many bytes to a file to be detected by the 'file' command.
- setMaxBytes(int) - Method in class org.apache.tika.detect.magika.MagikaDetector.Config
- setMaxBytes(int) - Method in class org.apache.tika.detect.siegfried.SiegfriedDetector.Config
- setMaxCharsForDetection(int) - Method in class org.apache.tika.langdetect.opennlp.metadatafilter.OpenNLPMetadataFilter
- setMaxCharsForDetection(int) - Method in class org.apache.tika.langdetect.optimaize.metadatafilter.OptimaizeMetadataFilter
- setMaxChunkChars(int) - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- setMaxChunkChars(int) - Method in class org.apache.tika.inference.InferenceConfig
- setMaxChunks(int) - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- setMaxChunks(int) - Method in class org.apache.tika.inference.InferenceConfig.RuntimeConfig
- setMaxChunks(int) - Method in class org.apache.tika.inference.InferenceConfig
-
Set the maximum number of chunks per document.
- setMaxConnections(int) - Method in class org.apache.tika.client.HttpClientFactory
- setMaxConnections(int) - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- setMaxConnections(Integer) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- setMaxConnections(Integer) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setMaxConnectionsPerRoute(int) - Method in class org.apache.tika.client.HttpClientFactory
- setMaxConnectionsPerRoute(Integer) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- setMaxConnectionsPerRoute(Integer) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setMaxContentLength(int) - Method in class org.apache.tika.eval.app.ProfilerBase
-
Truncate the content string if greater than this length to this length
- setMaxContentLengthForLangId(int) - Method in class org.apache.tika.eval.app.ProfilerBase
-
Truncate content string if greater than this length to this length for lang id
- setMaxCount(int) - Method in class org.apache.tika.config.EmbeddedLimits
-
Sets the maximum number of embedded documents to process.
- setMaxDataLengthBytes(int) - Method in class org.apache.tika.parser.image.PSDParser.PSDParserConfig
- setMaxDepth(int) - Method in class org.apache.tika.config.EmbeddedLimits
-
Sets the maximum nesting depth for embedded documents.
- setMaxEmails(int) - Method in class org.apache.tika.parser.microsoft.libpst.LibPstParserConfig
- setMaxEmbeddedCount(int) - Method in class org.apache.tika.parser.ParseRecord
-
Sets the maximum number of embedded documents to parse.
- setMaxEmbeddedCount(int) - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Set the maximum number of embedded resources to process.
- setMaxEmbeddedDepth(int) - Method in class org.apache.tika.parser.ParseRecord
-
Sets the maximum depth for parsing embedded documents.
- setMaxEntityExpansions(int) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Set the maximum number of entity expansions allowable in SAX/DOM/StAX parsing.
- setMaxEntityExpansions(Integer) - Method in class org.apache.tika.config.GlobalSettings.XmlReaderUtilsConfig
- setMaxErrMsgSize(Integer) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- setMaxErrMsgSize(Integer) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setMaxExtractLength(long) - Method in class org.apache.tika.eval.app.EvalConfig
- setMaxFieldSize(int) - Method in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- setMaxFileSizeToEmbed(long) - Method in class org.apache.tika.inference.ImageEmbeddingConfig
- setMaxFileSizeToEmbed(long) - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- setMaxFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
- setMaxFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- setMaxFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set maximum file size to submit file to ocr.
- setMaxFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocrencode.EncodeOCRConfig
-
Set the maximum image size (in bytes) accepted for base64 encoding.
- setMaxFileSizeToOcr(long) - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- setMaxFileSizeToOcr(long) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- setMaxFilesPerProcess(int) - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Set the maximum number of files to process before restarting the forked process.
- setMaxFilesProcessedPerProcess(int) - Method in class org.apache.tika.pipes.core.PipesConfig
- setMaxImagePixels(long) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig.RuntimeConfig
- setMaxImagePixels(long) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
-
Set the maximum total pixels (width × height) allowed for an image before OCR is skipped.
- setMaxImagePixels(long) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- setMaxImagePixels(long) - Method in class org.apache.tika.parser.pdf.OcrConfig
-
Set the maximum total pixels (width × height) for a rendered page image.
- setMaxImagePixels(long) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig.RuntimeConfig
- setMaxImagePixels(long) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
-
Set the maximum total pixels (width × height) for an image.
- setMaxImagesToOcr(int) - Method in class org.apache.tika.parser.ocrencode.EncodeOCRConfig
-
Sets the maximum number of images to base64-encode per parse (across the whole document, tracked via ParseContext).
- setMaximumCompressionRatio(long) - Method in class org.apache.tika.sax.SecureContentHandler
-
Sets the ratio between output characters and input bytes.
- setMaximumDepth(int) - Method in class org.apache.tika.sax.SecureContentHandler
-
Sets the maximum XML element nesting level.
- setMaximumPackageEntryDepth(int) - Method in class org.apache.tika.sax.SecureContentHandler
-
Sets the maximum package entry nesting level.
- setMaximumPoolSize(int) - Method in interface org.apache.tika.concurrent.ConfigurableThreadPoolExecutor
- setMaxIncrementalUpdates(int) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
The maximum number of incremental updates to parse if
PDFParserConfig.setParseIncrementalUpdates(boolean)is set totrue - setMaxKeySize(int) - Method in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- setMaxLength(int) - Method in class org.apache.tika.langdetect.charsoup.CharSoupLanguageDetector
-
Sets the maximum text length (in characters) that will be buffered for detection.
- setMaxLength(int) - Method in class org.apache.tika.langdetect.charsoup.CharSoupMetadataFilter
- setMaxLength(int) - Method in class org.apache.tika.langdetect.opennlp.OpenNLPDetector
- setMaxLength(long) - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- setMaxMainMemoryBytes(long) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- setMaxNumReuses(int) - Static method in class org.apache.tika.utils.XMLReaderUtils
- setMaxNumReuses(Integer) - Method in class org.apache.tika.config.GlobalSettings.XmlReaderUtilsConfig
- setMaxOverride(int) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
- setMaxPackageEntryDepth(int) - Method in class org.apache.tika.config.OutputLimits
-
Sets the maximum package entry nesting depth.
- setMaxPagesToOcr(int) - Method in class org.apache.tika.parser.pdf.OcrConfig
-
Set the maximum number of pages to OCR per document.
- setMaxRecordLength(int) - Method in class org.apache.tika.parser.image.BPGParser
- setMaxRecordSize(int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
- setMaxRecordSize(int) - Method in class org.apache.tika.parser.mp3.Mp3Parser
-
This statically sets the max record size in
ID3v2Frame - setMaxRedirects(Integer) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- setMaxRedirects(Integer) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setMaxScaleTo(int) - Method in class org.apache.tika.renderer.pdf.poppler.PopplerRenderer
-
Set the maximum pixel dimension (in pixels) for the longest edge of rendered page images.
- setMaxSpoolSize(Long) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- setMaxSpoolSize(Long) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setMaxStdErr(int) - Method in class org.apache.tika.parser.external.ExternalParserConfig
- setMaxStdErr(int) - Method in class org.apache.tika.parser.gdal.GDALParser
- setMaxStdOut(int) - Method in class org.apache.tika.parser.external.ExternalParserConfig
- setMaxStdOut(int) - Method in class org.apache.tika.parser.gdal.GDALParser
- setMaxStringLength(int) - Method in class org.apache.tika.Tika
-
Sets the maximum length of strings returned by the parseToString methods.
- setMaxTextLength(int) - Static method in class org.apache.tika.eval.core.langid.LanguageIDWrapper
- setMaxTokens(int) - Method in class org.apache.tika.eval.app.ProfilerBase
-
Add a LimitTokenCountFilterFactory if > -1
- setMaxTokens(int) - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- setMaxTokens(int) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig.RuntimeConfig
- setMaxTokens(int) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- setMaxTotalBytes(int) - Method in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- setMaxUnpackBytes(long) - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- setMaxValuesPerField(int) - Method in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- setMaxWaitForClientMillis(long) - Method in class org.apache.tika.pipes.core.PipesConfig
- setMaxWaitMillis(long) - Method in class org.apache.tika.server.client.TikaServerClientConfig
-
maximum time in milliseconds to wait for a new fetchemittuple to be available from the queue.
- setMaxXmlDepth(int) - Method in class org.apache.tika.config.OutputLimits
-
Sets the maximum XML element nesting depth.
- setMaxXMPMMHistory(int) - Static method in class org.apache.tika.parser.xmp.JempboxExtractor
-
Maximum number of events to extract from the event history in the XMP Media Management (XMPMM) section.
- setMaxXMPMMHistory(int) - Static method in class org.apache.tika.parser.xmp.XMPMetadataExtractor
-
Maximum number of events to extract from the event history in the XMP Media Management (XMPMM) section.
- setMediaType(MediaType) - Method in class org.apache.tika.parser.csv.CSVParams
- setMediaTypeRegistry(MediaTypeRegistry) - Method in class org.apache.tika.io.SpoolingStrategy
-
Sets the media type registry used for checking type specializations.
- setMediaTypeRegistry(MediaTypeRegistry) - Method in class org.apache.tika.parser.CompositeParser
-
Sets the media type registry used to infer type relationships.
- setMediaTypeRegistry(MediaTypeRegistry) - Method in class org.apache.tika.parser.multiple.AbstractMultipleParser
-
Sets the media type registry used to infer type relationships.
- setMemoryLimitInKb(int) - Method in class org.apache.tika.parser.pkg.CompressorParser.Config
- setMessage(String) - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
-
Status message
- setMessage(String) - Method in class org.apache.tika.SavePipesIteratorReply.Builder
-
Status message
- setMessageBytes(ByteString) - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
-
Status message
- setMessageBytes(ByteString) - Method in class org.apache.tika.SavePipesIteratorReply.Builder
-
Status message
- setMetadata(String[]) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the metadata whose values will be analyzed using cTAKES.
- setMetadata(Metadata) - Method in class org.apache.tika.xmp.convert.AbstractConverter
- setMetadataCommandArguments(Map<Property, String[]>) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Sets the map of Metadata keys to command line parameters.
- setMetaParser(Parser) - Method in class org.apache.tika.parser.epub.EpubParser
- setMetaParser(Parser) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
- setMimes(List<String>) - Method in class org.apache.tika.metadata.filter.RemoveByMimeMetadataFilter
- setMinConfidence(double) - Method in class org.apache.tika.parser.csv.TextAndCSVConfig
- setMinFileSizeToEmbed(long) - Method in class org.apache.tika.inference.ImageEmbeddingConfig
- setMinFileSizeToEmbed(long) - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- setMinFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
- setMinFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- setMinFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set minimum file size to submit file to ocr.
- setMinFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocrencode.EncodeOCRConfig
-
Set the minimum image size (in bytes) accepted for base64 encoding.
- setMinFileSizeToOcr(long) - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- setMinFileSizeToOcr(long) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- setMinLength(int) - Method in class org.apache.tika.parser.strings.StringsConfig
-
Sets the minimum sequence length (characters) to print.
- setMinSize(int) - Method in class org.apache.tika.parser.strings.Latin1StringsParser
-
Sets the minimum size of a character sequence to be extracted.
- setMinTokenLength(int) - Method in class org.apache.tika.eval.core.textstats.TextProfileSignature
-
Be careful -- for CJK languages, the default analyzer uses character bigrams.
- setMixedLanguages(boolean) - Method in class org.apache.tika.language.detect.LanguageDetector
- setMode(String) - Method in class org.apache.tika.server.client.TikaServerClientConfig
- setModel(String) - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- setModel(String) - Method in class org.apache.tika.inference.ImageEmbeddingConfig.RuntimeConfig
- setModel(String) - Method in class org.apache.tika.inference.ImageEmbeddingConfig
- setModel(String) - Method in class org.apache.tika.inference.InferenceConfig.RuntimeConfig
- setModel(String) - Method in class org.apache.tika.inference.InferenceConfig
- setModel(String) - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- setModel(String) - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- setModel(String) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig.RuntimeConfig
- setModel(String) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- setN(long) - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
- setName(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
- setName(String) - Method in class org.apache.tika.parser.microsoft.chm.DirectoryListingEntry
-
Sets entry name
- setName(String) - Method in class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
- setName(String) - Method in class org.apache.tika.pipes.ignite.ExtensionConfigDTO
- setNameLength(int) - Method in class org.apache.tika.parser.microsoft.chm.DirectoryListingEntry
-
Sets an entry name length
- setNamePrefix(String) - Method in class org.apache.tika.eval.app.db.TableInfo
- setNameToDelimiterCharacterMap(Map<String, Character>) - Method in class org.apache.tika.parser.csv.TextAndCSVConfig
-
Set the name-to-delimiter map with Character values.
- setNameToDelimiterMap(Map<String, String>) - Method in class org.apache.tika.parser.csv.TextAndCSVConfig
-
Set the name-to-delimiter map from String values (for JSON deserialization).
- setNativeLibPath(String) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig.RuntimeConfig
- setNativeLibPath(String) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
-
Set the path to the directory containing native Tesseract/Leptonica shared libraries.
- setNativeLibPath(String) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- setNERModelPath(String) - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig.RuntimeConfig
- setNERModelPath(String) - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
- setNerModelUrl(URL) - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig.RuntimeConfig
- setNerModelUrl(URL) - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
- setNtDomain(String) - Method in class org.apache.tika.client.HttpClientFactory
- setNtDomain(String) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setNum_blocks(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Sets number of blocks containing in the chm file
- setNumClients(int) - Method in class org.apache.tika.pipes.core.PipesConfig
- setNumClients(int) - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
EXPERT: Set the number of forked JVM processes (clients) to use for parsing.
- setNumEmitters(int) - Method in class org.apache.tika.pipes.core.PipesConfig
- setNumFetchersPerPage(int) - Method in class org.apache.tika.ListFetchersRequest.Builder
-
List this many fetchers per page.
- setNumId(int) - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
- setNumOfHidden(int) - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- setNumOfInputs(int) - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- setNumOfOutputs(int) - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- setNumThreads(int) - Method in class org.apache.tika.server.client.TikaServerClientConfig
- setNumWorkers(int) - Method in class org.apache.tika.eval.app.EvalConfig
- setOcr(OcrConfig) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- setOcrDPI(int) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Dots per inch used to render the page image for OCR.
- setOcrEngineMode(int) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
-
Set OCR Engine Mode.
- setOcrEngineMode(int) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- setOcrImageFormat(OcrConfig.ImageFormat) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- setOcrImageQuality(float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image quality used to render the page image for OCR.
- setOcrImageType(OcrConfig.ImageType) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- setOcrMaxImagePixels(long) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Set the maximum total pixels (width × height) for a rendered page image.
- setOcrMaxPagesToOcr(int) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Set the maximum number of pages to OCR per document.
- setOcrRenderingStrategy(OcrConfig.RenderingStrategy) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
When rendering the page for OCR, do you want to include the rendering of the electronic text, ALL, or do you only want to run OCR on the images and vector graphics (NO_TEXT)?
- setOcrStrategy(OcrConfig.Strategy) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Which strategy to use for OCR
- setOcrStrategyAuto(OcrConfig.StrategyAuto) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Sets the OCR strategy auto configuration.
- setOffset(int) - Method in class org.apache.tika.parser.microsoft.chm.DirectoryListingEntry
- setOids(ObjectSpaceObjectStreamOfOIDsOSIDsOrContextIDs) - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
- setOnExists(FileSystemEmitterConfig.ON_EXISTS) - Method in class org.apache.tika.pipes.emitter.fs.FileSystemEmitterRuntimeConfig
- setOnlyLatestRevision(boolean) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteTreeWalkerOptions
-
Only parse the latest revision.
- setOnParseException(FetchEmitTuple.ON_PARSE_EXCEPTION) - Method in class org.apache.tika.pipes.core.PipesConfig
-
Sets the default behavior when a parse exception occurs.
- setOpenContainer(Object) - Method in class org.apache.tika.io.TikaInputStream
- setOsids(ObjectSpaceObjectStreamOfOIDsOSIDsOrContextIDs) - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
- setOtherTesseractConfig(Map<String, String>) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set the map of other Tesseract config parameters.
- setOtherTesseractSettings(List<String>) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
- setOutputField(String) - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- setOutputField(String) - Method in class org.apache.tika.inference.ImageEmbeddingConfig
- setOutputField(String) - Method in class org.apache.tika.inference.InferenceConfig
- setOutputFileHandler(Parser) - Method in class org.apache.tika.parser.external.ExternalParserConfig
- setOutputFormat(String) - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- setOutputFormat(UnpackConfig.OUTPUT_FORMAT) - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- setOutputMode(String) - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- setOutputMode(UnpackConfig.OUTPUT_MODE) - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- setOutputStream(OutputStream) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the
OutputStreamobject used to write the CAS. - setOutputThreshold(long) - Method in class org.apache.tika.sax.SecureContentHandler
-
Sets the threshold for output characters before the zip bomb prevention is activated.
- setOutputType(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- setOutputType(TesseractOCRConfig.OUTPUT_TYPE) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set output type from ocr process.
- setOverallTimeoutMillis(Long) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- setOverallTimeoutMillis(Long) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setOverlapChars(int) - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- setOverlapChars(int) - Method in class org.apache.tika.inference.InferenceConfig
- setPageNumber(int) - Method in class org.apache.tika.ListFetchersRequest.Builder
-
List the fetchers starting at this page number
- setPageSegMode(int) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
-
Set tesseract page segmentation mode.
- setPageSegMode(int) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- setPageSegMode(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set tesseract page segmentation mode.
- setPageSeparator(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
The page separator to use in plain text output.
- setPaginated(List<PaginatedLocator>) - Method in class org.apache.tika.inference.locator.Locators
- setParams(float[]) - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- setParseContext(ParseContext) - Method in class org.apache.tika.pipes.core.emitter.EmitDataImpl
-
Sets the ParseContext.
- setParseContext(ParseContext) - Method in class org.apache.tika.sax.BasicContentHandlerFactory
-
Sets the parse context for storing warnings when throwOnWriteLimitReached is false.
- setParseContextJson(String) - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
Optional JSON object to configure the ParseContext for this request, overriding server defaults.
- setParseContextJsonBytes(ByteString) - Method in class org.apache.tika.FetchAndParseRequest.Builder
-
Optional JSON object to configure the ParseContext for this request, overriding server defaults.
- setParseException(boolean) - Method in class org.apache.tika.eval.core.util.ContentTags
- setParseIncrementalUpdates(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- setParseMode(String) - Method in class org.apache.tika.pipes.core.PipesConfig
-
Sets the default parse mode from a string.
- setParseMode(ParseMode) - Method in class org.apache.tika.pipes.core.PipesConfig
-
Sets the default parse mode for how embedded documents are handled.
- setParseMode(ParseMode) - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Set the parse mode (RMETA for recursive metadata, CONCATENATE for single document).
- setParsers(Map<MediaType, Parser>) - Method in class org.apache.tika.parser.CompositeParser
-
Sets the component parsers.
- setPartitions(int) - Method in class org.apache.tika.pipes.ignite.config.IgniteConfigStoreConfig
- setPartitions(int) - Method in class org.apache.tika.pipes.ignite.IgniteConfigStore
- setPassword(String) - Method in class org.apache.tika.client.HttpClientFactory
- setPassword(String) - Method in class org.apache.tika.parser.SimplePasswordProvider
- setPassword(String) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setPathStyleAccessEnabled(boolean) - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- setPdftoppmPath(String) - Method in class org.apache.tika.renderer.pdf.poppler.PopplerRenderer
-
Set the path to the
pdftoppmexecutable. - setPersonAndEmail(String, Property, Property, Metadata) - Static method in class org.apache.tika.parser.mailcommons.MailUtil
-
This tries to split a "from" or "to" value into a person field and an email field.
- setPipesConfig(int, long, int, List<String>) - Method in class org.apache.tika.pipes.core.config.ConfigOverrides.Builder
-
Set pipes configuration with all options.
- setPipesConfig(int, List<String>) - Method in class org.apache.tika.pipes.core.config.ConfigOverrides.Builder
-
Set pipes configuration with basic options.
- setPluginRoots(String) - Method in class org.apache.tika.pipes.core.config.ConfigOverrides.Builder
-
Set the plugin roots path.
- setPluginsDir(Path) - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Set the plugins directory where plugin zips are located.
- setPoolSize(int) - Static method in class org.apache.tika.mime.MimeTypesReader
-
Set the pool size for cached XML parsers.
- setPoolSize(int) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig.RuntimeConfig
- setPoolSize(int) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
-
Set the number of Tesseract instances to keep in the pool.
- setPoolSize(int) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- setPoolSize(int) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Set the pool size for cached XML parsers.
- setPoolSize(Integer) - Method in class org.apache.tika.config.GlobalSettings.XmlReaderUtilsConfig
- setPort(int) - Method in class org.apache.tika.server.core.TikaServerConfig
- setPort(Integer) - Method in class org.apache.tika.pipes.grpc.TikaGrpcServer
- setPosition(long) - Method in class org.apache.tika.io.TikaInputStream
- setPreferAlternateContentChoice(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
- setPrefix(String) - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- setPreloadLangs(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- setPreserveInterwordSpacing(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Whether or not to maintain interword spacing.
- setPrettyPrint(boolean) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Enables the formatted output for serializer.
- setPrettyPrint(boolean) - Method in class org.apache.tika.pipes.emitter.fs.FileSystemEmitterRuntimeConfig
- setPrettyPrinting(boolean) - Static method in class org.apache.tika.serialization.JsonMetadata
- setPrettyPrinting(boolean) - Static method in class org.apache.tika.serialization.JsonMetadataList
- setPriors(Map<String, Float>) - Method in class org.apache.tika.langdetect.charsoup.CharSoupLanguageDetector
- setPriors(Map<String, Float>) - Method in class org.apache.tika.langdetect.lingo24.Lingo24LangDetector
- setPriors(Map<String, Float>) - Method in class org.apache.tika.langdetect.mitll.TextLangDetector
- setPriors(Map<String, Float>) - Method in class org.apache.tika.langdetect.opennlp.OpenNLPDetector
-
NOT YET SUPPORTED.
- setPriors(Map<String, Float>) - Method in class org.apache.tika.langdetect.optimaize.OptimaizeLangDetector
- setPriors(Map<String, Float>) - Method in class org.apache.tika.language.detect.LanguageDetector
-
Set the a-priori probabilities for these languages.
- setPrivateKey(File) - Method in class org.apache.tika.pipes.grpc.TikaGrpcServer
- setPrivateKeyPassword(String) - Method in class org.apache.tika.pipes.grpc.TikaGrpcServer
- setProcessEmailAsMsg(boolean) - Method in class org.apache.tika.parser.microsoft.libpst.LibPstParserConfig
- setProcessTimeMillis(long) - Method in class org.apache.tika.utils.FileProcessResult
- setProfile(String) - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- setProgId(String) - Method in class org.apache.tika.parser.microsoft.ooxml.EmbeddedPartMetadata
- setProgressTimeoutMillis(long) - Method in class org.apache.tika.config.TimeoutLimits
-
Sets the maximum time in milliseconds between progress updates before the task is considered stalled.
- setProjectId(String) - Method in class org.apache.tika.pipes.fetcher.gcs.config.GCSFetcherConfig
- setPrompt(String) - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- setPrompt(String) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig.RuntimeConfig
- setPrompt(String) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- setProxyHost(String) - Method in class org.apache.tika.client.HttpClientFactory
- setProxyHost(String) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setProxyPort(int) - Method in class org.apache.tika.client.HttpClientFactory
- setProxyPort(Integer) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setQuantRate(float) - Method in class org.apache.tika.eval.core.textstats.TextProfileSignature
- setQueueSize(int) - Method in class org.apache.tika.pipes.core.PipesConfig
- setQuoteAssignmentValues(boolean) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Sets whether or not to quote assignment values, i.e. tag='value'.
- setR0(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setR1(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setR2(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setReadLimit(int) - Method in class org.apache.tika.ml.junkdetect.JunkFilterEncodingDetector
- setReadPstPath(String) - Method in class org.apache.tika.parser.microsoft.libpst.LibPstParserConfig.RuntimeConfig
- setReadPstPath(String) - Method in class org.apache.tika.parser.microsoft.libpst.LibPstParserConfig
- setRegex(String) - Method in class org.apache.tika.metadata.filter.CaptureGroupMetadataFilter
- setRegion(String) - Method in class org.apache.tika.parser.transcribe.aws.AmazonTranscribeConfig.RuntimeConfig
- setRegion(String) - Method in class org.apache.tika.parser.transcribe.aws.AmazonTranscribeConfig
- setRegion(String) - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- setRenderedName(String) - Method in class org.apache.tika.parser.microsoft.ooxml.EmbeddedPartMetadata
- setRenderer(Renderer) - Method in class org.apache.tika.parser.pdf.PDFParser
- setRenderer(Renderer) - Method in interface org.apache.tika.parser.RenderingParser
- setRenderingStrategy(OcrConfig.RenderingStrategy) - Method in class org.apache.tika.parser.pdf.OcrConfig
- setRenderResults(RenderResults) - Method in class org.apache.tika.renderer.pdf.pdfbox.PDFRenderingState
- setRepeatedField(Descriptors.FieldDescriptor, int, Object) - Method in class org.apache.tika.DeleteFetcherReply.Builder
- setRepeatedField(Descriptors.FieldDescriptor, int, Object) - Method in class org.apache.tika.DeleteFetcherRequest.Builder
- setRepeatedField(Descriptors.FieldDescriptor, int, Object) - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
- setRepeatedField(Descriptors.FieldDescriptor, int, Object) - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- setRepeatedField(Descriptors.FieldDescriptor, int, Object) - Method in class org.apache.tika.FetchAndParseReply.Builder
- setRepeatedField(Descriptors.FieldDescriptor, int, Object) - Method in class org.apache.tika.FetchAndParseRequest.Builder
- setRepeatedField(Descriptors.FieldDescriptor, int, Object) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- setRepeatedField(Descriptors.FieldDescriptor, int, Object) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- setRepeatedField(Descriptors.FieldDescriptor, int, Object) - Method in class org.apache.tika.GetFetcherReply.Builder
- setRepeatedField(Descriptors.FieldDescriptor, int, Object) - Method in class org.apache.tika.GetFetcherRequest.Builder
- setRepeatedField(Descriptors.FieldDescriptor, int, Object) - Method in class org.apache.tika.GetPipesIteratorReply.Builder
- setRepeatedField(Descriptors.FieldDescriptor, int, Object) - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
- setRepeatedField(Descriptors.FieldDescriptor, int, Object) - Method in class org.apache.tika.ListFetchersReply.Builder
- setRepeatedField(Descriptors.FieldDescriptor, int, Object) - Method in class org.apache.tika.ListFetchersRequest.Builder
- setRepeatedField(Descriptors.FieldDescriptor, int, Object) - Method in class org.apache.tika.SaveFetcherReply.Builder
- setRepeatedField(Descriptors.FieldDescriptor, int, Object) - Method in class org.apache.tika.SaveFetcherRequest.Builder
- setRepeatedField(Descriptors.FieldDescriptor, int, Object) - Method in class org.apache.tika.SavePipesIteratorReply.Builder
- setRepeatedField(Descriptors.FieldDescriptor, int, Object) - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
- setReplicas(int) - Method in class org.apache.tika.pipes.ignite.config.IgniteConfigStoreConfig
- setReplicas(int) - Method in class org.apache.tika.pipes.ignite.IgniteConfigStore
- setRequestTimeoutMillis(int) - Method in class org.apache.tika.client.HttpClientFactory
- setRequestTimeoutMillis(Integer) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- setRequestTimeoutMillis(Integer) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setResetInterval(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
-
Sets a reset interval
- setResetTableIndex(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmDirectoryListingSet
-
Sets reset table index
- setResize(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- setResolvedConfig(String, Object) - Method in class org.apache.tika.parser.ParseContext
-
Caches a resolved configuration object.
- setResources(List<FrictionlessResource>) - Method in class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
- setReturnStackTrace(boolean) - Method in class org.apache.tika.server.core.TikaServerConfig
- setReturnStderr(boolean) - Method in class org.apache.tika.parser.external.ExternalParserConfig
- setReturnStdout(boolean) - Method in class org.apache.tika.parser.external.ExternalParserConfig
- setRight(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
- setRtfEmbeddedMaxBytesInKb(int) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
- setSasToken(String) - Method in class org.apache.tika.pipes.fetcher.azblob.config.AZBlobFetcherConfig
- setScopes(List<String>) - Method in class org.apache.tika.pipes.fetcher.googledrive.config.GoogleDriveFetcherConfig
- setScopes(List<String>) - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.MicrosoftGraphFetcherConfig
- setScore(double) - Method in class org.apache.tika.sax.StandardReference
- setScore(double) - Method in class org.apache.tika.sax.StandardReference.StandardReferenceBuilder
- setSecondOrganization(String, String) - Method in class org.apache.tika.sax.StandardReference.StandardReferenceBuilder
- setSecondOrganizationAcronym(String) - Method in class org.apache.tika.sax.StandardReference
- setSecret(String) - Method in class org.apache.tika.language.translate.impl.MicrosoftTranslator
-
Sets the client secret for the translator API.
- setSecretKey(String) - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- setSecure(boolean) - Method in class org.apache.tika.pipes.grpc.TikaGrpcServer
- setSeparator(String) - Method in class org.apache.tika.sax.StandardReference
- setSeparatorChar(char) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the separator character used for annotation properties.
- setSerialize(boolean) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Enables CAS serialization.
- setSerializerType(CTAKESSerializer) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the type of cTAKES (UIMA) serializer used to write CAS.
- setServer(Server) - Method in class org.apache.tika.pipes.grpc.TikaGrpcServer
- setServerStatus(ServerStatus) - Method in interface org.apache.tika.server.core.ServerStatusResource
- setServiceAccountKeyBase64(String) - Method in class org.apache.tika.pipes.fetcher.googledrive.config.GoogleDriveFetcherConfig
- setServiceLoader(GlobalSettings.ServiceLoaderConfig) - Method in class org.apache.tika.config.GlobalSettings
- setSetKCMS(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Whether to call
System.setProperty("sun.java2d.cmm", "sun.java2d.cmm.kcms.KcmsServiceProvider"). - setSharedMapper(ObjectMapper) - Static method in class org.apache.tika.serialization.TikaModule
-
Sets the shared ObjectMapper for use during deserialization.
- setSharedSecret(String) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- setShortText(boolean) - Method in class org.apache.tika.language.detect.LanguageDetector
- setShutdownClientAfterMillis(long) - Method in class org.apache.tika.pipes.core.PipesConfig
-
If the client has been inactive after this many milliseconds, shut it down.
- setSiegfriedPath(String) - Method in class org.apache.tika.detect.siegfried.SiegfriedDetector.Config
- setSiegfriedPath(String) - Method in class org.apache.tika.detect.siegfried.SiegfriedDetector.RuntimeConfig
- setSignature(byte[]) - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Sets itsf header signature
- setSignature(byte[]) - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Sets itsp signature
- setSignature(byte[]) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
-
Sets a signature of control data block
- setSignature(byte[]) - Method in class org.apache.tika.parser.microsoft.chm.ChmPmgiHeader
-
Sets pmgi signature
- setSignature(byte[]) - Method in class org.apache.tika.parser.microsoft.chm.ChmPmglHeader
- setSize(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
-
Sets a size of control data
- setSkipContainerDocumentDigest(boolean) - Method in class org.apache.tika.parser.digestutils.BouncyCastleDigesterFactory
- setSkipContainerDocumentDigest(boolean) - Method in class org.apache.tika.parser.digestutils.CommonsDigesterFactory
- setSkipEmbedding(boolean) - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- setSkipEmbedding(boolean) - Method in class org.apache.tika.inference.ImageEmbeddingConfig
- setSkipEmbedding(boolean) - Method in class org.apache.tika.inference.InferenceConfig
- setSkipEmbedding(boolean) - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- setSkipOcr(boolean) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
- setSkipOcr(boolean) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- setSkipOcr(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
If you want to turn off OCR at run time for a specific file, set this to
true - setSkipOcr(boolean) - Method in class org.apache.tika.parser.ocrencode.EncodeOCRConfig
-
If set to
true, disables base64 encoding at runtime: the parser reports no supported types and parse() is a no-op. - setSkipOcr(boolean) - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- setSkipOcr(boolean) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- setSleepOnStartupTimeoutMillis(long) - Method in class org.apache.tika.pipes.core.PipesConfig
- setSocketTimeoutMillis(int) - Method in class org.apache.tika.client.HttpClientFactory
- setSocketTimeoutMillis(Integer) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- setSocketTimeoutMillis(Integer) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setSocketTimeoutMs(long) - Method in class org.apache.tika.pipes.core.PipesConfig
-
Socket timeout in milliseconds for reading from the forked process.
- setSortByPosition(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, sort text tokens by their x/y position before extracting text.
- setSourceField(String) - Method in class org.apache.tika.metadata.filter.CaptureGroupMetadataFilter
- setSpacingTolerance(Float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
See
PDFTextStripper.setSpacingTolerance(float) - setSpatial(List<SpatialLocator>) - Method in class org.apache.tika.inference.locator.Locators
- setSpoolToTemp(boolean) - Method in class org.apache.tika.pipes.fetcher.azblob.config.AZBlobFetcherConfig
- setSpoolToTemp(boolean) - Method in class org.apache.tika.pipes.fetcher.gcs.config.GCSFetcherConfig
- setSpoolToTemp(boolean) - Method in class org.apache.tika.pipes.fetcher.googledrive.config.GoogleDriveFetcherConfig
- setSpoolToTemp(boolean) - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- setSpoolToTemp(boolean) - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.MicrosoftGraphFetcherConfig
- setSpoolTypes(Set<MediaType>) - Method in class org.apache.tika.io.SpoolingStrategy
-
Sets the media types that should be spooled to disk.
- setStaleFetcherDelaySeconds(int) - Method in class org.apache.tika.pipes.core.PipesConfig
- setStaleFetcherTimeoutSeconds(int) - Method in class org.apache.tika.pipes.core.PipesConfig
- setStartBookmark(PDOutlineItem) - Method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
- setStartIndex(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
- setStartupTimeoutMillis(long) - Method in class org.apache.tika.pipes.core.PipesConfig
- setStartupTimeoutMillis(long) - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Set the startup timeout in milliseconds.
- setStatus(String) - Method in class org.apache.tika.FetchAndParseReply.Builder
-
The status from the message.
- setStatusBytes(ByteString) - Method in class org.apache.tika.FetchAndParseReply.Builder
-
The status from the message.
- setStderr(String) - Method in class org.apache.tika.utils.FileProcessResult
- setStderrHandler(Parser) - Method in class org.apache.tika.parser.external.ExternalParserConfig
- setStderrLength(long) - Method in class org.apache.tika.utils.FileProcessResult
- setStderrTruncated(boolean) - Method in class org.apache.tika.utils.FileProcessResult
- setStdout(String) - Method in class org.apache.tika.utils.FileProcessResult
- setStdoutHandler(Parser) - Method in class org.apache.tika.parser.external.ExternalParserConfig
- setStdoutLength(long) - Method in class org.apache.tika.utils.FileProcessResult
- setStdoutTruncated(boolean) - Method in class org.apache.tika.utils.FileProcessResult
- setStopOnlyOnFatal(boolean) - Method in class org.apache.tika.pipes.core.PipesConfig
- setStrategy(OcrConfig.Strategy) - Method in class org.apache.tika.parser.pdf.OcrConfig
- setStrategyAuto(OcrConfig.StrategyAuto) - Method in class org.apache.tika.parser.pdf.OcrConfig
- setStream_uuid(byte[]) - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Sets stream uuid
- setStreamReadConstraints(StreamReadConstraints) - Static method in class org.apache.tika.serialization.JsonMetadata
-
Sets the stream read constraints for JSON parsing of metadata.
- setStreamReadConstraints(StreamReadConstraints) - Static method in class org.apache.tika.serialization.JsonMetadataList
-
Sets the stream read constraints for JSON parsing of metadata lists.
- setStrike(boolean) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
- setStringsPath(String) - Method in class org.apache.tika.parser.strings.StringsConfig.RuntimeConfig
- setStringsPath(String) - Method in class org.apache.tika.parser.strings.StringsConfig
-
Sets the "strings" installation folder.
- setStripMarkup(boolean) - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector.Config
- setStyleID(String) - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
- setSubject(String) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- setSubjectUser(String) - Method in class org.apache.tika.pipes.fetcher.googledrive.config.GoogleDriveFetcherConfig
- setSuccess(boolean) - Method in class org.apache.tika.DeleteFetcherReply.Builder
-
Success if the fetcher was successfully removed from the fetch store.
- setSuffixStrategy(String) - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- setSuffixStrategy(UnpackConfig.SUFFIX_STRATEGY) - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- setSuperType(MimeType, MediaType) - Method in class org.apache.tika.mime.MimeTypes
- setSupportedEmbedTypes(Set<MediaType>) - Method in class org.apache.tika.embedder.ExternalEmbedder
- setSupportedTypes(List<String>) - Method in class org.apache.tika.parser.external.ExternalParserConfig
- setSuppressDuplicateOverlappingText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, the parser should try to remove duplicated text over the same region.
- setSwath(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
- setSystem_uuid(byte[]) - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Sets system uuid
- setTableName(String) - Method in class org.apache.tika.pipes.ignite.config.IgniteConfigStoreConfig
- setTableName(String) - Method in class org.apache.tika.pipes.ignite.IgniteConfigStore
- setTableOffset(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
-
Sets a table offset
- setTargetField(String) - Method in class org.apache.tika.metadata.filter.CaptureGroupMetadataFilter
- setTempDirectory(String) - Method in class org.apache.tika.pipes.core.PipesConfig
-
Sets the directory for temporary files during pipes-based parsing.
- setTemporal(List<TemporalLocator>) - Method in class org.apache.tika.inference.locator.Locators
- setTemporaryFileDirectory(File) - Method in class org.apache.tika.io.TemporaryResources
-
Sets the directory to be used for the temporary files created by the
TemporaryResources.createTempFile(String)method. - setTemporaryFileDirectory(Path) - Method in class org.apache.tika.io.TemporaryResources
-
Sets the directory to be used for the temporary files created by the
TemporaryResources.createTempFile(String)method. - setTenantId(String) - Method in interface org.apache.tika.pipes.fetchers.microsoftgraph.config.AadCredentialConfigBase
- setTenantId(String) - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.Client2CertificateCredentialsConfig
- setTenantId(String) - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.ClientCertificateCredentialsConfig
- setTenantId(String) - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.ClientSecretCredentialsConfig
- setTessdataPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig.RuntimeConfig
- setTessdataPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- setTesseractPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig.RuntimeConfig
- setTesseractPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
- setText(boolean) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Enables content text analysis using cTAKES.
- setText(byte[]) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Set the input text (byte) data whose charset is to be detected.
- setText(InputStream) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Set the input text (byte) data whose charset is to be detected.
- setText(List<TextLocator>) - Method in class org.apache.tika.inference.locator.Locators
- setThreshold(double) - Method in class org.apache.tika.sax.StandardsExtractingContentHandler
-
Sets the score to be used as threshold.
- setThresholdBytes(Long) - Method in class org.apache.tika.pipes.core.EmitStrategyConfig
-
Set the threshold in bytes for DYNAMIC strategy.
- setThrottleSeconds(long[]) - Method in class org.apache.tika.pipes.fetcher.s3.config.S3FetcherConfig
- setThrottleSeconds(long[]) - Method in class org.apache.tika.pipes.fetchers.microsoftgraph.config.MicrosoftGraphFetcherConfig
- setThrottleSeconds(List<Long>) - Method in class org.apache.tika.pipes.fetcher.googledrive.config.GoogleDriveFetcherConfig
- setThrowOnEncryptedPayload(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
- setThrowOnMaxCount(boolean) - Method in class org.apache.tika.config.EmbeddedLimits
-
Sets whether to throw an exception when maxCount is reached.
- setThrowOnMaxDepth(boolean) - Method in class org.apache.tika.config.EmbeddedLimits
-
Sets whether to throw an exception when maxDepth is reached.
- setThrowOnWriteLimit(boolean) - Method in class org.apache.tika.config.OutputLimits
-
Sets whether to throw an exception when writeLimit is reached.
- setThrowOnWriteLimitReached(boolean) - Method in class org.apache.tika.sax.BasicContentHandlerFactory
-
Sets whether to throw an exception when write limit is reached.
- setThrowOnZeroBytes(boolean) - Method in class org.apache.tika.parser.AutoDetectParserConfig
- setTikaConfig(File) - Method in class org.apache.tika.pipes.grpc.TikaGrpcServer
- setTikaEndpoints(List<String>) - Method in class org.apache.tika.server.client.TikaServerClientConfig
- setTimeout(boolean) - Method in class org.apache.tika.utils.FileProcessResult
- setTimeoutLimits(TimeoutLimits) - Method in class org.apache.tika.pipes.core.config.ConfigOverrides.Builder
-
Set the timeout limits to write to the parse-context section.
- setTimeoutLimits(TimeoutLimits) - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Set the timeout limits for parsing operations.
- setTimeoutMs(int) - Method in class org.apache.tika.renderer.pdf.poppler.PopplerRenderer
-
Set the timeout in milliseconds for the pdftoppm process.
- setTimeoutMs(long) - Method in class org.apache.tika.detect.FileCommandDetector
- setTimeoutMs(long) - Method in class org.apache.tika.detect.magika.MagikaDetector.Config
- setTimeoutMs(long) - Method in class org.apache.tika.detect.siegfried.SiegfriedDetector.Config
- setTimeoutMs(long) - Method in class org.apache.tika.parser.external.ExternalParserConfig
- setTimeoutMs(long) - Method in class org.apache.tika.parser.gdal.GDALParser
- setTimeoutSeconds(int) - Method in class org.apache.tika.inference.AbstractEmbeddingFilter
- setTimeoutSeconds(int) - Method in class org.apache.tika.inference.ImageEmbeddingConfig
- setTimeoutSeconds(int) - Method in class org.apache.tika.inference.InferenceConfig
- setTimeoutSeconds(int) - Method in class org.apache.tika.inference.OpenAIImageEmbeddingParser
- setTimeoutSeconds(int) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
-
Set maximum time (seconds) to wait for a pooled Tesseract instance.
- setTimeoutSeconds(int) - Method in class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- setTimeoutSeconds(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set maximum time (seconds) to wait for the ocring process to terminate.
- setTimeoutSeconds(int) - Method in class org.apache.tika.parser.strings.StringsConfig
-
Sets the maximum time (in seconds) to wait for the "strings" command to terminate.
- setTimeoutSeconds(int) - Method in class org.apache.tika.parser.vlm.AbstractVLMParser
- setTimeoutSeconds(int) - Method in class org.apache.tika.parser.vlm.VLMOCRConfig
- setTimeoutSeconds(long) - Method in class org.apache.tika.parser.microsoft.libpst.LibPstParserConfig
- setTitle(String) - Method in class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
- setTlsConfig(TlsConfig) - Method in class org.apache.tika.server.core.TikaServerConfig
- setTotal(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
- setTotalCharsPerPage(int) - Method in class org.apache.tika.parser.pdf.OcrConfig.StrategyAuto
- setTotalTaskTimeoutMillis(long) - Method in class org.apache.tika.config.TimeoutLimits
-
Sets the maximum wall-clock time in milliseconds for a parse task.
- setTracking(boolean) - Method in class org.apache.tika.parser.mbox.MboxParser
- setTranslator(Translator) - Method in class org.apache.tika.language.translate.impl.CachedTranslator
- setTrustCertCollection(File) - Method in class org.apache.tika.pipes.grpc.TikaGrpcServer
- setTrustedPageSeparator(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig.RuntimeConfig
- setTrustedPageSeparator(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Same as
TesseractOCRConfig.setPageSeparator(String)but does not perform any checks on the string. - setTrustStoreFile(String) - Method in class org.apache.tika.server.core.TlsConfig
- setTrustStorePassword(String) - Method in class org.apache.tika.server.core.TlsConfig
- setTrustStoreType(String) - Method in class org.apache.tika.server.core.TlsConfig
- setType(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
- setType(MediaType) - Method in class org.apache.tika.detect.NNTrainedModelBuilder
- setType(EmitStrategy) - Method in class org.apache.tika.pipes.core.EmitStrategyConfig
-
Set the emit strategy type.
- setType(BasicContentHandlerFactory.HANDLER_TYPE) - Method in class org.apache.tika.sax.BasicContentHandlerFactory
-
Sets the handler type.
- setTypes(List<String>) - Method in class org.apache.tika.metadata.filter.ClearByAttachmentTypeMetadataFilter
-
For types see
TikaCoreProperties.EmbeddedResourceType - setUMLSPass(String) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the UMLS password.
- setUMLSUser(String) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the UMLS username.
- setUncompressedLen(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
-
Sets uncompressed length
- setUnderline(String) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
- setUnknown(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
-
Sets an unknown
- setUnknown_000c(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Sets unknown_00c
- setUnknown_000c(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Sets 000c unknown bytes Unknown means here that those guys who cracked the chm format do not know what's it purposes for
- setUnknown_0024(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Sets 0024 unknown bytes
- setUnknown_002c(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Sets 002c unknown bytes
- setUnknown_0044(byte[]) - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Sets 0044 unknown bytes
- setUnknown_18(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
-
Sets unknown 18 bytes
- setUnknown0008(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmPmglHeader
- setUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.DeleteFetcherReply.Builder
- setUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.DeleteFetcherRequest.Builder
- setUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.DeletePipesIteratorReply.Builder
- setUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.DeletePipesIteratorRequest.Builder
- setUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.FetchAndParseReply.Builder
- setUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.FetchAndParseRequest.Builder
- setUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply.Builder
- setUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest.Builder
- setUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.GetFetcherReply.Builder
- setUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.GetFetcherRequest.Builder
- setUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.GetPipesIteratorReply.Builder
- setUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.GetPipesIteratorRequest.Builder
- setUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.ListFetchersReply.Builder
- setUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.ListFetchersRequest.Builder
- setUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.SaveFetcherReply.Builder
- setUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.SaveFetcherRequest.Builder
- setUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.SavePipesIteratorReply.Builder
- setUnknownFields(UnknownFieldSet) - Method in class org.apache.tika.SavePipesIteratorRequest.Builder
- setUnknownLen(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Sets unknown length
- setUnknownOffset(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Sets unknown offset
- setUnmappedUnicodeCharsPerPage(float) - Method in class org.apache.tika.parser.pdf.OcrConfig.StrategyAuto
- setupContentHandlerFactory(ParseContext, String, int, boolean) - Static method in class org.apache.tika.server.core.resource.TikaResource
-
Sets up the ContentHandlerFactory in the ParseContext based on explicit parameters.
- setupContentHandlerFactory(ParseContext, String, MultivaluedMap<String, String>) - Static method in class org.apache.tika.server.core.resource.TikaResource
-
Sets up the ContentHandlerFactory in the ParseContext based on handler type and HTTP headers.
- setupContentHandlerFactoryIfNeeded(ParseContext, String, int, boolean) - Static method in class org.apache.tika.server.core.resource.TikaResource
-
Sets up the ContentHandlerFactory in the ParseContext if not already set.
- setupContentHandlerFactoryIfNeeded(ParseContext, String, MultivaluedMap<String, String>) - Static method in class org.apache.tika.server.core.resource.TikaResource
-
Sets up the ContentHandlerFactory in the ParseContext if not already set.
- setupModule(Module.SetupContext) - Method in class org.apache.tika.serialization.TikaModule
- setupMultipartConfig(List<Attachment>, Metadata, ParseContext) - Static method in class org.apache.tika.server.core.resource.TikaResource
-
Processes multipart attachments for /config endpoints.
- setUseMime(boolean) - Method in class org.apache.tika.detect.FileCommandDetector
- setUseMime(boolean) - Method in class org.apache.tika.detect.magika.MagikaDetector.Config
- setUseMime(boolean) - Method in class org.apache.tika.detect.siegfried.SiegfriedDetector.Config
- setUserAgent(String) - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.config.AtlassianJwtFetcherConfig
- setUserAgent(String) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setUserConfigPath(Path) - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Set a user-provided configuration file path.
- setUserName(String) - Method in class org.apache.tika.client.HttpClientFactory
- setUserName(String) - Method in class org.apache.tika.pipes.fetcher.http.config.HttpFetcherConfig
- setUseSharedServer(boolean) - Method in class org.apache.tika.pipes.core.PipesConfig
-
Sets whether to use shared server mode.
- setUtf16PropertiesToPrint(Set<OneNotePropertyEnum>) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteTreeWalkerOptions
-
Print file node data in UTF-16 format when they match these props.
- setVector(float[]) - Method in class org.apache.tika.inference.Chunk
- setVerifySsl(boolean) - Method in class org.apache.tika.client.HttpClientFactory
- setVersion(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Sets itsf version
- setVersion(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
-
Sets a version of itsp header
- setVersion(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
-
Sets version of control data block
- setVersion(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
-
Sets the version
- setWindow(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setWindowPosition(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setWindowSize(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
-
Sets a window size
- setWindowSize(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
- setWindowsPerReset(long) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
-
Sets windows per reset
- setWriteContent(boolean) - Method in class org.apache.tika.parser.RegexCaptureParserConfig
- setWriteFileNameToContent(boolean) - Method in class org.apache.tika.sax.SAXOutputConfig
- setWriteLimit(int) - Method in class org.apache.tika.config.OutputLimits
-
Sets the maximum characters to write.
- setWriteLimit(int) - Method in class org.apache.tika.pipes.fork.PipesForkParserConfig
-
Set the write limit for content extraction.
- setWriteLimit(int) - Method in class org.apache.tika.sax.BasicContentHandlerFactory
-
Sets the write limit.
- setWriteLimitReached(boolean) - Method in class org.apache.tika.parser.ParseRecord
- setWriteMetadataToHead(boolean) - Method in class org.apache.tika.sax.SAXOutputConfig
- setWriteSelectHeadersInBody(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
- setXmlReaderUtils(GlobalSettings.XmlReaderUtilsConfig) - Method in class org.apache.tika.config.GlobalSettings
- setZeroPadName(int) - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- setZipBombRatio(long) - Method in class org.apache.tika.config.OutputLimits
-
Sets the zip bomb ratio (maximum output:input ratio).
- setZipBombThreshold(long) - Method in class org.apache.tika.config.OutputLimits
-
Sets the zip bomb threshold (characters before check activates).
- setZipEmbeddedFiles(boolean) - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- SEVENZ - Static variable in class org.apache.tika.detect.zip.PackageConstants
- SevenZParser - Class in org.apache.tika.parser.pkg
-
Parser for 7z (Seven Zip) archives.
- SevenZParser() - Constructor for class org.apache.tika.parser.pkg.SevenZParser
- SHA1 - Enum constant in enum class org.apache.tika.digest.DigestDef.Algorithm
- SHA256 - Enum constant in enum class org.apache.tika.digest.DigestDef.Algorithm
- SHA256 - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- sha256Hash() - Method in record class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler.FrictionlessFileInfo
-
Returns the value of the
sha256Hashrecord component. - SHA3_256 - Enum constant in enum class org.apache.tika.digest.DigestDef.Algorithm
- SHA3_384 - Enum constant in enum class org.apache.tika.digest.DigestDef.Algorithm
- SHA3_512 - Enum constant in enum class org.apache.tika.digest.DigestDef.Algorithm
- SHA384 - Enum constant in enum class org.apache.tika.digest.DigestDef.Algorithm
- SHA512 - Enum constant in enum class org.apache.tika.digest.DigestDef.Algorithm
- shadingFill(COSName) - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- SharedServerManager - Class in org.apache.tika.pipes.core
-
Manages a single shared PipesServer process for multiple PipesClients.
- SharedServerManager(PipesConfig, Path, int) - Constructor for class org.apache.tika.pipes.core.SharedServerManager
-
Creates a SharedServerManager.
- SharedServerResources - Class in org.apache.tika.pipes.core.server
-
Holds shared resources for a shared PipesServer.
- sheetParts - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
- SheetTextAsHTML(OfficeParserConfig, XHTMLContentHandler) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
- shortText - Variable in class org.apache.tika.language.detect.LanguageDetector
- ShortTextFeatureExtractor - Class in org.apache.tika.langdetect.charsoup
-
Production feature extractor for the CharSoup short-text language detection model.
- ShortTextFeatureExtractor(int) - Constructor for class org.apache.tika.langdetect.charsoup.ShortTextFeatureExtractor
- ShortTextFeatureExtractor(int, boolean) - Constructor for class org.apache.tika.langdetect.charsoup.ShortTextFeatureExtractor
- SHOT_DATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The date and time when the video was shot."
- SHOT_LOCATION - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the location where the video was shot.
- SHOT_NAME - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the shot or take."
- shouldAcceptBox(String) - Method in class org.apache.tika.parser.mp4.TikaMp4BoxHandler
- shouldAcceptContainer(String) - Method in class org.apache.tika.parser.mp4.TikaMp4BoxHandler
- shouldParseEmbedded(Metadata) - Method in interface org.apache.tika.extractor.EmbeddedDocumentExtractor
-
Determines whether the given embedded document should be parsed.
- shouldParseEmbedded(Metadata) - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
- shouldParseEmbedded(Metadata) - Method in class org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
- shouldSkip(ParseContext) - Static method in class org.apache.tika.digest.SkipContainerDocumentDigest
-
Checks if container document digesting should be skipped for this parse.
- shouldSpool(TikaInputStream, Metadata, MediaType) - Method in class org.apache.tika.io.SpoolingStrategy
-
Determines whether the stream should be spooled to disk.
- shouldTranslate(TikaInputStream, Metadata) - Method in class org.apache.tika.extractor.DefaultEmbeddedStreamTranslator
-
This should sniff the stream to determine if it needs to be translated.
- shouldTranslate(TikaInputStream, Metadata) - Method in interface org.apache.tika.extractor.EmbeddedStreamTranslator
- shouldTranslate(TikaInputStream, Metadata) - Method in class org.apache.tika.extractor.microsoft.MSEmbeddedStreamTranslator
- shouldTranslate(TikaInputStream, Metadata) - Method in class org.apache.tika.extractor.microsoft.PSTEmailStreamTranslator
- showGlyph(Matrix, PDFont, int, Vector) - Method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
- showGlyph(Matrix, PDFont, int, Vector) - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- SHUT_DOWN - Enum constant in enum class org.apache.tika.pipes.core.protocol.PipesMessageType
- shutdown() - Method in class org.apache.tika.pipes.core.PerClientServerManager
- shutdown() - Method in interface org.apache.tika.pipes.core.ServerManager
-
Shuts down the server process and cleans up resources.
- shutdown() - Method in class org.apache.tika.pipes.core.SharedServerManager
- shutDown() - Static method in record class org.apache.tika.pipes.core.protocol.PipesMessage
- shutdownNow() - Method in class org.apache.tika.server.core.resource.AsyncResource
- ShutDownReceivedException - Exception in org.apache.tika.pipes.core.protocol
-
Thrown when a SHUT_DOWN message is received where an ACK was expected.
- ShutDownReceivedException() - Constructor for exception org.apache.tika.pipes.core.protocol.ShutDownReceivedException
- SIEGFRIED_ERRORS - Static variable in class org.apache.tika.detect.siegfried.SiegfriedDetector
- SIEGFRIED_IDENTIFIERS_DETAILS - Static variable in class org.apache.tika.detect.siegfried.SiegfriedDetector
- SIEGFRIED_IDENTIFIERS_NAME - Static variable in class org.apache.tika.detect.siegfried.SiegfriedDetector
- SIEGFRIED_PREFIX - Static variable in class org.apache.tika.detect.siegfried.SiegfriedDetector
- SIEGFRIED_SIGNATURE - Static variable in class org.apache.tika.detect.siegfried.SiegfriedDetector
- SIEGFRIED_STATUS - Static variable in class org.apache.tika.detect.siegfried.SiegfriedDetector
- SIEGFRIED_VERSION - Static variable in class org.apache.tika.detect.siegfried.SiegfriedDetector
- SiegfriedDetector - Class in org.apache.tika.detect.siegfried
-
Simple wrapper around Siegfried https://github.com/richardlehane/siegfried The default behavior is to run detection, report the results in the metadata and then return null so that other detectors will be used.
- SiegfriedDetector() - Constructor for class org.apache.tika.detect.siegfried.SiegfriedDetector
-
Default constructor.
- SiegfriedDetector(JsonConfig) - Constructor for class org.apache.tika.detect.siegfried.SiegfriedDetector
-
Constructor for JSON configuration.
- SiegfriedDetector.Config - Class in org.apache.tika.detect.siegfried
-
Configuration class for JSON deserialization.
- SiegfriedDetector.RuntimeConfig - Class in org.apache.tika.detect.siegfried
-
RuntimeConfig blocks modification of security-sensitive path fields at runtime.
- signature - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.NodeObject
- SIGNATURE_CONTACT_INFO - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- SIGNATURE_DATE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- SIGNATURE_FILTER - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- SIGNATURE_LOCATION - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- SIGNATURE_NAME - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- SIGNATURE_REASON - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- SIGNATURE_RELATIONSHIP - Static variable in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
- signatureData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.SignatureObject
-
Gets or sets a binary item as specified in [MS-FSSHTTPB] section 2.2.1.3 that specifies a value that is unique to the file data represented by this root node object.
- SignatureObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Signature Object
- SignatureObject - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Signature Object
- SignatureObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.SignatureObject
-
Initializes a new instance of the SignatureObject class.
- SIMPLE - Enum constant in enum class org.apache.tika.metadata.Property.PropertyType
-
A single value
- SimpleAlgorithm - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingMethod
-
File data is passed to the Simple algorithm chunking method.
- SimpleChunking - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
- SimpleChunking(byte[]) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.SimpleChunking
-
Initializes a new instance of the SimpleChunking class
- SimplePasswordProvider - Class in org.apache.tika.parser
-
A simple
PasswordProviderthat returns a configured password for all documents. - SimplePasswordProvider() - Constructor for class org.apache.tika.parser.SimplePasswordProvider
- SimplePasswordProvider(String) - Constructor for class org.apache.tika.parser.SimplePasswordProvider
- SimpleTextExtractor - Class in org.apache.tika.example
- SimpleTextExtractor() - Constructor for class org.apache.tika.example.SimpleTextExtractor
- SimpleThreadPoolExecutor - Class in org.apache.tika.concurrent
-
Simple Thread Pool Executor
- SimpleThreadPoolExecutor() - Constructor for class org.apache.tika.concurrent.SimpleThreadPoolExecutor
- SimpleTypeDetector - Class in org.apache.tika.example
- SimpleTypeDetector() - Constructor for class org.apache.tika.example.SimpleTypeDetector
- SINGLE_7_BIT - Enum constant in enum class org.apache.tika.parser.strings.StringsEncoding
- SINGLE_8_BIT - Enum constant in enum class org.apache.tika.parser.strings.StringsEncoding
- size() - Method in class org.apache.tika.eval.core.textstats.TokenCountPriorityQueue
- size() - Method in class org.apache.tika.eval.core.tokens.TokenCountPriorityQueue
- size() - Method in class org.apache.tika.metadata.Metadata
-
Returns the number of metadata names in this metadata.
- size() - Method in interface org.apache.tika.pipes.core.config.ConfigStore
-
Returns the number of stored configurations.
- size() - Method in class org.apache.tika.pipes.core.config.FileBasedConfigStore
- size() - Method in class org.apache.tika.pipes.core.config.InMemoryConfigStore
- size() - Method in class org.apache.tika.pipes.ignite.IgniteConfigStore
- size() - Method in class org.apache.tika.xmp.XMPMetadata
-
Returns the number of top-level namespaces
- skip(long) - Method in class org.apache.tika.io.BoundedInputStream
-
Invokes the delegate's
skip(long)method. - skip(long) - Method in class org.apache.tika.io.LookaheadInputStream
- skip(long) - Method in class org.apache.tika.io.TailStream
-
This implementation delegates to the
read()method to ensure that the tail buffer is also filled if data is skipped. - skip(long) - Method in class org.apache.tika.io.TikaInputStream
-
Skips up to
nbytes. - SKIP - Enum constant in enum class org.apache.tika.pipes.api.FetchEmitTuple.ON_PARSE_EXCEPTION
- SKIP_IF_EXISTS - Enum constant in enum class org.apache.tika.eval.app.db.JDBCUtil.CREATE_TABLE
- SkipContainerDocumentDigest - Class in org.apache.tika.digest
-
Marker class to signal that container document digesting should be skipped for a particular parse operation.
- SkipEmbeddedDocumentSelector - Class in org.apache.tika.extractor
-
A
DocumentSelectorthat skips all embedded documents. - SkipEmbeddedDocumentSelector() - Constructor for class org.apache.tika.extractor.SkipEmbeddedDocumentSelector
- skipFully(long) - Method in class org.apache.tika.parser.hwp.HwpStreamReader
- skippedEntity(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
- skippedEntity(String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- skippedEntity(String) - Method in class org.apache.tika.sax.TeeContentHandler
- skippedEntity(String) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
- skipSpaces() - Method in class org.apache.tika.parser.pdf.updates.StartXRefScanner
-
This will skip all spaces and comments that are present.
- skipWhiteSpaces() - Method in class org.apache.tika.parser.pdf.updates.StartXRefScanner
- SLDWORKS - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
SolidWorks CAD file
- SLIDE_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Slides are there in the (presentation) document
- SNAPPY_FRAMED - Static variable in class org.apache.tika.detect.zip.CompressorConstants
- SNAPPY_RAW - Static variable in class org.apache.tika.detect.zip.CompressorConstants
- SOCKET_CONNECT_TIMEOUT_MS - Static variable in class org.apache.tika.pipes.core.PerClientServerManager
- SOCKET_CONNECT_TIMEOUT_MS - Static variable in class org.apache.tika.pipes.core.PipesClient
- SOCKET_CONNECT_TIMEOUT_MS - Static variable in class org.apache.tika.pipes.core.SharedServerManager
- SOCKET_TIMEOUT_MS - Static variable in class org.apache.tika.pipes.core.PipesClient
- socketTimeoutMillis() - Method in record class org.apache.tika.pipes.emitter.es.HttpClientConfig
-
Returns the value of the
socketTimeoutMillisrecord component. - socketTimeoutMillis() - Method in record class org.apache.tika.pipes.emitter.opensearch.HttpClientConfig
-
Returns the value of the
socketTimeoutMillisrecord component. - socketTimeoutMillis() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Returns the value of the
socketTimeoutMillisrecord component. - socketTimeoutMillis() - Method in record class org.apache.tika.pipes.reporter.opensearch.HttpClientConfig
-
Returns the value of the
socketTimeoutMillisrecord component. - softmax(float[]) - Static method in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
In-place softmax with numerical stability.
- softmax(float[]) - Static method in class org.apache.tika.ml.LinearModel
-
In-place softmax with numerical stability.
- SOFTWARE - Static variable in interface org.apache.tika.metadata.TIFF
-
"Software or firmware used to generate the image."
- SOLIDWORKS_ASSEMBLY - Enum constant in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- SOLIDWORKS_DRAWING - Enum constant in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- SOLIDWORKS_PART - Enum constant in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- solrCollection() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Returns the value of the
solrCollectionrecord component. - SolrEmitter - Class in org.apache.tika.pipes.emitter.solr
-
Emitter to write parsed documents to Apache Solr.
- SolrEmitterConfig - Record Class in org.apache.tika.pipes.emitter.solr
- SolrEmitterConfig(String, List<String>, List<String>, String, String, int, int, int, String, String, String, String, String, String, String, Integer) - Constructor for record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Creates an instance of a
SolrEmitterConfigrecord class. - SolrEmitterConfig.AttachmentStrategy - Enum Class in org.apache.tika.pipes.emitter.solr
- SolrEmitterConfig.UpdateStrategy - Enum Class in org.apache.tika.pipes.emitter.solr
- SolrEmitterFactory - Class in org.apache.tika.pipes.emitter.solr
-
Factory for creating Solr emitters.
- SolrEmitterFactory() - Constructor for class org.apache.tika.pipes.emitter.solr.SolrEmitterFactory
- SolrPipesIterator - Class in org.apache.tika.pipes.iterator.solr
-
Iterates through results from a Solr query.
- SolrPipesIteratorConfig - Class in org.apache.tika.pipes.iterator.solr
- SolrPipesIteratorConfig() - Constructor for class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorConfig
- SolrPipesIteratorFactory - Class in org.apache.tika.pipes.iterator.solr
-
Factory for creating Solr pipes iterators.
- SolrPipesIteratorFactory() - Constructor for class org.apache.tika.pipes.iterator.solr.SolrPipesIteratorFactory
- SolrPipesPlugin - Class in org.apache.tika.pipes.plugin.solr
- SolrPipesPlugin(PluginWrapper) - Constructor for class org.apache.tika.pipes.plugin.solr.SolrPipesPlugin
- solrUrls() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Returns the value of the
solrUrlsrecord component. - solrZkChroot() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Returns the value of the
solrZkChrootrecord component. - solrZkHosts() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Returns the value of the
solrZkHostsrecord component. - SORT_STACK_TRACE - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- sortLoadedClasses(List<T>) - Static method in class org.apache.tika.utils.ServiceLoaderUtils
-
Sorts a list of loaded classes, so that non-Tika ones come before Tika ones, and otherwise in reverse alphabetical order
- SOURCE - Static variable in interface org.apache.tika.metadata.ClimateForcast
- SOURCE - Static variable in interface org.apache.tika.metadata.DublinCore
-
A reference to a resource from which the present resource is derived.
- SOURCE - Static variable in interface org.apache.tika.metadata.IPTC
-
Identifies the original owner of the copyright for the intellectual content of the item.
- SOURCE - Static variable in interface org.apache.tika.metadata.Photoshop
- SOURCE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- SOURCE - Static variable in interface org.apache.tika.metadata.XMPDC
-
A reference to a resource from which the present resource is derived.
- SOURCE_PATH - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
This should be used to store the path (relative or full) of the source/container file, including the file name, e.g. doc/path/to/my_pdf.pdf
- SourceCodeParser - Class in org.apache.tika.parser.code
-
Generic Source code parser for Java, Groovy, C++.
- SourceCodeParser() - Constructor for class org.apache.tika.parser.code.SourceCodeParser
- SourceCodeParser(EncodingDetector) - Constructor for class org.apache.tika.parser.code.SourceCodeParser
- sourceField - Variable in class org.apache.tika.metadata.filter.CaptureGroupMetadataFilter.Config
- SourceFilepath - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- SPACE - Static variable in class org.apache.tika.utils.StringUtils
-
A String for a space character.
- SpatialLocator - Class in org.apache.tika.inference.locator
-
Locator for a spatial region in an image or diagram.
- SpatialLocator(float[]) - Constructor for class org.apache.tika.inference.locator.SpatialLocator
- SpatialLocator(float[], String) - Constructor for class org.apache.tika.inference.locator.SpatialLocator
- SPEAKER_PLACEMENT - Static variable in interface org.apache.tika.metadata.XMPDM
-
"A description of the speaker angles from center front in degrees.
- SPECIALIST_NAME - Static variable in class org.apache.tika.ml.chardetect.Utf16SpecialistEncodingDetector
-
Specialist name used in
SpecialistOutputfor provenance. - SpecialistOutput - Class in org.apache.tika.ml.chardetect
-
Raw per-class logits from a single MoE specialist.
- SpecialistOutput(String, Map<String, Float>) - Constructor for class org.apache.tika.ml.chardetect.SpecialistOutput
- SpecializedKnowledge - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Specialized Knowledge
- SpecializedKnowledge - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Specialized Knowledge
- SPEEX_AUDIO - Static variable in class org.apache.tika.parser.ogg.SpeexParser
- SPEEX_AUDIO_ALT - Static variable in class org.apache.tika.parser.ogg.SpeexParser
- SpeexParser - Class in org.apache.tika.parser.ogg
-
Parser for OGG Speex audio files.
- SpeexParser() - Constructor for class org.apache.tika.parser.ogg.SpeexParser
- spi() - Element in annotation interface org.apache.tika.config.TikaComponent
-
Whether this component should be included in SPI files for automatic discovery via ServiceLoader.
- SpiCompositeSerializer<T> - Class in org.apache.tika.serialization.serdes
-
Abstract base serializer for SPI-loaded composite types that support exclusions.
- SpiCompositeSerializer(String) - Constructor for class org.apache.tika.serialization.serdes.SpiCompositeSerializer
- SpoolingStrategy - Class in org.apache.tika.io
-
Strategy for determining when to spool a TikaInputStream to disk.
- SpoolingStrategy() - Constructor for class org.apache.tika.io.SpoolingStrategy
- spoolToTemp() - Method in record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
-
Returns the value of the
spoolToTemprecord component. - SpreadsheetMLParser - Class in org.apache.tika.parser.microsoft.xml
-
Parses wordml 2003 format Excel files.
- SpreadsheetMLParser() - Constructor for class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
- SpringExample - Class in org.apache.tika.example
- SpringExample() - Constructor for class org.apache.tika.example.SpringExample
- SQLITE_APPLICATION_ID - Static variable in class org.apache.tika.parser.sqlite3.SQLite3Parser
-
Base16 encoded integer representing the "application id"
- SQLITE_CLASS_NAME - Static variable in class org.apache.tika.parser.sqlite3.SQLite3DBParser
- SQLITE_USER_VERSION - Static variable in class org.apache.tika.parser.sqlite3.SQLite3Parser
-
Base16 encoded integer representing the "user version"
- SQLITE3_PREFIX - Static variable in class org.apache.tika.parser.sqlite3.SQLite3Parser
- SQLite3DBParser - Class in org.apache.tika.parser.sqlite3
-
This is the implementation of the db parser for SQLite.
- SQLite3DBParser() - Constructor for class org.apache.tika.parser.sqlite3.SQLite3DBParser
- SQLite3Parser - Class in org.apache.tika.parser.sqlite3
-
This is the main class for parsing SQLite3 files.
- SQLite3Parser() - Constructor for class org.apache.tika.parser.sqlite3.SQLite3Parser
-
Checks to see if class is available for org.sqlite.JDBC.
- SQLite3TableReader - Class in org.apache.tika.parser.sqlite3
-
Concrete class for SQLLite table parsing.
- SQLite3TableReader(Connection, String, EmbeddedDocumentUtil) - Constructor for class org.apache.tika.parser.sqlite3.SQLite3TableReader
- STANDARD - Enum constant in enum class org.apache.tika.eval.core.tokens.TikaEvalTokenizer.Mode
-
General token counting — letters, ideographs, and numbers.
- STANDARD_REFERENCES - Static variable in class org.apache.tika.sax.StandardsExtractingContentHandler
- StandardExtractorFactory - Class in org.apache.tika.extractor
-
Standard factory for creating
ParsingEmbeddedDocumentExtractorinstances. - StandardExtractorFactory() - Constructor for class org.apache.tika.extractor.StandardExtractorFactory
- StandardHtmlEncodingDetector - Class in org.apache.tika.parser.html.charsetdetector
-
Full WHATWG prescan charset detector for HTML: HTTP Content-Type header →
<meta charset>/<meta http-equiv>tag, per https://html.spec.whatwg.org/multipage/parsing.html#the-input-byte-stream. - StandardHtmlEncodingDetector() - Constructor for class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
- StandardMetadataLimiter - Class in org.apache.tika.metadata.writefilter
-
Standard implementation of
MetadataWriteLimiterthat limits the amount of metadata a parser can add based onStandardMetadataLimiter.maxTotalEstimatedSize,StandardMetadataLimiter.maxFieldSize,StandardMetadataLimiter.maxValuesPerField, andStandardMetadataLimiter.maxKeySize. - StandardMetadataLimiter(int, int, int, int, Set<String>, Set<String>, boolean) - Constructor for class org.apache.tika.metadata.writefilter.StandardMetadataLimiter
- StandardMetadataLimiterFactory - Class in org.apache.tika.metadata.writefilter
-
Standard factory for creating
StandardMetadataLimiterinstances. - StandardMetadataLimiterFactory() - Constructor for class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- StandardOrganizations - Class in org.apache.tika.sax
-
This class provides a collection of the most important technical standard organizations.
- StandardOrganizations() - Constructor for class org.apache.tika.sax.StandardOrganizations
- StandardReference - Class in org.apache.tika.sax
-
Class that represents a standard reference.
- StandardReference.StandardReferenceBuilder - Class in org.apache.tika.sax
- StandardReferenceBuilder(String, String) - Constructor for class org.apache.tika.sax.StandardReference.StandardReferenceBuilder
- StandardsExtractingContentHandler - Class in org.apache.tika.sax
-
StandardsExtractingContentHandler is a Content Handler used to extract standard references while parsing.
- StandardsExtractingContentHandler() - Constructor for class org.apache.tika.sax.StandardsExtractingContentHandler
-
Creates a decorator that by default forwards incoming SAX events to a dummy content handler that simply ignores all the events.
- StandardsExtractingContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.sax.StandardsExtractingContentHandler
-
Creates a decorator for the given SAX event handler and Metadata object.
- StandardsExtractionExample - Class in org.apache.tika.example
-
Class to demonstrate how to use the
StandardsExtractingContentHandlerto get a list of the standard references from every file in a directory. - StandardsExtractionExample() - Constructor for class org.apache.tika.example.StandardsExtractionExample
- StandardsText - Class in org.apache.tika.sax
-
StandardText relies on regular expressions to extract standard references from text.
- StandardsText() - Constructor for class org.apache.tika.sax.StandardsText
- StandardUnpackSelector - Class in org.apache.tika.pipes.core.extractor
-
Selector for filtering which embedded documents should have their bytes extracted during UNPACK mode.
- StandardUnpackSelector() - Constructor for class org.apache.tika.pipes.core.extractor.StandardUnpackSelector
- StarOfficeDetector - Class in org.apache.tika.detect.zip
- StarOfficeDetector() - Constructor for class org.apache.tika.detect.zip.StarOfficeDetector
- start() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcherPlugin
- start() - Method in class org.apache.tika.pipes.grpc.TikaGrpcServer
- start() - Method in class org.apache.tika.pipes.ignite.server.IgniteStoreServer
-
Start the Ignite server node and initialize the cluster synchronously.
- start() - Method in class org.apache.tika.pipes.plugin.atlassianjwt.AtlassianJwtPipesPlugin
- start() - Method in class org.apache.tika.pipes.plugin.azblob.AZBlobPipesPlugin
- start() - Method in class org.apache.tika.pipes.plugin.csv.CSVPipesPlugin
- start() - Method in class org.apache.tika.pipes.plugin.es.ESPipesPlugin
- start() - Method in class org.apache.tika.pipes.plugin.fs.FileSystemPipesPlugin
- start() - Method in class org.apache.tika.pipes.plugin.gcs.GCSPipesPlugin
- start() - Method in class org.apache.tika.pipes.plugin.googledrive.GoogleDrivePipesPlugin
- start() - Method in class org.apache.tika.pipes.plugin.http.HttpPipesPlugin
- start() - Method in class org.apache.tika.pipes.plugin.jdbc.JDBCPipesPlugin
- start() - Method in class org.apache.tika.pipes.plugin.JsonPipesPlugin
- start() - Method in class org.apache.tika.pipes.plugin.kafka.KafkaPipesPlugin
- start() - Method in class org.apache.tika.pipes.plugin.microsoftgraph.MicrosoftGraphPipesPlugin
- start() - Method in class org.apache.tika.pipes.plugin.opensearch.OpenSearchPipesPlugin
- start() - Method in class org.apache.tika.pipes.plugin.s3.S3PipesPlugin
- start() - Method in class org.apache.tika.pipes.plugin.solr.SolrPipesPlugin
- start(ServerStatus.TASK, String) - Method in class org.apache.tika.server.core.ServerStatus
-
Records the start of a task and returns a task ID for tracking.
- start(BundleContext) - Method in class org.apache.tika.bundle.internal.BundleActivator
- start(BundleContext) - Method in class org.apache.tika.config.TikaActivator
- START_PMGL - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
- startBookmark(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- startBookmark(String, String) - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- startDescription(String, String, String) - Method in class org.apache.tika.sax.XMPContentHandler
- startDocument() - Method in class org.apache.tika.parser.dif.DIFContentHandler
- startDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
- startDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
- startDocument() - Method in class org.apache.tika.parser.tmx.TMXContentHandler
- startDocument() - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
- startDocument() - Method in class org.apache.tika.sax.boilerpipe.BoilerpipeContentHandler
- startDocument() - Method in class org.apache.tika.sax.ContentHandlerDecorator
- startDocument() - Method in class org.apache.tika.sax.DIFContentHandler
- startDocument() - Method in class org.apache.tika.sax.EmbeddedContentHandler
-
Ignored.
- startDocument() - Method in class org.apache.tika.sax.ExpandedTitleContentHandler
- startDocument() - Method in class org.apache.tika.sax.TeeContentHandler
- startDocument() - Method in class org.apache.tika.sax.TextContentHandler
- startDocument() - Method in class org.apache.tika.sax.ToHTMLContentHandler
- startDocument() - Method in class org.apache.tika.sax.ToXMLContentHandler
-
Writes the XML prefix.
- startDocument() - Method in class org.apache.tika.sax.XHTMLContentHandler
-
Starts an XHTML document by setting up the namespace mappings when called for the first time.
- startDocument() - Method in class org.apache.tika.sax.XMPContentHandler
-
Starts an XMP document by setting up the namespace mappings and writing out the following header:
- startDocument(PDDocument) - Method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
- STARTED - Enum constant in enum class org.apache.tika.parser.microsoft.chm.ChmCommons.IntelState
- STARTED_DECODING - Enum constant in enum class org.apache.tika.parser.microsoft.chm.ChmCommons.LzxState
- startEditedSection(String, Date, EditType) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- startEditedSection(String, Date, EditType) - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- startElement(String) - Method in class org.apache.tika.sax.XHTMLContentHandler
- startElement(String, String, String) - Method in class org.apache.tika.sax.XHTMLContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.mime.MimeTypesReader
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.dif.DIFContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.microsoft.ooxml.CommentPersonHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.mif.MIFContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.tmx.TMXContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.AttributeMetadataHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.MetadataHandler
-
Deprecated.
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.boilerpipe.BoilerpipeContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.DIFContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ElementMappingContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ExpandedTitleContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.LinkContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.RichTextContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.SafeContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.SecureContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.TeeContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.TextAndAttributeContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.TextContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ToMarkdownContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ToTextContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ToXMLContentHandler
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.XHTMLContentHandler
-
Starts the given element.
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
- startElement(String, AttributesImpl) - Method in class org.apache.tika.sax.XHTMLContentHandler
- startEmbeddedDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
-
This is called before parsing each embedded document.
- startEmbeddedDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.RecursiveParserWrapperHandler
-
This is called before parsing an embedded document
- startPage(PDPage) - Method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
- startParagraph(ParagraphProperties) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- startParagraph(ParagraphProperties) - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- startPrefixMapping(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
- startPrefixMapping(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
- startPrefixMapping(String, String) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
- startPrefixMapping(String, String) - Method in class org.apache.tika.sax.boilerpipe.BoilerpipeContentHandler
- startPrefixMapping(String, String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- startPrefixMapping(String, String) - Method in class org.apache.tika.sax.TeeContentHandler
- startPrefixMapping(String, String) - Method in class org.apache.tika.sax.ToXMLContentHandler
- startRow(int) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
- startSDT() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- startSDT() - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- startsWith(byte[], String) - Static method in class org.apache.tika.parser.microsoft.chm.ChmDirectoryListingSet
- startTable() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- startTable() - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- startTableCell() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- startTableCell() - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- startTableRow() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
- startTableRow() - Method in interface org.apache.tika.parser.microsoft.ooxml.XWPFBodyContentsHandler
- startTotalCount() - Method in interface org.apache.tika.pipes.api.pipesiterator.TotalCounter
- startTotalCount() - Method in class org.apache.tika.pipes.iterator.fs.FileSystemPipesIterator
- STARTUP_FAILED - Enum constant in enum class org.apache.tika.pipes.core.protocol.PipesMessageType
- startupFailed(byte[]) - Static method in record class org.apache.tika.pipes.core.protocol.PipesMessage
- StartXRefOffset - Class in org.apache.tika.parser.pdf.updates
- StartXRefOffset(long, long, long, boolean) - Constructor for class org.apache.tika.parser.pdf.updates.StartXRefOffset
- StartXRefScanner - Class in org.apache.tika.parser.pdf.updates
-
This is a first draft of a scanner to extract incremental updates out of PDFs.
- StartXRefScanner(RandomAccessRead) - Constructor for class org.apache.tika.parser.pdf.updates.StartXRefScanner
- STATE - Static variable in interface org.apache.tika.metadata.Photoshop
- StatefulParser - Class in org.apache.tika.parser
-
The RecursiveParserWrapper wraps the parser sent into the parsecontext and then uses that parser to store state (among many other things).
- StatefulParser(Parser) - Constructor for class org.apache.tika.parser.StatefulParser
-
Creates a decorator for the given parser.
- staticZipDetectors - Variable in class org.apache.tika.detect.zip.DefaultZipContainerDetector
- STATISTICAL - Enum constant in enum class org.apache.tika.detect.EncodingResult.ResultType
-
Probabilistic inference from a statistical model.
- StatisticalSpecialist - Interface in org.apache.tika.ml.chardetect
-
SPI contract for an MoE charset-detection specialist.
- status() - Method in record class org.apache.tika.pipes.api.PipesResult
-
Returns the value of the
statusrecord component. - STATUS - Static variable in class org.apache.tika.pipes.core.serialization.PipesResultSerializer
- STATUS_FIELD_NUMBER - Static variable in class org.apache.tika.FetchAndParseReply
- statusFile() - Method in record class org.apache.tika.pipes.reporter.fs.FileSystemReporterConfig
-
Returns the value of the
statusFilerecord component. - StatusReporter - Class in org.apache.tika.eval.app
- StatusReporter(CallablePipesIterator, AtomicInteger, AtomicInteger, AtomicBoolean) - Constructor for class org.apache.tika.eval.app.StatusReporter
- STD_ERR - Static variable in interface org.apache.tika.metadata.ExternalProcess
-
STD_ERR
- STD_ERR_IS_TRUNCATED - Static variable in interface org.apache.tika.metadata.ExternalProcess
-
Whether or not stderr was truncated
- STD_ERR_LENGTH - Static variable in interface org.apache.tika.metadata.ExternalProcess
-
Stderr length whether or not it was truncated.
- STD_OUT - Static variable in interface org.apache.tika.metadata.ExternalProcess
-
STD_OUT
- STD_OUT_IS_TRUNCATED - Static variable in interface org.apache.tika.metadata.ExternalProcess
-
Whether or not stdout was truncated
- STD_OUT_LENGTH - Static variable in interface org.apache.tika.metadata.ExternalProcess
-
Stdout length whether or not it was truncated.
- stillTesting() - Method in class org.apache.tika.example.PickBestTextEncodingParser.CharsetTester
-
Deprecated.
- stop() - Method in class org.apache.tika.pipes.fetcher.atlassianjwt.AtlassianJwtFetcherPlugin
- stop() - Method in class org.apache.tika.pipes.grpc.TikaGrpcServer
- stop() - Method in class org.apache.tika.pipes.plugin.atlassianjwt.AtlassianJwtPipesPlugin
- stop() - Method in class org.apache.tika.pipes.plugin.azblob.AZBlobPipesPlugin
- stop() - Method in class org.apache.tika.pipes.plugin.csv.CSVPipesPlugin
- stop() - Method in class org.apache.tika.pipes.plugin.es.ESPipesPlugin
- stop() - Method in class org.apache.tika.pipes.plugin.fs.FileSystemPipesPlugin
- stop() - Method in class org.apache.tika.pipes.plugin.gcs.GCSPipesPlugin
- stop() - Method in class org.apache.tika.pipes.plugin.googledrive.GoogleDrivePipesPlugin
- stop() - Method in class org.apache.tika.pipes.plugin.http.HttpPipesPlugin
- stop() - Method in class org.apache.tika.pipes.plugin.jdbc.JDBCPipesPlugin
- stop() - Method in class org.apache.tika.pipes.plugin.JsonPipesPlugin
- stop() - Method in class org.apache.tika.pipes.plugin.kafka.KafkaPipesPlugin
- stop() - Method in class org.apache.tika.pipes.plugin.microsoftgraph.MicrosoftGraphPipesPlugin
- stop() - Method in class org.apache.tika.pipes.plugin.opensearch.OpenSearchPipesPlugin
- stop() - Method in class org.apache.tika.pipes.plugin.s3.S3PipesPlugin
- stop() - Method in class org.apache.tika.pipes.plugin.solr.SolrPipesPlugin
- stop(BundleContext) - Method in class org.apache.tika.bundle.internal.BundleActivator
- stop(BundleContext) - Method in class org.apache.tika.config.TikaActivator
- StoppingEarlyException - Exception in org.apache.tika.sax
-
Sentinel exception to stop parsing xml once target is found while SAX parsing.
- StoppingEarlyException() - Constructor for exception org.apache.tika.sax.StoppingEarlyException
- storageIndex - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
- StorageIndexCellMapping - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Specifies the storage index cell mappings (with cell identifier, cell mapping extended GUID, and cell mapping serial number)
- StorageIndexCellMapping - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Storage Index Cell Mapping
- StorageIndexCellMapping() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
-
Initializes a new instance of the StorageIndexCellMapping class.
- storageIndexCellMappingList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
- StorageIndexDataElementData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- StorageIndexDataElementData - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
-
Storage Index Data Element
- StorageIndexDataElementData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
-
Initializes a new instance of the StorageIndexDataElementData class.
- storageIndexExtendedGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
- storageIndexManifestMapping - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
- StorageIndexManifestMapping - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- StorageIndexManifestMapping - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Storage Index Manifest Mapping
- StorageIndexManifestMapping() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexManifestMapping
-
Initializes a new instance of the StorageIndexManifestMapping class.
- StorageIndexRevisionMapping - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Specifies the storage index revision mappings (with revision and revision mapping extended GUIDs, and revision mapping serial number)
- StorageIndexRevisionMapping - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Storage Index Revision Mapping
- StorageIndexRevisionMapping() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexRevisionMapping
-
Initializes a new instance of the StorageIndexRevisionMapping class.
- storageIndexRevisionMappingList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
- storageManifest - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
- StorageManifestDataElementData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- StorageManifestDataElementData - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
-
Storage Manifest Data Element
- StorageManifestDataElementData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestDataElementData
-
Initializes a new instance of the StorageManifestDataElementData class.
- StorageManifestRootDeclare - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Specifies one or more storage manifest root declare.
- StorageManifestRootDeclare - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Storage Manifest Root Declare
- StorageManifestRootDeclare() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestRootDeclare
-
Initializes a new instance of the StorageManifestRootDeclare class.
- storageManifestRootDeclareList - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestDataElementData
- storageManifestSchemaGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestDataElementData
- StorageManifestSchemaGUID - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
Specifies a storage manifest schema GUID
- StorageManifestSchemaGUID - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Storage Manifest Schema GUID
- StorageManifestSchemaGUID() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestSchemaGUID
-
Initializes a new instance of the StorageManifestSchemaGUID class.
- storeOriginalDocument(InputStream, String) - Method in class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler
-
Stores the original container document for optional inclusion.
- storeOriginalDocument(InputStream, String) - Method in class org.apache.tika.pipes.core.extractor.TempFileUnpackHandler
-
Stores the original container document for inclusion in the zip.
- StrategyAuto() - Constructor for class org.apache.tika.parser.pdf.OcrConfig.StrategyAuto
- StrategyAuto(float, int) - Constructor for class org.apache.tika.parser.pdf.OcrConfig.StrategyAuto
- STREAM_OBJECT_HEADER_START_16_BIT - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
-
Specify for 16-bit stream object header start.
- STREAM_OBJECT_HEADER_START_32_BIT - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
-
Specify for 32-bit stream object header start.
- StreamEmitter - Interface in org.apache.tika.pipes.api.emitter
- StreamGobbler - Class in org.apache.tika.utils
- StreamGobbler(InputStream, int) - Constructor for class org.apache.tika.utils.StreamGobbler
- StreamingContentHandlerFactory - Interface in org.apache.tika.sax
-
Extended factory interface for creating ContentHandler instances that write directly to an OutputStream.
- StreamingDetectContext - Class in org.apache.tika.detect.zip
- StreamingDetectContext() - Constructor for class org.apache.tika.detect.zip.StreamingDetectContext
- streamingDetectFinal(StreamingDetectContext) - Method in class org.apache.tika.detect.apple.IWorkDetector
- streamingDetectFinal(StreamingDetectContext) - Method in class org.apache.tika.detect.microsoft.ooxml.OPCPackageDetector
- streamingDetectFinal(StreamingDetectContext) - Method in class org.apache.tika.detect.zip.FrictionlessPackageDetector
- streamingDetectFinal(StreamingDetectContext) - Method in class org.apache.tika.detect.zip.IPADetector
- streamingDetectFinal(StreamingDetectContext) - Method in class org.apache.tika.detect.zip.JarDetector
- streamingDetectFinal(StreamingDetectContext) - Method in class org.apache.tika.detect.zip.KMZDetector
- streamingDetectFinal(StreamingDetectContext) - Method in class org.apache.tika.detect.zip.OpenDocumentDetector
- streamingDetectFinal(StreamingDetectContext) - Method in class org.apache.tika.detect.zip.StarOfficeDetector
- streamingDetectFinal(StreamingDetectContext) - Method in interface org.apache.tika.detect.zip.ZipContainerDetector
-
After we've finished streaming the zip archive entries, a detector may make a final decision.
- streamingDetectUpdate(ZipArchiveEntry, InputStream, StreamingDetectContext) - Method in class org.apache.tika.detect.apple.IWorkDetector
- streamingDetectUpdate(ZipArchiveEntry, InputStream, StreamingDetectContext) - Method in class org.apache.tika.detect.microsoft.ooxml.OPCPackageDetector
- streamingDetectUpdate(ZipArchiveEntry, InputStream, StreamingDetectContext) - Method in class org.apache.tika.detect.zip.FrictionlessPackageDetector
- streamingDetectUpdate(ZipArchiveEntry, InputStream, StreamingDetectContext) - Method in class org.apache.tika.detect.zip.IPADetector
- streamingDetectUpdate(ZipArchiveEntry, InputStream, StreamingDetectContext) - Method in class org.apache.tika.detect.zip.JarDetector
- streamingDetectUpdate(ZipArchiveEntry, InputStream, StreamingDetectContext) - Method in class org.apache.tika.detect.zip.KMZDetector
- streamingDetectUpdate(ZipArchiveEntry, InputStream, StreamingDetectContext) - Method in class org.apache.tika.detect.zip.OpenDocumentDetector
- streamingDetectUpdate(ZipArchiveEntry, InputStream, StreamingDetectContext) - Method in class org.apache.tika.detect.zip.StarOfficeDetector
- streamingDetectUpdate(ZipArchiveEntry, InputStream, StreamingDetectContext) - Method in interface org.apache.tika.detect.zip.ZipContainerDetector
-
Try to detect on a specific entry.
- StreamingZipContainerDetector - Class in org.apache.tika.detect.zip
-
A zip container detector that uses only streaming detection, never opening the file as a ZipFile.
- StreamingZipContainerDetector() - Constructor for class org.apache.tika.detect.zip.StreamingZipContainerDetector
- StreamingZipContainerDetector(List<ZipContainerDetector>) - Constructor for class org.apache.tika.detect.zip.StreamingZipContainerDetector
- StreamingZipContainerDetector(ServiceLoader) - Constructor for class org.apache.tika.detect.zip.StreamingZipContainerDetector
- StreamObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- StreamObject(StreamObjectTypeHeaderStart) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
-
Initializes a new instance of the StreamObject class.
- StreamObjectHeaderEnd - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- StreamObjectHeaderEnd() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd
- StreamObjectHeaderEnd16bit - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
An 16-bit header for a compound object would indicate the end of a stream object
- StreamObjectHeaderEnd16bit() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd16bit
-
Initializes a new instance of the StreamObjectHeaderEnd16bit class, this is the default constructor.
- StreamObjectHeaderEnd16bit(int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd16bit
-
Initializes a new instance of the StreamObjectHeaderEnd16bit class with the specified type value.
- StreamObjectHeaderEnd16bit(StreamObjectTypeHeaderEnd) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd16bit
-
Initializes a new instance of the StreamObjectHeaderEnd16bit class with the specified type value.
- StreamObjectHeaderEnd8bit - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
An 8-bit header for a compound object would indicate the end of a stream object
- StreamObjectHeaderEnd8bit() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd8bit
-
Initializes a new instance of the StreamObjectHeaderEnd8bit class, this is the default constructor.
- StreamObjectHeaderEnd8bit(int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd8bit
-
Initializes a new instance of the StreamObjectHeaderEnd8bit class with the specified type value.
- StreamObjectHeaderEnd8bit(StreamObjectTypeHeaderEnd) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd8bit
-
Initializes a new instance of the StreamObjectHeaderEnd8bit class with the specified type value.
- StreamObjectHeaderStart - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
This class specifies the base class for 16-bit or 32-bit stream object header start
- StreamObjectHeaderStart() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
-
Initializes a new instance of the StreamObjectHeaderStart class.
- StreamObjectHeaderStart(StreamObjectTypeHeaderStart) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
-
Initializes a new instance of the StreamObjectHeaderStart class with specified header type.
- StreamObjectHeaderStart16bit - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
An 16-bit header for a compound object would indicate the start of a stream object
- StreamObjectHeaderStart16bit() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart16bit
-
Initializes a new instance of the StreamObjectHeaderStart16bit class, this is the default constructor.
- StreamObjectHeaderStart16bit(StreamObjectTypeHeaderStart) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart16bit
-
Initializes a new instance of the StreamObjectHeaderStart16bit class with specified type.
- StreamObjectHeaderStart16bit(StreamObjectTypeHeaderStart, int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart16bit
-
Initializes a new instance of the StreamObjectHeaderStart16bit class with specified type and length.
- StreamObjectHeaderStart32bit - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
An 32-bit header for a compound object would indicate the start of a stream object
- StreamObjectHeaderStart32bit() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart32bit
-
Initializes a new instance of the StreamObjectHeaderStart32bit class, this is the default constructor.
- StreamObjectHeaderStart32bit(StreamObjectTypeHeaderStart) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart32bit
-
Initializes a new instance of the StreamObjectHeaderStart32bit class with specified type.
- StreamObjectHeaderStart32bit(StreamObjectTypeHeaderStart, int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart32bit
-
Initializes a new instance of the StreamObjectHeaderStart32bit class with specified type and length.
- StreamObjectParseErrorException - Exception in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- StreamObjectParseErrorException(int, String, Exception) - Constructor for exception org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectParseErrorException
-
Initializes a new instance of the StreamObjectParseErrorException class
- StreamObjectParseErrorException(int, String, String, Exception) - Constructor for exception org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectParseErrorException
-
Initializes a new instance of the StreamObjectParseErrorException class
- StreamObjectTypeHeaderEnd - Enum Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
- StreamObjectTypeHeaderStart - Enum Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
-
The enumeration of the stream object type header start
- streamObjectTypeName - Variable in exception org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectParseErrorException
- STRETCH_MODE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio stretch mode."
- Strikethrough - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- STRING_TOO_SHORT - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.Error
- StringsConfig - Class in org.apache.tika.parser.strings
-
Configuration for the "strings" (or strings-alternative) command.
- StringsConfig() - Constructor for class org.apache.tika.parser.strings.StringsConfig
- StringsConfig.RuntimeConfig - Class in org.apache.tika.parser.strings
-
RuntimeConfig blocks modification of security-sensitive path fields at runtime.
- StringsEncoding - Enum Class in org.apache.tika.parser.strings
-
Character encoding of the strings that are to be found using the "strings" command.
- StringsParser - Class in org.apache.tika.parser.strings
-
Parser that uses the "strings" (or strings-alternative) command to find the printable strings in a object, or other binary, file (application/octet-tis).
- StringsParser() - Constructor for class org.apache.tika.parser.strings.StringsParser
- StringsParser(JsonConfig) - Constructor for class org.apache.tika.parser.strings.StringsParser
- StringsParser(StringsConfig) - Constructor for class org.apache.tika.parser.strings.StringsParser
- StringStatsCalculator<T> - Interface in org.apache.tika.eval.core.textstats
-
Interface for calculators that require a string
- stringToAsciiBytes(String) - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
- StringUtils - Class in org.apache.tika.utils
- StringUtils() - Constructor for class org.apache.tika.utils.StringUtils
- strip(byte[], int, int, byte[], int) - Static method in class org.apache.tika.ml.chardetect.HtmlByteStripper
-
Strip HTML/XML tags, comments, and the bodies of
<script>and<style>elements fromsrc[srcOffset .. srcOffset+srcLen)intodststarting atdstOffset. - stripMarkup - Variable in class org.apache.tika.parser.txt.Icu4jEncodingDetector.Config
- stripTrailingSlash(String) - Static method in class org.apache.tika.parser.vlm.AbstractVLMParser
- strokePath() - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- STRUCTURAL - Enum constant in enum class org.apache.tika.detect.EncodingResult.ResultType
-
The encoding is proven by byte-grammar structure (ISO-2022 escape sequences, UTF-8 multibyte validation).
- StructuralEncodingRules - Class in org.apache.tika.ml.chardetect
-
Fast, rule-based encoding checks that run before the statistical model.
- StructuralEncodingRules.Utf8Result - Enum Class in org.apache.tika.ml.chardetect
-
Outcome of the UTF-8 structural check.
- STRUCTURE - Enum constant in enum class org.apache.tika.metadata.Property.PropertyType
- StructureElementChildNodes - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- SUB_CLASS_OF_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- SUB_CLASS_TYPE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- SUBJECT - Static variable in interface org.apache.tika.metadata.DublinCore
-
The topic of the content of the resource.
- SUBJECT - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
Deprecated.
- SUBJECT - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
DublinCore.SUBJECT; should include both subject and keywords if a document format has both. - SUBJECT - Static variable in interface org.apache.tika.metadata.XMPDC
-
The topic of the content of the resource.
- SUBJECT_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
Specifies one or more Subjects from the IPTC Subject-NewsCodes taxonomy to categorise the content.
- SUBLOCATION - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of a sublocation the content is focussing on -- either the location shown in visual media or referenced by text or audio media.
- SUBMISSION_ACCEPTED_AT_TIME - Static variable in interface org.apache.tika.metadata.MAPI
- SUBMISSION_ID - Static variable in interface org.apache.tika.metadata.MAPI
- SubRequest - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Sub Request
- SubRequest - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Sub Request
- SubResponse - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Sub Response
- Subscript - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- subtract(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
- subtract(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- subtract(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
- subtract(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
- subtract(long) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
- subtract(UByte) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
- subtract(UInteger) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- subtract(ULong) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
- subtract(UShort) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
- SubtreeMatcher - Class in org.apache.tika.sax.xpath
-
Evaluation state of a
...//...XPath expression. - SubtreeMatcher(Matcher) - Constructor for class org.apache.tika.sax.xpath.SubtreeMatcher
- SUCCESS - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.CATEGORY
-
Processing completed successfully (possibly with warnings)
- SUCCESS - Enum constant in enum class org.apache.tika.renderer.RenderResult.STATUS
- SUCCESS_FIELD_NUMBER - Static variable in class org.apache.tika.DeleteFetcherReply
- summarize(File) - Method in class org.apache.tika.example.TrecDocumentGenerator
- SUMMARY_PROPERTY_PREFIX - Static variable in class org.apache.tika.parser.microsoft.JackcessParser
- SummaryExtractor - Class in org.apache.tika.parser.microsoft
-
Extractor for Common OLE2 (HPSF) metadata
- SummaryExtractor(Metadata) - Constructor for class org.apache.tika.parser.microsoft.SummaryExtractor
- Superscript - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- SUPERSET_MAP - Static variable in class org.apache.tika.detect.CharsetSupersets
-
Maps detected charset canonical names (case-sensitive, as returned by
Charset.name()) to their superset charset canonical name. - SUPERSET_OF - Static variable in class org.apache.tika.ml.chardetect.CharsetConfusables
-
Directional superset relationships: key is a charset, value is its immediate superset.
- supersetOf(Charset) - Static method in class org.apache.tika.detect.CharsetSupersets
-
Returns the superset charset to use for decoding, or
nullifdetectedhas no superset override. - SUPPLEMENTAL_CATEGORIES - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.
- SUPPLEMENTAL_CATEGORIES - Static variable in interface org.apache.tika.metadata.Photoshop
- SupplementingParser - Class in org.apache.tika.parser.multiple
-
Runs the input stream through all available parsers, merging the metadata from them based on the
AbstractMultipleParser.MetadataPolicychosen. - SupplementingParser(MediaTypeRegistry, AbstractMultipleParser.MetadataPolicy, Collection<? extends Parser>) - Constructor for class org.apache.tika.parser.multiple.SupplementingParser
- SupplementingParser(MediaTypeRegistry, AbstractMultipleParser.MetadataPolicy, Parser...) - Constructor for class org.apache.tika.parser.multiple.SupplementingParser
- SUPPORTED_ITEMS - Static variable in class org.apache.tika.parser.microsoft.pst.PSTMailItemParser
- SUPPORTED_TYPES - Static variable in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
- SUPPORTED_TYPES - Static variable in class org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006.Word2006MLParser
- SUPPORTED_TYPES - Static variable in class org.apache.tika.parser.transcribe.aws.AmazonTranscribe
- supportsTotalCount() - Method in interface org.apache.tika.pipes.api.reporter.PipesReporter
-
Override this if your reporter supports total count.
- supportsTotalCount() - Method in class org.apache.tika.pipes.core.reporter.CompositePipesReporter
- supportsTotalCount() - Method in class org.apache.tika.pipes.core.reporter.NoOpReporter
- supportsTotalCount() - Method in class org.apache.tika.pipes.reporter.es.ESPipesReporter
- supportsTotalCount() - Method in class org.apache.tika.pipes.reporter.fs.FileSystemStatusReporter
- supportsTotalCount() - Method in class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporter
- supportsTotalCount() - Method in class org.apache.tika.pipes.reporter.opensearch.OpenSearchPipesReporter
- SXSLFPowerPointExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
-
SAX/Streaming pptx extractior
- SXSLFPowerPointExtractorDecorator(Metadata, ParseContext, XSLFEventBasedPowerPointExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.SXSLFPowerPointExtractorDecorator
- SXWPFWordExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
-
This is an experimental, alternative extractor for docx files.
- SXWPFWordExtractorDecorator(Metadata, ParseContext, XWPFEventBasedWordExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.SXWPFWordExtractorDecorator
- SYMMETRIC_GROUPS - Static variable in class org.apache.tika.ml.chardetect.CharsetConfusables
-
Symmetric-only confusable groups.
- symmetricPeersOf(String) - Static method in class org.apache.tika.ml.chardetect.CharsetConfusables
-
Return the set of charsets that are symmetrically confusable with
charset, not includingcharsetitself. - SYS_PROP_NER_IMPL - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
- SystemUtils - Class in org.apache.tika.utils
-
Copied from commons-lang to avoid requiring the dependency
- SystemUtils() - Constructor for class org.apache.tika.utils.SystemUtils
T
- TABLE_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Tables in the document
- TABLE_ID - Static variable in interface org.apache.tika.metadata.ClimateForcast
- TABLE_NAME - Static variable in interface org.apache.tika.metadata.Database
- TABLE_NAME - Static variable in class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporter
- TableBordersVisible - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- TableColumnsLocked - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- TableColumnWidths - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- TableInfo - Class in org.apache.tika.eval.app.db
- TableInfo(String, List<ColInfo>) - Constructor for class org.apache.tika.eval.app.db.TableInfo
- TableInfo(String, ColInfo...) - Constructor for class org.apache.tika.eval.app.db.TableInfo
- tableName() - Method in record class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterConfig
-
Returns the value of the
tableNamerecord component. - TagAndStyle(String, String) - Constructor for class org.apache.tika.parser.microsoft.WordExtractor.TagAndStyle
- tagCount - Variable in class org.apache.tika.ml.chardetect.HtmlByteStripper.Result
-
Number of well-formed tags parsed (including comments).
- TaggedContentHandler - Class in org.apache.tika.sax
-
A content handler decorator that tags potential exceptions so that the handler that caused the exception can easily be identified.
- TaggedContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.TaggedContentHandler
-
Creates a tagging decorator for the given content handler.
- TaggedSAXException - Exception in org.apache.tika.sax
-
A
SAXExceptionwrapper that tags the wrapped exception with a given object reference. - TaggedSAXException(SAXException, Object) - Constructor for exception org.apache.tika.sax.TaggedSAXException
-
Creates a tagged wrapper for the given exception.
- tagName() - Method in enum class org.apache.tika.parser.microsoft.FormattingUtils.Tag
- TAGS_A - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TAGS_B - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TAGS_DIV - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TAGS_I - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TAGS_IMG - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TAGS_LI - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TAGS_OL - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TAGS_P - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TAGS_PARSE_EXCEPTION - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TAGS_TABLE - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TAGS_TABLE - Static variable in class org.apache.tika.eval.app.ExtractProfiler
- TAGS_TABLE_A - Static variable in class org.apache.tika.eval.app.ExtractComparer
- TAGS_TABLE_B - Static variable in class org.apache.tika.eval.app.ExtractComparer
- TAGS_TD - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TAGS_TITLE - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TAGS_TR - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TAGS_U - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TAGS_UL - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TailStream - Class in org.apache.tika.io
-
A specialized input stream implementation which records the last portion read from an underlying stream.
- TailStream(InputStream, int) - Constructor for class org.apache.tika.io.TailStream
-
Creates a new instance of
TailStream. - TAPE_NAME - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the tape from which the clip was captured, as set during the capture process."
- TAR - Static variable in class org.apache.tika.detect.zip.PackageConstants
- TargetElement(String, String) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
-
A shortcut that automatically creates the QName object
- TargetElement(String, String, Map<QName, QName>) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
-
A shortcut that automatically creates the QName object
- TargetElement(QName) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
-
Creates an TargetElement with no attributes, all attributes will be deleted from SAX stream
- TargetElement(QName, Map<QName, QName>) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
-
Creates an TargetElement, attributes of this element will be mapped as specified
- targetField - Variable in class org.apache.tika.metadata.filter.CaptureGroupMetadataFilter.Config
- TargetPartitionId - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Target PartitionId, new added in MOSS2013.
- TargetPartitionId - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Target Partition Id
- TarWriter - Class in org.apache.tika.server.core.writer
- TarWriter() - Constructor for class org.apache.tika.server.core.writer.TarWriter
- TASK_EXCEPTION - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.CATEGORY
-
Task-level exception - this task failed, log and continue with next task
- TaskStatus - Class in org.apache.tika.server.core
-
Represents the status of an active task for observability purposes.
- TaskTagDueDate - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- TeeContentHandler - Class in org.apache.tika.sax
-
Content handler proxy that forwards the received SAX events to zero or more underlying content handlers.
- TeeContentHandler(ContentHandler...) - Constructor for class org.apache.tika.sax.TeeContentHandler
- TEIDOMParser - Class in org.apache.tika.parser.journal
- TEIDOMParser() - Constructor for class org.apache.tika.parser.journal.TEIDOMParser
- TempFileUnpackHandler - Class in org.apache.tika.pipes.core.extractor
-
An UnpackHandler that writes embedded bytes to a temporary directory for later zipping.
- TempFileUnpackHandler(EmitKey, UnpackConfig) - Constructor for class org.apache.tika.pipes.core.extractor.TempFileUnpackHandler
- TempFileUnpackHandler.EmbeddedFileInfo - Record Class in org.apache.tika.pipes.core.extractor
-
Information about an embedded file stored in the temp directory.
- TEMPLATE - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- templateID - Variable in class org.apache.tika.parser.microsoft.rtf.ListDescriptor
- TEMPO - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio's tempo."
- TemporalLocator - Class in org.apache.tika.inference.locator
-
Locator for a time range in audio or video content.
- TemporalLocator(long, long) - Constructor for class org.apache.tika.inference.locator.TemporalLocator
- TemporaryResources - Class in org.apache.tika.io
-
Utility class for tracking and ultimately closing or otherwise disposing a collection of temporary resources.
- TemporaryResources() - Constructor for class org.apache.tika.io.TemporaryResources
- TESS_META - Static variable in class org.apache.tika.parser.ocr.TesseractOCRParser
- Tess4JConfig - Class in org.apache.tika.parser.ocr.tess4j
-
Configuration for
Tess4JParser. - Tess4JConfig() - Constructor for class org.apache.tika.parser.ocr.tess4j.Tess4JConfig
- Tess4JConfig.RuntimeConfig - Class in org.apache.tika.parser.ocr.tess4j
-
Runtime-only Tess4JConfig that prevents modification of paths and pool settings during parse-time configuration.
- Tess4JParser - Class in org.apache.tika.parser.ocr.tess4j
-
OCR parser using Tess4J, which provides a Java JNA wrapper around the native Tesseract library.
- Tess4JParser() - Constructor for class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- Tess4JParser(JsonConfig) - Constructor for class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- Tess4JParser(Tess4JConfig) - Constructor for class org.apache.tika.parser.ocr.tess4j.Tess4JParser
- TesseractOCRConfig - Class in org.apache.tika.parser.ocr
-
Configuration for TesseractOCRParser.
- TesseractOCRConfig() - Constructor for class org.apache.tika.parser.ocr.TesseractOCRConfig
- TesseractOCRConfig.OUTPUT_TYPE - Enum Class in org.apache.tika.parser.ocr
- TesseractOCRConfig.RuntimeConfig - Class in org.apache.tika.parser.ocr
-
Runtime-only TesseractOCRConfig that prevents modification of paths.
- TesseractOCRParser - Class in org.apache.tika.parser.ocr
-
TesseractOCRParser powered by tesseract-ocr engine.
- TesseractOCRParser() - Constructor for class org.apache.tika.parser.ocr.TesseractOCRParser
- TesseractOCRParser(JsonConfig) - Constructor for class org.apache.tika.parser.ocr.TesseractOCRParser
-
Constructor for JSON configuration.
- TesseractOCRParser(TesseractOCRConfig) - Constructor for class org.apache.tika.parser.ocr.TesseractOCRParser
- testCompositeDocument() - Static method in class org.apache.tika.example.TIAParsingExample
- testHtmlMapper() - Static method in class org.apache.tika.example.TIAParsingExample
- testLocale() - Static method in class org.apache.tika.example.TIAParsingExample
- testTeeContentHandler(String) - Static method in class org.apache.tika.example.TIAParsingExample
- text(String) - Static method in class org.apache.tika.mime.MediaType
- TEXT - Enum constant in enum class org.apache.tika.metadata.Property.ValueType
- TEXT - Enum constant in enum class org.apache.tika.parser.microsoft.OutlookExtractor.BODY_TYPES_PROCESSED
- TEXT - Enum constant in enum class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenType
- TEXT - Enum constant in enum class org.apache.tika.sax.BasicContentHandlerFactory.HANDLER_TYPE
- TEXT_HTML - Static variable in class org.apache.tika.mime.MediaType
- TEXT_ONLY - Enum constant in enum class org.apache.tika.parser.pdf.OcrConfig.RenderingStrategy
- TEXT_PLAIN - Static variable in class org.apache.tika.mime.MediaType
- TextAndAttributeContentHandler - Class in org.apache.tika.sax
- TextAndAttributeContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.TextAndAttributeContentHandler
- TextAndAttributeContentHandler(ContentHandler, boolean) - Constructor for class org.apache.tika.sax.TextAndAttributeContentHandler
- TextAndAttributeXMLParser - Class in org.apache.tika.parser.xml
- TextAndAttributeXMLParser() - Constructor for class org.apache.tika.parser.xml.TextAndAttributeXMLParser
- TextAndCSVConfig - Class in org.apache.tika.parser.csv
- TextAndCSVConfig() - Constructor for class org.apache.tika.parser.csv.TextAndCSVConfig
- TextAndCSVParser - Class in org.apache.tika.parser.csv
-
Unless the
TikaCoreProperties.CONTENT_TYPE_USER_OVERRIDEis set, this parser tries to assess whether the file is a text file, csv or tsv. - TextAndCSVParser() - Constructor for class org.apache.tika.parser.csv.TextAndCSVParser
- TextAndCSVParser(JsonConfig) - Constructor for class org.apache.tika.parser.csv.TextAndCSVParser
-
This constructor is called by the JSON-based configuration loader.
- TextAndCSVParser(EncodingDetector) - Constructor for class org.apache.tika.parser.csv.TextAndCSVParser
- TextAndCSVParser(TextAndCSVConfig) - Constructor for class org.apache.tika.parser.csv.TextAndCSVParser
- TextCell - Class in org.apache.tika.parser.microsoft
-
Text cell.
- TextCell(String) - Constructor for class org.apache.tika.parser.microsoft.TextCell
- TextContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that only passes the
TextContentHandler.characters(char[], int, int)and (@linkTextContentHandler.ignorableWhitespace(char[], int, int)(plusTextContentHandler.startDocument()andTextContentHandler.endDocument()events to the decorated content handler. - TextContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.TextContentHandler
- TextContentHandler(ContentHandler, boolean) - Constructor for class org.apache.tika.sax.TextContentHandler
- TextDetector - Class in org.apache.tika.detect
-
Content type detection of plain text documents.
- TextDetector() - Constructor for class org.apache.tika.detect.TextDetector
-
Constructs a
TextDetectorwhich will look at the default number of bytes from the beginning of the document. - TextDetector(int) - Constructor for class org.apache.tika.detect.TextDetector
-
Constructs a
TextDetectorwhich will look at a given number of bytes from the beginning of the document. - TextExtendedAscii - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- TextLangDetector - Class in org.apache.tika.langdetect.mitll
-
Language Detection using MIT Lincoln Lab’s Text.jl library https://github.com/trevorlewis/TextREST.jl
- TextLangDetector() - Constructor for class org.apache.tika.langdetect.mitll.TextLangDetector
- TextLocator - Class in org.apache.tika.inference.locator
-
Character-offset locator into the extracted text content.
- TextLocator(int, int) - Constructor for class org.apache.tika.inference.locator.TextLocator
- TextMatcher - Class in org.apache.tika.sax.xpath
-
Final evaluation state of a
... - TextMatcher() - Constructor for class org.apache.tika.sax.xpath.TextMatcher
- TextMessageBodyWriter - Class in org.apache.tika.server.core.writer
-
Returns simple text string for a particular metadata value.
- TextMessageBodyWriter() - Constructor for class org.apache.tika.server.core.writer.TextMessageBodyWriter
- TextOnlyPDFRenderer - Class in org.apache.tika.renderer.pdf.pdfbox
-
This class extends the PDFRenderer to render only the textual elements
- TextOnlyPDFRenderer(PDDocument) - Constructor for class org.apache.tika.renderer.pdf.pdfbox.TextOnlyPDFRenderer
- TextProfileSignature - Class in org.apache.tika.eval.core.textstats
-
Copied nearly directly from Apache Nutch: https://github.com/apache/nutch/blob/master/src/java/org/apache/nutch/crawl/TextProfileSignature.java
- TextProfileSignature() - Constructor for class org.apache.tika.eval.core.textstats.TextProfileSignature
- TextQualityComparison - Class in org.apache.tika.quality
-
Result of comparing two candidate strings for text quality via
TextQualityDetector.compare(java.lang.String, java.lang.String, java.lang.String, java.lang.String). - TextQualityComparison(String, float, TextQualityScore, TextQualityScore, String, String) - Constructor for class org.apache.tika.quality.TextQualityComparison
- TextQualityDetector - Interface in org.apache.tika.quality
-
Scores a string for text quality and arbitrates between two candidate strings.
- TextQualityScore - Class in org.apache.tika.quality
-
Result of scoring a string for text quality via a
TextQualityDetector. - TextQualityScore(float, float, float, float, String) - Constructor for class org.apache.tika.quality.TextQualityScore
- TextRunData - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- TextRunDataObject - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- TextRunFormatting - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- TextRunIndex - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- TextRunIsEmbeddedObject - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- TextSha256Signature - Class in org.apache.tika.eval.core.textstats
-
Calculates the base32 encoded SHA-256 checksum on the analyzed text
- TextSha256Signature() - Constructor for class org.apache.tika.eval.core.textstats.TextSha256Signature
- TextStatistics - Class in org.apache.tika.detect
-
Utility class for computing a histogram of the bytes seen in a stream.
- TextStatistics() - Constructor for class org.apache.tika.detect.TextStatistics
- TextStatsCalculator - Interface in org.apache.tika.eval.core.textstats
-
Base text stats interface
- TextStatsFromTikaEval - Class in org.apache.tika.example
-
These examples create a new
CompositeTextStatsCalculatorfor each call. - TextStatsFromTikaEval() - Constructor for class org.apache.tika.example.TextStatsFromTikaEval
- THAI - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- THEORA_VIDEO - Static variable in class org.apache.tika.parser.ogg.TheoraParser
- TheoraParser - Class in org.apache.tika.parser.ogg
-
Parser for OGG Theora video files, which may also contain one or more soundtrack streams.
- TheoraParser() - Constructor for class org.apache.tika.parser.ogg.TheoraParser
- THREADED_COMMENT_RELATION - Static variable in class org.apache.tika.parser.microsoft.ooxml.OPCPackageWrapper
- ThreadSafeUnzipper - Class in org.apache.tika.plugins
-
Thread-safe and process-safe plugin unzipper using atomic rename.
- ThreadSafeUnzipper() - Constructor for class org.apache.tika.plugins.ThreadSafeUnzipper
- threshold(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
- THROW_EX_IF_EXISTS - Enum constant in enum class org.apache.tika.eval.app.db.JDBCUtil.CREATE_TABLE
- throwIfCauseOf(Exception) - Method in class org.apache.tika.sax.TaggedContentHandler
-
Re-throws the original exception thrown by this handler.
- throwIfCauseOf(SAXException) - Method in class org.apache.tika.sax.SecureContentHandler
-
Converts the given
SAXExceptionto a correspondingTikaExceptionif it's caused by this instance detecting a zip bomb. - throwIfWriteLimitReached(Exception) - Static method in exception org.apache.tika.exception.WriteLimitReachedException
- throwOnWriteLimitReached() - Method in record class org.apache.tika.server.core.resource.ServerHandlerConfig
-
Returns the value of the
throwOnWriteLimitReachedrecord component. - THUMBNAIL - Enum constant in enum class org.apache.tika.extractor.EmbeddedDocumentUtil.EmbeddedResourcePrefix
- THUMBNAIL - Enum constant in enum class org.apache.tika.metadata.TikaCoreProperties.EmbeddedResourceType
- THUMBNAIL - Static variable in interface org.apache.tika.metadata.RTFMetadata
-
if set to true, this means that an image file is probably a "thumbnail" any time a pict/emf/wmf is in an object
- TIAParsingExample - Class in org.apache.tika.example
- TIAParsingExample() - Constructor for class org.apache.tika.example.TIAParsingExample
- TIBETAN - Static variable in class org.apache.tika.langdetect.charsoup.ScriptCategory
- TIFF - Enum constant in enum class org.apache.tika.parser.pdf.OcrConfig.ImageFormat
- TIFF - Interface in org.apache.tika.metadata
-
XMP Exif TIFF schema.
- TiffParser - Class in org.apache.tika.parser.image
- TiffParser() - Constructor for class org.apache.tika.parser.image.TiffParser
- Tika - Class in org.apache.tika
-
Facade class for accessing Tika functionality.
- Tika() - Constructor for class org.apache.tika.Tika
-
Creates a Tika facade using the default configuration.
- Tika(Detector) - Constructor for class org.apache.tika.Tika
-
Creates a Tika facade using the given detector instance, the default parser configuration, and the default Translator.
- Tika(Detector, Parser) - Constructor for class org.apache.tika.Tika
-
Creates a Tika facade using the given detector and parser instances, but the default Translator.
- Tika(Detector, Parser, Translator) - Constructor for class org.apache.tika.Tika
-
Creates a Tika facade using the given detector, parser, and translator instances.
- TIKA_CHUNKS - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
JSON array of chunks (text segments with optional embedding vectors and locators).
- TIKA_CONTENT - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- TIKA_CONTENT_HANDLER - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Deprecated.Use
TikaCoreProperties.TIKA_CONTENT_HANDLER_TYPEfor the handler type enum value. - TIKA_CONTENT_HANDLER_TYPE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
The handler type used to produce
TikaCoreProperties.TIKA_CONTENT. - TIKA_DETECTED_LANGUAGE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- TIKA_DETECTED_LANGUAGE_CONFIDENCE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- TIKA_DETECTED_LANGUAGE_CONFIDENCE_RAW - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- TIKA_EVAL_NS - Static variable in class org.apache.tika.eval.core.metadata.TikaEvalMetadataFilter
- TIKA_LINK_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- TIKA_META_EXCEPTION_EMBEDDED_STREAM - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Use this to store exceptions caught while trying to read the stream of an embedded resource.
- TIKA_META_EXCEPTION_PREFIX - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Use this to store parse exception information in the Metadata object.
- TIKA_META_EXCEPTION_WARNING - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Use this to store exceptions caught during a parse that are non-fatal, e.g. if a parser is in lenient mode and more content can be extracted if we ignore an exception thrown by a dependency.
- TIKA_META_PREFIX - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Use this to prefix metadata properties that store information about the parsing process.
- TIKA_META_WARN_PREFIX - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Use this to store warnings that happened during the parse.
- TIKA_MIME_FILE - Static variable in interface org.apache.tika.metadata.TikaMimeKeys
- TIKA_MIME_ID - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TIKA_OOXML - Static variable in class org.apache.tika.detect.zip.PackageConstants
- TIKA_PAGED_TEXT_PREFIX - Static variable in interface org.apache.tika.metadata.TikaPagedText
- TIKA_PARSED_BY - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- TIKA_PARSED_BY_FULL_SET - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Use this to store a record of all parsers that touched a given file in the container file's metadata.
- TIKA_SERVER_GRPC_DEFAULT_PORT - Static variable in class org.apache.tika.pipes.grpc.TikaGrpcServer
- TIKA_SERVER_ID_ENV - Static variable in class org.apache.tika.server.core.TikaServerCli
-
This value is set to the server's id in the forked process.
- TIKA_UTI_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
- TikaActivator - Class in org.apache.tika.config
-
Bundle activator that adjust the class loading mechanism of the
ServiceLoaderclass to work correctly in an OSGi environment. - TikaActivator() - Constructor for class org.apache.tika.config.TikaActivator
- TikaAsyncCLI - Class in org.apache.tika.async.cli
- TikaAsyncCLI() - Constructor for class org.apache.tika.async.cli.TikaAsyncCLI
- TikaCLI - Class in org.apache.tika.cli
-
Simple command line interface for Apache Tika.
- TikaCLI() - Constructor for class org.apache.tika.cli.TikaCLI
- TikaClient - Class in org.apache.tika.server.client
- TikaClientCLI - Class in org.apache.tika.server.client
- TikaClientCLI() - Constructor for class org.apache.tika.server.client.TikaClientCLI
- TikaClientConfigException - Exception in org.apache.tika.server.client
- TikaClientConfigException(String) - Constructor for exception org.apache.tika.server.client.TikaClientConfigException
- TikaClientConfigException(String, Throwable) - Constructor for exception org.apache.tika.server.client.TikaClientConfigException
- TikaClientException - Exception in org.apache.tika.client
- TikaClientException(String) - Constructor for exception org.apache.tika.client.TikaClientException
- TikaClientException(String, Throwable) - Constructor for exception org.apache.tika.client.TikaClientException
- TikaComponent - Annotation Interface in org.apache.tika.config
-
Annotation for Tika components (parsers, detectors, etc.) that enables: Automatic SPI file generation (META-INF/services/...) Name-based component registry for JSON configuration
- TikaComponentProcessor - Class in org.apache.tika.annotation
-
Annotation processor for
TikaComponentthat generates: Standard Java SPI files (META-INF/services/*) for ServiceLoader Component index files (META-INF/tika/*.idx) for name-based lookup - TikaComponentProcessor() - Constructor for class org.apache.tika.annotation.TikaComponentProcessor
- TikaConfigException - Exception in org.apache.tika.exception
-
Tika Config Exception is an exception to occur when there is an error in Tika config file and/or one or more of the parsers failed to initialize from that erroneous config.
- TikaConfigException(String) - Constructor for exception org.apache.tika.exception.TikaConfigException
-
Creates an instance of exception
- TikaConfigException(String, Throwable) - Constructor for exception org.apache.tika.exception.TikaConfigException
- TikaCoreProperties - Interface in org.apache.tika.metadata
-
Contains a core set of basic Tika metadata properties, which all parsers will attempt to supply (where the file format permits).
- TikaCoreProperties.EmbeddedResourceType - Enum Class in org.apache.tika.metadata
-
A file might contain different types of embedded documents.
- TikaDetectors - Class in org.apache.tika.server.core.resource
-
Provides details of all the
Detectors registered with Apache Tika, similar to --list-detectors with the Tika CLI. - TikaDetectors() - Constructor for class org.apache.tika.server.core.resource.TikaDetectors
- TikaEmitterException - Exception in org.apache.tika.pipes.core.emitter
- TikaEmitterException(String) - Constructor for exception org.apache.tika.pipes.core.emitter.TikaEmitterException
- TikaEmitterException(String, Throwable) - Constructor for exception org.apache.tika.pipes.core.emitter.TikaEmitterException
- TikaEmitterResult - Class in org.apache.tika.server.client
- TikaEmitterResult(TikaEmitterResult.STATUS, long, String) - Constructor for class org.apache.tika.server.client.TikaEmitterResult
- TikaEvalCLI - Class in org.apache.tika.eval.app
- TikaEvalCLI() - Constructor for class org.apache.tika.eval.app.TikaEvalCLI
- TikaEvalMetadataFilter - Class in org.apache.tika.eval.core.metadata
- TikaEvalMetadataFilter() - Constructor for class org.apache.tika.eval.core.metadata.TikaEvalMetadataFilter
- TikaEvalTokenizer - Class in org.apache.tika.eval.core.tokens
-
Tokenizer for tika-eval text analysis.
- TikaEvalTokenizer.Mode - Enum Class in org.apache.tika.eval.core.tokens
-
Tokenization mode.
- TikaExcelDataFormatter - Class in org.apache.tika.parser.microsoft
-
Overrides Excel's General format to include more significant digits than the MS Spec allows.
- TikaExcelDataFormatter() - Constructor for class org.apache.tika.parser.microsoft.TikaExcelDataFormatter
- TikaExcelDataFormatter(Locale) - Constructor for class org.apache.tika.parser.microsoft.TikaExcelDataFormatter
- TikaExcelGeneralFormat - Class in org.apache.tika.parser.microsoft
-
A Format that allows up to 15 significant digits for integers.
- TikaExcelGeneralFormat(Locale) - Constructor for class org.apache.tika.parser.microsoft.TikaExcelGeneralFormat
- TikaException - Exception in org.apache.tika.exception
-
Tika exception
- TikaException(String) - Constructor for exception org.apache.tika.exception.TikaException
- TikaException(String, Throwable) - Constructor for exception org.apache.tika.exception.TikaException
- TikaExtension - Interface in org.apache.tika.plugins
-
Interface for TikaExtensions
- TikaExtensionFactory<T extends TikaExtension> - Interface in org.apache.tika.plugins
- TikaFileTypeDetector - Class in org.apache.tika.filetypedetector
- TikaFileTypeDetector() - Constructor for class org.apache.tika.filetypedetector.TikaFileTypeDetector
- TikaGrpc - Class in org.apache.tika
-
The Tika Grpc Service definition
- TikaGrpc.AsyncService - Interface in org.apache.tika
-
The Tika Grpc Service definition
- TikaGrpc.TikaBlockingStub - Class in org.apache.tika
-
A stub to allow clients to do limited synchronous rpc calls to service Tika.
- TikaGrpc.TikaBlockingV2Stub - Class in org.apache.tika
-
A stub to allow clients to do synchronous rpc calls to service Tika.
- TikaGrpc.TikaFutureStub - Class in org.apache.tika
-
A stub to allow clients to do ListenableFuture-style rpc calls to service Tika.
- TikaGrpc.TikaImplBase - Class in org.apache.tika
-
Base class for the server implementation of the service Tika.
- TikaGrpc.TikaStub - Class in org.apache.tika
-
A stub to allow clients to do asynchronous rpc calls to service Tika.
- TikaGrpcServer - Class in org.apache.tika.pipes.grpc
-
Server that manages startup/shutdown of the GRPC Tika server.
- TikaGrpcServer() - Constructor for class org.apache.tika.pipes.grpc.TikaGrpcServer
- TikaGUI - Class in org.apache.tika.gui
-
Simple Swing GUI for Apache Tika.
- TikaGUI(Parser, TikaLoader) - Constructor for class org.apache.tika.gui.TikaGUI
- TikaHttpClient - Class in org.apache.tika.http
-
Lightweight HTTP client for Tika parser modules that call external REST endpoints (embedding APIs, VLM services, etc.).
- TikaImplBase() - Constructor for class org.apache.tika.TikaGrpc.TikaImplBase
- TikaInputStream - Class in org.apache.tika.io
-
Input stream with extended capabilities for detection and parsing.
- TikaInputStream(InputStream, long) - Constructor for class org.apache.tika.io.TikaInputStream
-
Protected constructor for subclasses.
- tikaInputStreamGetFile(String) - Static method in class org.apache.tika.example.TIAParsingExample
- TikaJsonConfig - Class in org.apache.tika.config.loader
-
Parsed representation of a Tika JSON configuration file.
- TikaLoader - Class in org.apache.tika.config.loader
-
Main entry point for loading Tika components from JSON configuration.
- TikaLoggingFilter - Class in org.apache.tika.server.core
- TikaLoggingFilter(boolean) - Constructor for class org.apache.tika.server.core.TikaLoggingFilter
- TikaMemoryLimitException - Exception in org.apache.tika.exception
- TikaMemoryLimitException(long, long) - Constructor for exception org.apache.tika.exception.TikaMemoryLimitException
- TikaMemoryLimitException(String) - Constructor for exception org.apache.tika.exception.TikaMemoryLimitException
- TikaMimeKeys - Interface in org.apache.tika.metadata
-
A collection of Tika metadata keys used in Mime Type resolution
- TikaMimeTypes - Class in org.apache.tika.server.core.resource
-
Provides details of all the mimetypes known to Apache Tika, similar to --list-supported-types with the Tika CLI.
- TikaMimeTypes() - Constructor for class org.apache.tika.server.core.resource.TikaMimeTypes
- TikaModule - Class in org.apache.tika.serialization
-
Jackson module that provides compact serialization for Tika components.
- TikaModule() - Constructor for class org.apache.tika.serialization.TikaModule
- TikaMp4BoxHandler - Class in org.apache.tika.parser.mp4
- TikaMp4BoxHandler(Metadata, Metadata, XHTMLContentHandler) - Constructor for class org.apache.tika.parser.mp4.TikaMp4BoxHandler
- TikaNameIdChunks - Class in org.apache.tika.parser.microsoft.msg
-
Collection of convenience chunks for the NameID part of an outlook file
- TikaNameIdChunks() - Constructor for class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks
- TikaNameIdChunks.PredefinedPropertySet - Enum Class in org.apache.tika.parser.microsoft.msg
- TikaNameIdChunks.PropertySetType - Enum Class in org.apache.tika.parser.microsoft.msg
- TikaObjectMapperFactory - Class in org.apache.tika.config.loader
-
Factory for creating ObjectMappers configured for Tika serialization.
- TikaObjectMapperFactory() - Constructor for class org.apache.tika.config.loader.TikaObjectMapperFactory
- TikaPagedText - Interface in org.apache.tika.metadata
-
Metadata properties for paged text, metadata appropriate for an individual page (useful for embedded document handlers called on individual pages).
- TikaParsers - Class in org.apache.tika.server.core.resource
-
Provides details of all the
Parsers registered with Apache Tika, similar to --list-parsers and --list-parser-details within the Tika CLI. - TikaParsers() - Constructor for class org.apache.tika.server.core.resource.TikaParsers
- TikaPluginManager - Class in org.apache.tika.plugins
-
PF4J-based plugin manager for Tika pipes components.
- TikaPluginManager(List<Path>) - Constructor for class org.apache.tika.plugins.TikaPluginManager
- TikaProgressTracker - Class in org.apache.tika.config
-
Tracks parse progress for the two-tier timeout system.
- TikaProgressTracker() - Constructor for class org.apache.tika.config.TikaProgressTracker
-
Creates a tracker initialized to the current time.
- TikaProto - Class in org.apache.tika
- TikaResource - Class in org.apache.tika.server.core.resource
- TikaResource() - Constructor for class org.apache.tika.server.core.resource.TikaResource
- TikaSerializationException - Exception in org.apache.tika.serialization
- TikaSerializationException(String) - Constructor for exception org.apache.tika.serialization.TikaSerializationException
- TikaSerializationException(String, Throwable) - Constructor for exception org.apache.tika.serialization.TikaSerializationException
- TikaServerCli - Class in org.apache.tika.server.core
- TikaServerCli() - Constructor for class org.apache.tika.server.core.TikaServerCli
- TikaServerClientConfig - Class in org.apache.tika.server.client
- TikaServerClientConfig() - Constructor for class org.apache.tika.server.client.TikaServerClientConfig
- TikaServerConfig - Class in org.apache.tika.server.core
- TikaServerConfig() - Constructor for class org.apache.tika.server.core.TikaServerConfig
- TikaServerParseException - Exception in org.apache.tika.server.core
-
Simple wrapper exception to be thrown for consistent handling of exceptions that can happen during a parse.
- TikaServerParseException(Exception) - Constructor for exception org.apache.tika.server.core.TikaServerParseException
- TikaServerParseException(String) - Constructor for exception org.apache.tika.server.core.TikaServerParseException
- TikaServerParseExceptionMapper - Class in org.apache.tika.server.core
- TikaServerParseExceptionMapper(boolean) - Constructor for class org.apache.tika.server.core.TikaServerParseExceptionMapper
- TikaServerProcess - Class in org.apache.tika.server.core
- TikaServerProcess() - Constructor for class org.apache.tika.server.core.TikaServerProcess
- TikaServerResource - Interface in org.apache.tika.server.core.resource
-
Stub interface to allow for loading of resources via SPI
- TikaServerStatus - Class in org.apache.tika.server.core.resource
- TikaServerStatus(ServerStatus) - Constructor for class org.apache.tika.server.core.resource.TikaServerStatus
- TikaServerWriter<T> - Interface in org.apache.tika.server.core.writer
-
Stub interface to allow for SPI loading from other modules without opening up service loading to any generic MessageBodyWriter
- TikaTimeoutException - Exception in org.apache.tika.exception
-
Runtime/unchecked version of
TimeoutException - TikaTimeoutException(String) - Constructor for exception org.apache.tika.exception.TikaTimeoutException
- TikaToXMP - Class in org.apache.tika.xmp.convert
- TikaToXMP() - Constructor for class org.apache.tika.xmp.convert.TikaToXMP
- TikaUserDataBox - Class in org.apache.tika.parser.mp4.boxes
- TikaUserDataBox(String, byte[], Metadata, XHTMLContentHandler) - Constructor for class org.apache.tika.parser.mp4.boxes.TikaUserDataBox
- TikaVersion - Class in org.apache.tika.server.core.resource
- TikaVersion() - Constructor for class org.apache.tika.server.core.resource.TikaVersion
- TikaWelcome - Class in org.apache.tika.server.core.resource
-
Provides a basic welcome to the Apache Tika Server.
- TikaWelcome(List<ResourceProvider>) - Constructor for class org.apache.tika.server.core.resource.TikaWelcome
- TikaWelcome.Endpoint - Class in org.apache.tika.server.core.resource
- TIME - Static variable in interface org.apache.tika.parser.ner.NERecogniser
- TIME_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
- TIME_SIGNATURE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The time signature of the music."
- TIMEOUT - Enum constant in enum class org.apache.tika.eval.app.ProfilerBase.PARSE_ERROR_TYPE
- TIMEOUT - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- TIMEOUT - Enum constant in enum class org.apache.tika.pipes.core.protocol.PipesMessageType
- TIMEOUT - Enum constant in enum class org.apache.tika.renderer.RenderResult.STATUS
- TIMEOUT - Static variable in class org.apache.tika.pipes.core.PipesResults
- TimeoutLimits - Class in org.apache.tika.config
-
Configuration for the two-tier task timeout system.
- TimeoutLimits() - Constructor for class org.apache.tika.config.TimeoutLimits
-
No-arg constructor for Jackson deserialization.
- TimeoutLimits(long, long) - Constructor for class org.apache.tika.config.TimeoutLimits
-
Constructor with both timeout parameters.
- TIMESTAMP - Static variable in interface org.apache.tika.metadata.Geographic
-
This is the timestamp that derives from a gps record
- TITLE - Static variable in interface org.apache.tika.metadata.DublinCore
-
A name given to the resource.
- TITLE - Static variable in interface org.apache.tika.metadata.IPTC
-
A shorthand reference for the item.
- TITLE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- TITLE - Static variable in interface org.apache.tika.metadata.XMP
-
This doesn't belong to the XMP Basic schema.
- TITLE - Static variable in interface org.apache.tika.metadata.XMPDC
-
A name given to the resource.
- TlsConfig - Class in org.apache.tika.server.core
- TlsConfig() - Constructor for class org.apache.tika.server.core.TlsConfig
- tmp - Variable in class org.apache.tika.io.TikaInputStream
- TMXContentHandler - Class in org.apache.tika.parser.tmx
-
Content Handler for Translation Memory eXchange (TMX) files.
- TMXParser - Class in org.apache.tika.parser.tmx
-
Parser for Translation Memory eXchange (TMX) files.
- TMXParser() - Constructor for class org.apache.tika.parser.tmx.TMXParser
- TNEFParser - Class in org.apache.tika.parser.microsoft
-
A POI-powered Tika Parser for TNEF (Transport Neutral Encoding Format) messages, aka winmail.dat
- TNEFParser() - Constructor for class org.apache.tika.parser.microsoft.TNEFParser
- TO - Enum constant in enum class org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
- toBigInteger() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
- toBigInteger() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- toBigInteger() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UNumber
-
Get this number as a
BigInteger. - toBigInteger() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
- toBoolean(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- toBuilder() - Method in class org.apache.tika.DeleteFetcherReply
- toBuilder() - Method in class org.apache.tika.DeleteFetcherRequest
- toBuilder() - Method in class org.apache.tika.DeletePipesIteratorReply
- toBuilder() - Method in class org.apache.tika.DeletePipesIteratorRequest
- toBuilder() - Method in class org.apache.tika.FetchAndParseReply
- toBuilder() - Method in class org.apache.tika.FetchAndParseRequest
- toBuilder() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- toBuilder() - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- toBuilder() - Method in class org.apache.tika.GetFetcherReply
- toBuilder() - Method in class org.apache.tika.GetFetcherRequest
- toBuilder() - Method in class org.apache.tika.GetPipesIteratorReply
- toBuilder() - Method in class org.apache.tika.GetPipesIteratorRequest
- toBuilder() - Method in class org.apache.tika.ListFetchersReply
- toBuilder() - Method in class org.apache.tika.ListFetchersRequest
- toBuilder() - Method in class org.apache.tika.SaveFetcherReply
- toBuilder() - Method in class org.apache.tika.SaveFetcherRequest
- toBuilder() - Method in class org.apache.tika.SavePipesIteratorReply
- toBuilder() - Method in class org.apache.tika.SavePipesIteratorRequest
- toByte() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd8bit
-
This method is used to get the byte value of the 8bit stream object header End.
- toByteArray() - Method in class org.apache.tika.parser.microsoft.onenote.GUID
- toByteArray(List<Byte>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.ByteUtil
- toBytes(Object) - Static method in class org.apache.tika.pipes.core.serialization.JsonPipesIpc
-
Serialize an object to Smile binary format bytes.
- toChar(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- toCharset() - Method in enum class org.apache.tika.ml.chardetect.StructuralEncodingRules.Utf8Result
- toDouble(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- toGeoTag(Map<String, List<Location>>, String) - Method in class org.apache.tika.parser.geo.topic.GeoTag
- ToHTMLContentHandler - Class in org.apache.tika.sax
-
SAX event handler that serializes the HTML document to a character stream.
- ToHTMLContentHandler() - Constructor for class org.apache.tika.sax.ToHTMLContentHandler
- ToHTMLContentHandler(OutputStream, String) - Constructor for class org.apache.tika.sax.ToHTMLContentHandler
- toInt16(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- toInt16(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.LittleEndianBitConverter
-
Returns a 16-bit signed integer converted from two bytes at a specified position in a byte array.
- toInt32(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- toInt32(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.LittleEndianBitConverter
-
Returns a 32-bit signed integer converted from two bytes at a specified position in a byte array.
- toInt64(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- toJson() - Method in class org.apache.tika.config.loader.TikaLoader
-
Converts the current configuration to a JSON string (pretty-printed).
- toJson() - Method in class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
-
Serializes this DataPackage to JSON.
- toJson(List<Chunk>) - Static method in class org.apache.tika.inference.ChunkSerializer
-
Serialize chunks to a JSON array string.
- toJson(List<Metadata>, Writer) - Static method in class org.apache.tika.serialization.JsonMetadataList
-
Serializes a Metadata object to Json.
- toJson(List<Metadata>, Writer, boolean) - Static method in class org.apache.tika.serialization.JsonMetadataList
-
Serializes a Metadata object to Json.
- toJson(List<FetchEmitTuple>) - Static method in class org.apache.tika.pipes.core.serialization.JsonFetchEmitTupleList
- toJson(List<FetchEmitTuple>, Writer) - Static method in class org.apache.tika.pipes.core.serialization.JsonFetchEmitTupleList
- toJson(Metadata, Writer) - Static method in class org.apache.tika.serialization.JsonMetadata
-
Serializes a Metadata object to Json.
- toJson(FetchEmitTuple) - Static method in class org.apache.tika.pipes.core.serialization.JsonFetchEmitTuple
- toJson(FetchEmitTuple, Writer) - Static method in class org.apache.tika.pipes.core.serialization.JsonFetchEmitTuple
- toJson(EmitDataImpl, Writer) - Static method in class org.apache.tika.pipes.core.serialization.JsonEmitData
- toJsonPath(Path) - Static method in class org.apache.tika.config.JsonConfigHelper
-
Converts a Path to a JSON-safe string with forward slashes.
- toKebabCase(String) - Static method in class org.apache.tika.annotation.KebabCaseConverter
-
Converts a Java class name to kebab-case.
- toKebabCase(String) - Static method in class org.apache.tika.config.loader.KebabCaseConverter
-
Converts a Java class name to kebab-case.
- TOKEN_ENTROPY_RATE - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TOKEN_LENGTH_MEAN - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TOKEN_LENGTH_STD_DEV - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TOKEN_LENGTH_SUM - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TokenContraster - Class in org.apache.tika.eval.core.tokens
-
Computes some corpus contrast statistics.
- TokenContraster() - Constructor for class org.apache.tika.eval.core.tokens.TokenContraster
- TokenCountPriorityQueue - Class in org.apache.tika.eval.core.textstats
-
Bounded min-heap that keeps the top-N TokenIntPairs by value.
- TokenCountPriorityQueue - Class in org.apache.tika.eval.core.tokens
-
Bounded min-heap that keeps the top-N TokenIntPairs by value.
- TokenCountPriorityQueue(int) - Constructor for class org.apache.tika.eval.core.textstats.TokenCountPriorityQueue
- TokenCounts - Class in org.apache.tika.eval.core.tokens
- TokenCounts() - Constructor for class org.apache.tika.eval.core.tokens.TokenCounts
- TokenCountStatsCalculator<T> - Interface in org.apache.tika.eval.core.textstats
-
Interface for calculators that require token stats
- TokenEntropy - Class in org.apache.tika.eval.core.textstats
- TokenEntropy() - Constructor for class org.apache.tika.eval.core.textstats.TokenEntropy
- TokenIntPair - Class in org.apache.tika.eval.core.tokens
- TokenIntPair(String, int) - Constructor for class org.apache.tika.eval.core.tokens.TokenIntPair
- tokenize(String) - Method in class org.apache.tika.eval.core.tokens.AnalyzerManager
-
Tokenize the given text and return a TokenCounts object.
- tokenize(String) - Static method in class org.apache.tika.eval.core.tokens.TikaEvalTokenizer
-
Tokenize in
TikaEvalTokenizer.Mode.COMMON_TOKENSmode and return tokens as a list. - tokenize(String) - Static method in class org.apache.tika.langdetect.charsoup.WordTokenizer
-
Tokenize the given raw text with full preprocessing (truncate, strip URLs/emails, NFC normalize, case fold) and return tokens as a list.
- tokenize(String) - Static method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
- tokenize(String, Consumer<String>) - Method in class org.apache.tika.eval.core.tokens.AnalyzerManager
-
Tokenize and stream tokens to a consumer, respecting maxTokens limit.
- tokenize(String, Consumer<String>) - Static method in class org.apache.tika.eval.core.tokens.TikaEvalTokenizer
-
Tokenize in
TikaEvalTokenizer.Mode.COMMON_TOKENSmode, streaming tokens to a consumer. - tokenize(String, Consumer<String>) - Static method in class org.apache.tika.langdetect.charsoup.WordTokenizer
-
Tokenize with full preprocessing, streaming tokens to a consumer.
- tokenize(String, TikaEvalTokenizer.Mode) - Static method in class org.apache.tika.eval.core.tokens.TikaEvalTokenizer
-
Tokenize in the specified mode and return tokens as a list.
- tokenize(String, TikaEvalTokenizer.Mode, int, Consumer<String>) - Static method in class org.apache.tika.eval.core.tokens.TikaEvalTokenizer
-
Tokenize in the specified mode, streaming at most
maxTokenstokens to a consumer. - tokenize(String, TikaEvalTokenizer.Mode, Consumer<String>) - Static method in class org.apache.tika.eval.core.tokens.TikaEvalTokenizer
-
Tokenize in the specified mode, streaming tokens to a consumer.
- tokenizeAlphanumeric(String, Consumer<String>) - Static method in class org.apache.tika.langdetect.charsoup.WordTokenizer
-
Tokenize the given raw text with full preprocessing, including numeric tokens.
- TokenLengths - Class in org.apache.tika.eval.core.textstats
- TokenLengths() - Constructor for class org.apache.tika.eval.core.textstats.TokenLengths
- TokenStatistics - Class in org.apache.tika.eval.core.tokens
- TokenStatistics(int, int, TokenIntPair[], double, SummaryStatistics) - Constructor for class org.apache.tika.eval.core.tokens.TokenStatistics
- toListOfByte(byte[]) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.ByteUtil
- ToMarkdownContentHandler - Class in org.apache.tika.sax
-
SAX event handler that writes content as Markdown.
- ToMarkdownContentHandler() - Constructor for class org.apache.tika.sax.ToMarkdownContentHandler
- ToMarkdownContentHandler(OutputStream, String) - Constructor for class org.apache.tika.sax.ToMarkdownContentHandler
- ToMarkdownContentHandler(Writer) - Constructor for class org.apache.tika.sax.ToMarkdownContentHandler
- toMediaType(OggStreamIdentifier.OggStreamType) - Static method in class org.apache.tika.detect.ogg.OggDetector
-
Converts from vorbis-java type to Tika's type
- top() - Method in class org.apache.tika.eval.core.textstats.TokenCountPriorityQueue
- top() - Method in class org.apache.tika.eval.core.tokens.TokenCountPriorityQueue
- TOP_10_MORE_IN_A - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TOP_10_MORE_IN_B - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TOP_10_UNIQUE_TOKEN_DIFFS_A - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TOP_10_UNIQUE_TOKEN_DIFFS_B - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- TOP_N_TOKENS - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- topic() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
topicrecord component. - TopNTokens - Class in org.apache.tika.eval.core.textstats
- TopNTokens(int) - Constructor for class org.apache.tika.eval.core.textstats.TopNTokens
- TopologyCreationTimeStamp - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- topShortTextLanguages(String, int) - Static method in class org.apache.tika.langdetect.charsoup.CharSoupLanguageDetector
-
Return the top
nlanguage codes from the short-text discriminative model, ranked by raw logit (descending). - toResponse(TikaServerParseException) - Method in class org.apache.tika.server.core.TikaServerParseExceptionMapper
- toSingle(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- toString() - Method in class org.apache.tika.config.EmbeddedLimits
- toString() - Method in record class org.apache.tika.config.loader.ComponentInfo
-
Returns a string representation of this record class.
- toString() - Method in class org.apache.tika.config.loader.TikaJsonConfig
- toString() - Method in class org.apache.tika.config.OutputLimits
- toString() - Method in class org.apache.tika.config.TimeoutLimits
- toString() - Method in class org.apache.tika.detect.EncodingDetectorContext.Result
- toString() - Method in class org.apache.tika.detect.EncodingResult
- toString() - Method in class org.apache.tika.detect.MagicDetector
-
Returns a string representation of the Detection Rule.
- toString() - Method in class org.apache.tika.eval.app.EvalConfig
- toString() - Method in class org.apache.tika.eval.core.tokens.TokenIntPair
- toString() - Method in class org.apache.tika.eval.core.tokens.TokenStatistics
- toString() - Method in class org.apache.tika.io.TikaInputStream
- toString() - Method in class org.apache.tika.language.detect.LanguageResult
- toString() - Method in class org.apache.tika.metadata.filter.CompositeMetadataFilter
- toString() - Method in class org.apache.tika.metadata.filter.FieldNameMappingFilter
- toString() - Method in class org.apache.tika.metadata.Metadata
- toString() - Method in class org.apache.tika.metadata.writefilter.StandardMetadataLimiterFactory
- toString() - Method in class org.apache.tika.mime.MediaType
- toString() - Method in class org.apache.tika.mime.MimeType
-
Returns the name of this media type.
- toString() - Method in class org.apache.tika.ml.chardetect.ScoredCandidate
- toString() - Method in class org.apache.tika.ml.chardetect.SpecialistOutput
- toString() - Method in class org.apache.tika.ml.chardetect.Utf16ColumnFeatureExtractor
- toString() - Method in class org.apache.tika.ml.Prediction
- toString() - Method in class org.apache.tika.parser.AutoDetectParserConfig
- toString() - Method in class org.apache.tika.parser.csv.CSVResult
- toString() - Method in class org.apache.tika.parser.dif.DIFContentHandler
- toString() - Method in class org.apache.tika.parser.microsoft.chm.ChmBlockInfo
-
Returns textual representation of ChmBlockInfo
- toString() - Method in class org.apache.tika.parser.microsoft.chm.ChmDirectoryListingSet
- toString() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
-
Prints the values of ChmfHeader
- toString() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
- toString() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
-
Returns textual representation of ChmLzxcControlData
- toString() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
- toString() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
-
It suits for informative outlook
- toString() - Method in class org.apache.tika.parser.microsoft.chm.ChmPmgiHeader
-
Returns textual representation of the pmgi header
- toString() - Method in class org.apache.tika.parser.microsoft.chm.ChmPmglHeader
- toString() - Method in class org.apache.tika.parser.microsoft.chm.DirectoryListingEntry
- toString() - Method in class org.apache.tika.parser.microsoft.NumberCell
- toString() - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
- toString() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
- toString() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
- toString() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- toString() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
- toString() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
- toString() - Method in class org.apache.tika.parser.microsoft.onenote.GUID
- toString() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFToken
- toString() - Method in class org.apache.tika.parser.microsoft.TextCell
- toString() - Method in class org.apache.tika.parser.ParseContext
- toString() - Method in class org.apache.tika.parser.pdf.OcrConfig.StrategyAuto
- toString() - Method in class org.apache.tika.parser.pdf.updates.StartXRefOffset
- toString() - Method in enum class org.apache.tika.parser.strings.StringsEncoding
- toString() - Method in class org.apache.tika.parser.txt.CharsetMatch
- toString() - Method in record class org.apache.tika.parser.vlm.AbstractVLMParser.HttpCall
-
Returns a string representation of this record class.
- toString() - Method in class org.apache.tika.pipes.api.emitter.EmitKey
- toString() - Method in class org.apache.tika.pipes.api.FetchEmitTuple
- toString() - Method in class org.apache.tika.pipes.api.fetcher.FetchKey
- toString() - Method in class org.apache.tika.pipes.api.pipesiterator.TotalCountResult
- toString() - Method in record class org.apache.tika.pipes.api.PipesResult
-
Returns a string representation of this record class.
- toString() - Method in record class org.apache.tika.pipes.core.async.EmitDataPair
-
Returns a string representation of this record class.
- toString() - Method in record class org.apache.tika.pipes.core.config.ConfigMerger.MergeResult
-
Returns a string representation of this record class.
- toString() - Method in class org.apache.tika.pipes.core.emitter.EmitDataImpl
- toString() - Method in class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
- toString() - Method in record class org.apache.tika.pipes.core.extractor.frictionless.FrictionlessResource
-
Returns a string representation of this record class.
- toString() - Method in record class org.apache.tika.pipes.core.extractor.FrictionlessUnpackHandler.FrictionlessFileInfo
-
Returns a string representation of this record class.
- toString() - Method in class org.apache.tika.pipes.core.extractor.StandardUnpackSelector
- toString() - Method in record class org.apache.tika.pipes.core.extractor.TempFileUnpackHandler.EmbeddedFileInfo
-
Returns a string representation of this record class.
- toString() - Method in class org.apache.tika.pipes.core.extractor.UnpackConfig
- toString() - Method in record class org.apache.tika.pipes.core.protocol.PipesMessage
-
Returns a string representation of this record class.
- toString() - Method in record class org.apache.tika.pipes.emitter.azblob.AZBlobEmitterConfig
-
Returns a string representation of this record class.
- toString() - Method in record class org.apache.tika.pipes.emitter.es.ESEmitterConfig
-
Overrides the record default to prevent
apiKeyleaking into logs. - toString() - Method in record class org.apache.tika.pipes.emitter.es.HttpClientConfig
-
Returns a string representation of this record class.
- toString() - Method in class org.apache.tika.pipes.emitter.es.JsonResponse
- toString() - Method in record class org.apache.tika.pipes.emitter.fs.FileSystemEmitterConfig
-
Returns a string representation of this record class.
- toString() - Method in record class org.apache.tika.pipes.emitter.gcs.GCSEmitterConfig
-
Returns a string representation of this record class.
- toString() - Method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
-
Returns a string representation of this record class.
- toString() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns a string representation of this record class.
- toString() - Method in record class org.apache.tika.pipes.emitter.opensearch.HttpClientConfig
-
Returns a string representation of this record class.
- toString() - Method in class org.apache.tika.pipes.emitter.opensearch.JsonResponse
- toString() - Method in record class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig
-
Returns a string representation of this record class.
- toString() - Method in record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
-
Returns a string representation of this record class.
- toString() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Returns a string representation of this record class.
- toString() - Method in class org.apache.tika.pipes.fetcher.fs.FileSystemFetcher
- toString() - Method in class org.apache.tika.pipes.fork.PipesForkResult
- toString() - Method in record class org.apache.tika.pipes.reporter.es.ESReporterConfig
-
Returns a string representation of this record class.
- toString() - Method in record class org.apache.tika.pipes.reporter.fs.FileSystemReporterConfig
-
Returns a string representation of this record class.
- toString() - Method in record class org.apache.tika.pipes.reporter.jdbc.JDBCPipesReporterConfig
-
Returns a string representation of this record class.
- toString() - Method in record class org.apache.tika.pipes.reporter.opensearch.HttpClientConfig
-
Returns a string representation of this record class.
- toString() - Method in class org.apache.tika.pipes.reporter.opensearch.JsonResponse
- toString() - Method in record class org.apache.tika.pipes.reporter.opensearch.OpenSearchReporterConfig
-
Returns a string representation of this record class.
- toString() - Method in record class org.apache.tika.plugins.ExtensionConfig
-
Returns a string representation of this record class.
- toString() - Method in class org.apache.tika.quality.TextQualityComparison
- toString() - Method in class org.apache.tika.quality.TextQualityScore
- toString() - Method in class org.apache.tika.sax.ContentHandlerDecorator
- toString() - Method in class org.apache.tika.sax.DIFContentHandler
- toString() - Method in class org.apache.tika.sax.Link
- toString() - Method in class org.apache.tika.sax.StandardReference
- toString() - Method in class org.apache.tika.sax.TextContentHandler
- toString() - Method in class org.apache.tika.sax.ToMarkdownContentHandler
- toString() - Method in class org.apache.tika.sax.ToTextContentHandler
-
Returns the contents of the internal string buffer where all the received characters have been collected.
- toString() - Method in class org.apache.tika.server.client.TikaEmitterResult
- toString() - Method in record class org.apache.tika.server.core.resource.PipesParsingHelper.UnpackResult
-
Returns a string representation of this record class.
- toString() - Method in record class org.apache.tika.server.core.resource.ServerHandlerConfig
-
Returns a string representation of this record class.
- toString() - Method in class org.apache.tika.server.core.TaskStatus
- toString() - Method in class org.apache.tika.server.core.TlsConfig
- toString() - Method in class org.apache.tika.Tika
- toString() - Method in class org.apache.tika.utils.FileProcessResult
- toString() - Method in class org.apache.tika.xmp.XMPMetadata
-
Serializes the XMP data in compact form without packet wrapper
- toString(byte[]) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- toTags(CharacterRun) - Static method in class org.apache.tika.parser.microsoft.FormattingUtils
- TOTAL_TIME - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- TOTAL_UNMAPPED_UNICODE_CHARS - Static variable in interface org.apache.tika.metadata.PDF
- TotalCounter - Interface in org.apache.tika.pipes.api.pipesiterator
-
Interface for pipesiterators that allow counting of total documents.
- TotalCountResult - Class in org.apache.tika.pipes.api.pipesiterator
- TotalCountResult() - Constructor for class org.apache.tika.pipes.api.pipesiterator.TotalCountResult
- TotalCountResult(long, TotalCountResult.STATUS) - Constructor for class org.apache.tika.pipes.api.pipesiterator.TotalCountResult
- TotalCountResult.STATUS - Enum Class in org.apache.tika.pipes.api.pipesiterator
- ToTextContentHandler - Class in org.apache.tika.sax
-
SAX event handler that writes all character content out to a character stream.
- ToTextContentHandler() - Constructor for class org.apache.tika.sax.ToTextContentHandler
-
Creates a content handler that writes character events to an internal string buffer.
- ToTextContentHandler(OutputStream, String) - Constructor for class org.apache.tika.sax.ToTextContentHandler
-
Creates a content handler that writes character events to the given output stream using the given encoding.
- ToTextContentHandler(Writer) - Constructor for class org.apache.tika.sax.ToTextContentHandler
-
Creates a content handler that writes character events to the given writer.
- toUint16() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd16bit
-
This method is used to get the byte value of the 16-bit stream object header End.
- ToUint16() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart16bit
-
This method is used to get the Uint16 value of the 16bit stream object header.
- ToUInt16(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.LittleEndianBitConverter
-
Returns a 16-bit unsigned integer converted from two bytes at a specified position in a byte array.
- toUInt32(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
- toUInt32(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.LittleEndianBitConverter
-
Returns a 32-bit unsigned integer converted from two bytes at a specified position in a byte array.
- toUInt64(byte[], int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.LittleEndianBitConverter
-
Returns a 64-bit unsigned integer converted from two bytes at a specified position in a byte array.
- ToXMLContentHandler - Class in org.apache.tika.sax
-
SAX event handler that serializes the XML document to a character stream.
- ToXMLContentHandler() - Constructor for class org.apache.tika.sax.ToXMLContentHandler
- ToXMLContentHandler(OutputStream, String) - Constructor for class org.apache.tika.sax.ToXMLContentHandler
-
Creates an XML serializer that writes to the given byte stream using the given character encoding.
- ToXMLContentHandler(String) - Constructor for class org.apache.tika.sax.ToXMLContentHandler
- TRACK_NUMBER - Static variable in interface org.apache.tika.metadata.XMPDM
-
"A numeric value indicating the order of the audio file within its original recording."
- TrainedModel - Class in org.apache.tika.detect
- TrainedModel() - Constructor for class org.apache.tika.detect.TrainedModel
- TrainedModelDetector - Class in org.apache.tika.detect
- TrainedModelDetector() - Constructor for class org.apache.tika.detect.TrainedModelDetector
- TrainJunkModel - Class in org.apache.tika.ml.junkdetect.tools
-
Trains the junk detector model from per-script corpus files produced by
BuildJunkTrainingData. - TrainJunkModel() - Constructor for class org.apache.tika.ml.junkdetect.tools.TrainJunkModel
- TrainNaiveBayesBigram - Class in org.apache.tika.ml.chardetect.tools
-
Naive-Bayes byte-bigram charset classifier trainer.
- TrainNaiveBayesBigram() - Constructor for class org.apache.tika.ml.chardetect.tools.TrainNaiveBayesBigram
- TrainTestSplit - Class in org.apache.tika.eval.app.tools
- TrainTestSplit() - Constructor for class org.apache.tika.eval.app.tools.TrainTestSplit
- transactionalId() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
transactionalIdrecord component. - transactionTimeoutMs() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
transactionTimeoutMsrecord component. - TranscribeTranslateExample - Class in org.apache.tika.example
-
This example demonstrates primitive logic for chaining Tika API calls.
- TranscribeTranslateExample() - Constructor for class org.apache.tika.example.TranscribeTranslateExample
- transferTo(OutputStream) - Method in class org.apache.tika.io.BoundedInputStream
- translate(String) - Method in class org.apache.tika.language.translate.impl.MarianTranslator.MarianServerClient
-
Translate the passed text using the Marian Server.
- translate(String) - Method in class org.apache.tika.language.translate.impl.RTGTranslator
- translate(String, String) - Method in class org.apache.tika.language.translate.DefaultTranslator
-
Translate, using the first available service-loaded translator
- translate(String, String) - Method in class org.apache.tika.language.translate.EmptyTranslator
- translate(String, String) - Method in class org.apache.tika.language.translate.impl.CachedTranslator
- translate(String, String) - Method in class org.apache.tika.language.translate.impl.ExternalTranslator
-
Default translate method which uses built Tika language identification.
- translate(String, String) - Method in class org.apache.tika.language.translate.impl.GoogleTranslator
- translate(String, String) - Method in class org.apache.tika.language.translate.impl.JoshuaNetworkTranslator
-
Make an attempt to guess the source language via
org.apache.tika.language.translate.AbstractTranslator#detectLanguage(String)before making the call toJoshuaNetworkTranslator.translate(String, String, String) - translate(String, String) - Method in class org.apache.tika.language.translate.impl.Lingo24Translator
- translate(String, String) - Method in class org.apache.tika.language.translate.impl.MarianTranslator
-
Default translate method which uses built Tika language identification.
- translate(String, String) - Method in class org.apache.tika.language.translate.impl.MicrosoftTranslator
-
Use the Microsoft service to translate the given text to the given target language.
- translate(String, String) - Method in class org.apache.tika.language.translate.impl.RTGTranslator
- translate(String, String) - Method in class org.apache.tika.language.translate.impl.YandexTranslator
- translate(String, String) - Method in interface org.apache.tika.language.translate.Translator
-
Translate text to the given language This method attempts to auto-detect the source language of the text.
- translate(String, String) - Method in class org.apache.tika.Tika
-
Translate the given text String to the given language, attempting to auto-detect the source language.
- translate(String, String, String) - Method in class org.apache.tika.language.translate.DefaultTranslator
-
Translate, using the first available service-loaded translator
- translate(String, String, String) - Method in class org.apache.tika.language.translate.EmptyTranslator
- translate(String, String, String) - Method in class org.apache.tika.language.translate.impl.CachedTranslator
- translate(String, String, String) - Method in class org.apache.tika.language.translate.impl.GoogleTranslator
- translate(String, String, String) - Method in class org.apache.tika.language.translate.impl.JoshuaNetworkTranslator
-
Initially then check if the source language has been provided.
- translate(String, String, String) - Method in class org.apache.tika.language.translate.impl.Lingo24Translator
- translate(String, String, String) - Method in class org.apache.tika.language.translate.impl.MarianTranslator
-
Translate method with specific source and target languages.
- translate(String, String, String) - Method in class org.apache.tika.language.translate.impl.MicrosoftTranslator
-
Use the Microsoft service to translate the given text from the given source language to the given target.
- translate(String, String, String) - Method in class org.apache.tika.language.translate.impl.MosesTranslator
- translate(String, String, String) - Method in class org.apache.tika.language.translate.impl.RTGTranslator
- translate(String, String, String) - Method in class org.apache.tika.language.translate.impl.YandexTranslator
- translate(String, String, String) - Method in interface org.apache.tika.language.translate.Translator
-
Translate text between given languages.
- translate(String, String, String) - Method in class org.apache.tika.Tika
-
Translate the given text String to and from the given languages.
- translate(TikaInputStream, Metadata, OutputStream) - Method in class org.apache.tika.extractor.DefaultEmbeddedStreamTranslator
-
This will consume the InputStream and write the stream to the output stream
- translate(TikaInputStream, Metadata, OutputStream) - Method in interface org.apache.tika.extractor.EmbeddedStreamTranslator
- translate(TikaInputStream, Metadata, OutputStream) - Method in class org.apache.tika.extractor.microsoft.MSEmbeddedStreamTranslator
- translate(TikaInputStream, Metadata, OutputStream) - Method in class org.apache.tika.extractor.microsoft.PSTEmailStreamTranslator
- TRANSLATE - Enum constant in enum class org.apache.tika.server.core.ServerStatus.TASK
- translateArgs(String[]) - Static method in class org.apache.tika.cli.AsyncHelper
- translatePost(InputStream, String, String, String) - Method in class org.apache.tika.server.core.resource.TranslateResource
- translatePut(InputStream, String, String, String) - Method in class org.apache.tika.server.core.resource.TranslateResource
- TranslateResource - Class in org.apache.tika.server.core.resource
- TranslateResource(ServerStatus) - Constructor for class org.apache.tika.server.core.resource.TranslateResource
- Translator - Interface in org.apache.tika.language.translate
-
Interface for Translator services.
- TranslatorExample - Class in org.apache.tika.example
- TranslatorExample() - Constructor for class org.apache.tika.example.TranslatorExample
- TRANSMISSION_REFERENCE - Static variable in interface org.apache.tika.metadata.Photoshop
- TrecDocumentGenerator - Class in org.apache.tika.example
-
Generates document summaries for corpus analysis in the Open Relevance project.
- TrecDocumentGenerator() - Constructor for class org.apache.tika.example.TrecDocumentGenerator
- trimMessage(String) - Static method in class org.apache.tika.utils.ExceptionUtils
-
Utility method to trim the message from a stack trace string.
- TRUE - Static variable in class org.apache.tika.eval.app.ProfilerBase
- TrueTypeParser - Class in org.apache.tika.parser.font
-
Parser for TrueType font files (TTF).
- TrueTypeParser() - Constructor for class org.apache.tika.parser.font.TrueTypeParser
- truncateContent(ContentTags, int, Map<Cols, String>) - Static method in class org.apache.tika.eval.app.ProfilerBase
-
Get the content and record in the data
Cols.CONTENT_TRUNCATED_AT_MAX_LENwhether the string was truncated - TRUNCATED_CONTENT_FOR_DETECTION - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
This indicates that only a portion of the file content was provided for detection.
- TRUNCATED_METADATA - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
This means that metadata keys or metadata values were truncated.
- tryAnalyzeWhetherConfirmSchema(List<DataElement>, ExGuid) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to analyze whether the data elements are confirmed to the schema defined in MS-FSSHTTPD.
- tryAnalyzeWhetherFullDataElementList(List<DataElement>, ExGuid) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
-
This method is used to try to analyze the returned whether data elements are complete.
- tryGetCurrent(byte[], AtomicInteger, AtomicReference<T>, Class<T>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
-
Try to get current object, true will returned if success.
- tryOpenContainerOnTikaInputStream(TikaInputStream, Metadata) - Static method in class org.apache.tika.detect.microsoft.POIFSContainerDetector
- tryParse(byte[], int, AtomicReference<StreamObjectHeaderStart>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
-
This method is used to parse the actual 16bit or 32bit stream header.
- tryToAdd(FetchEmitTuple) - Method in class org.apache.tika.pipes.pipesiterator.PipesIteratorBase
- tryToFindExistingLeafParser(Class, ParseContext) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
Tries to find an existing parser within the ParseContext.
- tryToGetMsgTitle(DirectoryEntry, String) - Static method in class org.apache.tika.parser.microsoft.OutlookExtractor
- tryToOpenZipFile(TikaInputStream, Metadata) - Static method in class org.apache.tika.zip.utils.ZipSalvager
-
Tries to open a ZipFile from the TikaInputStream using default charset.
- tryToOpenZipFile(TikaInputStream, Metadata, Charset) - Static method in class org.apache.tika.zip.utils.ZipSalvager
-
Tries to open a ZipFile from the TikaInputStream.
- tryToParse(String) - Method in class org.apache.tika.utils.DateUtils
-
Tries to parse the date string; returns null if no parse was possible.
- TSD_MIME_TYPE - Static variable in class org.apache.tika.parser.crypto.TSDParser
- TSDParser - Class in org.apache.tika.parser.crypto
-
Tika parser for Time Stamped Data Envelope (application/timestamped-data)
- TSDParser() - Constructor for class org.apache.tika.parser.crypto.TSDParser
- TwoBytesOfData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
-
This class is used to represent the property contains 2 bytes of data in the PropertySet.rgData stream field.
- TwoBytesOfData - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
The property contains 2 bytes of data in the PropertySet.rgData stream field.
- TwoBytesOfData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.TwoBytesOfData
- TXT - Enum constant in enum class org.apache.tika.parser.ocr.TesseractOCRConfig.OUTPUT_TYPE
- TXTParser - Class in org.apache.tika.parser.txt
-
Plain text parser.
- TXTParser() - Constructor for class org.apache.tika.parser.txt.TXTParser
- TXTParser(EncodingDetector) - Constructor for class org.apache.tika.parser.txt.TXTParser
- type - Variable in class org.apache.tika.mime.MimeTypesReader
-
Current type
- type - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
- type - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
- type - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
- type - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
- type() - Method in record class org.apache.tika.pipes.core.protocol.PipesMessage
-
Returns the value of the
typerecord component. - type() - Method in record class org.apache.tika.server.core.resource.ServerHandlerConfig
-
Returns the value of the
typerecord component. - TYPE - Static variable in interface org.apache.tika.metadata.DublinCore
-
The nature or genre of the content of the resource.
- TYPE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- TYPE - Static variable in interface org.apache.tika.metadata.XMPDC
-
The nature or genre of the content of the resource.
- TYPED - Static variable in class org.apache.tika.serialization.serdes.ParseContextSerializer
- TypeDetector - Class in org.apache.tika.detect
-
Content type detection based on a content type hint.
- TypeDetector() - Constructor for class org.apache.tika.detect.TypeDetector
- typeName - Variable in class org.apache.tika.serialization.serdes.SpiCompositeSerializer
- types - Variable in class org.apache.tika.metadata.filter.ClearByAttachmentTypeMetadataFilter.Config
- types - Variable in class org.apache.tika.mime.MimeTypesReader
U
- U - Enum constant in enum class org.apache.tika.parser.microsoft.FormattingUtils.Tag
- ubyte(byte) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
-
Create an
unsigned byteby masking it with0xFFi.e. - ubyte(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
-
Create an
unsigned byte - ubyte(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
-
Create an
unsigned byte - ubyte(short) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
-
Create an
unsigned byte - ubyte(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
-
Create an
unsigned byte - UByte - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned
-
The
unsigned bytetype - ubyteToInt(byte) - Static method in class org.apache.tika.io.EndianUtils
-
Convert an 'unsigned' byte to an integer. ie, don't carry across the sign.
- uint(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
-
Create an
unsigned intby masking it with0xFFFFFFFFi.e. - uint(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
-
Create an
unsigned int - uint(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
-
Create an
unsigned int - uint16() - Method in class org.apache.tika.parser.hwp.HwpStreamReader
-
unsigned 2 byte
- uint16(int) - Method in class org.apache.tika.parser.hwp.HwpStreamReader
-
unsigned 2 byte array
- uint32() - Method in class org.apache.tika.parser.hwp.HwpStreamReader
-
unsigned 4 byte
- uint8() - Method in class org.apache.tika.parser.hwp.HwpStreamReader
-
unsigned 1 byte
- UInteger - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned
-
The
unsigned inttype - ulong(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
-
Create an
unsigned longby masking it with0xFFFFFFFFFFFFFFFFi.e. - ulong(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
-
Create an
unsigned long - ulong(BigInteger) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
-
Create an
unsigned long - ULong - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned
-
The
unsigned longtype - UMath - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned
- UNCOMPRESSED - Enum constant in enum class org.apache.tika.parser.microsoft.chm.ChmCommons.EntryType
- UNCOMPRESSED - Static variable in class org.apache.tika.parser.microsoft.chm.ChmCommons
- UNCOMPRESSED_SIZE - Static variable in interface org.apache.tika.metadata.Zip
-
Uncompressed size of the entry in bytes.
- UNDEFINED - Static variable in class org.apache.tika.parser.microsoft.chm.ChmCommons
-
Represents lzx block types in order to decompress differently
- Underline - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- UnderlineType - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- unescapeCommandLine(String) - Static method in class org.apache.tika.utils.ProcessUtils
- UNICODE_CHAR_BLOCKS - Enum constant in enum class org.apache.tika.eval.app.db.Cols
- UNICODE_ESCAPE - Enum constant in enum class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenType
- UnicodeBlockCounter - Class in org.apache.tika.eval.core.textstats
- UnicodeBlockCounter(int) - Constructor for class org.apache.tika.eval.core.textstats.UnicodeBlockCounter
- UniversalEncodingDetector - Class in org.apache.tika.parser.txt
- UniversalEncodingDetector() - Constructor for class org.apache.tika.parser.txt.UniversalEncodingDetector
-
Default constructor for SPI loading.
- UniversalEncodingDetector(JsonConfig) - Constructor for class org.apache.tika.parser.txt.UniversalEncodingDetector
-
Constructor for JSON configuration.
- UniversalEncodingDetector(UniversalEncodingDetector.Config) - Constructor for class org.apache.tika.parser.txt.UniversalEncodingDetector
-
Constructor with explicit Config object.
- UniversalEncodingDetector.Config - Class in org.apache.tika.parser.txt
-
Configuration class for JSON deserialization.
- UniversalExecutableParser - Class in org.apache.tika.parser.executable
-
Parser for universal executable files.
- UniversalExecutableParser() - Constructor for class org.apache.tika.parser.executable.UniversalExecutableParser
- UNIX_MODE - Static variable in interface org.apache.tika.metadata.Zip
-
Unix file mode/permissions for the entry.
- Unknown - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- UNKNOWN - Enum constant in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- UNKNOWN - Static variable in class org.apache.tika.quality.TextQualityScore
-
Sentinel z-score returned when scoring could not be run (e.g. null or empty input).
- UNKNOWN_ENUM - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.Error
- UNKNOWN_GUID - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.Error
- UNKNOWN_IMG_NS - Static variable in class org.apache.tika.parser.image.ImageMetadataExtractor
- UNKNOWN13 - Enum constant in enum class org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
- UNLIMITED - Static variable in class org.apache.tika.config.EmbeddedLimits
- UNLIMITED - Static variable in class org.apache.tika.config.OutputLimits
- UNLISTED_SLIDE_NAMES - Static variable in interface org.apache.tika.metadata.Office
- UNMAPPED_UNICODE_CHARS_PER_PAGE - Static variable in interface org.apache.tika.metadata.PDF
- unmarshalBytes(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
- unmarshalCharArray(byte[], ChmPmglHeader, int) - Method in class org.apache.tika.parser.microsoft.chm.ChmPmglHeader
- unmarshalUByte() - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
- unmarshalUtfChar() - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
- unpack(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.core.resource.UnpackerResource
-
Extracts embedded documents from a container file (simple PUT, no config).
- UNPACK - Enum constant in enum class org.apache.tika.pipes.api.ParseMode
-
Extracts embedded document bytes and emits them, with full RMETA metadata.
- UNPACK_EMITTER_ID - Static variable in class org.apache.tika.server.core.resource.PipesParsingHelper
-
Name of the file-system emitter used for UNPACK mode.
- unpackAll(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.core.resource.UnpackerResource
-
Extracts embedded documents plus original document and metadata (simple PUT).
- unpackAllWithConfig(List<Attachment>, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.core.resource.UnpackerResource
-
Extracts embedded documents plus original/metadata with config (multipart POST).
- UnpackConfig - Class in org.apache.tika.pipes.core.extractor
- UnpackConfig() - Constructor for class org.apache.tika.pipes.core.extractor.UnpackConfig
-
Create an UnpackConfig with default settings.
- UnpackConfig.KEY_BASE_STRATEGY - Enum Class in org.apache.tika.pipes.core.extractor
- UnpackConfig.OUTPUT_FORMAT - Enum Class in org.apache.tika.pipes.core.extractor
-
Output format for UNPACK mode.
- UnpackConfig.OUTPUT_MODE - Enum Class in org.apache.tika.pipes.core.extractor
-
Output mode for how embedded files are delivered.
- UnpackConfig.SUFFIX_STRATEGY - Enum Class in org.apache.tika.pipes.core.extractor
- UnpackerResource - Class in org.apache.tika.server.core.resource
-
JAX-RS resource for unpacking embedded documents from container files.
- UnpackerResource() - Constructor for class org.apache.tika.server.core.resource.UnpackerResource
- UnpackExtractor - Class in org.apache.tika.pipes.core.extractor
-
Embedded document extractor that parses and unpacks embedded documents, extracting both text/metadata and raw bytes.
- UnpackExtractor(ParseContext) - Constructor for class org.apache.tika.pipes.core.extractor.UnpackExtractor
- UnpackExtractorFactory - Class in org.apache.tika.pipes.core.extractor
- UnpackExtractorFactory() - Constructor for class org.apache.tika.pipes.core.extractor.UnpackExtractorFactory
- UnpackHandler - Interface in org.apache.tika.extractor
- UnpackResult(Path, List<Metadata>) - Constructor for record class org.apache.tika.server.core.resource.PipesParsingHelper.UnpackResult
-
Creates an instance of a
UnpackResultrecord class. - UnpackSelector - Interface in org.apache.tika.extractor
- UnpackSelector.AcceptAll - Class in org.apache.tika.extractor
- unpackWithConfig(List<Attachment>, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.core.resource.UnpackerResource
-
Extracts embedded documents with configuration (multipart POST).
- UnrarParser - Class in org.apache.tika.parser.pkg
-
Parser for Rar files.
- UnrarParser() - Constructor for class org.apache.tika.parser.pkg.UnrarParser
- unravelStringMet(NetcdfFile, Group, Metadata) - Method in class org.apache.tika.parser.hdf.HDFParser
- UNRECOGNIZED - Enum constant in enum class org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
- Unsigned - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned
-
A utility class for static access to unsigned number functionality.
- UNSPECIFIED - Enum constant in enum class org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
- UNSPECIFIED_CRASH - Enum constant in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
- UNSPECIFIED_CRASH - Enum constant in enum class org.apache.tika.pipes.core.protocol.PipesMessageType
- UNSPECIFIED_CRASH - Static variable in class org.apache.tika.pipes.core.PipesResults
- UNSPECIFIED_MEDIA_TYPE - Static variable in class org.apache.tika.parser.html.DataURISchemeUtil
- UNSUPPORTED - Enum constant in enum class org.apache.tika.pipes.api.pipesiterator.TotalCountResult.STATUS
- UNSUPPORTED - Static variable in class org.apache.tika.pipes.api.pipesiterator.TotalCountResult
- UNSUPPORTED_OOXML_TYPES - Static variable in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
-
We claim to support all OOXML files, but we actually don't support a small number of them.
- UNSUPPORTED_VERSION - Enum constant in enum class org.apache.tika.eval.app.ProfilerBase.EXCEPTION_TYPE
- UnsupportedFormatException - Exception in org.apache.tika.exception
-
Parsers should throw this exception when they encounter a file format that they do not support.
- UnsupportedFormatException(String) - Constructor for exception org.apache.tika.exception.UnsupportedFormatException
- UNumber - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned
-
A base type for unsigned numbers.
- UNumber() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UNumber
- unwrapClass(Parser) - Method in class org.apache.tika.config.loader.ParserLoader
- unwrapClass(T) - Method in class org.apache.tika.config.loader.AbstractSpiComponentLoader
-
Unwrap a component to get the underlying class for auto-exclusion purposes.
- unzipPlugin(Path) - Static method in class org.apache.tika.plugins.ThreadSafeUnzipper
-
Unzips a plugin zip file to a directory with the same name (minus .zip extension).
- update() - Method in class org.apache.tika.config.TikaProgressTracker
-
Signals progress directly on this tracker instance.
- update(byte[], int, int) - Method in interface org.apache.tika.eval.core.textstats.BytesRefCalculator.BytesRefCalcInstance
- update(ParseContext) - Static method in class org.apache.tika.config.TikaProgressTracker
-
Signals progress from a ParseContext.
- UPDATE_MUST_EXIST - Enum constant in enum class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig.UpdateStrategy
- UPDATE_MUST_NOT_EXIST - Enum constant in enum class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig.UpdateStrategy
- updateInsertStatement(int, PreparedStatement, ColInfo, String) - Static method in class org.apache.tika.eval.app.db.JDBCUtil
- updateStrategy() - Method in record class org.apache.tika.pipes.emitter.es.ESEmitterConfig
-
Returns the value of the
updateStrategyrecord component. - updateStrategy() - Method in record class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig
-
Returns the value of the
updateStrategyrecord component. - updateStrategy() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Returns the value of the
updateStrategyrecord component. - UPSERT - Enum constant in enum class org.apache.tika.pipes.emitter.es.ESEmitterConfig.UpdateStrategy
- UPSERT - Enum constant in enum class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig.UpdateStrategy
- URGENCY - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.
- URGENCY - Static variable in interface org.apache.tika.metadata.Photoshop
- uri - Variable in class org.apache.tika.xmp.convert.Namespace
- URI - Enum constant in enum class org.apache.tika.metadata.Property.ValueType
- url() - Method in record class org.apache.tika.parser.vlm.AbstractVLMParser.HttpCall
-
Returns the value of the
urlrecord component. - URL - Enum constant in enum class org.apache.tika.metadata.Property.ValueType
- USAGE() - Static method in class org.apache.tika.eval.app.ExtractComparer
- USAGE() - Static method in class org.apache.tika.eval.app.reports.ResultsReporter
- USAGE_TERMS - Static variable in interface org.apache.tika.metadata.XMPRights
-
A word or short phrase that identifies a resource as a member of a userdefined collection.
- useAutoDetectParser() - Static method in class org.apache.tika.example.TIAParsingExample
- useCompositeParser() - Static method in class org.apache.tika.example.TIAParsingExample
- useDirectJPEG - Variable in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- useHtmlParser() - Static method in class org.apache.tika.example.TIAParsingExample
- USER_DEFINED_METADATA_NAME_PREFIX - Static variable in interface org.apache.tika.metadata.Office
-
For user defined metadata entries in the document, what prefix should be attached to the key names.
- USER_DEFINED_PROPERTY_PREFIX - Static variable in class org.apache.tika.parser.microsoft.JackcessParser
- UserAgent - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
User Agent
- UserAgent - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
User Agent
- UserAgentClientandPlatform - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
User Agent Client and Platform
- UserAgentGUID - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
User Agent GUID
- UserAgentversion - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
User Agent version
- userName() - Method in record class org.apache.tika.pipes.emitter.es.HttpClientConfig
-
Returns the value of the
userNamerecord component. - userName() - Method in record class org.apache.tika.pipes.emitter.opensearch.HttpClientConfig
-
Returns the value of the
userNamerecord component. - userName() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
-
Returns the value of the
userNamerecord component. - userName() - Method in record class org.apache.tika.pipes.reporter.opensearch.HttpClientConfig
-
Returns the value of the
userNamerecord component. - usesCompactFormat(Class<?>) - Static method in class org.apache.tika.serialization.ComponentNameResolver
-
Checks if a type should use compact format serialization.
- ushort(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
-
Create an
unsigned short - ushort(short) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
-
Create an
unsigned shortby masking it with0xFFFFi.e. - ushort(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.Unsigned
-
Create an
unsigned short - UShort - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned
-
The
unsigned shorttype - UTC - Static variable in class org.apache.tika.utils.DateUtils
-
The UTC time zone.
- Utf16ColumnFeatureExtractor - Class in org.apache.tika.ml.chardetect
-
Feature extractor for the UTF-16 specialist of the mixture-of-experts charset detector.
- Utf16ColumnFeatureExtractor() - Constructor for class org.apache.tika.ml.chardetect.Utf16ColumnFeatureExtractor
- Utf16SpecialistEncodingDetector - Class in org.apache.tika.ml.chardetect
-
UTF-16 specialist detector of the mixture-of-experts charset detection architecture.
- Utf16SpecialistEncodingDetector() - Constructor for class org.apache.tika.ml.chardetect.Utf16SpecialistEncodingDetector
-
Load the model from the default classpath location.
- UuidUtils - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
- UuidUtils() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.UuidUtils
V
- V1_DEFAULT_FLAGS - Static variable in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Default flags for v1 models (word unigrams only).
- validate() - Method in record class org.apache.tika.pipes.emitter.azblob.AZBlobEmitterConfig
- validate() - Method in record class org.apache.tika.pipes.emitter.gcs.GCSEmitterConfig
- validate() - Method in record class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig
- validate() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
- validate() - Method in record class org.apache.tika.pipes.emitter.s3.S3EmitterConfig
- validate() - Method in record class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig
- validateAndCollectConfigs(PluginManager, JsonNode) - Method in class org.apache.tika.pipes.core.AbstractComponentManager
-
Validates the configuration and collects component configs without instantiating.
- validateFetchEmitTuple(FetchEmitTuple) - Static method in class org.apache.tika.pipes.core.server.ServerProtocolIO
-
Validates that a FetchEmitTuple's configuration is consistent.
- value - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
- value - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
- value - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
- valueOf(byte) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
Get an instance of an
unsigned byteby masking it with0xFFi.e. - valueOf(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
Get an instance of an
unsigned byte - valueOf(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
-
Create an
unsigned intby masking it with0xFFFFFFFFi.e. - valueOf(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
-
Create an
unsigned short - valueOf(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
Get an instance of an
unsigned byte - valueOf(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
-
Create an
unsigned int - valueOf(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
Create an
unsigned longby masking it with0xFFFFFFFFFFFFFFFFi.e. - valueOf(short) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
Get an instance of an
unsigned byte - valueOf(short) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
-
Create an
unsigned shortby masking it with0xFFFFi.e. - valueOf(String) - Static method in enum class org.apache.tika.detect.EncodingResult.ResultType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.digest.DigestDef.Algorithm
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.digest.DigestDef.Encoding
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.eval.app.db.Cols
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.eval.app.db.JDBCUtil.CREATE_TABLE
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.eval.app.io.ExtractReader.ALTER_METADATA_LIST
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.eval.app.io.ExtractReaderException.TYPE
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.eval.app.ProfilerBase.EXCEPTION_TYPE
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.eval.app.ProfilerBase.PARSE_ERROR_TYPE
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.eval.core.tokens.TikaEvalTokenizer.Mode
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.exception.EmbeddedLimitReachedException.LimitType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.extractor.EmbeddedDocumentUtil.EmbeddedResourcePrefix
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.language.detect.LanguageConfidence
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.metadata.Property.PropertyType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.metadata.Property.ValueType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.metadata.TikaCoreProperties.EmbeddedResourceType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.ml.chardetect.StructuralEncodingRules.Utf8Result
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.ctakes.CTAKESSerializer
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.chm.ChmCommons.EntryType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.chm.ChmCommons.IntelState
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.chm.ChmCommons.LzxState
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.FormattingUtils.Tag
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PredefinedPropertySet
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PropertySetType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.onenote.Error
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.RequestTypes
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingMethod
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
-
Get an instance of an
unsigned byte - valueOf(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
-
Create an
unsigned int - valueOf(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
Create an
unsigned long - valueOf(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
-
Create an
unsigned short - valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.ooxml.EditType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.OutlookExtractor.BODY_TYPES_PROCESSED
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.multiple.AbstractMultipleParser.MetadataPolicy
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.ocr.TesseractOCRConfig.OUTPUT_TYPE
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.pdf.OcrConfig.ImageFormat
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.pdf.OcrConfig.ImageType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.pdf.OcrConfig.RenderingStrategy
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.pdf.OcrConfig.Strategy
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.pdf.PDFParserConfig.AccessCheckMode
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.pdf.PDFParserConfig.IMAGE_STRATEGY
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.parser.strings.StringsEncoding
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.api.FetchEmitTuple.ON_PARSE_EXCEPTION
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.api.ParseMode
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.api.pipesiterator.TotalCountResult.STATUS
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.api.PipesResult.CATEGORY
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.core.EmitStrategy
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.KEY_BASE_STRATEGY
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.OUTPUT_FORMAT
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.OUTPUT_MODE
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.SUFFIX_STRATEGY
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.core.protocol.PipesMessageType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.emitter.es.ESEmitterConfig.AttachmentStrategy
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.emitter.es.ESEmitterConfig.UpdateStrategy
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig.AttachmentStrategy
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig.MultivaluedFieldStrategy
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig.AttachmentStrategy
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig.UpdateStrategy
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig.AttachmentStrategy
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig.UpdateStrategy
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.renderer.RenderResult.STATUS
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.sax.BasicContentHandlerFactory.HANDLER_TYPE
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class org.apache.tika.server.core.ServerStatus.TASK
-
Returns the enum constant of this class with the specified name.
- valueOf(BigInteger) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
-
Create an
unsigned long - values() - Static method in enum class org.apache.tika.detect.EncodingResult.ResultType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.digest.DigestDef.Algorithm
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.digest.DigestDef.Encoding
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.eval.app.db.Cols
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.eval.app.db.JDBCUtil.CREATE_TABLE
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.eval.app.io.ExtractReader.ALTER_METADATA_LIST
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.eval.app.io.ExtractReaderException.TYPE
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.eval.app.ProfilerBase.EXCEPTION_TYPE
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.eval.app.ProfilerBase.PARSE_ERROR_TYPE
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.eval.core.tokens.TikaEvalTokenizer.Mode
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.exception.EmbeddedLimitReachedException.LimitType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.extractor.EmbeddedDocumentUtil.EmbeddedResourcePrefix
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.language.detect.LanguageConfidence
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.metadata.Property.PropertyType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.metadata.Property.ValueType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.metadata.TikaCoreProperties.EmbeddedResourceType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.ml.chardetect.StructuralEncodingRules.Utf8Result
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.ctakes.CTAKESSerializer
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.chm.ChmCommons.EntryType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.chm.ChmCommons.IntelState
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.chm.ChmCommons.LzxState
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.FormattingUtils.Tag
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PredefinedPropertySet
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.msg.TikaNameIdChunks.PropertySetType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.onenote.Error
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.RequestTypes
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingMethod
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.ooxml.EditType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.OutlookExtractor.BODY_TYPES_PROCESSED
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.multiple.AbstractMultipleParser.MetadataPolicy
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.ocr.TesseractOCRConfig.OUTPUT_TYPE
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.pdf.OcrConfig.ImageFormat
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.pdf.OcrConfig.ImageType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.pdf.OcrConfig.RenderingStrategy
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.pdf.OcrConfig.Strategy
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.pdf.PDFParserConfig.AccessCheckMode
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.pdf.PDFParserConfig.IMAGE_STRATEGY
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.parser.strings.StringsEncoding
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.api.FetchEmitTuple.ON_PARSE_EXCEPTION
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.api.ParseMode
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.api.pipesiterator.TotalCountResult.STATUS
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.api.PipesResult.CATEGORY
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.api.PipesResult.RESULT_STATUS
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.core.EmitStrategy
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.KEY_BASE_STRATEGY
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.OUTPUT_FORMAT
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.OUTPUT_MODE
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.SUFFIX_STRATEGY
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.core.protocol.PipesMessageType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.emitter.es.ESEmitterConfig.AttachmentStrategy
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.emitter.es.ESEmitterConfig.UpdateStrategy
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig.AttachmentStrategy
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.emitter.jdbc.JDBCEmitterConfig.MultivaluedFieldStrategy
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig.AttachmentStrategy
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitterConfig.UpdateStrategy
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig.AttachmentStrategy
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.pipes.emitter.solr.SolrEmitterConfig.UpdateStrategy
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.renderer.RenderResult.STATUS
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.sax.BasicContentHandlerFactory.HANDLER_TYPE
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class org.apache.tika.server.core.ServerStatus.TASK
-
Returns an array containing the constants of this enum class, in the order they are declared.
- valueSerializer() - Method in record class org.apache.tika.pipes.emitter.kafka.KafkaEmitterConfig
-
Returns the value of the
valueSerializerrecord component. - VECTOR_GRAPHICS_ONLY - Enum constant in enum class org.apache.tika.parser.pdf.OcrConfig.RenderingStrategy
- VectorGraphicsOnlyPDFRenderer - Class in org.apache.tika.renderer.pdf.pdfbox
-
This class extends the PDFRenderer to render only the textual elements
- VectorGraphicsOnlyPDFRenderer(PDDocument) - Constructor for class org.apache.tika.renderer.pdf.pdfbox.VectorGraphicsOnlyPDFRenderer
- VectorSerializer - Class in org.apache.tika.inference
-
Serializes and deserializes float vectors as base64-encoded big-endian float32 byte arrays.
- VERBATIM - Static variable in class org.apache.tika.parser.microsoft.chm.ChmCommons
- verifySsl() - Method in record class org.apache.tika.pipes.emitter.es.HttpClientConfig
-
Returns the value of the
verifySslrecord component. - VERSION - Enum constant in enum class org.apache.tika.metadata.TikaCoreProperties.EmbeddedResourceType
- VERSION - Static variable in class org.apache.tika.detect.siegfried.SiegfriedDetector
- VERSION - Static variable in interface org.apache.tika.metadata.Epub
- VERSION - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
The version number.
- VERSION - Static variable in interface org.apache.tika.metadata.QuattroPro
-
Version.
- VERSION - Static variable in class org.apache.tika.ml.chardetect.tools.TrainNaiveBayesBigram
- VERSION - Static variable in class org.apache.tika.ml.LinearModel
-
Latest version we emit.
- VERSION_COUNT - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
General metadata key for the count of non-final versions available within a file.
- VERSION_MADE_BY - Static variable in interface org.apache.tika.metadata.Zip
-
Version of ZIP specification used to create the entry.
- VERSION_NUMBER - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
General metadata key for the version number of a given file that contains earlier versions within it.
- VERSION_V1 - Static variable in class org.apache.tika.ml.LinearModel
- VERSION_V2 - Static variable in class org.apache.tika.ml.LinearModel
- VersionHistoryGraphSpaceContextNodes - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- VersionTokenKnowledge - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Version Token Knowledge
- VERY_HIDDEN_SHEET_NAMES - Static variable in interface org.apache.tika.metadata.Office
- video(String) - Static method in class org.apache.tika.mime.MediaType
- VIDEO_ALPHA_MODE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The alpha mode."
- VIDEO_ALPHA_UNITY_IS_TRANSPARENT - Static variable in interface org.apache.tika.metadata.XMPDM
-
"When true, unity is clear, when false, it is opaque."
- VIDEO_COLOR_SPACE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The color space."
- VIDEO_COMPRESSOR - Static variable in interface org.apache.tika.metadata.XMPDM
-
"Video compression used.
- VIDEO_FIELD_ORDER - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The field order for video."
- VIDEO_FRAME_RATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The video frame rate."
- VIDEO_MOD_DATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The date and time when the video was last modified."
- VIDEO_PIXEL_ASPECT_RATIO - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The aspect ratio, expressed as wd/ht.
- VIDEO_PIXEL_DEPTH - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The size in bits of each color component of a pixel.
- VISIO - Enum constant in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- visitFile(Path, BasicFileAttributes) - Method in class org.apache.tika.parser.microsoft.libpst.EmailVisitor
- visitFileFailed(Path, IOException) - Method in class org.apache.tika.parser.microsoft.libpst.EmailVisitor
- VLM_COMPLETION_TOKENS - Static variable in class org.apache.tika.parser.vlm.AbstractVLMParser
- VLM_META - Static variable in class org.apache.tika.parser.vlm.AbstractVLMParser
-
Metadata namespace for VLM properties.
- VLM_MODEL - Static variable in class org.apache.tika.parser.vlm.AbstractVLMParser
- VLM_PROMPT_TOKENS - Static variable in class org.apache.tika.parser.vlm.AbstractVLMParser
- VLMOCRConfig - Class in org.apache.tika.parser.vlm
-
Configuration for
VLMOCRParser. - VLMOCRConfig() - Constructor for class org.apache.tika.parser.vlm.VLMOCRConfig
- VLMOCRConfig.RuntimeConfig - Class in org.apache.tika.parser.vlm
-
Runtime-only config that prevents modification of security-sensitive and cost-sensitive fields at parse time.
- VorbisParser - Class in org.apache.tika.parser.ogg
-
Parser for OGG Vorbis audio files.
- VorbisParser() - Constructor for class org.apache.tika.parser.ogg.VorbisParser
- VSD - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
Microsoft Visio
- VSDXExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
-
SAX-based extractor for Visio OOXML (.vsdx) files.
- VSDXExtractorDecorator(ParseContext, OPCPackage) - Constructor for class org.apache.tika.parser.microsoft.ooxml.VSDXExtractorDecorator
W
- W_NS - Static variable in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
- WACZParser - Class in org.apache.tika.parser.wacz
- WACZParser() - Constructor for class org.apache.tika.parser.wacz.WACZParser
- walkTree(OneNoteTreeWalkerOptions, Metadata, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
- WARC - Interface in org.apache.tika.metadata
- WARC_GZ - Static variable in class org.apache.tika.detect.gzip.GZipSpecializationDetector
- WARC_HTTP_PREFIX - Static variable in class org.apache.tika.parser.warc.WARCParser
- WARC_HTTP_STATUS - Static variable in class org.apache.tika.parser.warc.WARCParser
- WARC_HTTP_STATUS_REASON - Static variable in class org.apache.tika.parser.warc.WARCParser
- WARC_PAYLOAD_CONTENT_TYPE - Static variable in interface org.apache.tika.metadata.WARC
- WARC_PREFIX - Static variable in class org.apache.tika.parser.warc.WARCParser
- WARC_RECORD_CONTENT_TYPE - Static variable in interface org.apache.tika.metadata.WARC
- WARC_RECORD_ID - Static variable in interface org.apache.tika.metadata.WARC
- WARC_WARNING - Static variable in interface org.apache.tika.metadata.WARC
- WARCParser - Class in org.apache.tika.parser.warc
-
This uses jwarc to parse warc files and arc files
- WARCParser() - Constructor for class org.apache.tika.parser.warc.WARCParser
- warn() - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
- warning(SAXParseException) - Method in class org.apache.tika.sax.ContentHandlerDecorator
- WARNING - Static variable in class org.apache.tika.detect.siegfried.SiegfriedDetector
- WaterlineKnowledge - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Waterline Knowledge
- WaterlineKnowledge - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Waterline Knowledge
- WaterlineKnowledgeEntry - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Waterline Knowledge Entry
- WEB_STATEMENT - Static variable in interface org.apache.tika.metadata.XMPRights
-
A Web URL for a statement of the ownership and usage rights for this resource.
- WEBARCHIVE - Static variable in class org.apache.tika.detect.apple.BPListDetector
- WebPictureContainer14 - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
- WebPParser - Class in org.apache.tika.parser.image
- WebPParser() - Constructor for class org.apache.tika.parser.image.WebPParser
- WILL_PARSE - Static variable in class org.apache.tika.parser.ParsingIntent
-
Singleton instance indicating that parsing will follow detection.
- Win32Error - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Win32 Error
- WINDOWS_1252 - Static variable in class org.apache.tika.parser.microsoft.rtf.jflex.RTFCharsetMaps
- winner() - Method in class org.apache.tika.quality.TextQualityComparison
-
Returns
"A"if candidate A is cleaner,"B"otherwise. - withFallbacks(Collection<? extends Parser>, Set<MediaType>) - Static method in class org.apache.tika.parser.ParserDecorator
-
Deprecated.This has been replaced by
FallbackParser - withFeatureFlags(int) - Method in class org.apache.tika.langdetect.charsoup.CharSoupModel
-
Returns a new model with the same weights but a different feature-flags bitmask.
- withMimeFilters(Parser, Set<MediaType>, Set<MediaType>) - Static method in class org.apache.tika.parser.ParserDecorator
-
Decorates the given parser with mime type filtering.
- withoutTypes(Parser, Set<MediaType>) - Static method in class org.apache.tika.parser.ParserDecorator
-
Decorates the given parser so that it never claims to support parsing of the given media types, but will work for all others.
- withTypes(Parser, Set<MediaType>) - Static method in class org.apache.tika.parser.ParserDecorator
-
Decorates the given parser so that it always claims to support parsing of the given media types.
- WMFParser - Class in org.apache.tika.parser.microsoft
-
This parser offers a very rough capability to extract text if there is text stored in the WMF files.
- WMFParser() - Constructor for class org.apache.tika.parser.microsoft.WMFParser
- WORD_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Words in the document
- WORD_PROCESSING_NAMESPACE_URI - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- WORD_PROCESSING_PREFIX - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
- Word2006MLParser - Class in org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006
- Word2006MLParser() - Constructor for class org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006.Word2006MLParser
- WORDDOCUMENT - Enum constant in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- WordExtractor - Class in org.apache.tika.parser.microsoft
- WordExtractor(ParseContext, Metadata) - Constructor for class org.apache.tika.parser.microsoft.WordExtractor
- WordExtractor.TagAndStyle - Class in org.apache.tika.parser.microsoft
- WordMLParser - Class in org.apache.tika.parser.microsoft.xml
-
Parses wordml 2003 format word files.
- WordMLParser() - Constructor for class org.apache.tika.parser.microsoft.xml.WordMLParser
- WordPerfect - Interface in org.apache.tika.metadata
-
WordPerfect properties collection.
- WORDPERFECT_METADATA_NAME_PREFIX - Static variable in interface org.apache.tika.metadata.WordPerfect
- WordPerfectParser - Class in org.apache.tika.parser.wordperfect
-
Parser for Corel WordPerfect documents.
- WordPerfectParser() - Constructor for class org.apache.tika.parser.wordperfect.WordPerfectParser
- WordTokenizer - Class in org.apache.tika.langdetect.charsoup
-
General-purpose word tokenizer that shares the same preprocessing pipeline as
CharSoupFeatureExtractor: NFC normalization, URL/email stripping, case folding viaCharacter.toLowerCase(int). - WORK_TYPE - Static variable in interface org.apache.tika.metadata.CreativeCommons
- WORKBOOK - Enum constant in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- WORKBOOK_CODENAME - Static variable in interface org.apache.tika.metadata.Office
- working(long) - Static method in record class org.apache.tika.pipes.core.protocol.PipesMessage
-
Creates a WORKING heartbeat with the last-progress timestamp in the payload.
- WORKING - Enum constant in enum class org.apache.tika.pipes.core.protocol.PipesMessageType
- WORKS - Enum constant in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- WPS - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
Microsoft Works
- wrapInComposite(List<Detector>, LoaderContext) - Method in class org.apache.tika.config.loader.DetectorLoader
- wrapInComposite(List<EncodingDetector>, LoaderContext) - Method in class org.apache.tika.config.loader.EncodingDetectorLoader
- wrapInComposite(List<Parser>, LoaderContext) - Method in class org.apache.tika.config.loader.ParserLoader
- wrapInComposite(List<T>, LoaderContext) - Method in class org.apache.tika.config.loader.AbstractSpiComponentLoader
-
Wrap a list of components in a composite.
- wrapList(List<?>) - Method in class org.apache.tika.serialization.ComponentConfig
- wrapWith(Function<List<?>, T>) - Method in class org.apache.tika.serialization.ComponentConfig.Builder
-
Configure how to wrap a list into the component type.
- write(char) - Method in class org.apache.tika.sax.ToXMLContentHandler
-
Writes the given character as-is.
- write(char[], int, int) - Method in class org.apache.tika.language.detect.LanguageWriter
- write(char[], int, int) - Method in interface org.apache.tika.sax.SafeContentHandler.Output
- write(int, String) - Method in class org.apache.tika.eval.app.db.DBBuffer
- write(int, String) - Method in class org.apache.tika.eval.app.db.MimeBuffer
- write(DataOutputStream) - Method in record class org.apache.tika.pipes.core.protocol.PipesMessage
-
Writes this message to the stream and flushes.
- write(String) - Method in class org.apache.tika.sax.ToXMLContentHandler
-
Writes the given string of character as-is.
- write(Connection, Path) - Static method in class org.apache.tika.eval.app.reports.MarkdownSummaryWriter
- WRITE_LIMIT_REACHED - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
- WriteAccessResponse - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
-
Write Access Response
- WriteAccessResponse - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
-
Write Access Response
- writeBulkRequest(String, String, StringWriter) - Method in class org.apache.tika.pipes.reporter.opensearch.OpenSearchClient
- writeCharacters(TextPosition) - Method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
- writeConfig(Path, Map<String, Object>, Path) - Static method in class org.apache.tika.config.JsonConfigHelper
-
Loads a template, applies replacements, and writes to an output file.
- writeConfigFromResource(String, Class<?>, Map<String, Object>, Path) - Static method in class org.apache.tika.config.JsonConfigHelper
-
Loads a template from resources, applies replacements, and writes to an output file.
- writeContentData(String, Map<Class, Object>, TableInfo) - Method in class org.apache.tika.eval.app.ProfilerBase
-
Checks to see if metadata is null or content is empty (null or only whitespace).
- writeCrash(PipesMessageType, Throwable) - Method in class org.apache.tika.pipes.core.server.ServerProtocolIO
-
Writes a crash message (OOM, TIMEOUT, or UNSPECIFIED_CRASH) with the serialized stack trace and waits for ACK.
- writeDoc(Metadata, StringWriter) - Method in class org.apache.tika.pipes.reporter.opensearch.OpenSearchClient
- writeExceptionData(String, Metadata, TableInfo) - Method in class org.apache.tika.eval.app.ProfilerBase
- writeExtractException(TableInfo, String, String, ExtractReaderException.TYPE) - Method in class org.apache.tika.eval.app.ProfilerBase
- writeFile(byte[][], String) - Static method in class org.apache.tika.parser.microsoft.chm.ChmCommons
-
Writes byte[][] to the file
- writeFinished(PipesResult) - Method in class org.apache.tika.pipes.core.server.ServerProtocolIO
-
Writes a FINISHED message with the serialized result and waits for ACK.
- writeIntermediate(Metadata) - Method in class org.apache.tika.pipes.core.server.ServerProtocolIO
-
Writes an INTERMEDIATE_RESULT message with the serialized metadata and waits for ACK.
- writeLimit() - Method in record class org.apache.tika.server.core.resource.ServerHandlerConfig
-
Returns the value of the
writeLimitrecord component. - WriteLimiter - Interface in org.apache.tika.sax
- WriteLimitReachedException - Exception in org.apache.tika.exception
- WriteLimitReachedException(int) - Constructor for exception org.apache.tika.exception.WriteLimitReachedException
- writeLineSeparator() - Method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
- WriteOutContentHandler - Class in org.apache.tika.sax
-
SAX event handler that writes content up to an optional write limit out to a character stream or other decorated handler.
- WriteOutContentHandler() - Constructor for class org.apache.tika.sax.WriteOutContentHandler
-
Creates a content handler that writes character events to an internal string buffer.
- WriteOutContentHandler(int) - Constructor for class org.apache.tika.sax.WriteOutContentHandler
-
Creates a content handler that writes character events to an internal string buffer.
- WriteOutContentHandler(Writer) - Constructor for class org.apache.tika.sax.WriteOutContentHandler
-
Creates a content handler that writes character events to the given writer.
- WriteOutContentHandler(Writer, int) - Constructor for class org.apache.tika.sax.WriteOutContentHandler
-
Creates a content handler that writes content up to the given write limit to the given character stream.
- WriteOutContentHandler(ContentHandler, int) - Constructor for class org.apache.tika.sax.WriteOutContentHandler
-
Creates a content handler that writes content up to the given write limit to the given content handler.
- WriteOutContentHandler(ContentHandler, int, boolean, ParseContext) - Constructor for class org.apache.tika.sax.WriteOutContentHandler
-
The default is to throw a
WriteLimitReachedException - writeParagraphEnd() - Method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
- writeParagraphStart() - Method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
- writeProfileData(EvalFilePaths, int, ContentTags, Metadata, String, String, List<Integer>, TableInfo) - Method in class org.apache.tika.eval.app.ProfilerBase
- writer - Variable in class org.apache.tika.eval.app.ProfilerBase
- writeReplacement(SafeContentHandler.Output) - Method in class org.apache.tika.sax.SafeContentHandler
-
Outputs the replacement for an invalid character.
- writeReport(Connection, Path) - Method in class org.apache.tika.eval.app.reports.Report
- writeRow(TableInfo, Map<Cols, String>) - Method in class org.apache.tika.eval.app.io.DBWriter
- writeRow(TableInfo, Map<Cols, String>) - Method in interface org.apache.tika.eval.app.io.IDBWriter
- writeString(String) - Method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
- writeTo(CodedOutputStream) - Method in class org.apache.tika.DeleteFetcherReply
- writeTo(CodedOutputStream) - Method in class org.apache.tika.DeleteFetcherRequest
- writeTo(CodedOutputStream) - Method in class org.apache.tika.DeletePipesIteratorReply
- writeTo(CodedOutputStream) - Method in class org.apache.tika.DeletePipesIteratorRequest
- writeTo(CodedOutputStream) - Method in class org.apache.tika.FetchAndParseReply
- writeTo(CodedOutputStream) - Method in class org.apache.tika.FetchAndParseRequest
- writeTo(CodedOutputStream) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaReply
- writeTo(CodedOutputStream) - Method in class org.apache.tika.GetFetcherConfigJsonSchemaRequest
- writeTo(CodedOutputStream) - Method in class org.apache.tika.GetFetcherReply
- writeTo(CodedOutputStream) - Method in class org.apache.tika.GetFetcherRequest
- writeTo(CodedOutputStream) - Method in class org.apache.tika.GetPipesIteratorReply
- writeTo(CodedOutputStream) - Method in class org.apache.tika.GetPipesIteratorRequest
- writeTo(CodedOutputStream) - Method in class org.apache.tika.ListFetchersReply
- writeTo(CodedOutputStream) - Method in class org.apache.tika.ListFetchersRequest
- writeTo(CodedOutputStream) - Method in class org.apache.tika.SaveFetcherReply
- writeTo(CodedOutputStream) - Method in class org.apache.tika.SaveFetcherRequest
- writeTo(CodedOutputStream) - Method in class org.apache.tika.SavePipesIteratorReply
- writeTo(CodedOutputStream) - Method in class org.apache.tika.SavePipesIteratorRequest
- writeTo(OutputStream) - Method in class org.apache.tika.pipes.core.extractor.frictionless.DataPackage
-
Writes this DataPackage as JSON to the given output stream.
- writeTo(String, Writer) - Method in class org.apache.tika.langdetect.LanguageDetectorTest
- writeTo(String, Writer, int) - Method in class org.apache.tika.langdetect.LanguageDetectorTest
- writeTo(Map<String, byte[]>, Class<?>, Type, Annotation[], MediaType, MultivaluedMap<String, Object>, OutputStream) - Method in class org.apache.tika.server.core.writer.TarWriter
- writeTo(Map<String, byte[]>, Class<?>, Type, Annotation[], MediaType, MultivaluedMap<String, Object>, OutputStream) - Method in class org.apache.tika.server.core.writer.ZipWriter
- writeTo(Map<String, Object>, Class<?>, Type, Annotation[], MediaType, MultivaluedMap<String, Object>, OutputStream) - Method in class org.apache.tika.server.core.writer.JSONObjWriter
- writeTo(Metadata, Class<?>, Type, Annotation[], MediaType, MultivaluedMap<String, Object>, OutputStream) - Method in class org.apache.tika.server.core.writer.CSVMessageBodyWriter
- writeTo(Metadata, Class<?>, Type, Annotation[], MediaType, MultivaluedMap<String, Object>, OutputStream) - Method in class org.apache.tika.server.core.writer.JSONMessageBodyWriter
- writeTo(Metadata, Class<?>, Type, Annotation[], MediaType, MultivaluedMap<String, Object>, OutputStream) - Method in class org.apache.tika.server.core.writer.TextMessageBodyWriter
- writeTo(Metadata, Class<?>, Type, Annotation[], MediaType, MultivaluedMap<String, Object>, OutputStream) - Method in class org.apache.tika.server.standard.writer.XMPMessageBodyWriter
- writeTo(MetadataList, Class<?>, Type, Annotation[], MediaType, MultivaluedMap<String, Object>, OutputStream) - Method in class org.apache.tika.server.core.writer.MetadataListMessageBodyWriter
- writeToBuffer(PDImage, String, boolean, OutputStream) - Method in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- writeWordSeparator() - Method in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
- WzHyperlinkUrl - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
X
- XCAS - Enum constant in enum class org.apache.tika.parser.ctakes.CTAKESSerializer
- xhtml - Variable in class org.apache.tika.parser.pdf.image.ImageGraphicsEngine
- XHTML - Static variable in class org.apache.tika.sax.XHTMLContentHandler
-
The XHTML namespace URI
- XHTMLContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that simplifies the task of producing XHTML events for Tika content parsers.
- XHTMLContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.sax.XHTMLContentHandler
- XHTMLContentHandler(ContentHandler, Metadata, ParseContext) - Constructor for class org.apache.tika.sax.XHTMLContentHandler
-
Creates a new XHTMLContentHandler with configuration from ParseContext.
- XLIFF12ContentHandler - Class in org.apache.tika.parser.xliff
-
Content Handler for XLIFF 1.2 documents.
- XLIFF12Parser - Class in org.apache.tika.parser.xliff
-
Parser for XLIFF 1.2 files.
- XLIFF12Parser() - Constructor for class org.apache.tika.parser.xliff.XLIFF12Parser
- XLR - Enum constant in enum class org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
- XLR - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
Microsoft Works Spreadsheet 7.0
- XLS - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
-
Microsoft Excel
- XLSXHREFFormatter - Class in org.apache.tika.eval.app.reports
- XLSXHREFFormatter(String, HyperlinkType) - Constructor for class org.apache.tika.eval.app.reports.XLSXHREFFormatter
- XLZParser - Class in org.apache.tika.parser.xliff
-
Parser for XLZ Archives.
- XLZParser() - Constructor for class org.apache.tika.parser.xliff.XLZParser
- XMI - Enum constant in enum class org.apache.tika.parser.ctakes.CTAKESSerializer
- XML - Enum constant in enum class org.apache.tika.parser.ctakes.CTAKESSerializer
- XML - Enum constant in enum class org.apache.tika.sax.BasicContentHandlerFactory.HANDLER_TYPE
- XML - Static variable in class org.apache.tika.mime.MimeTypes
-
Name of the
xmltype, application/xml. - XMLParser - Class in org.apache.tika.parser.xml
-
XML parser.
- XMLParser() - Constructor for class org.apache.tika.parser.xml.XMLParser
- XMLProfiler - Class in org.apache.tika.parser.xml
- XMLProfiler() - Constructor for class org.apache.tika.parser.xml.XMLProfiler
- XMLReaderUtils - Class in org.apache.tika.utils
-
Utility functions for reading XML.
- XMLReaderUtils() - Constructor for class org.apache.tika.utils.XMLReaderUtils
- XmlReaderUtilsConfig() - Constructor for class org.apache.tika.config.GlobalSettings.XmlReaderUtilsConfig
- XmlRootExtractor - Class in org.apache.tika.detect
-
Utility class that uses a
SAXParserto determine the namespace URI and local name of the root element of an XML file. - XmlRootExtractor() - Constructor for class org.apache.tika.detect.XmlRootExtractor
- XmlToJsonConfigConverter - Class in org.apache.tika.cli
-
Converts legacy XML Tika configuration files to the new JSON format.
- XMP - Interface in org.apache.tika.metadata
-
Metadata keys for the XMP Basic Schema
- XMP - Static variable in class org.apache.tika.sax.XMPContentHandler
-
The XMP namespace URI
- XMP_DOCUMENT_CATALOG_LOCATION - Static variable in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
- XMP_LOCATION - Static variable in interface org.apache.tika.metadata.PDF
-
If xmp is extracted by, e.g. the XMLProfiler, where did it come from?
- XMP_PAGE_LOCATION_PREFIX - Static variable in class org.apache.tika.parser.pdf.PDFMarkedContent2XHTML
- XMPContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that simplifies the task of producing XMP output.
- XMPContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.XMPContentHandler
- XMPDC - Interface in org.apache.tika.metadata
-
Metadata keys for the XMP DublinCore schema.
- XMPDM - Interface in org.apache.tika.metadata
-
XMP Dynamic Media schema.
- XMPDM.ChannelTypePropertyConverter - Class in org.apache.tika.metadata
-
Deprecated.Experimental method, will change shortly
- XMPIdq - Interface in org.apache.tika.metadata
- XMPMessageBodyWriter - Class in org.apache.tika.server.standard.writer
- XMPMessageBodyWriter() - Constructor for class org.apache.tika.server.standard.writer.XMPMessageBodyWriter
- XMPMetadata - Class in org.apache.tika.xmp
-
Provides a conversion of the Metadata map from Tika to the XMP data model by also providing the Metadata API for clients to ease transition.
- XMPMetadata() - Constructor for class org.apache.tika.xmp.XMPMetadata
-
Initializes with an empty XMP packet
- XMPMetadata(Metadata) - Constructor for class org.apache.tika.xmp.XMPMetadata
- XMPMetadata(Metadata, String) - Constructor for class org.apache.tika.xmp.XMPMetadata
-
Initializes the data by converting the Metadata information to XMP.
- XMPMetadataExtractor - Class in org.apache.tika.parser.xmp
-
XMP Metadata Extractor based on Apache XmpBox.
- XMPMetadataExtractor() - Constructor for class org.apache.tika.parser.xmp.XMPMetadataExtractor
- XMPMetadataResource - Class in org.apache.tika.server.standard.resource
- XMPMetadataResource() - Constructor for class org.apache.tika.server.standard.resource.XMPMetadataResource
- XMPMM - Interface in org.apache.tika.metadata
- XMPPacketScanner - Class in org.apache.tika.parser.xmp
-
This class is a parser for XMP packets.
- XMPPacketScanner() - Constructor for class org.apache.tika.parser.xmp.XMPPacketScanner
- XMPPDF - Interface in org.apache.tika.metadata
-
Metadata keys for the XMP PDF Schema
- XMPRights - Interface in org.apache.tika.metadata
-
XMP Rights management schema.
- XMPSchemaIllustrator - Class in org.apache.tika.parser.pdf.xmpschemas
- XMPSchemaIllustrator(XMPMetadata) - Constructor for class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaIllustrator
- XMPSchemaIllustrator(Element, String) - Constructor for class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaIllustrator
- XMPSchemaPDFUA - Class in org.apache.tika.parser.pdf.xmpschemas
- XMPSchemaPDFUA(XMPMetadata) - Constructor for class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFUA
- XMPSchemaPDFUA(Element, String) - Constructor for class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFUA
- XMPSchemaPDFVT - Class in org.apache.tika.parser.pdf.xmpschemas
- XMPSchemaPDFVT(XMPMetadata) - Constructor for class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFVT
- XMPSchemaPDFVT(Element, String) - Constructor for class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFVT
- XMPSchemaPDFX - Class in org.apache.tika.parser.pdf.xmpschemas
-
This is somewhat of a hack to handle the older pdfx: See also the more modern
XMPSchemaPDFXId - XMPSchemaPDFX(XMPMetadata) - Constructor for class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFX
- XMPSchemaPDFX(Element, String) - Constructor for class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFX
- XMPSchemaPDFXId - Class in org.apache.tika.parser.pdf.xmpschemas
- XMPSchemaPDFXId(XMPMetadata) - Constructor for class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFXId
- XMPSchemaPDFXId(Element, String) - Constructor for class org.apache.tika.parser.pdf.xmpschemas.XMPSchemaPDFXId
- xor(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- xor(long) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- xor(UInteger) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
- xorExtendedGUID(ExtendedGUID, ExtendedGUID) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AdapterHelper
-
XOR two ExtendedGUID instances.
- XPATH - Enum constant in enum class org.apache.tika.metadata.Property.ValueType
- XPathParser - Class in org.apache.tika.sax.xpath
-
Parser for a very simple XPath subset.
- XPathParser() - Constructor for class org.apache.tika.sax.xpath.XPathParser
- XPathParser(String, String) - Constructor for class org.apache.tika.sax.xpath.XPathParser
- XPS - Static variable in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
- XPSExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml.xps
- XPSExtractorDecorator(ParseContext, OPCPackage) - Constructor for class org.apache.tika.parser.microsoft.ooxml.xps.XPSExtractorDecorator
- XSLFEventBasedPowerPointExtractor - Class in org.apache.tika.parser.microsoft.ooxml.xslf
-
Lightweight holder for an
OPCPackagefor PPTX files. - XSLFEventBasedPowerPointExtractor(OPCPackage) - Constructor for class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
- XSSFBExcelExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
- XSSFBExcelExtractorDecorator(ParseContext, OPCPackage, Locale) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XSSFBExcelExtractorDecorator
- XSSFExcelExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
- XSSFExcelExtractorDecorator(ParseContext, OPCPackage, Locale) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
- XSSFExcelExtractorDecorator.HeaderFooterFromString - Class in org.apache.tika.parser.microsoft.ooxml
- XSSFExcelExtractorDecorator.SheetTextAsHTML - Class in org.apache.tika.parser.microsoft.ooxml
-
Turns formatted sheet events into HTML
- XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer - Class in org.apache.tika.parser.microsoft.ooxml
-
Captures information on interesting tags, whilst delegating the main work to the formatting handler
- XSSFSheetInterestingPartsCapturer(ContentHandler) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
- XUserDefinedCharset - Class in org.apache.tika.parser.html.charsetdetector.charsets
- XUserDefinedCharset() - Constructor for class org.apache.tika.parser.html.charsetdetector.charsets.XUserDefinedCharset
- XUserDefinedCharset.NotImplementedException - Exception in org.apache.tika.parser.html.charsetdetector.charsets
- XWPFBodyContentsHandler - Interface in org.apache.tika.parser.microsoft.ooxml
-
Callback interface for receiving structured document events from the OOXML SAX dispatcher.
- XWPFEventBasedWordExtractor - Class in org.apache.tika.parser.microsoft.ooxml.xwpf
-
Lightweight holder for an
OPCPackagefor DOCX files. - XWPFEventBasedWordExtractor(OPCPackage) - Constructor for class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
- XWPFFeatureExtractor - Class in org.apache.tika.parser.microsoft.ooxml.xwpf
-
This is designed to extract features that are useful for forensics, e-discovery and digital preservation.
- XWPFFeatureExtractor() - Constructor for class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFFeatureExtractor
- XWPFListManager - Class in org.apache.tika.parser.microsoft.ooxml
- XWPFListManager(XWPFNumberingShim) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XWPFListManager
- XWPFNumberingShim - Class in org.apache.tika.parser.microsoft.ooxml.xwpf
-
SAX-based parser for numbering.xml that replaces the XMLBeans-dependent POI XWPFNumbering.
- XWPFNumberingShim(PackagePart, ParseContext) - Constructor for class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFNumberingShim
- XWPFStylesShim - Class in org.apache.tika.parser.microsoft.ooxml.xwpf
-
For Tika, all we need (so far) is a mapping between styleId and a style's name.
- XWPFStylesShim(PackagePart, ParseContext) - Constructor for class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFStylesShim
- XZ - Static variable in class org.apache.tika.detect.zip.CompressorConstants
Y
- YandexTranslator - Class in org.apache.tika.language.translate.impl
-
An implementation of a REST client for the YANDEX Translate API.
- YandexTranslator() - Constructor for class org.apache.tika.language.translate.impl.YandexTranslator
- YY_SLASH_MM_SLASH_DD - Static variable in class org.apache.tika.parser.mailcommons.MailDateParser
- yyatEOF() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenizer
-
Returns whether the scanner has reached the end of the reader it reads from.
- yybegin(int) - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenizer
-
Enters a new lexical state.
- yycharat(int) - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenizer
-
Returns the character at the given position from the matched text.
- yyclose() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenizer
-
Closes the input reader.
- YYEOF - Static variable in class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenizer
-
This character denotes the end of file.
- YYINITIAL - Static variable in class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenizer
- yylength() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenizer
-
How many characters were matched.
- yylex() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenizer
-
Resumes scanning until the next regular expression is matched, the end of input is encountered or an I/O-Error occurs.
- yypushback(int) - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenizer
-
Pushes the specified amount of characters back into the input stream.
- yyreset(Reader) - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenizer
-
Resets the scanner to read from a new input stream.
- yystate() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenizer
-
Returns the current lexical state.
- yytext() - Method in class org.apache.tika.parser.microsoft.rtf.jflex.RTFTokenizer
-
Returns the text matched by the current regular expression.
- YYYY_MM_DD - Static variable in class org.apache.tika.parser.mailcommons.MailDateParser
- YYYY_MM_DD_HH_MM - Static variable in class org.apache.tika.parser.mailcommons.MailDateParser
Z
- ZERO_BYTE_EXTRACT_FILE - Enum constant in enum class org.apache.tika.eval.app.io.ExtractReaderException.TYPE
- ZeroByteFileException - Exception in org.apache.tika.exception
-
Exception thrown by the AutoDetectParser when a file contains zero-bytes.
- ZeroByteFileException(String) - Constructor for exception org.apache.tika.exception.ZeroByteFileException
- ZeroByteFileException.IgnoreZeroByteFileException - Class in org.apache.tika.exception
- ZeroSizeFileDetector - Class in org.apache.tika.detect
-
Detector to identify zero length files as application/x-zerovalue
- ZeroSizeFileDetector() - Constructor for class org.apache.tika.detect.ZeroSizeFileDetector
- Zip - Interface in org.apache.tika.metadata
-
ZIP file properties collection.
- ZIP - Static variable in class org.apache.tika.detect.zip.PackageConstants
- ZIP_PREFIX - Static variable in interface org.apache.tika.metadata.Zip
- ZIP_SPECIALIZATIONS - Static variable in class org.apache.tika.parser.pkg.ZipParser
-
Set of media types that are specializations of ZIP (e.g., Office documents, EPUB, APK).
- ZipAlgorithm - Enum constant in enum class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingMethod
-
File data is passed to the Zip algorithm chunking method.
- ZipContainerDetector - Interface in org.apache.tika.detect.zip
-
Classes that implement this must be able to detect on a ZipFile and in streaming mode.
- zipFile() - Method in record class org.apache.tika.server.core.resource.PipesParsingHelper.UnpackResult
-
Returns the value of the
zipFilerecord component. - ZipFilesChunking - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
-
This class is used to process zip file chunking
- ZipFilesChunking(byte[]) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ZipFilesChunking
-
Initializes a new instance of the ZipFilesChunking class
- ZipHeader - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
- ZipListFiles - Class in org.apache.tika.example
-
Example code listing from Chapter 1.
- ZipListFiles() - Constructor for class org.apache.tika.example.ZipListFiles
- ZipParser - Class in org.apache.tika.parser.pkg
-
Parser for ZIP and JAR archives using file-based access for complete metadata extraction.
- ZipParser() - Constructor for class org.apache.tika.parser.pkg.ZipParser
- ZipParser(JsonConfig) - Constructor for class org.apache.tika.parser.pkg.ZipParser
-
Constructor for JSON-based configuration.
- ZipParser(EncodingDetector) - Constructor for class org.apache.tika.parser.pkg.ZipParser
- ZipParser(ZipParserConfig) - Constructor for class org.apache.tika.parser.pkg.ZipParser
- ZipParserConfig - Class in org.apache.tika.parser.pkg
-
Configuration for
ZipParser. - ZipParserConfig() - Constructor for class org.apache.tika.parser.pkg.ZipParserConfig
- ZIPPED - Enum constant in enum class org.apache.tika.pipes.core.extractor.UnpackConfig.OUTPUT_MODE
-
Package all files into a single zip archive
- ZipSalvager - Class in org.apache.tika.zip.utils
- ZipSalvager() - Constructor for class org.apache.tika.zip.utils.ZipSalvager
- ZipWriter - Class in org.apache.tika.server.core.writer
- ZipWriter() - Constructor for class org.apache.tika.server.core.writer.ZipWriter
- ZLIB - Static variable in class org.apache.tika.detect.zip.CompressorConstants
- ZSTD - Static variable in class org.apache.tika.detect.zip.CompressorConstants
_
- _COLOR_MODE_CHOICES_INDEXED - Static variable in interface org.apache.tika.metadata.Photoshop
All Classes and Interfaces|All Packages|Constant Field Values|Serialized Form