Skip navigation links
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _ 

A

ABOUT - Static variable in interface org.apache.tika.metadata.XMP
Unordered text strings of advisories.
ABS_PEAK_AUDIO_FILE_PATH - Static variable in interface org.apache.tika.metadata.XMPDM
"The absolute path to the file's peak audio file.
AbstractChunking - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
This class specifies the base class for file chunking
AbstractChunking(byte[]) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.AbstractChunking
Initializes a new instance of the AbstractChunking class.
AbstractConsumersBuilder - Class in org.apache.tika.batch.builders
 
AbstractConsumersBuilder() - Constructor for class org.apache.tika.batch.builders.AbstractConsumersBuilder
 
AbstractConverter - Class in org.apache.tika.xmp.convert
Base class for Tika Metadata to XMP converter which provides some needed common functionality.
AbstractConverter() - Constructor for class org.apache.tika.xmp.convert.AbstractConverter
 
AbstractDBParser - Class in org.apache.tika.parser.jdbc
Abstract class that handles iterating through tables within a database.
AbstractDBParser() - Constructor for class org.apache.tika.parser.jdbc.AbstractDBParser
 
AbstractEmitter - Class in org.apache.tika.pipes.emitter
 
AbstractEmitter() - Constructor for class org.apache.tika.pipes.emitter.AbstractEmitter
 
AbstractEncodingDetectorParser - Class in org.apache.tika.parser
Abstract base class for parsers that use the AutoDetectReader and need to use the EncodingDetector configured by TikaConfig
AbstractEncodingDetectorParser() - Constructor for class org.apache.tika.parser.AbstractEncodingDetectorParser
 
AbstractEncodingDetectorParser(EncodingDetector) - Constructor for class org.apache.tika.parser.AbstractEncodingDetectorParser
 
AbstractExternalProcessParser - Class in org.apache.tika.parser
Abstract base class for parsers that call external processes.
AbstractExternalProcessParser() - Constructor for class org.apache.tika.parser.AbstractExternalProcessParser
 
AbstractFetcher - Class in org.apache.tika.pipes.fetcher
 
AbstractFetcher() - Constructor for class org.apache.tika.pipes.fetcher.AbstractFetcher
 
AbstractFetcher(String) - Constructor for class org.apache.tika.pipes.fetcher.AbstractFetcher
 
AbstractFSConsumer - Class in org.apache.tika.batch.fs
 
AbstractFSConsumer(ArrayBlockingQueue<FileResource>) - Constructor for class org.apache.tika.batch.fs.AbstractFSConsumer
 
AbstractImageParser - Class in org.apache.tika.parser.image
 
AbstractImageParser() - Constructor for class org.apache.tika.parser.image.AbstractImageParser
 
AbstractListManager - Class in org.apache.tika.parser.microsoft
 
AbstractListManager() - Constructor for class org.apache.tika.parser.microsoft.AbstractListManager
 
AbstractListManager.LevelTuple - Class in org.apache.tika.parser.microsoft
 
AbstractListManager.ParagraphLevelCounter - Class in org.apache.tika.parser.microsoft
 
AbstractMultipleParser - Class in org.apache.tika.parser.multiple
Abstract base class for parser wrappers which may / will process a given stream multiple times, merging the results of the various parsers used.
AbstractMultipleParser(MediaTypeRegistry, Collection<? extends Parser>, Map<String, Param>) - Constructor for class org.apache.tika.parser.multiple.AbstractMultipleParser
 
AbstractMultipleParser(MediaTypeRegistry, AbstractMultipleParser.MetadataPolicy, Parser...) - Constructor for class org.apache.tika.parser.multiple.AbstractMultipleParser
 
AbstractMultipleParser(MediaTypeRegistry, AbstractMultipleParser.MetadataPolicy, Collection<? extends Parser>) - Constructor for class org.apache.tika.parser.multiple.AbstractMultipleParser
 
AbstractMultipleParser.MetadataPolicy - Enum in org.apache.tika.parser.multiple
The various strategies for handling metadata emitted by multiple parsers.
AbstractOfficeParser - Class in org.apache.tika.parser.microsoft
Intermediate layer to set OfficeParserConfig uniformly.
AbstractOfficeParser() - Constructor for class org.apache.tika.parser.microsoft.AbstractOfficeParser
 
AbstractOOXMLExtractor - Class in org.apache.tika.parser.microsoft.ooxml
Base class for all Tika OOXML extractors.
AbstractOOXMLExtractor(ParseContext, POIXMLTextExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
 
AbstractParser - Class in org.apache.tika.parser
Abstract base class for new parsers.
AbstractParser() - Constructor for class org.apache.tika.parser.AbstractParser
 
AbstractProfiler - Class in org.apache.tika.eval.app
 
AbstractProfiler(ArrayBlockingQueue<FileResource>, IDBWriter) - Constructor for class org.apache.tika.eval.app.AbstractProfiler
 
AbstractProfiler.EXCEPTION_TYPE - Enum in org.apache.tika.eval.app
 
AbstractProfiler.PARSE_ERROR_TYPE - Enum in org.apache.tika.eval.app
If information was gathered from the log file about a parse error
AbstractRecursiveParserWrapperHandler - Class in org.apache.tika.sax
This is a special handler to be used only with the RecursiveParserWrapper.
AbstractRecursiveParserWrapperHandler(ContentHandlerFactory) - Constructor for class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
 
AbstractRecursiveParserWrapperHandler(ContentHandlerFactory, int) - Constructor for class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
 
AbstractTranslator - Class in org.apache.tika.language.translate.impl
 
AbstractTranslator() - Constructor for class org.apache.tika.language.translate.impl.AbstractTranslator
 
AbstractXML2003Parser - Class in org.apache.tika.parser.microsoft.xml
 
AbstractXML2003Parser() - Constructor for class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
 
AccessChecker - Class in org.apache.tika.parser.pdf
Checks whether or not a document allows extraction generally or extraction for accessibility only.
AccessChecker() - Constructor for class org.apache.tika.parser.pdf.AccessChecker
This constructs an AccessChecker that will not perform any checking and will always return without throwing an exception.
AccessChecker(boolean) - Constructor for class org.apache.tika.parser.pdf.AccessChecker
This constructs an AccessChecker that will check for whether or not content should be extracted from a document.
AccessPermissionException - Exception in org.apache.tika.exception
Exception to be thrown when a document does not allow content extraction.
AccessPermissionException() - Constructor for exception org.apache.tika.exception.AccessPermissionException
 
AccessPermissionException(Throwable) - Constructor for exception org.apache.tika.exception.AccessPermissionException
 
AccessPermissionException(String) - Constructor for exception org.apache.tika.exception.AccessPermissionException
 
AccessPermissionException(String, Throwable) - Constructor for exception org.apache.tika.exception.AccessPermissionException
 
AccessPermissions - Interface in org.apache.tika.metadata
Until we can find a common standard, we'll use these options.
ACKNOWLEDGEMENT - Static variable in interface org.apache.tika.metadata.ClimateForcast
 
ACRONYM_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
ACTION_TRIGGER - Static variable in interface org.apache.tika.metadata.PDF
This specifies where an action or destination would be found/triggered in the document: on document open, before close, etc.
actionPerformed(ActionEvent) - Method in class org.apache.tika.gui.TikaGUI
 
Activator - Class in org.apache.tika.parser.internal
 
Activator() - Constructor for class org.apache.tika.parser.internal.Activator
 
AdapterHelper - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
AdapterHelper() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AdapterHelper
 
add(String, long) - Method in class org.apache.tika.eval.core.tokens.LangModel
 
add(String, String) - Method in class org.apache.tika.eval.core.tokens.TokenCounter
Deprecated.
 
add(String) - Method in class org.apache.tika.langdetect.tika.LanguageProfile
Adds a single occurrence of the given ngram to this profile.
add(String, long) - Method in class org.apache.tika.langdetect.tika.LanguageProfile
Adds multiple occurrences of the given ngram to this profile.
add(StringBuffer) - Method in class org.apache.tika.langdetect.tika.LanguageProfilerBuilder
Adds ngrams from a single word to this profile
add(String, String) - Method in class org.apache.tika.metadata.Metadata
Add a metadata name/value mapping.
add(String, String[]) - Method in class org.apache.tika.metadata.Metadata
Add a metadata name/value mapping.
add(Property, String) - Method in class org.apache.tika.metadata.Metadata
Add a metadata property/value mapping.
add(Property, int) - Method in class org.apache.tika.metadata.Metadata
Adds the integer value of the identified metadata property.
add(Metadata) - Method in class org.apache.tika.metadata.serialization.JsonStreamingSerializer
 
add(String, String, Map<String, String[]>) - Method in interface org.apache.tika.metadata.writefilter.MetadataWriteFilter
Based on the field and value, this filter modifies the field and/or the value to something that should be added to the Metadata object.
add(String, String, Map<String, String[]>) - Method in class org.apache.tika.metadata.writefilter.StandardWriteFilter
 
add(UByte) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
add(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
add(UInteger) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
add(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
add(ULong) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
add(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
add(long) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
add(UShort) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
add(int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
add(String, String) - Method in class org.apache.tika.xmp.XMPMetadata
As this API could only possibly work for simple properties in XMP, it just calls the set method, which replaces any existing value
addAlias(MediaType, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
 
addAllCharacters(String, ContentHandler) - Method in class org.apache.tika.parser.jdbc.JDBCTableReader
 
addAlternative(GeoTag) - Method in class org.apache.tika.parser.geo.GeoTag
 
addCloseableResource(Closeable) - Method in class org.apache.tika.io.TikaInputStream
 
addData(byte[], int, int) - Method in class org.apache.tika.detect.TextStatistics
 
addDocument(String, List<Metadata>) - Method in class org.apache.tika.pipes.emitter.opensearch.OpenSearchClient
 
addDrawingHyperLinks(PackagePart) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
ADDED - Static variable in class org.apache.tika.batch.FileResourceCrawler
 
addErrorLogTablePair(Path, TableInfo) - Method in class org.apache.tika.eval.app.batch.DBConsumersManager
 
addErrorLogTablePairs(DBConsumersManager) - Method in class org.apache.tika.eval.app.batch.EvalConsumerBuilder
 
addErrorLogTablePairs(DBConsumersManager) - Method in class org.apache.tika.eval.app.batch.ExtractComparerBuilder
 
addErrorLogTablePairs(DBConsumersManager) - Method in class org.apache.tika.eval.app.batch.ExtractProfilerBuilder
 
addErrorLogTablePairs(DBConsumersManager) - Method in class org.apache.tika.eval.app.batch.FileProfilerBuilder
 
addEvenIfNull(Property, String, Metadata) - Static method in class org.apache.tika.parser.microsoft.OutlookExtractor
 
addingService(ServiceReference) - Method in class org.apache.tika.config.TikaActivator
 
ADDITIONAL_MODEL_INFO - Static variable in interface org.apache.tika.metadata.IPTC
Information about the ethnicity and other facets of the model(s) in a model-released image.
ADDITIONAL_NAMESPACES - Static variable in class org.apache.tika.xmp.convert.MSOfficeBinaryConverter
 
ADDITIONAL_NAMESPACES - Static variable in class org.apache.tika.xmp.convert.MSOfficeXMLConverter
 
ADDITIONAL_NAMESPACES - Static variable in class org.apache.tika.xmp.convert.OpenDocumentConverter
 
ADDITIONAL_NAMESPACES - Static variable in class org.apache.tika.xmp.convert.RTFConverter
 
addMetadata(Mp4Directory) - Method in class org.apache.tika.parser.mp4.boxes.TikaUserDataBox
 
addMetadata(String) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
 
addMetadata(String) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
 
addMetadata(String) - Method in class org.apache.tika.parser.xml.MetadataHandler
Deprecated.
 
addMulti(Metadata, Property, String) - Static method in class org.apache.tika.parser.microsoft.SummaryExtractor
 
addOtherTesseractConfig(String, String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
Add a key-value pair to pass to Tesseract using its -c command line option.
addPattern(MimeType, String) - Method in class org.apache.tika.mime.MimeTypes
Adds a file name pattern for the given media type.
addPattern(MimeType, String, boolean) - Method in class org.apache.tika.mime.MimeTypes
Adds a file name pattern for the given media type.
addPersonAndEmail(String, Property, Property, Metadata) - Static method in class org.apache.tika.parser.mailcommons.MailUtil
This tries to split a "from" or "to" value into a person field and an email field.
addPrefix(String, String) - Method in class org.apache.tika.sax.xpath.XPathParser
 
addProfile(String, LanguageProfile) - Static method in class org.apache.tika.langdetect.tika.LanguageIdentifier
Adds a single language profile
addResource(Closeable) - Method in class org.apache.tika.io.TemporaryResources
Adds a new resource to the set of tracked resources that will all be closed when the TemporaryResources.close() method is called.
addSuperType(MediaType, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
 
addText(char[], int, int) - Method in class org.apache.tika.langdetect.lingo24.Lingo24LangDetector
 
addText(char[], int, int) - Method in class org.apache.tika.langdetect.mitll.TextLangDetector
 
addText(char[], int, int) - Method in class org.apache.tika.langdetect.opennlp.OpenNLPDetector
This will buffer up to OpenNLPDetector.setMaxLength(int) and then ignore the rest of the text.
addText(char[], int, int) - Method in class org.apache.tika.langdetect.optimaize.OptimaizeLangDetector
 
addText(char[], int, int) - Method in class org.apache.tika.langdetect.tika.TikaLanguageDetector
 
addText(char[], int, int) - Method in class org.apache.tika.language.detect.LanguageDetector
Add statistics about this text for the current document.
addText(CharSequence) - Method in class org.apache.tika.language.detect.LanguageDetector
Add to the statistics being accumulated for the current document.
addType(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
 
addXRefEntry(COSWriterXRefEntry) - Method in class org.apache.tika.fuzzing.pdf.EvilCOSWriter
add an entry in the x ref table for later dump.
AdobeFontMetricParser - Class in org.apache.tika.parser.font
Parser for AFM Font Files
AdobeFontMetricParser() - Constructor for class org.apache.tika.parser.font.AdobeFontMetricParser
 
advance(int) - Method in class org.apache.tika.sax.SecureContentHandler
Records the given number of output characters (or more accurately UTF-16 code units).
AdvancedTypeDetector - Class in org.apache.tika.example
 
AdvancedTypeDetector() - Constructor for class org.apache.tika.example.AdvancedTypeDetector
 
ADVISORY - Static variable in interface org.apache.tika.metadata.XMP
Unordered text strings of advisories.
AES_ENV_VAR - Static variable in class org.apache.tika.client.HttpClientFactory
 
afterRead(int) - Method in class org.apache.tika.io.TikaInputStream
 
AgeRecogniser - Class in org.apache.tika.parser.recognition
Parser for extracting features from text.
AgeRecogniser() - Constructor for class org.apache.tika.parser.recognition.AgeRecogniser
 
AgeRecogniserConfig - Class in org.apache.tika.parser.recognition
Stores URL for AgePredictor
AgeRecogniserConfig(Map<String, Param>) - Constructor for class org.apache.tika.parser.recognition.AgeRecogniserConfig
 
ALBUM - Static variable in interface org.apache.tika.metadata.XMPDM
"The name of the album."
ALBUM_ARTIST - Static variable in interface org.apache.tika.metadata.XMPDM
"The name of the album artist or group for compilation albums."
ALIAS_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
ALIAS_TYPE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
ALIGNED_OFFSET - Static variable in class org.apache.tika.parser.microsoft.chm.ChmCommons
 
alignedLenTable - Variable in class org.apache.tika.parser.microsoft.chm.ChmLzxState
 
alignedTreeTable - Variable in class org.apache.tika.parser.microsoft.chm.ChmLzxState
 
allowedPolicies - Static variable in class org.apache.tika.parser.multiple.FallbackParser
The different Metadata Policies we support (all)
allowedPolicies - Static variable in class org.apache.tika.parser.multiple.SupplementingParser
The different Metadata Policies we support (not discard)
alpha - Variable in class org.apache.tika.parser.ocr.tess4j.ImageDeskew.HoughLine
 
AlphaIdeographFilterFactory - Class in org.apache.tika.eval.core.tokens
Factory for filter that only allows tokens with characters that "isAlphabetic" or "isIdeographic" through.
AlphaIdeographFilterFactory(Map<String, String>) - Constructor for class org.apache.tika.eval.core.tokens.AlphaIdeographFilterFactory
 
ALT_TAPE_NAME - Static variable in interface org.apache.tika.metadata.XMPDM
"An alternative tape name, set via the project window or timecode dialog in Premiere.
AlternativePackaging - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
AlternativePackaging() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
 
ALTITUDE - Static variable in interface org.apache.tika.metadata.Geographic
The WGS84 Altitude of the Point
ALTITUDE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
 
ALWAYS_ADD_FIELDS - Static variable in class org.apache.tika.metadata.writefilter.StandardWriteFilter
 
ALWAYS_SET_FIELDS - Static variable in class org.apache.tika.metadata.writefilter.StandardWriteFilter
 
amazonTranscribe(Path, Path) - Static method in class org.apache.tika.example.TranscribeTranslateExample
Use AmazonTranscribe to execute transcription on input data.
AmazonTranscribe - Class in org.apache.tika.parser.transcribe.aws
Amazon Transcribe implementation.
AmazonTranscribe() - Constructor for class org.apache.tika.parser.transcribe.aws.AmazonTranscribe
 
analyze(StringBuilder) - Method in class org.apache.tika.langdetect.tika.LanguageProfilerBuilder
Analyzes a piece of text
AnalyzerManager - Class in org.apache.tika.eval.core.tokens
 
analyzeStorageIndexDataElement(List<DataElement>, ExGuid, AtomicReference<ExGuid>, AtomicReference<HashMap<CellID, ExGuid>>, AtomicReference<HashMap<ExGuid, ExGuid>>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to analyze the storage index data element to get all the mappings.
ANNOTATION_SUBTYPES - Static variable in interface org.apache.tika.metadata.PDF
 
ANNOTATION_TYPES - Static variable in interface org.apache.tika.metadata.PDF
 
AnnotationUtils - Class in org.apache.tika.utils
This class contains utilities for dealing with tika annotations
AnnotationUtils() - Constructor for class org.apache.tika.utils.AnnotationUtils
 
apiBaseUri - Variable in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 
apiUri - Variable in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 
APP_VERSION - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
 
appendByteArrayToListOfByte(List<Byte>, byte[]) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.ByteUtil
 
appendGUID(UUID) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
Append a specified GUID value into the buffer.
appendInit32(int, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
Append a specified Init32 type value into the buffer with the specified bit length.
appendUInit32(int, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
Append a specified Unit32 type value into the buffer with the specified bit length.
appendUInt64(long, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
Append a specified Unit64 type value into the buffer with the specified bit length.
AppleSingleFileParser - Class in org.apache.tika.parser.apple
Parser that strips the header off of AppleSingle and AppleDouble files.
AppleSingleFileParser() - Constructor for class org.apache.tika.parser.apple.AppleSingleFileParser
 
APPLICATION - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
 
application(String) - Static method in class org.apache.tika.mime.MediaType
 
APPLICATION_XML - Static variable in class org.apache.tika.mime.MediaType
 
APPLICATION_ZIP - Static variable in class org.apache.tika.mime.MediaType
 
applyStyleAndValue(int, ResultSet, Cell) - Method in class org.apache.tika.eval.app.reports.XLSXHREFFormatter
 
AppParserFactoryBuilder - Class in org.apache.tika.batch.builders
 
AppParserFactoryBuilder() - Constructor for class org.apache.tika.batch.builders.AppParserFactoryBuilder
 
AR - Static variable in class org.apache.tika.detect.zip.PackageConstants
 
ARCHITECTURE_BITS - Static variable in interface org.apache.tika.metadata.MachineMetadata
 
ARJ - Static variable in class org.apache.tika.detect.zip.PackageConstants
 
ARRAY_CLOSE - Static variable in class org.apache.tika.fuzzing.pdf.EvilCOSWriter
The array close token.
ARRAY_OPEN - Static variable in class org.apache.tika.fuzzing.pdf.EvilCOSWriter
The array open token.
ArrayNumber - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
The class is used to represent the number of the array.
ArrayNumber() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.ArrayNumber
 
ARTIST - Static variable in interface org.apache.tika.metadata.XMPDM
"The name of the artist or artists."
ARTWORK_OR_OBJECT - Static variable in interface org.apache.tika.metadata.IPTC
A set of metadata about artwork or an object in the item
ARTWORK_OR_OBJECT_DETAIL_COPYRIGHT_NOTICE - Static variable in interface org.apache.tika.metadata.IPTC
Contains any necessary copyright notice for claiming the intellectual property for artwork or an object in the image and should identify the current owner of the copyright of this work with associated intellectual property rights.
ARTWORK_OR_OBJECT_DETAIL_CREATOR - Static variable in interface org.apache.tika.metadata.IPTC
Contains the name of the artist who has created artwork or an object in the image.
ARTWORK_OR_OBJECT_DETAIL_DATE_CREATED - Static variable in interface org.apache.tika.metadata.IPTC
Designates the date and optionally the time the artwork or object in the image was created.
ARTWORK_OR_OBJECT_DETAIL_SOURCE - Static variable in interface org.apache.tika.metadata.IPTC
The organisation or body holding and registering the artwork or object in the image for inventory purposes.
ARTWORK_OR_OBJECT_DETAIL_SOURCE_INVENTORY_NUMBER - Static variable in interface org.apache.tika.metadata.IPTC
The inventory number issued by the organisation or body holding and registering the artwork or object in the image.
ARTWORK_OR_OBJECT_DETAIL_TITLE - Static variable in interface org.apache.tika.metadata.IPTC
A reference for the artwork or object in the image.
asBytes(UUID) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.UuidUtils
 
asInputSource() - Method in class org.apache.tika.detect.AutoDetectReader
 
ASSEMBLE_DOCUMENT - Static variable in interface org.apache.tika.metadata.AccessPermissions
Can the user insert/rotate/delete pages.
assertByteArrayNotNull(byte[]) - Static method in class org.apache.tika.parser.microsoft.chm.ChmAssert
Checks if byte[] is not null
assertByteArrayNotNull(byte[]) - Static method in class org.apache.tika.parser.microsoft.chm.ChmCommons
 
assertChmAccessorNotNull(ChmAccessor<?>) - Static method in class org.apache.tika.parser.microsoft.chm.ChmAssert
Checks if ChmAccessor is not null In case of null throws exception
assertChmAccessorParameters(byte[], ChmAccessor<?>, int) - Static method in class org.apache.tika.parser.microsoft.chm.ChmAssert
Checks validity of ChmAccessor parameters
assertChmBlockSegment(byte[], ChmLzxcResetTable, int, int, int) - Static method in class org.apache.tika.parser.microsoft.chm.ChmAssert
Checks a validity of the chmBlockSegment parameters
assertCopyingDataIndex(int, int) - Static method in class org.apache.tika.parser.microsoft.chm.ChmAssert
 
assertDirectoryListingEntry(int, String, ChmCommons.EntryType, int, int) - Static method in class org.apache.tika.parser.microsoft.chm.ChmAssert
Checks validity of the DirectoryListingEntry's parameters In case of invalid parameter(s) throws an exception
assertInputStreamNotNull(InputStream) - Static method in class org.apache.tika.parser.microsoft.chm.ChmAssert
Checks if InputStream is not null
assertPositiveInt(int) - Static method in class org.apache.tika.parser.microsoft.chm.ChmAssert
Checks if int param is greater than zero In case param <= 0 throws an exception
assignFieldParams(Object, Map<String, Param>) - Static method in class org.apache.tika.utils.AnnotationUtils
Assigns the param values to bean
assignValue(Object, Object) - Method in class org.apache.tika.config.ParamField
Sets given value to the annotated field of bean
asUuid(byte[]) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.UuidUtils
 
AsyncConfig - Class in org.apache.tika.pipes.async
 
AsyncConfig() - Constructor for class org.apache.tika.pipes.async.AsyncConfig
 
AsyncEmitter - Class in org.apache.tika.pipes.async
Worker thread that takes EmitData off the queue, batches it and tries to emit it as a batch
AsyncEmitter(AsyncConfig, ArrayBlockingQueue<EmitData>, EmitterManager) - Constructor for class org.apache.tika.pipes.async.AsyncEmitter
 
AsyncProcessor - Class in org.apache.tika.pipes.async
This is the main class for handling async requests.
AsyncProcessor(Path) - Constructor for class org.apache.tika.pipes.async.AsyncProcessor
 
AsyncRequest - Class in org.apache.tika.server.core.resource
 
AsyncRequest(List<FetchEmitTuple>) - Constructor for class org.apache.tika.server.core.resource.AsyncRequest
 
AsyncResource - Class in org.apache.tika.server.core.resource
 
AsyncResource(Path, Set<String>) - Constructor for class org.apache.tika.server.core.resource.AsyncResource
 
attachExternalParsers(TikaConfig) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
 
attachExternalParsers(List<ExternalParser>, TikaConfig) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
 
AttributeDependantMetadataHandler - Class in org.apache.tika.parser.xml
This adds a Metadata entry for a given node.
AttributeDependantMetadataHandler(Metadata, String, String) - Constructor for class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
 
AttributeMatcher - Class in org.apache.tika.sax.xpath
Final evaluation state of a .../@* XPath expression.
AttributeMatcher() - Constructor for class org.apache.tika.sax.xpath.AttributeMatcher
 
AttributeMetadataHandler - Class in org.apache.tika.parser.xml
SAX event handler that maps the contents of an XML attribute into a metadata field.
AttributeMetadataHandler(String, String, Metadata, String) - Constructor for class org.apache.tika.parser.xml.AttributeMetadataHandler
 
AttributeMetadataHandler(String, String, Metadata, Property) - Constructor for class org.apache.tika.parser.xml.AttributeMetadataHandler
 
audio(String) - Static method in class org.apache.tika.mime.MediaType
 
AUDIO_CHANNEL_TYPE - Static variable in interface org.apache.tika.metadata.XMPDM
"The audio channel type."
AUDIO_COMPRESSOR - Static variable in interface org.apache.tika.metadata.XMPDM
"The audio compression used.
AUDIO_MOD_DATE - Static variable in interface org.apache.tika.metadata.XMPDM
"The date and time when the audio was last modified."
AUDIO_SAMPLE_RATE - Static variable in interface org.apache.tika.metadata.XMPDM
"The audio sample rate.
AUDIO_SAMPLE_TYPE - Static variable in interface org.apache.tika.metadata.XMPDM
"The audio sample type."
AudioFrame - Class in org.apache.tika.parser.mp3
An Audio Frame in an MP3 file.
AudioFrame(InputStream, ContentHandler) - Constructor for class org.apache.tika.parser.mp3.AudioFrame
Deprecated.
Use the constructor which is passed all values directly.
AudioFrame(int, int, int, int, InputStream) - Constructor for class org.apache.tika.parser.mp3.AudioFrame
Deprecated.
Use the constructor which is passed all values directly.
AudioFrame(int, int, int, int, int, int, float) - Constructor for class org.apache.tika.parser.mp3.AudioFrame
Creates a new instance of AudioFrame and initializes all properties.
AudioParser - Class in org.apache.tika.parser.audio
 
AudioParser() - Constructor for class org.apache.tika.parser.audio.AudioParser
 
AUTHOR - Static variable in interface org.apache.tika.metadata.Office
Name of the principal author(s) of a document
AUTHORS_POSITION - Static variable in interface org.apache.tika.metadata.Photoshop
 
AutoDetectParser - Class in org.apache.tika.parser
 
AutoDetectParser() - Constructor for class org.apache.tika.parser.AutoDetectParser
Creates an auto-detecting parser instance using the default Tika configuration.
AutoDetectParser(Detector) - Constructor for class org.apache.tika.parser.AutoDetectParser
 
AutoDetectParser(Parser...) - Constructor for class org.apache.tika.parser.AutoDetectParser
Creates an auto-detecting parser instance using the specified set of parser.
AutoDetectParser(Detector, Parser...) - Constructor for class org.apache.tika.parser.AutoDetectParser
 
AutoDetectParser(TikaConfig) - Constructor for class org.apache.tika.parser.AutoDetectParser
 
AutoDetectParserConfig - Class in org.apache.tika.parser
This config object can be used to tune how conservative we want to be when parsing data that is extremely compressible and resembles a ZIP bomb.
AutoDetectParserConfig(Long, Long, Long, Integer, Integer) - Constructor for class org.apache.tika.parser.AutoDetectParserConfig
Creates a SecureContentHandlerConfig using the passed in parameters.
AutoDetectParserConfig() - Constructor for class org.apache.tika.parser.AutoDetectParserConfig
 
AutoDetectParserFactory - Class in org.apache.tika.batch
Simple class for AutoDetectParser
AutoDetectParserFactory() - Constructor for class org.apache.tika.batch.AutoDetectParserFactory
 
AutoDetectParserFactory - Class in org.apache.tika.parser
Factory for an AutoDetectParser
AutoDetectParserFactory(Map<String, String>) - Constructor for class org.apache.tika.parser.AutoDetectParserFactory
 
AutoDetectReader - Class in org.apache.tika.detect
An input stream reader that automatically detects the character encoding to be used for converting bytes to characters.
AutoDetectReader(InputStream, Metadata, EncodingDetector) - Constructor for class org.apache.tika.detect.AutoDetectReader
 
AutoDetectReader(InputStream, Metadata, ServiceLoader) - Constructor for class org.apache.tika.detect.AutoDetectReader
 
AutoDetectReader(InputStream, Metadata) - Constructor for class org.apache.tika.detect.AutoDetectReader
 
AutoDetectReader(InputStream) - Constructor for class org.apache.tika.detect.AutoDetectReader
 
AutoDetectTransformer - Class in org.apache.tika.fuzzing
 
AutoDetectTransformer() - Constructor for class org.apache.tika.fuzzing.AutoDetectTransformer
 
AutoDetectTransformer(List<Transformer>) - Constructor for class org.apache.tika.fuzzing.AutoDetectTransformer
 
autoTranslate(InputStream, String, String) - Method in class org.apache.tika.server.core.resource.TranslateResource
 
available() - Method in class org.apache.tika.io.LookaheadInputStream
 
available - Variable in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 
AZBlobEmitter - Class in org.apache.tika.pipes.emitter.azblob
Emit files to Azure blob storage.
AZBlobEmitter() - Constructor for class org.apache.tika.pipes.emitter.azblob.AZBlobEmitter
 
AZBlobFetcher - Class in org.apache.tika.pipes.fetcher.azblob
Fetches files from Azure blob storage.
AZBlobFetcher() - Constructor for class org.apache.tika.pipes.fetcher.azblob.AZBlobFetcher
 
AZBlobPipesIterator - Class in org.apache.tika.pipes.pipesiterator.azblob
 
AZBlobPipesIterator() - Constructor for class org.apache.tika.pipes.pipesiterator.azblob.AZBlobPipesIterator
 

B

baseRevisionID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifest
 
BasicContentHandlerFactory - Class in org.apache.tika.sax
Basic factory for creating common types of ContentHandlers
BasicContentHandlerFactory(BasicContentHandlerFactory.HANDLER_TYPE, int) - Constructor for class org.apache.tika.sax.BasicContentHandlerFactory
 
BasicContentHandlerFactory.HANDLER_TYPE - Enum in org.apache.tika.sax
Common handler types for content.
BasicObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
Base object for FSSHTTPB.
BasicObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BasicObject
 
BasicTikaFSConsumer - Class in org.apache.tika.batch.fs
Basic FileResourceConsumer that reads files from an input directory and writes content to the output directory.
BasicTikaFSConsumer(ArrayBlockingQueue<FileResource>, ParserFactory, ContentHandlerFactory, OutputStreamFactory, TikaConfig) - Constructor for class org.apache.tika.batch.fs.BasicTikaFSConsumer
BasicTikaFSConsumer(ArrayBlockingQueue<FileResource>, Parser, ContentHandlerFactory, OutputStreamFactory) - Constructor for class org.apache.tika.batch.fs.BasicTikaFSConsumer
 
BasicTikaFSConsumersBuilder - Class in org.apache.tika.batch.fs.builders
 
BasicTikaFSConsumersBuilder() - Constructor for class org.apache.tika.batch.fs.builders.BasicTikaFSConsumersBuilder
 
BasicTokenCountStatsCalculator - Class in org.apache.tika.eval.core.textstats
 
BasicTokenCountStatsCalculator() - Constructor for class org.apache.tika.eval.core.textstats.BasicTokenCountStatsCalculator
 
batchInsert(PreparedStatement, TableInfo, Map<Cols, String>) - Static method in class org.apache.tika.eval.app.db.JDBCUtil
 
BatchNoRestartError - Error in org.apache.tika.batch
FileResourceConsumers should throw this if something catastrophic has happened and the BatchProcess should shutdown and not be restarted.
BatchNoRestartError(Throwable) - Constructor for error org.apache.tika.batch.BatchNoRestartError
 
BatchNoRestartError(String) - Constructor for error org.apache.tika.batch.BatchNoRestartError
 
BatchNoRestartError(String, Throwable) - Constructor for error org.apache.tika.batch.BatchNoRestartError
 
BatchProcess - Class in org.apache.tika.batch
This is the main processor class for a single process.
BatchProcess(FileResourceCrawler, ConsumersManager, StatusReporter, Interrupter) - Constructor for class org.apache.tika.batch.BatchProcess
 
BatchProcess.BATCH_CONSTANTS - Enum in org.apache.tika.batch
 
BatchProcessBuilder - Class in org.apache.tika.batch.builders
Builds a BatchProcessor from a combination of runtime arguments and the config file.
BatchProcessBuilder() - Constructor for class org.apache.tika.batch.builders.BatchProcessBuilder
 
BatchProcessDriverCLI - Class in org.apache.tika.batch
 
BatchProcessDriverCLI(String[]) - Constructor for class org.apache.tika.batch.BatchProcessDriverCLI
 
BatchTopCommonTokenCounter - Class in org.apache.tika.eval.app.tools
Utility class that runs TopCommonTokenCounter against a directory of table files (named {lang}_table.gz or leipzip-like afr_...-sentences.txt) and outputs common tokens files for each input table file in the output directory.
BatchTopCommonTokenCounter() - Constructor for class org.apache.tika.eval.app.tools.BatchTopCommonTokenCounter
 
BIG - Static variable in class org.apache.tika.metadata.MachineMetadata.Endian
 
BinaryItem - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
BinaryItem() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
Initializes a new instance of the BinaryItem class.
BinaryItem(Collection<Byte>) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
Initializes a new instance of the BinaryItem class with the specified content.
BIND_EXCEPTION - Static variable in class org.apache.tika.server.core.TikaServerProcess
 
Bit - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
The class is used to read/set bit value for a byte array
Bit() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.Bit
 
BitConverter - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
 
BitConverter() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
BitReader - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
A class is used to extract values across byte boundaries with arbitrary bit positions.
BitReader(byte[], int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
Initializes a new instance of the BitReader class with specified bytes buffer and start position in byte.
BITS_PER_SAMPLE - Static variable in interface org.apache.tika.metadata.TIFF
"Number of bits per component in each channel."
BITUNES - Static variable in class org.apache.tika.detect.apple.BPListDetector
 
BitWriter - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
 
BitWriter(int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
Initializes a new instance of the BitWriter class with specified buffer size in byte.
blobExtendedGUID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDataBLOBReference
 
BMEMGRAPH - Static variable in class org.apache.tika.detect.apple.BPListDetector
 
body - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
 
body - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfContextIDs
 
body - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOIDs
 
body - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOSIDs
 
BodyContentHandler - Class in org.apache.tika.sax
Content handler decorator that only passes everything inside the XHTML <body/> tag to the underlying handler.
BodyContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.BodyContentHandler
Creates a content handler that passes all XHTML body events to the given underlying content handler.
BodyContentHandler(Writer) - Constructor for class org.apache.tika.sax.BodyContentHandler
Creates a content handler that writes XHTML body character events to the given writer.
BodyContentHandler(OutputStream) - Constructor for class org.apache.tika.sax.BodyContentHandler
Creates a content handler that writes XHTML body character events to the given output stream using the default encoding.
BodyContentHandler(int) - Constructor for class org.apache.tika.sax.BodyContentHandler
Creates a content handler that writes XHTML body character events to an internal string buffer.
BodyContentHandler() - Constructor for class org.apache.tika.sax.BodyContentHandler
Creates a content handler that writes XHTML body character events to an internal string buffer.
BoilerpipeContentHandler - Class in org.apache.tika.sax.boilerpipe
Uses the boilerpipe library to automatically extract the main content from a web page.
BoilerpipeContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.boilerpipe.BoilerpipeContentHandler
Creates a new boilerpipe-based content extractor, using the DefaultExtractor extraction rules and "delegate" as the content handler.
BoilerpipeContentHandler(Writer) - Constructor for class org.apache.tika.sax.boilerpipe.BoilerpipeContentHandler
Creates a content handler that writes XHTML body character events to the given writer.
BoilerpipeContentHandler(ContentHandler, BoilerpipeExtractor) - Constructor for class org.apache.tika.sax.boilerpipe.BoilerpipeContentHandler
Creates a new boilerpipe-based content extractor, using the given extraction rules.
boolValue - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
 
BouncyCastleDigester - Class in org.apache.tika.parser.digestutils
Digester that relies on BouncyCastle for MessageDigest implementations.
BouncyCastleDigester(int, String) - Constructor for class org.apache.tika.parser.digestutils.BouncyCastleDigester
Include a string representing the comma-separated algorithms to run: e.g.
BoundedInputStream - Class in org.apache.tika.io
Very slight modification of Commons' BoundedInputStream so that we can figure out if this hit the bound or not.
BoundedInputStream(long, InputStream) - Constructor for class org.apache.tika.io.BoundedInputStream
 
BPGParser - Class in org.apache.tika.parser.image
Parser for the Better Portable Graphics (BPG) File Format.
BPGParser() - Constructor for class org.apache.tika.parser.image.BPGParser
 
BPLIST - Static variable in class org.apache.tika.detect.apple.BPListDetector
 
BPListDetector - Class in org.apache.tika.detect.apple
Detector for BPList with utility functions for PList.
BPListDetector() - Constructor for class org.apache.tika.detect.apple.BPListDetector
 
BROTLI - Static variable in class org.apache.tika.detect.zip.CompressorConstants
 
BufferUnderrunException() - Constructor for exception org.apache.tika.io.EndianUtils.BufferUnderrunException
 
build(Node, Map<String, String>, ArrayBlockingQueue<FileResource>) - Method in class org.apache.tika.batch.builders.AbstractConsumersBuilder
 
build(Node, Map<String, String>) - Method in class org.apache.tika.batch.builders.AppParserFactoryBuilder
 
build(InputStream, Map<String, String>) - Method in class org.apache.tika.batch.builders.BatchProcessBuilder
Builds a BatchProcess from runtime arguments and a input stream of a configuration file.
build(Node, Map<String, String>) - Method in class org.apache.tika.batch.builders.BatchProcessBuilder
Builds a FileResourceBatchProcessor from runtime arguments and a document node of a configuration file.
build(InputStream) - Method in class org.apache.tika.batch.builders.CommandLineParserBuilder
 
build(Node, Map<String, String>) - Method in class org.apache.tika.batch.builders.DefaultContentHandlerFactoryBuilder
 
build(Node, Map<String, String>) - Method in interface org.apache.tika.batch.builders.IContentHandlerFactoryBuilder
 
build(Node, Map<String, String>, ArrayBlockingQueue<FileResource>) - Method in interface org.apache.tika.batch.builders.ICrawlerBuilder
 
build(Node, long, Map<String, String>) - Method in class org.apache.tika.batch.builders.InterrupterBuilder
 
build(Node, Map<String, String>) - Method in interface org.apache.tika.batch.builders.IParserFactoryBuilder
 
build(Node, Map<String, String>, ArrayBlockingQueue<FileResource>) - Method in interface org.apache.tika.batch.builders.ObjectFromDOMAndQueueBuilder
 
build(Node, Map<String, String>) - Method in interface org.apache.tika.batch.builders.ObjectFromDOMBuilder
 
build(Node, Map<String, String>) - Method in class org.apache.tika.batch.builders.ParserFactoryBuilder
 
build(Node, Map<String, String>) - Method in interface org.apache.tika.batch.builders.ReporterBuilder
 
build(FileResourceCrawler, ConsumersManager, Node, Map<String, String>) - Method in class org.apache.tika.batch.builders.SimpleLogReporterBuilder
 
build(FileResourceCrawler, ConsumersManager, Node, Map<String, String>) - Method in interface org.apache.tika.batch.builders.StatusReporterBuilder
 
build(Node, Map<String, String>, ArrayBlockingQueue<FileResource>) - Method in class org.apache.tika.batch.fs.builders.BasicTikaFSConsumersBuilder
 
build(Node, Map<String, String>, ArrayBlockingQueue<FileResource>) - Method in class org.apache.tika.batch.fs.builders.FSCrawlerBuilder
 
build() - Method in class org.apache.tika.client.HttpClientFactory
 
build() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
 
build() - Method in class org.apache.tika.eval.app.batch.EvalConsumerBuilder
 
build(Node, Map<String, String>, ArrayBlockingQueue<FileResource>) - Method in class org.apache.tika.eval.app.batch.EvalConsumersBuilder
 
build() - Method in class org.apache.tika.eval.app.batch.ExtractComparerBuilder
 
build() - Method in class org.apache.tika.eval.app.batch.ExtractProfilerBuilder
 
build() - Method in class org.apache.tika.eval.app.batch.FileProfilerBuilder
 
build(Path) - Static method in class org.apache.tika.eval.app.reports.ResultsReporter
 
build() - Method in class org.apache.tika.fork.ParserFactoryFactory
 
BUILD - Static variable in interface org.apache.tika.metadata.QuattroPro
Build.
build() - Method in class org.apache.tika.parser.AutoDetectParserFactory
 
Build(byte[]) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.IntermediateNodeObject.RootNodeObjectBuilder
This method is used to build a root node object from a byte array
Build(List<ObjectGroupDataElementData>, ObjectGroupObjectData, ExGuid) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject.IntermediateNodeObjectBuilder
This method is used to build intermediate node object from an list of object group data element
Build(byte[], SignatureObject) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject.IntermediateNodeObjectBuilder
This method is used to build intermediate node object from a byte array with a signature
build(NodeObject) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData.Builder
This method is used to build a list of DataElement from a node object
build() - Method in class org.apache.tika.parser.ParserFactory
 
build(Path) - Static method in class org.apache.tika.pipes.pipesiterator.PipesIterator
 
build() - Method in class org.apache.tika.sax.StandardReference.StandardReferenceBuilder
 
build2() - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
Initialize the MimeTypes with this builder instance
buildClass(Class<T>, String) - Static method in class org.apache.tika.util.ClassLoaderUtil
 
buildComposite(String, Class<P>, String, Class<T>, InputStream) - Static method in class org.apache.tika.config.ConfigBase
Use this to build a list of components for a composite item (e.g.
buildComposite(String, Class<P>, String, Class<T>, Element) - Static method in class org.apache.tika.config.ConfigBase
 
buildDataElements(byte[], AtomicReference<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to build a list of data elements to represent a file.
buildDOM(InputStream, ParseContext) - Static method in class org.apache.tika.utils.XMLReaderUtils
This checks context for a user specified DocumentBuilder.
buildDOM(Path) - Static method in class org.apache.tika.utils.XMLReaderUtils
Builds a Document with a DocumentBuilder from the pool
buildDOM(String) - Static method in class org.apache.tika.utils.XMLReaderUtils
Builds a Document with a DocumentBuilder from the pool
buildDOM(InputStream) - Static method in class org.apache.tika.utils.XMLReaderUtils
Builds a Document with a DocumentBuilder from the pool
Builder() - Constructor for class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
 
Builder() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData.Builder
 
buildExtractReader(Map<String, String>) - Method in class org.apache.tika.eval.app.batch.EvalConsumerBuilder
 
buildParagraphTagAndStyle(String, boolean) - Static method in class org.apache.tika.parser.microsoft.WordExtractor
Given a style name, return what tag should be used, and what style should be applied to it.
buildSingle(String, Class<T>, InputStream) - Static method in class org.apache.tika.config.ConfigBase
Use this to build a single class, where the user specifies the instance class, e.g.
buildSingle(String, Class<T>, Element, T) - Static method in class org.apache.tika.config.ConfigBase
Use this to build a single class, where the user specifies the instance class, e.g.
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
Populates the XHTMLContentHandler object received as parameter.
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.POIXMLTextExtractorDecorator
 
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.SXSLFPowerPointExtractorDecorator
 
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.SXWPFWordExtractorDecorator
 
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSExtractorDecorator
 
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSLFPowerPointExtractorDecorator
 
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFBExcelExtractorDecorator
 
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator
 
BWEBARCHIVE - Static variable in class org.apache.tika.detect.apple.BPListDetector
 
BYTE_ARRAY_LENGHT - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
ByteDeleter - Class in org.apache.tika.fuzzing.general
 
ByteDeleter() - Constructor for class org.apache.tika.fuzzing.general.ByteDeleter
 
ByteFlipper - Class in org.apache.tika.fuzzing.general
 
ByteFlipper() - Constructor for class org.apache.tika.fuzzing.general.ByteFlipper
 
ByteInjector - Class in org.apache.tika.fuzzing.general
 
ByteInjector() - Constructor for class org.apache.tika.fuzzing.general.ByteInjector
 
BytesRefCalculator<T> - Interface in org.apache.tika.eval.core.textstats
Interface for calculators that require a string
BytesRefCalculator.BytesRefCalcInstance<T> - Interface in org.apache.tika.eval.core.textstats
 
ByteUtil - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
 
ByteUtil() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.ByteUtil
 
BZIP - Static variable in class org.apache.tika.detect.zip.CompressorConstants
 
BZIP2 - Static variable in class org.apache.tika.detect.zip.CompressorConstants
 

C

CachedTranslator - Class in org.apache.tika.language.translate.impl
CachedTranslator.
CachedTranslator() - Constructor for class org.apache.tika.language.translate.impl.CachedTranslator
Create a new CachedTranslator (must set the Translator with CachedTranslator.setTranslator(Translator) before use!)
CachedTranslator(Translator) - Constructor for class org.apache.tika.language.translate.impl.CachedTranslator
Create a new CachedTranslator.
calcTextStats(ContentTags) - Method in class org.apache.tika.eval.app.AbstractProfiler
 
calculate(String) - Method in class org.apache.tika.eval.core.langid.LanguageIDWrapper
 
calculate(TokenCounts) - Method in class org.apache.tika.eval.core.textstats.BasicTokenCountStatsCalculator
 
calculate(List<LanguageResult>, TokenCounts) - Method in class org.apache.tika.eval.core.textstats.CommonTokens
 
calculate(List<LanguageResult>, TokenCounts) - Method in class org.apache.tika.eval.core.textstats.CommonTokensBhattacharyya
 
calculate(List<LanguageResult>, TokenCounts) - Method in class org.apache.tika.eval.core.textstats.CommonTokensCosine
 
calculate(List<LanguageResult>, TokenCounts) - Method in class org.apache.tika.eval.core.textstats.CommonTokensHellinger
 
calculate(List<LanguageResult>, TokenCounts) - Method in class org.apache.tika.eval.core.textstats.CommonTokensKLDivergence
 
calculate(List<LanguageResult>, TokenCounts) - Method in class org.apache.tika.eval.core.textstats.CommonTokensKLDNormed
 
calculate(String) - Method in class org.apache.tika.eval.core.textstats.CompositeTextStatsCalculator
 
calculate(String) - Method in class org.apache.tika.eval.core.textstats.ContentLengthCalculator
 
calculate(List<LanguageResult>, TokenCounts) - Method in interface org.apache.tika.eval.core.textstats.LanguageAwareTokenCountStats
 
calculate(String) - Method in interface org.apache.tika.eval.core.textstats.StringStatsCalculator
 
calculate(TokenCounts) - Method in class org.apache.tika.eval.core.textstats.TextProfileSignature
 
calculate(TokenCounts) - Method in interface org.apache.tika.eval.core.textstats.TokenCountStatsCalculator
 
calculate(TokenCounts) - Method in class org.apache.tika.eval.core.textstats.TokenEntropy
 
calculate(TokenCounts) - Method in class org.apache.tika.eval.core.textstats.TokenLengths
 
calculate(TokenCounts) - Method in class org.apache.tika.eval.core.textstats.TopNTokens
 
calculate(String) - Method in class org.apache.tika.eval.core.textstats.UnicodeBlockCounter
 
calculateContrastStatistics(TokenCounts, TokenCounts) - Method in class org.apache.tika.eval.core.tokens.TokenContraster
 
call() - Method in class org.apache.tika.batch.BatchProcess
Runs main execution loop.
call() - Method in class org.apache.tika.batch.FileResourceConsumer
 
call() - Method in class org.apache.tika.batch.FileResourceCrawler
 
call() - Method in class org.apache.tika.batch.fs.strawman.StrawManTikaAppDriver
 
call() - Method in class org.apache.tika.batch.Interrupter
 
call() - Method in class org.apache.tika.batch.StatusReporter
Startup the reporter.
call() - Method in class org.apache.tika.pipes.async.AsyncEmitter
 
call() - Method in class org.apache.tika.pipes.pipesiterator.PipesIterator
 
call() - Method in class org.apache.tika.server.core.TikaServerWatchDog
 
CAN_MODIFY - Static variable in interface org.apache.tika.metadata.AccessPermissions
Can any modifications be made to the document
CAN_MODIFY_ANNOTATIONS - Static variable in interface org.apache.tika.metadata.AccessPermissions
Can the user modify annotations
CAN_PRINT - Static variable in interface org.apache.tika.metadata.AccessPermissions
Can the user print the document
CAN_PRINT_DEGRADED - Static variable in interface org.apache.tika.metadata.AccessPermissions
Can the user print an image-degraded version of the document.
canRun() - Static method in class org.apache.tika.langdetect.mitll.TextLangDetector
 
canRun() - Static method in class org.apache.tika.parser.journal.GrobidRESTParser
 
CantFuzzException - Exception in org.apache.tika.fuzzing.exceptions
 
CantFuzzException(String) - Constructor for exception org.apache.tika.fuzzing.exceptions.CantFuzzException
 
CAPTION_WRITER - Static variable in interface org.apache.tika.metadata.Photoshop
 
CaptionObject - Class in org.apache.tika.parser.captioning
A model for caption objects from graphics and texts typically includes human readable sentence, language of the sentence and confidence score.
CaptionObject(String, String, double) - Constructor for class org.apache.tika.parser.captioning.CaptionObject
 
cast(InputStream) - Static method in class org.apache.tika.io.TikaInputStream
Returns the given stream casts to a TikaInputStream, or null if the stream is not a TikaInputStream.
CATEGORY - Static variable in interface org.apache.tika.metadata.IPTC
Deprecated. 
CATEGORY - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
A categorization of the content of this package.
CATEGORY - Static variable in interface org.apache.tika.metadata.Photoshop
 
cb - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtFourBytesOfLengthFollowedByData
 
Cell - Interface in org.apache.tika.parser.microsoft
Cell of content.
cell(String, String, XSSFComment) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
 
CellDecorator - Class in org.apache.tika.parser.microsoft
Cell decorator.
CellDecorator(Cell) - Constructor for class org.apache.tika.parser.microsoft.CellDecorator
 
CellID - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
CellID(ExGuid, ExGuid) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
Initializes a new instance of the CellID class with specified ExGuids.
CellID(CellID) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
Initializes a new instance of the CellID class, this is the copy constructor.
CellID() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
Initializes a new instance of the CellID class, this is default constructor.
cellID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
 
cellID - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestRootDeclare
 
CellIDArray - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
CellIDArray(long, List<CellID>) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
Initializes a new instance of the CellIDArray class.
CellIDArray(CellIDArray) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
Initializes a new instance of the CellIDArray class, this is copy constructor.
CellIDArray() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
Initializes a new instance of the CellIDArray class, this is default constructor.
cellIDArray - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
 
cellIDArray - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDataBLOBReference
 
CellManifestCurrentRevision - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
CellManifestCurrentRevision() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestCurrentRevision
Initializes a new instance of the CellManifestCurrentRevision class.
cellManifestCurrentRevision - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestDataElementData
 
cellManifestCurrentRevisionExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestCurrentRevision
 
CellManifestDataElementData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
Cell manifest data element
CellManifestDataElementData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestDataElementData
Initializes a new instance of the CellManifestDataElementData class.
cellManifests - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
 
cellMappingExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
 
cellMappingSerialNumber - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
 
cellReferencesCount - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
 
cellReferencesCount - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
 
CellSecondExGuid - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
 
CERTIFICATE - Static variable in interface org.apache.tika.metadata.XMPRights
A Web URL for a rights management certificate.
ChannelTypePropertyConverter() - Constructor for class org.apache.tika.metadata.XMPDM.ChannelTypePropertyConverter
Deprecated.
 
CHARACTER_COUNT - Static variable in interface org.apache.tika.metadata.Office
The number of Characters in the document
CHARACTER_COUNT_WITH_SPACES - Static variable in interface org.apache.tika.metadata.Office
The number of Characters in the document, including spaces
characters - Variable in class org.apache.tika.mime.MimeTypesReader
 
characters(char[], int, int) - Method in class org.apache.tika.mime.MimeTypesReader
 
characters(char[], int, int) - Method in class org.apache.tika.parser.ctakes.CTAKESContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.parser.dif.DIFContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
 
characters(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
 
characters(char[], int, int) - Method in class org.apache.tika.parser.mif.MIFContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.parser.tmx.TMXContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
 
characters(char[], int, int) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
 
characters(char[], int, int) - Method in class org.apache.tika.parser.xml.MetadataHandler
Deprecated.
 
characters(char[], int, int) - Method in class org.apache.tika.sax.boilerpipe.BoilerpipeContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
characters(char[], int, int) - Method in class org.apache.tika.sax.DIFContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.sax.ExpandedTitleContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.sax.LinkContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.sax.PhoneExtractingContentHandler
The characters method is called whenever a Parser wants to pass raw...
characters(char[], int, int) - Method in class org.apache.tika.sax.SafeContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.sax.SecureContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.sax.StandardsExtractingContentHandler
The characters method is called whenever a Parser wants to pass raw characters to the ContentHandler.
characters(char[], int, int) - Method in class org.apache.tika.sax.TeeContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.sax.TextContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.sax.ToTextContentHandler
Writes the given characters to the given character stream.
characters(char[], int, int) - Method in class org.apache.tika.sax.ToXMLContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.sax.WriteOutContentHandler
Writes the given characters to the given character stream.
characters(char[], int, int) - Method in class org.apache.tika.sax.XHTMLContentHandler
 
characters(String) - Method in class org.apache.tika.sax.XHTMLContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
 
CHARACTERS_PER_PAGE - Static variable in interface org.apache.tika.metadata.PDF
 
CharsetContentHandlerFactory() - Constructor for class org.apache.tika.example.PickBestTextEncodingParser.CharsetContentHandlerFactory
Deprecated.
 
CharsetDetector - Class in org.apache.tika.parser.txt
CharsetDetector provides a facility for detecting the charset or encoding of character data in an unknown format.
CharsetDetector() - Constructor for class org.apache.tika.parser.txt.CharsetDetector
Constructor
CharsetDetector(int) - Constructor for class org.apache.tika.parser.txt.CharsetDetector
 
CharsetMatch - Class in org.apache.tika.parser.txt
This class represents a charset that has been identified by a CharsetDetector as a possible encoding for a set of input data.
CharsetTester() - Constructor for class org.apache.tika.example.PickBestTextEncodingParser.CharsetTester
Deprecated.
 
CharsetUtils - Class in org.apache.tika.utils
 
CharsetUtils() - Constructor for class org.apache.tika.utils.CharsetUtils
 
check(String, int...) - Static method in class org.apache.tika.embedder.ExternalEmbedder
Checks to see if the command can be run.
check(String[], int...) - Static method in class org.apache.tika.embedder.ExternalEmbedder
Checks to see if the command can be run.
check(String, int...) - Static method in class org.apache.tika.parser.external.ExternalParser
Checks to see if the command can be run.
check(String[], int...) - Static method in class org.apache.tika.parser.external.ExternalParser
 
check(Metadata) - Method in class org.apache.tika.parser.pdf.AccessChecker
Checks to see if a document's content should be extracted based on metadata values and the value of AccessChecker.allowAccessibility in the constructor.
CHECK_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
 
checkActive() - Method in class org.apache.tika.pipes.async.AsyncProcessor
 
checkAvail() - Method in class org.apache.tika.parser.geo.gazetteer.GeoGazetteerClient
Ping lucene-geo-gazetteer API
checkBit(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
 
checkCommand(String, int...) - Method in class org.apache.tika.language.translate.impl.ExternalTranslator
Checks to see if the command can be run.
checkForTimedOutMillis(long) - Method in class org.apache.tika.batch.FileResourceConsumer
Checks to see if the currentFile being processed (if there is one) should be timed out (still being worked on after staleThresholdMillis).
checkHasFile() - Static method in class org.apache.tika.detect.FileCommandDetector
 
checkHasFile(String) - Static method in class org.apache.tika.detect.FileCommandDetector
 
checkInitialization(InitializableProblemHandler) - Method in interface org.apache.tika.config.Initializable
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.dl.imagerec.DL4JInceptionV3Net
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.dl.imagerec.DL4JVGG16Net
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.external2.ExternalParser
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.pdf.PDFParser
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.AgeRecogniser
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.RegexCaptureParser
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.sqlite3.SQLite3Parser
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.strings.StringsParser
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.transcribe.aws.AmazonTranscribe
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.emitter.azblob.AZBlobEmitter
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.emitter.gcs.GCSEmitter
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitter
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.emitter.s3.S3Emitter
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.emitter.solr.SolrEmitter
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.fetcher.azblob.AZBlobFetcher
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.fetcher.fs.FileSystemFetcher
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.fetcher.gcs.GCSFetcher
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.fetcher.http.HttpFetcher
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.fetcher.s3.S3Fetcher
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.pipesiterator.azblob.AZBlobPipesIterator
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.pipesiterator.csv.CSVPipesIterator
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.pipesiterator.filelist.FileListPipesIterator
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.pipesiterator.fs.FileSystemPipesIterator
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.pipesiterator.gcs.GCSPipesIterator
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.pipesiterator.jdbc.JDBCPipesIterator
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.pipesiterator.PipesIterator
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.pipesiterator.s3.S3PipesIterator
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.pipes.pipesiterator.solr.SolrPipesIterator
 
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.server.core.TlsConfig
 
checkIntegrity() - Method in class org.apache.tika.eval.app.tools.SlowCompositeReaderWrapper
 
checkIsOperating() - Static method in class org.apache.tika.server.core.resource.TikaResource
 
checkThisIsAncestorOfOrSameAsThat(File, File) - Static method in class org.apache.tika.batch.fs.FSUtil
Deprecated.
checkThisIsAncestorOfThat(File, File) - Static method in class org.apache.tika.batch.fs.FSUtil
Deprecated.
ChildMatcher - Class in org.apache.tika.sax.xpath
Intermediate evaluation state of a .../*... XPath expression.
ChildMatcher(Matcher) - Constructor for class org.apache.tika.sax.xpath.ChildMatcher
 
CHM_ITSF_V2_LEN - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
CHM_ITSF_V3_LEN - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
CHM_ITSP_V1_LEN - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
CHM_LZXC_MIN_LEN - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
CHM_LZXC_RESETTABLE_V1_LEN - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
CHM_LZXC_V2_LEN - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
CHM_PMGI_LEN - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
CHM_PMGI_MARKER - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
CHM_PMGL_LEN - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
CHM_SIGNATURE_LEN - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
CHM_VER_1 - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
CHM_VER_2 - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
CHM_VER_3 - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
CHM_WINDOW_SIZE_BLOCK - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
ChmAccessor<T> - Interface in org.apache.tika.parser.microsoft.chm
Defines an accessor interface
ChmAssert - Class in org.apache.tika.parser.microsoft.chm
Contains chm extractor assertions
ChmAssert() - Constructor for class org.apache.tika.parser.microsoft.chm.ChmAssert
 
ChmBlockInfo - Class in org.apache.tika.parser.microsoft.chm
A container that contains chm block information such as: i.
ChmCommons - Class in org.apache.tika.parser.microsoft.chm
 
ChmCommons.EntryType - Enum in org.apache.tika.parser.microsoft.chm
Represents entry types: uncompressed, compressed
ChmCommons.IntelState - Enum in org.apache.tika.parser.microsoft.chm
Represents intel file states during decompression
ChmCommons.LzxState - Enum in org.apache.tika.parser.microsoft.chm
Represents lzx states: started decoding, not started decoding
ChmConstants - Class in org.apache.tika.parser.microsoft.chm
 
ChmDirectoryListingSet - Class in org.apache.tika.parser.microsoft.chm
Holds chm listing entries
ChmDirectoryListingSet(byte[], ChmItsfHeader, ChmItspHeader) - Constructor for class org.apache.tika.parser.microsoft.chm.ChmDirectoryListingSet
Constructs chm directory listing set
ChmExtractor - Class in org.apache.tika.parser.microsoft.chm
Extracts text from chm file.
ChmExtractor(InputStream) - Constructor for class org.apache.tika.parser.microsoft.chm.ChmExtractor
 
ChmItsfHeader - Class in org.apache.tika.parser.microsoft.chm
The Header 0000: char[4] 'ITSF' 0004: DWORD 3 (Version number) 0008: DWORD Total header length, including header section table and following data.
ChmItsfHeader() - Constructor for class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
 
ChmItspHeader - Class in org.apache.tika.parser.microsoft.chm
Directory header The directory starts with a header; its format is as follows: 0000: char[4] 'ITSP' 0004: DWORD Version number 1 0008: DWORD Length of the directory header 000C: DWORD $0a (unknown) 0010: DWORD $1000 Directory chunk size 0014: DWORD "Density" of quickref section, usually 2 0018: DWORD Depth of the index tree - 1 there is no index, 2 if there is one level of PMGI chunks 001C: DWORD Chunk number of root index chunk, -1 if there is none (though at least one file has 0 despite there being no index chunk, probably a bug) 0020: DWORD Chunk number of first PMGL (listing) chunk 0024: DWORD Chunk number of last PMGL (listing) chunk 0028: DWORD -1 (unknown) 002C: DWORD Number of directory chunks (total) 0030: DWORD Windows language ID 0034: GUID {5D02926A-212E-11D0-9DF9-00A0C922E6EC} 0044: DWORD $54 (This is the length again) 0048: DWORD -1 (unknown) 004C: DWORD -1 (unknown) 0050: DWORD -1 (unknown)
ChmItspHeader() - Constructor for class org.apache.tika.parser.microsoft.chm.ChmItspHeader
 
ChmLzxBlock - Class in org.apache.tika.parser.microsoft.chm
Decompresses a chm block.
ChmLzxBlock(int, byte[], long, ChmLzxBlock) - Constructor for class org.apache.tika.parser.microsoft.chm.ChmLzxBlock
 
ChmLzxcControlData - Class in org.apache.tika.parser.microsoft.chm
::DataSpace/Storage//ControlData This file contains $20 bytes of information on the compression.
ChmLzxcControlData() - Constructor for class org.apache.tika.parser.microsoft.chm.ChmLzxcControlData
 
ChmLzxcResetTable - Class in org.apache.tika.parser.microsoft.chm
LZXC reset table For ensuring a decompression.
ChmLzxcResetTable() - Constructor for class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
 
ChmLzxState - Class in org.apache.tika.parser.microsoft.chm
 
ChmLzxState(int) - Constructor for class org.apache.tika.parser.microsoft.chm.ChmLzxState
 
ChmParser - Class in org.apache.tika.parser.microsoft.chm
 
ChmParser() - Constructor for class org.apache.tika.parser.microsoft.chm.ChmParser
 
ChmParsingException - Exception in org.apache.tika.parser.microsoft.chm
 
ChmParsingException(String) - Constructor for exception org.apache.tika.parser.microsoft.chm.ChmParsingException
 
ChmPmgiHeader - Class in org.apache.tika.parser.microsoft.chm
Description Note: not always exists An index chunk has the following format: 0000: char[4] 'PMGI' 0004: DWORD Length of quickref/free area at end of directory chunk 0008: Directory index entries (to quickref/free area) The quickref area in an PMGI is the same as in an PMGL The format of a directory index entry is as follows: BYTE: length of name BYTEs: name (UTF-8 encoded) ENCINT: directory listing chunk which starts with name Encoded Integers aka ENCINT An ENCINT is a variable-length integer.
ChmPmgiHeader() - Constructor for class org.apache.tika.parser.microsoft.chm.ChmPmgiHeader
 
ChmPmglHeader - Class in org.apache.tika.parser.microsoft.chm
Description There are two types of directory chunks -- index chunks, and listing chunks.
ChmPmglHeader() - Constructor for class org.apache.tika.parser.microsoft.chm.ChmPmglHeader
 
ChmSection - Class in org.apache.tika.parser.microsoft.chm
 
ChmSection(byte[]) - Constructor for class org.apache.tika.parser.microsoft.chm.ChmSection
 
ChmSection(byte[], byte[]) - Constructor for class org.apache.tika.parser.microsoft.chm.ChmSection
 
ChmWrapper - Class in org.apache.tika.parser.microsoft.chm
 
ChmWrapper() - Constructor for class org.apache.tika.parser.microsoft.chm.ChmWrapper
 
chunking() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.AbstractChunking
This method is used to chunk the file data.
chunking() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.RDCAnalysisChunking
This method is used to chunk the file data.
chunking() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.SimpleChunking
This method is used to chunk the file data.
chunking() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ZipFilesChunking
This method is used to chunk the file data.
ChunkingFactory - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
This class is used to create instance of AbstractChunking.
ChunkingMethod - Enum in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking
 
CITY - Static variable in interface org.apache.tika.metadata.IPTC
Name of the city the content is focussing on -- either the place shown in visual media or referenced by text or audio media.
CITY - Static variable in interface org.apache.tika.metadata.Photoshop
 
CJKBigramAwareLengthFilterFactory - Class in org.apache.tika.eval.core.tokens
Creates a very narrowly focused TokenFilter that limits tokens based on length _unless_ they've been identified as <DOUBLE> or <SINGLE> by the CJKBigramFilter.
CJKBigramAwareLengthFilterFactory(Map<String, String>) - Constructor for class org.apache.tika.eval.core.tokens.CJKBigramAwareLengthFilterFactory
 
ClassLoaderUtil - Class in org.apache.tika.util
 
ClassLoaderUtil() - Constructor for class org.apache.tika.util.ClassLoaderUtil
 
className - Variable in class org.apache.tika.server.core.resource.TikaWelcome.Endpoint
 
ClassParser - Class in org.apache.tika.parser.asm
Parser for Java .class files.
ClassParser() - Constructor for class org.apache.tika.parser.asm.ClassParser
 
clean(String) - Static method in class org.apache.tika.sax.CleanPhoneText
 
clean(String) - Static method in class org.apache.tika.utils.CharsetUtils
Handle various common charset name errors, and return something that will be considered valid (and is normalized)
CleanPhoneText - Class in org.apache.tika.sax
Class to help de-obfuscate phone numbers in text.
CleanPhoneText() - Constructor for class org.apache.tika.sax.CleanPhoneText
 
cleanSubstitutions - Static variable in class org.apache.tika.sax.CleanPhoneText
 
clear(String) - Method in class org.apache.tika.eval.core.tokens.TokenCounter
Deprecated.
 
clearBit(byte[], long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.Bit
Set a bit value to "Off" in the specified byte array with the specified bit position.
ClearByMimeMetadataFilter - Class in org.apache.tika.metadata.filter
This class clears the entire metadata object if the mime matches the mime filter.
ClearByMimeMetadataFilter() - Constructor for class org.apache.tika.metadata.filter.ClearByMimeMetadataFilter
 
ClearByMimeMetadataFilter(Set<String>) - Constructor for class org.apache.tika.metadata.filter.ClearByMimeMetadataFilter
 
clearProfiles() - Static method in class org.apache.tika.langdetect.tika.LanguageIdentifier
Clears the current map of language profiles
CLIENT_UNAVAILABLE_WITHIN_MS - Static variable in class org.apache.tika.pipes.PipesResult
 
ClimateForcast - Interface in org.apache.tika.metadata
Met keys from NCAR CCSM files in the Climate Forecast Convention.
clone() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
 
cloneAndUpdate(TesseractOCRConfig) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
cloneAndUpdate(PDFParserConfig) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
cloneMetadata(Metadata) - Static method in class org.apache.tika.utils.ParserUtils
Does a deep clone of a Metadata object.
close(Closeable) - Method in class org.apache.tika.batch.FileResourceConsumer
 
close() - Method in class org.apache.tika.eval.app.db.DBBuffer
 
close() - Method in class org.apache.tika.eval.app.db.MimeBuffer
 
close() - Method in class org.apache.tika.eval.app.io.DBWriter
This closes the writer by executing batch and committing changes.
close() - Method in interface org.apache.tika.eval.app.io.IDBWriter
 
close() - Method in class org.apache.tika.eval.core.tokens.CommonTokenCountManager
 
close() - Method in class org.apache.tika.fork.ForkParser
 
close() - Method in class org.apache.tika.fuzzing.pdf.EvilCOSWriter
This will close the stream.
close() - Method in class org.apache.tika.io.LookaheadInputStream
 
close() - Method in class org.apache.tika.io.TemporaryResources
Closes all tracked resources.
close() - Method in class org.apache.tika.io.TikaInputStream
 
close() - Method in class org.apache.tika.langdetect.tika.ProfilingWriter
 
close() - Method in class org.apache.tika.language.detect.LanguageWriter
Ignored.
close() - Method in class org.apache.tika.language.translate.impl.MarianTranslator.MarianServerClient
Close the connection to the Marian Server.
close() - Method in class org.apache.tika.metadata.serialization.JsonStreamingSerializer
 
close() - Method in class org.apache.tika.parser.jdbc.AbstractDBParser
Override this for any special handling of closing the connection.
close() - Method in class org.apache.tika.parser.microsoft.ooxml.OPCPackageWrapper
 
close() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
 
close() - Method in class org.apache.tika.parser.ParsingReader
Closes the read end of the pipe.
close() - Method in class org.apache.tika.pipes.async.AsyncProcessor
 
close() - Method in class org.apache.tika.pipes.PipesClient
 
close() - Method in class org.apache.tika.pipes.PipesParser
 
close() - Method in class org.apache.tika.pipes.PipesReporter
No-op implementation.
close() - Method in class org.apache.tika.server.core.resource.PipesResource
 
close() - Method in class org.apache.tika.utils.RereadableInputStream
Closes the input stream and removes the temporary file if one was created.
closeStyleTags(XHTMLContentHandler, Deque<FormattingUtils.Tag>) - Static method in class org.apache.tika.parser.microsoft.FormattingUtils
Closes all formatting tags.
closeWriter() - Method in class org.apache.tika.eval.app.AbstractProfiler
 
ColInfo - Class in org.apache.tika.eval.app.db
 
ColInfo(Cols, int) - Constructor for class org.apache.tika.eval.app.db.ColInfo
 
ColInfo(Cols, int, String) - Constructor for class org.apache.tika.eval.app.db.ColInfo
 
ColInfo(Cols, int, Integer) - Constructor for class org.apache.tika.eval.app.db.ColInfo
 
ColInfo(Cols, int, Integer, String) - Constructor for class org.apache.tika.eval.app.db.ColInfo
 
COLOR_MODE - Static variable in interface org.apache.tika.metadata.Photoshop
 
Cols - Enum in org.apache.tika.eval.app.db
 
COLUMN_COUNT - Static variable in interface org.apache.tika.metadata.Database
 
COLUMN_NAME - Static variable in interface org.apache.tika.metadata.Database
 
COMMAND_LINE - Static variable in interface org.apache.tika.metadata.ClimateForcast
 
COMMAND_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
 
CommandLineParserBuilder - Class in org.apache.tika.batch.builders
Reads configurable options from a config file and returns org.apache.commons.cli.Options object to be used in commandline parser.
CommandLineParserBuilder() - Constructor for class org.apache.tika.batch.builders.CommandLineParserBuilder
 
COMMENT - Static variable in class org.apache.tika.fuzzing.pdf.EvilCOSWriter
The start to a PDF comment.
COMMENT - Static variable in interface org.apache.tika.metadata.ClimateForcast
 
COMMENT_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
COMMENTS - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
 
COMMENTS - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
 
CommonsDigester - Class in org.apache.tika.parser.digestutils
Implementation of DigestingParser.Digester that relies on commons.codec.digest.DigestUtils to calculate digest hashes.
CommonsDigester(int, String) - Constructor for class org.apache.tika.parser.digestutils.CommonsDigester
Include a string representing the comma-separated algorithms to run: e.g.
CommonsDigester(int, CommonsDigester.DigestAlgorithm...) - Constructor for class org.apache.tika.parser.digestutils.CommonsDigester
CommonsDigester.DigestAlgorithm - Enum in org.apache.tika.parser.digestutils
 
CommonTokenCountManager - Class in org.apache.tika.eval.core.tokens
 
CommonTokenCountManager() - Constructor for class org.apache.tika.eval.core.tokens.CommonTokenCountManager
 
CommonTokenCountManager(Path, String) - Constructor for class org.apache.tika.eval.core.tokens.CommonTokenCountManager
 
CommonTokenOverlapCounter - Class in org.apache.tika.eval.app.tools
 
CommonTokenOverlapCounter() - Constructor for class org.apache.tika.eval.app.tools.CommonTokenOverlapCounter
 
CommonTokenResult - Class in org.apache.tika.eval.core.tokens
 
CommonTokenResult(String, int, int, int, int) - Constructor for class org.apache.tika.eval.core.tokens.CommonTokenResult
 
CommonTokens - Class in org.apache.tika.eval.core.textstats
 
CommonTokens() - Constructor for class org.apache.tika.eval.core.textstats.CommonTokens
 
CommonTokens(CommonTokenCountManager) - Constructor for class org.apache.tika.eval.core.textstats.CommonTokens
 
CommonTokensBhattacharyya - Class in org.apache.tika.eval.core.textstats
 
CommonTokensBhattacharyya(CommonTokenCountManager) - Constructor for class org.apache.tika.eval.core.textstats.CommonTokensBhattacharyya
 
CommonTokensCosine - Class in org.apache.tika.eval.core.textstats
 
CommonTokensCosine(CommonTokenCountManager) - Constructor for class org.apache.tika.eval.core.textstats.CommonTokensCosine
 
CommonTokensHellinger - Class in org.apache.tika.eval.core.textstats
 
CommonTokensHellinger(CommonTokenCountManager) - Constructor for class org.apache.tika.eval.core.textstats.CommonTokensHellinger
 
CommonTokensKLDivergence - Class in org.apache.tika.eval.core.textstats
 
CommonTokensKLDivergence(CommonTokenCountManager) - Constructor for class org.apache.tika.eval.core.textstats.CommonTokensKLDivergence
 
CommonTokensKLDNormed - Class in org.apache.tika.eval.core.textstats
 
CommonTokensKLDNormed(CommonTokenCountManager) - Constructor for class org.apache.tika.eval.core.textstats.CommonTokensKLDNormed
 
COMP_OBJ - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
Some other kind of embedded document, in a CompObj container within another OLE2 document
Compact64bitInt - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
A 9-byte encoding of values in the range 0x0002000000000000 through 0xFFFFFFFFFFFFFFFF
Compact64bitInt(long) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Initializes a new instance of the Compact64bitInt class with specified value.
Compact64bitInt() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Initializes a new instance of the Compact64bitInt class, this is the default constructor.
CompactID - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
This class is used to represent the CompactID structrue.
CompactID() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CompactID
 
CompactUint14bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Specify the type value for compact uint 14 bits type value.
CompactUint21bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Specify the type value for compact uint 21 bits type value.
CompactUint28bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Specify the type value for compact uint 28 bits type value.
CompactUint35bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Specify the type value for compact uint 35 bits type value.
CompactUint42bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Specify the type value for compact uint 42 bits type value.
CompactUint49bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Specify the type value for compact uint 49 bits type value.
CompactUint64bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Specify the type value for compact uint 64 bits type value.
CompactUint7bitType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Specify the type value for compact uint 7 bits type value.
CompactUintNullType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
Specify the type value for compact uint zero type value.
COMPANY - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
 
compare(String, String) - Method in class org.apache.tika.metadata.serialization.PrettyMetadataKeyComparator
 
compare(long, long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
compare(ClassResourceInfo, ClassResourceInfo, Message) - Method in class org.apache.tika.server.core.ProduceTypeResourceComparator
Compares the class to handle.
compare(OperationResourceInfo, OperationResourceInfo, Message) - Method in class org.apache.tika.server.core.ProduceTypeResourceComparator
Compares the method to handle.
compare(InputStream) - Method in class org.apache.tika.server.eval.TikaEvalResource
 
compareClassName(Object, Object) - Static method in class org.apache.tika.utils.CompareUtils
Compare two classes by class names.
compareFiles(EvalFilePaths, EvalFilePaths) - Method in class org.apache.tika.eval.app.ExtractComparer
 
compareTo(TokenIntPair) - Method in class org.apache.tika.eval.core.tokens.TokenIntPair
Descending by value, ascending by token
compareTo(Property) - Method in class org.apache.tika.metadata.Property
 
compareTo(MediaType) - Method in class org.apache.tika.mime.MediaType
 
compareTo(MimeType) - Method in class org.apache.tika.mime.MimeType
 
compareTo(CSVResult) - Method in class org.apache.tika.parser.csv.CSVResult
Sorts in descending order of confidence
compareTo(ExtendedGUID) - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
 
compareTo(UByte) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
compareTo(UInteger) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
compareTo(ULong) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
compareTo(UShort) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
compareTo(GUID) - Method in class org.apache.tika.parser.microsoft.onenote.GUID
 
compareTo(CharsetMatch) - Method in class org.apache.tika.parser.txt.CharsetMatch
Compare to other CharsetMatch objects.
CompareUtils - Class in org.apache.tika.utils
 
CompareUtils() - Constructor for class org.apache.tika.utils.CompareUtils
 
COMPARISON_CONTAINERS - Static variable in class org.apache.tika.eval.app.ExtractComparer
 
COMPILATION - Static variable in interface org.apache.tika.metadata.XMPDM
"An album created by various artists."
complete(long) - Method in class org.apache.tika.server.core.ServerStatus
Removes the task from the collection of currently running tasks.
COMPLETED_SEMAPHORE - Static variable in class org.apache.tika.pipes.pipesiterator.PipesIterator
 
COMPOSER - Static variable in interface org.apache.tika.metadata.XMPDM
"The composer's name."
composite(Property, Property[]) - Static method in class org.apache.tika.metadata.Property
Constructs a new composite property from the given primary and array of secondary properties.
CompositeDetector - Class in org.apache.tika.detect
Content type detector that combines multiple different detection mechanisms.
CompositeDetector(MediaTypeRegistry, List<Detector>, Collection<Class<? extends Detector>>) - Constructor for class org.apache.tika.detect.CompositeDetector
 
CompositeDetector(MediaTypeRegistry, List<Detector>) - Constructor for class org.apache.tika.detect.CompositeDetector
 
CompositeDetector(List<Detector>) - Constructor for class org.apache.tika.detect.CompositeDetector
 
CompositeDetector(Detector...) - Constructor for class org.apache.tika.detect.CompositeDetector
 
CompositeDigester - Class in org.apache.tika.parser.digest
 
CompositeDigester(DigestingParser.Digester...) - Constructor for class org.apache.tika.parser.digest.CompositeDigester
 
CompositeEncodingDetector - Class in org.apache.tika.detect
 
CompositeEncodingDetector(List<EncodingDetector>, Collection<Class<? extends EncodingDetector>>) - Constructor for class org.apache.tika.detect.CompositeEncodingDetector
 
CompositeEncodingDetector(List<EncodingDetector>) - Constructor for class org.apache.tika.detect.CompositeEncodingDetector
 
CompositeExternalParser - Class in org.apache.tika.parser.external
A Composite Parser that wraps up all the available External Parsers, and provides an easy way to access them.
CompositeExternalParser() - Constructor for class org.apache.tika.parser.external.CompositeExternalParser
 
CompositeExternalParser(MediaTypeRegistry) - Constructor for class org.apache.tika.parser.external.CompositeExternalParser
 
CompositeMatcher - Class in org.apache.tika.sax.xpath
Composite XPath evaluation state.
CompositeMatcher(Matcher, Matcher) - Constructor for class org.apache.tika.sax.xpath.CompositeMatcher
 
CompositeMetadataFilter - Class in org.apache.tika.metadata.filter
 
CompositeMetadataFilter(List<MetadataFilter>) - Constructor for class org.apache.tika.metadata.filter.CompositeMetadataFilter
 
CompositeParseContextConfig - Class in org.apache.tika.server.core
 
CompositeParseContextConfig() - Constructor for class org.apache.tika.server.core.CompositeParseContextConfig
 
CompositeParser - Class in org.apache.tika.parser
Composite parser that delegates parsing tasks to a component parser based on the declared content type of the incoming document.
CompositeParser(MediaTypeRegistry, List<Parser>, Collection<Class<? extends Parser>>) - Constructor for class org.apache.tika.parser.CompositeParser
 
CompositeParser(MediaTypeRegistry, List<Parser>) - Constructor for class org.apache.tika.parser.CompositeParser
 
CompositeParser(MediaTypeRegistry, Parser...) - Constructor for class org.apache.tika.parser.CompositeParser
 
CompositeParser() - Constructor for class org.apache.tika.parser.CompositeParser
 
CompositeTagHandler - Class in org.apache.tika.parser.mp3
Takes an array of ID3Tags in preference order, and when asked for a given tag, will return it from the first ID3Tags that has it.
CompositeTagHandler(ID3Tags[]) - Constructor for class org.apache.tika.parser.mp3.CompositeTagHandler
 
CompositeTextStatsCalculator - Class in org.apache.tika.eval.core.textstats
 
CompositeTextStatsCalculator(List<TextStatsCalculator>) - Constructor for class org.apache.tika.eval.core.textstats.CompositeTextStatsCalculator
 
CompositeTextStatsCalculator(List<TextStatsCalculator>, Analyzer, LanguageIDWrapper) - Constructor for class org.apache.tika.eval.core.textstats.CompositeTextStatsCalculator
 
compound - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart
Gets or sets a value that specifies if set a compound parse type is needed and MUST be ended with either an 8-bit stream object header end or a 16-bit stream object header end.
COMPRESS - Static variable in class org.apache.tika.detect.zip.CompressorConstants
 
CompressorConstants - Class in org.apache.tika.detect.zip
 
CompressorConstants() - Constructor for class org.apache.tika.detect.zip.CompressorConstants
 
CompressorParser - Class in org.apache.tika.parser.pkg
Parser for various compression formats.
CompressorParser() - Constructor for class org.apache.tika.parser.pkg.CompressorParser
 
CompressorParserOptions - Interface in org.apache.tika.parser.pkg
Interface for setting options for the CompressorParser by passing via the ParseContext.
ConcurrentUtils - Class in org.apache.tika.utils
Utility Class for Concurrency in Tika
ConcurrentUtils() - Constructor for class org.apache.tika.utils.ConcurrentUtils
 
confidence - Variable in class org.apache.tika.parser.recognition.RecognisedObject
Confidence score
config - Variable in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
 
ConfigBase - Class in org.apache.tika.config
 
ConfigBase() - Constructor for class org.apache.tika.config.ConfigBase
 
ConfigurableThreadPoolExecutor - Interface in org.apache.tika.concurrent
Allows Thread Pool to be Configurable.
configure(String, InputStream) - Method in class org.apache.tika.config.ConfigBase
Use this to configure a subclass of ConfigBase, a single known object.
configure(ParseContext) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
Checks to see if the user has specified an OfficeParserConfig.
configure(PDF2XHTML) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Configures the given pdf2XHTML.
configure(MultivaluedMap<String, String>, Metadata, ParseContext) - Method in class org.apache.tika.server.core.CompositeParseContextConfig
 
configure(MultivaluedMap<String, String>, Metadata, ParseContext) - Method in class org.apache.tika.server.core.config.DocumentSelectorConfig
 
configure(MultivaluedMap<String, String>, Metadata, ParseContext) - Method in class org.apache.tika.server.core.config.PasswordProviderConfig
 
configure(MultivaluedMap<String, String>, Metadata, ParseContext) - Method in class org.apache.tika.server.core.config.TimeoutConfig
 
configure(MultivaluedMap<String, String>, Metadata, ParseContext) - Method in interface org.apache.tika.server.core.ParseContextConfig
Configures the parseContext with present headers.
configure(MultivaluedMap<String, String>, Metadata, ParseContext) - Method in class org.apache.tika.server.standard.config.PDFServerConfig
Configures the parseContext with present headers.
configure(MultivaluedMap<String, String>, Metadata, ParseContext) - Method in class org.apache.tika.server.standard.config.TesseractServerConfig
Configures the parseContext with present headers.
configureExtractor(POIXMLTextExtractor, Locale) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFBExcelExtractorDecorator
 
configureExtractor(POIXMLTextExtractor, Locale) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
consume(String) - Method in interface org.apache.tika.parser.external.ExternalParser.LineConsumer
Consume a line
ConsumersManager - Class in org.apache.tika.batch
Simple interface around a collection of consumers that allows for initializing and shutting shared resources (e.g.
ConsumersManager(List<FileResourceConsumer>) - Constructor for class org.apache.tika.batch.ConsumersManager
 
CONTACT - Static variable in interface org.apache.tika.metadata.ClimateForcast
 
CONTACT_INFO_ADDRESS - Static variable in interface org.apache.tika.metadata.IPTC
The contact information address part.
CONTACT_INFO_CITY - Static variable in interface org.apache.tika.metadata.IPTC
The contact information city part.
CONTACT_INFO_COUNTRY - Static variable in interface org.apache.tika.metadata.IPTC
The contact information country part.
CONTACT_INFO_EMAIL - Static variable in interface org.apache.tika.metadata.IPTC
The contact information email address part.
CONTACT_INFO_PHONE - Static variable in interface org.apache.tika.metadata.IPTC
The contact information phone number part.
CONTACT_INFO_POSTAL_CODE - Static variable in interface org.apache.tika.metadata.IPTC
The contact information part denoting the local postal code.
CONTACT_INFO_STATE_PROVINCE - Static variable in interface org.apache.tika.metadata.IPTC
The contact information part denoting regional information such as state or province.
CONTACT_INFO_WEB_URL - Static variable in interface org.apache.tika.metadata.IPTC
The contact information web address part.
CONTAINER_EXCEPTION - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
 
CONTAINER_TABLE - Static variable in class org.apache.tika.eval.app.ExtractProfiler
 
ContainerExtractor - Interface in org.apache.tika.extractor
Tika container extractor interface.
contains(String) - Method in class org.apache.tika.eval.core.tokens.LangModel
 
contains(String, String, String) - Method in class org.apache.tika.language.translate.impl.CachedTranslator
Check whether this CachedTranslator's cache contains a translation of the text from the source language to the target language.
contains(String, String) - Method in class org.apache.tika.language.translate.impl.CachedTranslator
Check whether this CachedTranslator's cache contains a translation of the text to the target language, attempting to auto-detect the source language.
contains(Charset) - Method in class org.apache.tika.parser.html.charsetdetector.charsets.ReplacementCharset
 
contains(Charset) - Method in class org.apache.tika.parser.html.charsetdetector.charsets.XUserDefinedCharset
 
containsColumn(Cols) - Method in class org.apache.tika.eval.app.db.TableInfo
 
containsEmail(String) - Static method in class org.apache.tika.parser.mailcommons.MailUtil
If the chunk looks like it contains an email
containsTable(String) - Method in class org.apache.tika.eval.app.db.JDBCUtil
 
CONTENT - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
content - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
 
content - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
 
content - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
Gets or sets an extended GUID array
CONTENT_COMPARISONS - Static variable in class org.apache.tika.eval.app.ExtractComparer
 
CONTENT_DISPOSITION - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_ENCODING - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_LANGUAGE - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_LENGTH - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_LOCATION - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_MD5 - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_STATUS - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
The status of the content.
CONTENT_TYPE - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_TYPE_HINT - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
This is currently used to identify Content-Type that may be included within a document, such as in html documents (e.g.
CONTENT_TYPE_PARSER_OVERRIDE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
This is used by parsers to override detection of embedded resources with the override detector.
CONTENT_TYPE_USER_OVERRIDE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
This is used by users to override detection with the override detector.
ContentHandlerDecorator - Class in org.apache.tika.sax
Decorator base class for the ContentHandler interface.
ContentHandlerDecorator(ContentHandler) - Constructor for class org.apache.tika.sax.ContentHandlerDecorator
Creates a decorator for the given SAX event handler.
ContentHandlerDecorator() - Constructor for class org.apache.tika.sax.ContentHandlerDecorator
Creates a decorator that by default forwards incoming SAX events to a dummy content handler that simply ignores all the events.
ContentHandlerDecoratorFactory - Interface in org.apache.tika.sax
 
ContentHandlerExample - Class in org.apache.tika.example
Examples of using different Content Handlers to get different parts of the file's contents
ContentHandlerExample() - Constructor for class org.apache.tika.example.ContentHandlerExample
 
ContentHandlerFactory - Interface in org.apache.tika.sax
Interface to allow easier injection of code for getting a new ContentHandler
ContentLengthCalculator - Class in org.apache.tika.eval.core.textstats
 
ContentLengthCalculator() - Constructor for class org.apache.tika.eval.core.textstats.ContentLengthCalculator
 
CONTENTS_TABLE - Static variable in class org.apache.tika.eval.app.ExtractProfiler
 
CONTENTS_TABLE_A - Static variable in class org.apache.tika.eval.app.ExtractComparer
 
CONTENTS_TABLE_B - Static variable in class org.apache.tika.eval.app.ExtractComparer
 
ContentTagParser - Class in org.apache.tika.eval.core.util
 
ContentTagParser() - Constructor for class org.apache.tika.eval.core.util.ContentTagParser
 
ContentTags - Class in org.apache.tika.eval.core.util
 
ContentTags(String) - Constructor for class org.apache.tika.eval.core.util.ContentTags
 
ContentTags(String, boolean) - Constructor for class org.apache.tika.eval.core.util.ContentTags
 
ContentTags(String, Map<String, Integer>) - Constructor for class org.apache.tika.eval.core.util.ContentTags
 
contextIDs - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
 
ContrastStatistics - Class in org.apache.tika.eval.core.tokens
 
ContrastStatistics() - Constructor for class org.apache.tika.eval.core.tokens.ContrastStatistics
 
CONTRIBUTOR - Static variable in interface org.apache.tika.metadata.DublinCore
An entity responsible for making contributions to the content of the resource.
CONTRIBUTOR - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
 
CONTROL_DATA - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
CONTROLLED_VOCABULARY_TERM - Static variable in interface org.apache.tika.metadata.IPTC
A term to describe the content of the image by a value from a Controlled Vocabulary.
CONVENTIONS - Static variable in interface org.apache.tika.metadata.ClimateForcast
 
convert(Object) - Static method in class org.apache.tika.metadata.XMPDM.ChannelTypePropertyConverter
Deprecated.
How a standalone converter might work
convert(Metadata) - Static method in class org.apache.tika.xmp.convert.TikaToXMP
 
convert(Metadata, String) - Static method in class org.apache.tika.xmp.convert.TikaToXMP
Convert the given Tika metadata map to XMP object.
convertAndSet(Metadata, Object) - Static method in class org.apache.tika.metadata.XMPDM.ChannelTypePropertyConverter
Deprecated.
How convert+set might work
convertToJSONArray(JSONObject, String) - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
Converts JSON Object to JSON Array
convertToJSONObject(String) - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
Parses a JSON String and converts it to a JSON Object
copy() - Method in class org.apache.tika.client.HttpClientFactory
 
copy(DirectoryEntry, DirectoryEntry) - Method in class org.apache.tika.extractor.microsoft.MSEmbeddedStreamTranslator
 
copyAtMost(Reader, Writer, int) - Method in class org.apache.tika.langdetect.LanguageDetectorTest
 
copyOfRange(byte[], int, int) - Static method in class org.apache.tika.parser.microsoft.chm.ChmCommons
 
COPYRIGHT - Static variable in interface org.apache.tika.metadata.XMPDM
"The copyright information."
COPYRIGHT_NOTICE - Static variable in interface org.apache.tika.metadata.IPTC
Contains any necessary copyright notice for claiming the intellectual property for this item and should identify the current owner of the copyright for the item.
COPYRIGHT_OWNER - Static variable in interface org.apache.tika.metadata.IPTC
Owner or owners of the copyright in the licensed image.
COPYRIGHT_OWNER_ID - Static variable in interface org.apache.tika.metadata.IPTC
The ID of the owner or owners of the copyright in the licensed image.
COPYRIGHT_OWNER_ID_WRONG_CASE - Static variable in interface org.apache.tika.metadata.IPTC
Deprecated.
COPYRIGHT_OWNER_NAME - Static variable in interface org.apache.tika.metadata.IPTC
The name of the owner or owners of the copyright in the licensed image.
CoreNLPNERecogniser - Class in org.apache.tika.parser.ner.corenlp
This class offers an implementation of NERecogniser based on CRF classifiers from Stanford CoreNLP.
CoreNLPNERecogniser() - Constructor for class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
 
CoreNLPNERecogniser(String) - Constructor for class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
Creates a NERecogniser by loading model from given path
CorruptedFileException - Exception in org.apache.tika.exception
This exception should be thrown when the parse absolutely, positively has to stop.
CorruptedFileException(String) - Constructor for exception org.apache.tika.exception.CorruptedFileException
 
CorruptedFileException(String, Throwable) - Constructor for exception org.apache.tika.exception.CorruptedFileException
 
count() - Method in class org.apache.tika.detect.TextStatistics
Returns the total number of bytes seen so far.
count(int) - Method in class org.apache.tika.detect.TextStatistics
Returns the number of occurrences of the given byte.
count - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
 
count - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
 
count - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
 
count - Variable in class org.apache.tika.parser.ocr.tess4j.ImageDeskew.HoughLine
 
countControl() - Method in class org.apache.tika.detect.TextStatistics
Counts control characters (i.e.
countEightBit() - Method in class org.apache.tika.detect.TextStatistics
Counts eight bit characters, i.e.
COUNTRY - Static variable in interface org.apache.tika.metadata.IPTC
Full name of the country the content is focussing on -- either the country shown in visual media or referenced in text or audio media.
COUNTRY - Static variable in interface org.apache.tika.metadata.Photoshop
 
COUNTRY_CODE - Static variable in interface org.apache.tika.metadata.IPTC
Code of the country the content is focussing on -- either the country shown in visual media or referenced in text or audio media.
countSafeAscii() - Method in class org.apache.tika.detect.TextStatistics
Counts "safe" (i.e.
countTokenOverlaps(String, Map<String, MutableInt>) - Method in class org.apache.tika.eval.core.tokens.CommonTokenCountManager
Deprecated.
COVERAGE - Static variable in interface org.apache.tika.metadata.DublinCore
The extent or scope of the content of the resource.
COVERAGE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
 
CPIO - Static variable in class org.apache.tika.detect.zip.PackageConstants
 
cProperties - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
 
cProperties - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySet
 
create(TokenStream) - Method in class org.apache.tika.eval.core.tokens.AlphaIdeographFilterFactory
 
create(TokenStream) - Method in class org.apache.tika.eval.core.tokens.CJKBigramAwareLengthFilterFactory
 
create(TokenStream) - Method in class org.apache.tika.eval.core.tokens.URLEmailNormalizingFilterFactory
 
create(String, InputStream, String) - Static method in class org.apache.tika.langdetect.tika.LanguageProfilerBuilder
Creates a new Language profile from (preferably quite large - 5-10k of lines) text file
create() - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates an empty instance; same as calling new MimeTypes().
create(Document) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance from the specified document.
create(InputStream...) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance from the specified input stream.
create(InputStream) - Static method in class org.apache.tika.mime.MimeTypesFactory
 
create(URL...) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance from the resource at the location specified by the URL.
create(URL) - Static method in class org.apache.tika.mime.MimeTypesFactory
 
create(String) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance from the specified file path, as interpreted by the class loader in getResource().
create(String, String) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance.
create(String, String, ClassLoader) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance.
create() - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
 
create(ServiceLoader) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
 
create(String, ServiceLoader) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
 
create(URL...) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
 
CREATE_DATE - Static variable in interface org.apache.tika.metadata.XMP
The date and time the resource was created.
createArrayProperty(Property, String, String, int) - Method in class org.apache.tika.xmp.convert.AbstractConverter
 
createArrayProperty(String, String, String, int) - Method in class org.apache.tika.xmp.convert.AbstractConverter
Creates an array property from a list of values.
createCellMainifestDataElement(ExGuid, Map<CellID, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to create the cell manifest data element.
createChunkingInstance(byte[]) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingFactory
This method is used to create the instance of AbstractChunking.
createChunkingInstance(IntermediateNodeObject) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingFactory
This method is used to create the instance of AbstractChunking.
createChunkingInstance(byte[], ChunkingMethod) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.ChunkingFactory
This method is used to create the instance of AbstractChunking.
createCommaSeparatedArray(Property, String, String, int) - Method in class org.apache.tika.xmp.convert.AbstractConverter
 
createCommaSeparatedArray(String, String, String, int) - Method in class org.apache.tika.xmp.convert.AbstractConverter
Creates an array property from a comma separated list.
CREATED - Static variable in interface org.apache.tika.metadata.DublinCore
Date of creation of the resource.
CREATED - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
 
createDecryptStream(InputStream, Key) - Method in class org.apache.tika.parser.hwp.HwpTextExtractorV5
 
createFrameIfPresent(InputStream) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
Returns the next ID3v2 Frame in the file, or null if the next batch of data doesn't correspond to either an ID3v2 header.
createInstance(ObjectGroupDataElementData) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.HeaderCell
Create the instance of Header Cell.
createInstance(ExGuid, ObjectGroupDataElementData, boolean) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObjectGroup
 
createLangAltProperty(Property, String, String) - Method in class org.apache.tika.xmp.convert.AbstractConverter
 
createLangAltProperty(String, String, String) - Method in class org.apache.tika.xmp.convert.AbstractConverter
Creates a language alternative property in the x-default language
createObjectGroupDataElement(byte[], AtomicReference<ExGuid>, List<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to create object group data/blob element list.
createOneNoteDocumentFromDirectFileResource(OneNoteDirectFileResource) - Method in class org.apache.tika.parser.microsoft.onenote.OneNoteParser
Create a OneNoteDocument object.
createPageDrawer(PageDrawerParameters) - Method in class org.apache.tika.parser.pdf.NoTextPDFRenderer
Returns a new PageDrawer instance, using the given parameters.
createParser() - Static method in class org.apache.tika.server.core.resource.TikaResource
 
createProperty(Property, String, String) - Method in class org.apache.tika.xmp.convert.AbstractConverter
 
createProperty(String, String, String) - Method in class org.apache.tika.xmp.convert.AbstractConverter
Creates a simple property.
createRevisionManifestDataElement(ExGuid, ExGuid, List<ExGuid>, Map<ExGuid, ExGuid>, AtomicReference<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to create the revision manifest data element.
createStorageIndexDataElement(ExGuid, Map<CellID, ExGuid>, Map<ExGuid, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to create the storage index data element.
createStorageManifestDataElement(Map<CellID, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to create the storage manifest data element.
createTables(List<TableInfo>, JDBCUtil.CREATE_TABLE) - Method in class org.apache.tika.eval.app.db.JDBCUtil
 
createTempFile() - Method in class org.apache.tika.io.TemporaryResources
Creates a temporary file that will automatically be deleted when the TemporaryResources.close() method is called, returning its path.
createTemporaryFile() - Method in class org.apache.tika.io.TemporaryResources
Creates and returns a temporary file that will automatically be deleted when the TemporaryResources.close() method is called.
CREATION_DATE - Static variable in interface org.apache.tika.metadata.Office
When was the document created?
CreativeCommons - Interface in org.apache.tika.metadata
A collection of Creative Commons properties names.
CREATOR - Static variable in interface org.apache.tika.metadata.DublinCore
An entity primarily responsible for making the content of the resource.
CREATOR - Static variable in interface org.apache.tika.metadata.IPTC
Contains the name of the person who created the content of this item, a photographer for photos, a graphic artist for graphics, or a writer for textual news, but in cases where the photographer should not be identified the name of a company or organisation may be appropriate.
CREATOR - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
 
CREATOR_TOOL - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
 
CREATOR_TOOL - Static variable in interface org.apache.tika.metadata.XMP
The name of the first known tool used to create the resource.
CREATORS_CONTACT_INFO - Static variable in interface org.apache.tika.metadata.IPTC
The creator's contact information provides all necessary information to get in contact with the creator of this item and comprises a set of sub-properties for proper addressing.
CREATORS_JOB_TITLE - Static variable in interface org.apache.tika.metadata.IPTC
Contains the job title of the person who created the content of this item.
CREDIT - Static variable in interface org.apache.tika.metadata.Photoshop
 
CREDIT_LINE - Static variable in interface org.apache.tika.metadata.IPTC
The credit to person(s) and/or organisation(s) required by the supplier of the item to be used when published.
CryptoParser - Class in org.apache.tika.parser
Decrypts the incoming document stream and delegates further parsing to another parser instance.
CryptoParser(String, Provider, Set<MediaType>) - Constructor for class org.apache.tika.parser.CryptoParser
 
CryptoParser(String, Set<MediaType>) - Constructor for class org.apache.tika.parser.CryptoParser
 
CSVMessageBodyWriter - Class in org.apache.tika.server.core.writer
 
CSVMessageBodyWriter() - Constructor for class org.apache.tika.server.core.writer.CSVMessageBodyWriter
 
CSVParams - Class in org.apache.tika.parser.csv
 
CSVPipesIterator - Class in org.apache.tika.pipes.pipesiterator.csv
Iterates through a UTF-8 CSV file.
CSVPipesIterator() - Constructor for class org.apache.tika.pipes.pipesiterator.csv.CSVPipesIterator
 
CSVResult - Class in org.apache.tika.parser.csv
 
CSVResult(double, MediaType, Character) - Constructor for class org.apache.tika.parser.csv.CSVResult
 
CTAKES_META_PREFIX - Static variable in class org.apache.tika.parser.ctakes.CTAKESContentHandler
 
CTAKESAnnotationProperty - Enum in org.apache.tika.parser.ctakes
This enumeration includes the properties that an IdentifiedAnnotation object can provide.
CTAKESConfig - Class in org.apache.tika.parser.ctakes
Configuration for CTAKESContentHandler.
CTAKESConfig() - Constructor for class org.apache.tika.parser.ctakes.CTAKESConfig
Default constructor.
CTAKESConfig(InputStream) - Constructor for class org.apache.tika.parser.ctakes.CTAKESConfig
Loads properties from InputStream and then tries to close InputStream.
CTAKESContentHandler - Class in org.apache.tika.parser.ctakes
Class used to extract biomedical information while parsing.
CTAKESContentHandler(ContentHandler, Metadata, CTAKESConfig) - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
Creates a new CTAKESContentHandler for the given ContentHandler and Metadata objects.
CTAKESContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
Creates a new CTAKESContentHandler for the given ContentHandler and Metadata objects.
CTAKESContentHandler() - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
Default constructor.
CTAKESParser - Class in org.apache.tika.parser.ctakes
CTAKESParser decorates a Parser and leverages on CTAKESContentHandler to extract biomedical information from clinical text using Apache cTAKES.
CTAKESParser() - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
Wraps the default Parser
CTAKESParser(TikaConfig) - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
Wraps the default Parser for this Config
CTAKESParser(Parser) - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
Wraps the specified Parser
CTAKESSerializer - Enum in org.apache.tika.parser.ctakes
Enumeration for types of cTAKES (UIMA) CAS serializer supported by cTAKES.
CTAKESUtils - Class in org.apache.tika.parser.ctakes
This class provides methods to extract biomedical information from plain text using CTAKESContentHandler that relies on Apache cTAKES.
CTAKESUtils() - Constructor for class org.apache.tika.parser.ctakes.CTAKESUtils
 
CUSTOM_MIMES_SYS_PROP - Static variable in class org.apache.tika.mime.MimeTypesFactory
System property to set a path to an additional external custom mimetypes XML file to be loaded.
customCompositeDetector() - Static method in class org.apache.tika.example.CustomMimeInfo
 
CustomMimeInfo - Class in org.apache.tika.example
 
CustomMimeInfo() - Constructor for class org.apache.tika.example.CustomMimeInfo
 
customMimeInfo() - Static method in class org.apache.tika.example.CustomMimeInfo
 

D

d - Variable in class org.apache.tika.parser.ocr.tess4j.ImageDeskew.HoughLine
 
data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.EightBytesOfData
 
data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.FourBytesOfData
 
data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.OneByteOfData
 
data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
 
data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtFourBytesOfLengthFollowedByData
 
data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.TwoBytesOfData
 
data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
 
data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
Gets or sets a binary item as specified in [MS-FSSHTTPB] section 2.2.1.3 that specifies a value that is unique to the file data represented by this root node object.
data - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
 
data - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.RawTag
 
Database - Interface in org.apache.tika.metadata
 
databaseExists(Path) - Static method in class org.apache.tika.eval.app.db.H2Util
 
DataElement - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
DataElement(DataElementType, DataElementData) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
Initializes a new instance of the DataElement class.
DataElement() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
Initializes a new instance of the DataElement class.
DataElementData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
Base class of data element
DataElementData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementData
 
dataElementExGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
 
DataElementHash - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
Specifies an data element hash stream object
DataElementHash() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
Initializes a new instance of the DataElementHash class.
dataElementHash - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
 
dataElementHashData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
 
dataElementHashScheme - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
 
dataElementPackage - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
 
DataElementPackage - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
DataElementPackage() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementPackage
Initializes a new instance of the DataElementHash class.
DataElementParseErrorException - Exception in org.apache.tika.parser.microsoft.onenote.fsshttpb.exception
 
DataElementParseErrorException(int, Exception) - Constructor for exception org.apache.tika.parser.microsoft.onenote.fsshttpb.exception.DataElementParseErrorException
 
DataElementParseErrorException(int, String, Exception) - Constructor for exception org.apache.tika.parser.microsoft.onenote.fsshttpb.exception.DataElementParseErrorException
 
dataElements - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementPackage
 
DataElementType - Enum in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
The enumeration of the data element type
dataElementType - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
 
DataElementUtils - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.util
 
DataElementUtils() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
 
dataHash - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
 
DataHashObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
 
DataHashObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
Initializes a new instance of the DataHashObject class.
DataNodeObjectData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
Data Node Object data
DataNodeObjectData(byte[], int, int) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataNodeObjectData
Initializes a new instance of the DataNodeObjectData class.
dataNodeObjectData - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
 
dataRoot - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
 
dataSize - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataSizeObject
 
dataSize - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.NodeObject
 
DataSizeObject - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj
Data Size Object
DataSizeObject() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataSizeObject
Initializes a new instance of the DataSizeObject class.
DataURIScheme - Class in org.apache.tika.parser.html
 
DataURISchemeParseException - Exception in org.apache.tika.parser.html
 
DataURISchemeParseException(String) - Constructor for exception org.apache.tika.parser.html.DataURISchemeParseException
 
DataURISchemeUtil - Class in org.apache.tika.parser.html
Not thread safe.
DataURISchemeUtil() - Constructor for class org.apache.tika.parser.html.DataURISchemeUtil
 
DATE - Static variable in interface org.apache.tika.metadata.DublinCore
A date associated with an event in the life cycle of the resource.
DATE - Static variable in interface org.apache.tika.parser.ner.NERecogniser
 
DATE_CREATED - Static variable in interface org.apache.tika.metadata.IPTC
Designates the date and optionally the time the intellectual content was created rather than the date of the creation of the physical representation.
DATE_CREATED - Static variable in interface org.apache.tika.metadata.Photoshop
 
DATE_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
DateNormalizingMetadataFilter - Class in org.apache.tika.metadata.filter
Some dates in some file formats do not have a timezone.
DateNormalizingMetadataFilter() - Constructor for class org.apache.tika.metadata.filter.DateNormalizingMetadataFilter
 
DateUtils - Class in org.apache.tika.utils
Date related utility methods and constants
DateUtils() - Constructor for class org.apache.tika.utils.DateUtils
 
DBBuffer - Class in org.apache.tika.eval.app.db
 
DBBuffer(Connection, String, String, String) - Constructor for class org.apache.tika.eval.app.db.DBBuffer
 
DBConsumersManager - Class in org.apache.tika.eval.app.batch
 
DBConsumersManager(JDBCUtil, MimeBuffer, List<FileResourceConsumer>) - Constructor for class org.apache.tika.eval.app.batch.DBConsumersManager
 
DBFParser - Class in org.apache.tika.parser.dbf
This is a Tika wrapper around the DBFReader.
DBFParser() - Constructor for class org.apache.tika.parser.dbf.DBFParser
 
DBWriter - Class in org.apache.tika.eval.app.io
This is still in its early stages.
DBWriter(Connection, List<TableInfo>, JDBCUtil, MimeBuffer) - Constructor for class org.apache.tika.eval.app.io.DBWriter
 
DcXMLParser - Class in org.apache.tika.parser.xml
Dublin Core metadata parser
DcXMLParser() - Constructor for class org.apache.tika.parser.xml.DcXMLParser
 
decode(String) - Static method in class org.apache.tika.mime.HexCoDec
Decode a hex string
decode(char[]) - Static method in class org.apache.tika.mime.HexCoDec
Decode an array of hex chars
decode(char[], int, int) - Static method in class org.apache.tika.mime.HexCoDec
Decode an array of hex chars.
decompressConcatenated(Metadata) - Method in interface org.apache.tika.parser.pkg.CompressorParserOptions
 
decorate(ContentHandler, Metadata) - Method in interface org.apache.tika.sax.ContentHandlerDecoratorFactory
 
DEF_MODEL - Static variable in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
 
DEFAULT - Static variable in interface org.apache.tika.config.InitializableProblemHandler
 
DEFAULT - Static variable in class org.apache.tika.config.ParamField
 
DEFAULT - Static variable in class org.apache.tika.parser.AutoDetectParserConfig
 
DEFAULT_CHARSET - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
DEFAULT_EMBEDDED_FILE_FIELD_NAME - Static variable in class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitter
 
DEFAULT_EMBEDDED_FILE_FIELD_NAME - Static variable in class org.apache.tika.pipes.emitter.solr.SolrEmitter
 
DEFAULT_FORKED_STARTUP_MILLIS - Static variable in class org.apache.tika.server.core.TikaServerConfig
Number of milliseconds to wait for forked process to startup
DEFAULT_HANDLER_CONFIG - Static variable in class org.apache.tika.pipes.HandlerConfig
 
DEFAULT_HANDLER_TYPE - Static variable in class org.apache.tika.server.core.resource.RecursiveMetadataResource
 
DEFAULT_HOST - Static variable in class org.apache.tika.server.core.TikaServerConfig
 
DEFAULT_ID - Static variable in class org.apache.tika.language.translate.impl.MicrosoftTranslator
 
DEFAULT_MAX_CHARS_FOR_DETECTION - Static variable in class org.apache.tika.langdetect.optimaize.OptimaizeLangDetector
 
DEFAULT_MAX_CHARS_FOR_SHORT_DETECTION - Static variable in class org.apache.tika.langdetect.optimaize.OptimaizeLangDetector
 
DEFAULT_MAX_ENTITY_EXPANSIONS - Static variable in class org.apache.tika.utils.XMLReaderUtils
 
DEFAULT_MAX_FIELD_SIZE - Static variable in class org.apache.tika.metadata.writefilter.StandardWriteFilterFactory
 
DEFAULT_MAX_FILES_PROCESSED_PER_PROCESS - Static variable in class org.apache.tika.pipes.PipesConfigBase
 
DEFAULT_MAX_FOR_EMIT_BATCH - Static variable in class org.apache.tika.pipes.PipesConfigBase
default size to send back to the PipesClient for batch emitting.
DEFAULT_MAX_KEY_SIZE - Static variable in class org.apache.tika.metadata.writefilter.StandardWriteFilterFactory
 
DEFAULT_MAX_QUEUE_SIZE - Static variable in class org.apache.tika.batch.builders.BatchProcessBuilder
 
DEFAULT_MAX_VALUES_PER_FIELD - Static variable in class org.apache.tika.metadata.writefilter.StandardWriteFilterFactory
 
DEFAULT_MAX_WAIT_MS - Static variable in class org.apache.tika.pipes.pipesiterator.PipesIterator
 
DEFAULT_MINIMUM_TIMEOUT_MILLIS - Static variable in class org.apache.tika.server.core.TikaServerConfig
Clients may not set a timeout less than this amount.
DEFAULT_MODEL_PATH - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
default Model path
DEFAULT_MODELS - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
DEFAULT_NER_IMPL - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
 
DEFAULT_NGRAM_LENGTH - Static variable in class org.apache.tika.langdetect.tika.LanguageProfile
 
DEFAULT_NUM_CLIENTS - Static variable in class org.apache.tika.pipes.PipesConfigBase
 
DEFAULT_ON_PARSE_EXCEPTION - Static variable in class org.apache.tika.pipes.FetchEmitTuple
 
DEFAULT_POOL_SIZE - Static variable in class org.apache.tika.utils.XMLReaderUtils
Default size for the pool of SAX Parsers and the pool of DOM builders
DEFAULT_PORT - Static variable in class org.apache.tika.server.core.TikaServerConfig
 
DEFAULT_QUEUE_SIZE - Static variable in class org.apache.tika.pipes.pipesiterator.PipesIterator
 
DEFAULT_SECRET - Static variable in class org.apache.tika.language.translate.impl.MicrosoftTranslator
 
DEFAULT_SHUTDOWN_CLIENT_AFTER_MILLS - Static variable in class org.apache.tika.pipes.PipesConfigBase
 
DEFAULT_STARTUP_TIMEOUT_MILLIS - Static variable in class org.apache.tika.pipes.PipesConfigBase
 
DEFAULT_TASK_PULSE_MILLIS - Static variable in class org.apache.tika.server.core.TikaServerConfig
How often to check to see that the task hasn't timed out
DEFAULT_TASK_TIMEOUT_MILLIS - Static variable in class org.apache.tika.server.core.TikaServerConfig
Number of milliseconds to wait per server task (parse, detect, unpack, translate, etc.) before timing out and shutting down the forked process.
DEFAULT_TIMEOUT_MILLIS - Static variable in class org.apache.tika.pipes.PipesConfigBase
 
DEFAULT_TIMEOUT_MILLIS - Static variable in class org.apache.tika.server.eval.TikaEvalResource
 
DEFAULT_TIMEOUT_MS - Static variable in class org.apache.tika.parser.external2.ExternalParser
 
DEFAULT_TOTAL_ESTIMATED_BYTES - Static variable in class org.apache.tika.metadata.writefilter.StandardWriteFilterFactory
 
DefaultContentHandlerFactoryBuilder - Class in org.apache.tika.batch.builders
Builds BasicContentHandler with type defined by attribute "basicHandlerType" with possible values: xml, html, text, body, ignore.
DefaultContentHandlerFactoryBuilder() - Constructor for class org.apache.tika.batch.builders.DefaultContentHandlerFactoryBuilder
 
DefaultDetector - Class in org.apache.tika.detect
A composite detector based on all the Detector implementations available through the service provider mechanism.
DefaultDetector(MimeTypes, ServiceLoader, Collection<Class<? extends Detector>>) - Constructor for class org.apache.tika.detect.DefaultDetector
 
DefaultDetector(MimeTypes, ServiceLoader) - Constructor for class org.apache.tika.detect.DefaultDetector
 
DefaultDetector(MimeTypes, ClassLoader) - Constructor for class org.apache.tika.detect.DefaultDetector
 
DefaultDetector(ClassLoader) - Constructor for class org.apache.tika.detect.DefaultDetector
 
DefaultDetector(MimeTypes) - Constructor for class org.apache.tika.detect.DefaultDetector
 
DefaultDetector() - Constructor for class org.apache.tika.detect.DefaultDetector
 
DefaultEmbeddedStreamTranslator - Class in org.apache.tika.extractor
Loads EmbeddedStreamTranslators via service loading.
DefaultEmbeddedStreamTranslator() - Constructor for class org.apache.tika.extractor.DefaultEmbeddedStreamTranslator
 
DefaultEncodingDetector - Class in org.apache.tika.detect
A composite encoding detector based on all the EncodingDetector implementations available through the service provider mechanism.
DefaultEncodingDetector() - Constructor for class org.apache.tika.detect.DefaultEncodingDetector
 
DefaultEncodingDetector(ServiceLoader) - Constructor for class org.apache.tika.detect.DefaultEncodingDetector
 
DefaultEncodingDetector(ServiceLoader, Collection<Class<? extends EncodingDetector>>) - Constructor for class org.apache.tika.detect.DefaultEncodingDetector
 
DefaultHtmlMapper - Class in org.apache.tika.parser.html
The default HTML mapping rules in Tika.
DefaultHtmlMapper() - Constructor for class org.apache.tika.parser.html.DefaultHtmlMapper
 
DefaultInputStreamFactory - Class in org.apache.tika.server.core
Passthrough -- returns InputStream as is
DefaultInputStreamFactory() - Constructor for class org.apache.tika.server.core.DefaultInputStreamFactory
 
DefaultMetadataFilter - Class in org.apache.tika.metadata.filter
 
DefaultMetadataFilter(ServiceLoader) - Constructor for class org.apache.tika.metadata.filter.DefaultMetadataFilter
 
DefaultMetadataFilter(List<MetadataFilter>) - Constructor for class org.apache.tika.metadata.filter.DefaultMetadataFilter
 
DefaultMetadataFilter() - Constructor for class org.apache.tika.metadata.filter.DefaultMetadataFilter
 
DefaultParser - Class in org.apache.tika.parser
A composite parser based on all the Parser implementations available through the service provider mechanism.
DefaultParser(MediaTypeRegistry, ServiceLoader, Collection<Class<? extends Parser>>, EncodingDetector) - Constructor for class org.apache.tika.parser.DefaultParser
 
DefaultParser(MediaTypeRegistry, ServiceLoader, Collection<Class<? extends Parser>>) - Constructor for class org.apache.tika.parser.DefaultParser
 
DefaultParser(MediaTypeRegistry, ServiceLoader, EncodingDetector) - Constructor for class org.apache.tika.parser.DefaultParser
 
DefaultParser(MediaTypeRegistry, ServiceLoader) - Constructor for class org.apache.tika.parser.DefaultParser
 
DefaultParser(MediaTypeRegistry, ClassLoader) - Constructor for class org.apache.tika.parser.DefaultParser
 
DefaultParser(ClassLoader) - Constructor for class org.apache.tika.parser.DefaultParser
 
DefaultParser(MediaTypeRegistry) - Constructor for class org.apache.tika.parser.DefaultParser
 
DefaultParser() - Constructor for class org.apache.tika.parser.DefaultParser
 
DefaultProbDetector - Class in org.apache.tika.detect
A version of DefaultDetector for probabilistic mime detectors, which use statistical techniques to blend the results of differing underlying detectors when attempting to detect the type of a given file.
DefaultProbDetector(ProbabilisticMimeDetectionSelector, ServiceLoader) - Constructor for class org.apache.tika.detect.DefaultProbDetector
 
DefaultProbDetector(ProbabilisticMimeDetectionSelector, ClassLoader) - Constructor for class org.apache.tika.detect.DefaultProbDetector
 
DefaultProbDetector(ClassLoader) - Constructor for class org.apache.tika.detect.DefaultProbDetector
 
DefaultProbDetector(MimeTypes) - Constructor for class org.apache.tika.detect.DefaultProbDetector
 
DefaultProbDetector() - Constructor for class org.apache.tika.detect.DefaultProbDetector
 
DefaultTranslator - Class in org.apache.tika.language.translate
A translator which picks the first available Translator implementations available through the service provider mechanism.
DefaultTranslator(ServiceLoader) - Constructor for class org.apache.tika.language.translate.DefaultTranslator
 
DefaultTranslator() - Constructor for class org.apache.tika.language.translate.DefaultTranslator
 
DefaultZipContainerDetector - Class in org.apache.tika.detect.zip
 
DefaultZipContainerDetector() - Constructor for class org.apache.tika.detect.zip.DefaultZipContainerDetector
 
DefaultZipContainerDetector(ServiceLoader) - Constructor for class org.apache.tika.detect.zip.DefaultZipContainerDetector
 
DefaultZipContainerDetector(List<ZipContainerDetector>) - Constructor for class org.apache.tika.detect.zip.DefaultZipContainerDetector
 
DEFLATE64 - Static variable in class org.apache.tika.detect.zip.CompressorConstants
 
DelegatingParser - Class in org.apache.tika.parser
Base class for parser implementations that want to delegate parts of the task of parsing an input document to another parser.
DelegatingParser() - Constructor for class org.apache.tika.parser.DelegatingParser
 
deleteNamespace(String) - Static method in class org.apache.tika.xmp.XMPMetadata
Deletes a namespace from the registry.
DELIMITER_PROPERTY - Static variable in class org.apache.tika.parser.csv.TextAndCSVParser
 
DeprecatedStreamingZipContainerDetector - Class in org.apache.tika.detect.zip
 
DeprecatedStreamingZipContainerDetector() - Constructor for class org.apache.tika.detect.zip.DeprecatedStreamingZipContainerDetector
 
DeprecatedZipContainerDetector - Class in org.apache.tika.detect.zip
A detector that works on Zip documents and tries to figure out basic types -- epub, jar, ear, war, kmz and StarOffice
DeprecatedZipContainerDetector() - Constructor for class org.apache.tika.detect.zip.DeprecatedZipContainerDetector
 
DERIVED_FROM_DOCUMENTID - Static variable in interface org.apache.tika.metadata.XMPMM
Document id for the document that this document was derived from
DERIVED_FROM_INSTANCEID - Static variable in interface org.apache.tika.metadata.XMPMM
Instance id for the document instance that this document was derived from
descend(String, String) - Method in class org.apache.tika.sax.xpath.ChildMatcher
 
descend(String, String) - Method in class org.apache.tika.sax.xpath.CompositeMatcher
 
descend(String, String) - Method in class org.apache.tika.sax.xpath.Matcher
Returns the XPath evaluation state that results from descending to a child element with the given name.
descend(String, String) - Method in class org.apache.tika.sax.xpath.NamedElementMatcher
 
descend(String, String) - Method in class org.apache.tika.sax.xpath.SubtreeMatcher
 
describeMediaType() - Static method in class org.apache.tika.example.MediaTypeExample
 
DescribeMetadata - Class in org.apache.tika.example
Print the supported Tika Metadata models and their fields.
DescribeMetadata() - Constructor for class org.apache.tika.example.DescribeMetadata
 
DESCRIPTION - Static variable in interface org.apache.tika.metadata.DublinCore
An account of the content of the resource.
DESCRIPTION - Static variable in interface org.apache.tika.metadata.IPTC
A textual description, including captions, of the item's content, particularly used where the object is not text.
DESCRIPTION - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
 
DESCRIPTION_WRITER - Static variable in interface org.apache.tika.metadata.IPTC
Identifier or the name of the person involved in writing, editing or correcting the description of the content.
deserialize(JsonParser, DeserializationContext) - Method in class org.apache.tika.metadata.serialization.JsonMetadataDeserializer
 
deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestDataElementData
Used to return the length of this element.
deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementData
De-serialize data element data from byte array.
deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDataElementData
Used to return the length of this element.
deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestDataElementData
Used to return the length of this element.
deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexDataElementData
Used to de-serialize the data element.
deserializeDataElementDataFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestDataElementData
Used to de-serialize data element.
deserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BasicObject
Used to return the length of this element.
deserializeFromByteArray(StreamObjectHeaderStart, byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
Used to return the length of this element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.CellManifestCurrentRevision
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementHash
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElementPackage
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataSizeObject
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.IntermediateNodeObject
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupData
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupDeclarations
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadata
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupMetadataDeclarations
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectBLOBDataDeclaration
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectData
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDataBLOBReference
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.ObjectGroupObjectDeclare
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifest
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestObjectGroupReferences
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionManifestRootDeclare
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.SignatureObject
Used to de-serialize the element.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexCellMapping
Used to de-serialize the items.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexManifestMapping
Used to Deserialize the items.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageIndexRevisionMapping
Used to de-serialize the items
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestRootDeclare
Used to de-serialize the items.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StorageManifestSchemaGUID
Used to de-serialize the items.
deserializeItemsFromByteArray(byte[], AtomicInteger, int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
De-serialize items from byte array.
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.apple.BPListDetector
 
detect(ZipFile, TikaInputStream) - Method in class org.apache.tika.detect.apple.IWorkDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.CompositeDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.CompositeEncodingDetector
 
detect(InputStream, Metadata) - Method in interface org.apache.tika.detect.Detector
Detects the content type of the given input document.
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.EmptyDetector
 
detect(InputStream, Metadata) - Method in interface org.apache.tika.detect.EncodingDetector
Detects the character encoding of the given text document, or null if the encoding of the document can not be detected.
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.FileCommandDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.MagicDetector
 
detect(ZipFile, TikaInputStream) - Method in class org.apache.tika.detect.microsoft.ooxml.OPCPackageDetector
 
detect(Set<String>) - Static method in class org.apache.tika.detect.microsoft.POIFSContainerDetector
Deprecated.
Use POIFSContainerDetector.detect(Set, DirectoryEntry) and pass the root entry of the filesystem whose type is to be detected, as a second argument.
detect(Set<String>, DirectoryEntry) - Static method in class org.apache.tika.detect.microsoft.POIFSContainerDetector
Internal detection of the specific kind of OLE2 document, based on the names of the top-level streams within the file.
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.microsoft.POIFSContainerDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.NameDetector
Detects the content type of an input document based on the document name given in the input metadata.
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.NonDetectingEncodingDetector
 
detect(Set<String>) - Static method in class org.apache.tika.detect.ole.MiscOLEDetector
Deprecated.
Use MiscOLEDetector.detect(Set, DirectoryEntry) and pass the root entry of the filesystem whose type is to be detected, as a second argument.
detect(Set<String>, DirectoryEntry) - Static method in class org.apache.tika.detect.ole.MiscOLEDetector
Internal detection of the specific kind of OLE2 document, based on the names of the top-level streams within the file.
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.ole.MiscOLEDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.OverrideDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.TextDetector
Looks at the beginning of the document input stream to determine whether the document is text or not.
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.TrainedModelDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.TypeDetector
Detects the content type of an input document based on a type hint given in the input metadata.
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.ZeroSizeFileDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.zip.DefaultZipContainerDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.zip.DeprecatedStreamingZipContainerDetector
 
detect(ZipFile, TikaInputStream) - Method in class org.apache.tika.detect.zip.FrictionlessPackageDetector
 
detect(ZipFile, TikaInputStream) - Method in class org.apache.tika.detect.zip.IPADetector
 
detect(ZipFile, TikaInputStream) - Method in class org.apache.tika.detect.zip.JarDetector
 
detect(ZipFile, TikaInputStream) - Method in class org.apache.tika.detect.zip.KMZDetector
 
detect(ZipFile, TikaInputStream) - Method in class org.apache.tika.detect.zip.OpenDocumentDetector
 
detect(ZipFile, TikaInputStream) - Method in class org.apache.tika.detect.zip.StarOfficeDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.zip.StreamingZipContainerDetector
 
detect(ZipFile, TikaInputStream) - Method in interface org.apache.tika.detect.zip.ZipContainerDetector
If detection is successful, the ZipDetector should set the zip file or OPCPackage in TikaInputStream.setOpenContainer() Implementations should _not_ close the ZipFile
detect(InputStream, Metadata) - Method in class org.apache.tika.example.EncryptedPrescriptionDetector
 
detect() - Method in class org.apache.tika.language.detect.LanguageDetector
 
detect(CharSequence) - Method in class org.apache.tika.language.detect.LanguageDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.mime.MimeTypes
Automatically detects the MIME type of a document based on magic markers in the stream prefix and any given metadata hints.
detect(InputStream, Metadata) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.parser.html.HtmlEncodingDetector
 
detect(ZipFile) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
 
detect(ZipFile) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
 
detect() - Method in class org.apache.tika.parser.txt.CharsetDetector
Return the charset that best matches the supplied input data.
detect(InputStream, Metadata) - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.parser.txt.UniversalEncodingDetector
 
detect(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.core.resource.DetectorResource
 
detect(InputStream) - Method in class org.apache.tika.server.core.resource.LanguageResource
 
detect(String) - Method in class org.apache.tika.server.core.resource.LanguageResource
 
detect(InputStream, Metadata) - Method in class org.apache.tika.Tika
Detects the media type of the given document.
detect(InputStream, String) - Method in class org.apache.tika.Tika
Detects the media type of the given document.
detect(InputStream) - Method in class org.apache.tika.Tika
Detects the media type of the given document.
detect(byte[], String) - Method in class org.apache.tika.Tika
Detects the media type of the given document.
detect(byte[]) - Method in class org.apache.tika.Tika
Detects the media type of the given document.
detect(Path) - Method in class org.apache.tika.Tika
Detects the media type of the file at the given path.
detect(File) - Method in class org.apache.tika.Tika
Detects the media type of the given file.
detect(URL) - Method in class org.apache.tika.Tika
Detects the media type of the resource at the given URL.
detect(String) - Method in class org.apache.tika.Tika
Detects the media type of a document with the given file name.
DETECT_EXCEPTION - Static variable in class org.apache.tika.eval.app.FileProfiler
 
detectAll() - Method in class org.apache.tika.langdetect.lingo24.Lingo24LangDetector
 
detectAll() - Method in class org.apache.tika.langdetect.mitll.TextLangDetector
 
detectAll() - Method in class org.apache.tika.langdetect.opennlp.OpenNLPDetector
 
detectAll() - Method in class org.apache.tika.langdetect.optimaize.OptimaizeLangDetector
Detect languages based on previously submitted text (via addText calls).
detectAll() - Method in class org.apache.tika.langdetect.tika.TikaLanguageDetector
 
detectAll() - Method in class org.apache.tika.language.detect.LanguageDetector
Detect languages based on previously submitted text (via addText calls).
detectAll(String) - Method in class org.apache.tika.language.detect.LanguageDetector
Utility wrapper that detects the language of a given chunk of text.
detectAll() - Method in class org.apache.tika.parser.txt.CharsetDetector
Return an array of all charsets that appear to be plausible matches with the input data.
detectFilename(MultivaluedMap<String, String>) - Static method in class org.apache.tika.server.core.resource.TikaResource
 
detectIfPossible(ZipEntry) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
 
detectIfPossible(ZipEntry) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
 
detectLanguage(String) - Method in class org.apache.tika.example.LanguageDetectorExample
 
detectLanguage(String) - Method in class org.apache.tika.language.translate.impl.AbstractTranslator
 
detectOfficeOpenXML(OPCPackage) - Static method in class org.apache.tika.detect.microsoft.ooxml.OPCPackageDetector
Detects the type of an OfficeOpenXML (OOXML) file from opened Package
detectOnKeys(Set<String>) - Static method in class org.apache.tika.detect.apple.BPListDetector
 
Detector - Interface in org.apache.tika.detect
Content type detector.
DetectorResource - Class in org.apache.tika.server.core.resource
 
DetectorResource(ServerStatus) - Constructor for class org.apache.tika.server.core.resource.DetectorResource
 
detectType(ZipArchiveEntry, ZipFile) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
 
detectType(ZipArchiveEntry, ZipArchiveInputStream) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
 
detectType(InputStream) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
 
detectType(POIFSFileSystem) - Static method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
 
detectType(DirectoryEntry) - Static method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
 
detectWithCustomConfig(String) - Static method in class org.apache.tika.example.AdvancedTypeDetector
 
detectWithCustomDetector(String) - Static method in class org.apache.tika.example.AdvancedTypeDetector
 
detectXMLOnKeys(Set<String>) - Static method in class org.apache.tika.detect.apple.BPListDetector
 
DGN8Parser - Class in org.apache.tika.parser.dgn
This is a VERY LIMITED parser.
DGN8Parser() - Constructor for class org.apache.tika.parser.dgn.DGN8Parser
 
DGN_8 - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
 
DICE - Static variable in class org.apache.tika.server.eval.TikaEvalResource
 
DICT_CLOSE - Static variable in class org.apache.tika.fuzzing.pdf.EvilCOSWriter
The dictionary close token.
DICT_OPEN - Static variable in class org.apache.tika.fuzzing.pdf.EvilCOSWriter
The dictionary open token.
DIFContentHandler - Class in org.apache.tika.parser.dif
 
DIFContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.parser.dif.DIFContentHandler
 
DIFContentHandler - Class in org.apache.tika.sax
 
DIFContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.sax.DIFContentHandler
 
DIFParser - Class in org.apache.tika.parser.dif
 
DIFParser() - Constructor for class org.apache.tika.parser.dif.DIFParser
 
digest(InputStream, Metadata, ParseContext) - Method in class org.apache.tika.parser.digest.CompositeDigester
 
digest(InputStream, Metadata, ParseContext) - Method in class org.apache.tika.parser.digest.InputStreamDigester
 
digest(InputStream, Metadata, ParseContext) - Method in interface org.apache.tika.parser.DigestingParser.Digester
Digests an InputStream and sets the appropriate value(s) in the metadata.
DigestingAutoDetectParserFactory - Class in org.apache.tika.batch
 
DigestingAutoDetectParserFactory() - Constructor for class org.apache.tika.batch.DigestingAutoDetectParserFactory
 
DigestingParser - Class in org.apache.tika.parser
 
DigestingParser(Parser, DigestingParser.Digester) - Constructor for class org.apache.tika.parser.DigestingParser
Creates a decorator for the given parser.
DigestingParser.Digester - Interface in org.apache.tika.parser
Interface for digester.
DigestingParser.Encoder - Interface in org.apache.tika.parser
Encodes byte array from a MessageDigest to String
DIGITAL_IMAGE_GUID - Static variable in interface org.apache.tika.metadata.IPTC
Globally unique identifier for the item.
DIGITAL_SOURCE_FILE_TYPE - Static variable in interface org.apache.tika.metadata.IPTC
Deprecated. 
DIGITAL_SOURCE_TYPE - Static variable in interface org.apache.tika.metadata.IPTC
The type of the source of this digital image
DirectoryListingEntry - Class in org.apache.tika.parser.microsoft.chm
The format of a directory listing entry is as follows: BYTE: length of name BYTEs: name (UTF-8 encoded) ENCINT: content section ENCINT: offset ENCINT: length The offset is from the beginning of the content section the file is in, after the section has been decompressed (if appropriate).
DirectoryListingEntry() - Constructor for class org.apache.tika.parser.microsoft.chm.DirectoryListingEntry
 
DirectoryListingEntry(int, String, ChmCommons.EntryType, int, int) - Constructor for class org.apache.tika.parser.microsoft.chm.DirectoryListingEntry
Constructs directoryListingEntry
DirListParser - Class in org.apache.tika.example
Parses the output of /bin/ls and counts the number of files and the number of executables using Tika.
DirListParser() - Constructor for class org.apache.tika.example.DirListParser
 
DISC_NUMBER - Static variable in interface org.apache.tika.metadata.XMPDM
"The disc number for part of an album set."
DisplayMetInstance - Class in org.apache.tika.example
Grabs a PDF file from a URL and prints its Metadata
DisplayMetInstance() - Constructor for class org.apache.tika.example.DisplayMetInstance
 
dispose() - Method in class org.apache.tika.io.TemporaryResources
Calls the TemporaryResources.close() method and wraps the potential IOException into a TikaException for convenience when used within Tika.
dispose() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
Assign the internal read buffer to null.
distance(LanguageProfile) - Method in class org.apache.tika.langdetect.tika.LanguageProfile
Calculates the geometric distance between this and the given other language profile.
DL4JInceptionV3Net - Class in org.apache.tika.dl.imagerec
DL4JInceptionV3Net is an implementation of ObjectRecogniser.
DL4JInceptionV3Net() - Constructor for class org.apache.tika.dl.imagerec.DL4JInceptionV3Net
 
DL4JVGG16Net - Class in org.apache.tika.dl.imagerec
 
DL4JVGG16Net() - Constructor for class org.apache.tika.dl.imagerec.DL4JVGG16Net
 
DO_NOT_RESTART_EXIT_VALUE - Static variable in class org.apache.tika.server.core.TikaServerProcess
 
DOC - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
Microsoft Word
DOC_INFO_CREATED - Static variable in interface org.apache.tika.metadata.PDF
 
DOC_INFO_CREATOR - Static variable in interface org.apache.tika.metadata.PDF
 
DOC_INFO_CREATOR_TOOL - Static variable in interface org.apache.tika.metadata.PDF
 
DOC_INFO_KEY_WORDS - Static variable in interface org.apache.tika.metadata.PDF
 
DOC_INFO_MODIFICATION_DATE - Static variable in interface org.apache.tika.metadata.PDF
 
DOC_INFO_PRODUCER - Static variable in interface org.apache.tika.metadata.PDF
 
DOC_INFO_SUBJECT - Static variable in interface org.apache.tika.metadata.PDF
 
DOC_INFO_TITLE - Static variable in interface org.apache.tika.metadata.PDF
 
DOC_INFO_TRAPPED - Static variable in interface org.apache.tika.metadata.PDF
 
DOC_SECURITY - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
 
DOC_SECURITY_STRING - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
 
doClose() - Method in class org.apache.tika.eval.app.tools.SlowCompositeReaderWrapper
 
document(int, StoredFieldVisitor) - Method in class org.apache.tika.eval.app.tools.SlowCompositeReaderWrapper
 
DOCUMENTID - Static variable in interface org.apache.tika.metadata.XMPMM
The common identifier for all versions and renditions of a resource.
DocumentSelector - Interface in org.apache.tika.extractor
Interface for different document selection strategies for purposes like embedded document extraction by a ContainerExtractor instance.
DocumentSelectorConfig - Class in org.apache.tika.server.core.config
 
DocumentSelectorConfig() - Constructor for class org.apache.tika.server.core.config.DocumentSelectorConfig
 
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.ArrayNumber
This method is used to deserialize the number of array from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.EightBytesOfData
This method is used to deserialize the EightBytesOfData from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.FourBytesOfData
This method is used to deserialize the FourBytesOfData from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in interface org.apache.tika.parser.microsoft.onenote.fsshttpb.property.IProperty
This method is used to deserialize the property from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.NoData
This method is used to deserialize the NoData from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.OneByteOfData
This method is used to deserialize the OneByteOfData from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtArrayOfPropertyValues
This method is used to deserialize the prtArrayOfPropertyValues from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.PrtFourBytesOfLengthFollowedByData
This method is used to deserialize the prtFourBytesOfLengthFollowedByData from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.TwoBytesOfData
This method is used to deserialize the TwoBytesOfData from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.AlternativePackaging
This method is used to deserialize the Alternative Packaging object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BasicObject
Used to return the length of this element.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.BinaryItem
This method is used to de-serialize the BinaryItem basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
This method is used to deserialize the CellID basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellIDArray
This method is used to deserialize the CellIDArray basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
This method is used to deserialize the Compact64bitInt basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CompactID
This method is used to deserialize the CompactID object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
This method is used to deserialize the ExGuid basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
This method is used to deserialize the ExGUIDArray basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.JCID
This method is used to deserialize the JCID object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyID
This method is used to deserialize the PropertyID object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.SerialNumber
This method is used to deserialize the SerialNumber basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.PropertySet
This method is used to deserialize the PropertySet from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectPropSet
This method is used to deserialize the ObjectSpaceObjectPropSet from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
This method is used to deserialize the ObjectSpaceObjectStreamHeader object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfContextIDs
This method is used to deserialize the ObjectSpaceObjectStreamOfContextIDs object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOIDs
This method is used to deserialize the ObjectSpaceObjectStreamOfOIDs object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamOfOSIDs
This method is used to deserialize the ObjectSpaceObjectStreamOfOSIDs object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd16bit
This method is used to deserialize the StreamObjectHeaderEnd16bit basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderEnd8bit
This method is used to deserialize the StreamObjectHeaderEnd8bit basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart16bit
This method is used to deserialize the StreamObjectHeaderStart16bit basic object from the specified byte array and start index.
doDeserializeFromByteArray(byte[], int) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectHeaderStart32bit
This method is used to deserialize the StreamObjectHeaderStart32bit basic object from the specified byte array and start index.
doubleByte - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.TextEncoding
 
doubleToInt64Bits(double) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
doubleValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
doubleValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
doubleValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
doubleValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
doWriteBody(COSDocument) - Method in class org.apache.tika.fuzzing.pdf.EvilCOSWriter
This will write the body of the document.
doWriteHeader(COSDocument) - Method in class org.apache.tika.fuzzing.pdf.EvilCOSWriter
This will write the header to the PDF document.
doWriteObject(COSBase) - Method in class org.apache.tika.fuzzing.pdf.EvilCOSWriter
This will write a COS object.
doWriteTrailer(COSDocument) - Method in class org.apache.tika.fuzzing.pdf.EvilCOSWriter
This will write the trailer to the PDF document.
drawingHyperlinks - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
DRM_ENCRYPTED - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
TIKA-3666 MSOffice or other file encrypted with DRM in an OLE container
dropTableIfExists(Connection, String) - Method in class org.apache.tika.eval.app.db.H2Util
 
dropTableIfExists(Connection, String) - Method in class org.apache.tika.eval.app.db.JDBCUtil
 
DublinCore - Interface in org.apache.tika.metadata
A collection of Dublin Core metadata names.
DUMP - Static variable in class org.apache.tika.detect.zip.PackageConstants
 
DumpTikaConfigExample - Class in org.apache.tika.example
This class shows how to dump a TikaConfig object to a configuration file.
DumpTikaConfigExample() - Constructor for class org.apache.tika.example.DumpTikaConfigExample
 
DURATION - Static variable in interface org.apache.tika.metadata.XMPDM
"The duration of the media file." Value is in Seconds, unless xmpDM:scale is also set.
DurationFormatUtils - Class in org.apache.tika.util
Functionality and naming conventions (roughly) copied from org.apache.commons.lang3 so that we didn't have to add another dependency.
DurationFormatUtils() - Constructor for class org.apache.tika.util.DurationFormatUtils
 
DWG_CUSTOM_META_PREFIX - Static variable in class org.apache.tika.parser.dwg.DWGParser
 
DWGParser - Class in org.apache.tika.parser.dwg
DWG (CAD Drawing) parser.
DWGParser() - Constructor for class org.apache.tika.parser.dwg.DWGParser
 

E

EightBytesOfData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
This class is used to represent the property contains 8 bytes of data in the PropertySet.rgData stream field.
EightBytesOfData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.EightBytesOfData
 
ELAPSED_MILLIS - Static variable in class org.apache.tika.batch.FileResourceConsumer
 
element(String, String) - Method in class org.apache.tika.sax.XHTMLContentHandler
Emits an XHTML element with the given text content.
ElementMappingContentHandler - Class in org.apache.tika.sax
Content handler decorator that maps element QNames using a Map.
ElementMappingContentHandler(ContentHandler, Map<QName, ElementMappingContentHandler.TargetElement>) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler
 
ElementMappingContentHandler.TargetElement - Class in org.apache.tika.sax
 
ElementMatcher - Class in org.apache.tika.sax.xpath
Final evaluation state of an XPath expression that targets an element.
ElementMatcher() - Constructor for class org.apache.tika.sax.xpath.ElementMatcher
 
ElementMetadataHandler - Class in org.apache.tika.parser.xml
SAX event handler that maps the contents of an XML element into a metadata field.
ElementMetadataHandler(String, String, Metadata, String) - Constructor for class org.apache.tika.parser.xml.ElementMetadataHandler
Constructor for string metadata keys.
ElementMetadataHandler(String, String, Metadata, String, boolean, boolean) - Constructor for class org.apache.tika.parser.xml.ElementMetadataHandler
Constructor for string metadata keys which allows change of behavior for duplicate and empty entry values.
ElementMetadataHandler(String, String, Metadata, Property) - Constructor for class org.apache.tika.parser.xml.ElementMetadataHandler
Constructor for Property metadata keys.
ElementMetadataHandler(String, String, Metadata, Property, boolean, boolean) - Constructor for class org.apache.tika.parser.xml.ElementMetadataHandler
Constructor for Property metadata keys which allows change of behavior for duplicate and empty entry values.
EMAIL - Static variable in class org.apache.tika.eval.core.tokens.URLEmailNormalizingFilterFactory
 
EMB_APP_VERSION - Static variable in interface org.apache.tika.metadata.RTFMetadata
if an application and version is given as part of the embedded object, this is the literal string
EMB_CLASS - Static variable in interface org.apache.tika.metadata.RTFMetadata
 
EMB_ITEM - Static variable in interface org.apache.tika.metadata.RTFMetadata
 
EMB_TOPIC - Static variable in interface org.apache.tika.metadata.RTFMetadata
 
embed(Metadata, InputStream, OutputStream, ParseContext) - Method in interface org.apache.tika.embedder.Embedder
Embeds related document metadata from the given metadata object into the given output stream.
embed(Metadata, InputStream, OutputStream, ParseContext) - Method in class org.apache.tika.embedder.ExternalEmbedder
Executes the configured external command and passes the given document stream as a simple XHTML document to the given SAX content handler.
EMBEDDED_DEPTH - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
 
EMBEDDED_EXCEPTION - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
 
EMBEDDED_FILE_DESCRIPTION - Static variable in interface org.apache.tika.metadata.PDF
 
EMBEDDED_FILE_PATH_TABLE - Static variable in class org.apache.tika.eval.app.ExtractProfiler
 
EMBEDDED_FILE_PATH_TABLE_A - Static variable in class org.apache.tika.eval.app.ExtractComparer
 
EMBEDDED_FILE_PATH_TABLE_B - Static variable in class org.apache.tika.eval.app.ExtractComparer
 
EMBEDDED_PARSER - Static variable in class org.apache.tika.utils.ParserUtils
 
EMBEDDED_RELATIONSHIP_ID - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
 
EMBEDDED_RELATIONSHIPS - Static variable in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
 
EMBEDDED_RESOURCE_LIMIT_REACHED - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
 
EMBEDDED_RESOURCE_PATH - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
 
EMBEDDED_RESOURCE_TYPE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
Embedded resource type property
EMBEDDED_RESOURCE_TYPE_KEY - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
 
EMBEDDED_STORAGE_CLASS_ID - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
 
EmbeddedContentHandler - Class in org.apache.tika.sax
Content handler decorator that prevents the EmbeddedContentHandler.startDocument() and EmbeddedContentHandler.endDocument() events from reaching the decorated handler.
EmbeddedContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.EmbeddedContentHandler
Created a decorator that prevents the given handler from receiving EmbeddedContentHandler.startDocument() and EmbeddedContentHandler.endDocument() events.
EmbeddedDocumentExtractor - Interface in org.apache.tika.extractor
 
EmbeddedDocumentExtractorFactory - Interface in org.apache.tika.extractor
 
EmbeddedDocumentUtil - Class in org.apache.tika.extractor
Utility class to handle common issues with embedded documents.
EmbeddedDocumentUtil(ParseContext) - Constructor for class org.apache.tika.extractor.EmbeddedDocumentUtil
 
embeddedOLERef(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
embeddedOLERef(String) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
embeddedPicRef(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
embeddedPicRef(String, String) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
EmbeddedResourceHandler - Interface in org.apache.tika.extractor
Tika container extractor callback interface.
EmbeddedStreamTranslator - Interface in org.apache.tika.extractor
Interface for different filtering of embedded streams.
Embedder - Interface in org.apache.tika.embedder
Tika embedder interface
EMFParser - Class in org.apache.tika.parser.microsoft
Extracts files embedded in EMF and offers a very rough capability to extract text if there is text stored in the EMF.
EMFParser() - Constructor for class org.apache.tika.parser.microsoft.EMFParser
 
emit(List<? extends EmitData>) - Method in class org.apache.tika.pipes.emitter.AbstractEmitter
The default behavior is to call Emitter.emit(String, List) on each item.
emit(String, List<Metadata>) - Method in class org.apache.tika.pipes.emitter.azblob.AZBlobEmitter
Requires the src-bucket/path/to/my/file.txt in the TikaCoreProperties.SOURCE_PATH.
emit(String, InputStream, Metadata) - Method in class org.apache.tika.pipes.emitter.azblob.AZBlobEmitter
 
emit(String, List<Metadata>) - Method in interface org.apache.tika.pipes.emitter.Emitter
 
emit(List<? extends EmitData>) - Method in interface org.apache.tika.pipes.emitter.Emitter
 
emit(String, List<Metadata>) - Method in class org.apache.tika.pipes.emitter.EmptyEmitter
 
emit(List<? extends EmitData>) - Method in class org.apache.tika.pipes.emitter.EmptyEmitter
 
emit(String, List<Metadata>) - Method in class org.apache.tika.pipes.emitter.fs.FileSystemEmitter
 
emit(String, InputStream, Metadata) - Method in class org.apache.tika.pipes.emitter.fs.FileSystemEmitter
 
emit(String, List<Metadata>) - Method in class org.apache.tika.pipes.emitter.gcs.GCSEmitter
Requires the src-bucket/path/to/my/file.txt in the TikaCoreProperties.SOURCE_PATH.
emit(String, InputStream, Metadata) - Method in class org.apache.tika.pipes.emitter.gcs.GCSEmitter
 
emit(String, List<Metadata>) - Method in class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitter
 
emit(String, List<Metadata>) - Method in class org.apache.tika.pipes.emitter.s3.S3Emitter
Requires the src-bucket/path/to/my/file.txt in the TikaCoreProperties.SOURCE_PATH.
emit(String, InputStream, Metadata) - Method in class org.apache.tika.pipes.emitter.s3.S3Emitter
 
emit(String, List<Metadata>) - Method in class org.apache.tika.pipes.emitter.solr.SolrEmitter
 
emit(List<? extends EmitData>) - Method in class org.apache.tika.pipes.emitter.solr.SolrEmitter
 
emit(String, InputStream, Metadata) - Method in interface org.apache.tika.pipes.emitter.StreamEmitter
 
EMIT_SUCCESS - Static variable in class org.apache.tika.pipes.PipesResult
 
EmitData - Class in org.apache.tika.pipes.emitter
 
EmitData(EmitKey, List<Metadata>) - Constructor for class org.apache.tika.pipes.emitter.EmitData
 
EMITKEY - Static variable in class org.apache.tika.metadata.serialization.JsonFetchEmitTuple
 
EmitKey - Class in org.apache.tika.pipes.emitter
 
EmitKey() - Constructor for class org.apache.tika.pipes.emitter.EmitKey
 
EmitKey(String, String) - Constructor for class org.apache.tika.pipes.emitter.EmitKey
 
EMITTER - Static variable in class org.apache.tika.metadata.serialization.JsonFetchEmitTuple
 
Emitter - Interface in org.apache.tika.pipes.emitter
 
EmitterManager - Class in org.apache.tika.pipes.emitter
Utility class that will apply the appropriate fetcher to the fetcherString based on the prefix.
EmitterManager(List<Emitter>) - Constructor for class org.apache.tika.pipes.emitter.EmitterManager
 
EMPTY - Static variable in class org.apache.tika.mime.MediaType
 
EMPTY - Static variable in class org.apache.tika.utils.StringUtils
The empty String "".
EMPTY_CONTENT_TAGS - Static variable in class org.apache.tika.eval.core.util.ContentTags
 
EMPTY_LIST - Static variable in class org.apache.tika.parser.microsoft.ooxml.XWPFListManager
Empty singleton to be used when there is no list manager.
EMPTY_MODEL - Static variable in class org.apache.tika.eval.core.tokens.LangModel
 
EMPTY_OUTPUT - Static variable in class org.apache.tika.pipes.PipesResult
 
EMPTY_STYLES - Static variable in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFStylesShim
Empty singleton to be used when there is no style info
EmptyDetector - Class in org.apache.tika.detect
Dummy detector that returns application/octet-stream for all documents.
EmptyDetector() - Constructor for class org.apache.tika.detect.EmptyDetector
 
EmptyEmitter - Class in org.apache.tika.pipes.emitter
 
EmptyEmitter() - Constructor for class org.apache.tika.pipes.emitter.EmptyEmitter
 
EmptyFetcher - Class in org.apache.tika.pipes.fetcher
 
EmptyFetcher() - Constructor for class org.apache.tika.pipes.fetcher.EmptyFetcher
 
emptyGuid() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.GuidUtil
 
EmptyParser - Class in org.apache.tika.parser
Dummy parser that always produces an empty XHTML document without even attempting to parse the given document stream.
EmptyParser() - Constructor for class org.apache.tika.parser.EmptyParser
 
EmptyTranslator - Class in org.apache.tika.language.translate
Dummy translator that always declines to give any text.
EmptyTranslator() - Constructor for class org.apache.tika.language.translate.EmptyTranslator
 
enableInputFilter(boolean) - Method in class org.apache.tika.parser.txt.CharsetDetector
Enable filtering of input text.
encode(byte[]) - Static method in class org.apache.tika.mime.HexCoDec
Hex encode an array of bytes
encode(byte[], int, int) - Static method in class org.apache.tika.mime.HexCoDec
Hex encode an array of bytes
encode(byte[]) - Method in interface org.apache.tika.parser.DigestingParser.Encoder
 
encoding - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.TextEncoding
 
EncodingDetector - Interface in org.apache.tika.detect
Character encoding detector.
encodings - Static variable in class org.apache.tika.parser.mp3.ID3v2Frame
 
ENCRYPTED - Static variable in interface org.apache.tika.metadata.WordPerfect
Is encrypted?.
EncryptedDocumentException - Exception in org.apache.tika.exception
 
EncryptedDocumentException() - Constructor for exception org.apache.tika.exception.EncryptedDocumentException
 
EncryptedDocumentException(Throwable) - Constructor for exception org.apache.tika.exception.EncryptedDocumentException
 
EncryptedDocumentException(String) - Constructor for exception org.apache.tika.exception.EncryptedDocumentException
 
EncryptedDocumentException(String, Throwable) - Constructor for exception org.apache.tika.exception.EncryptedDocumentException
 
EncryptedPrescriptionDetector - Class in org.apache.tika.example
 
EncryptedPrescriptionDetector() - Constructor for class org.apache.tika.example.EncryptedPrescriptionDetector
 
EncryptedPrescriptionParser - Class in org.apache.tika.example
 
EncryptedPrescriptionParser() - Constructor for class org.apache.tika.example.EncryptedPrescriptionParser
 
encryptionObjects - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObjectGroup
 
endBookmark(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
endBookmark(String) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
endDescription() - Method in class org.apache.tika.sax.XMPContentHandler
 
endDocument() - Method in class org.apache.tika.parser.ctakes.CTAKESContentHandler
 
endDocument() - Method in class org.apache.tika.parser.dif.DIFContentHandler
 
endDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
 
endDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
 
endDocument() - Method in class org.apache.tika.parser.mif.MIFContentHandler
 
endDocument() - Method in class org.apache.tika.parser.tmx.TMXContentHandler
 
endDocument() - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
 
endDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
This is called after the full parse has completed.
endDocument() - Method in class org.apache.tika.sax.boilerpipe.BoilerpipeContentHandler
 
endDocument() - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
endDocument() - Method in class org.apache.tika.sax.DIFContentHandler
 
endDocument() - Method in class org.apache.tika.sax.EmbeddedContentHandler
Ignored.
endDocument() - Method in class org.apache.tika.sax.EndDocumentShieldingContentHandler
 
endDocument() - Method in class org.apache.tika.sax.PhoneExtractingContentHandler
This method is called whenever the Parser is done parsing the file.
endDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.RecursiveParserWrapperHandler
 
endDocument() - Method in class org.apache.tika.sax.SafeContentHandler
 
endDocument() - Method in class org.apache.tika.sax.StandardsExtractingContentHandler
This method is called whenever the Parser is done parsing the file.
endDocument() - Method in class org.apache.tika.sax.TeeContentHandler
 
endDocument() - Method in class org.apache.tika.sax.TextContentHandler
 
endDocument() - Method in class org.apache.tika.sax.ToTextContentHandler
Flushes the character stream so that no characters are forgotten in internal buffers.
endDocument() - Method in class org.apache.tika.sax.XHTMLContentHandler
Ends the XHTML document by writing the following footer and clearing the namespace mappings:
endDocument() - Method in class org.apache.tika.sax.XMPContentHandler
Ends the XMP document by writing the following footer and clearing the namespace mappings:
EndDocumentShieldingContentHandler - Class in org.apache.tika.sax
A wrapper around a ContentHandler which will ignore normal SAX calls to EndDocumentShieldingContentHandler.endDocument(), and only fire them later.
EndDocumentShieldingContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.EndDocumentShieldingContentHandler
Creates a decorator for the given SAX event handler.
endEditedSection() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
endEditedSection() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
endElement(String, String, String) - Method in class org.apache.tika.mime.MimeTypesReader
 
endElement(String, String, String) - Method in class org.apache.tika.parser.dif.DIFContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
 
endElement(String, String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
 
endElement(String, String, String) - Method in class org.apache.tika.parser.mif.MIFContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.parser.tmx.TMXContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
 
endElement(String, String, String) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
 
endElement(String, String, String) - Method in class org.apache.tika.parser.xml.MetadataHandler
Deprecated.
 
endElement(String, String, String) - Method in class org.apache.tika.sax.boilerpipe.BoilerpipeContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
endElement(String, String, String) - Method in class org.apache.tika.sax.DIFContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.sax.ElementMappingContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.sax.ExpandedTitleContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.sax.LinkContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.sax.SafeContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.sax.SecureContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.sax.TeeContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.sax.ToHTMLContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.sax.ToTextContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.sax.ToXMLContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.sax.XHTMLContentHandler
Ends the given element.
endElement(String) - Method in class org.apache.tika.sax.XHTMLContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
 
endEmbeddedDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
This is called after parsing each embedded document.
endEmbeddedDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.RecursiveParserWrapperHandler
This is called after parsing an embedded document.
ENDIAN - Static variable in interface org.apache.tika.metadata.MachineMetadata
 
EndianUtils - Class in org.apache.tika.io
General Endian Related Utilties.
EndianUtils() - Constructor for class org.apache.tika.io.EndianUtils
 
EndianUtils.BufferUnderrunException - Exception in org.apache.tika.io
 
ENDLINE - Static variable in class org.apache.tika.sax.XHTMLContentHandler
The elements that get appended with the XHTMLContentHandler.NL character.
endnoteReference(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
endnoteReference(String) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
ENDOBJ - Static variable in class org.apache.tika.fuzzing.pdf.EvilCOSWriter
The end object token.
endParagraph() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
endParagraph() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
Endpoint(Class<?>, Method, String, String, String[]) - Constructor for class org.apache.tika.server.core.resource.TikaWelcome.Endpoint
 
endPrefixMapping(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
 
endPrefixMapping(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
 
endPrefixMapping(String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
endPrefixMapping(String) - Method in class org.apache.tika.sax.TeeContentHandler
 
endRow(int) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
 
endSDT() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
endSDT() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
ENDSTREAM - Static variable in class org.apache.tika.fuzzing.pdf.EvilCOSWriter
The close stream token.
endTable() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
endTable() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
endTableCell() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
endTableCell() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
endTableRow() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
endTableRow() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
ENGINEER - Static variable in interface org.apache.tika.metadata.XMPDM
"The engineer's name."
enqueue() - Method in class org.apache.tika.pipes.pipesiterator.azblob.AZBlobPipesIterator
 
enqueue() - Method in class org.apache.tika.pipes.pipesiterator.csv.CSVPipesIterator
 
enqueue() - Method in class org.apache.tika.pipes.pipesiterator.filelist.FileListPipesIterator
 
enqueue() - Method in class org.apache.tika.pipes.pipesiterator.fs.FileSystemPipesIterator
 
enqueue() - Method in class org.apache.tika.pipes.pipesiterator.gcs.GCSPipesIterator
 
enqueue() - Method in class org.apache.tika.pipes.pipesiterator.jdbc.JDBCPipesIterator
 
enqueue() - Method in class org.apache.tika.pipes.pipesiterator.PipesIterator
 
enqueue() - Method in class org.apache.tika.pipes.pipesiterator.s3.S3PipesIterator
 
enqueue() - Method in class org.apache.tika.pipes.pipesiterator.solr.SolrPipesIterator
 
ensureFormattingState(XHTMLContentHandler, EnumSet<FormattingUtils.Tag>, Deque<FormattingUtils.Tag>) - Static method in class org.apache.tika.parser.microsoft.FormattingUtils
Closes all tags until currentState contains only tags from desired set, then open all required tags to reach desired state.
ensureStreamReReadable(InputStream, TemporaryResources) - Static method in class org.apache.tika.utils.ParserUtils
Ensures that the Stream will be able to be re-read, by buffering to a temporary file if required.
ENTITY_LOCAL_NAMES - Static variable in class org.apache.tika.parser.xml.XMLProfiler
 
ENTITY_TYPES - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
 
ENTITY_TYPES - Static variable in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
 
ENTITY_TYPES - Static variable in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
 
ENTITY_TYPES - Static variable in class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
some common entities identified by NLTK
ENTITY_URIS - Static variable in class org.apache.tika.parser.xml.XMLProfiler
 
entityTypes - Variable in class org.apache.tika.parser.ner.regex.RegexNERecogniser
 
enumerateChm() - Method in class org.apache.tika.parser.microsoft.chm.ChmExtractor
Enumerates chm entities
ENVI_MIME_TYPE - Static variable in class org.apache.tika.parser.envi.EnviHeaderParser
 
EnviHeaderParser - Class in org.apache.tika.parser.envi
 
EnviHeaderParser() - Constructor for class org.apache.tika.parser.envi.EnviHeaderParser
 
EnviHeaderParser(EncodingDetector) - Constructor for class org.apache.tika.parser.envi.EnviHeaderParser
 
EOF - Static variable in class org.apache.tika.fuzzing.pdf.EvilCOSWriter
The EOF constant.
EpubContentParser - Class in org.apache.tika.parser.epub
Parser for EPUB OPS *.html files.
EpubContentParser() - Constructor for class org.apache.tika.parser.epub.EpubContentParser
 
EpubParser - Class in org.apache.tika.parser.epub
Epub parser
EpubParser() - Constructor for class org.apache.tika.parser.epub.EpubParser
 
equals(Object) - Method in class org.apache.tika.eval.app.db.ColInfo
 
equals(Object) - Method in class org.apache.tika.eval.core.tokens.TokenIntPair
 
equals(Object) - Method in class org.apache.tika.eval.core.tokens.TokenStatistics
 
equals(String, String) - Static method in class org.apache.tika.language.detect.LanguageNames
 
equals(Object) - Method in class org.apache.tika.metadata.Metadata
 
equals(Object) - Method in class org.apache.tika.metadata.Property
 
equals(Object) - Method in class org.apache.tika.mime.MediaType
 
equals(Object) - Method in class org.apache.tika.mime.MimeType
 
equals(Object) - Method in class org.apache.tika.parser.csv.CSVResult
 
equals(Object) - Method in class org.apache.tika.parser.html.DataURIScheme
 
equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
 
equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
Override the Equals method.
equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Override the Equals method.
equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataHashObject
 
equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
equals(Object) - Method in class org.apache.tika.parser.microsoft.onenote.GUID
 
equals(Object) - Method in class org.apache.tika.parser.pdf.AccessChecker
 
equals(Object) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
equals(Object) - Method in class org.apache.tika.parser.txt.CharsetMatch
compare this CharsetMatch to another based on confidence value
equals(Object) - Method in class org.apache.tika.pipes.emitter.EmitKey
 
equals(Object) - Method in class org.apache.tika.pipes.FetchEmitTuple
 
equals(Object) - Method in class org.apache.tika.pipes.fetcher.FetchKey
 
equals(Object) - Method in class org.apache.tika.pipes.HandlerConfig
 
equals(Object) - Method in class org.apache.tika.xmp.XMPMetadata
This method is not implemented, yet.
EQUIPMENT_MAKE - Static variable in interface org.apache.tika.metadata.TIFF
"Manufacturer of the recording equipment."
EQUIPMENT_MODEL - Static variable in interface org.apache.tika.metadata.TIFF
"Model name or number of the recording equipment."
Error - Enum in org.apache.tika.parser.microsoft.onenote
 
ERROR_CODES_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
 
ErrorParser - Class in org.apache.tika.parser
Dummy parser that always throws a TikaException without even attempting to parse the given document stream.
ErrorParser() - Constructor for class org.apache.tika.parser.ErrorParser
 
escapeCommandLine(String) - Static method in class org.apache.tika.utils.ProcessUtils
This should correctly put double-quotes around an argument if ProcessBuilder doesn't seem to work (as it doesn't on paths with spaces on Windows)
ESRI_LAYER - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
 
EvalConsumerBuilder - Class in org.apache.tika.eval.app.batch
 
EvalConsumerBuilder() - Constructor for class org.apache.tika.eval.app.batch.EvalConsumerBuilder
 
EvalConsumersBuilder - Class in org.apache.tika.eval.app.batch
 
EvalConsumersBuilder() - Constructor for class org.apache.tika.eval.app.batch.EvalConsumersBuilder
 
EvalExceptionUtils - Class in org.apache.tika.eval.core.util
 
EvalExceptionUtils() - Constructor for class org.apache.tika.eval.core.util.EvalExceptionUtils
 
EVENT - Static variable in interface org.apache.tika.metadata.IPTC
Names or describes the specific event the content relates to.
EvilCOSWriter - Class in org.apache.tika.fuzzing.pdf
 
EvilCOSWriter(OutputStream, PDFTransformerConfig) - Constructor for class org.apache.tika.fuzzing.pdf.EvilCOSWriter
COSWriter constructor.
ExcelExtractor - Class in org.apache.tika.parser.microsoft
Excel parser implementation which uses POI's Event API to handle the contents of a Workbook.
ExcelExtractor(ParseContext, Metadata) - Constructor for class org.apache.tika.parser.microsoft.ExcelExtractor
 
EXCEPTION_TABLE - Static variable in class org.apache.tika.eval.app.ExtractProfiler
 
EXCEPTION_TABLE_A - Static variable in class org.apache.tika.eval.app.ExtractComparer
 
EXCEPTION_TABLE_B - Static variable in class org.apache.tika.eval.app.ExtractComparer
 
ExceptionUtils - Class in org.apache.tika.utils
 
ExceptionUtils() - Constructor for class org.apache.tika.utils.ExceptionUtils
 
ExcludeFieldMetadataFilter - Class in org.apache.tika.metadata.filter
 
ExcludeFieldMetadataFilter() - Constructor for class org.apache.tika.metadata.filter.ExcludeFieldMetadataFilter
 
ExcludeFieldMetadataFilter(Set<String>) - Constructor for class org.apache.tika.metadata.filter.ExcludeFieldMetadataFilter
 
ExecutableParser - Class in org.apache.tika.parser.executable
Parser for executable files.
ExecutableParser() - Constructor for class org.apache.tika.parser.executable.ExecutableParser
 
execute() - Method in class org.apache.tika.batch.BatchProcessDriverCLI
 
execute(Connection, Path) - Method in class org.apache.tika.eval.app.reports.ResultsReporter
 
execute(ParseContext, Runnable) - Static method in class org.apache.tika.utils.ConcurrentUtils
Execute a runnable using an ExecutorService from the ParseContext if possible.
execute(ProcessBuilder, long, int, int) - Static method in class org.apache.tika.utils.ProcessUtils
This writes stdout and stderr to the FileProcessResult.
execute(ProcessBuilder, long, Path, int) - Static method in class org.apache.tika.utils.ProcessUtils
This redirects stdout to stdoutRedirect path.
exGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataNodeObjectData
 
ExGuid - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
ExGuid(int, UUID) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Initializes a new instance of the ExGuid class with specified value.
ExGuid(ExGuid) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Initializes a new instance of the ExGuid class, this is the copy constructor.
ExGuid() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Initializes a new instance of the ExGuid class, this is a default constructor.
exGuid - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.NodeObject
 
ExGUIDArray - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic
 
ExGUIDArray(List<ExGuid>) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
Initializes a new instance of the ExGUIDArray class with specified value.
ExGUIDArray(ExGUIDArray) - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
Initializes a new instance of the ExGUIDArray class, this is copy constructor.
ExGUIDArray() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
Initializes a new instance of the ExGUIDArray class, this is the default constructor.
EXIF_PAGE_COUNT - Static variable in interface org.apache.tika.metadata.TIFF
 
EXIT_VALUE - Static variable in interface org.apache.tika.metadata.ExternalProcess
Exit value of the sub process
ExpandedTitleContentHandler - Class in org.apache.tika.sax
Content handler decorator which wraps a TransformerHandler in order to allow the TITLE tag to render as <title></title> rather than <title/> which is accomplished by calling the ContentHandler.characters(char[], int, int) method with a length of 1 but a zero length char array.
ExpandedTitleContentHandler() - Constructor for class org.apache.tika.sax.ExpandedTitleContentHandler
 
ExpandedTitleContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.ExpandedTitleContentHandler
 
EXPERIMENT_ID - Static variable in interface org.apache.tika.metadata.ClimateForcast
 
EXPOSURE_TIME - Static variable in interface org.apache.tika.metadata.TIFF
"Exposure time in seconds."
ExtendedGUID - Class in org.apache.tika.parser.microsoft.onenote
 
ExtendedGUID() - Constructor for class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
 
ExtendedGUID(GUID, long) - Constructor for class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
 
ExtendedGUID10BitUintType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Specify the extended GUID 10 Bit int type value.
ExtendedGUID17BitUintType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Specify the extended GUID 17 Bit int type value.
ExtendedGUID32BitUintType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Specify the extended GUID 32 Bit int type value.
ExtendedGUID5BitUintType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Specify the extended GUID 5 Bit int type value.
ExtendedGUIDNullType - Static variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGuid
Specify the extended GUID null type value.
extendedStreamsPresent - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.space.ObjectSpaceObjectStreamHeader
 
extendGUID1 - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
 
extendGUID2 - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.CellID
 
extension_neg(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
 
EXTENSION_TAG_EXIF - Static variable in class org.apache.tika.parser.image.BPGParser
 
EXTENSION_TAG_ICC_PROFILE - Static variable in class org.apache.tika.parser.image.BPGParser
 
EXTENSION_TAG_THUMBNAIL - Static variable in class org.apache.tika.parser.image.BPGParser
 
EXTENSION_TAG_XMP - Static variable in class org.apache.tika.parser.image.BPGParser
 
extension_trust(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
 
EXTERNAL_PARSERS_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
 
externalBoolean(String) - Static method in class org.apache.tika.metadata.Property
 
externalBooleanSeq(String) - Static method in class org.apache.tika.metadata.Property
 
externalClosedChoise(String, String...) - Static method in class org.apache.tika.metadata.Property
 
externalDate(String) - Static method in class org.apache.tika.metadata.Property
 
ExternalEmbedder - Class in org.apache.tika.embedder
Embedder that uses an external program (like sed or exiftool) to embed text content and metadata into a given document.
ExternalEmbedder() - Constructor for class org.apache.tika.embedder.ExternalEmbedder
 
externalInteger(String) - Static method in class org.apache.tika.metadata.Property
 
externalOpenChoise(String, String...) - Static method in class org.apache.tika.metadata.Property
 
ExternalParser - Class in org.apache.tika.parser.external
Parser that uses an external program (like catdoc or pdf2txt) to extract text content and metadata from a given document.
ExternalParser() - Constructor for class org.apache.tika.parser.external.ExternalParser
 
ExternalParser - Class in org.apache.tika.parser.external2
This is a next generation external parser that uses some of the more recent additions to Tika.
ExternalParser() - Constructor for class org.apache.tika.parser.external2.ExternalParser
 
ExternalParser.LineConsumer - Interface in org.apache.tika.parser.external
Consumer contract
ExternalParsersConfigReader - Class in org.apache.tika.parser.external
Builds up ExternalParser instances based on XML file(s) which define what to run, for what, and how to process any output metadata.
ExternalParsersConfigReader() - Constructor for class org.apache.tika.parser.external.ExternalParsersConfigReader
 
ExternalParsersConfigReaderMetKeys - Interface in org.apache.tika.parser.external
Met Keys used by the ExternalParsersConfigReader.
ExternalParsersFactory - Class in org.apache.tika.parser.external
Creates instances of ExternalParser based on XML configuration files.
ExternalParsersFactory() - Constructor for class org.apache.tika.parser.external.ExternalParsersFactory
 
ExternalProcess - Interface in org.apache.tika.metadata
 
externalReal(String) - Static method in class org.apache.tika.metadata.Property
 
externalRealSeq(String) - Static method in class org.apache.tika.metadata.Property
 
externalText(String) - Static method in class org.apache.tika.metadata.Property
 
externalTextBag(String) - Static method in class org.apache.tika.metadata.Property
 
ExternalTranslator - Class in org.apache.tika.language.translate.impl
Abstract class used to interact with command line/external Translators.
ExternalTranslator() - Constructor for class org.apache.tika.language.translate.impl.ExternalTranslator
 
EXTRA_BITS - Static variable in class org.apache.tika.parser.microsoft.chm.ChmConstants
 
extract(InputStream, Path) - Method in class org.apache.tika.example.ExtractEmbeddedFiles
 
extract(TikaInputStream, ContainerExtractor, EmbeddedResourceHandler) - Method in interface org.apache.tika.extractor.ContainerExtractor
Processes a container file, and extracts all the embedded resources from within it.
extract(TikaInputStream, ContainerExtractor, EmbeddedResourceHandler) - Method in class org.apache.tika.extractor.ParserContainerExtractor
 
extract(String) - Method in class org.apache.tika.parser.html.DataURISchemeUtil
Extracts DataURISchemes from free text, as in javascript.
extract(InputStream, Metadata, XHTMLContentHandler) - Method in class org.apache.tika.parser.hwp.HwpTextExtractorV5
extract Text from HWP Stream.
extract(Metadata) - Method in class org.apache.tika.parser.microsoft.ooxml.MetadataExtractor
 
EXTRACT_CONTENT - Static variable in interface org.apache.tika.metadata.AccessPermissions
Should content be extracted, generally.
EXTRACT_EXCEPTION_TABLE - Static variable in class org.apache.tika.eval.app.ExtractProfiler
 
EXTRACT_EXCEPTION_TABLE_A - Static variable in class org.apache.tika.eval.app.ExtractComparer
 
EXTRACT_EXCEPTION_TABLE_B - Static variable in class org.apache.tika.eval.app.ExtractComparer
 
EXTRACT_FOR_ACCESSIBILITY - Static variable in interface org.apache.tika.metadata.AccessPermissions
Should content be extracted for the purposes of accessibility.
extractChmEntry(DirectoryListingEntry) - Method in class org.apache.tika.parser.microsoft.chm.ChmExtractor
Decompresses a chm entry
ExtractComparer - Class in org.apache.tika.eval.app
 
ExtractComparer(ArrayBlockingQueue<FileResource>, Path, Path, Path, ExtractReader, IDBWriter) - Constructor for class org.apache.tika.eval.app.ExtractComparer
 
ExtractComparerBuilder - Class in org.apache.tika.eval.app.batch
 
ExtractComparerBuilder() - Constructor for class org.apache.tika.eval.app.batch.ExtractComparerBuilder
 
extractDublinCore(XMPMetadata, Metadata) - Static method in class org.apache.tika.parser.xmp.JempboxExtractor
Tries to extract Dublin Core schema from XMP.
extractDublinCoreSchema(XMPMetadata, Metadata) - Static method in class org.apache.tika.parser.xmp.XMPMetadataExtractor
Extracts Dublin Core.
extractEmbeddedDocumentsExample(Path) - Method in class org.apache.tika.example.ParsingExample
 
ExtractEmbeddedFiles - Class in org.apache.tika.example
 
ExtractEmbeddedFiles() - Constructor for class org.apache.tika.example.ExtractEmbeddedFiles
 
extractGenre(String) - Static method in class org.apache.tika.parser.mp3.ID3v22Handler
 
extractHeaderFooter(String, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFBExcelExtractorDecorator
 
extractHeaderFooter(String, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
extractHyperLinks(PackagePart, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
extractLinks(String) - Static method in class org.apache.tika.utils.RegexUtils
Extract urls from plain text.
extractMacros(POIFSFileSystem, ContentHandler, EmbeddedDocumentExtractor) - Static method in class org.apache.tika.parser.microsoft.OfficeParser
Helper to extract macros from an NPOIFS/vbaProject.bin
extractor - Variable in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
 
extractPhoneNumbers(String) - Static method in class org.apache.tika.sax.CleanPhoneText
 
ExtractProfiler - Class in org.apache.tika.eval.app
 
ExtractProfiler(ArrayBlockingQueue<FileResource>, Path, Path, ExtractReader, IDBWriter) - Constructor for class org.apache.tika.eval.app.ExtractProfiler
 
ExtractProfilerBuilder - Class in org.apache.tika.eval.app.batch
 
ExtractProfilerBuilder() - Constructor for class org.apache.tika.eval.app.batch.ExtractProfilerBuilder
 
ExtractReader - Class in org.apache.tika.eval.app.io
 
ExtractReader() - Constructor for class org.apache.tika.eval.app.io.ExtractReader
Reads full extract, no modification of metadata list, no min or max extract length checking
ExtractReader(ExtractReader.ALTER_METADATA_LIST) - Constructor for class org.apache.tika.eval.app.io.ExtractReader
 
ExtractReader(ExtractReader.ALTER_METADATA_LIST, long, long) - Constructor for class org.apache.tika.eval.app.io.ExtractReader
 
ExtractReader.ALTER_METADATA_LIST - Enum in org.apache.tika.eval.app.io
 
ExtractReaderException - Exception in org.apache.tika.eval.app.io
Exception when trying to read extract
ExtractReaderException(ExtractReaderException.TYPE) - Constructor for exception org.apache.tika.eval.app.io.ExtractReaderException
 
ExtractReaderException.TYPE - Enum in org.apache.tika.eval.app.io
 
extractRootElement(byte[]) - Method in class org.apache.tika.detect.XmlRootExtractor
 
extractRootElement(InputStream) - Method in class org.apache.tika.detect.XmlRootExtractor
 
extractStandardReferences(String, double) - Static method in class org.apache.tika.sax.StandardsText
Extracts the standard references found within the given text.
extractXMPBasicSchema(XMPMetadata, Metadata) - Static method in class org.apache.tika.parser.xmp.XMPMetadataExtractor
Extracts basic schema metadata from XMP.
extractXMPMM(XMPMetadata, Metadata) - Static method in class org.apache.tika.parser.xmp.JempboxExtractor
Extracts Media Management metadata from XMP.

F

F_NUMBER - Static variable in interface org.apache.tika.metadata.TIFF
"F-Number." The f-number is the focal length divided by the "effective" aperture diameter.
FAIL - Static variable in class org.apache.tika.sax.xpath.Matcher
State of a failed XPath evaluation, where nothing is matched.
FallbackParser - Class in org.apache.tika.parser.multiple
Tries multiple parsers in turn, until one succeeds.
FallbackParser(MediaTypeRegistry, Collection<? extends Parser>, Map<String, Param>) - Constructor for class org.apache.tika.parser.multiple.FallbackParser
 
FallbackParser(MediaTypeRegistry, AbstractMultipleParser.MetadataPolicy, Collection<? extends Parser>) - Constructor for class org.apache.tika.parser.multiple.FallbackParser
 
FallbackParser(MediaTypeRegistry, AbstractMultipleParser.MetadataPolicy, Parser...) - Constructor for class org.apache.tika.parser.multiple.FallbackParser
 
FALSE - Static variable in class org.apache.tika.eval.app.AbstractProfiler
 
FeedParser - Class in org.apache.tika.parser.feed
Feed parser.
FeedParser() - Constructor for class org.apache.tika.parser.feed.FeedParser
 
fetch(String, Metadata) - Method in class org.apache.tika.pipes.fetcher.azblob.AZBlobFetcher
 
fetch(String, Metadata) - Method in class org.apache.tika.pipes.fetcher.EmptyFetcher
 
fetch(String, Metadata) - Method in interface org.apache.tika.pipes.fetcher.Fetcher
 
fetch(String, Metadata) - Method in class org.apache.tika.pipes.fetcher.fs.FileSystemFetcher
 
fetch(String, Metadata) - Method in class org.apache.tika.pipes.fetcher.gcs.GCSFetcher
 
fetch(String, Metadata) - Method in class org.apache.tika.pipes.fetcher.http.HttpFetcher
 
fetch(String, long, long, Metadata) - Method in class org.apache.tika.pipes.fetcher.http.HttpFetcher
 
fetch(String, long, long, Metadata) - Method in interface org.apache.tika.pipes.fetcher.RangeFetcher
 
fetch(String, Metadata) - Method in class org.apache.tika.pipes.fetcher.s3.S3Fetcher
 
fetch(String, long, long, Metadata) - Method in class org.apache.tika.pipes.fetcher.s3.S3Fetcher
 
fetch(String, Metadata) - Method in class org.apache.tika.pipes.fetcher.url.UrlFetcher
 
FETCH_RANGE_END - Static variable in class org.apache.tika.metadata.serialization.JsonFetchEmitTuple
 
FETCH_RANGE_START - Static variable in class org.apache.tika.metadata.serialization.JsonFetchEmitTuple
 
FetchEmitTuple - Class in org.apache.tika.pipes
 
FetchEmitTuple(String, FetchKey, EmitKey) - Constructor for class org.apache.tika.pipes.FetchEmitTuple
 
FetchEmitTuple(String, FetchKey, EmitKey, FetchEmitTuple.ON_PARSE_EXCEPTION) - Constructor for class org.apache.tika.pipes.FetchEmitTuple
 
FetchEmitTuple(String, FetchKey, EmitKey, Metadata) - Constructor for class org.apache.tika.pipes.FetchEmitTuple
 
FetchEmitTuple(String, FetchKey, EmitKey, Metadata, HandlerConfig, FetchEmitTuple.ON_PARSE_EXCEPTION) - Constructor for class org.apache.tika.pipes.FetchEmitTuple
 
FetchEmitTuple.ON_PARSE_EXCEPTION - Enum in org.apache.tika.pipes
 
FETCHER - Static variable in class org.apache.tika.metadata.serialization.JsonFetchEmitTuple
 
Fetcher - Interface in org.apache.tika.pipes.fetcher
Interface for an object that will fetch an InputStream given a fetch string.
FetcherManager - Class in org.apache.tika.pipes.fetcher
Utility class to hold multiple fetchers.
FetcherManager(List<Fetcher>) - Constructor for class org.apache.tika.pipes.fetcher.FetcherManager
 
FetcherStreamFactory - Class in org.apache.tika.server.core
This class looks for "fileUrl" in the http header.
FetcherStreamFactory(FetcherManager) - Constructor for class org.apache.tika.server.core.FetcherStreamFactory
 
FetcherStringException - Exception in org.apache.tika.pipes.fetcher
If something goes wrong in parsing the fetcher string
FetcherStringException(String) - Constructor for exception org.apache.tika.pipes.fetcher.FetcherStringException
 
FETCHKEY - Static variable in class org.apache.tika.metadata.serialization.JsonFetchEmitTuple
 
FetchKey - Class in org.apache.tika.pipes.fetcher
Pair of fetcherName (which fetcher to call) and the key to send to that fetcher to retrieve a specific file.
FetchKey() - Constructor for class org.apache.tika.pipes.fetcher.FetchKey
 
FetchKey(String, String) - Constructor for class org.apache.tika.pipes.fetcher.FetchKey
 
FetchKey(String, String, long, long) - Constructor for class org.apache.tika.pipes.fetcher.FetchKey
 
FictionBookParser - Class in org.apache.tika.parser.xml
 
FictionBookParser() - Constructor for class org.apache.tika.parser.xml.FictionBookParser
 
Field - Annotation Type in org.apache.tika.config
Field annotation is a contract for binding Param value from Tika Configuration to an object.
FieldNameMappingFilter - Class in org.apache.tika.metadata.filter
 
FieldNameMappingFilter() - Constructor for class org.apache.tika.metadata.filter.FieldNameMappingFilter
 
FILE_DATA_RATE - Static variable in interface org.apache.tika.metadata.XMPDM
"The file data rate in megabytes per second.
FILE_EXTENSION - Static variable in interface org.apache.tika.batch.FileResource
 
FILE_ID - Static variable in interface org.apache.tika.metadata.WordPerfect
File identifier.
FILE_MIME_TABLE - Static variable in class org.apache.tika.eval.app.FileProfiler
 
FILE_PROFILES - Static variable in class org.apache.tika.eval.app.FileProfiler
 
FILE_SIZE - Static variable in interface org.apache.tika.metadata.WordPerfect
File size as defined in document header.
FILE_TYPE - Static variable in interface org.apache.tika.metadata.WordPerfect
File type.
FileCommandDetector - Class in org.apache.tika.detect
This runs the linux 'file' command against a file.
FileCommandDetector() - Constructor for class org.apache.tika.detect.FileCommandDetector
 
fileContent - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.chunking.AbstractChunking
 
fileDataObject - Variable in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.RevisionStoreObject
 
FileListPipesIterator - Class in org.apache.tika.pipes.pipesiterator.filelist
Reads a list of file names/relative paths from a UTF-8 file.
FileListPipesIterator() - Constructor for class org.apache.tika.pipes.pipesiterator.filelist.FileListPipesIterator
 
FilenameUtils - Class in org.apache.tika.io
 
FilenameUtils() - Constructor for class org.apache.tika.io.FilenameUtils
 
FileProcessResult - Class in org.apache.tika.utils
 
FileProcessResult() - Constructor for class org.apache.tika.utils.FileProcessResult
 
FileProfiler - Class in org.apache.tika.eval.app
This class profiles actual files as opposed to extracts e.g.
FileProfiler(ArrayBlockingQueue<FileResource>, Path, IDBWriter) - Constructor for class org.apache.tika.eval.app.FileProfiler
 
FileProfilerBuilder - Class in org.apache.tika.eval.app.batch
 
FileProfilerBuilder() - Constructor for class org.apache.tika.eval.app.batch.FileProfilerBuilder
 
FileResource - Interface in org.apache.tika.batch
This is a basic interface to handle a logical "file".
FileResourceConsumer - Class in org.apache.tika.batch
This is a base class for file consumers.
FileResourceConsumer(ArrayBlockingQueue<FileResource>) - Constructor for class org.apache.tika.batch.FileResourceConsumer
 
FileResourceCrawler - Class in org.apache.tika.batch
 
FileResourceCrawler(ArrayBlockingQueue<FileResource>, int) - Constructor for class org.apache.tika.batch.FileResourceCrawler
 
FileSystemEmitter - Class in org.apache.tika.pipes.emitter.fs
Emitter to write to a file system.
FileSystemEmitter() - Constructor for class org.apache.tika.pipes.emitter.fs.FileSystemEmitter
 
FileSystemFetcher - Class in org.apache.tika.pipes.fetcher.fs
 
FileSystemFetcher() - Constructor for class org.apache.tika.pipes.fetcher.fs.FileSystemFetcher
 
FileSystemPipesIterator - Class in org.apache.tika.pipes.pipesiterator.fs
 
FileSystemPipesIterator() - Constructor for class org.apache.tika.pipes.pipesiterator.fs.FileSystemPipesIterator
 
FileSystemPipesIterator(Path) - Constructor for class org.apache.tika.pipes.pipesiterator.fs.FileSystemPipesIterator
 
FILL_IN_FORM - Static variable in interface org.apache.tika.metadata.AccessPermissions
Can the user fill in a form
fillMetadata(Parser, Metadata, MultivaluedMap<String, String>) - Static method in class org.apache.tika.server.core.resource.TikaResource
 
fillParseContext(MultivaluedMap<String, String>, Metadata, ParseContext) - Static method in class org.apache.tika.server.core.resource.TikaResource
 
filter(Metadata) - Method in class org.apache.tika.eval.core.metadata.TikaEvalMetadataFilter
 
filter(Metadata) - Method in class org.apache.tika.langdetect.opennlp.metadatafilter.OpenNLPMetadataFilter
 
filter(Metadata) - Method in class org.apache.tika.langdetect.optimaize.metadatafilter.OptimaizeMetadataFilter
 
filter(Metadata) - Method in class org.apache.tika.metadata.filter.ClearByMimeMetadataFilter
 
filter(Metadata) - Method in class org.apache.tika.metadata.filter.CompositeMetadataFilter
 
filter(Metadata) - Method in class org.apache.tika.metadata.filter.DateNormalizingMetadataFilter
 
filter(Metadata) - Method in class org.apache.tika.metadata.filter.ExcludeFieldMetadataFilter
 
filter(Metadata) - Method in class org.apache.tika.metadata.filter.FieldNameMappingFilter
 
filter(Metadata) - Method in class org.apache.tika.metadata.filter.IncludeFieldMetadataFilter
 
filter(Metadata) - Method in class org.apache.tika.metadata.filter.MetadataFilter
 
filter(Metadata) - Method in class org.apache.tika.metadata.filter.NoOpFilter
 
filter(ContainerRequestContext) - Method in class org.apache.tika.server.core.TikaLoggingFilter
 
filterExisting(Map<String, String[]>) - Method in interface org.apache.tika.metadata.writefilter.MetadataWriteFilter
 
filterExisting(Map<String, String[]>) - Method in class org.apache.tika.metadata.writefilter.StandardWriteFilter
 
findDuplicateParsers(ParseContext) - Method in class org.apache.tika.parser.CompositeParser
Utility method that goes through all the component parsers and finds all media types for which more than one parser declares support.
findInFile(String, Path) - Method in class org.apache.tika.example.InterruptableParsingExample
 
findMatches(String, Pattern) - Method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
finds matching sub groups in text
findNames(String[]) - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
finds names from given array of tokens
findServiceResources(String) - Method in class org.apache.tika.config.ServiceLoader
Returns all the available service resources matching the given pattern, such as all instances of tika-mimetypes.xml on the classpath, or all org.apache.tika.parser.Parser service files.
findStorageIndexCellMapping(CellID) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
This method is used to find the Storage Index Cell Mapping matches the Cell ID.
findStorageIndexRevisionMapping(ExGuid) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.MSOneStorePackage
This method is used to find the Storage Index Revision Mapping that matches the Revision Mapping Extended GUID.
finish() - Method in interface org.apache.tika.eval.core.textstats.BytesRefCalculator.BytesRefCalcInstance
 
finished() - Method in class org.apache.tika.pipes.async.AsyncProcessor
 
FINISHED_STRING - Static variable in class org.apache.tika.batch.fs.FSBatchProcessCLI
 
flag - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.RawTag
 
FLASH_FIRED - Static variable in interface org.apache.tika.metadata.TIFF
Did the Flash fire when taking this image?
FlatOpenDocumentParser - Class in org.apache.tika.parser.odf
 
FlatOpenDocumentParser() - Constructor for class org.apache.tika.parser.odf.FlatOpenDocumentParser
 
floatValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UByte
 
floatValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UInteger
 
floatValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.ULong
 
floatValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.unsigned.UShort
 
flush() - Method in class org.apache.tika.langdetect.tika.ProfilingWriter
Ignored.
flush() - Method in class org.apache.tika.language.detect.LanguageWriter
Ignored.
flushAndClose(Closeable) - Method in class org.apache.tika.batch.FileResourceConsumer
 
FLVParser - Class in org.apache.tika.parser.video
Parser for metadata contained in Flash Videos (.flv).
FLVParser() - Constructor for class org.apache.tika.parser.video.FLVParser
 
FOCAL_LENGTH - Static variable in interface org.apache.tika.metadata.TIFF
"Focal length of the lens, in millimeters."
Font - Interface in org.apache.tika.metadata
 
FONT_NAME - Static variable in interface org.apache.tika.metadata.Font
Basic name of a font used in a file
footers - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
 
footnoteReference(String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
 
footnoteReference(String) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
 
ForkParser - Class in org.apache.tika.fork
 
ForkParser(Path, ParserFactoryFactory) - Constructor for class org.apache.tika.fork.ForkParser
If you have a directory with, say, tike-app.jar and you want the forked process/server to build a parser and run it from that -- so that you can keep all of those dependencies out of your client code, use this initializer.
ForkParser(Path, ParserFactoryFactory, ClassLoader) - Constructor for class org.apache.tika.fork.ForkParser
EXPERT
ForkParser(ClassLoader, Parser) - Constructor for class org.apache.tika.fork.ForkParser
 
ForkParser(ClassLoader) - Constructor for class org.apache.tika.fork.ForkParser
 
ForkParser() - Constructor for class org.apache.tika.fork.ForkParser
 
ForkProxy - Interface in org.apache.tika.fork
 
ForkResource - Interface in org.apache.tika.fork
 
FORMAT - Static variable in interface org.apache.tika.metadata.DublinCore
Typically, Format may include the media-type or dimensions of the resource.
FORMAT - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
 
format(Object, StringBuffer, FieldPosition) - Method in class org.apache.tika.parser.microsoft.TikaExcelGeneralFormat
 
formatDate(Date) - Static method in class org.apache.tika.utils.DateUtils
Returns a ISO 8601 representation of the given date.
formatDate(Calendar) - Static method in class org.apache.tika.utils.DateUtils
Returns a ISO 8601 representation of the given date.
formatDateUnknownTimezone(Date) - Static method in class org.apache.tika.utils.DateUtils
Returns a ISO 8601 representation of the given date, which is in an unknown timezone.
formatMillis(long) - Static method in class org.apache.tika.util.DurationFormatUtils
 
formatRawCellContents(double, int, String, boolean) - Method in class org.apache.tika.parser.microsoft.TikaExcelDataFormatter
 
formatter - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
FormattingUtils - Class in org.apache.tika.parser.microsoft
 
FormattingUtils.Tag - Enum in org.apache.tika.parser.microsoft
 
forName(String) - Method in class org.apache.tika.mime.MimeTypes
Returns the registered media type with the given name (or alias).
forName(String) - Static method in class org.apache.tika.utils.CharsetUtils
Returns Charset impl, if one exists.
FourBytesOfData - Class in org.apache.tika.parser.microsoft.onenote.fsshttpb.property
This class is used to represent the property contains 4 bytes of data in the PropertySet.rgData stream field.
FourBytesOfData() - Constructor for class org.apache.tika.parser.microsoft.onenote.fsshttpb.property.FourBytesOfData
 
freeBuffer(ByteBuffer) - Static method in class org.apache.tika.io.MappedBufferCleaner
If a cleaner is available, this buffer will be cleaned.
FrictionlessPackageDetector - Class in org.apache.tika.detect.zip
 
FrictionlessPackageDetector() - Constructor for class org.apache.tika.detect.zip.FrictionlessPackageDetector
 
fromCurlyBraceUTF16Bytes(byte[]) - Static method in class org.apache.tika.parser.microsoft.onenote.GUID
Converts a GUID of format: {AAAAAAAA-BBBB-CCCC-DDDD-EEEEEEEEEEEE} (in bytes) to a GUID object.
fromIntVal(int) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.DataElementType
 
fromIntVal(int) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.PropertyType
 
fromIntVal(int) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderEnd
 
fromIntVal(int) - Static method in enum org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObjectTypeHeaderStart
 
fromJson(Reader) - Static method in class org.apache.tika.metadata.serialization.JsonFetchEmitTuple
 
fromJson(Reader) - Static method in class org.apache.tika.metadata.serialization.JsonFetchEmitTupleList
 
fromJson(Reader) - Static method in class org.apache.tika.metadata.serialization.JsonMetadata
Read metadata from reader.
fromJson(Reader) - Static method in class org.apache.tika.metadata.serialization.JsonMetadataList
Read metadata from reader.
FS_REL_PATH - Static variable in class org.apache.tika.batch.fs.FSProperties
File's relative path (including file name) from a given source root
FSBatchProcessCLI - Class in org.apache.tika.batch.fs
 
FSBatchProcessCLI(String[]) - Constructor for class org.apache.tika.batch.fs.FSBatchProcessCLI
 
FSConsumersManager - Class in org.apache.tika.batch.fs
 
FSConsumersManager(List<FileResourceConsumer>) - Constructor for class org.apache.tika.batch.fs.FSConsumersManager
 
FSCrawlerBuilder - Class in org.apache.tika.batch.fs.builders
Builds either an FSDirectoryCrawler or an FSListCrawler.
FSCrawlerBuilder() - Constructor for class org.apache.tika.batch.fs.builders.FSCrawlerBuilder
 
FSDirectoryCrawler - Class in org.apache.tika.batch.fs
 
FSDirectoryCrawler(ArrayBlockingQueue<FileResource>, int, Path, FSDirectoryCrawler.CRAWL_ORDER) - Constructor for class org.apache.tika.batch.fs.FSDirectoryCrawler
 
FSDirectoryCrawler(ArrayBlockingQueue<FileResource>, int, Path, Path, FSDirectoryCrawler.CRAWL_ORDER) - Constructor for class org.apache.tika.batch.fs.FSDirectoryCrawler
 
FSDirectoryCrawler.CRAWL_ORDER - Enum in org.apache.tika.batch.fs
 
FSDocumentSelector - Class in org.apache.tika.batch.fs
Selector that chooses files based on their file name and their size, as determined by TikaCoreProperties.RESOURCE_NAME_KEY and Metadata.CONTENT_LENGTH.
FSDocumentSelector(Pattern, Pattern, long, long) - Constructor for class org.apache.tika.batch.fs.FSDocumentSelector
 
FSFileResource - Class in org.apache.tika.batch.fs
FileSystem(FS)Resource wraps a file name.
FSFileResource(File, File) - Constructor for class org.apache.tika.batch.fs.FSFileResource
Deprecated.
to be removed in Tika 2.0
FSFileResource(Path, Path) - Constructor for class org.apache.tika.batch.fs.FSFileResource
Constructor
FSListCrawler - Class in org.apache.tika.batch.fs
Class that "crawls" a list of files.
FSListCrawler(ArrayBlockingQueue<FileResource>, int, File, File, String) - Constructor for class org.apache.tika.batch.fs.FSListCrawler
Deprecated. 
FSListCrawler(ArrayBlockingQueue<FileResource>, int, Path, Path, Charset) - Constructor for class org.apache.tika.batch.fs.FSListCrawler
Constructor for a crawler that reads a list of files to process.
FSOutputStreamFactory - Class in org.apache.tika.batch.fs
 
FSOutputStreamFactory(File, FSUtil.HANDLE_EXISTING, FSOutputStreamFactory.COMPRESSION, String) - Constructor for class org.apache.tika.batch.fs.FSOutputStreamFactory
Deprecated.
FSOutputStreamFactory(Path, FSUtil.HANDLE_EXISTING, FSOutputStreamFactory.COMPRESSION, String) - Constructor for class org.apache.tika.batch.fs.FSOutputStreamFactory
 
FSOutputStreamFactory.COMPRESSION - Enum in org.apache.tika.batch.fs
 
FSProperties - Class in org.apache.tika.batch.fs
 
FSProperties() - Constructor for class org.apache.tika.batch.fs.FSProperties
 
FSUtil - Class in org.apache.tika.batch.fs
Utility class to handle some common issues when reading from and writing to a file system (FS).
FSUtil() - Constructor for class org.apache.tika.batch.fs.FSUtil
 
FSUtil.HANDLE_EXISTING - Enum in org.apache.tika.batch.fs
 
FuzzingCLI - Class in org.apache.tika.fuzzing.cli
 
FuzzingCLI() - Constructor for class org.apache.tika.fuzzing.cli.FuzzingCLI
 
FuzzingCLIConfig - Class in org.apache.tika.fuzzing.cli
 
FuzzingCLIConfig() - Constructor for class org.apache.tika.fuzzing.cli.FuzzingCLIConfig
 
FuzzOne - Class in org.apache.tika.fuzzing.cli
Forked process that runs against a single input file
FuzzOne() - Constructor for class org.apache.tika.fuzzing.cli.FuzzOne
 

G

GARBAGE - Static variable in class org.apache.tika.fuzzing.pdf.EvilCOSWriter
Garbage bytes used to create the PDF header.
GCSEmitter - Class in org.apache.tika.pipes.emitter.gcs
 
GCSEmitter() - Constructor for class org.apache.tika.pipes.emitter.gcs.GCSEmitter
 
GCSFetcher - Class in org.apache.tika.pipes.fetcher.gcs
Fetches files from google cloud storage.
GCSFetcher() - Constructor for class org.apache.tika.pipes.fetcher.gcs.GCSFetcher
 
GCSPipesIterator - Class in org.apache.tika.pipes.pipesiterator.gcs
 
GCSPipesIterator() - Constructor for class org.apache.tika.pipes.pipesiterator.gcs.GCSPipesIterator
 
GDALParser - Class in org.apache.tika.parser.gdal
Wraps execution of the Geospatial Data Abstraction Library (GDAL) gdalinfo tool used to extract geospatial information out of hundreds of geo file formats.
GDALParser() - Constructor for class org.apache.tika.parser.gdal.GDALParser
 
GENERAL_EMBEDDED - Static variable in class org.apache.tika.detect.microsoft.POIFSContainerDetector
General embedded document type within an OLE2 container
GeneralTransformer - Class in org.apache.tika.fuzzing.general
 
GeneralTransformer() - Constructor for class org.apache.tika.fuzzing.general.GeneralTransformer
 
GeneralTransformer(Transformer...) - Constructor for class org.apache.tika.fuzzing.general.GeneralTransformer
 
GeneralTransformer(int, Transformer...) - Constructor for class org.apache.tika.fuzzing.general.GeneralTransformer
 
generateFooter(StringBuffer) - Method in class org.apache.tika.server.core.HTMLHelper
 
generateHeader(StringBuffer, String) - Method in class org.apache.tika.server.core.HTMLHelper
Generates the HTML Header for the user facing page, adding in the given title as required
generateRSS(Path) - Method in class org.apache.tika.example.RecentFiles
 
GenericConverter - Class in org.apache.tika.xmp.convert
Trys to convert as much of the properties in the Metadata map to XMP namespaces.
GenericConverter() - Constructor for class org.apache.tika.xmp.convert.GenericConverter
 
GENRE - Static variable in interface org.apache.tika.metadata.XMPDM
"The name of the genre."
GENRES - Static variable in interface org.apache.tika.parser.mp3.ID3Tags
List of predefined genres.
GeoGazetteerClient - Class in org.apache.tika.parser.geo.gazetteer
 
GeoGazetteerClient(String) - Constructor for class org.apache.tika.parser.geo.gazetteer.GeoGazetteerClient
Pass URL on which lucene-geo-gazetteer is available - eg.
GeoGazetteerClient(GeoParserConfig) - Constructor for class org.apache.tika.parser.geo.gazetteer.GeoGazetteerClient
 
Geographic - Interface in org.apache.tika.metadata
Geographic schema.
GeographicInformationParser - Class in org.apache.tika.parser.geoinfo
 
GeographicInformationParser() - Constructor for class org.apache.tika.parser.geoinfo.GeographicInformationParser
 
geoInfoType - Static variable in class org.apache.tika.parser.geoinfo.GeographicInformationParser
 
GeoParser - Class in org.apache.tika.parser.geo
 
GeoParser() - Constructor for class org.apache.tika.parser.geo.GeoParser
 
GeoParserConfig - Class in org.apache.tika.parser.geo
 
GeoParserConfig() - Constructor for class org.apache.tika.parser.geo.GeoParserConfig
 
GeoTag - Class in org.apache.tika.parser.geo
 
GeoTag() - Constructor for class org.apache.tika.parser.geo.GeoTag
 
get(Class<T>) - Method in class org.apache.tika.detect.zip.StreamingDetectContext
Returns the object in this context that implements the given interface.
get(Class<T>, T) - Method in class org.apache.tika.detect.zip.StreamingDetectContext
Returns the object in this context that implements the given interface, or the given default value if such an object is not found.
get(InputStream, TemporaryResources) - Static method in class org.apache.tika.io.TikaInputStream
Casts or wraps the given stream to a TikaInputStream instance.
get(InputStream) - Static method in class org.apache.tika.io.TikaInputStream
Casts or wraps the given stream to a TikaInputStream instance.
get(byte[]) - Static method in class org.apache.tika.io.TikaInputStream
Creates a TikaInputStream from the given array of bytes.
get(byte[], Metadata) - Static method in class org.apache.tika.io.TikaInputStream
Creates a TikaInputStream from the given array of bytes.
get(Path) - Static method in class org.apache.tika.io.TikaInputStream
Creates a TikaInputStream from the file at the given path.
get(Path, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
Creates a TikaInputStream from the file at the given path.
get(Path, Metadata, TemporaryResources) - Static method in class org.apache.tika.io.TikaInputStream
 
get(File) - Static method in class org.apache.tika.io.TikaInputStream
Deprecated.
use TikaInputStream.get(Path). In Tika 2.0, this will be removed or modified to throw an IOException.
get(File, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
Deprecated.
use TikaInputStream.get(Path, Metadata). In Tika 2.0, this will be removed or modified to throw an IOException.
get(InputStreamFactory) - Static method in class org.apache.tika.io.TikaInputStream
Creates a TikaInputStream from a Factory which can create fresh InputStreams for the same resource multiple times.
get(InputStreamFactory, TemporaryResources) - Static method in class org.apache.tika.io.TikaInputStream
Creates a TikaInputStream from a Factory which can create fresh InputStreams for the same resource multiple times.
get(Blob) - Static method in class org.apache.tika.io.TikaInputStream
Creates a TikaInputStream from the given database BLOB.
get(Blob, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
Creates a TikaInputStream from the given database BLOB.
get(URI) - Static method in class org.apache.tika.io.TikaInputStream
Creates a TikaInputStream from the resource at the given URI.
get(URI, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
Creates a TikaInputStream from the resource at the given URI.
get(URL) - Static method in class org.apache.tika.io.TikaInputStream
Creates a TikaInputStream from the resource at the given URL.
get(URL, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
Creates a TikaInputStream from the resource at the given URL.
get(String) - Method in class org.apache.tika.metadata.Metadata
Get the value associated to a metadata name.
get(Property) - Method in class org.apache.tika.metadata.Metadata
Returns the value (if any) of the identified metadata property.
get(String) - Static method in class org.apache.tika.metadata.Property
Retrieve the property object that corresponds to the given key
get(Class<T>) - Method in class org.apache.tika.parser.ParseContext
Returns the object in this context that implements the given interface.
get(Class<T>, T) - Method in class org.apache.tika.parser.ParseContext
Returns the object in this context that implements the given interface, or the given default value if such an object is not found.
get() - Method in enum org.apache.tika.parser.strings.StringsEncoding
 
get(TikaConfig, List<String>) - Static method in class org.apache.tika.server.client.TikaClient
 
get(String) - Method in class org.apache.tika.xmp.XMPMetadata
Returns the value of a simple property or the first one of an array.
get(Property) - Method in class org.apache.tika.xmp.XMPMetadata
 
get7BitsInt(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
AKA a Synchsafe integer.
getAccessChecker() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getAcronym() - Method in class org.apache.tika.mime.MimeType
Returns an acronym for this mime type.
getAdded() - Method in class org.apache.tika.batch.FileResourceCrawler
 
getAdded() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
 
getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.AbstractConverter
Every Converter has to provide information about namespaces that are used additionally to the core set of XMP namespaces.
getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.GenericConverter
 
getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.MSOfficeBinaryConverter
 
getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.MSOfficeXMLConverter
 
getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.OpenDocumentConverter
 
getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.RTFConverter
 
getAdmin1Code() - Method in class org.apache.tika.parser.geo.gazetteer.Location
 
getAdmin2Code() - Method in class org.apache.tika.parser.geo.gazetteer.Location
 
getAeDescriptorPath() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns the path to XML descriptor for AnalysisEngine.
getAgePredictorClient() - Method in class org.apache.tika.parser.recognition.AgeRecogniser
 
getAlbum() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getAlbum() - Method in interface org.apache.tika.parser.mp3.ID3Tags
 
getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
 
getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
 
getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getAlbumArtist() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getAlbumArtist() - Method in interface org.apache.tika.parser.mp3.ID3Tags
The Artist for the overall album / compilation of albums
getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
ID3v1 doesn't have album-wide artists, so returns null;
getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
 
getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getAliases(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
Returns the set of known aliases of the given canonical media type.
getAlignedLenTable() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
 
getAlignedTreeTable() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
 
getAllComponentParsers() - Method in class org.apache.tika.parser.CompositeParser
Returns all parsers registered with the Composite Parser, including ones which may not currently be active.
getAllComponentParsers() - Method in class org.apache.tika.parser.DefaultParser
 
getAllDetectableCharsets() - Static method in class org.apache.tika.parser.txt.CharsetDetector
Get the names of all charsets supported by CharsetDetector class.
getAllNameEntitiesfromInput(InputStream) - Method in class org.apache.tika.parser.geo.NameEntityExtractor
 
getAllowableFilters() - Method in class org.apache.tika.fuzzing.pdf.PDFTransformerConfig
Which filters are allowed
getAllowedHostsForRedirect() - Method in class org.apache.tika.client.HttpClientFactory
 
getAllParsers() - Method in class org.apache.tika.parser.multiple.AbstractMultipleParser
 
getAllTagHandlers(InputStream, ContentHandler) - Static method in class org.apache.tika.parser.mp3.Mp3Parser
Scans the MP3 frames for ID3 tags, and creates ID3Tag Handlers for each supported set of tags.
getAlpha(int) - Method in class org.apache.tika.parser.ocr.tess4j.ImageDeskew
 
getAlphabeticTokens() - Method in class org.apache.tika.eval.core.tokens.CommonTokenResult
 
getAnalysisEngine(String, String, String) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
Returns a new UIMA Analysis Engine (AE).
getAnnotationProperty(IdentifiedAnnotation, CTAKESAnnotationProperty) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
Returns the annotation value based on the given annotation type.
getAnnotationProps() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns an array of CTAKESAnnotationProperty's that will be included into cTAKES metadata.
getAnnotationPropsAsString() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns a string containing a comma-separated list of CTAKESAnnotationProperty names that will be included into cTAKES metadata.
getApiKey() - Method in class org.apache.tika.language.translate.impl.YandexTranslator
Get the API Key in use for client authentication
getApiUri(Metadata) - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
 
getApiUri(Metadata) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
 
getApiUri(Metadata) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTVideoRecogniser
 
getArray() - Method in class org.apache.tika.eval.core.textstats.TokenCountPriorityQueue
 
getArray() - Method in class org.apache.tika.eval.core.tokens.TokenCountPriorityQueue
 
getArtist() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getArtist() - Method in interface org.apache.tika.parser.mp3.ID3Tags
The Artist for the track
getArtist() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
 
getArtist() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
 
getArtist() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getArtist() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getAttributesMapping() - Method in class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
 
getAttrValue(String, Attributes) - Static method in class org.apache.tika.utils.XMLReaderUtils
 
getAuthScheme() - Method in class org.apache.tika.client.HttpClientFactory
 
getAutoDetectParserConfig() - Method in class org.apache.tika.config.TikaConfig
 
getAverageCharTolerance() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getBasePath() - Method in class org.apache.tika.pipes.fetcher.fs.FileSystemFetcher
 
getBaseType() - Method in class org.apache.tika.mime.MediaType
Returns the base form of the MediaType, excluding any parameters, such as "text/plain" for "text/plain; charset=utf-8"
getBestNameEntity() - Method in class org.apache.tika.parser.geo.NameEntityExtractor
 
getBigInteger(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
 
getBinaryDocValues(String) - Method in class org.apache.tika.eval.app.tools.SlowCompositeReaderWrapper
 
getBitRate() - Method in class org.apache.tika.parser.mp3.AudioFrame
Get the bit rate in bit per second.
getBlob(ResultSet, int, Metadata) - Method in class org.apache.tika.parser.jdbc.JDBCTableReader
 
getBlock_len() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
Returns block's length
getBlockAddress() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
Returns block addresses
getBlockCount() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
Gets a block count
getBlockidx_intvl() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
Returns block index interval
getBlockLen() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
Gets a block length
getBlockLength() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
 
getBlockNext() - Method in class org.apache.tika.parser.microsoft.chm.ChmPmglHeader
 
getBlockNumber() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxBlock
 
getBlockPrev() - Method in class org.apache.tika.parser.microsoft.chm.ChmPmglHeader
 
getBlockRemaining() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
 
getBlockType() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
 
getBody() - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
 
getBoolean(String, Boolean) - Static method in class org.apache.tika.util.PropsUtil
Parses v.
getByte() - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
 
getByteList() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
 
getBytes(boolean) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
getBytes(char) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
getBytes(double) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
getBytes(short) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
getBytes(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
getBytes(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
getBytes(float) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
getBytes(String) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitConverter
 
getBytes() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitWriter
Gets a copy byte array which contains the current written byte.
getBytes(long) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.LittleEndianBitConverter
Returns the specified 64-bit unsigned integer value as an array of bytes.
getBytes(int) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.LittleEndianBitConverter
Returns the specified 32-bit unsigned integer value as an array of bytes.
getCapacity() - Method in class org.apache.tika.pipes.async.AsyncProcessor
 
getCause() - Method in exception org.apache.tika.sax.TaggedSAXException
Returns the wrapped exception.
getCauseForTermination() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
 
getCellManifestDataElementData(List<DataElement>, StorageManifestDataElementData, HashMap<CellID, ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to get cell manifest data element from a list of data element.
getCenter() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
 
getChannels() - Method in class org.apache.tika.parser.mp3.AudioFrame
Get the number of channels (1=mono, 2=stereo)
getCharset() - Method in class org.apache.tika.detect.AutoDetectReader
 
getCharset() - Method in class org.apache.tika.detect.NonDetectingEncodingDetector
 
getCharset() - Method in class org.apache.tika.parser.csv.CSVParams
 
getChildTypes(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
Returns the set of known children of the given canonical media type
getChmBlockInfoInstance(DirectoryListingEntry, int, ChmLzxcControlData) - Static method in class org.apache.tika.parser.microsoft.chm.ChmBlockInfo
Deprecated.
getChmBlockInfoInstance(DirectoryListingEntry, int, ChmLzxcControlData, ChmBlockInfo) - Static method in class org.apache.tika.parser.microsoft.chm.ChmBlockInfo
 
getChmBlockSegment(byte[], ChmLzxcResetTable, int, int, int) - Static method in class org.apache.tika.parser.microsoft.chm.ChmCommons
 
getChmDirList() - Method in class org.apache.tika.parser.microsoft.chm.ChmExtractor
 
getChmDirList() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
 
getChmItsfHeader() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
 
getChmItspHeader() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
 
getChmLzxcControlData() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
 
getChmLzxcResetTable() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
 
getChoices() - Method in class org.apache.tika.metadata.Property
Returns the (immutable) set of choices for the values of this property.
getClassName() - Method in enum org.apache.tika.parser.ctakes.CTAKESSerializer
 
getColInfos() - Method in class org.apache.tika.eval.app.db.TableInfo
 
getColorspace() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getCommand() - Method in class org.apache.tika.embedder.ExternalEmbedder
Gets the command to be run.
getCommand() - Method in class org.apache.tika.parser.external.ExternalParser
 
getCommand() - Method in class org.apache.tika.parser.gdal.GDALParser
 
getCommandAppendOperator() - Method in class org.apache.tika.embedder.ExternalEmbedder
Gets the operator to append rather than replace a value for the command line tool, i.e.
getCommandAssignmentDelimeter() - Method in class org.apache.tika.embedder.ExternalEmbedder
Gets the delimiter for multiple assignments for the command line tool, i.e.
getCommandAssignmentOperator() - Method in class org.apache.tika.embedder.ExternalEmbedder
Gets the assignment operator for the command line tool, i.e.
getCommandMetadataSegments(Metadata) - Method in class org.apache.tika.embedder.ExternalEmbedder
Constructs a collection of command line arguments responsible for setting individual metadata fields based on the given metadata.
getComment(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
Builds up the ID3 comment, by parsing and extracting the comment string parts from the given data.
getComments() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getComments() - Method in interface org.apache.tika.parser.mp3.ID3Tags
Retrieves the comments, if any.
getComments() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
 
getComments() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
 
getComments() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getComments() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getCommitWithin() - Method in class org.apache.tika.pipes.emitter.opensearch.OpenSearchEmitter
 
getCommitWithin() - Method in class org.apache.tika.pipes.emitter.solr.SolrEmitter
 
getCommonTokens() - Method in class org.apache.tika.eval.core.tokens.CommonTokenResult
 
getCommonTokensAnalyzer() - Method in class org.apache.tika.eval.core.tokens.AnalyzerManager
This analyzer should be used to generate common tokens lists from large corpora.
getCompilation() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getCompilation() - Method in interface org.apache.tika.parser.mp3.ID3Tags
 
getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
ID3v1 doesn't have compilations, so returns null;
getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
ID3v22 doesn't have compilations, so returns null;
getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getComposer() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getComposer() - Method in interface org.apache.tika.parser.mp3.ID3Tags
 
getComposer() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
ID3v1 doesn't have composers, so returns null;
getComposer() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
 
getComposer() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getComposer() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getCompoundTypes() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
Gets the StreamObjectTypeHeaderStart
getCompressedLen() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxcResetTable
Gets compressed length
getConfidence() - Method in class org.apache.tika.language.detect.LanguageResult
 
getConfidence() - Method in class org.apache.tika.parser.csv.CSVResult
 
getConfidence() - Method in class org.apache.tika.parser.recognition.RecognisedObject
 
getConfidence() - Method in class org.apache.tika.parser.txt.CharsetMatch
Get an indication of the confidence in the charset detected.
getConfig() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
Deprecated.
as of 1.17, use EmbeddedDocumentUtil.getTikaConfig() instead
getConfig() - Static method in class org.apache.tika.server.core.resource.TikaResource
 
getConfigPath() - Method in class org.apache.tika.server.core.TikaServerConfig
 
getConnection() - Method in class org.apache.tika.eval.app.db.JDBCUtil
Override this any optimizations you want to do on the db before writing/reading.
getConnection(InputStream, Metadata, ParseContext) - Method in class org.apache.tika.parser.jdbc.AbstractDBParser
Override this for special configuration of the connection, such as limiting the number of rows to be held in memory.
getConnectionString() - Method in class org.apache.tika.eval.app.db.H2Util
 
getConnectionString() - Method in class org.apache.tika.eval.app.db.JDBCUtil
 
getConnectionString(InputStream, Metadata, ParseContext) - Method in class org.apache.tika.parser.jdbc.AbstractDBParser
Implement for db specific connection information, e.g.
getConnectTimeout() - Method in class org.apache.tika.client.HttpClientFactory
 
getConsidered() - Method in class org.apache.tika.batch.FileResourceCrawler
 
getConsidered() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
Returns the number of file resources considered.
getConstraints() - Method in class org.apache.tika.eval.app.db.ColInfo
 
getConsumed() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
 
getConsumers() - Method in class org.apache.tika.batch.ConsumersManager
Get the consumers
getConsumersManagerMaxMillis() - Method in class org.apache.tika.batch.ConsumersManager
BatchProcess will throw an exception if the ConsumersManager doesn't complete init() or shutdown() within this amount of time.
getContent(EvalFilePaths, Metadata) - Static method in class org.apache.tika.eval.app.AbstractProfiler
 
getContent() - Method in class org.apache.tika.eval.core.util.ContentTags
 
getContent() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxBlock
 
getContent(int, int) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxBlock
 
getContent(int) - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxBlock
 
getContent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.ExGUIDArray
 
getContent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.IntermediateNodeObject
Get all the content which is represented by the root node object.
getContent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.LeafNodeObject
Get all the content which is represented by the intermediate node object.
getContent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.NodeObject
Get all the content which is represented by the node object.
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.example.PrescriptionParser
 
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.dif.DIFParser
 
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
 
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
 
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.WordMLParser
 
getContentHandler(ContentHandler, Metadata) - Method in class org.apache.tika.parser.mif.MIFParser
Get the content handler to use.
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentMetaParser
 
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.DcXMLParser
 
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.FictionBookParser
 
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.TextAndAttributeXMLParser
 
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.XMLParser
 
getContentHandlerDecoratorFactory() - Method in class org.apache.tika.parser.AutoDetectParserConfig
 
getContentHandlerFactory() - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
 
getContentLanguage() - Method in class org.apache.tika.example.ImportContextImpl
 
getContentLength() - Method in class org.apache.tika.example.ImportContextImpl
 
getContentLength() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxBlock
 
getContentParser() - Method in class org.apache.tika.parser.epub.EpubParser
 
getContentParser() - Method in class org.apache.tika.parser.odf.OpenDocumentParser
 
getContextIDs() - Method in class org.apache.tika.parser.microsoft.onenote.ObjectSpaceObjectPropSet
 
getControlDataIndex() - Method in class org.apache.tika.parser.microsoft.chm.ChmDirectoryListingSet
Returns control data index that located in List
getConverter(String) - Static method in class org.apache.tika.xmp.convert.TikaToXMP
Retrieve a specific converter according to the mimetype
getCoreCacheHelper() - Method in class org.apache.tika.eval.app.tools.SlowCompositeReaderWrapper
 
getCoreProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
 
getCoreProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
 
getCoreProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
 
getCors() - Method in class org.apache.tika.server.core.TikaServerConfig
 
getCount(String) - Method in class org.apache.tika.eval.core.tokens.LangModel
 
getCount() - Method in class org.apache.tika.langdetect.tika.LanguageProfile
 
getCount(String) - Method in class org.apache.tika.langdetect.tika.LanguageProfile
 
getCountryCode() - Method in class org.apache.tika.parser.geo.gazetteer.Location
 
getCounts() - Method in class org.apache.tika.eval.core.tokens.LangModel
 
getCurrent(byte[], AtomicInteger, Class<T>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.StreamObject
Get current stream object.
getCurrent() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.BitReader
 
getCurrentCharset() - Method in class org.apache.tika.example.PickBestTextEncodingParser.CharsetTester
Deprecated.
 
getCurrentFile() - Method in class org.apache.tika.batch.FileResourceConsumer
Returns the name and start time of a file that is currently being processed.
getCurrentFSSHTTPBSubRequestID() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
This method is used to get the current sub request ID and atomic adding the token by 1.
GetCurrentSerialNumber() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
This method is used to get the current serial number and atomic adding the token by 1.
getCurrentToken() - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.SequenceNumberGenerator
This method is used to get the current token value and atomic adding the token by 1.
getCustomProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
 
getCustomProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
 
getCustomProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
 
getData() - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
 
getData() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
 
getData(Class<T>) - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.DataElement
Used to get data.
getData() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
 
getDataObjectDataElementData(List<DataElement>, ExGuid, AtomicReference<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to get the list of object group data element from a list of data element.
getDataObjectDataElementData(List<DataElement>, RevisionManifestDataElementData, AtomicReference<ExGuid>) - Static method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.util.DataElementUtils
This method is used to get a list of object group data element from a list of data element.
getDataOffset() - Method in class org.apache.tika.parser.microsoft.chm.ChmDirectoryListingSet
Returns data offset
getDataOffset() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
Returns data offset
getDataToSign() - Method in class org.apache.tika.fuzzing.pdf.EvilCOSWriter
Return the stream of PDF data to be signed.
getDate(Property) - Method in class org.apache.tika.metadata.Metadata
Returns the value of the identified Date based metadata property.
getDate(Property) - Method in class org.apache.tika.xmp.XMPMetadata
 
getDateFormatOverride() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
 
getDBWriter(List<TableInfo>) - Method in class org.apache.tika.eval.app.batch.EvalConsumerBuilder
 
getDecodedValue() - Method in class org.apache.tika.parser.microsoft.onenote.fsshttpb.streamobj.basic.Compact64bitInt
 
getDecorationName() - Method in class org.apache.tika.parser.ctakes.CTAKESParser
 
getDecorationName() - Method in class org.apache.tika.parser.ParserDecorator
 
getDectorsHTML() - Method in class org.apache.tika.server.core.resource.TikaDetectors
 
getDefaultConfig() - Static method in class org.apache.tika.config.TikaConfig
Provides a default configuration (TikaConfig).
getDefaultConfig() - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
getDefaultDetector(MimeTypes, ServiceLoader) - Static method in class org.apache.tika.config.TikaConfig
 
getDefaultEncodingDetector(ServiceLoader) - Static method in class org.apache.tika.config.TikaConfig
 
getDefaultLanguageDetector() - Static method in class org.apache.tika.language.detect.LanguageDetector
 
getDefaultMimeTypes() - Static method in class org.apache.tika.mime.MimeTypes
Get the default MimeTypes.
getDefaultMimeTypes(ClassLoader) - Static method in class org.apache.tika.mime.MimeTypes
Get the default MimeTypes.
getDefaultNumConsumers() - Static method in class org.apache.tika.batch.builders.AbstractConsumersBuilder
 
getDefaultRegistry() - Static method in class org.apache.tika.mime.MediaTypeRegistry
Returns the built-in media type registry included in Tika.
getDelegateParser(ParseContext) - Method in class org.apache.tika.parser.DelegatingParser
Returns the parser instance to which parsing tasks should be delegated.
getDelegatingParser() - Method in class org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
 
getDelimiter() - Method in class org.apache.tika.parser.csv.CSVParams
 
getDelimiter() - Method in class org.apache.tika.parser.csv.CSVResult
 
getDensity() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getDepth() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getDescription() - Method in class org.apache.tika.mime.MimeType
Returns the description of this media type.
getDescription() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
Gets the description, if present
getDetectableCharsets() - Method in class org.apache.tika.parser.txt.CharsetDetector
Deprecated.
This API is ICU internal only.
getDetector() - Method in class org.apache.tika.config.TikaConfig
Returns the configured detector instance.
getDetector() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
 
getDetector() - Method in class org.apache.tika.language.detect.LanguageHandler
Returns the language detector used by this content handler.
getDetector() - Method in class org.apache.tika.language.detect.LanguageWriter
Returns the language detector used by this writer.
getDetector() - Method in class org.apache.tika.parser.AutoDetectParser
Returns the type detector used by this parser to auto-detect the type of a document.
getDetector() - Method in class org.apache.tika.Tika
Returns the detector instance used by this facade.
getDetectors() - Method in class org.apache.tika.detect.CompositeDetector
Returns the component detectors.
getDetectors() - Method in class org.apache.tika.detect.CompositeEncodingDetector
 
getDetectors() - Method in class org.apache.tika.detect.DefaultDetector
 
getDetectors() - Method in class org.apache.tika.detect.DefaultProbDetector
 
getDetectorsJSON() - Method in class org.apache.tika.server.core.resource.TikaDetectors
 
getDetectorsPlain() - Method in class org.apache.tika.server.core.resource.TikaDetectors
 
getDiceCoefficient() - Method in class org.apache.tika.eval.core.tokens.ContrastStatistics
 
getDigest() - Method in class org.apache.tika.server.core.TikaServerConfig
digest configuration string, e.g.
getDigestMarkLimit() - Method in class org.apache.tika.server.core.TikaServerConfig
 
getDir_uuid() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
Returns directory uuid
getDirectoryListingEntryList() - Method in class org.apache.tika.parser.microsoft.chm.ChmDirectoryListingSet
Returns chm directory listing entry list
getDirLen() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
Returns directory length
getDirOffset() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
Returns directory offset
getDisc() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getDisc() - Method in interface org.apache.tika.parser.mp3.ID3Tags
The number of the disc this belongs to, within the set
getDisc() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
ID3v1 doesn't have disc numbers, so returns null;
getDisc() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
 
getDisc() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getDisc() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
 
getDocument() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor
Returns the opened document.
getDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSExtractorDecorator
 
getDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
 
getDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
 
getDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
 
getDocumentBuilder() - Method in class org.apache.tika.parser.ParseContext
Returns the DOM builder specified in this parsing context.
getDocumentBuilder() - Static method in class org.apache.tika.utils.XMLReaderUtils
Returns the DOM builder specified in this parsing context.
getDocumentBuilderFactory() - Static method in class org.apache.tika.utils.XMLReaderUtils
Returns the DOM builder factory specified in this parsing context.
getDropThreshold() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
 
getDuration() - Method in class org.apache.tika.parser.mp3.AudioFrame
Returns the duration in milliseconds.
getEmbeddedDocumentExtractor(ParseContext) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
This offers a uniform way to get an EmbeddedDocumentExtractor from a ParseContext.
getEmbeddedDocumentExtractorFactory() - Method in class org.apache.tika.parser.AutoDetectParserConfig
 
getEmitData() - Method in class org.apache.tika.pipes.PipesResult
 
getEmitDataQueue(int) - Method in class org.apache.tika.server.core.resource.AsyncResource
 
getEmitKey() - Method in class org.apache.tika.pipes.emitter.EmitData
 
getEmitKey() - Method in class org.apache.tika.pipes.emitter.EmitKey
 
getEmitKey() - Method in class org.apache.tika.pipes.FetchEmitTuple
 
getEmitMaxEstimatedBytes() - Method in class org.apache.tika.pipes.async.AsyncConfig
When the emit queue hits this estimated size (sum of estimated extract sizes), emit the batch.
getEmitter(String) - Method in class org.apache.tika.pipes.emitter.EmitterManager
 
getEmitterName() - Method in class org.apache.tika.pipes.emitter.EmitKey
 
getEmitterName() - Method in class org.apache.tika.pipes.pipesiterator.PipesIterator
 
getEmitWithinMillis() - Method in class org.apache.tika.pipes.async.AsyncConfig
 
getEncint() - Method in class org.apache.tika.parser.microsoft.chm.ChmSection
 
getEncoding() - Method in class org.apache.tika.example.ImportContextImpl
 
getEncoding() - Method in class org.apache.tika.parser.strings.StringsConfig
Returns the character encoding of the strings that are to be found.
getEncodingDetector() - Method in class org.apache.tika.config.TikaConfig
Returns the configured encoding detector instance
getEncodingDetector(ParseContext) - Method in class org.apache.tika.parser.AbstractEncodingDetectorParser
Look for an EncodingDetetor in the ParseContext.
getEncodingDetector() - Method in class org.apache.tika.parser.AbstractEncodingDetectorParser
 
getEndBlock() - Method in class org.apache.tika.parser.microsoft.chm.ChmBlockInfo
Returns the end block index
getEndOffset() - Method in class org.apache.tika.parser.microsoft.chm.ChmBlockInfo
Returns the end offset index
getEndpoints() - Method in class org.apache.tika.server.core.TikaServerConfig
 
getEntityTypes() - Method in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
Gets set of entity types recognised by this recogniser
getEntityTypes() - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
Gets set of entity types recognised by this recogniser
getEntityTypes() - Method in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
Gets set of entity types recognised by this recogniser
getEntityTypes() - Method in interface org.apache.tika.parser.ner.NERecogniser
gets a set of entity types whose names are recognisable by this
getEntityTypes() - Method in class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
Gets set of entity types recognised by this recogniser
getEntityTypes() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
 
getEntityTypes() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
 
getEntityTypes() - Method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
 
getEntriesToCopy() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
 
getEntropy() - Method in class org.apache.tika.eval.core.tokens.TokenStatistics
 
getEntryType() - Method in class org.apache.tika.parser.microsoft.chm.DirectoryListingEntry
Returns ChmCommons.EntryType (COMPRESSED or UNCOMPRESSED)
getErrors() - Static method in class org.apache.tika.langdetect.tika.LanguageIdentifier
Returns a string of error messages related to initializing language profiles
getEstimatedSizeBytes() - Method in class org.apache.tika.pipes.emitter.EmitData
 
getExecutorService() - Method in class org.apache.tika.config.TikaConfig
 
getExitStatus() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
 
getExitValue() - Method in class org.apache.tika.utils.FileProcessResult
 
getExtendedGuidString() - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
 
getExtendedHeader() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
 
getExtendedProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
 
getExtendedProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
 
getExtendedProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
 
getExtension(TikaInputStream, Metadata) - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
 
getExtension() - Method in class org.apache.tika.mime.MimeType
Returns the preferred file extension of this type, or an empty string if no extensions are known.
getExtension() - Method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
 
getExtensions() - Method in class org.apache.tika.mime.MimeType
Returns the list of all known file extensions of this media type.
getFallback() - Method in class org.apache.tika.parser.CompositeParser
Returns the fallback parser.
getFetchEmitQueue(int) - Method in class org.apache.tika.server.core.resource.AsyncResource
 
getFetcher(String) - Method in class org.apache.tika.pipes.fetcher.FetcherManager
 
getFetcherName() - Method in class org.apache.tika.pipes.fetcher.FetchKey
 
getFetcherName() - Method in class org.apache.tika.pipes.pipesiterator.PipesIterator
 
getFetchKey() - Method in class org.apache.tika.pipes.FetchEmitTuple
 
getFetchKey() - Method in class org.apache.tika.pipes.fetcher.FetchKey
 
getField() - Method in class org.apache.tika.config.ParamField
 
getFieldInfos() - Method in class org.apache.tika.eval.app.tools.SlowCompositeReaderWrapper
 
getFile() - Method in class org.apache.tika.io.TikaInputStream
 
getFile(String, File) - Static method in class org.apache.tika.util.PropsUtil
Deprecated.
getFileChannel() - Method in class org.apache.tika.io.TikaInputStream
 
getFileLength(Path) - Method in class org.apache.tika.eval.app.AbstractProfiler
 
getFilesProcessed() - Method in class org.apache.tika.pipes.PipesClient
 
getFilesProcessed() - Method in class org.apache.tika.server.core.ServerStatus
 
getFilesystem() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
 
getFilesystem() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
 
getFilesystem() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
 
getFilter() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
 
getFilteredStackTrace(Throwable) - Static method in class org.apache.tika.utils.ExceptionUtils
Simple util to get stack trace.
getFilters(COSBase) - Method in class org.apache.tika.fuzzing.pdf.PDFTransformerConfig
getFlags() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
 
getForkedJvmArgs() - Method in class org.apache.tika.pipes.PipesConfigBase
 
getForkedJvmArgs() - Method in class org.apache.tika.server.core.TikaServerConfig
 
getForkedProcessArgs(String, String) - Method in class org.apache.tika.server.core.TikaServerConfig
 
getForkedProcessArgs(int, String) - Method in class org.apache.tika.server.core.TikaServerConfig
 
getForkedStatusFile() - Method in class org.apache.tika.server.core.TikaServerConfig
 
getFormat() - Method in class org.apache.tika.language.translate.impl.YandexTranslator
Retrieve the current text format setting.
getFormattedNumber(Paragraph) - Method in class org.apache.tika.parser.microsoft.ListManager
Get the formatted number for a given paragraph

getFormattedNumber(XWPFParagraph) - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFListManager
 
getFormattedNumber(BigInteger, int) - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFListManager
 
getFramesRead() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
 
getFreeSpace() - Method in class org.apache.tika.parser.microsoft.chm.ChmPmgiHeader
Returns pmgi free space
getFreeSpace() - Method in class org.apache.tika.parser.microsoft.chm.ChmPmglHeader
 
getGazetteerRestEndpoint() - Method in class org.apache.tika.parser.geo.GeoParser
 
getGazetteerRestEndpoint() - Method in class org.apache.tika.parser.geo.GeoParserConfig
 
getGeneralAnalyzer() - Method in class org.apache.tika.eval.core.tokens.AnalyzerManager
This analyzer should be used to extract all tokens.
getGenre() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
 
getGenre() - Method in interface org.apache.tika.parser.mp3.ID3Tags
 
getGenre() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
 
getGenre() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
 
getGenre() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
 
getGenre() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
 
getGuid() - Method in class org.apache.tika.parser.microsoft.onenote.ExtendedGUID
 
getGuid() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
 
getGuid() - Method in class org.apache.tika.parser.microsoft.onenote.GUID
 
getGuidString() - Method in class org.apache.tika.parser.microsoft.onenote.GUID
 
getHadStarted() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
 
getHandlerConfig() - Method in class org.apache.tika.pipes.FetchEmitTuple
 
getHandlerConfig() - Method in class org.apache.tika.pipes.pipesiterator.PipesIterator
 
getHeader_len() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
Returns header length
getHeaderLen() - Method in class org.apache.tika.parser.microsoft.chm.ChmItsfHeader
Returns itsf header length
getHeaders() - Method in class org.apache.tika.parser.jdbc.JDBCTableReader
 
getHost() - Method in class org.apache.tika.server.core.TikaServerConfig
 
getHTML(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.core.resource.TikaResource
 
getHTMLFromMultipart(Attachment, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.core.resource.TikaResource
 
getId() - Method in class org.apache.tika.parser.recognition.RecognisedObject
 
getId() - Method in class org.apache.tika.pipes.FetchEmitTuple
 
getId() - Method in class org.apache.tika.server.core.TikaServerConfig
 
getId() - Method in class org.apache.tika.server.core.WatchDogResult
 
getIdBase() - Method in class org.apache.tika.server.core.TikaServerConfig
 
getIdentifier() - Method in class org.apache.tika.sax.StandardReference
 
getIgnoredLineConsumer() - Method in class org.apache.tika.parser.external.ExternalParser
Gets lines consumer
getIlvl() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
 
getImageMagickPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
getImageMagickProg() - Static method in class org.apache.tika.parser.ocr.TesseractOCRParser
 
getImportRoot() - Method in class org.apache.tika.example.ImportContextImpl
 
getIncludeFields() - Method in class org.apache.tika.metadata.writefilter.StandardWriteFilterFactory
 
getIndex() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntryFNDX
 
getIndex_depth() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
Returns an index depth
getIndex_head() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
Returns an index head
getIndex_root() - Method in class org.apache.tika.parser.microsoft.chm.ChmItspHeader
Returns index root
getIndexCopyFromStart() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
 
getIndexCopyToStart() - Method in class org.apache.tika.parser.microsoft.onenote.GlobalIdTableEntry3FNDX
 
getIndexOfContent() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
 
getIndexOfResetData() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
 
getIndexOfResetTable() - Method in class org.apache.tika.parser.microsoft.chm.ChmWrapper
 
getIniBlock() - Method in class org.apache.tika.parser.microsoft.chm.ChmBlockInfo
Returns an initial block index
getInitializableProblemHandler() - Method in class org.apache.tika.config.ServiceLoader
Returns the handler for problems with initializables
getInlineBool(OneNotePropertyEnum) - Static method in enum org.apache.tika.parser.microsoft.onenote.OneNotePropertyEnum
 
getInputDirectory() - Method in class org.apache.tika.fuzzing.cli.FuzzingCLIConfig
 
getInputStream(FileResource) - Method in class org.apache.tika.batch.fs.AbstractFSConsumer
 
getInputStream() - Method in class org.apache.tika.example.ImportContextImpl
Returns a new InputStream to the temporary file created during instanciation or null, if this context does not provide a stream.
getInputStream() - Method in interface org.apache.tika.io.InputStreamFactory
 
getInputStream() - Method in class org.apache.tika.parser.html.DataURIScheme
 
getInputStream(InputStream, Metadata, HttpHeaders) - Method in class org.apache.tika.server.core.DefaultInputStreamFactory
 
getInputStream(InputStream, Metadata, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.core.DefaultInputStreamFactory
 
getInputStream(InputStream, Metadata, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.core.FetcherStreamFactory
 
getInputStream(InputStream, Metadata, HttpHeaders) - Method in class org.apache.tika.server.core.FetcherStreamFactory
 
getInputStream(InputStream, Metadata, HttpHeaders) - Method in interface org.apache.tika.server.core.InputStreamFactory
getInputStream(InputStream, Metadata, HttpHeaders, UriInfo) - Method in interface org.apache.tika.server.core.InputStreamFactory
 
getInputStream(InputStream, Metadata, HttpHeaders, UriInfo) - Static method in class org.apache.tika.server.core.resource.TikaResource
 
getInputStreamFactory() - Method in class org.apache.tika.io.TikaInputStream
If the Stream was created from an InputStreamFactory, return that, otherwise null.
getInstance() - Method in interface org.apache.tika.eval.core.textstats.BytesRefCalculator
 
getInstance() - Method in class org.apache.tika.eval.core.textstats.TextSha256Signature
 
getInstance() - Static method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
 
getInt(Property) - Method in class org.apache.tika.metadata.Metadata
Returns the value of the identified Integer based metadata property.
getInt(byte[]) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
 
getInt(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
 
getInt(String, Integer) - Static method in class org.apache.tika.util.PropsUtil
Parses v.
getInt(String, Map<String, String>, Node) - Static method in class org.apache.tika.util.XMLDOMUtil
Get an int value.
getInt(Property) - Method in class org.apache.tika.xmp.XMPMetadata
 
getInt2(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
 
getInt3(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
 
getIntBE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
Get a BE int value from the beginning of a byte array
getIntBE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
Get a BE int value from a byte array
getIntelCurrentPossition() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
 
getIntelFileSize() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
 
getIntelState() - Method in class org.apache.tika.parser.microsoft.chm.ChmLzxState
 
getIntLE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
Get a LE int value from the beginning of a byte array