Skip navigation links
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z _ 


ABS_PEAK_AUDIO_FILE_PATH - Static variable in interface org.apache.tika.metadata.XMPDM
"The absolute path to the file's peak audio file.
AbstractConsumersBuilder - Class in
AbstractConsumersBuilder() - Constructor for class
AbstractConverter - Class in org.apache.tika.xmp.convert
Base class for Tika Metadata to XMP converter which provides some needed common functionality.
AbstractConverter() - Constructor for class org.apache.tika.xmp.convert.AbstractConverter
AbstractEncodingDetectorParser - Class in org.apache.tika.parser
Abstract base class for parsers that use the AutoDetectReader and need to use the EncodingDetector configured by TikaConfig
AbstractEncodingDetectorParser() - Constructor for class org.apache.tika.parser.AbstractEncodingDetectorParser
AbstractEncodingDetectorParser(EncodingDetector) - Constructor for class org.apache.tika.parser.AbstractEncodingDetectorParser
AbstractFSConsumer - Class in org.apache.tika.batch.fs
AbstractFSConsumer(ArrayBlockingQueue<FileResource>) - Constructor for class org.apache.tika.batch.fs.AbstractFSConsumer
AbstractListManager - Class in
AbstractListManager() - Constructor for class
AbstractListManager.LevelTuple - Class in
AbstractListManager.ParagraphLevelCounter - Class in
AbstractOfficeParser - Class in
Intermediate layer to set OfficeParserConfig uniformly.
AbstractOfficeParser() - Constructor for class
AbstractOOXMLExtractor - Class in
Base class for all Tika OOXML extractors.
AbstractOOXMLExtractor(ParseContext, POIXMLTextExtractor) - Constructor for class
AbstractParser - Class in org.apache.tika.parser
Abstract base class for new parsers.
AbstractParser() - Constructor for class org.apache.tika.parser.AbstractParser
AbstractProfiler - Class in org.apache.tika.eval
AbstractProfiler(ArrayBlockingQueue<FileResource>, IDBWriter) - Constructor for class org.apache.tika.eval.AbstractProfiler
AbstractProfiler.EXCEPTION_TYPE - Enum in org.apache.tika.eval
AbstractProfiler.PARSE_ERROR_TYPE - Enum in org.apache.tika.eval
If information was gathered from the log file about a parse error
AbstractRecursiveParserWrapperHandler - Class in org.apache.tika.sax
This is a special handler to be used only with the RecursiveParserWrapper.
AbstractRecursiveParserWrapperHandler(ContentHandlerFactory) - Constructor for class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
AbstractRecursiveParserWrapperHandler(ContentHandlerFactory, int) - Constructor for class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
AbstractTranslator - Class in org.apache.tika.language.translate
AbstractTranslator() - Constructor for class org.apache.tika.language.translate.AbstractTranslator
AbstractXML2003Parser - Class in
AbstractXML2003Parser() - Constructor for class
AccessChecker - Class in org.apache.tika.parser.pdf
Checks whether or not a document allows extraction generally or extraction for accessibility only.
AccessChecker() - Constructor for class org.apache.tika.parser.pdf.AccessChecker
This constructs an AccessChecker that will not perform any checking and will always return without throwing an exception.
AccessChecker(boolean) - Constructor for class org.apache.tika.parser.pdf.AccessChecker
This constructs an AccessChecker that will check for whether or not content should be extracted from a document.
AccessPermissionException - Exception in org.apache.tika.exception
Exception to be thrown when a document does not allow content extraction.
AccessPermissionException() - Constructor for exception org.apache.tika.exception.AccessPermissionException
AccessPermissionException(Throwable) - Constructor for exception org.apache.tika.exception.AccessPermissionException
AccessPermissionException(String) - Constructor for exception org.apache.tika.exception.AccessPermissionException
AccessPermissionException(String, Throwable) - Constructor for exception org.apache.tika.exception.AccessPermissionException
AccessPermissions - Interface in org.apache.tika.metadata
Until we can find a common standard, we'll use these options.
ACKNOWLEDGEMENT - Static variable in interface org.apache.tika.metadata.ClimateForcast
ACRONYM_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
ACTION_TRIGGER - Static variable in interface org.apache.tika.metadata.PDF
This specifies where an action or destination would be found/triggered in the document: on document open, before close, etc.
actionPerformed(ActionEvent) - Method in class org.apache.tika.gui.TikaGUI
Activator - Class in org.apache.tika.parser.internal
Activator() - Constructor for class org.apache.tika.parser.internal.Activator
add(String, long) - Method in class org.apache.tika.eval.tokens.LangModel
add(String, String) - Method in class org.apache.tika.eval.tokens.TokenCounter
add(String) - Method in class org.apache.tika.language.LanguageProfile
Adds a single occurrence of the given ngram to this profile.
add(String, long) - Method in class org.apache.tika.language.LanguageProfile
Adds multiple occurrences of the given ngram to this profile.
add(StringBuffer) - Method in class org.apache.tika.language.LanguageProfilerBuilder
Adds ngrams from a single word to this profile
add(String, String) - Method in class org.apache.tika.metadata.Metadata
Add a metadata name/value mapping.
add(Property, String) - Method in class org.apache.tika.metadata.Metadata
Add a metadata property/value mapping.
add(Property, int) - Method in class org.apache.tika.metadata.Metadata
Adds the integer value of the identified metadata property.
add(Metadata) - Method in class org.apache.tika.metadata.serialization.JsonStreamingSerializer
add(String, String) - Method in class org.apache.tika.xmp.XMPMetadata
As this API could only possibly work for simple properties in XMP, it just calls the set method, which replaces any existing value
addAlias(MediaType, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
addAlternative(GeoTag) - Method in class org.apache.tika.parser.geo.topic.GeoTag
addData(byte[], int, int) - Method in class org.apache.tika.detect.TextStatistics
addDrawingHyperLinks(PackagePart) - Method in class
ADDED - Static variable in class org.apache.tika.batch.FileResourceCrawler
addErrorLogTablePair(Path, TableInfo) - Method in class org.apache.tika.eval.batch.DBConsumersManager
addErrorLogTablePairs(DBConsumersManager) - Method in class org.apache.tika.eval.batch.EvalConsumerBuilder
addErrorLogTablePairs(DBConsumersManager) - Method in class org.apache.tika.eval.batch.ExtractComparerBuilder
addErrorLogTablePairs(DBConsumersManager) - Method in class org.apache.tika.eval.batch.ExtractProfilerBuilder
addEvenIfNull(Property, String, Metadata) - Static method in class
addingService(ServiceReference) - Method in class org.apache.tika.config.TikaActivator
ADDITIONAL_MODEL_INFO - Static variable in interface org.apache.tika.metadata.IPTC
Information about the ethnicity and other facets of the model(s) in a model-released image.
ADDITIONAL_NAMESPACES - Static variable in class org.apache.tika.xmp.convert.MSOfficeBinaryConverter
ADDITIONAL_NAMESPACES - Static variable in class org.apache.tika.xmp.convert.MSOfficeXMLConverter
ADDITIONAL_NAMESPACES - Static variable in class org.apache.tika.xmp.convert.OpenDocumentConverter
ADDITIONAL_NAMESPACES - Static variable in class org.apache.tika.xmp.convert.RTFConverter
addMetadata(String) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
addMetadata(String) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
addMetadata(String) - Method in class org.apache.tika.parser.xml.MetadataHandler
addMulti(Metadata, Property, String) - Static method in class
addOtherTesseractConfig(String, String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
Add a key-value pair to pass to Tesseract using its -c command line option.
addPattern(MimeType, String) - Method in class org.apache.tika.mime.MimeTypes
Adds a file name pattern for the given media type.
addPattern(MimeType, String, boolean) - Method in class org.apache.tika.mime.MimeTypes
Adds a file name pattern for the given media type.
addPersonAndEmail(String, Property, Property, Metadata) - Static method in class org.apache.tika.parser.mail.MailUtil
This tries to split a "from" or "to" value into a person field and an email field.
addPrefix(String, String) - Method in class org.apache.tika.sax.xpath.XPathParser
addProfile(String, LanguageProfile) - Static method in class org.apache.tika.language.LanguageIdentifier
Adds a single language profile
addResource(Closeable) - Method in class
Adds a new resource to the set of tracked resources that will all be closed when the TemporaryResources.close() method is called.
addSuperType(MediaType, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
addText(char[], int, int) - Method in class org.apache.tika.langdetect.Lingo24LangDetector
addText(char[], int, int) - Method in class org.apache.tika.langdetect.OptimaizeLangDetector
addText(char[], int, int) - Method in class org.apache.tika.langdetect.TextLangDetector
addText(char[], int, int) - Method in class org.apache.tika.language.detect.LanguageDetector
Add statistics about this text for the current document.
addText(CharSequence) - Method in class org.apache.tika.language.detect.LanguageDetector
Add to the statistics being accumulated for the current document.
addType(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
AdobeFontMetricParser - Class in org.apache.tika.parser.font
Parser for AFM Font Files
AdobeFontMetricParser() - Constructor for class org.apache.tika.parser.font.AdobeFontMetricParser
AdvancedTypeDetector - Class in org.apache.tika.example
AdvancedTypeDetector() - Constructor for class org.apache.tika.example.AdvancedTypeDetector
afterRead(int) - Method in class
Invoked by the read methods after the proxied call has returned successfully.
afterRead(int) - Method in class
AgeRecogniser - Class in org.apache.tika.parser.recognition
Parser for extracting features from text.
AgeRecogniser() - Constructor for class org.apache.tika.parser.recognition.AgeRecogniser
AgeRecogniserConfig - Class in org.apache.tika.parser.recognition
Stores URL for AgePredictor
AgeRecogniserConfig(Map<String, Param>) - Constructor for class org.apache.tika.parser.recognition.AgeRecogniserConfig
ALBUM - Static variable in interface org.apache.tika.metadata.XMPDM
"The name of the album."
ALBUM_ARTIST - Static variable in interface org.apache.tika.metadata.XMPDM
"The name of the album artist or group for compilation albums."
ALIAS_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
ALIAS_TYPE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
ALIGNED_OFFSET - Static variable in class org.apache.tika.parser.chm.core.ChmCommons
alignedLenTable - Variable in class org.apache.tika.parser.chm.lzx.ChmLzxState
alignedTreeTable - Variable in class org.apache.tika.parser.chm.lzx.ChmLzxState
AlphaIdeographFilterFactory - Class in org.apache.tika.eval.tokens
Factory for filter that only allows tokens with characters that "isAlphabetic" or "isIdeographic" through.
AlphaIdeographFilterFactory(Map<String, String>) - Constructor for class org.apache.tika.eval.tokens.AlphaIdeographFilterFactory
ALT_TAPE_NAME - Static variable in interface org.apache.tika.metadata.XMPDM
"An alternative tape name, set via the project window or timecode dialog in Premiere.
ALTITUDE - Static variable in interface org.apache.tika.metadata.Geographic
The WGS84 Altitude of the Point
ALTITUDE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
analyze(StringBuilder) - Method in class org.apache.tika.language.LanguageProfilerBuilder
Analyzes a piece of text
AnalyzerManager - Class in org.apache.tika.eval.tokens
AnnotationUtils - Class in org.apache.tika.utils
This class contains utilities for dealing with tika annotations
AnnotationUtils() - Constructor for class org.apache.tika.utils.AnnotationUtils
apiBaseUri - Variable in class
apiUri - Variable in class
APP_VERSION - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
AppleSingleFileParser - Class in
Parser that strips the header off of AppleSingle and AppleDouble files.
AppleSingleFileParser() - Constructor for class
APPLICATION - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
application(String) - Static method in class org.apache.tika.mime.MediaType
APPLICATION_NAME - Static variable in interface org.apache.tika.metadata.MSOffice
APPLICATION_VERSION - Static variable in interface org.apache.tika.metadata.MSOffice
APPLICATION_XML - Static variable in class org.apache.tika.mime.MediaType
APPLICATION_ZIP - Static variable in class org.apache.tika.mime.MediaType
applyStyleAndValue(int, ResultSet, Cell) - Method in class org.apache.tika.eval.reports.XLSXHREFFormatter
AppParserFactoryBuilder - Class in
AppParserFactoryBuilder() - Constructor for class
ARCHITECTURE_BITS - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
ARTIST - Static variable in interface org.apache.tika.metadata.XMPDM
"The name of the artist or artists."
ARTWORK_OR_OBJECT - Static variable in interface org.apache.tika.metadata.IPTC
A set of metadata about artwork or an object in the item
ARTWORK_OR_OBJECT_DETAIL_COPYRIGHT_NOTICE - Static variable in interface org.apache.tika.metadata.IPTC
Contains any necessary copyright notice for claiming the intellectual property for artwork or an object in the image and should identify the current owner of the copyright of this work with associated intellectual property rights.
ARTWORK_OR_OBJECT_DETAIL_CREATOR - Static variable in interface org.apache.tika.metadata.IPTC
Contains the name of the artist who has created artwork or an object in the image.
ARTWORK_OR_OBJECT_DETAIL_DATE_CREATED - Static variable in interface org.apache.tika.metadata.IPTC
Designates the date and optionally the time the artwork or object in the image was created.
ARTWORK_OR_OBJECT_DETAIL_SOURCE - Static variable in interface org.apache.tika.metadata.IPTC
The organisation or body holding and registering the artwork or object in the image for inventory purposes.
ARTWORK_OR_OBJECT_DETAIL_SOURCE_INVENTORY_NUMBER - Static variable in interface org.apache.tika.metadata.IPTC
The inventory number issued by the organisation or body holding and registering the artwork or object in the image.
ARTWORK_OR_OBJECT_DETAIL_TITLE - Static variable in interface org.apache.tika.metadata.IPTC
A reference for the artwork or object in the image.
asInputSource() - Method in class org.apache.tika.detect.AutoDetectReader
ASSEMBLE_DOCUMENT - Static variable in interface org.apache.tika.metadata.AccessPermissions
Can the user insert/rotate/delete pages.
assertByteArrayNotNull(byte[]) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
Checks if byte[] is not null
assertByteArrayNotNull(byte[]) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
assertChmAccessorNotNull(ChmAccessor<?>) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
Checks if ChmAccessor is not null In case of null throws exception
assertChmAccessorParameters(byte[], ChmAccessor<?>, int) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
Checks validity of ChmAccessor parameters
assertChmBlockSegment(byte[], ChmLzxcResetTable, int, int, int) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
Checks a validity of the chmBlockSegment parameters
assertCopyingDataIndex(int, int) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
assertDirectoryListingEntry(int, String, ChmCommons.EntryType, int, int) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
Checks validity of the DirectoryListingEntry's parameters In case of invalid parameter(s) throws an exception
assertInputStreamNotNull(InputStream) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
Checks if InputStream is not null
assertPositiveInt(int) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
Checks if int param is greater than zero In case param <= 0 throws an exception
assignFieldParams(Object, Map<String, Param>) - Static method in class org.apache.tika.utils.AnnotationUtils
Assigns the param values to bean
assignValue(Object, Object) - Method in class org.apache.tika.config.ParamField
Sets given value to the annotated field of bean
attachExternalParsers(TikaConfig) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
attachExternalParsers(List<ExternalParser>, TikaConfig) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
AttributeDependantMetadataHandler - Class in org.apache.tika.parser.xml
This adds a Metadata entry for a given node.
AttributeDependantMetadataHandler(Metadata, String, String) - Constructor for class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
AttributeMatcher - Class in org.apache.tika.sax.xpath
Final evaluation state of a .../@* XPath expression.
AttributeMatcher() - Constructor for class org.apache.tika.sax.xpath.AttributeMatcher
AttributeMetadataHandler - Class in org.apache.tika.parser.xml
SAX event handler that maps the contents of an XML attribute into a metadata field.
AttributeMetadataHandler(String, String, Metadata, String) - Constructor for class org.apache.tika.parser.xml.AttributeMetadataHandler
AttributeMetadataHandler(String, String, Metadata, Property) - Constructor for class org.apache.tika.parser.xml.AttributeMetadataHandler
audio(String) - Static method in class org.apache.tika.mime.MediaType
AUDIO_CHANNEL_TYPE - Static variable in interface org.apache.tika.metadata.XMPDM
"The audio channel type."
AUDIO_COMPRESSOR - Static variable in interface org.apache.tika.metadata.XMPDM
"The audio compression used.
AUDIO_MOD_DATE - Static variable in interface org.apache.tika.metadata.XMPDM
"The date and time when the audio was last modified."
AUDIO_SAMPLE_RATE - Static variable in interface org.apache.tika.metadata.XMPDM
"The audio sample rate.
AUDIO_SAMPLE_TYPE - Static variable in interface org.apache.tika.metadata.XMPDM
"The audio sample type."
AudioFrame - Class in org.apache.tika.parser.mp3
An Audio Frame in an MP3 file.
AudioFrame(InputStream, ContentHandler) - Constructor for class org.apache.tika.parser.mp3.AudioFrame
Use the constructor which is passed all values directly.
AudioFrame(int, int, int, int, InputStream) - Constructor for class org.apache.tika.parser.mp3.AudioFrame
Use the constructor which is passed all values directly.
AudioFrame(int, int, int, int, int, int, float) - Constructor for class org.apache.tika.parser.mp3.AudioFrame
Creates a new instance of AudioFrame and initializes all properties.
AudioParser - Class in
AudioParser() - Constructor for class
AUTHOR - Static variable in interface org.apache.tika.metadata.MSOffice
AUTHOR - Static variable in interface org.apache.tika.metadata.Office
Name of the principal author(s) of a document
AUTHORS_POSITION - Static variable in interface org.apache.tika.metadata.Photoshop
AutoDetectParser - Class in org.apache.tika.parser
AutoDetectParser() - Constructor for class org.apache.tika.parser.AutoDetectParser
Creates an auto-detecting parser instance using the default Tika configuration.
AutoDetectParser(Detector) - Constructor for class org.apache.tika.parser.AutoDetectParser
AutoDetectParser(Parser...) - Constructor for class org.apache.tika.parser.AutoDetectParser
Creates an auto-detecting parser instance using the specified set of parser.
AutoDetectParser(Detector, Parser...) - Constructor for class org.apache.tika.parser.AutoDetectParser
AutoDetectParser(TikaConfig) - Constructor for class org.apache.tika.parser.AutoDetectParser
AutoDetectParserFactory - Class in org.apache.tika.batch
Simple class for AutoDetectParser
AutoDetectParserFactory() - Constructor for class org.apache.tika.batch.AutoDetectParserFactory
AutoDetectParserFactory - Class in org.apache.tika.parser
Factory for an AutoDetectParser
AutoDetectParserFactory(Map<String, String>) - Constructor for class org.apache.tika.parser.AutoDetectParserFactory
AutoDetectReader - Class in org.apache.tika.detect
An input stream reader that automatically detects the character encoding to be used for converting bytes to characters.
AutoDetectReader(InputStream, Metadata, EncodingDetector) - Constructor for class org.apache.tika.detect.AutoDetectReader
AutoDetectReader(InputStream, Metadata, ServiceLoader) - Constructor for class org.apache.tika.detect.AutoDetectReader
AutoDetectReader(InputStream, Metadata) - Constructor for class org.apache.tika.detect.AutoDetectReader
AutoDetectReader(InputStream) - Constructor for class org.apache.tika.detect.AutoDetectReader
autoTranslate(InputStream, String, String) - Method in class org.apache.tika.server.resource.TranslateResource
available() - Method in class
available() - Method in class
Return the number of bytes that can be read.
available() - Method in class
Invokes the delegate's available() method.
available() - Method in class org.apache.tika.parser.hwp.HwpStreamReader
More data to read ?
available - Variable in class


BasicContentHandlerFactory - Class in org.apache.tika.sax
Basic factory for creating common types of ContentHandlers
BasicContentHandlerFactory(BasicContentHandlerFactory.HANDLER_TYPE, int) - Constructor for class org.apache.tika.sax.BasicContentHandlerFactory
BasicContentHandlerFactory.HANDLER_TYPE - Enum in org.apache.tika.sax
Common handler types for content.
BasicTikaFSConsumer - Class in org.apache.tika.batch.fs
Basic FileResourceConsumer that reads files from an input directory and writes content to the output directory.
BasicTikaFSConsumer(ArrayBlockingQueue<FileResource>, ParserFactory, ContentHandlerFactory, OutputStreamFactory, TikaConfig) - Constructor for class org.apache.tika.batch.fs.BasicTikaFSConsumer
BasicTikaFSConsumer(ArrayBlockingQueue<FileResource>, Parser, ContentHandlerFactory, OutputStreamFactory) - Constructor for class org.apache.tika.batch.fs.BasicTikaFSConsumer
BasicTikaFSConsumersBuilder - Class in
BasicTikaFSConsumersBuilder() - Constructor for class
BasicTokenCountStatsCalculator - Class in org.apache.tika.eval.textstats
BasicTokenCountStatsCalculator() - Constructor for class org.apache.tika.eval.textstats.BasicTokenCountStatsCalculator
BatchNoRestartError - Error in org.apache.tika.batch
FileResourceConsumers should throw this if something catastrophic has happened and the BatchProcess should shutdown and not be restarted.
BatchNoRestartError(Throwable) - Constructor for error org.apache.tika.batch.BatchNoRestartError
BatchNoRestartError(String) - Constructor for error org.apache.tika.batch.BatchNoRestartError
BatchNoRestartError(String, Throwable) - Constructor for error org.apache.tika.batch.BatchNoRestartError
BatchProcess - Class in org.apache.tika.batch
This is the main processor class for a single process.
BatchProcess(FileResourceCrawler, ConsumersManager, StatusReporter, Interrupter) - Constructor for class org.apache.tika.batch.BatchProcess
BatchProcess.BATCH_CONSTANTS - Enum in org.apache.tika.batch
BatchProcessBuilder - Class in
Builds a BatchProcessor from a combination of runtime arguments and the config file.
BatchProcessBuilder() - Constructor for class
BatchProcessDriverCLI - Class in org.apache.tika.batch
BatchProcessDriverCLI(String[]) - Constructor for class org.apache.tika.batch.BatchProcessDriverCLI
BatchTopCommonTokenCounter - Class in
Utility class that runs TopCommonTokenCounter against a directory of table files (named {lang}_table.gz or leipzip-like afr_...-sentences.txt) and outputs common tokens files for each input table file in the output directory.
BatchTopCommonTokenCounter() - Constructor for class
beforeRead(int) - Method in class
Invoked by the read methods before the call is proxied.
BIG - Static variable in class org.apache.tika.parser.executable.MachineMetadata.Endian
BITS_PER_SAMPLE - Static variable in interface org.apache.tika.metadata.TIFF
"Number of bits per component in each channel."
BodyContentHandler - Class in org.apache.tika.sax
Content handler decorator that only passes everything inside the XHTML <body/> tag to the underlying handler.
BodyContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.BodyContentHandler
Creates a content handler that passes all XHTML body events to the given underlying content handler.
BodyContentHandler(Writer) - Constructor for class org.apache.tika.sax.BodyContentHandler
Creates a content handler that writes XHTML body character events to the given writer.
BodyContentHandler(OutputStream) - Constructor for class org.apache.tika.sax.BodyContentHandler
Creates a content handler that writes XHTML body character events to the given output stream using the default encoding.
BodyContentHandler(int) - Constructor for class org.apache.tika.sax.BodyContentHandler
Creates a content handler that writes XHTML body character events to an internal string buffer.
BodyContentHandler() - Constructor for class org.apache.tika.sax.BodyContentHandler
Creates a content handler that writes XHTML body character events to an internal string buffer.
BoilerpipeContentHandler - Class in org.apache.tika.parser.html
Uses the boilerpipe library to automatically extract the main content from a web page.
BoilerpipeContentHandler(ContentHandler) - Constructor for class org.apache.tika.parser.html.BoilerpipeContentHandler
Creates a new boilerpipe-based content extractor, using the DefaultExtractor extraction rules and "delegate" as the content handler.
BoilerpipeContentHandler(Writer) - Constructor for class org.apache.tika.parser.html.BoilerpipeContentHandler
Creates a content handler that writes XHTML body character events to the given writer.
BoilerpipeContentHandler(ContentHandler, BoilerpipeExtractor) - Constructor for class org.apache.tika.parser.html.BoilerpipeContentHandler
Creates a new boilerpipe-based content extractor, using the given extraction rules.
BouncyCastleDigester - Class in org.apache.tika.parser.utils
Digester that relies on BouncyCastle for MessageDigest implementations.
BouncyCastleDigester(int, String) - Constructor for class org.apache.tika.parser.utils.BouncyCastleDigester
Include a string representing the comma-separated algorithms to run: e.g.
BoundedInputStream - Class in
Very slight modification of Commons' BoundedInputStream so that we can figure out if this hit the bound or not.
BoundedInputStream(long, InputStream) - Constructor for class
BPGParser - Class in org.apache.tika.parser.image
Parser for the Better Portable Graphics )BPG) File Format.
BPGParser() - Constructor for class org.apache.tika.parser.image.BPGParser
BufferUnderrunException() - Constructor for exception
build(Node, Map<String, String>, ArrayBlockingQueue<FileResource>) - Method in class
build(Node, Map<String, String>) - Method in class
build(InputStream, Map<String, String>) - Method in class
Builds a BatchProcess from runtime arguments and a input stream of a configuration file.
build(Node, Map<String, String>) - Method in class
Builds a FileResourceBatchProcessor from runtime arguments and a document node of a configuration file.
build(InputStream) - Method in class
build(Node, Map<String, String>) - Method in class
build(Node, Map<String, String>) - Method in interface
build(Node, Map<String, String>, ArrayBlockingQueue<FileResource>) - Method in interface
build(Node, long, Map<String, String>) - Method in class
build(Node, Map<String, String>) - Method in interface
build(Node, Map<String, String>, ArrayBlockingQueue<FileResource>) - Method in interface
build(Node, Map<String, String>) - Method in interface
build(Node, Map<String, String>) - Method in class
build(Node, Map<String, String>) - Method in interface
build(FileResourceCrawler, ConsumersManager, Node, Map<String, String>) - Method in class
build(FileResourceCrawler, ConsumersManager, Node, Map<String, String>) - Method in interface
build(Node, Map<String, String>, ArrayBlockingQueue<FileResource>) - Method in class
build(Node, Map<String, String>, ArrayBlockingQueue<FileResource>) - Method in class
build() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
build() - Method in class org.apache.tika.eval.batch.EvalConsumerBuilder
build(Node, Map<String, String>, ArrayBlockingQueue<FileResource>) - Method in class org.apache.tika.eval.batch.EvalConsumersBuilder
build() - Method in class org.apache.tika.eval.batch.ExtractComparerBuilder
build() - Method in class org.apache.tika.eval.batch.ExtractProfilerBuilder
build(Path) - Static method in class org.apache.tika.eval.reports.ResultsReporter
build() - Method in class org.apache.tika.fork.ParserFactoryFactory
BUILD - Static variable in interface org.apache.tika.metadata.QuattroPro
build() - Method in class org.apache.tika.parser.AutoDetectParserFactory
build() - Method in class org.apache.tika.parser.ParserFactory
build() - Method in class org.apache.tika.sax.StandardReference.StandardReferenceBuilder
build2() - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
Initialize the MimeTypes with this builder instance
buildClass(Class<T>, String) - Static method in class org.apache.tika.util.ClassLoaderUtil
buildDOM(InputStream, ParseContext) - Static method in class org.apache.tika.utils.XMLReaderUtils
This checks context for a user specified DocumentBuilder.
buildDOM(Path) - Static method in class org.apache.tika.utils.XMLReaderUtils
Builds a Document with a DocumentBuilder from the pool
buildDOM(String) - Static method in class org.apache.tika.utils.XMLReaderUtils
Builds a Document with a DocumentBuilder from the pool
buildDOM(InputStream) - Static method in class org.apache.tika.utils.XMLReaderUtils
Builds a Document with a DocumentBuilder from the pool
Builder() - Constructor for class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
buildExtractReader(Map<String, String>) - Method in class org.apache.tika.eval.batch.EvalConsumerBuilder
buildParagraphTagAndStyle(String, boolean) - Static method in class
Given a style name, return what tag should be used, and what style should be applied to it.
buildXHTML(XHTMLContentHandler) - Method in class
Populates the XHTMLContentHandler object received as parameter.
buildXHTML(XHTMLContentHandler) - Method in class
buildXHTML(XHTMLContentHandler) - Method in class
buildXHTML(XHTMLContentHandler) - Method in class
buildXHTML(XHTMLContentHandler) - Method in class
buildXHTML(XHTMLContentHandler) - Method in class
buildXHTML(XHTMLContentHandler) - Method in class
buildXHTML(XHTMLContentHandler) - Method in class
buildXHTML(XHTMLContentHandler) - Method in class
BYTE_ARRAY_LENGHT - Static variable in class org.apache.tika.parser.chm.core.ChmConstants


CachedTranslator - Class in org.apache.tika.language.translate
CachedTranslator() - Constructor for class org.apache.tika.language.translate.CachedTranslator
Create a new CachedTranslator (must set the Translator with CachedTranslator.setTranslator(Translator) before use!)
CachedTranslator(Translator) - Constructor for class org.apache.tika.language.translate.CachedTranslator
Create a new CachedTranslator.
calcTextStats(ContentTags) - Method in class org.apache.tika.eval.AbstractProfiler
calculate(String) - Method in class org.apache.tika.eval.langid.LanguageIDWrapper
calculate(TokenCounts) - Method in class org.apache.tika.eval.textstats.BasicTokenCountStatsCalculator
calculate(List<Language>, TokenCounts) - Method in class org.apache.tika.eval.textstats.CommonTokens
calculate(List<Language>, TokenCounts) - Method in class org.apache.tika.eval.textstats.CommonTokensBhattacharyya
calculate(List<Language>, TokenCounts) - Method in class org.apache.tika.eval.textstats.CommonTokensCosine
calculate(List<Language>, TokenCounts) - Method in class org.apache.tika.eval.textstats.CommonTokensHellinger
calculate(List<Language>, TokenCounts) - Method in class org.apache.tika.eval.textstats.CommonTokensKLDivergence
calculate(List<Language>, TokenCounts) - Method in class org.apache.tika.eval.textstats.CommonTokensKLDNormed
calculate(String) - Method in class org.apache.tika.eval.textstats.CompositeTextStatsCalculator
calculate(String) - Method in class org.apache.tika.eval.textstats.ContentLengthCalculator
calculate(List<Language>, TokenCounts) - Method in interface org.apache.tika.eval.textstats.LanguageAwareTokenCountStats
calculate(String) - Method in interface org.apache.tika.eval.textstats.StringStatsCalculator
calculate(TokenCounts) - Method in interface org.apache.tika.eval.textstats.TokenCountStatsCalculator
calculate(TokenCounts) - Method in class org.apache.tika.eval.textstats.TokenEntropy
calculate(TokenCounts) - Method in class org.apache.tika.eval.textstats.TokenLengths
calculate(TokenCounts) - Method in class org.apache.tika.eval.textstats.TopNTokens
calculate(String) - Method in class org.apache.tika.eval.textstats.UnicodeBlockCounter
calculateContrastStatistics(TokenCounts, TokenCounts) - Method in class org.apache.tika.eval.tokens.TokenContraster
call() - Method in class org.apache.tika.batch.BatchProcess
Runs main execution loop.
call() - Method in class org.apache.tika.batch.FileResourceConsumer
call() - Method in class org.apache.tika.batch.FileResourceCrawler
call() - Method in class org.apache.tika.batch.fs.strawman.StrawManTikaAppDriver
call() - Method in class org.apache.tika.batch.Interrupter
call() - Method in class org.apache.tika.batch.StatusReporter
Startup the reporter.
CAN_MODIFY - Static variable in interface org.apache.tika.metadata.AccessPermissions
Can any modifications be made to the document
CAN_MODIFY_ANNOTATIONS - Static variable in interface org.apache.tika.metadata.AccessPermissions
Can the user modify annotations
CAN_PRINT - Static variable in interface org.apache.tika.metadata.AccessPermissions
Can the user print the document
CAN_PRINT_DEGRADED - Static variable in interface org.apache.tika.metadata.AccessPermissions
Can the user print an image-degraded version of the document.
canRun() - Static method in class org.apache.tika.langdetect.TextLangDetector
canRun() - Static method in class org.apache.tika.parser.journal.GrobidRESTParser
CAPTION_WRITER - Static variable in interface org.apache.tika.metadata.Photoshop
CaptionObject - Class in org.apache.tika.parser.captioning
A model for caption objects from graphics and texts typically includes human readable sentence, language of the sentence and confidence score.
CaptionObject(String, String, double) - Constructor for class org.apache.tika.parser.captioning.CaptionObject
cast(InputStream) - Static method in class
Returns the given stream casts to a TikaInputStream, or null if the stream is not a TikaInputStream.
CATEGORY - Static variable in interface org.apache.tika.metadata.IPTC
CATEGORY - Static variable in interface org.apache.tika.metadata.MSOffice
CATEGORY - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
A categorization of the content of this package.
CATEGORY - Static variable in interface org.apache.tika.metadata.Photoshop
Cell - Interface in
Cell of content.
cell(String, String, XSSFComment) - Method in class
CellDecorator - Class in
Cell decorator.
CellDecorator(Cell) - Constructor for class
CERTIFICATE - Static variable in interface org.apache.tika.metadata.XMPRights
A Web URL for a rights management certificate.
ChannelTypePropertyConverter() - Constructor for class org.apache.tika.metadata.XMPDM.ChannelTypePropertyConverter
CHARACTER_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
CHARACTER_COUNT - Static variable in interface org.apache.tika.metadata.Office
The number of Characters in the document
CHARACTER_COUNT_WITH_SPACES - Static variable in interface org.apache.tika.metadata.MSOffice
CHARACTER_COUNT_WITH_SPACES - Static variable in interface org.apache.tika.metadata.Office
The number of Characters in the document, including spaces
characters - Variable in class org.apache.tika.mime.MimeTypesReader
characters(char[], int, int) - Method in class org.apache.tika.mime.MimeTypesReader
characters(char[], int, int) - Method in class org.apache.tika.parser.ctakes.CTAKESContentHandler
characters(char[], int, int) - Method in class org.apache.tika.parser.dif.DIFContentHandler
characters(char[], int, int) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
characters(char[], int, int) - Method in class
characters(char[], int, int) - Method in class
characters(char[], int, int) - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
characters(char[], int, int) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
characters(char[], int, int) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
characters(char[], int, int) - Method in class org.apache.tika.parser.xml.MetadataHandler
characters(char[], int, int) - Method in class org.apache.tika.sax.ContentHandlerDecorator
characters(char[], int, int) - Method in class org.apache.tika.sax.DIFContentHandler
characters(char[], int, int) - Method in class org.apache.tika.sax.ExpandedTitleContentHandler
characters(char[], int, int) - Method in class org.apache.tika.sax.LinkContentHandler
characters(char[], int, int) - Method in class org.apache.tika.sax.PhoneExtractingContentHandler
The characters method is called whenever a Parser wants to pass raw...
characters(char[], int, int) - Method in class org.apache.tika.sax.SafeContentHandler
characters(char[], int, int) - Method in class org.apache.tika.sax.SecureContentHandler
characters(char[], int, int) - Method in class org.apache.tika.sax.StandardsExtractingContentHandler
The characters method is called whenever a Parser wants to pass raw characters to the ContentHandler.
characters(char[], int, int) - Method in class org.apache.tika.sax.TeeContentHandler
characters(char[], int, int) - Method in class org.apache.tika.sax.TextContentHandler
characters(char[], int, int) - Method in class org.apache.tika.sax.ToTextContentHandler
Writes the given characters to the given character stream.
characters(char[], int, int) - Method in class org.apache.tika.sax.ToXMLContentHandler
characters(char[], int, int) - Method in class org.apache.tika.sax.WriteOutContentHandler
Writes the given characters to the given character stream.
characters(char[], int, int) - Method in class org.apache.tika.sax.XHTMLContentHandler
characters(String) - Method in class org.apache.tika.sax.XHTMLContentHandler
characters(char[], int, int) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
CHARACTERS_PER_PAGE - Static variable in interface org.apache.tika.metadata.PDF
CharsetDetector - Class in org.apache.tika.parser.txt
CharsetDetector provides a facility for detecting the charset or encoding of character data in an unknown format.
CharsetDetector() - Constructor for class org.apache.tika.parser.txt.CharsetDetector
CharsetDetector(int) - Constructor for class org.apache.tika.parser.txt.CharsetDetector
CharsetMatch - Class in org.apache.tika.parser.txt
This class represents a charset that has been identified by a CharsetDetector as a possible encoding for a set of input data.
CharsetUtils - Class in org.apache.tika.utils
CharsetUtils() - Constructor for class org.apache.tika.utils.CharsetUtils
check(String, int...) - Static method in class org.apache.tika.embedder.ExternalEmbedder
Checks to see if the command can be run.
check(String[], int...) - Static method in class org.apache.tika.embedder.ExternalEmbedder
Checks to see if the command can be run.
check(String, int...) - Static method in class org.apache.tika.parser.external.ExternalParser
Checks to see if the command can be run.
check(String[], int...) - Static method in class org.apache.tika.parser.external.ExternalParser
check(Metadata) - Method in class org.apache.tika.parser.pdf.AccessChecker
Checks to see if a document's content should be extracted based on metadata values and the value of AccessChecker.allowAccessibility in the constructor.
CHECK_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
checkAvail() - Method in class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
Ping lucene-geo-gazetteer API
checkBit(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
checkCommand(String, int...) - Method in class org.apache.tika.language.translate.ExternalTranslator
Checks to see if the command can be run.
checkForTimedOutMillis(long) - Method in class org.apache.tika.batch.FileResourceConsumer
Checks to see if the currentFile being processed (if there is one) should be timed out (still being worked on after staleThresholdMillis).
checkInitialization(InitializableProblemHandler) - Method in interface org.apache.tika.config.Initializable
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.dl.imagerec.DL4JInceptionV3Net
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.dl.imagerec.DL4JVGG16Net
checkInitialization(InitializableProblemHandler) - Method in class
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.jdbc.SQLite3Parser
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.pdf.PDFParser
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.AgeRecogniser
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
checkInitialization(InitializableProblemHandler) - Method in class
checkInitialization(InitializableProblemHandler) - Method in class
checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
checkIntegrity() - Method in class
checkIsOperating() - Static method in class org.apache.tika.server.resource.TikaResource
checkThisIsAncestorOfOrSameAsThat(File, File) - Static method in class org.apache.tika.batch.fs.FSUtil
checkThisIsAncestorOfThat(File, File) - Static method in class org.apache.tika.batch.fs.FSUtil
ChildMatcher - Class in org.apache.tika.sax.xpath
Intermediate evaluation state of a .../*... XPath expression.
ChildMatcher(Matcher) - Constructor for class org.apache.tika.sax.xpath.ChildMatcher
CHM_ITSF_V2_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
CHM_ITSF_V3_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
CHM_ITSP_V1_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
CHM_LZXC_MIN_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
CHM_LZXC_RESETTABLE_V1_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
CHM_LZXC_V2_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
CHM_PMGI_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
CHM_PMGI_MARKER - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
CHM_PMGL_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
CHM_SIGNATURE_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
CHM_VER_1 - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
CHM_VER_2 - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
CHM_VER_3 - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
CHM_WINDOW_SIZE_BLOCK - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
ChmAccessor<T> - Interface in org.apache.tika.parser.chm.accessor
Defines an accessor interface
ChmAssert - Class in org.apache.tika.parser.chm.assertion
Contains chm extractor assertions
ChmAssert() - Constructor for class org.apache.tika.parser.chm.assertion.ChmAssert
ChmBlockInfo - Class in org.apache.tika.parser.chm.lzx
A container that contains chm block information such as: i.
ChmCommons - Class in org.apache.tika.parser.chm.core
ChmCommons.EntryType - Enum in org.apache.tika.parser.chm.core
Represents entry types: uncompressed, compressed
ChmCommons.IntelState - Enum in org.apache.tika.parser.chm.core
Represents intel file states during decompression
ChmCommons.LzxState - Enum in org.apache.tika.parser.chm.core
Represents lzx states: started decoding, not started decoding
ChmConstants - Class in org.apache.tika.parser.chm.core
ChmDirectoryListingSet - Class in org.apache.tika.parser.chm.accessor
Holds chm listing entries
ChmDirectoryListingSet(byte[], ChmItsfHeader, ChmItspHeader) - Constructor for class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
Constructs chm directory listing set
ChmExtractor - Class in org.apache.tika.parser.chm.core
Extracts text from chm file.
ChmExtractor(InputStream) - Constructor for class org.apache.tika.parser.chm.core.ChmExtractor
ChmItsfHeader - Class in org.apache.tika.parser.chm.accessor
The Header 0000: char[4] 'ITSF' 0004: DWORD 3 (Version number) 0008: DWORD Total header length, including header section table and following data.
ChmItsfHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmItsfHeader
ChmItspHeader - Class in org.apache.tika.parser.chm.accessor
Directory header The directory starts with a header; its format is as follows: 0000: char[4] 'ITSP' 0004: DWORD Version number 1 0008: DWORD Length of the directory header 000C: DWORD $0a (unknown) 0010: DWORD $1000 Directory chunk size 0014: DWORD "Density" of quickref section, usually 2 0018: DWORD Depth of the index tree - 1 there is no index, 2 if there is one level of PMGI chunks 001C: DWORD Chunk number of root index chunk, -1 if there is none (though at least one file has 0 despite there being no index chunk, probably a bug) 0020: DWORD Chunk number of first PMGL (listing) chunk 0024: DWORD Chunk number of last PMGL (listing) chunk 0028: DWORD -1 (unknown) 002C: DWORD Number of directory chunks (total) 0030: DWORD Windows language ID 0034: GUID {5D02926A-212E-11D0-9DF9-00A0C922E6EC} 0044: DWORD $54 (This is the length again) 0048: DWORD -1 (unknown) 004C: DWORD -1 (unknown) 0050: DWORD -1 (unknown)
ChmItspHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmItspHeader
ChmLzxBlock - Class in org.apache.tika.parser.chm.lzx
Decompresses a chm block.
ChmLzxBlock(int, byte[], long, ChmLzxBlock) - Constructor for class org.apache.tika.parser.chm.lzx.ChmLzxBlock
ChmLzxcControlData - Class in org.apache.tika.parser.chm.accessor
::DataSpace/Storage//ControlData This file contains $20 bytes of information on the compression.
ChmLzxcControlData() - Constructor for class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
ChmLzxcResetTable - Class in org.apache.tika.parser.chm.accessor
LZXC reset table For ensuring a decompression.
ChmLzxcResetTable() - Constructor for class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
ChmLzxState - Class in org.apache.tika.parser.chm.lzx
ChmLzxState(int) - Constructor for class org.apache.tika.parser.chm.lzx.ChmLzxState
ChmParser - Class in org.apache.tika.parser.chm
ChmParser() - Constructor for class org.apache.tika.parser.chm.ChmParser
ChmParsingException - Exception in org.apache.tika.parser.chm.exception
ChmParsingException(String) - Constructor for exception org.apache.tika.parser.chm.exception.ChmParsingException
ChmPmgiHeader - Class in org.apache.tika.parser.chm.accessor
Description Note: not always exists An index chunk has the following format: 0000: char[4] 'PMGI' 0004: DWORD Length of quickref/free area at end of directory chunk 0008: Directory index entries (to quickref/free area) The quickref area in an PMGI is the same as in an PMGL The format of a directory index entry is as follows: BYTE: length of name BYTEs: name (UTF-8 encoded) ENCINT: directory listing chunk which starts with name Encoded Integers aka ENCINT An ENCINT is a variable-length integer.
ChmPmgiHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
ChmPmglHeader - Class in org.apache.tika.parser.chm.accessor
Description There are two types of directory chunks -- index chunks, and listing chunks.
ChmPmglHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmPmglHeader
ChmSection - Class in org.apache.tika.parser.chm.lzx
ChmSection(byte[]) - Constructor for class org.apache.tika.parser.chm.lzx.ChmSection
ChmSection(byte[], byte[]) - Constructor for class org.apache.tika.parser.chm.lzx.ChmSection
ChmWrapper - Class in org.apache.tika.parser.chm.core
ChmWrapper() - Constructor for class org.apache.tika.parser.chm.core.ChmWrapper
CITY - Static variable in interface org.apache.tika.metadata.IPTC
Name of the city the content is focussing on -- either the place shown in visual media or referenced by text or audio media.
CITY - Static variable in interface org.apache.tika.metadata.Photoshop
CJKBigramAwareLengthFilterFactory - Class in org.apache.tika.eval.tokens
Creates a very narrowly focused TokenFilter that limits tokens based on length _unless_ they've been identified as <DOUBLE> or <SINGLE> by the CJKBigramFilter.
CJKBigramAwareLengthFilterFactory(Map<String, String>) - Constructor for class org.apache.tika.eval.tokens.CJKBigramAwareLengthFilterFactory
ClassLoaderUtil - Class in org.apache.tika.util
ClassLoaderUtil() - Constructor for class org.apache.tika.util.ClassLoaderUtil
className - Variable in class org.apache.tika.server.resource.TikaWelcome.Endpoint
ClassParser - Class in org.apache.tika.parser.asm
Parser for Java .class files.
ClassParser() - Constructor for class org.apache.tika.parser.asm.ClassParser
clean(String) - Static method in class org.apache.tika.sax.CleanPhoneText
clean(String) - Static method in class org.apache.tika.utils.CharsetUtils
Handle various common charset name errors, and return something that will be considered valid (and is normalized)
CleanPhoneText - Class in org.apache.tika.sax
Class to help de-obfuscate phone numbers in text.
CleanPhoneText() - Constructor for class org.apache.tika.sax.CleanPhoneText
cleanSubstitutions - Static variable in class org.apache.tika.sax.CleanPhoneText
clear(String) - Method in class org.apache.tika.eval.tokens.TokenCounter
clearProfiles() - Static method in class org.apache.tika.language.LanguageIdentifier
Clears the current map of language profiles
ClimateForcast - Interface in org.apache.tika.metadata
Met keys from NCAR CCSM files in the Climate Forecast Convention.
clone() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
cloneMetadata(Metadata) - Static method in class org.apache.tika.utils.ParserUtils
Does a deep clone of a Metadata object.
close(Closeable) - Method in class org.apache.tika.batch.FileResourceConsumer
close() - Method in class org.apache.tika.eval.db.DBBuffer
close() - Method in class org.apache.tika.eval.db.MimeBuffer
close() - Method in class
close() - Method in interface
close() - Method in class org.apache.tika.eval.tokens.CommonTokenCountManager
close() - Method in class org.apache.tika.fork.ForkParser
close() - Method in class
Replaces the underlying input stream with a ClosedInputStream sentinel.
close() - Method in class
close() - Method in class
Close this input stream - resets the internal state to the initial values.
close() - Method in class
Invokes the delegate's close() method.
close() - Method in class
Closes all tracked resources.
close() - Method in class
close() - Method in class org.apache.tika.language.detect.LanguageWriter
close() - Method in class org.apache.tika.language.ProfilingWriter
close() - Method in class org.apache.tika.metadata.serialization.JsonStreamingSerializer
close() - Method in class org.apache.tika.parser.mp4.DirectFileReadDataSource
close() - Method in class org.apache.tika.parser.ParsingReader
Closes the read end of the pipe.
close() - Method in class org.apache.tika.utils.RereadableInputStream
Closes the input stream and removes the temporary file if one was created.
ClosedInputStream - Class in
Closed input stream.
ClosedInputStream() - Constructor for class
closeQuietly(Reader) - Static method in class
Unconditionally close an Reader.
closeQuietly(Channel) - Static method in class
Unconditionally close a Channel.
closeQuietly(Writer) - Static method in class
Unconditionally close a Writer.
closeQuietly(InputStream) - Static method in class
Unconditionally close an InputStream.
closeQuietly(OutputStream) - Static method in class
Unconditionally close an OutputStream.
CloseShieldInputStream - Class in
Proxy stream that prevents the underlying input stream from being closed.
CloseShieldInputStream(InputStream) - Constructor for class
Creates a proxy that shields the given input stream from being closed.
closeStyleTags(XHTMLContentHandler, Deque<FormattingUtils.Tag>) - Static method in class
Closes all formatting tags.
closeWriter() - Method in class org.apache.tika.eval.AbstractProfiler
ColInfo - Class in org.apache.tika.eval.db
ColInfo(Cols, int) - Constructor for class org.apache.tika.eval.db.ColInfo
ColInfo(Cols, int, String) - Constructor for class org.apache.tika.eval.db.ColInfo
ColInfo(Cols, int, Integer) - Constructor for class org.apache.tika.eval.db.ColInfo
ColInfo(Cols, int, Integer, String) - Constructor for class org.apache.tika.eval.db.ColInfo
COLOR_MODE - Static variable in interface org.apache.tika.metadata.Photoshop
Cols - Enum in org.apache.tika.eval.db
COLUMN_COUNT - Static variable in interface org.apache.tika.metadata.Database
COLUMN_NAME - Static variable in interface org.apache.tika.metadata.Database
COMMAND_LINE - Static variable in interface org.apache.tika.metadata.ClimateForcast
COMMAND_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
CommandLineParserBuilder - Class in
Reads configurable options from a config file and returns org.apache.commons.cli.Options object to be used in commandline parser.
CommandLineParserBuilder() - Constructor for class
COMMENT - Static variable in interface org.apache.tika.metadata.ClimateForcast
COMMENT_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
COMMENTS - Static variable in interface org.apache.tika.metadata.MSOffice
COMMENTS - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
COMMENTS - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
CommonsDigester - Class in org.apache.tika.parser.utils
Implementation of DigestingParser.Digester that relies on commons.codec.digest.DigestUtils to calculate digest hashes.
CommonsDigester(int, String) - Constructor for class org.apache.tika.parser.utils.CommonsDigester
Include a string representing the comma-separated algorithms to run: e.g.
CommonsDigester(int, CommonsDigester.DigestAlgorithm...) - Constructor for class org.apache.tika.parser.utils.CommonsDigester
CommonsDigester.DigestAlgorithm - Enum in org.apache.tika.parser.utils
CommonTokenCountManager - Class in org.apache.tika.eval.tokens
CommonTokenCountManager() - Constructor for class org.apache.tika.eval.tokens.CommonTokenCountManager
CommonTokenCountManager(Path, String) - Constructor for class org.apache.tika.eval.tokens.CommonTokenCountManager
CommonTokenOverlapCounter - Class in
CommonTokenOverlapCounter() - Constructor for class
CommonTokenResult - Class in org.apache.tika.eval.tokens
CommonTokenResult(String, int, int, int, int) - Constructor for class org.apache.tika.eval.tokens.CommonTokenResult
CommonTokens - Class in org.apache.tika.eval.textstats
CommonTokens() - Constructor for class org.apache.tika.eval.textstats.CommonTokens
CommonTokens(CommonTokenCountManager) - Constructor for class org.apache.tika.eval.textstats.CommonTokens
CommonTokensBhattacharyya - Class in org.apache.tika.eval.textstats
CommonTokensBhattacharyya(CommonTokenCountManager) - Constructor for class org.apache.tika.eval.textstats.CommonTokensBhattacharyya
CommonTokensCosine - Class in org.apache.tika.eval.textstats
CommonTokensCosine(CommonTokenCountManager) - Constructor for class org.apache.tika.eval.textstats.CommonTokensCosine
CommonTokensHellinger - Class in org.apache.tika.eval.textstats
CommonTokensHellinger(CommonTokenCountManager) - Constructor for class org.apache.tika.eval.textstats.CommonTokensHellinger
CommonTokensKLDivergence - Class in org.apache.tika.eval.textstats
CommonTokensKLDivergence(CommonTokenCountManager) - Constructor for class org.apache.tika.eval.textstats.CommonTokensKLDivergence
CommonTokensKLDNormed - Class in org.apache.tika.eval.textstats
CommonTokensKLDNormed(CommonTokenCountManager) - Constructor for class org.apache.tika.eval.textstats.CommonTokensKLDNormed
COMP_OBJ - Static variable in class
Some other kind of embedded document, in a CompObj container within another OLE2 document
COMPANY - Static variable in interface org.apache.tika.metadata.MSOffice
COMPANY - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
compare(String, String) - Method in class org.apache.tika.metadata.serialization.PrettyMetadataKeyComparator
compareFiles(EvalFilePaths, EvalFilePaths) - Method in class org.apache.tika.eval.ExtractComparer
compareTo(TokenIntPair) - Method in class org.apache.tika.eval.tokens.TokenIntPair
Descending by value, ascending by token
compareTo(Property) - Method in class org.apache.tika.metadata.Property
compareTo(MediaType) - Method in class org.apache.tika.mime.MediaType
compareTo(MimeType) - Method in class org.apache.tika.mime.MimeType
compareTo(CSVResult) - Method in class org.apache.tika.parser.csv.CSVResult
Sorts in descending order of confidence
compareTo(CharsetMatch) - Method in class org.apache.tika.parser.txt.CharsetMatch
Compare to other CharsetMatch objects.
COMPARISON_CONTAINERS - Static variable in class org.apache.tika.eval.ExtractComparer
COMPILATION - Static variable in interface org.apache.tika.metadata.XMPDM
"An album created by various artists."
complete(long) - Method in class org.apache.tika.server.ServerStatus
Removes the task from the collection of currently running tasks.
COMPOSER - Static variable in interface org.apache.tika.metadata.XMPDM
"The composer's name."
composite(Property, Property[]) - Static method in class org.apache.tika.metadata.Property
Constructs a new composite property from the given primary and array of secondary properties.
CompositeDetector - Class in org.apache.tika.detect
Content type detector that combines multiple different detection mechanisms.
CompositeDetector(MediaTypeRegistry, List<Detector>, Collection<Class<? extends Detector>>) - Constructor for class org.apache.tika.detect.CompositeDetector
CompositeDetector(MediaTypeRegistry, List<Detector>) - Constructor for class org.apache.tika.detect.CompositeDetector
CompositeDetector(List<Detector>) - Constructor for class org.apache.tika.detect.CompositeDetector
CompositeDetector(Detector...) - Constructor for class org.apache.tika.detect.CompositeDetector
CompositeDigester - Class in org.apache.tika.parser.digest
CompositeDigester(DigestingParser.Digester...) - Constructor for class org.apache.tika.parser.digest.CompositeDigester
CompositeEncodingDetector - Class in org.apache.tika.detect
CompositeEncodingDetector(List<EncodingDetector>, Collection<Class<? extends EncodingDetector>>) - Constructor for class org.apache.tika.detect.CompositeEncodingDetector
CompositeEncodingDetector(List<EncodingDetector>) - Constructor for class org.apache.tika.detect.CompositeEncodingDetector
CompositeExternalParser - Class in org.apache.tika.parser.external
A Composite Parser that wraps up all the available External Parsers, and provides an easy way to access them.
CompositeExternalParser() - Constructor for class org.apache.tika.parser.external.CompositeExternalParser
CompositeExternalParser(MediaTypeRegistry) - Constructor for class org.apache.tika.parser.external.CompositeExternalParser
CompositeMatcher - Class in org.apache.tika.sax.xpath
Composite XPath evaluation state.
CompositeMatcher(Matcher, Matcher) - Constructor for class org.apache.tika.sax.xpath.CompositeMatcher
CompositeParser - Class in org.apache.tika.parser
Composite parser that delegates parsing tasks to a component parser based on the declared content type of the incoming document.
CompositeParser(MediaTypeRegistry, List<Parser>, Collection<Class<? extends Parser>>) - Constructor for class org.apache.tika.parser.CompositeParser
CompositeParser(MediaTypeRegistry, List<Parser>) - Constructor for class org.apache.tika.parser.CompositeParser
CompositeParser(MediaTypeRegistry, Parser...) - Constructor for class org.apache.tika.parser.CompositeParser
CompositeParser() - Constructor for class org.apache.tika.parser.CompositeParser
CompositeTagHandler - Class in org.apache.tika.parser.mp3
Takes an array of ID3Tags in preference order, and when asked for a given tag, will return it from the first ID3Tags that has it.
CompositeTagHandler(ID3Tags[]) - Constructor for class org.apache.tika.parser.mp3.CompositeTagHandler
CompositeTextStatsCalculator - Class in org.apache.tika.eval.textstats
CompositeTextStatsCalculator(List<TextStatsCalculator>) - Constructor for class org.apache.tika.eval.textstats.CompositeTextStatsCalculator
CompositeTextStatsCalculator(List<TextStatsCalculator>, Analyzer, LanguageIDWrapper) - Constructor for class org.apache.tika.eval.textstats.CompositeTextStatsCalculator
CompressorParser - Class in org.apache.tika.parser.pkg
Parser for various compression formats.
CompressorParser() - Constructor for class org.apache.tika.parser.pkg.CompressorParser
CompressorParserOptions - Interface in org.apache.tika.parser.pkg
Interface for setting options for the CompressorParser by passing via the ParseContext.
ConcurrentUtils - Class in org.apache.tika.utils
Utility Class for Concurrency in Tika
ConcurrentUtils() - Constructor for class org.apache.tika.utils.ConcurrentUtils
confidence - Variable in class org.apache.tika.parser.recognition.RecognisedObject
Confidence score
config - Variable in class
ConfigurableThreadPoolExecutor - Interface in org.apache.tika.concurrent
Allows Thread Pool to be Configurable.
configure(ParseContext) - Method in class
Checks to see if the user has specified an OfficeParserConfig.
configure(PDF2XHTML) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Configures the given pdf2XHTML.
configureExtractor(POIXMLTextExtractor, Locale) - Method in class
configureExtractor(POIXMLTextExtractor, Locale) - Method in class
consume(String) - Method in interface org.apache.tika.parser.external.ExternalParser.LineConsumer
Consume a line
ConsumersManager - Class in org.apache.tika.batch
Simple interface around a collection of consumers that allows for initializing and shutting shared resources (e.g.
ConsumersManager(List<FileResourceConsumer>) - Constructor for class org.apache.tika.batch.ConsumersManager
CONTACT - Static variable in interface org.apache.tika.metadata.ClimateForcast
CONTACT_INFO_ADDRESS - Static variable in interface org.apache.tika.metadata.IPTC
The contact information address part.
CONTACT_INFO_CITY - Static variable in interface org.apache.tika.metadata.IPTC
The contact information city part.
CONTACT_INFO_COUNTRY - Static variable in interface org.apache.tika.metadata.IPTC
The contact information country part.
CONTACT_INFO_EMAIL - Static variable in interface org.apache.tika.metadata.IPTC
The contact information email address part.
CONTACT_INFO_PHONE - Static variable in interface org.apache.tika.metadata.IPTC
The contact information phone number part.
CONTACT_INFO_POSTAL_CODE - Static variable in interface org.apache.tika.metadata.IPTC
The contact information part denoting the local postal code.
CONTACT_INFO_STATE_PROVINCE - Static variable in interface org.apache.tika.metadata.IPTC
The contact information part denoting regional information such as state or province.
CONTACT_INFO_WEB_URL - Static variable in interface org.apache.tika.metadata.IPTC
The contact information web address part.
CONTAINER_EXCEPTION - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
CONTAINER_TABLE - Static variable in class org.apache.tika.eval.ExtractProfiler
ContainerExtractor - Interface in org.apache.tika.extractor
Tika container extractor interface.
contains(String) - Method in class org.apache.tika.eval.tokens.LangModel
contains(String, String, String) - Method in class org.apache.tika.language.translate.CachedTranslator
Check whether this CachedTranslator's cache contains a translation of the text from the source language to the target language.
contains(String, String) - Method in class org.apache.tika.language.translate.CachedTranslator
Check whether this CachedTranslator's cache contains a translation of the text to the target language, attempting to auto-detect the source language.
contains(Charset) - Method in class org.apache.tika.parser.html.charsetdetector.charsets.ReplacementCharset
contains(Charset) - Method in class org.apache.tika.parser.html.charsetdetector.charsets.XUserDefinedCharset
containsColumn(Cols) - Method in class org.apache.tika.eval.db.TableInfo
containsEmail(String) - Static method in class org.apache.tika.parser.mail.MailUtil
If the chunk looks like it contains an email
containsTable(String) - Method in class org.apache.tika.eval.db.JDBCUtil
CONTENT - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
CONTENT_COMPARISONS - Static variable in class org.apache.tika.eval.ExtractComparer
CONTENT_DISPOSITION - Static variable in interface org.apache.tika.metadata.HttpHeaders
CONTENT_ENCODING - Static variable in interface org.apache.tika.metadata.HttpHeaders
CONTENT_LANGUAGE - Static variable in interface org.apache.tika.metadata.HttpHeaders
CONTENT_LENGTH - Static variable in interface org.apache.tika.metadata.HttpHeaders
CONTENT_LOCATION - Static variable in interface org.apache.tika.metadata.HttpHeaders
CONTENT_MD5 - Static variable in interface org.apache.tika.metadata.HttpHeaders
CONTENT_STATUS - Static variable in interface org.apache.tika.metadata.MSOffice
CONTENT_STATUS - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
The status of the content.
CONTENT_TYPE - Static variable in interface org.apache.tika.metadata.HttpHeaders
CONTENT_TYPE_HINT - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
This is currently used to identify Content-Type that may be included within a document, such as in html documents (e.g.
CONTENT_TYPE_OVERRIDE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
contentEquals(InputStream, InputStream) - Static method in class
Compare the contents of two Streams to determine if they are equal or not.
contentEquals(Reader, Reader) - Static method in class
Compare the contents of two Readers to determine if they are equal or not.
ContentHandlerDecorator - Class in org.apache.tika.sax
Decorator base class for the ContentHandler interface.
ContentHandlerDecorator(ContentHandler) - Constructor for class org.apache.tika.sax.ContentHandlerDecorator
Creates a decorator for the given SAX event handler.
ContentHandlerDecorator() - Constructor for class org.apache.tika.sax.ContentHandlerDecorator
Creates a decorator that by default forwards incoming SAX events to a dummy content handler that simply ignores all the events.
ContentHandlerExample - Class in org.apache.tika.example
Examples of using different Content Handlers to get different parts of the file's contents
ContentHandlerExample() - Constructor for class org.apache.tika.example.ContentHandlerExample
ContentHandlerFactory - Interface in org.apache.tika.sax
Interface to allow easier injection of code for getting a new ContentHandler
ContentLengthCalculator - Class in org.apache.tika.eval.textstats
ContentLengthCalculator() - Constructor for class org.apache.tika.eval.textstats.ContentLengthCalculator
CONTENTS_TABLE - Static variable in class org.apache.tika.eval.ExtractProfiler
CONTENTS_TABLE_A - Static variable in class org.apache.tika.eval.ExtractComparer
CONTENTS_TABLE_B - Static variable in class org.apache.tika.eval.ExtractComparer
ContentTagParser - Class in org.apache.tika.eval.util
ContentTagParser() - Constructor for class org.apache.tika.eval.util.ContentTagParser
ContentTags - Class in org.apache.tika.eval.util
ContentTags(String) - Constructor for class org.apache.tika.eval.util.ContentTags
ContentTags(String, boolean) - Constructor for class org.apache.tika.eval.util.ContentTags
ContentTags(String, Map<String, Integer>) - Constructor for class org.apache.tika.eval.util.ContentTags
ContrastStatistics - Class in org.apache.tika.eval.tokens
ContrastStatistics() - Constructor for class org.apache.tika.eval.tokens.ContrastStatistics
CONTRIBUTOR - Static variable in interface org.apache.tika.metadata.DublinCore
An entity responsible for making contributions to the content of the resource.
CONTRIBUTOR - Static variable in class org.apache.tika.metadata.Metadata
use TikaCoreProperties#CONTRIBUTOR
CONTRIBUTOR - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
CONTROL_DATA - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
CONTROLLED_VOCABULARY_TERM - Static variable in interface org.apache.tika.metadata.IPTC
A term to describe the content of the image by a value from a Controlled Vocabulary.
CONVENTIONS - Static variable in interface org.apache.tika.metadata.ClimateForcast
convert(Object) - Static method in class org.apache.tika.metadata.XMPDM.ChannelTypePropertyConverter
How a standalone converter might work
convert(Metadata) - Static method in class org.apache.tika.xmp.convert.TikaToXMP
convert(Metadata, String) - Static method in class org.apache.tika.xmp.convert.TikaToXMP
Convert the given Tika metadata map to XMP object.
convertAndSet(Metadata, Object) - Static method in class org.apache.tika.metadata.XMPDM.ChannelTypePropertyConverter
How convert+set might work
converttoInt(byte[]) - Static method in class org.apache.tika.parser.image.ICNSType
convertToJSONArray(JSONObject, String) - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
Converts JSON Object to JSON Array
convertToJSONObject(String) - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
Parses a JSON String and converts it to a JSON Object
copy(InputStream, OutputStream) - Static method in class
Copy bytes from an InputStream to an OutputStream.
copy(InputStream, Writer) - Static method in class
Copy bytes from an InputStream to chars on a Writer using the default character encoding of the platform.
copy(InputStream, Writer, String) - Static method in class
Copy bytes from an InputStream to chars on a Writer using the specified character encoding.
copy(Reader, Writer) - Static method in class
Copy chars from a Reader to a Writer.
copy(Reader, OutputStream) - Static method in class
Copy chars from a Reader to bytes on an OutputStream using the default character encoding of the platform, and calling flush.
copy(Reader, OutputStream, String) - Static method in class
Copy chars from a Reader to bytes on an OutputStream using the specified character encoding, and calling flush.
copyLarge(InputStream, OutputStream) - Static method in class
Copy bytes from a large (over 2GB) InputStream to an OutputStream.
copyLarge(Reader, Writer) - Static method in class
Copy chars from a large (over 2GB) Reader to a Writer.
copyOfRange(byte[], int, int) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
COPYRIGHT - Static variable in interface org.apache.tika.metadata.XMPDM
"The copyright information."
COPYRIGHT_NOTICE - Static variable in interface org.apache.tika.metadata.IPTC
Contains any necessary copyright notice for claiming the intellectual property for this item and should identify the current owner of the copyright for the item.
COPYRIGHT_OWNER - Static variable in interface org.apache.tika.metadata.IPTC
Owner or owners of the copyright in the licensed image.
COPYRIGHT_OWNER_ID - Static variable in interface org.apache.tika.metadata.IPTC
The ID of the owner or owners of the copyright in the licensed image.
COPYRIGHT_OWNER_ID_WRONG_CASE - Static variable in interface org.apache.tika.metadata.IPTC
COPYRIGHT_OWNER_NAME - Static variable in interface org.apache.tika.metadata.IPTC
The name of the owner or owners of the copyright in the licensed image.
CoreNLPNERecogniser - Class in org.apache.tika.parser.ner.corenlp
This class offers an implementation of NERecogniser based on CRF classifiers from Stanford CoreNLP.
CoreNLPNERecogniser() - Constructor for class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
CoreNLPNERecogniser(String) - Constructor for class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
Creates a NERecogniser by loading model from given path
CorruptedFileException - Exception in org.apache.tika.exception
This exception should be thrown when the parse absolutely, positively has to stop.
CorruptedFileException(String) - Constructor for exception org.apache.tika.exception.CorruptedFileException
CorruptedFileException(String, Throwable) - Constructor for exception org.apache.tika.exception.CorruptedFileException
count() - Method in class org.apache.tika.detect.TextStatistics
Returns the total number of bytes seen so far.
count(int) - Method in class org.apache.tika.detect.TextStatistics
Returns the number of occurrences of the given byte.
countControl() - Method in class org.apache.tika.detect.TextStatistics
Counts control characters (i.e.
countEightBit() - Method in class org.apache.tika.detect.TextStatistics
Counts eight bit characters, i.e.
CountingInputStream - Class in
A decorating input stream that counts the number of bytes that have passed through the stream so far.
CountingInputStream(InputStream) - Constructor for class
Constructs a new CountingInputStream.
COUNTRY - Static variable in interface org.apache.tika.metadata.IPTC
Full name of the country the content is focussing on -- either the country shown in visual media or referenced in text or audio media.
COUNTRY - Static variable in interface org.apache.tika.metadata.Photoshop
COUNTRY_CODE - Static variable in interface org.apache.tika.metadata.IPTC
Code of the country the content is focussing on -- either the country shown in visual media or referenced in text or audio media.
countSafeAscii() - Method in class org.apache.tika.detect.TextStatistics
Counts "safe" (i.e.
countTokenOverlaps(String, Map<String, MutableInt>) - Method in class org.apache.tika.eval.tokens.CommonTokenCountManager
COVERAGE - Static variable in interface org.apache.tika.metadata.DublinCore
The extent or scope of the content of the resource.
COVERAGE - Static variable in class org.apache.tika.metadata.Metadata
use TikaCoreProperties#COVERAGE
COVERAGE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
create(TokenStream) - Method in class org.apache.tika.eval.tokens.AlphaIdeographFilterFactory
create(TokenStream) - Method in class org.apache.tika.eval.tokens.CJKBigramAwareLengthFilterFactory
create(TokenStream) - Method in class org.apache.tika.eval.tokens.URLEmailNormalizingFilterFactory
create(String, InputStream, String) - Static method in class org.apache.tika.language.LanguageProfilerBuilder
Creates a new Language profile from (preferably quite large - 5-10k of lines) text file
create() - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates an empty instance; same as calling new MimeTypes().
create(Document) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance from the specified document.
create(InputStream...) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance from the specified input stream.
create(InputStream) - Static method in class org.apache.tika.mime.MimeTypesFactory
create(URL...) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance from the resource at the location specified by the URL.
create(URL) - Static method in class org.apache.tika.mime.MimeTypesFactory
create(String) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance from the specified file path, as interpreted by the class loader in getResource().
create(String, String) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance.
create(String, String, ClassLoader) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance.
create() - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
create(ServiceLoader) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
create(String, ServiceLoader) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
create(URL...) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
CREATE_DATE - Static variable in interface org.apache.tika.metadata.XMP
The date and time the resource was created.
createArrayProperty(Property, String, String, int) - Method in class org.apache.tika.xmp.convert.AbstractConverter
createArrayProperty(String, String, String, int) - Method in class org.apache.tika.xmp.convert.AbstractConverter
Creates an array property from a list of values.
createCommaSeparatedArray(Property, String, String, int) - Method in class org.apache.tika.xmp.convert.AbstractConverter
createCommaSeparatedArray(String, String, String, int) - Method in class org.apache.tika.xmp.convert.AbstractConverter
Creates an array property from a comma separated list.
CREATED - Static variable in interface org.apache.tika.metadata.DublinCore
Date of creation of the resource.
CREATED - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
createDecryptStream(InputStream, Key) - Method in class org.apache.tika.parser.hwp.HwpTextExtractorV5
createFrameIfPresent(InputStream) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
Returns the next ID3v2 Frame in the file, or null if the next batch of data doesn't correspond to either an ID3v2 header.
createLangAltProperty(Property, String, String) - Method in class org.apache.tika.xmp.convert.AbstractConverter
createLangAltProperty(String, String, String) - Method in class org.apache.tika.xmp.convert.AbstractConverter
Creates a language alternative property in the x-default language
createOneNoteDocumentFromDirectFileResource(OneNoteDirectFileResource) - Method in class
Create a OneNoteDocument object.
createParser() - Static method in class org.apache.tika.server.resource.TikaResource
createProperty(Property, String, String) - Method in class org.apache.tika.xmp.convert.AbstractConverter
createProperty(String, String, String) - Method in class org.apache.tika.xmp.convert.AbstractConverter
Creates a simple property.
createTables(List<TableInfo>, JDBCUtil.CREATE_TABLE) - Method in class org.apache.tika.eval.db.JDBCUtil
createTempFile() - Method in class
Creates a temporary file that will automatically be deleted when the TemporaryResources.close() method is called, returning its path.
createTemporaryFile() - Method in class
Creates and returns a temporary file that will automatically be deleted when the TemporaryResources.close() method is called.
CREATION_DATE - Static variable in interface org.apache.tika.metadata.MSOffice
CREATION_DATE - Static variable in interface org.apache.tika.metadata.Office
When was the document created?
CreativeCommons - Interface in org.apache.tika.metadata
A collection of Creative Commons properties names.
CREATOR - Static variable in interface org.apache.tika.metadata.DublinCore
An entity primarily responsible for making the content of the resource.
CREATOR - Static variable in interface org.apache.tika.metadata.IPTC
Contains the name of the person who created the content of this item, a photographer for photos, a graphic artist for graphics, or a writer for textual news, but in cases where the photographer should not be identified the name of a company or organisation may be appropriate.
CREATOR - Static variable in class org.apache.tika.metadata.Metadata
use TikaCoreProperties#CREATOR
CREATOR - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
CREATOR_TOOL - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
CREATOR_TOOL - Static variable in interface org.apache.tika.metadata.XMP
The name of the first known tool used to create the resource.
CREATORS_CONTACT_INFO - Static variable in interface org.apache.tika.metadata.IPTC
The creator's contact information provides all necessary information to get in contact with the creator of this item and comprises a set of sub-properties for proper addressing.
CREATORS_JOB_TITLE - Static variable in interface org.apache.tika.metadata.IPTC
Contains the job title of the person who created the content of this item.
CREDIT - Static variable in interface org.apache.tika.metadata.Photoshop
CREDIT_LINE - Static variable in interface org.apache.tika.metadata.IPTC
The credit to person(s) and/or organisation(s) required by the supplier of the item to be used when published.
CryptoParser - Class in org.apache.tika.parser
Decrypts the incoming document stream and delegates further parsing to another parser instance.
CryptoParser(String, Provider, Set<MediaType>) - Constructor for class org.apache.tika.parser.CryptoParser
CryptoParser(String, Set<MediaType>) - Constructor for class org.apache.tika.parser.CryptoParser
CSVMessageBodyWriter - Class in org.apache.tika.server.writer
CSVMessageBodyWriter() - Constructor for class org.apache.tika.server.writer.CSVMessageBodyWriter
CSVParams - Class in org.apache.tika.parser.csv
CSVResult - Class in org.apache.tika.parser.csv
CSVResult(double, MediaType, Character) - Constructor for class org.apache.tika.parser.csv.CSVResult
CTAKES_META_PREFIX - Static variable in class org.apache.tika.parser.ctakes.CTAKESContentHandler
CTAKESAnnotationProperty - Enum in org.apache.tika.parser.ctakes
This enumeration includes the properties that an IdentifiedAnnotation object can provide.
CTAKESConfig - Class in org.apache.tika.parser.ctakes
Configuration for CTAKESContentHandler.
CTAKESConfig() - Constructor for class org.apache.tika.parser.ctakes.CTAKESConfig
Default constructor.
CTAKESConfig(InputStream) - Constructor for class org.apache.tika.parser.ctakes.CTAKESConfig
Loads properties from InputStream and then tries to close InputStream.
CTAKESContentHandler - Class in org.apache.tika.parser.ctakes
Class used to extract biomedical information while parsing.
CTAKESContentHandler(ContentHandler, Metadata, CTAKESConfig) - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
Creates a new CTAKESContentHandler for the given ContentHandler and Metadata objects.
CTAKESContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
Creates a new CTAKESContentHandler for the given ContentHandler and Metadata objects.
CTAKESContentHandler() - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
Default constructor.
CTAKESParser - Class in org.apache.tika.parser.ctakes
CTAKESParser decorates a Parser and leverages on CTAKESContentHandler to extract biomedical information from clinical text using Apache cTAKES.
CTAKESParser() - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
Wraps the default Parser
CTAKESParser(TikaConfig) - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
Wraps the default Parser for this Config
CTAKESParser(Parser) - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
Wraps the specified Parser
CTAKESSerializer - Enum in org.apache.tika.parser.ctakes
Enumeration for types of cTAKES (UIMA) CAS serializer supported by cTAKES.
CTAKESUtils - Class in org.apache.tika.parser.ctakes
This class provides methods to extract biomedical information from plain text using CTAKESContentHandler that relies on Apache cTAKES.
CTAKESUtils() - Constructor for class org.apache.tika.parser.ctakes.CTAKESUtils
CUSTOM_MIMES_SYS_PROP - Static variable in class org.apache.tika.mime.MimeTypesFactory
System property to set a path to an additional external custom mimetypes XML file to be loaded.
customCompositeDetector() - Static method in class org.apache.tika.example.CustomMimeInfo
CustomMimeInfo - Class in org.apache.tika.example
CustomMimeInfo() - Constructor for class org.apache.tika.example.CustomMimeInfo
customMimeInfo() - Static method in class org.apache.tika.example.CustomMimeInfo


data - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.RawTag
Database - Interface in org.apache.tika.metadata
databaseExists(Path) - Static method in class org.apache.tika.eval.db.H2Util
DataURIScheme - Class in org.apache.tika.parser.utils
DataURISchemeParseException - Exception in org.apache.tika.parser.utils
DataURISchemeParseException(String) - Constructor for exception org.apache.tika.parser.utils.DataURISchemeParseException
DataURISchemeUtil - Class in org.apache.tika.parser.utils
Not thread safe.
DataURISchemeUtil() - Constructor for class org.apache.tika.parser.utils.DataURISchemeUtil
DATE - Static variable in interface org.apache.tika.metadata.DublinCore
A date associated with an event in the life cycle of the resource.
DATE - Static variable in class org.apache.tika.metadata.Metadata
use TikaCoreProperties#CREATED
DATE - Static variable in interface org.apache.tika.parser.ner.NERecogniser
DATE_CREATED - Static variable in interface org.apache.tika.metadata.IPTC
Designates the date and optionally the time the intellectual content was created rather than the date of the creation of the physical representation.
DATE_CREATED - Static variable in interface org.apache.tika.metadata.Photoshop
DATE_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
DateUtils - Class in org.apache.tika.utils
Date related utility methods and constants
DateUtils() - Constructor for class org.apache.tika.utils.DateUtils
DBBuffer - Class in org.apache.tika.eval.db
DBBuffer(Connection, String, String, String) - Constructor for class org.apache.tika.eval.db.DBBuffer
DBConsumersManager - Class in org.apache.tika.eval.batch
DBConsumersManager(JDBCUtil, MimeBuffer, List<FileResourceConsumer>) - Constructor for class org.apache.tika.eval.batch.DBConsumersManager
DBFParser - Class in org.apache.tika.parser.dbf
This is a Tika wrapper around the DBFReader.
DBFParser() - Constructor for class org.apache.tika.parser.dbf.DBFParser
DBWriter - Class in
This is still in its early stages.
DBWriter(Connection, List<TableInfo>, JDBCUtil, MimeBuffer) - Constructor for class
DcXMLParser - Class in org.apache.tika.parser.xml
Dublin Core metadata parser
DcXMLParser() - Constructor for class org.apache.tika.parser.xml.DcXMLParser
decode(String) - Static method in class org.apache.tika.mime.HexCoDec
Decode a hex string
decode(char[]) - Static method in class org.apache.tika.mime.HexCoDec
Decode an array of hex chars
decode(char[], int, int) - Static method in class org.apache.tika.mime.HexCoDec
Decode an array of hex chars.
decompressConcatenated(Metadata) - Method in interface org.apache.tika.parser.pkg.CompressorParserOptions
DEF_MODEL - Static variable in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
DEFAULT - Static variable in interface org.apache.tika.config.InitializableProblemHandler
DEFAULT - Static variable in class org.apache.tika.config.ParamField
DEFAULT_CHARSET - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
DEFAULT_CHILD_STARTUP_MILLIS - Static variable in class org.apache.tika.server.ServerTimeouts
Number of milliseconds to wait for child process to startup
DEFAULT_HOST - Static variable in class org.apache.tika.server.TikaServerCli
DEFAULT_ID - Static variable in class org.apache.tika.language.translate.MicrosoftTranslator
DEFAULT_MAX_ENTITY_EXPANSIONS - Static variable in class org.apache.tika.utils.XMLReaderUtils
DEFAULT_MAX_QUEUE_SIZE - Static variable in class
DEFAULT_MODEL_PATH - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
default Model path
DEFAULT_MODELS - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
DEFAULT_NER_IMPL - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
DEFAULT_NGRAM_LENGTH - Static variable in class org.apache.tika.language.LanguageProfile
DEFAULT_PING_PULSE_MILLIS - Static variable in class org.apache.tika.server.ServerTimeouts
How often should the parent try to ping the child to check status
DEFAULT_PING_TIMEOUT_MILLIS - Static variable in class org.apache.tika.server.ServerTimeouts
If the child doesn't receive a ping or the parent doesn't hear back from a ping in this amount of time, kill and restart the child.
DEFAULT_POOL_SIZE - Static variable in class org.apache.tika.utils.XMLReaderUtils
Default size for the pool of SAX Parsers and the pool of DOM builders
DEFAULT_PORT - Static variable in class org.apache.tika.server.TikaServerCli
DEFAULT_SECRET - Static variable in class org.apache.tika.language.translate.MicrosoftTranslator
DEFAULT_TASK_TIMEOUT_MILLIS - Static variable in class org.apache.tika.server.ServerTimeouts
Number of milliseconds to wait per server task (parse, detect, unpack, translate, etc.) before timing out and shutting down the child process.
DefaultContentHandlerFactoryBuilder - Class in
Builds BasicContentHandler with type defined by attribute "basicHandlerType" with possible values: xml, html, text, body, ignore.
DefaultContentHandlerFactoryBuilder() - Constructor for class
DefaultDetector - Class in org.apache.tika.detect
A composite detector based on all the Detector implementations available through the service provider mechanism.
DefaultDetector(MimeTypes, ServiceLoader, Collection<Class<? extends Detector>>) - Constructor for class org.apache.tika.detect.DefaultDetector
DefaultDetector(MimeTypes, ServiceLoader) - Constructor for class org.apache.tika.detect.DefaultDetector
DefaultDetector(MimeTypes, ClassLoader) - Constructor for class org.apache.tika.detect.DefaultDetector
DefaultDetector(ClassLoader) - Constructor for class org.apache.tika.detect.DefaultDetector
DefaultDetector(MimeTypes) - Constructor for class org.apache.tika.detect.DefaultDetector
DefaultDetector() - Constructor for class org.apache.tika.detect.DefaultDetector
DefaultEncodingDetector - Class in org.apache.tika.detect
A composite encoding detector based on all the EncodingDetector implementations available through the service provider mechanism.
DefaultEncodingDetector() - Constructor for class org.apache.tika.detect.DefaultEncodingDetector
DefaultEncodingDetector(ServiceLoader) - Constructor for class org.apache.tika.detect.DefaultEncodingDetector
DefaultEncodingDetector(ServiceLoader, Collection<Class<? extends EncodingDetector>>) - Constructor for class org.apache.tika.detect.DefaultEncodingDetector
DefaultHtmlMapper - Class in org.apache.tika.parser.html
The default HTML mapping rules in Tika.
DefaultHtmlMapper() - Constructor for class org.apache.tika.parser.html.DefaultHtmlMapper
DefaultInputStreamFactory - Class in org.apache.tika.server
Passthrough -- returns InputStream as is
DefaultInputStreamFactory() - Constructor for class org.apache.tika.server.DefaultInputStreamFactory
DefaultParser - Class in org.apache.tika.parser
A composite parser based on all the Parser implementations available through the service provider mechanism.
DefaultParser(MediaTypeRegistry, ServiceLoader, Collection<Class<? extends Parser>>, EncodingDetector) - Constructor for class org.apache.tika.parser.DefaultParser
DefaultParser(MediaTypeRegistry, ServiceLoader, Collection<Class<? extends Parser>>) - Constructor for class org.apache.tika.parser.DefaultParser
DefaultParser(MediaTypeRegistry, ServiceLoader, EncodingDetector) - Constructor for class org.apache.tika.parser.DefaultParser
DefaultParser(MediaTypeRegistry, ServiceLoader) - Constructor for class org.apache.tika.parser.DefaultParser
DefaultParser(MediaTypeRegistry, ClassLoader) - Constructor for class org.apache.tika.parser.DefaultParser
DefaultParser(ClassLoader) - Constructor for class org.apache.tika.parser.DefaultParser
DefaultParser(MediaTypeRegistry) - Constructor for class org.apache.tika.parser.DefaultParser
DefaultParser() - Constructor for class org.apache.tika.parser.DefaultParser
DefaultProbDetector - Class in org.apache.tika.detect
A version of DefaultDetector for probabilistic mime detectors, which use statistical techniques to blend the results of differing underlying detectors when attempting to detect the type of a given file.
DefaultProbDetector(ProbabilisticMimeDetectionSelector, ServiceLoader) - Constructor for class org.apache.tika.detect.DefaultProbDetector
DefaultProbDetector(ProbabilisticMimeDetectionSelector, ClassLoader) - Constructor for class org.apache.tika.detect.DefaultProbDetector
DefaultProbDetector(ClassLoader) - Constructor for class org.apache.tika.detect.DefaultProbDetector
DefaultProbDetector(MimeTypes) - Constructor for class org.apache.tika.detect.DefaultProbDetector
DefaultProbDetector() - Constructor for class org.apache.tika.detect.DefaultProbDetector
DefaultTranslator - Class in org.apache.tika.language.translate
A translator which picks the first available Translator implementations available through the service provider mechanism.
DefaultTranslator(ServiceLoader) - Constructor for class org.apache.tika.language.translate.DefaultTranslator
DefaultTranslator() - Constructor for class org.apache.tika.language.translate.DefaultTranslator
DelegatingParser - Class in org.apache.tika.parser
Base class for parser implementations that want to delegate parts of the task of parsing an input document to another parser.
DelegatingParser() - Constructor for class org.apache.tika.parser.DelegatingParser
deleteNamespace(String) - Static method in class org.apache.tika.xmp.XMPMetadata
Deletes a namespace from the registry.
DELIMITER_PROPERTY - Static variable in class org.apache.tika.parser.csv.TextAndCSVParser
DERIVED_FROM_DOCUMENTID - Static variable in interface org.apache.tika.metadata.XMPMM
Document id for the document that this document was derived from
DERIVED_FROM_INSTANCEID - Static variable in interface org.apache.tika.metadata.XMPMM
Instance id for the document instance that this document was derived from
descend(String, String) - Method in class org.apache.tika.sax.xpath.ChildMatcher
descend(String, String) - Method in class org.apache.tika.sax.xpath.CompositeMatcher
descend(String, String) - Method in class org.apache.tika.sax.xpath.Matcher
Returns the XPath evaluation state that results from descending to a child element with the given name.
descend(String, String) - Method in class org.apache.tika.sax.xpath.NamedElementMatcher
descend(String, String) - Method in class org.apache.tika.sax.xpath.SubtreeMatcher
describeMediaType() - Static method in class org.apache.tika.example.MediaTypeExample
DescribeMetadata - Class in org.apache.tika.example
Print the supported Tika Metadata models and their fields.
DescribeMetadata() - Constructor for class org.apache.tika.example.DescribeMetadata
DESCRIPTION - Static variable in interface org.apache.tika.metadata.DublinCore
An account of the content of the resource.
DESCRIPTION - Static variable in interface org.apache.tika.metadata.IPTC
A textual description, including captions, of the item's content, particularly used where the object is not text.
DESCRIPTION - Static variable in class org.apache.tika.metadata.Metadata
use TikaCoreProperties#DESCRIPTION
DESCRIPTION - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
DESCRIPTION_WRITER - Static variable in interface org.apache.tika.metadata.IPTC
Identifier or the name of the person involved in writing, editing or correcting the description of the content.
deserialize(JsonElement, Type, JsonDeserializationContext) - Method in class org.apache.tika.metadata.serialization.JsonMetadataDeserializer
Deserializes a json object (equivalent to: Map) into a Metadata object.
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.CompositeDetector
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.CompositeEncodingDetector
detect(InputStream, Metadata) - Method in interface org.apache.tika.detect.Detector
Detects the content type of the given input document.
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.EmptyDetector
detect(InputStream, Metadata) - Method in interface org.apache.tika.detect.EncodingDetector
Detects the character encoding of the given text document, or null if the encoding of the document can not be detected.
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.MagicDetector
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.NameDetector
Detects the content type of an input document based on the document name given in the input metadata.
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.NonDetectingEncodingDetector
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.OverrideDetector
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.TextDetector
Looks at the beginning of the document input stream to determine whether the document is text or not.
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.TrainedModelDetector
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.TypeDetector
Detects the content type of an input document based on a type hint given in the input metadata.
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.ZeroSizeFileDetector
detect(InputStream, Metadata) - Method in class org.apache.tika.example.EncryptedPrescriptionDetector
detect() - Method in class org.apache.tika.language.detect.LanguageDetector
detect(CharSequence) - Method in class org.apache.tika.language.detect.LanguageDetector
detect(InputStream, Metadata) - Method in class org.apache.tika.mime.MimeTypes
Automatically detects the MIME type of a document based on magic markers in the stream prefix and any given metadata hints.
detect(InputStream, Metadata) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
detect(InputStream, Metadata) - Method in class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
detect(InputStream, Metadata) - Method in class org.apache.tika.parser.html.HtmlEncodingDetector
detect(ZipFile) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
detect(ZipFile) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
detect(Set<String>) - Static method in class
Use POIFSContainerDetector.detect(Set, DirectoryEntry) and pass the root entry of the filesystem whose type is to be detected, as a second argument.
detect(Set<String>, DirectoryEntry) - Static method in class
Internal detection of the specific kind of OLE2 document, based on the names of the top-level streams within the file.
detect(InputStream, Metadata) - Method in class
detect(InputStream, Metadata) - Method in class org.apache.tika.parser.pkg.StreamingZipContainerDetector
detect(InputStream, Metadata) - Method in class org.apache.tika.parser.pkg.ZipContainerDetector
detect() - Method in class org.apache.tika.parser.txt.CharsetDetector
Return the charset that best matches the supplied input data.
detect(InputStream, Metadata) - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
detect(InputStream, Metadata) - Method in class org.apache.tika.parser.txt.UniversalEncodingDetector
detect(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.resource.DetectorResource
detect(InputStream) - Method in class org.apache.tika.server.resource.LanguageResource
detect(String) - Method in class org.apache.tika.server.resource.LanguageResource
detect(InputStream, Metadata) - Method in class org.apache.tika.Tika
Detects the media type of the given document.
detect(InputStream, String) - Method in class org.apache.tika.Tika
Detects the media type of the given document.
detect(InputStream) - Method in class org.apache.tika.Tika
Detects the media type of the given document.
detect(byte[], String) - Method in class org.apache.tika.Tika
Detects the media type of the given document.
detect(byte[]) - Method in class org.apache.tika.Tika
Detects the media type of the given document.
detect(Path) - Method in class org.apache.tika.Tika
Detects the media type of the file at the given path.
detect(File) - Method in class org.apache.tika.Tika
Detects the media type of the given file.
detect(URL) - Method in class org.apache.tika.Tika
Detects the media type of the resource at the given URL.
detect(String) - Method in class org.apache.tika.Tika
Detects the media type of a document with the given file name.
detectAll() - Method in class org.apache.tika.langdetect.Lingo24LangDetector
detectAll() - Method in class org.apache.tika.langdetect.OptimaizeLangDetector
Detect languages based on previously submitted text (via addText calls).
detectAll() - Method in class org.apache.tika.langdetect.TextLangDetector
detectAll() - Method in class org.apache.tika.language.detect.LanguageDetector
Detect languages based on previously submitted text (via addText calls).
detectAll(String) - Method in class org.apache.tika.language.detect.LanguageDetector
Utility wrapper that detects the language of a given chunk of text.
detectAll() - Method in class org.apache.tika.parser.txt.CharsetDetector
Return an array of all charsets that appear to be plausible matches with the input data.
detectFilename(MultivaluedMap<String, String>) - Static method in class org.apache.tika.server.resource.TikaResource
detectIfPossible(ZipEntry) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
detectIfPossible(ZipEntry) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
detectLanguage(String) - Method in class org.apache.tika.example.LanguageDetectorExample
detectLanguage(String) - Method in class org.apache.tika.language.translate.AbstractTranslator
detectOfficeOpenXML(OPCPackage) - Static method in class org.apache.tika.parser.pkg.ZipContainerDetector
Detects the type of an OfficeOpenXML (OOXML) file from opened Package
Detector - Interface in org.apache.tika.detect
Content type detector.
DetectorResource - Class in org.apache.tika.server.resource
DetectorResource(ServerStatus) - Constructor for class org.apache.tika.server.resource.DetectorResource
detectType(ZipArchiveEntry, ZipFile) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
detectType(ZipArchiveEntry, ZipArchiveInputStream) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
detectType(InputStream) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
detectType(POIFSFileSystem) - Static method in enum
detectType(DirectoryEntry) - Static method in enum
detectWithCustomConfig(String) - Static method in class org.apache.tika.example.AdvancedTypeDetector
detectWithCustomDetector(String) - Static method in class org.apache.tika.example.AdvancedTypeDetector
DIFContentHandler - Class in org.apache.tika.parser.dif
DIFContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.parser.dif.DIFContentHandler
DIFContentHandler - Class in org.apache.tika.sax
DIFContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.sax.DIFContentHandler
DIFParser - Class in org.apache.tika.parser.dif
DIFParser() - Constructor for class org.apache.tika.parser.dif.DIFParser
digest(InputStream, Metadata, ParseContext) - Method in class org.apache.tika.parser.digest.CompositeDigester
digest(InputStream, Metadata, ParseContext) - Method in class org.apache.tika.parser.digest.InputStreamDigester
digest(InputStream, Metadata, ParseContext) - Method in interface org.apache.tika.parser.DigestingParser.Digester
Digests an InputStream and sets the appropriate value(s) in the metadata.
DigestingAutoDetectParserFactory - Class in org.apache.tika.batch
DigestingAutoDetectParserFactory() - Constructor for class org.apache.tika.batch.DigestingAutoDetectParserFactory
DigestingParser - Class in org.apache.tika.parser
DigestingParser(Parser, DigestingParser.Digester) - Constructor for class org.apache.tika.parser.DigestingParser
Creates a decorator for the given parser.
DigestingParser.Digester - Interface in org.apache.tika.parser
Interface for digester.
DigestingParser.Encoder - Interface in org.apache.tika.parser
Encodes byte array from a MessageDigest to String
DIGITAL_IMAGE_GUID - Static variable in interface org.apache.tika.metadata.IPTC
Globally unique identifier for the item.
DIGITAL_SOURCE_FILE_TYPE - Static variable in interface org.apache.tika.metadata.IPTC
DIGITAL_SOURCE_TYPE - Static variable in interface org.apache.tika.metadata.IPTC
The type of the source of this digital image
DirectFileReadDataSource - Class in org.apache.tika.parser.mp4
A DataSource implementation that relies on direct reads from a RandomAccessFile.
DirectFileReadDataSource(File) - Constructor for class org.apache.tika.parser.mp4.DirectFileReadDataSource
DirectoryListingEntry - Class in org.apache.tika.parser.chm.accessor
The format of a directory listing entry is as follows: BYTE: length of name BYTEs: name (UTF-8 encoded) ENCINT: content section ENCINT: offset ENCINT: length The offset is from the beginning of the content section the file is in, after the section has been decompressed (if appropriate).
DirectoryListingEntry() - Constructor for class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
DirectoryListingEntry(int, String, ChmCommons.EntryType, int, int) - Constructor for class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
Constructs directoryListingEntry
DirListParser - Class in org.apache.tika.example
Parses the output of /bin/ls and counts the number of files and the number of executables using Tika.
DirListParser() - Constructor for class org.apache.tika.example.DirListParser
DISC_NUMBER - Static variable in interface org.apache.tika.metadata.XMPDM
"The disc number for part of an album set."
DisplayMetInstance - Class in org.apache.tika.example
Grabs a PDF file from a URL and prints its Metadata
DisplayMetInstance() - Constructor for class org.apache.tika.example.DisplayMetInstance
dispose() - Method in class
Calls the TemporaryResources.close() method and wraps the potential IOException into a TikaException for convenience when used within Tika.
distance(LanguageProfile) - Method in class org.apache.tika.language.LanguageProfile
Calculates the geometric distance between this and the given other language profile.
DL4JInceptionV3Net - Class in org.apache.tika.dl.imagerec
DL4JInceptionV3Net is an implementation of ObjectRecogniser.
DL4JInceptionV3Net() - Constructor for class org.apache.tika.dl.imagerec.DL4JInceptionV3Net
DL4JVGG16Net - Class in org.apache.tika.dl.imagerec
DL4JVGG16Net() - Constructor for class org.apache.tika.dl.imagerec.DL4JVGG16Net
DOC - Static variable in class
Microsoft Word
DOC_INFO_CREATED - Static variable in interface org.apache.tika.metadata.PDF
DOC_INFO_CREATOR - Static variable in interface org.apache.tika.metadata.PDF
DOC_INFO_CREATOR_TOOL - Static variable in interface org.apache.tika.metadata.PDF
DOC_INFO_KEY_WORDS - Static variable in interface org.apache.tika.metadata.PDF
DOC_INFO_MODIFICATION_DATE - Static variable in interface org.apache.tika.metadata.PDF
DOC_INFO_PRODUCER - Static variable in interface org.apache.tika.metadata.PDF
DOC_INFO_SUBJECT - Static variable in interface org.apache.tika.metadata.PDF
DOC_INFO_TITLE - Static variable in interface org.apache.tika.metadata.PDF
DOC_INFO_TRAPPED - Static variable in interface org.apache.tika.metadata.PDF
DOC_SECURITY - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
doClose() - Method in class
document(int, StoredFieldVisitor) - Method in class
DOCUMENTID - Static variable in interface org.apache.tika.metadata.XMPMM
The common identifier for all versions and renditions of a resource.
DocumentSelector - Interface in org.apache.tika.extractor
Interface for different document selection strategies for purposes like embedded document extraction by a ContainerExtractor instance.
doubleByte - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.TextEncoding
DRAW_NS - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
drawingHyperlinks - Variable in class
dropTableIfExists(Connection, String) - Method in class org.apache.tika.eval.db.H2Util
dropTableIfExists(Connection, String) - Method in class org.apache.tika.eval.db.JDBCUtil
DublinCore - Interface in org.apache.tika.metadata
A collection of Dublin Core metadata names.
DumpTikaConfigExample - Class in org.apache.tika.example
This class shows how to dump a TikaConfig object to a configuration file.
DumpTikaConfigExample() - Constructor for class org.apache.tika.example.DumpTikaConfigExample
DURATION - Static variable in interface org.apache.tika.metadata.XMPDM
"The duration of the media file."
DurationFormatUtils - Class in org.apache.tika.util
Functionality and naming conventions (roughly) copied from org.apache.commons.lang3 so that we didn't have to add another dependency.
DurationFormatUtils() - Constructor for class org.apache.tika.util.DurationFormatUtils
DWGParser - Class in org.apache.tika.parser.dwg
DWG (CAD Drawing) parser.
DWGParser() - Constructor for class org.apache.tika.parser.dwg.DWGParser


EDIT_TIME - Static variable in interface org.apache.tika.metadata.MSOffice
How long has been spent editing the document?
ELAPSED_MILLIS - Static variable in class org.apache.tika.batch.FileResourceConsumer
element(String, String) - Method in class org.apache.tika.sax.XHTMLContentHandler
Emits an XHTML element with the given text content.
ElementMappingContentHandler - Class in org.apache.tika.sax
Content handler decorator that maps element QNames using a Map.
ElementMappingContentHandler(ContentHandler, Map<QName, ElementMappingContentHandler.TargetElement>) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler
ElementMappingContentHandler.TargetElement - Class in org.apache.tika.sax
ElementMatcher - Class in org.apache.tika.sax.xpath
Final evaluation state of an XPath expression that targets an element.
ElementMatcher() - Constructor for class org.apache.tika.sax.xpath.ElementMatcher
ElementMetadataHandler - Class in org.apache.tika.parser.xml
SAX event handler that maps the contents of an XML element into a metadata field.
ElementMetadataHandler(String, String, Metadata, String) - Constructor for class org.apache.tika.parser.xml.ElementMetadataHandler
Constructor for string metadata keys.
ElementMetadataHandler(String, String, Metadata, String, boolean, boolean) - Constructor for class org.apache.tika.parser.xml.ElementMetadataHandler
Constructor for string metadata keys which allows change of behavior for duplicate and empty entry values.
ElementMetadataHandler(String, String, Metadata, Property) - Constructor for class org.apache.tika.parser.xml.ElementMetadataHandler
Constructor for Property metadata keys.
ElementMetadataHandler(String, String, Metadata, Property, boolean, boolean) - Constructor for class org.apache.tika.parser.xml.ElementMetadataHandler
Constructor for Property metadata keys which allows change of behavior for duplicate and empty entry values.
EMAIL - Static variable in class org.apache.tika.eval.tokens.URLEmailNormalizingFilterFactory
EMB_APP_VERSION - Static variable in interface org.apache.tika.metadata.RTFMetadata
if an application and version is given as part of the embedded object, this is the literal string
EMB_CLASS - Static variable in interface org.apache.tika.metadata.RTFMetadata
EMB_ITEM - Static variable in interface org.apache.tika.metadata.RTFMetadata
EMB_TOPIC - Static variable in interface org.apache.tika.metadata.RTFMetadata
embed(Metadata, InputStream, OutputStream, ParseContext) - Method in interface org.apache.tika.embedder.Embedder
Embeds related document metadata from the given metadata object into the given output stream.
embed(Metadata, InputStream, OutputStream, ParseContext) - Method in class org.apache.tika.embedder.ExternalEmbedder
Executes the configured external command and passes the given document stream as a simple XHTML document to the given SAX content handler.
EMBEDDED_DEPTH - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
EMBEDDED_EXCEPTION - Static variable in class org.apache.tika.parser.RecursiveParserWrapper
EMBEDDED_EXCEPTION - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
EMBEDDED_EXCEPTION - Static variable in class org.apache.tika.utils.ParserUtils
EMBEDDED_FILE_PATH_TABLE - Static variable in class org.apache.tika.eval.ExtractProfiler
EMBEDDED_FILE_PATH_TABLE_A - Static variable in class org.apache.tika.eval.ExtractComparer
EMBEDDED_FILE_PATH_TABLE_B - Static variable in class org.apache.tika.eval.ExtractComparer
EMBEDDED_PARSER - Static variable in class org.apache.tika.utils.ParserUtils
EMBEDDED_RELATIONSHIP_ID - Static variable in interface org.apache.tika.metadata.TikaMetadataKeys
EMBEDDED_RELATIONSHIPS - Static variable in class
EMBEDDED_RESOURCE_LIMIT_REACHED - Static variable in class org.apache.tika.parser.RecursiveParserWrapper
EMBEDDED_RESOURCE_LIMIT_REACHED - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
EMBEDDED_RESOURCE_PATH - Static variable in class org.apache.tika.parser.RecursiveParserWrapper
EMBEDDED_RESOURCE_PATH - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
EMBEDDED_RESOURCE_TYPE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
Embedded resource type property
EMBEDDED_RESOURCE_TYPE - Static variable in interface org.apache.tika.metadata.TikaMetadataKeys
EMBEDDED_RESOURCE_TYPE_KEY - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
EMBEDDED_STORAGE_CLASS_ID - Static variable in interface org.apache.tika.metadata.TikaMetadataKeys
EmbeddedContentHandler - Class in org.apache.tika.sax
Content handler decorator that prevents the EmbeddedContentHandler.startDocument() and EmbeddedContentHandler.endDocument() events from reaching the decorated handler.
EmbeddedContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.EmbeddedContentHandler
Created a decorator that prevents the given handler from receiving EmbeddedContentHandler.startDocument() and EmbeddedContentHandler.endDocument() events.
EmbeddedDocumentExtractor - Interface in org.apache.tika.extractor
EmbeddedDocumentUtil - Class in org.apache.tika.extractor
Utility class to handle common issues with embedded documents.
EmbeddedDocumentUtil(ParseContext) - Constructor for class org.apache.tika.extractor.EmbeddedDocumentUtil
embeddedOLERef(String) - Method in class
embeddedOLERef(String) - Method in interface
embeddedPicRef(String, String) - Method in class
embeddedPicRef(String, String) - Method in interface
EmbeddedResourceHandler - Interface in org.apache.tika.extractor
Tika container extractor callback interface.
Embedder - Interface in org.apache.tika.embedder
Tika embedder interface
EMFParser - Class in
Extracts files embedded in EMF and offers a very rough capability to extract text if there is text stored in the EMF.
EMFParser() - Constructor for class
EMPTY - Static variable in class org.apache.tika.mime.MediaType
EMPTY_CONTENT_TAGS - Static variable in class org.apache.tika.eval.util.ContentTags
EMPTY_LIST - Static variable in class
Empty singleton to be used when there is no list manager.
EMPTY_MODEL - Static variable in class org.apache.tika.eval.tokens.LangModel
EMPTY_STYLES - Static variable in class
Empty singleton to be used when there is no style info
EmptyDetector - Class in org.apache.tika.detect
Dummy detector that returns application/octet-stream for all documents.
EmptyDetector() - Constructor for class org.apache.tika.detect.EmptyDetector
EmptyParser - Class in org.apache.tika.parser
Dummy parser that always produces an empty XHTML document without even attempting to parse the given document stream.
EmptyParser() - Constructor for class org.apache.tika.parser.EmptyParser
EmptyTranslator - Class in org.apache.tika.language.translate
Dummy translator that always declines to give any text.
EmptyTranslator() - Constructor for class org.apache.tika.language.translate.EmptyTranslator
enableInputFilter(boolean) - Method in class org.apache.tika.parser.txt.CharsetDetector
Enable filtering of input text.
encode(byte[]) - Static method in class org.apache.tika.mime.HexCoDec
Hex encode an array of bytes
encode(byte[], int, int) - Static method in class org.apache.tika.mime.HexCoDec
Hex encode an array of bytes
encode(byte[]) - Method in interface org.apache.tika.parser.DigestingParser.Encoder
encoding - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.TextEncoding
EncodingDetector - Interface in org.apache.tika.detect
Character encoding detector.
encodings - Static variable in class org.apache.tika.parser.mp3.ID3v2Frame
ENCRYPTED - Static variable in interface org.apache.tika.metadata.WordPerfect
Is encrypted?.
EncryptedDocumentException - Exception in org.apache.tika.exception
EncryptedDocumentException() - Constructor for exception org.apache.tika.exception.EncryptedDocumentException
EncryptedDocumentException(Throwable) - Constructor for exception org.apache.tika.exception.EncryptedDocumentException
EncryptedDocumentException(String) - Constructor for exception org.apache.tika.exception.EncryptedDocumentException
EncryptedDocumentException(String, Throwable) - Constructor for exception org.apache.tika.exception.EncryptedDocumentException
EncryptedPrescriptionDetector - Class in org.apache.tika.example
EncryptedPrescriptionDetector() - Constructor for class org.apache.tika.example.EncryptedPrescriptionDetector
EncryptedPrescriptionParser - Class in org.apache.tika.example
EncryptedPrescriptionParser() - Constructor for class org.apache.tika.example.EncryptedPrescriptionParser
endBookmark(String) - Method in class
endBookmark(String) - Method in interface
endDescription() - Method in class org.apache.tika.sax.XMPContentHandler
endDocument() - Method in class org.apache.tika.parser.ctakes.CTAKESContentHandler
endDocument() - Method in class org.apache.tika.parser.dif.DIFContentHandler
endDocument() - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
endDocument() - Method in class
endDocument() - Method in class
endDocument() - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
endDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
This is called after the full parse has completed.
endDocument() - Method in class org.apache.tika.sax.ContentHandlerDecorator
endDocument() - Method in class org.apache.tika.sax.DIFContentHandler
endDocument() - Method in class org.apache.tika.sax.EmbeddedContentHandler
endDocument() - Method in class org.apache.tika.sax.EndDocumentShieldingContentHandler
endDocument() - Method in class org.apache.tika.sax.PhoneExtractingContentHandler
This method is called whenever the Parser is done parsing the file.
endDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.RecursiveParserWrapperHandler
endDocument() - Method in class org.apache.tika.sax.SafeContentHandler
endDocument() - Method in class org.apache.tika.sax.StandardsExtractingContentHandler
This method is called whenever the Parser is done parsing the file.
endDocument() - Method in class org.apache.tika.sax.TeeContentHandler
endDocument() - Method in class org.apache.tika.sax.TextContentHandler
endDocument() - Method in class org.apache.tika.sax.ToTextContentHandler
Flushes the character stream so that no characters are forgotten in internal buffers.
endDocument() - Method in class org.apache.tika.sax.XHTMLContentHandler
Ends the XHTML document by writing the following footer and clearing the namespace mappings:
endDocument() - Method in class org.apache.tika.sax.XMPContentHandler
Ends the XMP document by writing the following footer and clearing the namespace mappings:
EndDocumentShieldingContentHandler - Class in org.apache.tika.sax
A wrapper around a ContentHandler which will ignore normal SAX calls to EndDocumentShieldingContentHandler.endDocument(), and only fire them later.
EndDocumentShieldingContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.EndDocumentShieldingContentHandler
Creates a decorator for the given SAX event handler.
endEditedSection() - Method in class
endEditedSection() - Method in interface
endElement(String, String, String) - Method in class org.apache.tika.mime.MimeTypesReader
endElement(String, String, String) - Method in class org.apache.tika.parser.dif.DIFContentHandler
endElement(String, String, String) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
endElement(String, String, String) - Method in class
endElement(String, String, String) - Method in class
endElement(String, String, String) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
endElement(String, String, String) - Method in class org.apache.tika.parser.xliff.XLIFF12ContentHandler
endElement(String, String, String) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
endElement(String, String, String) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
endElement(String, String, String) - Method in class org.apache.tika.parser.xml.MetadataHandler
endElement(String, String, String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
endElement(String, String, String) - Method in class org.apache.tika.sax.DIFContentHandler
endElement(String, String, String) - Method in class org.apache.tika.sax.ElementMappingContentHandler
endElement(String, String, String) - Method in class org.apache.tika.sax.ExpandedTitleContentHandler
endElement(String, String, String) - Method in class org.apache.tika.sax.LinkContentHandler
endElement(String, String, String) - Method in class org.apache.tika.sax.SafeContentHandler
endElement(String, String, String) - Method in class org.apache.tika.sax.SecureContentHandler
endElement(String, String, String) - Method in class org.apache.tika.sax.TeeContentHandler
endElement(String, String, String) - Method in class org.apache.tika.sax.ToHTMLContentHandler
endElement(String, String, String) - Method in class org.apache.tika.sax.ToTextContentHandler
endElement(String, String, String) - Method in class org.apache.tika.sax.ToXMLContentHandler
endElement(String, String, String) - Method in class org.apache.tika.sax.XHTMLContentHandler
Ends the given element.
endElement(String) - Method in class org.apache.tika.sax.XHTMLContentHandler
endElement(String, String, String) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
endEmbeddedDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
This is called after parsing each embedded document.
endEmbeddedDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.RecursiveParserWrapperHandler
This is called after parsing an embedded document.
ENDIAN - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
EndianUtils - Class in
General Endian Related Utilties.
EndianUtils() - Constructor for class
EndianUtils.BufferUnderrunException - Exception in
ENDLINE - Static variable in class org.apache.tika.sax.XHTMLContentHandler
The elements that get appended with the XHTMLContentHandler.NL character.
endnoteReference(String) - Method in class
endnoteReference(String) - Method in interface
endParagraph() - Method in class
endParagraph() - Method in interface
Endpoint(Class<?>, Method, String, String, String[]) - Constructor for class org.apache.tika.server.resource.TikaWelcome.Endpoint
endPrefixMapping(String) - Method in class
endPrefixMapping(String) - Method in class
endPrefixMapping(String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
endPrefixMapping(String) - Method in class org.apache.tika.sax.TeeContentHandler
endRow(int) - Method in class
endSDT() - Method in class
endSDT() - Method in interface
endTable() - Method in class
endTable() - Method in interface
endTableCell() - Method in class
endTableCell() - Method in interface
endTableRow() - Method in class
endTableRow() - Method in interface
ENGINEER - Static variable in interface org.apache.tika.metadata.XMPDM
"The engineer's name."
ensureFormattingState(XHTMLContentHandler, EnumSet<FormattingUtils.Tag>, Deque<FormattingUtils.Tag>) - Static method in class
Closes all tags until currentState contains only tags from desired set, then open all required tags to reach desired state.
ensureSkip(long) - Method in class org.apache.tika.parser.hwp.HwpStreamReader
ensure skip of n byte
ENTITY_LOCAL_NAMES - Static variable in class org.apache.tika.parser.xml.XMLProfiler
ENTITY_TYPES - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
ENTITY_TYPES - Static variable in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
ENTITY_TYPES - Static variable in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
ENTITY_TYPES - Static variable in class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
some common entities identified by NLTK
ENTITY_URIS - Static variable in class org.apache.tika.parser.xml.XMLProfiler
entityTypes - Variable in class org.apache.tika.parser.ner.regex.RegexNERecogniser
enumerateChm() - Method in class org.apache.tika.parser.chm.core.ChmExtractor
Enumerates chm entities
ENVI_MIME_TYPE - Static variable in class org.apache.tika.parser.envi.EnviHeaderParser
EnviHeaderParser - Class in org.apache.tika.parser.envi
EnviHeaderParser() - Constructor for class org.apache.tika.parser.envi.EnviHeaderParser
EnviHeaderParser(EncodingDetector) - Constructor for class org.apache.tika.parser.envi.EnviHeaderParser
EpubContentParser - Class in org.apache.tika.parser.epub
Parser for EPUB OPS *.html files.
EpubContentParser() - Constructor for class org.apache.tika.parser.epub.EpubContentParser
EpubParser - Class in org.apache.tika.parser.epub
Epub parser
EpubParser() - Constructor for class org.apache.tika.parser.epub.EpubParser
equals(Object) - Method in class org.apache.tika.eval.db.ColInfo
equals(Object) - Method in class org.apache.tika.eval.tokens.TokenIntPair
equals(Object) - Method in class org.apache.tika.eval.tokens.TokenStatistics
equals(String, String) - Static method in class org.apache.tika.language.detect.LanguageNames
equals(Object) - Method in class org.apache.tika.metadata.Metadata
equals(Object) - Method in class org.apache.tika.metadata.Property
equals(Object) - Method in class org.apache.tika.mime.MediaType
equals(Object) - Method in class org.apache.tika.mime.MimeType
equals(Object) - Method in class org.apache.tika.parser.csv.CSVResult
equals(Object) - Method in class org.apache.tika.parser.pdf.AccessChecker
equals(Object) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
equals(Object) - Method in class org.apache.tika.parser.txt.CharsetMatch
compare this CharsetMatch to another based on confidence value
equals(Object) - Method in class org.apache.tika.parser.utils.DataURIScheme
equals(Object) - Method in class org.apache.tika.xmp.XMPMetadata
This method is not implemented, yet.
EQUIPMENT_MAKE - Static variable in interface org.apache.tika.metadata.TIFF
"Manufacturer of the recording equipment."
EQUIPMENT_MODEL - Static variable in interface org.apache.tika.metadata.TIFF
"Model name or number of the recording equipment."
Error - Enum in
ERROR_CODES_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
ErrorParser - Class in org.apache.tika.parser
Dummy parser that always throws a TikaException without even attempting to parse the given document stream.
ErrorParser() - Constructor for class org.apache.tika.parser.ErrorParser
escapeCommandLine(String) - Static method in class org.apache.tika.utils.ProcessUtils
This should correctly put double-quotes around an argument if ProcessBuilder doesn't seem to work (as it doesn't on paths with spaces on Windows)
EvalConsumerBuilder - Class in org.apache.tika.eval.batch
EvalConsumerBuilder() - Constructor for class org.apache.tika.eval.batch.EvalConsumerBuilder
EvalConsumersBuilder - Class in org.apache.tika.eval.batch
EvalConsumersBuilder() - Constructor for class org.apache.tika.eval.batch.EvalConsumersBuilder
EvalExceptionUtils - Class in org.apache.tika.eval.util
EvalExceptionUtils() - Constructor for class org.apache.tika.eval.util.EvalExceptionUtils
EVENT - Static variable in interface org.apache.tika.metadata.IPTC
Names or describes the specific event the content relates to.
ExcelExtractor - Class in
Excel parser implementation which uses POI's Event API to handle the contents of a Workbook.
ExcelExtractor(ParseContext, Metadata) - Constructor for class
EXCEPTION_TABLE - Static variable in class org.apache.tika.eval.ExtractProfiler
EXCEPTION_TABLE_A - Static variable in class org.apache.tika.eval.ExtractComparer
EXCEPTION_TABLE_B - Static variable in class org.apache.tika.eval.ExtractComparer
ExceptionUtils - Class in org.apache.tika.utils
ExceptionUtils() - Constructor for class org.apache.tika.utils.ExceptionUtils
ExecutableParser - Class in org.apache.tika.parser.executable
Parser for executable files.
ExecutableParser() - Constructor for class org.apache.tika.parser.executable.ExecutableParser
execute() - Method in class org.apache.tika.batch.BatchProcessDriverCLI
execute(Connection, Path) - Method in class org.apache.tika.eval.reports.ResultsReporter
execute(String[], ServerTimeouts) - Method in class org.apache.tika.server.TikaServerWatchDog
execute(ParseContext, Runnable) - Static method in class org.apache.tika.utils.ConcurrentUtils
Execute a runnable using an ExecutorService from the ParseContext if possible.
EXIF_PAGE_COUNT - Static variable in interface org.apache.tika.metadata.TIFF
ExpandedTitleContentHandler - Class in org.apache.tika.sax
Content handler decorator which wraps a TransformerHandler in order to allow the TITLE tag to render as <title></title> rather than <title/> which is accomplished by calling the ContentHandler.characters(char[], int, int) method with a length of 1 but a zero length char array.
ExpandedTitleContentHandler() - Constructor for class org.apache.tika.sax.ExpandedTitleContentHandler
ExpandedTitleContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.ExpandedTitleContentHandler
EXPERIMENT_ID - Static variable in interface org.apache.tika.metadata.ClimateForcast
EXPOSURE_TIME - Static variable in interface org.apache.tika.metadata.TIFF
"Exposure time in seconds."
extension_neg(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
EXTENSION_TAG_EXIF - Static variable in class org.apache.tika.parser.image.BPGParser
EXTENSION_TAG_ICC_PROFILE - Static variable in class org.apache.tika.parser.image.BPGParser
EXTENSION_TAG_THUMBNAIL - Static variable in class org.apache.tika.parser.image.BPGParser
EXTENSION_TAG_XMP - Static variable in class org.apache.tika.parser.image.BPGParser
extension_trust(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
EXTERNAL_PARSERS_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
externalBoolean(String) - Static method in class org.apache.tika.metadata.Property
externalClosedChoise(String, String...) - Static method in class org.apache.tika.metadata.Property
externalDate(String) - Static method in class org.apache.tika.metadata.Property
ExternalEmbedder - Class in org.apache.tika.embedder
Embedder that uses an external program (like sed or exiftool) to embed text content and metadata into a given document.
ExternalEmbedder() - Constructor for class org.apache.tika.embedder.ExternalEmbedder
externalInteger(String) - Static method in class org.apache.tika.metadata.Property
externalOpenChoise(String, String...) - Static method in class org.apache.tika.metadata.Property
ExternalParser - Class in org.apache.tika.parser.external
Parser that uses an external program (like catdoc or pdf2txt) to extract text content and metadata from a given document.
ExternalParser() - Constructor for class org.apache.tika.parser.external.ExternalParser
ExternalParser.LineConsumer - Interface in org.apache.tika.parser.external
Consumer contract
ExternalParsersConfigReader - Class in org.apache.tika.parser.external
Builds up ExternalParser instances based on XML file(s) which define what to run, for what, and how to process any output metadata.
ExternalParsersConfigReader() - Constructor for class org.apache.tika.parser.external.ExternalParsersConfigReader
ExternalParsersConfigReaderMetKeys - Interface in org.apache.tika.parser.external
Met Keys used by the ExternalParsersConfigReader.
ExternalParsersFactory - Class in org.apache.tika.parser.external
Creates instances of ExternalParser based on XML configuration files.
ExternalParsersFactory() - Constructor for class org.apache.tika.parser.external.ExternalParsersFactory
externalReal(String) - Static method in class org.apache.tika.metadata.Property
externalText(String) - Static method in class org.apache.tika.metadata.Property
externalTextBag(String) - Static method in class org.apache.tika.metadata.Property
ExternalTranslator - Class in org.apache.tika.language.translate
Abstract class used to interact with command line/external Translators.
ExternalTranslator() - Constructor for class org.apache.tika.language.translate.ExternalTranslator
EXTRA_BITS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
extract(InputStream, Path) - Method in class org.apache.tika.example.ExtractEmbeddedFiles
extract(TikaInputStream, ContainerExtractor, EmbeddedResourceHandler) - Method in interface org.apache.tika.extractor.ContainerExtractor
Processes a container file, and extracts all the embedded resources from within it.
extract(TikaInputStream, ContainerExtractor, EmbeddedResourceHandler) - Method in class org.apache.tika.extractor.ParserContainerExtractor
extract(InputStream, Metadata, XHTMLContentHandler) - Method in class org.apache.tika.parser.hwp.HwpTextExtractorV5
extract Text from HWP Stream.
extract(Metadata) - Method in class
extract(String) - Method in class org.apache.tika.parser.utils.DataURISchemeUtil
Extracts DataURISchemes from free text, as in javascript.
EXTRACT_CONTENT - Static variable in interface org.apache.tika.metadata.AccessPermissions
Should content be extracted, generally.
EXTRACT_EXCEPTION_TABLE - Static variable in class org.apache.tika.eval.ExtractProfiler
EXTRACT_EXCEPTION_TABLE_A - Static variable in class org.apache.tika.eval.ExtractComparer
EXTRACT_EXCEPTION_TABLE_B - Static variable in class org.apache.tika.eval.ExtractComparer
EXTRACT_FOR_ACCESSIBILITY - Static variable in interface org.apache.tika.metadata.AccessPermissions
Should content be extracted for the purposes of accessibility.
extractChmEntry(DirectoryListingEntry) - Method in class org.apache.tika.parser.chm.core.ChmExtractor
Decompresses a chm entry
ExtractComparer - Class in org.apache.tika.eval
ExtractComparer(ArrayBlockingQueue<FileResource>, Path, Path, Path, ExtractReader, IDBWriter) - Constructor for class org.apache.tika.eval.ExtractComparer
ExtractComparerBuilder - Class in org.apache.tika.eval.batch
ExtractComparerBuilder() - Constructor for class org.apache.tika.eval.batch.ExtractComparerBuilder
extractDublinCore(XMPMetadata, Metadata) - Static method in class org.apache.tika.parser.image.xmp.JempboxExtractor
Tries to extract Dublin Core schema from XMP.
extractEmbeddedDocumentsExample(Path) - Method in class org.apache.tika.example.ParsingExample
ExtractEmbeddedFiles - Class in org.apache.tika.example
ExtractEmbeddedFiles() - Constructor for class org.apache.tika.example.ExtractEmbeddedFiles
extractGenre(String) - Static method in class org.apache.tika.parser.mp3.ID3v22Handler
extractHeaderFooter(String, XHTMLContentHandler) - Method in class
extractHeaderFooter(String, XHTMLContentHandler) - Method in class
extractHyperLinks(PackagePart, XHTMLContentHandler) - Method in class
extractLinks(String) - Static method in class org.apache.tika.utils.RegexUtils
Extract urls from plain text.
extractMacros(POIFSFileSystem, ContentHandler, EmbeddedDocumentExtractor) - Static method in class
Helper to extract macros from an NPOIFS/vbaProject.bin As of POI-3.15-final, there are still some bugs in VBAMacroReader.
extractor - Variable in class
extractPhoneNumbers(String) - Static method in class org.apache.tika.sax.CleanPhoneText
ExtractProfiler - Class in org.apache.tika.eval
ExtractProfiler(ArrayBlockingQueue<FileResource>, Path, Path, ExtractReader, IDBWriter) - Constructor for class org.apache.tika.eval.ExtractProfiler
ExtractProfilerBuilder - Class in org.apache.tika.eval.batch
ExtractProfilerBuilder() - Constructor for class org.apache.tika.eval.batch.ExtractProfilerBuilder
ExtractReader - Class in
ExtractReader() - Constructor for class
Reads full extract, no modification of metadata list, no min or max extract length checking
ExtractReader(ExtractReader.ALTER_METADATA_LIST) - Constructor for class
ExtractReader(ExtractReader.ALTER_METADATA_LIST, long, long) - Constructor for class
ExtractReader.ALTER_METADATA_LIST - Enum in
ExtractReaderException - Exception in
Exception when trying to read extract
ExtractReaderException(ExtractReaderException.TYPE) - Constructor for exception
ExtractReaderException.TYPE - Enum in
extractRootElement(byte[]) - Method in class org.apache.tika.detect.XmlRootExtractor
extractRootElement(InputStream) - Method in class org.apache.tika.detect.XmlRootExtractor
extractStandardReferences(String, double) - Static method in class org.apache.tika.sax.StandardsText
Extracts the standard references found within the given text.
extractXMPMM(XMPMetadata, Metadata) - Static method in class org.apache.tika.parser.image.xmp.JempboxExtractor
Extracts Media Management metadata from XMP.


F_NUMBER - Static variable in interface org.apache.tika.metadata.TIFF
"F-Number." The f-number is the focal length divided by the "effective" aperture diameter.
FAIL - Static variable in class org.apache.tika.sax.xpath.Matcher
State of a failed XPath evaluation, where nothing is matched.
FALSE - Static variable in class org.apache.tika.eval.AbstractProfiler
FeedParser - Class in org.apache.tika.parser.feed
Feed parser.
FeedParser() - Constructor for class org.apache.tika.parser.feed.FeedParser
FictionBookParser - Class in org.apache.tika.parser.xml
FictionBookParser() - Constructor for class org.apache.tika.parser.xml.FictionBookParser
Field - Annotation Type in org.apache.tika.config
Field annotation is a contract for binding Param value from Tika Configuration to an object.
FILE_DATA_RATE - Static variable in interface org.apache.tika.metadata.XMPDM
"The file data rate in megabytes per second.
FILE_EXTENSION - Static variable in interface org.apache.tika.batch.FileResource
FILE_ID - Static variable in interface org.apache.tika.metadata.WordPerfect
File identifier.
FILE_SIZE - Static variable in interface org.apache.tika.metadata.WordPerfect
File size as defined in document header.
FILE_TYPE - Static variable in interface org.apache.tika.metadata.WordPerfect
File type.
FileConfig - Class in org.apache.tika.parser.strings
Configuration for the "file" (or file-alternative) command.
FileConfig() - Constructor for class org.apache.tika.parser.strings.FileConfig
Default constructor.
FilenameUtils - Class in
FilenameUtils() - Constructor for class
FileResource - Interface in org.apache.tika.batch
This is a basic interface to handle a logical "file".
FileResourceConsumer - Class in org.apache.tika.batch
This is a base class for file consumers.
FileResourceConsumer(ArrayBlockingQueue<FileResource>) - Constructor for class org.apache.tika.batch.FileResourceConsumer
FileResourceCrawler - Class in org.apache.tika.batch
FileResourceCrawler(ArrayBlockingQueue<FileResource>, int) - Constructor for class org.apache.tika.batch.FileResourceCrawler
FILL_IN_FORM - Static variable in interface org.apache.tika.metadata.AccessPermissions
Can the user fill in a form
fillMetadata(Parser, Metadata, ParseContext, MultivaluedMap<String, String>) - Static method in class org.apache.tika.server.resource.TikaResource
fillParseContext(ParseContext, MultivaluedMap<String, String>, Parser) - Static method in class org.apache.tika.server.resource.TikaResource
filter(ContainerRequestContext) - Method in class org.apache.tika.server.TikaLoggingFilter
findDuplicateParsers(ParseContext) - Method in class org.apache.tika.parser.CompositeParser
Utility method that goes through all the component parsers and finds all media types for which more than one parser declares support.
findIconType(byte[]) - Static method in class org.apache.tika.parser.image.ICNSType
findInFile(String, Path) - Method in class org.apache.tika.example.InterruptableParsingExample
findMatches(String, Pattern) - Method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
finds matching sub groups in text
findNames(String[]) - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
finds names from given array of tokens
findServiceResources(String) - Method in class org.apache.tika.config.ServiceLoader
Returns all the available service resources matching the given pattern, such as all instances of tika-mimetypes.xml on the classpath, or all org.apache.tika.parser.Parser service files.
FINISHED_STRING - Static variable in class org.apache.tika.batch.fs.FSBatchProcessCLI
flag - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.RawTag
FLASH_FIRED - Static variable in interface org.apache.tika.metadata.TIFF
Did the Flash fire when taking this image?
flush() - Method in class org.apache.tika.language.detect.LanguageWriter
flush() - Method in class org.apache.tika.language.ProfilingWriter
flushAndClose(Closeable) - Method in class org.apache.tika.batch.FileResourceConsumer
FLVParser - Class in
Parser for metadata contained in Flash Videos (.flv).
FLVParser() - Constructor for class
FOCAL_LENGTH - Static variable in interface org.apache.tika.metadata.TIFF
"Focal length of the lens, in millimeters."
Font - Interface in org.apache.tika.metadata
FONT_NAME - Static variable in interface org.apache.tika.metadata.Font
Basic name of a font used in a file
footers - Variable in class
footnoteReference(String) - Method in class
footnoteReference(String) - Method in interface
ForkParser - Class in org.apache.tika.fork
ForkParser(Path, ParserFactoryFactory) - Constructor for class org.apache.tika.fork.ForkParser
If you have a directory with, say, tike-app.jar and you want the child process/server to build a parser and run it from that -- so that you can keep all of those dependencies out of your client code, use this initializer.
ForkParser(Path, ParserFactoryFactory, ClassLoader) - Constructor for class org.apache.tika.fork.ForkParser
ForkParser(ClassLoader, Parser) - Constructor for class org.apache.tika.fork.ForkParser
ForkParser(ClassLoader) - Constructor for class org.apache.tika.fork.ForkParser
ForkParser() - Constructor for class org.apache.tika.fork.ForkParser
ForkProxy - Interface in org.apache.tika.fork
ForkResource - Interface in org.apache.tika.fork
FORMAT - Static variable in interface org.apache.tika.metadata.DublinCore
Typically, Format may include the media-type or dimensions of the resource.
FORMAT - Static variable in class org.apache.tika.metadata.Metadata
use TikaCoreProperties#FORMAT
FORMAT - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
format(Object, StringBuffer, FieldPosition) - Method in class
formatDate(Date) - Static method in class org.apache.tika.utils.DateUtils
Returns a ISO 8601 representation of the given date.
formatDate(Calendar) - Static method in class org.apache.tika.utils.DateUtils
Returns a ISO 8601 representation of the given date.
formatDateUnknownTimezone(Date) - Static method in class org.apache.tika.utils.DateUtils
Returns a ISO 8601 representation of the given date, which is in an unknown timezone.
formatMillis(long) - Static method in class org.apache.tika.util.DurationFormatUtils
formatRawCellContents(double, int, String, boolean) - Method in class
formatter - Variable in class
FORMATTING_OBJECTS_NS - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
FormattingUtils - Class in
FormattingUtils.Tag - Enum in
forName(String) - Method in class org.apache.tika.mime.MimeTypes
Returns the registered media type with the given name (or alias).
forName(String) - Static method in class org.apache.tika.utils.CharsetUtils
Returns Charset impl, if one exists.
freeBuffer(ByteBuffer) - Static method in class
If a cleaner is available, this buffer will be cleaned.
fromJson(Reader) - Static method in class org.apache.tika.metadata.serialization.JsonMetadata
Read metadata from reader.
fromJson(Reader) - Static method in class org.apache.tika.metadata.serialization.JsonMetadataList
Read metadata from reader.
FS_REL_PATH - Static variable in class org.apache.tika.batch.fs.FSProperties
File's relative path (including file name) from a given source root
FSBatchProcessCLI - Class in org.apache.tika.batch.fs
FSBatchProcessCLI(String[]) - Constructor for class org.apache.tika.batch.fs.FSBatchProcessCLI
FSConsumersManager - Class in org.apache.tika.batch.fs
FSConsumersManager(List<FileResourceConsumer>) - Constructor for class org.apache.tika.batch.fs.FSConsumersManager
FSCrawlerBuilder - Class in
Builds either an FSDirectoryCrawler or an FSListCrawler.
FSCrawlerBuilder() - Constructor for class
FSDirectoryCrawler - Class in org.apache.tika.batch.fs
FSDirectoryCrawler(ArrayBlockingQueue<FileResource>, int, Path, FSDirectoryCrawler.CRAWL_ORDER) - Constructor for class org.apache.tika.batch.fs.FSDirectoryCrawler
FSDirectoryCrawler(ArrayBlockingQueue<FileResource>, int, Path, Path, FSDirectoryCrawler.CRAWL_ORDER) - Constructor for class org.apache.tika.batch.fs.FSDirectoryCrawler
FSDirectoryCrawler.CRAWL_ORDER - Enum in org.apache.tika.batch.fs
FSDocumentSelector - Class in org.apache.tika.batch.fs
Selector that chooses files based on their file name and their size, as determined by Metadata.RESOURCE_NAME_KEY and Metadata.CONTENT_LENGTH.
FSDocumentSelector(Pattern, Pattern, long, long) - Constructor for class org.apache.tika.batch.fs.FSDocumentSelector
FSFileResource - Class in org.apache.tika.batch.fs
FileSystem(FS)Resource wraps a file name.
FSFileResource(File, File) - Constructor for class org.apache.tika.batch.fs.FSFileResource
to be removed in Tika 2.0
FSFileResource(Path, Path) - Constructor for class org.apache.tika.batch.fs.FSFileResource
FSListCrawler - Class in org.apache.tika.batch.fs
Class that "crawls" a list of files.
FSListCrawler(ArrayBlockingQueue<FileResource>, int, File, File, String) - Constructor for class org.apache.tika.batch.fs.FSListCrawler
FSListCrawler(ArrayBlockingQueue<FileResource>, int, Path, Path, Charset) - Constructor for class org.apache.tika.batch.fs.FSListCrawler
Constructor for a crawler that reads a list of files to process.
FSOutputStreamFactory - Class in org.apache.tika.batch.fs
FSOutputStreamFactory(File, FSUtil.HANDLE_EXISTING, FSOutputStreamFactory.COMPRESSION, String) - Constructor for class org.apache.tika.batch.fs.FSOutputStreamFactory
FSOutputStreamFactory(Path, FSUtil.HANDLE_EXISTING, FSOutputStreamFactory.COMPRESSION, String) - Constructor for class org.apache.tika.batch.fs.FSOutputStreamFactory
FSOutputStreamFactory.COMPRESSION - Enum in org.apache.tika.batch.fs
FSProperties - Class in org.apache.tika.batch.fs
FSProperties() - Constructor for class org.apache.tika.batch.fs.FSProperties
FSUtil - Class in org.apache.tika.batch.fs
Utility class to handle some common issues when reading from and writing to a file system (FS).
FSUtil() - Constructor for class org.apache.tika.batch.fs.FSUtil
FSUtil.HANDLE_EXISTING - Enum in org.apache.tika.batch.fs


GDALParser - Class in org.apache.tika.parser.gdal
Wraps execution of the Geospatial Data Abstraction Library (GDAL) gdalinfo tool used to extract geospatial information out of hundreds of geo file formats.
GDALParser() - Constructor for class org.apache.tika.parser.gdal.GDALParser
GENERAL_EMBEDDED - Static variable in class
General embedded document type within an OLE2 container
generateFooter(StringBuffer) - Method in class org.apache.tika.server.HTMLHelper
generateHeader(StringBuffer, String) - Method in class org.apache.tika.server.HTMLHelper
Generates the HTML Header for the user facing page, adding in the given title as required
generateRSS(Path) - Method in class org.apache.tika.example.RecentFiles
GenericConverter - Class in org.apache.tika.xmp.convert
Trys to convert as much of the properties in the Metadata map to XMP namespaces.
GenericConverter() - Constructor for class org.apache.tika.xmp.convert.GenericConverter
GENRE - Static variable in interface org.apache.tika.metadata.XMPDM
"The name of the genre."
GENRES - Static variable in interface org.apache.tika.parser.mp3.ID3Tags
List of predefined genres.
GeoGazetteerClient - Class in org.apache.tika.parser.geo.topic.gazetteer
GeoGazetteerClient(String) - Constructor for class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
Pass URL on which lucene-geo-gazetteer is available - eg.
GeoGazetteerClient(GeoParserConfig) - Constructor for class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
Geographic - Interface in org.apache.tika.metadata
Geographic schema.
GeographicInformationParser - Class in org.apache.tika.parser.geoinfo
GeographicInformationParser() - Constructor for class org.apache.tika.parser.geoinfo.GeographicInformationParser
geoInfoType - Static variable in class org.apache.tika.parser.geoinfo.GeographicInformationParser
GeoParser - Class in org.apache.tika.parser.geo.topic
GeoParser() - Constructor for class org.apache.tika.parser.geo.topic.GeoParser
GeoParserConfig - Class in org.apache.tika.parser.geo.topic
GeoParserConfig() - Constructor for class org.apache.tika.parser.geo.topic.GeoParserConfig
GeoTag - Class in org.apache.tika.parser.geo.topic
GeoTag() - Constructor for class org.apache.tika.parser.geo.topic.GeoTag
get(InputStream) - Static method in class
Casts or wraps the given stream to a TaggedInputStream instance.
get(InputStream, TemporaryResources) - Static method in class
Casts or wraps the given stream to a TikaInputStream instance.
get(InputStream) - Static method in class
Casts or wraps the given stream to a TikaInputStream instance.
get(byte[]) - Static method in class
Creates a TikaInputStream from the given array of bytes.
get(byte[], Metadata) - Static method in class
Creates a TikaInputStream from the given array of bytes.
get(Path) - Static method in class
Creates a TikaInputStream from the file at the given path.
get(Path, Metadata) - Static method in class
Creates a TikaInputStream from the file at the given path.
get(File) - Static method in class
use TikaInputStream.get(Path). In Tika 2.0, this will be removed or modified to throw an IOException.
get(File, Metadata) - Static method in class
use TikaInputStream.get(Path, Metadata). In Tika 2.0, this will be removed or modified to throw an IOException.
get(Blob) - Static method in class
Creates a TikaInputStream from the given database BLOB.
get(Blob, Metadata) - Static method in class
Creates a TikaInputStream from the given database BLOB.
get(URI) - Static method in class
Creates a TikaInputStream from the resource at the given URI.
get(URI, Metadata) - Static method in class
Creates a TikaInputStream from the resource at the given URI.
get(URL) - Static method in class
Creates a TikaInputStream from the resource at the given URL.
get(URL, Metadata) - Static method in class
Creates a TikaInputStream from the resource at the given URL.
get(String) - Method in class org.apache.tika.metadata.Metadata
Get the value associated to a metadata name.
get(Property) - Method in class org.apache.tika.metadata.Metadata
Returns the value (if any) of the identified metadata property.
get(String) - Static method in class org.apache.tika.metadata.Property
Retrieve the property object that corresponds to the given key
get(Class<T>) - Method in class org.apache.tika.parser.ParseContext
Returns the object in this context that implements the given interface.
get(Class<T>, T) - Method in class org.apache.tika.parser.ParseContext
Returns the object in this context that implements the given interface, or the given default value if such an object is not found.
get() - Method in enum org.apache.tika.parser.strings.StringsEncoding
get(String) - Method in class org.apache.tika.xmp.XMPMetadata
Returns the value of a simple property or the first one of an array.
get(Property) - Method in class org.apache.tika.xmp.XMPMetadata
get7BitsInt(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
AKA a Synchsafe integer.
getAccessChecker() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getAcronym() - Method in class org.apache.tika.mime.MimeType
Returns an acronym for this mime type.
getAdded() - Method in class org.apache.tika.batch.FileResourceCrawler
getAdded() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.AbstractConverter
Every Converter has to provide information about namespaces that are used additionally to the core set of XMP namespaces.
getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.GenericConverter
getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.MSOfficeBinaryConverter
getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.MSOfficeXMLConverter
getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.OpenDocumentConverter
getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.RTFConverter
getAdmin1Code() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
getAdmin2Code() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
getAeDescriptorPath() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns the path to XML descriptor for AnalysisEngine.
getAgePredictorClient() - Method in class org.apache.tika.parser.recognition.AgeRecogniser
getAlbum() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
getAlbum() - Method in interface org.apache.tika.parser.mp3.ID3Tags
getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
getAlbumArtist() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
getAlbumArtist() - Method in interface org.apache.tika.parser.mp3.ID3Tags
The Artist for the overall album / compilation of albums
getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
ID3v1 doesn't have album-wide artists, so returns null;
getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
getAliases(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
Returns the set of known aliases of the given canonical media type.
getAlignedLenTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getAlignedTreeTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getAllComponentParsers() - Method in class org.apache.tika.parser.CompositeParser
Returns all parsers registered with the Composite Parser, including ones which may not currently be active.
getAllComponentParsers() - Method in class org.apache.tika.parser.DefaultParser
getAllDetectableCharsets() - Static method in class org.apache.tika.parser.txt.CharsetDetector
Get the names of all charsets supported by CharsetDetector class.
getAllNameEntitiesfromInput(InputStream) - Method in class org.apache.tika.parser.geo.topic.NameEntityExtractor
getAllTagHandlers(InputStream, ContentHandler) - Static method in class org.apache.tika.parser.mp3.Mp3Parser
Scans the MP3 frames for ID3 tags, and creates ID3Tag Handlers for each supported set of tags.
getAlphabeticTokens() - Method in class org.apache.tika.eval.tokens.CommonTokenResult
getAnalysisEngine(String, String, String) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
Returns a new UIMA Analysis Engine (AE).
getAnnotationProperty(IdentifiedAnnotation, CTAKESAnnotationProperty) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
Returns the annotation value based on the given annotation type.
getAnnotationProps() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns an array of CTAKESAnnotationProperty's that will be included into cTAKES metadata.
getAnnotationPropsAsString() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns a string containing a comma-separated list of CTAKESAnnotationProperty names that will be included into cTAKES metadata.
getApiKey() - Method in class org.apache.tika.language.translate.YandexTranslator
Get the API Key in use for client authentication
getApiUri(Metadata) - Method in class
getApiUri(Metadata) - Method in class
getApiUri(Metadata) - Method in class
getApplyRotation() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
getArray() - Method in class org.apache.tika.eval.textstats.TokenCountPriorityQueue
getArray() - Method in class org.apache.tika.eval.tokens.TokenCountPriorityQueue
getArtist() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
getArtist() - Method in interface org.apache.tika.parser.mp3.ID3Tags
The Artist for the track
getArtist() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
getArtist() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
getArtist() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
getArtist() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
getAttributesMapping() - Method in class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
getAttrValue(String, Attributes) - Static method in class org.apache.tika.utils.XMLReaderUtils
getAverageCharTolerance() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getBaseType() - Method in class org.apache.tika.mime.MediaType
Returns the base form of the MediaType, excluding any parameters, such as "text/plain" for "text/plain; charset=utf-8"
getBestNameEntity() - Method in class org.apache.tika.parser.geo.topic.NameEntityExtractor
getBigInteger(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
getBinaryDocValues(String) - Method in class
getBitRate() - Method in class org.apache.tika.parser.mp3.AudioFrame
Get the bit rate in bit per second.
getBitsPerPixel() - Method in class org.apache.tika.parser.image.ICNSType
getBlock_len() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns block's length
getBlockAddress() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Returns block addresses
getBlockCount() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Gets a block count
getBlockidx_intvl() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns block index interval
getBlockLen() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Gets a block length
getBlockLength() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getBlockNext() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
getBlockNumber() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
getBlockPrev() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
getBlockRemaining() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getBlockType() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getBoolean(String, Boolean) - Static method in class org.apache.tika.util.PropsUtil
Parses v.
getByte() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
getByteCount() - Method in class
The number of bytes that have passed through this stream.
getCatchIntermediateIOExceptions() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getCause() - Method in exception
Returns the wrapped exception.
getCause() - Method in exception org.apache.tika.sax.TaggedSAXException
Returns the wrapped exception.
getCauseForTermination() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
getCenter() - Method in class
getChannels() - Method in class org.apache.tika.parser.mp3.AudioFrame
Get the number of channels (1=mono, 2=stereo)
getCharset() - Method in class org.apache.tika.detect.AutoDetectReader
getCharset() - Method in class org.apache.tika.detect.NonDetectingEncodingDetector
getCharset() - Method in class org.apache.tika.parser.csv.CSVParams
getChildTypes(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
Returns the set of known children of the given canonical media type
getChmBlockInfoInstance(DirectoryListingEntry, int, ChmLzxcControlData) - Static method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
getChmBlockInfoInstance(DirectoryListingEntry, int, ChmLzxcControlData, ChmBlockInfo) - Static method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
getChmBlockSegment(byte[], ChmLzxcResetTable, int, int, int) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
getChmDirList() - Method in class org.apache.tika.parser.chm.core.ChmExtractor
getChmDirList() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
getChmItsfHeader() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
getChmItspHeader() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
getChmLzxcControlData() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
getChmLzxcResetTable() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
getChoices() - Method in class org.apache.tika.metadata.Property
Returns the (immutable) set of choices for the values of this property.
getClassName() - Method in enum org.apache.tika.parser.ctakes.CTAKESSerializer
getColInfos() - Method in class org.apache.tika.eval.db.TableInfo
getColorspace() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
getCommand() - Method in class org.apache.tika.embedder.ExternalEmbedder
Gets the command to be run.
getCommand() - Method in class org.apache.tika.parser.external.ExternalParser
getCommand() - Method in class org.apache.tika.parser.gdal.GDALParser
getCommandAppendOperator() - Method in class org.apache.tika.embedder.ExternalEmbedder
Gets the operator to append rather than replace a value for the command line tool, i.e.
getCommandAssignmentDelimeter() - Method in class org.apache.tika.embedder.ExternalEmbedder
Gets the delimiter for multiple assignments for the command line tool, i.e.
getCommandAssignmentOperator() - Method in class org.apache.tika.embedder.ExternalEmbedder
Gets the assignment operator for the command line tool, i.e.
getCommandMetadataSegments(Metadata) - Method in class org.apache.tika.embedder.ExternalEmbedder
Constructs a collection of command line arguments responsible for setting individual metadata fields based on the given metadata.
getComment(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
Builds up the ID3 comment, by parsing and extracting the comment string parts from the given data.
getComments() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
getComments() - Method in interface org.apache.tika.parser.mp3.ID3Tags
Retrieves the comments, if any.
getComments() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
getComments() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
getComments() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
getComments() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
getCommonTokens() - Method in class org.apache.tika.eval.tokens.CommonTokenResult
getCommonTokensAnalyzer() - Method in class org.apache.tika.eval.tokens.AnalyzerManager
This analyzer should be used to generate common tokens lists from large corpora.
getCompilation() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
getCompilation() - Method in interface org.apache.tika.parser.mp3.ID3Tags
getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
ID3v1 doesn't have compilations, so returns null;
getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
ID3v22 doesn't have compilations, so returns null;
getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
getComposer() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
getComposer() - Method in interface org.apache.tika.parser.mp3.ID3Tags
getComposer() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
ID3v1 doesn't have composers, so returns null;
getComposer() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
getComposer() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
getComposer() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
getCompressedLen() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Gets compressed length
getConcatenatePhoneticRuns() - Method in class
getConfidence() - Method in class org.apache.tika.eval.langid.Language
getConfidence() - Method in class org.apache.tika.language.detect.LanguageResult
getConfidence() - Method in class org.apache.tika.parser.csv.CSVResult
getConfidence() - Method in class org.apache.tika.parser.recognition.RecognisedObject
getConfidence() - Method in class org.apache.tika.parser.txt.CharsetMatch
Get an indication of the confidence in the charset detected.
getConfig() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
as of 1.17, use EmbeddedDocumentUtil.getTikaConfig() instead
getConfig() - Static method in class org.apache.tika.server.resource.TikaResource
getConnection() - Method in class org.apache.tika.eval.db.JDBCUtil
Override this any optimizations you want to do on the db before writing/reading.
getConnectionString() - Method in class org.apache.tika.eval.db.H2Util
getConnectionString() - Method in class org.apache.tika.eval.db.JDBCUtil
getConsidered() - Method in class org.apache.tika.batch.FileResourceCrawler
getConsidered() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
Returns the number of file resources considered.
getConstraints() - Method in class org.apache.tika.eval.db.ColInfo
getConsumed() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
getConsumers() - Method in class org.apache.tika.batch.ConsumersManager
Get the consumers
getConsumersManagerMaxMillis() - Method in class org.apache.tika.batch.ConsumersManager
BatchProcess will throw an exception if the ConsumersManager doesn't complete init() or shutdown() within this amount of time.
getContent(EvalFilePaths, Metadata) - Static method in class org.apache.tika.eval.AbstractProfiler
getContent() - Method in class org.apache.tika.eval.util.ContentTags
getContent() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
getContent(int, int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
getContent(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.example.PrescriptionParser
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.dif.DIFParser
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentMetaParser
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.DcXMLParser
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.FictionBookParser
getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.XMLParser
getContentHandlerFactory() - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
getContentLanguage() - Method in class org.apache.tika.example.ImportContextImpl
getContentLength() - Method in class org.apache.tika.example.ImportContextImpl
getContentLength() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
getContentParser() - Method in class org.apache.tika.parser.epub.EpubParser
getContentParser() - Method in class org.apache.tika.parser.odf.OpenDocumentParser
getControlDataIndex() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
Returns control data index that located in List
getConverter(String) - Static method in class org.apache.tika.xmp.convert.TikaToXMP
Retrieve a specific converter according to the mimetype
getCoreCacheHelper() - Method in class
getCoreProperties() - Method in class
getCoreProperties() - Method in class
getCoreProperties() - Method in class
getCount(String) - Method in class org.apache.tika.eval.tokens.LangModel
getCount() - Method in class
The number of bytes that have passed through this stream.
getCount() - Method in class org.apache.tika.language.LanguageProfile
getCount(String) - Method in class org.apache.tika.language.LanguageProfile
getCountryCode() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
getCounts() - Method in class org.apache.tika.eval.tokens.LangModel
getCurrentFile() - Method in class org.apache.tika.batch.FileResourceConsumer
Returns the name and start time of a file that is currently being processed.
getCustomProperties() - Method in class
getCustomProperties() - Method in class
getCustomProperties() - Method in class
getData() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
getData() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
getData() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
getDataOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
Returns data offset
getDataOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns data offset
getDate(Property) - Method in class org.apache.tika.metadata.Metadata
Returns the value of the identified Date based metadata property.
getDate(Property) - Method in class org.apache.tika.xmp.XMPMetadata
getDateFormatOverride() - Method in class
getDBWriter(List<TableInfo>) - Method in class org.apache.tika.eval.batch.EvalConsumerBuilder
getDecorationName() - Method in class org.apache.tika.parser.ctakes.CTAKESParser
getDecorationName() - Method in class org.apache.tika.parser.ParserDecorator
getDectorsHTML() - Method in class org.apache.tika.server.resource.TikaDetectors
getDefaultConfig() - Static method in class org.apache.tika.config.TikaConfig
Provides a default configuration (TikaConfig).
getDefaultConfig() - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
getDefaultDetector(MimeTypes, ServiceLoader) - Static method in class org.apache.tika.config.TikaConfig
getDefaultEncodingDetector(ServiceLoader) - Static method in class org.apache.tika.config.TikaConfig
getDefaultLanguageDetector() - Static method in class org.apache.tika.language.detect.LanguageDetector
getDefaultMimeTypes() - Static method in class org.apache.tika.mime.MimeTypes
Get the default MimeTypes.
getDefaultMimeTypes(ClassLoader) - Static method in class org.apache.tika.mime.MimeTypes
Get the default MimeTypes.
getDefaultNumConsumers() - Static method in class
getDefaultRegistry() - Static method in class org.apache.tika.mime.MediaTypeRegistry
Returns the built-in media type registry included in Tika.
getDelegateParser(ParseContext) - Method in class org.apache.tika.parser.DelegatingParser
Returns the parser instance to which parsing tasks should be delegated.
getDelimiter() - Method in class org.apache.tika.parser.csv.CSVParams
getDelimiter() - Method in class org.apache.tika.parser.csv.CSVResult
getDensity() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
getDepth() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
getDescription() - Method in class org.apache.tika.mime.MimeType
Returns the description of this media type.
getDescription() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
Gets the description, if present
getDetectableCharsets() - Method in class org.apache.tika.parser.txt.CharsetDetector
This API is ICU internal only.
getDetectAngles() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getDetector() - Method in class org.apache.tika.config.TikaConfig
Returns the configured detector instance.
getDetector() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
getDetector() - Method in class org.apache.tika.language.detect.LanguageHandler
Returns the language detector used by this content handler.
getDetector() - Method in class org.apache.tika.language.detect.LanguageWriter
Returns the language detector used by this writer.
getDetector() - Method in class org.apache.tika.parser.AutoDetectParser
Returns the type detector used by this parser to auto-detect the type of a document.
getDetector(Parser) - Static method in class org.apache.tika.server.resource.TikaResource
getDetector() - Method in class org.apache.tika.Tika
Returns the detector instance used by this facade.
getDetectors() - Method in class org.apache.tika.detect.CompositeDetector
Returns the component detectors.
getDetectors() - Method in class org.apache.tika.detect.CompositeEncodingDetector
getDetectors() - Method in class org.apache.tika.detect.DefaultDetector
getDetectors() - Method in class org.apache.tika.detect.DefaultProbDetector
getDetectorsJSON() - Method in class org.apache.tika.server.resource.TikaDetectors
getDetectorsPlain() - Method in class org.apache.tika.server.resource.TikaDetectors
getDiceCoefficient() - Method in class org.apache.tika.eval.tokens.ContrastStatistics
getDir_uuid() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns directory uuid
getDirectoryListingEntryList() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
Returns chm directory listing entry list
getDirLen() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns directory length
getDirOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns directory offset
getDisc() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
getDisc() - Method in interface org.apache.tika.parser.mp3.ID3Tags
The number of the disc this belongs to, within the set
getDisc() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
ID3v1 doesn't have disc numbers, so returns null;
getDisc() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
getDisc() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
getDisc() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
getDocument() - Method in class
getDocument() - Method in interface
Returns the opened document.
getDocument() - Method in class
getDocumentBuilder() - Method in class org.apache.tika.parser.ParseContext
Returns the DOM builder specified in this parsing context.
getDocumentBuilder() - Static method in class org.apache.tika.utils.XMLReaderUtils
Returns the DOM builder specified in this parsing context.
getDocumentBuilderFactory() - Static method in class org.apache.tika.utils.XMLReaderUtils
Returns the DOM builder factory specified in this parsing context.
getDuration() - Method in class org.apache.tika.parser.mp3.AudioFrame
Returns the duration in milliseconds.
getEmbeddedDocumentExtractor(ParseContext) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
This offers a uniform way to get an EmbeddedDocumentExtractor from a ParseContext.
getEnableAutoSpace() - Method in class org.apache.tika.parser.pdf.PDFParser
getEnableAutoSpace() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getEncint() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
getEncoding() - Method in class org.apache.tika.example.ImportContextImpl
getEncoding() - Method in class org.apache.tika.parser.strings.StringsConfig
Returns the character encoding of the strings that are to be found.
getEncodingDetector() - Method in class org.apache.tika.config.TikaConfig
Returns the configured encoding detector instance
getEncodingDetector(ParseContext) - Method in class org.apache.tika.parser.AbstractEncodingDetectorParser
Look for an EncodingDetetor in the ParseContext.
getEncodingDetector() - Method in class org.apache.tika.parser.AbstractEncodingDetectorParser
getEndBlock() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
Returns the end block index
getEndDocumentWasCalled() - Method in class org.apache.tika.sax.EndDocumentShieldingContentHandler
getEndOffset() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
Returns the end offset index
getEntityTypes() - Method in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
Gets set of entity types recognised by this recogniser
getEntityTypes() - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
Gets set of entity types recognised by this recogniser
getEntityTypes() - Method in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
Gets set of entity types recognised by this recogniser
getEntityTypes() - Method in interface org.apache.tika.parser.ner.NERecogniser
gets a set of entity types whose names are recognisable by this
getEntityTypes() - Method in class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
Gets set of entity types recognised by this recogniser
getEntityTypes() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
getEntityTypes() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
getEntityTypes() - Method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
getEntriesToCopy() - Method in class
getEntropy() - Method in class org.apache.tika.eval.tokens.TokenStatistics
getEntryType() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
Returns ChmCommons.EntryType (COMPRESSED or UNCOMPRESSED)
getErrors() - Static method in class org.apache.tika.language.LanguageIdentifier
Returns a string of error messages related to initializing language profiles
getExecutorService() - Method in class org.apache.tika.config.TikaConfig
getExitStatus() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
getExtendedHeader() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
getExtendedProperties() - Method in class
getExtendedProperties() - Method in class
getExtendedProperties() - Method in class
getExtension(TikaInputStream, Metadata) - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
getExtension() - Method in class org.apache.tika.mime.MimeType
Returns the preferred file extension of this type, or an empty string if no extensions are known.
getExtension() - Method in enum
getExtensions() - Method in class org.apache.tika.mime.MimeType
Returns the list of all known file extensions of this media type.
getExtractAcroFormContent() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getExtractActions() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getExtractAllAlternativesFromMSG() - Method in class
getExtractAllAlternativesFromMSG() - Method in class
getExtractAnnotationText() - Method in class org.apache.tika.parser.pdf.PDFParser
getExtractAnnotationText() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getExtractBookmarksText() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getExtractFontNames() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getExtractInlineImages() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getExtractMacros() - Method in class
getExtractMacros() - Method in class
getExtractMarkedContent() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getExtractScripts() - Method in class org.apache.tika.parser.html.HtmlParser
getExtractUniqueInlineImagesOnly() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getFallback() - Method in class org.apache.tika.parser.CompositeParser
Returns the fallback parser.
getField() - Method in class org.apache.tika.config.ParamField
getFieldInfos() - Method in class
getFile() - Method in class
getFile(String, File) - Static method in class org.apache.tika.util.PropsUtil
getFileChannel() - Method in class
getFileLength(Path) - Method in class org.apache.tika.eval.AbstractProfiler
getFilePath() - Method in class org.apache.tika.parser.strings.FileConfig
Returns the "file" installation folder.
getFileProg() - Static method in class org.apache.tika.parser.strings.StringsParser
getFilesProcessed() - Method in class org.apache.tika.server.ServerStatus
getFilter() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
getFilteredStackTrace(Throwable) - Static method in class org.apache.tika.utils.ExceptionUtils
Simple util to get stack trace.
getFlags() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
getFormat() - Method in class org.apache.tika.language.translate.YandexTranslator
Retrieve the current text format setting.
getFormattedNumber(Paragraph) - Method in class
Get the formatted number for a given paragraph

getFormattedNumber(XWPFParagraph) - Method in class
getFormattedNumber(BigInteger, int) - Method in class
getFramesRead() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getFreeSpace() - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
Returns pmgi free space
getFreeSpace() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
getGazetteerRestEndpoint() - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
getGeneralAnalyzer() - Method in class org.apache.tika.eval.tokens.AnalyzerManager
This analyzer should be used to extract all tokens.
getGenre() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
getGenre() - Method in interface org.apache.tika.parser.mp3.ID3Tags
getGenre() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
getGenre() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
getGenre() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
getGenre() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
getGuid() - Method in class
getHadStarted() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getHeader_len() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns header length
getHeaderLen() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns itsf header length
getHeight() - Method in class org.apache.tika.parser.image.ICNSType
getHTML(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.resource.TikaResource
getHTMLFromMultipart(Attachment, UriInfo) - Method in class org.apache.tika.server.resource.TikaResource
getId() - Method in class org.apache.tika.parser.recognition.RecognisedObject
getIdentifier() - Method in class org.apache.tika.sax.StandardReference
getIfXFAExtractOnlyXFA() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getIgnoredLineConsumer() - Method in class org.apache.tika.parser.external.ExternalParser
Gets lines consumer
getIlvl() - Method in class
getImageMagickPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
getImportRoot() - Method in class org.apache.tika.example.ImportContextImpl
getIncludeDeletedContent() - Method in class
getIncludeDeletedContent() - Method in class
getIncludeDeletedText() - Method in class
getIncludeDeletedText() - Method in interface
getIncludeHeadersAndFooters() - Method in class
getIncludeMissingRows() - Method in class
getIncludeMoveFromContent() - Method in class
getIncludeMoveFromContent() - Method in class
getIncludeMoveFromText() - Method in class
getIncludeMoveFromText() - Method in interface
getIncludeShapeBasedContent() - Method in class
getIncludeSlideMasterContent() - Method in class
getIncludeSlideNotes() - Method in class
getIndex() - Method in class
getIndex_depth() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns an index depth
getIndex_head() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns an index head
getIndex_root() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns index root
getIndexCopyFromStart() - Method in class
getIndexCopyToStart() - Method in class
getIndexOfContent() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
getIndexOfResetData() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
getIndexOfResetTable() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
getIniBlock() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
Returns an initial block index
getInitializableProblemHandler() - Method in class org.apache.tika.config.ServiceLoader
Returns the handler for problems with initializables
getInputSteam(InputStream, HttpHeaders) - Method in class org.apache.tika.server.DefaultInputStreamFactory
getInputSteam(InputStream, Metadata, HttpHeaders) - Method in class org.apache.tika.server.DefaultInputStreamFactory
getInputSteam(InputStream, HttpHeaders) - Method in interface org.apache.tika.server.InputStreamFactory
getInputSteam(InputStream, Metadata, HttpHeaders) - Method in interface org.apache.tika.server.InputStreamFactory
getInputSteam(InputStream, HttpHeaders) - Method in class org.apache.tika.server.URLEnabledInputStreamFactory
getInputSteam(InputStream, Metadata, HttpHeaders) - Method in class org.apache.tika.server.URLEnabledInputStreamFactory
getInputStream(FileResource) - Method in class org.apache.tika.batch.fs.AbstractFSConsumer
getInputStream() - Method in class org.apache.tika.example.ImportContextImpl
Returns a new InputStream to the temporary file created during instanciation or null, if this context does not provide a stream.
getInputStream() - Method in class org.apache.tika.parser.utils.DataURIScheme
getInputStream(InputStream, Metadata, HttpHeaders) - Static method in class org.apache.tika.server.resource.TikaResource
getInstance() - Static method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
getInt(Property) - Method in class org.apache.tika.metadata.Metadata
Returns the value of the identified Integer based metadata property.
getInt(byte[]) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
getInt(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
getInt(String, Integer) - Static method in class org.apache.tika.util.PropsUtil
Parses v.
getInt(String, Map<String, String>, Node) - Static method in class org.apache.tika.util.XMLDOMUtil
Get an int value.
getInt(Property) - Method in class org.apache.tika.xmp.XMPMetadata
getInt2(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
getInt3(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
getIntBE(byte[]) - Static method in class
Get a BE int value from the beginning of a byte array
getIntBE(byte[], int) - Static method in class
Get a BE int value from a byte array
getIntelCurrentPossition() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getIntelFileSize() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getIntelState() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getIntLE(byte[]) - Static method in class
Get a LE int value from the beginning of a byte array
getIntLE(byte[], int) - Static method in class
Get a LE int value from a byte array
getIntValues(Property) - Method in class org.apache.tika.metadata.Metadata
Gets the array of ints of the identified "seq" integer metadata property.
getIOListener() - Method in class org.apache.tika.example.ImportContextImpl
getJavaCommand() - Method in class org.apache.tika.fork.ForkParser
since 1.8
getJavaCommandAsList() - Method in class org.apache.tika.fork.ForkParser
Returns the command used to start the forked server process.
getJCas(AnalysisEngine) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
Returns a new JCas () appropriate for the given Analysis Engine.
getJDBCDriverClass() - Method in class org.apache.tika.eval.db.H2Util
getJDBCDriverClass() - Method in class org.apache.tika.eval.db.JDBCUtil
JDBC driver class.
getJustFileName(String) - Method in class
getKey() - Static method in class org.apache.tika.example.Pharmacy
getLabel() - Method in class org.apache.tika.parser.recognition.RecognisedObject
getLabelLang() - Method in class org.apache.tika.parser.recognition.RecognisedObject
getLang_id() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns language id
getLangCode() - Method in class org.apache.tika.eval.tokens.CommonTokenResult
getLangId() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns language ID
getLangTokens(String) - Method in class org.apache.tika.eval.tokens.CommonTokenCountManager
getLanguage() - Method in class org.apache.tika.eval.langid.Language
getLanguage() - Method in class org.apache.tika.language.detect.LanguageHandler
Returns the detected language based on text handled thus far.
getLanguage() - Method in class org.apache.tika.language.detect.LanguageResult
The ISO 639-1 language code (plus optional country code)
getLanguage() - Method in class org.apache.tika.language.detect.LanguageWriter
Returns the detected language based on text written thus far.
getLanguage() - Method in class org.apache.tika.language.LanguageIdentifier
Gets the identified language
getLanguage() - Method in class org.apache.tika.language.ProfilingHandler
Returns the language that best matches the current state of the language profile.
getLanguage() - Method in class org.apache.tika.language.ProfilingWriter
Returns the language that best matches the current state of the language profile.
getLanguage(long) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
Returns textual representation of LangID
getLanguage() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
Gets the language, if present
getLanguage() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
getLanguage() - Method in class org.apache.tika.parser.txt.CharsetMatch
Get the ISO code for the language of the detected charset.
getLanguageDetectors() - Static method in class org.apache.tika.language.detect.LanguageDetector
getLanguageDetectors(ServiceLoader) - Static method in class org.apache.tika.language.detect.LanguageDetector
getLastModified() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns last modified date of the chm file
getLatitude() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
getLayer() - Method in class org.apache.tika.parser.mp3.AudioFrame
Get the audio layer code.
getLeft() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
getLeft() - Method in class
getLength() - Method in class org.apache.tika.detect.MagicDetector
getLength() - Method in class
Returns the length (in bytes) of this stream.
getLength() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
getLength() - Method in class org.apache.tika.parser.mp3.AudioFrame
Returns the frame length in bytes.
getLength() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
getLengthTreeLengtsTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getLengthTreeTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getLinearizedDictionary(PDDocument) - Static method in class org.apache.tika.parser.pdf.PDFPreflightParser
Copied verbatim from PDFBox According to the PDF Reference, A linearized PDF contain a dictionary as first object (linearized dictionary) and only this one in the first section.
getLinks() - Method in class org.apache.tika.mime.MimeType
Get a list of links to help document this mime type
getLinks() - Method in class org.apache.tika.sax.LinkContentHandler
Returns the list of collected links.
getLiveDocs() - Method in class
getLoader() - Method in class org.apache.tika.config.ServiceLoader
getLoadErrorHandler() - Method in class org.apache.tika.config.ServiceLoader
Returns the load error handler used by this loader.
getLocations(List<String>) - Method in class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
Calls API of lucene-geo-gazetteer to search location name in gazetteer.
getLong(String, Long) - Static method in class org.apache.tika.util.PropsUtil
Parses v.
getLong(String, Map<String, String>, Node) - Static method in class org.apache.tika.util.XMLDOMUtil
Get a long value.
getLongitude() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
getLongLE(byte[], int) - Static method in class
Get a LE long value from a byte array
getLzxBlockLength() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
getLzxBlockOffset() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
getLzxBlocksCache() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
getMacroLanguage(String) - Static method in class org.apache.tika.language.detect.LanguageNames
If language is a specific variant of a macro language (e.g.
getMainDocumentParts() - Method in class
Return a list of the main parts of the document, used when searching for embedded resources.
getMainDocumentParts() - Method in class
getMainDocumentParts() - Method in class
In PowerPoint files, slides have things embedded in them, and slide drawings which have the images
getMainDocumentParts() - Method in class
This returns all items that might contain embedded objects: main document, headers, footers, comments, etc.
getMainDocumentParts() - Method in class
getMainDocumentParts() - Method in class
In PowerPoint files, slides have things embedded in them, and slide drawings which have the images
getMainDocumentParts() - Method in class
In Excel files, sheets have things embedded in them, and sheet drawings which have the images
getMainDocumentParts() - Method in class
Include main body and anything else that can have an attachment/embedded object
getMainOrganizationAcronym() - Method in class org.apache.tika.sax.StandardReference
getMainTreeElements() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getMainTreeLengtsTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getMainTreeTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getMajorVersion() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
getMappedTagName() - Method in class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
getMarkLimit() - Method in class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
getMarkLimit() - Method in class org.apache.tika.parser.html.HtmlEncodingDetector
getMarkLimit() - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
getMarkLimit() - Method in class org.apache.tika.parser.txt.UniversalEncodingDetector
getMaxBytesForEmbeddedObject() - Static method in class org.apache.tika.parser.rtf.RTFParser
getMaxChildStartupMillis() - Method in class org.apache.tika.server.ServerTimeouts
Maximum time in millis to allow for the child process to startup or restart
getMaxEntityExpansions() - Static method in class org.apache.tika.utils.XMLReaderUtils
getMaxFileSizeToOcr() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
getMaximumCompressionRatio() - Method in class org.apache.tika.sax.SecureContentHandler
Returns the maximum compression ratio.
getMaximumDepth() - Method in class org.apache.tika.sax.SecureContentHandler
Returns the maximum XML element nesting level.
getMaximumPackageEntryDepth() - Method in class org.apache.tika.sax.SecureContentHandler
Returns the maximum package entry nesting level.
getMaxMainMemoryBytes() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
The maximum amount of memory to use when loading a pdf into a PDDocument.
getMaxRestarts() - Method in class org.apache.tika.server.ServerTimeouts
getMaxStringLength() - Method in class org.apache.tika.Tika
Returns the maximum length of strings returned by the parseToString methods.
getMaxXMPMMHistory() - Static method in class org.apache.tika.parser.image.xmp.JempboxExtractor
getMediaType() - Method in class org.apache.tika.parser.csv.CSVParams
getMediaType() - Method in class org.apache.tika.parser.csv.CSVResult
getMediaType() - Method in class org.apache.tika.parser.utils.DataURIScheme
getMediaTypeRegistry() - Method in class org.apache.tika.config.TikaConfig
getMediaTypeRegistry() - Method in class org.apache.tika.mime.MimeTypes
getMediaTypeRegistry() - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
getMediaTypeRegistry() - Method in class org.apache.tika.parser.CompositeParser
Returns the media type registry used to infer type relationships.
getMediaTypes() - Method in class org.apache.tika.server.resource.TikaMimeTypes
getMessage() - Method in class org.apache.tika.server.resource.TikaResource
getMessageClass(String) - Static method in class
getMet(URL) - Static method in class org.apache.tika.example.DisplayMetInstance
getMetadata() - Method in interface org.apache.tika.batch.FileResource
This gets the metadata available before the parsing of the file.
getMetadata() - Method in class org.apache.tika.batch.fs.FSFileResource
getMetaData() - Method in class
getMetadata() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns an array of metadata whose values will be analyzed using cTAKES.
getMetadata() - Method in class org.apache.tika.parser.ctakes.CTAKESContentHandler
Returns metadata that includes cTAKES annotations.
getMetadata() - Method in class org.apache.tika.parser.RecursiveParserWrapper
getMetadata() - Method in class org.apache.tika.server.MetadataList
getMetadata(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.resource.MetadataResource
getMetadata(InputStream, HttpHeaders, UriInfo, String) - Method in class org.apache.tika.server.resource.RecursiveMetadataResource
Returns an InputStream that can be deserialized as a list of Metadata objects.
getMetadataAsString() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns a string containing a comma-separated list of metadata whose values will be analyzed using cTAKES.
getMetadataCommandArguments() - Method in class org.apache.tika.embedder.ExternalEmbedder
Gets the map of Metadata keys to command line parameters.
getMetadataExtractionPatterns() - Method in class org.apache.tika.parser.external.ExternalParser
getMetadataExtractor() - Method in class
getMetadataExtractor() - Method in interface
POIXMLTextExtractor.getMetadataTextExtractor() not yet supported for OOXML by POI.
getMetadataField(InputStream, HttpHeaders, UriInfo, String) - Method in class org.apache.tika.server.resource.MetadataResource
Get a specific metadata field.
getMetadataFromMultipart(Attachment, UriInfo) - Method in class org.apache.tika.server.resource.MetadataResource
getMetadataFromMultipart(Attachment, UriInfo, String) - Method in class org.apache.tika.server.resource.RecursiveMetadataResource
Returns an InputStream that can be deserialized as a list of Metadata objects.
getMetadataList() - Method in class org.apache.tika.sax.RecursiveParserWrapperHandler
getMetaParser() - Method in class org.apache.tika.parser.epub.EpubParser
getMetaParser() - Method in class org.apache.tika.parser.odf.OpenDocumentParser
getMimeId(String) - Method in class
getMimeId(String) - Method in interface
getMimeRepository() - Method in class org.apache.tika.config.TikaConfig
getMimeType() - Method in class org.apache.tika.example.ImportContextImpl
getMimeType(String) - Method in class org.apache.tika.mime.MimeTypes
getMimeType(File) - Method in class org.apache.tika.mime.MimeTypes
Use Tika.detect(File) instead
getMimeTypes() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
getMimeTypesHTML() - Method in class org.apache.tika.server.resource.TikaMimeTypes
getMimeTypesJSON() - Method in class org.apache.tika.server.resource.TikaMimeTypes
getMimeTypesPlain() - Method in class org.apache.tika.server.resource.TikaMimeTypes
getMinFileSizeToOcr() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
getMinLength() - Method in class org.apache.tika.detect.TrainedModelDetector
getMinLength() - Method in class org.apache.tika.mime.MimeTypes
Return the minimum length of data to provide to analyzing methods based on the document's content in order to check all the known MimeTypes.
getMinLength() - Method in class org.apache.tika.parser.strings.StringsConfig
Returns the minimum sequence length (characters) to print.
getMinorVersion() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
getMinSize() - Method in class org.apache.tika.parser.strings.Latin1StringsParser
Returns the minimum size of a character sequence to be extracted.
getModificationTime() - Method in class org.apache.tika.example.ImportContextImpl
getMSB() - Method in class org.apache.tika.parser.executable.MachineMetadata.Endian
getName() - Method in class org.apache.tika.config.Param
getName() - Method in class org.apache.tika.config.ParamField
getName() - Method in class org.apache.tika.eval.db.ColInfo
getName() - Method in class org.apache.tika.eval.db.TableInfo
getName(String) - Static method in class
This is a duplication of the algorithm and functionality available in commons io FilenameUtils.
getName() - Method in class org.apache.tika.language.LanguageProfilerBuilder
getName() - Method in class org.apache.tika.metadata.Property
getName() - Method in class org.apache.tika.mime.MimeType
Returns the name of this media type.
getName() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
Returns an entry name
getName() - Method in enum org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
getName() - Method in class org.apache.tika.parser.executable.MachineMetadata.Endian
getName() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
getName() - Method in class org.apache.tika.parser.txt.CharsetMatch
Get the name of the detected charset.
getNameLength() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
Returns an entry name length
getNames(Metadata) - Method in class org.apache.tika.metadata.serialization.JsonMetadataSerializer
Override to get a custom sort order or to filter names.
getNamespace() - Method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
getNamespacePrefix(String) - Static method in class org.apache.tika.xmp.XMPMetadata
Obtain the prefix for a registered namespace URI.
getNamespaces() - Static method in class org.apache.tika.xmp.XMPMetadata
getNamespaceURI(String) - Static method in class org.apache.tika.xmp.XMPMetadata
Obtain the URI for a registered namespace prefix.
getNerModelUrl() - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
getNewContentHandler() - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
getNewContentHandler(OutputStream, Charset) - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
getNewContentHandler() - Method in class org.apache.tika.sax.BasicContentHandlerFactory
getNewContentHandler(OutputStream, String) - Method in class org.apache.tika.sax.BasicContentHandlerFactory
getNewContentHandler(OutputStream, Charset) - Method in class org.apache.tika.sax.BasicContentHandlerFactory
getNewContentHandler() - Method in interface org.apache.tika.sax.ContentHandlerFactory
getNewContentHandler(OutputStream, String) - Method in interface org.apache.tika.sax.ContentHandlerFactory
getNewContentHandler(OutputStream, Charset) - Method in interface org.apache.tika.sax.ContentHandlerFactory
getNonRefTableInfos() - Method in class org.apache.tika.eval.batch.EvalConsumerBuilder
getNonRefTableInfos() - Method in class org.apache.tika.eval.batch.ExtractComparerBuilder
getNonRefTableInfos() - Method in class org.apache.tika.eval.batch.ExtractProfilerBuilder
getNormValues(String) - Method in class
getNum_blocks() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns number of blocks
getNumberHandledExceptions() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
getNumberOfLevels() - Method in class
getNumConsumers(Map<String, String>) - Static method in class
numConsumers is needed by both the crawler and the consumers.
getNumericDocValues(String) - Method in class
getNumHandledExceptions() - Method in class org.apache.tika.batch.FileResourceConsumer
getNumId() - Method in class
getNumOfHidden() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
getNumOfInputs() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
getNumOfOutputs() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
getNumResourcesConsumed() - Method in class org.apache.tika.batch.FileResourceConsumer
getNumRestarts() - Method in class org.apache.tika.batch.BatchProcessDriverCLI
getNumTranslationPairs() - Method in class org.apache.tika.language.translate.CachedTranslator
Get the number of different source/target translation pairs this CachedTranslator currently has in its cache.
getNumTranslationsFor(String, String) - Method in class org.apache.tika.language.translate.CachedTranslator
Get the number of different translations from the source language to the target language this CachedTranslator has in its cache.
getOcrDPI() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Dots per inch used to render the page image for OCR
getOcrImageFormatName() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
String representation of the image format used to render the page image for OCR (examples: png, tiff, jpeg)
getOcrImageQuality() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Image quality used to render the page image for OCR.
getOcrImageScale() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
as of Tika 1.23, this is no longer used in rendering page images; use PDFParserConfig.setOcrDPI(int)
getOcrImageType() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
Image type used to render the page image for OCR.
getOcrStrategy() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getOffset() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
getOOV() - Method in class org.apache.tika.eval.tokens.CommonTokenResult
getOOV(String) - Method in class org.apache.tika.example.TextStatsFromTikaEval
Use the default language id models and the default common tokens lists in tika-eval to calculate the out-of-vocabulary percentage for a given string.
getOpenContainer() - Method in class
Returns the open container object, such as a POIFS FileSystem in the event of an OLE2 document being detected and processed by the OLE2 detector.
getOrganizations() - Static method in class org.apache.tika.sax.StandardOrganizations
Returns the map containing the collection of the most important technical standard organizations.
getOrganzationsRegex() - Static method in class org.apache.tika.sax.StandardOrganizations
Returns the regular expression containing the most important technical standard organizations.
getOtherTesseractConfig() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
getOutputEncoding() - Method in class org.apache.tika.batch.fs.BasicTikaFSConsumer
getOutputEncoding() - Method in class org.apache.tika.batch.fs.RecursiveParserWrapperFSConsumer
getOutputEncoding() - Method in class org.apache.tika.batch.fs.StreamOutRPWFSConsumer
getOutputFile(File, String, FSUtil.HANDLE_EXISTING, String) - Static method in class org.apache.tika.batch.fs.FSUtil
getOutputPath(Path, String, FSUtil.HANDLE_EXISTING, String) - Static method in class org.apache.tika.batch.fs.FSUtil
Given an output root and an initial relative path, return the output file according to the HANDLE_EXISTING strategy

In the most basic use case, given a root directory "input", a file's relative path "dir1/dir2/fileA.docx", and an output directory "output", the output file would be "output/dir1/dir2/fileA.docx."

If HANDLE_EXISTING is set to OVERWRITE, this will not check to see if the output already exists, and the returned file could overwrite an existing file!!!

If HANDLE_EXISTING is set to RENAME, this will try to increment a counter at the end of the file name (fileA(2).docx) until there is a file name that doesn't exist.

getOutputStream(OutputStreamFactory, FileResource) - Method in class org.apache.tika.batch.fs.AbstractFSConsumer
Use this for consistent logging of exceptions.
getOutputStream(Metadata) - Method in class org.apache.tika.batch.fs.FSOutputStreamFactory
This tries to create a file based on the FSUtil.HANDLE_EXISTING value that was passed in during initialization.
getOutputStream(Metadata) - Method in interface org.apache.tika.batch.OutputStreamFactory
getOutputStream() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns an OutputStream object used write the CAS.
getOutputThreshold() - Method in class org.apache.tika.sax.SecureContentHandler
Returns the configured output threshold.
getOutputType() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
getOverlap() - Method in class org.apache.tika.eval.tokens.ContrastStatistics
getPackage() - Method in class
getPackage() - Method in class
getPackage() - Method in class
getPageSegMode() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
getPageSeparator() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
getParameters() - Method in class org.apache.tika.mime.MediaType
Returns an immutable sorted map of the parameters of this media type.
getParams() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
getParseException() - Method in class org.apache.tika.eval.util.ContentTags
getParser(TikaConfig) - Method in class org.apache.tika.batch.AutoDetectParserFactory
getParser(TikaConfig) - Method in class org.apache.tika.batch.DigestingAutoDetectParserFactory
getParser(TikaConfig) - Method in class org.apache.tika.batch.ParserFactory
getParser(MediaType) - Method in class org.apache.tika.config.TikaConfig
Use the TikaConfig.getParser() method instead
getParser() - Method in class org.apache.tika.config.TikaConfig
Returns the configured parser instance.
getParser(Metadata) - Method in class org.apache.tika.parser.CompositeParser
Returns the parser that best matches the given metadata.
getParser(Metadata, ParseContext) - Method in class org.apache.tika.parser.CompositeParser
getParser() - Method in class org.apache.tika.Tika
Returns the parser instance used by this facade.
getParserClassname(Parser) - Static method in class org.apache.tika.utils.ParserUtils
Identifies the real class name of the Parser, unwrapping any ParserDecorator decorations on top of it.
getParserDetailsHTML() - Method in class org.apache.tika.server.resource.TikaParsers
getParserDetailsJSON() - Method in class org.apache.tika.server.resource.TikaParsers
getParserDetailssPlain() - Method in class org.apache.tika.server.resource.TikaParsers
getParseRecursively() - Method in class org.apache.tika.batch.ParserFactory
getParsers(ParseContext) - Method in class org.apache.tika.parser.CompositeParser
getParsers() - Method in class org.apache.tika.parser.CompositeParser
Returns the component parsers.
getParsers(ParseContext) - Method in class org.apache.tika.parser.DefaultParser
getParsersHTML() - Method in class org.apache.tika.server.resource.TikaParsers
getParsersHTML(boolean) - Method in class org.apache.tika.server.resource.TikaParsers
getParsersJSON() - Method in class org.apache.tika.server.resource.TikaParsers
getParsersJSON(boolean) - Method in class org.apache.tika.server.resource.TikaParsers
getParsersPlain() - Method in class org.apache.tika.server.resource.TikaParsers
getParsersPlain(boolean) - Method in class org.apache.tika.server.resource.TikaParsers
getPart() - Method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
getPassword(Metadata) - Method in interface org.apache.tika.parser.PasswordProvider
Looks up the password for a document with the given metadata, and returns it for the Parser.
getPasswordProvider() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
getPath(Map<String, String>, String) - Method in class org.apache.tika.eval.batch.EvalConsumersBuilder
getPath() - Method in class
If the user created this TikaInputStream with a file, the original file will be returned.
getPath(int) - Method in class
getPath(String, Path) - Static method in class org.apache.tika.util.PropsUtil
Parses v.
getPathClassifyModel() - Method in class org.apache.tika.parser.recognition.AgeRecogniserConfig
getPathClassifyRegression() - Method in class org.apache.tika.parser.recognition.AgeRecogniserConfig
getPathsFromExtractCrawl(Metadata, Path) - Method in class org.apache.tika.eval.AbstractProfiler
getPathsFromSrcCrawl(Metadata, Path, Path) - Method in class org.apache.tika.eval.AbstractProfiler
getPDDocument(InputStream, String, MemoryUsageSetting, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
getPDDocument(Path, String, MemoryUsageSetting, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
getPDDocument(InputStream, String, MemoryUsageSetting, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFPreflightParser
getPDDocument(Path, String, MemoryUsageSetting, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFPreflightParser
getPDFParserConfig() - Method in class org.apache.tika.parser.pdf.PDFParser
getPingPulseMillis() - Method in class org.apache.tika.server.ServerTimeouts
getPingTimeoutMillis() - Method in class org.apache.tika.server.ServerTimeouts
getPointValues(String) - Method in class
getPoolSize() - Method in class org.apache.tika.fork.ForkParser
Returns the size of the process pool.
getPoolSize() - Static method in class org.apache.tika.utils.XMLReaderUtils
getPosition() - Method in class
Return the current position.
getPosition() - Method in class
Returns the current position within the stream.
getPrecision() - Method in class org.apache.tika.eval.db.ColInfo
Gets the precision.
getPrefixes() - Static method in class org.apache.tika.xmp.XMPMetadata
getPreserveInterwordSpacing() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
getPrevContent() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
getPrimaryProperty() - Method in class org.apache.tika.metadata.Property
Gets the primary property for a composite property
getProbabilities(String) - Method in class org.apache.tika.eval.langid.LanguageIDWrapper
getProbability(String) - Method in class org.apache.tika.eval.tokens.LangModel
getProfile() - Method in class org.apache.tika.language.ProfilingHandler
Returns the language profile being built by this content handler.
getProfile() - Method in class org.apache.tika.language.ProfilingWriter
Returns the language profile being built by this writer.
getProperties(String) - Static method in class org.apache.tika.metadata.Property
getProperty(Object) - Method in class org.apache.tika.example.ImportContextImpl
getPropertyType(String) - Static method in class org.apache.tika.metadata.Property
Get the type of a property
getPropertyType() - Method in class org.apache.tika.metadata.Property
getProvider() - Method in class org.apache.tika.parser.digest.InputStreamDigester
When subclassing this, becare to ensure that your provider is thread-safe (not likely) or return a new provider with each call.
getQNameAsString(QName) - Static method in class org.apache.tika.sax.ElementMappingContentHandler
getR0() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getR1() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getR2() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getRawScore() - Method in class org.apache.tika.language.detect.LanguageResult
getReader(InputStream, String) - Method in class org.apache.tika.parser.txt.CharsetDetector
Autodetect the charset of an inputStream, and return a Java Reader to access the converted input data.
getReader() - Method in class org.apache.tika.parser.txt.CharsetMatch
Create a for reading the Unicode character data corresponding to the original byte data supplied to the Charset detect operation.
getReaderCacheHelper() - Method in class
getRefTableInfos() - Method in class org.apache.tika.eval.batch.EvalConsumerBuilder
getRefTableInfos() - Method in class org.apache.tika.eval.batch.ExtractComparerBuilder
getRefTableInfos() - Method in class org.apache.tika.eval.batch.ExtractProfilerBuilder
getRegisteredMimeType(String) - Method in class org.apache.tika.mime.MimeTypes
Returns the registered, normalised media type with the given name (or alias).
getRel() - Method in class org.apache.tika.sax.Link
getResetInterval() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Returns reset interval
getResetTableIndex() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
Return index of reset table
getResize() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
getResource(Class<T>) - Method in class
Returns the latest of the tracked resources that implements or extends the given interface or class.
getResourceAsStream(String) - Method in class org.apache.tika.config.ServiceLoader
Returns an input stream for reading the specified resource from the configured class loader.
getResourceId() - Method in interface org.apache.tika.batch.FileResource
This is only used in logging to identify which file may have caused problems.
getResourceId() - Method in class org.apache.tika.batch.fs.FSFileResource
getRight() - Method in class
getRoughCountExceptions() - Method in class org.apache.tika.batch.StatusReporter
This returns a rough (unsynchronized) count of caught/handled exceptions.
getRSSFooters() - Method in class org.apache.tika.example.RecentFiles
getRSSHeaders() - Method in class org.apache.tika.example.RecentFiles
getRSSItem(Document) - Method in class org.apache.tika.example.RecentFiles
getSampleRate() - Method in class org.apache.tika.parser.mp3.AudioFrame
Get the sampling rate, in Hz
getSAXParser() - Method in class org.apache.tika.parser.ParseContext
Returns the SAX parser specified in this parsing context.
getSAXParser() - Static method in class org.apache.tika.utils.XMLReaderUtils
Returns the SAX parser specified in this parsing context.
getSAXParserFactory() - Method in class org.apache.tika.parser.ParseContext
Returns the SAX parser factory specified in this parsing context.
getSAXParserFactory() - Static method in class org.apache.tika.utils.XMLReaderUtils
Returns the SAX parser factory specified in this parsing context.
getScore() - Method in class org.apache.tika.sax.StandardReference
getSecondaryExtractProperties() - Method in class org.apache.tika.metadata.Property
Gets the secondary properties for a composite property
getSecondOrganizationAcronym() - Method in class org.apache.tika.sax.StandardReference
getSeparator() - Method in class org.apache.tika.sax.StandardReference
getSeparatorChar() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns the separator character used for annotation properties.
getSerializerType() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns the type of cTAKES (UIMA) serializer used to write the CAS.
getServiceClass(Class<T>, String) - Method in class org.apache.tika.config.ServiceLoader
Loads and returns the named service class that's expected to implement the given interface.
getServiceLoader() - Method in class org.apache.tika.config.TikaConfig
getSetKCMS() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getSetter() - Method in class org.apache.tika.config.ParamField
getShortBE(byte[]) - Static method in class
Get a BE short value from the beginning of a byte array
getShortBE(byte[], int) - Static method in class
Get a BE short value from a byte array
getShortLE(byte[]) - Static method in class
Get a LE short value from the beginning of a byte array
getShortLE(byte[], int) - Static method in class
Get a LE short value from a byte array
getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns a signature of itsf header
getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns a signature of the header
getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Returns a signature of control data block
getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
Returns pmgi signature if exists
getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
getSimilarity(LanguageProfilerBuilder) - Method in class org.apache.tika.language.LanguageProfilerBuilder
Calculates a score how well NGramProfiles match each other
getSize() - Method in class
Return the size this InputStream emulates.
getSize() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Returns a size of control data
getSize() - Method in class org.apache.tika.parser.mp3.ID3v2Frame.RawTag
getSize(Metadata, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.CSVMessageBodyWriter
getSize(Metadata, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.JSONMessageBodyWriter
getSize(MetadataList, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.MetadataListMessageBodyWriter
getSize(Map<String, byte[]>, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.TarWriter
getSize(Metadata, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.TextMessageBodyWriter
getSize(Metadata, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.XMPMessageBodyWriter
getSize(Map<String, byte[]>, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.ZipWriter
getSize() - Method in class org.apache.tika.utils.RereadableInputStream
Returns the number of bytes read from the original stream.
getSortByPosition() - Method in class org.apache.tika.parser.pdf.PDFParser
getSortByPosition() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getSorted() - Method in class org.apache.tika.language.LanguageProfilerBuilder
Returns a sorted list of ngrams (sort done by 1.
getSortedDocValues(String) - Method in class
getSortedNumericDocValues(String) - Method in class
getSortedSetDocValues(String) - Method in class
getSourceFileLength(EvalFilePaths, List<Metadata>) - Method in class org.apache.tika.eval.AbstractProfiler
getSpacingTolerance() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getSqlDef() - Method in class org.apache.tika.eval.db.ColInfo
getStackTrace(Throwable) - Static method in class org.apache.tika.utils.ExceptionUtils
Get the full stacktrace as a string
getStartBlock() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
Returns the start block index
getStartIndex() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
getStartOffset() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
Returns the start offset index
getState() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
getStatus() - Method in class org.apache.tika.server.ServerStatus
getStream_uuid() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns stream uuid
getString(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
Returns the String at the given offset and length.
getString(byte[], String) - Method in class org.apache.tika.parser.txt.CharsetDetector
Autodetect the charset of an inputStream, and return a String containing the converted input data.
getString() - Method in class org.apache.tika.parser.txt.CharsetMatch
Create a Java String from Unicode character data corresponding to the original byte data supplied to the Charset detect operation.
getString(int) - Method in class org.apache.tika.parser.txt.CharsetMatch
Create a Java String from Unicode character data corresponding to the original byte data supplied to the Charset detect operation.
getString(String, String) - Static method in class org.apache.tika.util.PropsUtil
Parses v.
getStringsPath() - Method in class org.apache.tika.parser.strings.StringsConfig
Returns the "strings" installation folder.
getStringsProg() - Static method in class org.apache.tika.parser.strings.StringsParser
getStripMarkup() - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
getStyleClass() - Method in class
getStyleID() - Method in class
getStyleName(String) - Method in class
getSubtype() - Method in class org.apache.tika.mime.MediaType
Return the Sub-Type of the MediaType, such as "plain" for "text/plain"
getSuffix(InputStream, int) - Static method in class org.apache.tika.parser.mp3.LyricsHandler
Reads and returns the last length bytes from the given stream.
getSummaryStatistics() - Method in class org.apache.tika.eval.tokens.TokenStatistics
getSupertype(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
Returns the supertype of the given type.
getSupportedEmbedTypes(ParseContext) - Method in interface org.apache.tika.embedder.Embedder
Returns the set of media types supported by this embedder when used with the given parse context.
getSupportedEmbedTypes(ParseContext) - Method in class org.apache.tika.embedder.ExternalEmbedder
getSupportedEmbedTypes() - Method in class org.apache.tika.embedder.ExternalEmbedder
getSupportedLanguages() - Method in class org.apache.tika.eval.langid.LanguageIDWrapper
getSupportedLanguages() - Static method in class org.apache.tika.language.LanguageIdentifier
Returns what languages are supported for language identification
getSupportedMimes() - Method in class org.apache.tika.dl.imagerec.DL4JInceptionV3Net
getSupportedMimes() - Method in class org.apache.tika.dl.imagerec.DL4JVGG16Net
getSupportedMimes() - Method in class
getSupportedMimes() - Method in interface org.apache.tika.parser.recognition.ObjectRecogniser
The mimes supported by this recogniser
getSupportedMimes() - Method in class
getSupportedMimes() - Method in class
getSupportedTypes(ParseContext) - Method in class org.apache.tika.example.DirListParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.example.EncryptedPrescriptionParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.example.PrescriptionParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.fork.ForkParser
getSupportedTypes(ParseContext) - Method in class
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.asm.ClassParser
getSupportedTypes(ParseContext) - Method in class
getSupportedTypes(ParseContext) - Method in class
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.chm.ChmParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.code.SourceCodeParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.CompositeParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.crypto.Pkcs7Parser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.crypto.TSDParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.CryptoParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.csv.TextAndCSVParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dbf.DBFParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.DelegatingParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dif.DIFParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dwg.DWGParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.EmptyParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.envi.EnviHeaderParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.epub.EpubContentParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.epub.EpubParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ErrorParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.executable.ExecutableParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.external.ExternalParser
getSupportedTypes() - Method in class org.apache.tika.parser.external.ExternalParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.feed.FeedParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.font.AdobeFontMetricParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.font.TrueTypeParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.gdal.GDALParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.geo.topic.GeoParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.geoinfo.GeographicInformationParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.grib.GribParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.hdf.HDFParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.html.HtmlParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.hwp.HwpV5Parser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.BPGParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.ICNSParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.ImageParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.PSDParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.TiffParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.WebPParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iptc.IptcAnpaParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.isatab.ISArchiveParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iwork.iwana.IWork18PackageParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iwork.IWorkPackageParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.jdbc.SQLite3Parser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.journal.JournalParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.jpeg.JpegParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mail.RFC822Parser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mat.MatParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mbox.MboxParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mbox.OutlookPSTParser
getSupportedTypes(ParseContext) - Method in class
getSupportedTypes(ParseContext) - Method in class
getSupportedTypes(ParseContext) - Method in class
getSupportedTypes(ParseContext) - Method in class
getSupportedTypes(ParseContext) - Method in class
getSupportedTypes(ParseContext) - Method in class
getSupportedTypes(ParseContext) - Method in class
getSupportedTypes(ParseContext) - Method in class
getSupportedTypes(ParseContext) - Method in class
getSupportedTypes(ParseContext) - Method in class
getSupportedTypes(ParseContext) - Method in class
getSupportedTypes(ParseContext) - Method in class
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mp3.Mp3Parser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mp4.MP4Parser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ner.NamedEntityParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.netcdf.NetCDFParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.NetworkParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentContentParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
getSupportedTypes(ParseContext) - Method in interface org.apache.tika.parser.Parser
Returns the set of media types supported by this parser when used with the given parse context.
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ParserDecorator
Delegates the method call to the decorated parser.
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.CompressorParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.PackageParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.RarParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pot.PooledTimeSeriesParser
Returns the set of media types supported by this parser when used with the given parse context.
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.prt.PRTParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.recognition.AgeRecogniser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.RecursiveParserWrapper
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.rtf.RTFParser
getSupportedTypes(ParseContext) - Method in class
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
Returns the types supported
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.strings.Latin1StringsParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.strings.StringsParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.txt.TXTParser
getSupportedTypes(ParseContext) - Method in class
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.wordperfect.QuattroProParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.wordperfect.WordPerfectParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xliff.XLIFF12Parser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xliff.XLZParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xml.FictionBookParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xml.XMLParser
getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xml.XMLProfiler
getSuppressDuplicateOverlappingText() - Method in class org.apache.tika.parser.pdf.PDFParser
getSuppressDuplicateOverlappingText() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
getSwath() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
getSyncBits(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
getSystem_uuid() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns system uuid
getSystemId() - Method in class org.apache.tika.example.ImportContextImpl
getTableOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Gets a table offset
getTables(Connection) - Method in class org.apache.tika.eval.db.H2Util
getTables(Connection) - Method in class org.apache.tika.eval.db.JDBCUtil
getTag() - Method in exception
Returns the object reference used as the tag this exception.
getTag() - Method in class
getTag() - Method in exception org.apache.tika.sax.TaggedSAXException
Returns the object reference used as the tag this exception.
getTags() - Method in class org.apache.tika.eval.util.ContentTags
getTagsPresent() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
getTagsPresent() - Method in interface org.apache.tika.parser.mp3.ID3Tags
Does the file contain this kind of tags?
getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
getTagString(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
Returns the (possibly null padded) String at the given offset and length.
getTail() - Method in class
Returns an array with the last data read from the underlying stream.
getTasks() - Method in class org.apache.tika.server.ServerStatus
getTaskTimeoutMillis() - Method in class org.apache.tika.server.ServerTimeouts
How long to wait for a task before shutting down the child server process and restarting it.
getTermVectors(int) - Method in class
getTessdataPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
getTesseractPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
getText() - Method in class
getText() - Method in class
getText() - Method in class
getText() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
Gets the text, if present
getText() - Method in class org.apache.tika.sax.Link
getText(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.resource.TikaResource
getTextDocument() - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
Retrieves the built TextDocument
getTextFromMultipart(Attachment, UriInfo) - Method in class org.apache.tika.server.resource.TikaResource
getTextMain(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.resource.TikaResource
getTextMainFromMultipart(Attachment, UriInfo) - Method in class org.apache.tika.server.resource.TikaResource
getThreshold() - Method in class org.apache.tika.sax.StandardsExtractingContentHandler
Gets the threshold to be used for selecting the standard references found within the text based on their score.
getTikaConfig() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
getTimeout() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
getTimeout() - Method in class org.apache.tika.parser.strings.StringsConfig
Returns the maximum time (in seconds) to wait for the "strings" command to terminate.
getTitle() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
getTitle() - Method in interface org.apache.tika.parser.mp3.ID3Tags
getTitle() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
getTitle() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
getTitle() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
getTitle() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
getTitle() - Method in class org.apache.tika.sax.Link
getToken() - Method in class org.apache.tika.eval.tokens.TokenIntPair
getTokens(String) - Method in class org.apache.tika.eval.tokens.CommonTokenCountManager
getTokens() - Method in class org.apache.tika.eval.tokens.LangModel
getTokens(String) - Method in class org.apache.tika.eval.tokens.TokenCounter
getTokens() - Method in class org.apache.tika.eval.tokens.TokenCounts
getTokenStatistics(String) - Method in class org.apache.tika.eval.tokens.TokenCounter
getTopN() - Method in class org.apache.tika.eval.tokens.TokenStatistics
getTopNMoreA() - Method in class org.apache.tika.eval.tokens.ContrastStatistics
getTopNMoreB() - Method in class org.apache.tika.eval.tokens.ContrastStatistics
getTopNUniqueA() - Method in class org.apache.tika.eval.tokens.ContrastStatistics
getTopNUniqueB() - Method in class org.apache.tika.eval.tokens.ContrastStatistics
getTotal() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
getTotalTokens() - Method in class org.apache.tika.eval.tokens.TokenCounts
getTotalTokens() - Method in class org.apache.tika.eval.tokens.TokenStatistics
getTotalUniqueTokens() - Method in class org.apache.tika.eval.tokens.TokenCounts
getTotalUniqueTokens() - Method in class org.apache.tika.eval.tokens.TokenStatistics
getTrackingMetadata() - Method in class org.apache.tika.parser.mbox.MboxParser
getTrackNumber() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
getTrackNumber() - Method in interface org.apache.tika.parser.mp3.ID3Tags
The number of the track within the album / recording
getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
getTransformer() - Method in class org.apache.tika.parser.ParseContext
Returns the transformer specified in this parsing context.
getTransformer() - Static method in class org.apache.tika.utils.XMLReaderUtils
Returns a new transformer
getTranslator() - Method in class org.apache.tika.config.TikaConfig
Returns the configured translator instance.
getTranslator() - Method in class org.apache.tika.language.translate.CachedTranslator
getTranslator() - Method in class org.apache.tika.language.translate.DefaultTranslator
Returns the current translator
getTranslator() - Method in class org.apache.tika.Tika
Returns the translator instance used by this facade.
getTranslators() - Method in class org.apache.tika.language.translate.DefaultTranslator
Returns all available translators
getType() - Method in class org.apache.tika.config.Param
getType() - Method in class org.apache.tika.config.ParamField
getType() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
getType() - Method in class org.apache.tika.eval.db.ColInfo
getType() - Method in exception
getType() - Method in class org.apache.tika.mime.MediaType
Return the Type of the MediaType, such as "text" for "text/plain"
getType() - Method in class org.apache.tika.mime.MimeType
Returns the normalized media type name.
getType() - Method in class org.apache.tika.parser.image.ICNSType
getType() - Method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
getType() - Method in enum org.apache.tika.parser.iwork.iwana.IWork18PackageParser.IWork18DocumentType
getType() - Method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
getType() - Method in enum
getType() - Method in class org.apache.tika.sax.BasicContentHandlerFactory
getType() - Method in class org.apache.tika.sax.Link
getTypeFromVal(int) - Static method in enum
getTypes() - Method in class org.apache.tika.mime.MediaTypeRegistry
Returns the set of all known canonical media types.
getTypeString() - Method in class org.apache.tika.config.Param
getUByte(byte[], int) - Static method in class
get the unsigned value of a byte.
getUIntBE(byte[]) - Static method in class
Get a BE unsigned int value from a byte array
getUIntBE(byte[], int) - Static method in class
Get a BE unsigned int value from a byte array
getUIntLE(byte[]) - Static method in class
Get a LE unsigned int value from a byte array
getUIntLE(byte[], int) - Static method in class
Get a LE unsigned int value from a byte array
getUMLSPass() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns the UMLS password.
getUMLSUser() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
Returns the UMLS username.
getUncompressedLen() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Gets uncompressed length
getUnderline() - Method in class
getUniformTypeIdentifier() - Method in class org.apache.tika.mime.MimeType
Get the UTI for this mime type.
getUniqueAlphabeticTokens() - Method in class org.apache.tika.eval.tokens.CommonTokenResult
getUniqueCommonTokens() - Method in class org.apache.tika.eval.tokens.CommonTokenResult
getUnknown() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Gets unknown
getUnknown0008() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
getUnknown_000c() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns unknown_00c value
getUnknown_000c() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns 000c unknown bytes
getUnknown_0024() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns 0024 unknown bytes
getUnknown_002c() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns 002c unknown bytes
getUnknown_0044() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns 0044 unknown bytes
getUnknown_18() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Returns unknown 18 bytes
getUnknownLen() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns unknown length
getUnknownOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns unknown offset
getUnseenProbability() - Method in class org.apache.tika.eval.tokens.LangModel
getUri() - Method in class org.apache.tika.sax.Link
getUserInterrupted() - Method in class org.apache.tika.batch.BatchProcessDriverCLI
getUseSAXDocxExtractor() - Method in class
getUseSAXDocxExtractor() - Method in class
getUseSAXPptxExtractor() - Method in class
getUShortBE(byte[]) - Static method in class
Get a BE unsigned short value from the beginning of a byte array
getUShortBE(byte[], int) - Static method in class
Get a BE unsigned short value from a byte array
getUShortLE(byte[]) - Static method in class
Get a LE unsigned short value from the beginning of a byte array
getUShortLE(byte[], int) - Static method in class
Get a LE unsigned short value from a byte array
getValue() - Method in class org.apache.tika.config.Param
getValue() - Method in class org.apache.tika.eval.tokens.TokenIntPair
getValues(Property) - Method in class org.apache.tika.metadata.Metadata
Get the values associated to a metadata name.
getValues(String) - Method in class org.apache.tika.metadata.Metadata
Get the values associated to a metadata name.
getValues(Property) - Method in class org.apache.tika.xmp.XMPMetadata
getValues(String) - Method in class org.apache.tika.xmp.XMPMetadata
Returns the value of a simple property or all if the property is an array and the elements are of simple type.
getValueType() - Method in class org.apache.tika.metadata.Property
getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
Returns itsf header version
getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
Returns version of itsp header
getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Returns a version of control data block
getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
Returns the version
getVersion() - Method in class org.apache.tika.parser.mp3.AudioFrame
getVersion() - Method in class org.apache.tika.server.resource.TikaVersion
getVersionCode() - Method in class org.apache.tika.parser.mp3.AudioFrame
Get the version code.
getWelcomeHTML() - Method in class org.apache.tika.server.resource.TikaWelcome
getWelcomePlain() - Method in class org.apache.tika.server.resource.TikaWelcome
getWidth() - Method in class org.apache.tika.parser.image.ICNSType
getWindow() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getWindowPosition() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getWindowSize() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Returns a window size
getWindowSize(int) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
LZX supports window sizes of 2^15 (32Kb) through 2^21 (2Mb) Returns X, i.e 2^X
getWindowSize() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
getWindowsPerReset() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
Returns windows per reset
getWrappedParser() - Method in class org.apache.tika.parser.ParserDecorator
Gets the parser wrapped by this ParserDecorator
getXHTML(ContentHandler, Metadata, ParseContext) - Method in class
getXHTML(ContentHandler, Metadata, ParseContext) - Method in interface
Parses the document into a sequence of XHTML SAX events sent to the given content handler.
getXHTML(ContentHandler, Metadata, ParseContext) - Method in class
getXHTML(ContentHandler, Metadata, ParseContext) - Method in class
getXML(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.resource.TikaResource
getXMLFromMultipart(Attachment, UriInfo) - Method in class org.apache.tika.server.resource.TikaResource
getXMLifiedLogMsg(String, String, String...) - Method in class org.apache.tika.batch.FileResourceConsumer
getXMLifiedLogMsg(String, String, Throwable, String...) - Method in class org.apache.tika.batch.FileResourceConsumer
Use this for structured output that captures resourceId and other attributes.
getXMLInputFactory() - Method in class org.apache.tika.parser.ParseContext
Returns the StAX input factory specified in this parsing context.
getXMLInputFactory() - Static method in class org.apache.tika.utils.XMLReaderUtils
Returns the StAX input factory specified in this parsing context.
getXMLReader() - Method in class org.apache.tika.parser.ParseContext
Returns the XMLReader specified in this parsing context.
getXMLReader() - Static method in class org.apache.tika.utils.XMLReaderUtils
Returns the XMLReader specified in this parsing context.
getXMPData() - Method in class org.apache.tika.xmp.XMPMetadata
Provides direct access to the XMP data model, in case a client prefers to work directly on it instead of using the Metadata API
getXMPMeta() - Method in class org.apache.tika.xmp.convert.AbstractConverter
getYear() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
getYear() - Method in interface org.apache.tika.parser.mp3.ID3Tags
getYear() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
getYear() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
getYear() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
getYear() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
GLOB_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
GlobalIdTableEntry3FNDX - Class in
GlobalIdTableEntry3FNDX() - Constructor for class
GlobalIdTableEntryFNDX - Class in
GlobalIdTableEntryFNDX() - Constructor for class
GoogleTranslator - Class in org.apache.tika.language.translate
An implementation of a REST client to the Google Translate v2 API.
GoogleTranslator() - Constructor for class org.apache.tika.language.translate.GoogleTranslator
GrabPhoneNumbersExample - Class in org.apache.tika.example
Class to demonstrate how to use the PhoneExtractingContentHandler to get a list of all of the phone numbers from every file in a directory.
GrabPhoneNumbersExample() - Constructor for class org.apache.tika.example.GrabPhoneNumbersExample
GREETING - Static variable in class org.apache.tika.server.resource.TikaResource
GRIB_MIME_TYPE - Static variable in class org.apache.tika.parser.grib.GribParser
GribParser - Class in org.apache.tika.parser.grib
GribParser() - Constructor for class org.apache.tika.parser.grib.GribParser
GrobidNERecogniser - Class in org.apache.tika.parser.ner.grobid
GrobidNERecogniser() - Constructor for class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
GrobidRESTParser - Class in org.apache.tika.parser.journal
GrobidRESTParser() - Constructor for class org.apache.tika.parser.journal.GrobidRESTParser


H2Util - Class in org.apache.tika.eval.db
H2Util(Path) - Constructor for class org.apache.tika.eval.db.H2Util
handle(String, MediaType, InputStream) - Method in interface org.apache.tika.extractor.EmbeddedResourceHandler
Called to process an embedded resource within the container.
handle(Metadata) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
Copies extracted tags to tika metadata using registered handlers.
handle(Iterator<Directory>) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
Copies extracted tags to tika metadata using registered handlers.
handleEmbeddedFile(PackagePart, ContentHandler, String) - Method in class
Handles an embedded file in the document
handleEntryMetadata(String, Date, Date, Long, XHTMLContentHandler) - Static method in class org.apache.tika.parser.pkg.PackageParser
handleException(SAXException) - Method in class org.apache.tika.sax.ContentHandlerDecorator
Handle any exceptions thrown by methods in this class.
handleException(SAXException) - Method in class org.apache.tika.sax.TaggedContentHandler
Tags any SAXExceptions thrown, wrapping and re-throwing.
handleFirstFileInDirectory(Path) - Method in class org.apache.tika.batch.fs.FSDirectoryCrawler
Override this if you have any special handling for the first actual file that the crawler comes across in a directory.
handleGlobError(MimeType, String, MimeTypeException, String, Attributes) - Method in class org.apache.tika.mime.MimeTypesReader
handleInitializableProblem(String, String) - Method in interface org.apache.tika.config.InitializableProblemHandler
handleIOException(IOException) - Method in class
Handle any IOExceptions thrown.
handleIOException(IOException) - Method in class
Tags any IOExceptions thrown, wrapping and re-throwing.
handleLoadError(String, Throwable) - Method in interface org.apache.tika.config.LoadErrorHandler
Handles a problem encountered when trying to load the specified service class.
handleMimeError(String, MimeTypeException, String, Attributes) - Method in class org.apache.tika.mime.MimeTypesReader
handleMsg(Level, String) - Method in interface
handleXMP(InputStream, int, ImageMetadataExtractor) - Method in class org.apache.tika.parser.image.BPGParser
HAS_ACROFORM_FIELDS - Static variable in interface org.apache.tika.metadata.PDF
Has > 0 AcroForm fields
HAS_MARKED_CONTENT - Static variable in interface org.apache.tika.metadata.PDF
HAS_SIGNATURE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
HAS_XFA - Static variable in interface org.apache.tika.metadata.PDF
HAS_XMP - Static variable in interface org.apache.tika.metadata.PDF
Has XMP, whether or not it is valid
hasEnoughText() - Method in class org.apache.tika.langdetect.OptimaizeLangDetector
hasEnoughText() - Method in class org.apache.tika.language.detect.LanguageDetector
Tell the caller whether more text is required for the current document before the language can be reliably detected.
hasErrors() - Static method in class org.apache.tika.language.LanguageIdentifier
Tests whether there were errors initializing language config
hasFile() - Method in class
hashCode() - Method in class org.apache.tika.eval.db.ColInfo
hashCode() - Method in class org.apache.tika.eval.tokens.TokenIntPair
hashCode() - Method in class org.apache.tika.eval.tokens.TokenStatistics
hashCode() - Method in class org.apache.tika.metadata.Metadata
hashCode() - Method in class org.apache.tika.metadata.Property
hashCode() - Method in class org.apache.tika.mime.MediaType
hashCode() - Method in class org.apache.tika.mime.MimeType
hashCode() - Method in class org.apache.tika.parser.csv.CSVResult
hashCode() - Method in class org.apache.tika.parser.pdf.AccessChecker
hashCode() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
hashCode() - Method in class org.apache.tika.parser.txt.CharsetMatch
generates a hashCode based on the confidence value
hashCode() - Method in class org.apache.tika.parser.utils.DataURIScheme
hasHitBound() - Method in class
hasHitMaximumEmbeddedResources() - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
hasID3v1() - Method in class org.apache.tika.parser.mp3.LyricsHandler
hasLength() - Method in class
hasLyrics() - Method in class org.apache.tika.parser.mp3.LyricsHandler
hasMacroLanguage(String) - Static method in class org.apache.tika.language.detect.LanguageNames
hasMagic() - Method in class org.apache.tika.mime.MimeType
hasMask() - Method in class org.apache.tika.parser.image.ICNSType
hasModel(String) - Method in class org.apache.tika.langdetect.Lingo24LangDetector
hasModel(String) - Method in class org.apache.tika.langdetect.OptimaizeLangDetector
hasModel(String) - Method in class org.apache.tika.langdetect.TextLangDetector
hasModel(String) - Method in class org.apache.tika.language.detect.LanguageDetector
Provide information about whether a model exists for a specific language.
hasNext() - Method in class org.apache.tika.parser.mp3.ID3v2Frame.RawTagIterator
hasParameters() - Method in class org.apache.tika.mime.MediaType
Checks whether this media type contains parameters.
hasRetinaDisplay() - Method in class org.apache.tika.parser.image.ICNSType
hasSkip(DirectoryListingEntry) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
Checks skippable patterns
hasStream() - Method in class org.apache.tika.example.ImportContextImpl
hasTesseract(TesseractOCRConfig) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
hasWarned() - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
HDFParser - Class in org.apache.tika.parser.hdf
Since the NetCDFParser depends on the NetCDF-Java API, we are able to use it to parse HDF files as well.
HDFParser() - Constructor for class org.apache.tika.parser.hdf.HDFParser
headerFooter(String, boolean, String) - Method in class
HeaderFooterFromString(String) - Constructor for class
headers - Variable in class
HEADLINE - Static variable in interface org.apache.tika.metadata.IPTC
A brief synopsis of the caption.
HEADLINE - Static variable in interface org.apache.tika.metadata.Photoshop
healthUri - Variable in class
HexCoDec - Class in org.apache.tika.mime
A set of Hex encoding and decoding utility methods.
HexCoDec() - Constructor for class org.apache.tika.mime.HexCoDec
hfHelper - Static variable in class
Allows access to headers/footers from raw xml strings
HIDDEN_SLIDES - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
HISTORY - Static variable in interface org.apache.tika.metadata.ClimateForcast
HISTORY_ACTION - Static variable in interface org.apache.tika.metadata.XMPMM
Action in the XMPMM's history section
HISTORY_EVENT_INSTANCEID - Static variable in interface org.apache.tika.metadata.XMPMM
Instance id in the XMPMM's history section
HISTORY_SOFTWARE_AGENT - Static variable in interface org.apache.tika.metadata.XMPMM
Software agent that created the action in the XMPMM's history section
HISTORY_WHEN - Static variable in interface org.apache.tika.metadata.XMPMM
When the action occurred in the XMPMM's history section
HSLFExtractor - Class in
HSLFExtractor(ParseContext, Metadata) - Constructor for class
HTML - Interface in org.apache.tika.metadata
HtmlEncodingDetector - Class in org.apache.tika.parser.html
Character encoding detector for determining the character encoding of a HTML document based on the potential charset parameter found in a Content-Type http-equiv meta tag somewhere near the beginning.
HtmlEncodingDetector() - Constructor for class org.apache.tika.parser.html.HtmlEncodingDetector
HTMLHelper - Class in org.apache.tika.server
Helps produce user facing HTML output.
HTMLHelper() - Constructor for class org.apache.tika.server.HTMLHelper
HtmlMapper - Interface in org.apache.tika.parser.html
HTML mapper used to make incoming HTML documents easier to handle by Tika clients.
HtmlParser - Class in org.apache.tika.parser.html
HTML parser.
HtmlParser() - Constructor for class org.apache.tika.parser.html.HtmlParser
HtmlParser(EncodingDetector) - Constructor for class org.apache.tika.parser.html.HtmlParser
HttpHeaders - Interface in org.apache.tika.metadata
A collection of HTTP header names.
httpMethod - Variable in class org.apache.tika.server.resource.TikaWelcome.Endpoint
HWP - Static variable in class
Hangul Word Processor (Korean)
HWP_MIME_TYPE - Static variable in class org.apache.tika.parser.hwp.HwpV5Parser
HwpStreamReader - Class in org.apache.tika.parser.hwp
HwpStreamReader(InputStream) - Constructor for class org.apache.tika.parser.hwp.HwpStreamReader
HwpTextExtractorV5 - Class in org.apache.tika.parser.hwp
HwpTextExtractorV5() - Constructor for class org.apache.tika.parser.hwp.HwpTextExtractorV5
HwpV5Parser - Class in org.apache.tika.parser.hwp
HwpV5Parser() - Constructor for class org.apache.tika.parser.hwp.HwpV5Parser
hyperlinkEnd() - Method in class
hyperlinkEnd() - Method in interface
hyperlinkStart(String) - Method in class
hyperlinkStart(String) - Method in interface
hyperlinkUpdate(HyperlinkEvent) - Method in class org.apache.tika.gui.TikaGUI


ICNS_1024x1024_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_128x128_24BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_128x128_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_128x128_8BIT_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_128x128_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_16x12_1BIT_IMAGE_AND_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_16x12_4BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_16x12_8BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_16x16_1BIT_IMAGE_AND_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_16x16_24BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_16x16_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_16x16_4BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_16x16_8BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_16x16_8BIT_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_16x16_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_256x256_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_256x256_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_32x32_1BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_32x32_1BIT_IMAGE_AND_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_32x32_24BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_32x32_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_32x32_4BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_32x32_8BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_32x32_8BIT_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_32x32_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_48x48_1BIT_IMAGE_AND_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_48x48_24BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_48x48_4BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_48x48_8BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_48x48_8BIT_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_512x512_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_64x64_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
ICNS_MIME_TYPE - Static variable in class org.apache.tika.parser.image.ICNSParser
ICNSParser - Class in org.apache.tika.parser.image
A basic parser class for Apple ICNS icon files
ICNSParser() - Constructor for class org.apache.tika.parser.image.ICNSParser
ICNSType - Class in org.apache.tika.parser.image
Holds details on Apple ICNS icons
IContentHandlerFactoryBuilder - Interface in
ICrawlerBuilder - Interface in
Icu4jEncodingDetector - Class in org.apache.tika.parser.txt
Icu4jEncodingDetector() - Constructor for class org.apache.tika.parser.txt.Icu4jEncodingDetector
ID - Static variable in class org.apache.tika.eval.AbstractProfiler
ID - Static variable in interface org.apache.tika.metadata.QuattroPro
id - Variable in class org.apache.tika.parser.recognition.RecognisedObject
Identifier for this object
id - Variable in class org.apache.tika.parser.rtf.ListDescriptor
ID3Comment(String) - Constructor for class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
Creates an ID3 v1 style comment tag
ID3Comment(String, String, String) - Constructor for class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
Creates an ID3 v2 style comment tag
ID3Tags - Interface in org.apache.tika.parser.mp3
Interface that defines the common interface for ID3 tag parsers, such as ID3v1 and ID3v2.3.
ID3Tags.ID3Comment - Class in org.apache.tika.parser.mp3
Represents a comments in ID3 (especially ID3 v2), where are made up of several parts
ID3TagsAndAudio() - Constructor for class org.apache.tika.parser.mp3.Mp3Parser.ID3TagsAndAudio
ID3v1Handler - Class in org.apache.tika.parser.mp3
This is used to parse ID3 Version 1 Tag information from an MP3 file, if available.
ID3v1Handler(InputStream, ContentHandler) - Constructor for class org.apache.tika.parser.mp3.ID3v1Handler
ID3v1Handler(byte[]) - Constructor for class org.apache.tika.parser.mp3.ID3v1Handler
Creates from the last 128 bytes of a stream.
ID3v22Handler - Class in org.apache.tika.parser.mp3
This is used to parse ID3 Version 2.2 Tag information from an MP3 file, if available.
ID3v22Handler(ID3v2Frame) - Constructor for class org.apache.tika.parser.mp3.ID3v22Handler
ID3v23Handler - Class in org.apache.tika.parser.mp3
This is used to parse ID3 Version 2.3 Tag information from an MP3 file, if available.
ID3v23Handler(ID3v2Frame) - Constructor for class org.apache.tika.parser.mp3.ID3v23Handler
ID3v24Handler - Class in org.apache.tika.parser.mp3
This is used to parse ID3 Version 2.4 Tag information from an MP3 file, if available.
ID3v24Handler(ID3v2Frame) - Constructor for class org.apache.tika.parser.mp3.ID3v24Handler
ID3v2Frame - Class in org.apache.tika.parser.mp3
A frame of ID3v2 data, which is then passed to a handler to be turned into useful data.
ID3v2Frame.RawTag - Class in org.apache.tika.parser.mp3
ID3v2Frame.RawTagIterator - Class in org.apache.tika.parser.mp3
Iterates over id3v2 raw tags.
ID3v2Frame.TextEncoding - Class in org.apache.tika.parser.mp3
ID_PROPERTY - Static variable in class org.apache.tika.language.translate.MicrosoftTranslator
IDBWriter - Interface in
IDENTIFIER - Static variable in interface org.apache.tika.metadata.DublinCore
Recommended best practice is to identify the resource by means of a string or number conforming to a formal identification system.
IDENTIFIER - Static variable in class org.apache.tika.metadata.Metadata
use TikaCoreProperties#IDENTIFIER
IDENTIFIER - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
IDENTIFIER - Static variable in interface org.apache.tika.metadata.XMP
An unordered array of text strings that unambiguously identify the resource within a given context.
identifyEndpoints() - Method in class org.apache.tika.server.resource.TikaWelcome
identifyStaticServiceProviders(Class<T>) - Method in class org.apache.tika.config.ServiceLoader
Returns the defined static service providers of the given type, without attempting to load them.
IdentityHtmlMapper - Class in org.apache.tika.parser.html
Alternative HTML mapping rules that pass the input HTML as-is without any modifications.
IdentityHtmlMapper() - Constructor for class org.apache.tika.parser.html.IdentityHtmlMapper
IFileProcessorFutureResult - Interface in org.apache.tika.batch
stub interface to allow for different result types from different processors
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.parser.dif.DIFContentHandler
ignorableWhitespace(char[], int, int) - Method in class
ignorableWhitespace(char[], int, int) - Method in class
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.ContentHandlerDecorator
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.DIFContentHandler
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.LinkContentHandler
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.SafeContentHandler
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.SecureContentHandler
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.TeeContentHandler
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.TextContentHandler
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.ToTextContentHandler
Writes the given ignorable characters to the given character stream.
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.WriteOutContentHandler
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
IGNORE - Static variable in interface org.apache.tika.config.InitializableProblemHandler
Strategy that simply ignores all problems.
IGNORE - Static variable in interface org.apache.tika.config.LoadErrorHandler
Strategy that simply ignores all problems.
IGNORE_LENGTH - Static variable in class
image(String) - Static method in class org.apache.tika.mime.MediaType
IMAGE_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
IMAGE_COUNT - Static variable in interface org.apache.tika.metadata.Office
The number of Images in the document
IMAGE_CREATOR - Static variable in interface org.apache.tika.metadata.IPTC
Creator or creators of the image.
IMAGE_CREATOR_ID - Static variable in interface org.apache.tika.metadata.IPTC
The ID of the creator or creators of the image.
IMAGE_CREATOR_ID_WRONG_CASE - Static variable in interface org.apache.tika.metadata.IPTC
IMAGE_CREATOR_NAME - Static variable in interface org.apache.tika.metadata.IPTC
The name of the creator or creators of the image.
IMAGE_LENGTH - Static variable in interface org.apache.tika.metadata.TIFF
"Image height in pixels."
IMAGE_REGISTRY_ENTRY - Static variable in interface org.apache.tika.metadata.IPTC
Both a Registry Item Id and a Registry Organisation Id to record any registration of this item with a registry.
IMAGE_SUPPLIER - Static variable in interface org.apache.tika.metadata.IPTC
Identifies the most recent supplier of the item, who is not necessarily its owner or creator.
IMAGE_SUPPLIER_ID - Static variable in interface org.apache.tika.metadata.IPTC
Identifies the most recent supplier of the item, who is not necessarily its owner or creator.
IMAGE_SUPPLIER_ID_WRONG_CASE - Static variable in interface org.apache.tika.metadata.IPTC
IMAGE_SUPPLIER_IMAGE_ID - Static variable in interface org.apache.tika.metadata.IPTC
Optional identifier assigned by the Image Supplier to the image.
IMAGE_SUPPLIER_NAME - Static variable in interface org.apache.tika.metadata.IPTC
Identifies the most recent supplier of the item, who is not necessarily its owner or creator.
IMAGE_WIDTH - Static variable in interface org.apache.tika.metadata.TIFF
"Image width in pixels."
ImageMetadataExtractor - Class in org.apache.tika.parser.image
Uses the Metadata Extractor library to read EXIF and IPTC image metadata and map to Tika fields.
ImageMetadataExtractor(Metadata) - Constructor for class org.apache.tika.parser.image.ImageMetadataExtractor
ImageMetadataExtractor(Metadata, ImageMetadataExtractor.DirectoryHandler...) - Constructor for class org.apache.tika.parser.image.ImageMetadataExtractor
ImageParser - Class in org.apache.tika.parser.image
ImageParser() - Constructor for class org.apache.tika.parser.image.ImageParser
ImportContextImpl - Class in org.apache.tika.example
ImportContextImpl(Item, String, InputContext, InputStream, IOListener, Detector) - Constructor for class org.apache.tika.example.ImportContextImpl
Creates a new item import context.
increaseFramesRead() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
increment(String) - Method in class org.apache.tika.eval.tokens.TokenCounts
incrementHandledExceptions() - Method in class org.apache.tika.batch.FileResourceConsumer
Make sure to call this appropriately!