- ABS_PEAK_AUDIO_FILE_PATH - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The absolute path to the file's peak audio file.
- AbstractConsumersBuilder - Class in org.apache.tika.batch.builders
-
- AbstractConsumersBuilder() - Constructor for class org.apache.tika.batch.builders.AbstractConsumersBuilder
-
- AbstractConverter - Class in org.apache.tika.xmp.convert
-
Base class for Tika Metadata to XMP converter which provides some needed common functionality.
- AbstractConverter() - Constructor for class org.apache.tika.xmp.convert.AbstractConverter
-
- AbstractEncodingDetectorParser - Class in org.apache.tika.parser
-
- AbstractEncodingDetectorParser() - Constructor for class org.apache.tika.parser.AbstractEncodingDetectorParser
-
- AbstractEncodingDetectorParser(EncodingDetector) - Constructor for class org.apache.tika.parser.AbstractEncodingDetectorParser
-
- AbstractFSConsumer - Class in org.apache.tika.batch.fs
-
- AbstractFSConsumer(ArrayBlockingQueue<FileResource>) - Constructor for class org.apache.tika.batch.fs.AbstractFSConsumer
-
- AbstractListManager - Class in org.apache.tika.parser.microsoft
-
- AbstractListManager() - Constructor for class org.apache.tika.parser.microsoft.AbstractListManager
-
- AbstractListManager.LevelTuple - Class in org.apache.tika.parser.microsoft
-
- AbstractListManager.ParagraphLevelCounter - Class in org.apache.tika.parser.microsoft
-
- AbstractOfficeParser - Class in org.apache.tika.parser.microsoft
-
- AbstractOfficeParser() - Constructor for class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- AbstractOOXMLExtractor - Class in org.apache.tika.parser.microsoft.ooxml
-
Base class for all Tika OOXML extractors.
- AbstractOOXMLExtractor(ParseContext, POIXMLTextExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
- AbstractParser - Class in org.apache.tika.parser
-
Abstract base class for new parsers.
- AbstractParser() - Constructor for class org.apache.tika.parser.AbstractParser
-
- AbstractProfiler - Class in org.apache.tika.eval
-
- AbstractProfiler(ArrayBlockingQueue<FileResource>, IDBWriter) - Constructor for class org.apache.tika.eval.AbstractProfiler
-
- AbstractProfiler.EXCEPTION_TYPE - Enum in org.apache.tika.eval
-
- AbstractProfiler.PARSE_ERROR_TYPE - Enum in org.apache.tika.eval
-
If information was gathered from the log file about
a parse error
- AbstractRecursiveParserWrapperHandler - Class in org.apache.tika.sax
-
- AbstractRecursiveParserWrapperHandler(ContentHandlerFactory) - Constructor for class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
-
- AbstractRecursiveParserWrapperHandler(ContentHandlerFactory, int) - Constructor for class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
-
- AbstractTranslator - Class in org.apache.tika.language.translate
-
- AbstractTranslator() - Constructor for class org.apache.tika.language.translate.AbstractTranslator
-
- AbstractXML2003Parser - Class in org.apache.tika.parser.microsoft.xml
-
- AbstractXML2003Parser() - Constructor for class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
-
- AccessChecker - Class in org.apache.tika.parser.pdf
-
Checks whether or not a document allows extraction generally
or extraction for accessibility only.
- AccessChecker() - Constructor for class org.apache.tika.parser.pdf.AccessChecker
-
This constructs an
AccessChecker
that
will not perform any checking and will always return without
throwing an exception.
- AccessChecker(boolean) - Constructor for class org.apache.tika.parser.pdf.AccessChecker
-
This constructs an
AccessChecker
that will check
for whether or not content should be extracted from a document.
- AccessPermissionException - Exception in org.apache.tika.exception
-
Exception to be thrown when a document does not allow content extraction.
- AccessPermissionException() - Constructor for exception org.apache.tika.exception.AccessPermissionException
-
- AccessPermissionException(Throwable) - Constructor for exception org.apache.tika.exception.AccessPermissionException
-
- AccessPermissionException(String) - Constructor for exception org.apache.tika.exception.AccessPermissionException
-
- AccessPermissionException(String, Throwable) - Constructor for exception org.apache.tika.exception.AccessPermissionException
-
- AccessPermissions - Interface in org.apache.tika.metadata
-
Until we can find a common standard, we'll use these options.
- ACKNOWLEDGEMENT - Static variable in interface org.apache.tika.metadata.ClimateForcast
-
- ACRONYM_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- ACTION_TRIGGER - Static variable in interface org.apache.tika.metadata.PDF
-
This specifies where an action or destination would be found/triggered
in the document: on document open, before close, etc.
- actionPerformed(ActionEvent) - Method in class org.apache.tika.gui.TikaGUI
-
- Activator - Class in org.apache.tika.parser.internal
-
- Activator() - Constructor for class org.apache.tika.parser.internal.Activator
-
- add(String, String) - Method in class org.apache.tika.eval.tokens.TokenCounter
-
- add(String) - Method in class org.apache.tika.language.LanguageProfile
-
Deprecated.
Adds a single occurrence of the given ngram to this profile.
- add(String, long) - Method in class org.apache.tika.language.LanguageProfile
-
Deprecated.
Adds multiple occurrences of the given ngram to this profile.
- add(StringBuffer) - Method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.
Adds ngrams from a single word to this profile
- add(String, String) - Method in class org.apache.tika.metadata.Metadata
-
Add a metadata name/value mapping.
- add(Property, String) - Method in class org.apache.tika.metadata.Metadata
-
Add a metadata property/value mapping.
- add(Property, int) - Method in class org.apache.tika.metadata.Metadata
-
Adds the integer value of the identified metadata property.
- add(Metadata) - Method in class org.apache.tika.metadata.serialization.JsonStreamingSerializer
-
- add(String, String) - Method in class org.apache.tika.xmp.XMPMetadata
-
As this API could only possibly work for simple properties in XMP, it just calls the set
method, which replaces any existing value
- addAlias(MediaType, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
- addAlternative(GeoTag) - Method in class org.apache.tika.parser.geo.topic.GeoTag
-
- addData(byte[], int, int) - Method in class org.apache.tika.detect.TextStatistics
-
- addDrawingHyperLinks(PackagePart) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
- ADDED - Static variable in class org.apache.tika.batch.FileResourceCrawler
-
- addErrorLogTablePair(Path, TableInfo) - Method in class org.apache.tika.eval.batch.DBConsumersManager
-
- addErrorLogTablePairs(DBConsumersManager) - Method in class org.apache.tika.eval.batch.EvalConsumerBuilder
-
- addErrorLogTablePairs(DBConsumersManager) - Method in class org.apache.tika.eval.batch.ExtractComparerBuilder
-
- addErrorLogTablePairs(DBConsumersManager) - Method in class org.apache.tika.eval.batch.ExtractProfilerBuilder
-
- addEvenIfNull(Property, String, Metadata) - Static method in class org.apache.tika.parser.microsoft.OutlookExtractor
-
- addingService(ServiceReference) - Method in class org.apache.tika.config.TikaActivator
-
- ADDITIONAL_MODEL_INFO - Static variable in interface org.apache.tika.metadata.IPTC
-
Information about the ethnicity and other facets of the model(s) in a
model-released image.
- ADDITIONAL_NAMESPACES - Static variable in class org.apache.tika.xmp.convert.MSOfficeBinaryConverter
-
- ADDITIONAL_NAMESPACES - Static variable in class org.apache.tika.xmp.convert.MSOfficeXMLConverter
-
- ADDITIONAL_NAMESPACES - Static variable in class org.apache.tika.xmp.convert.OpenDocumentConverter
-
- ADDITIONAL_NAMESPACES - Static variable in class org.apache.tika.xmp.convert.RTFConverter
-
- addMetadata(String) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
-
- addMetadata(String) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
-
- addMetadata(String) - Method in class org.apache.tika.parser.xml.MetadataHandler
-
Deprecated.
- addMulti(Metadata, Property, String) - Static method in class org.apache.tika.parser.microsoft.SummaryExtractor
-
- addOtherTesseractConfig(String, String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Add a key-value pair to pass to Tesseract using its -c command line option.
- addPattern(MimeType, String) - Method in class org.apache.tika.mime.MimeTypes
-
Adds a file name pattern for the given media type.
- addPattern(MimeType, String, boolean) - Method in class org.apache.tika.mime.MimeTypes
-
Adds a file name pattern for the given media type.
- addPersonAndEmail(String, Property, Property, Metadata) - Static method in class org.apache.tika.parser.mail.MailUtil
-
This tries to split a "from" or "to" value into a person field and an email field.
- addPrefix(String, String) - Method in class org.apache.tika.sax.xpath.XPathParser
-
- addProfile(String, LanguageProfile) - Static method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.
Adds a single language profile
- addResource(Closeable) - Method in class org.apache.tika.io.TemporaryResources
-
- addSuperType(MediaType, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
- addText(char[], int, int) - Method in class org.apache.tika.langdetect.Lingo24LangDetector
-
- addText(char[], int, int) - Method in class org.apache.tika.langdetect.OptimaizeLangDetector
-
- addText(char[], int, int) - Method in class org.apache.tika.langdetect.TextLangDetector
-
- addText(char[], int, int) - Method in class org.apache.tika.language.detect.LanguageDetector
-
Add statistics about this text for the current document.
- addText(CharSequence) - Method in class org.apache.tika.language.detect.LanguageDetector
-
Add to the statistics being accumulated for the current
document.
- addType(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
- AdobeFontMetricParser - Class in org.apache.tika.parser.font
-
Parser for AFM Font Files
- AdobeFontMetricParser() - Constructor for class org.apache.tika.parser.font.AdobeFontMetricParser
-
- AdvancedTypeDetector - Class in org.apache.tika.example
-
- AdvancedTypeDetector() - Constructor for class org.apache.tika.example.AdvancedTypeDetector
-
- afterRead(int) - Method in class org.apache.tika.io.ProxyInputStream
-
Invoked by the read methods after the proxied call has returned
successfully.
- afterRead(int) - Method in class org.apache.tika.io.TikaInputStream
-
- AgeRecogniser - Class in org.apache.tika.parser.recognition
-
Parser for extracting features from text.
- AgeRecogniser() - Constructor for class org.apache.tika.parser.recognition.AgeRecogniser
-
- AgeRecogniserConfig - Class in org.apache.tika.parser.recognition
-
Stores URL for AgePredictor
- AgeRecogniserConfig(Map<String, Param>) - Constructor for class org.apache.tika.parser.recognition.AgeRecogniserConfig
-
- ALBUM - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the album."
- ALBUM_ARTIST - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the album artist or group for compilation albums."
- ALIAS_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- ALIAS_TYPE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- ALIGNED_OFFSET - Static variable in class org.apache.tika.parser.chm.core.ChmCommons
-
- alignedLenTable - Variable in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- alignedTreeTable - Variable in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- AlphaIdeographFilterFactory - Class in org.apache.tika.eval.tokens
-
Factory for filter that only allows tokens with characters that "isAlphabetic" or "isIdeographic" through.
- AlphaIdeographFilterFactory(Map<String, String>) - Constructor for class org.apache.tika.eval.tokens.AlphaIdeographFilterFactory
-
- ALT_TAPE_NAME - Static variable in interface org.apache.tika.metadata.XMPDM
-
"An alternative tape name, set via the project window or timecode
dialog in Premiere.
- ALTITUDE - Static variable in interface org.apache.tika.metadata.Geographic
-
The WGS84 Altitude of the Point
- ALTITUDE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- analyze(StringBuilder) - Method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.
Analyzes a piece of text
- AnalyzerManager - Class in org.apache.tika.eval.tokens
-
- AnnotationUtils - Class in org.apache.tika.utils
-
This class contains utilities for dealing with tika annotations
- AnnotationUtils() - Constructor for class org.apache.tika.utils.AnnotationUtils
-
- apiBaseUri - Variable in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- apiUri - Variable in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- APP_VERSION - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
-
- AppleSingleFileParser - Class in org.apache.tika.parser.apple
-
Parser that strips the header off of AppleSingle and AppleDouble
files.
- AppleSingleFileParser() - Constructor for class org.apache.tika.parser.apple.AppleSingleFileParser
-
- APPLICATION - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
-
- application(String) - Static method in class org.apache.tika.mime.MediaType
-
- APPLICATION_NAME - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- APPLICATION_VERSION - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- APPLICATION_XML - Static variable in class org.apache.tika.mime.MediaType
-
- APPLICATION_ZIP - Static variable in class org.apache.tika.mime.MediaType
-
- applyStyleAndValue(int, ResultSet, Cell) - Method in class org.apache.tika.eval.reports.XLSXHREFFormatter
-
- AppParserFactoryBuilder - Class in org.apache.tika.batch.builders
-
- AppParserFactoryBuilder() - Constructor for class org.apache.tika.batch.builders.AppParserFactoryBuilder
-
- ARCHITECTURE_BITS - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- ARTIST - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the artist or artists."
- ARTWORK_OR_OBJECT - Static variable in interface org.apache.tika.metadata.IPTC
-
A set of metadata about artwork or an object in the item
- ARTWORK_OR_OBJECT_DETAIL_COPYRIGHT_NOTICE - Static variable in interface org.apache.tika.metadata.IPTC
-
Contains any necessary copyright notice for claiming the intellectual
property for artwork or an object in the image and should identify the
current owner of the copyright of this work with associated intellectual
property rights.
- ARTWORK_OR_OBJECT_DETAIL_CREATOR - Static variable in interface org.apache.tika.metadata.IPTC
-
Contains the name of the artist who has created artwork or an object in the image.
- ARTWORK_OR_OBJECT_DETAIL_DATE_CREATED - Static variable in interface org.apache.tika.metadata.IPTC
-
Designates the date and optionally the time the artwork or object in the
image was created.
- ARTWORK_OR_OBJECT_DETAIL_SOURCE - Static variable in interface org.apache.tika.metadata.IPTC
-
The organisation or body holding and registering the artwork or object in
the image for inventory purposes.
- ARTWORK_OR_OBJECT_DETAIL_SOURCE_INVENTORY_NUMBER - Static variable in interface org.apache.tika.metadata.IPTC
-
The inventory number issued by the organisation or body holding and
registering the artwork or object in the image.
- ARTWORK_OR_OBJECT_DETAIL_TITLE - Static variable in interface org.apache.tika.metadata.IPTC
-
A reference for the artwork or object in the image.
- asInputSource() - Method in class org.apache.tika.detect.AutoDetectReader
-
- ASSEMBLE_DOCUMENT - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Can the user insert/rotate/delete pages.
- assertByteArrayNotNull(byte[]) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
-
Checks if byte[] is not null
- assertByteArrayNotNull(byte[]) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
- assertChmAccessorNotNull(ChmAccessor<?>) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
-
Checks if ChmAccessor is not null In case of null throws exception
- assertChmAccessorParameters(byte[], ChmAccessor<?>, int) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
-
Checks validity of ChmAccessor parameters
- assertChmBlockSegment(byte[], ChmLzxcResetTable, int, int, int) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
-
Checks a validity of the chmBlockSegment parameters
- assertCopyingDataIndex(int, int) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
-
- assertDirectoryListingEntry(int, String, ChmCommons.EntryType, int, int) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
-
Checks validity of the DirectoryListingEntry's parameters In case of
invalid parameter(s) throws an exception
- assertInputStreamNotNull(InputStream) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
-
Checks if InputStream is not null
- assertPositiveInt(int) - Static method in class org.apache.tika.parser.chm.assertion.ChmAssert
-
Checks if int param is greater than zero In case param <= 0 throws an
exception
- assignFieldParams(Object, Map<String, Param>) - Static method in class org.apache.tika.utils.AnnotationUtils
-
Assigns the param values to bean
- assignValue(Object, Object) - Method in class org.apache.tika.config.ParamField
-
Sets given value to the annotated field of bean
- attachExternalParsers(TikaConfig) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
-
- attachExternalParsers(List<ExternalParser>, TikaConfig) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
-
- AttributeDependantMetadataHandler - Class in org.apache.tika.parser.xml
-
This adds a Metadata entry for a given node.
- AttributeDependantMetadataHandler(Metadata, String, String) - Constructor for class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
-
- AttributeMatcher - Class in org.apache.tika.sax.xpath
-
Final evaluation state of a .../@*
XPath expression.
- AttributeMatcher() - Constructor for class org.apache.tika.sax.xpath.AttributeMatcher
-
- AttributeMetadataHandler - Class in org.apache.tika.parser.xml
-
SAX event handler that maps the contents of an XML attribute into
a metadata field.
- AttributeMetadataHandler(String, String, Metadata, String) - Constructor for class org.apache.tika.parser.xml.AttributeMetadataHandler
-
- AttributeMetadataHandler(String, String, Metadata, Property) - Constructor for class org.apache.tika.parser.xml.AttributeMetadataHandler
-
- audio(String) - Static method in class org.apache.tika.mime.MediaType
-
- AUDIO_CHANNEL_TYPE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio channel type."
- AUDIO_COMPRESSOR - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio compression used.
- AUDIO_MOD_DATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The date and time when the audio was last modified."
- AUDIO_SAMPLE_RATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio sample rate.
- AUDIO_SAMPLE_TYPE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio sample type."
- AudioFrame - Class in org.apache.tika.parser.mp3
-
An Audio Frame in an MP3 file.
- AudioFrame(InputStream, ContentHandler) - Constructor for class org.apache.tika.parser.mp3.AudioFrame
-
- AudioFrame(int, int, int, int, InputStream) - Constructor for class org.apache.tika.parser.mp3.AudioFrame
-
- AudioFrame(int, int, int, int, int, int, float) - Constructor for class org.apache.tika.parser.mp3.AudioFrame
-
Creates a new instance of AudioFrame
and initializes all properties.
- AudioParser - Class in org.apache.tika.parser.audio
-
- AudioParser() - Constructor for class org.apache.tika.parser.audio.AudioParser
-
- AUTHOR - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- AUTHOR - Static variable in interface org.apache.tika.metadata.Office
-
Name of the principal author(s) of a document
- AUTHORS_POSITION - Static variable in interface org.apache.tika.metadata.Photoshop
-
- AutoDetectParser - Class in org.apache.tika.parser
-
- AutoDetectParser() - Constructor for class org.apache.tika.parser.AutoDetectParser
-
Creates an auto-detecting parser instance using the default Tika
configuration.
- AutoDetectParser(Detector) - Constructor for class org.apache.tika.parser.AutoDetectParser
-
- AutoDetectParser(Parser...) - Constructor for class org.apache.tika.parser.AutoDetectParser
-
Creates an auto-detecting parser instance using the specified set of parser.
- AutoDetectParser(Detector, Parser...) - Constructor for class org.apache.tika.parser.AutoDetectParser
-
- AutoDetectParser(TikaConfig) - Constructor for class org.apache.tika.parser.AutoDetectParser
-
- AutoDetectParserFactory - Class in org.apache.tika.batch
-
Simple class for AutoDetectParser
- AutoDetectParserFactory() - Constructor for class org.apache.tika.batch.AutoDetectParserFactory
-
- AutoDetectParserFactory - Class in org.apache.tika.parser
-
Factory for an AutoDetectParser
- AutoDetectParserFactory(Map<String, String>) - Constructor for class org.apache.tika.parser.AutoDetectParserFactory
-
- AutoDetectReader - Class in org.apache.tika.detect
-
An input stream reader that automatically detects the character encoding
to be used for converting bytes to characters.
- AutoDetectReader(InputStream, Metadata, EncodingDetector) - Constructor for class org.apache.tika.detect.AutoDetectReader
-
- AutoDetectReader(InputStream, Metadata, ServiceLoader) - Constructor for class org.apache.tika.detect.AutoDetectReader
-
- AutoDetectReader(InputStream, Metadata) - Constructor for class org.apache.tika.detect.AutoDetectReader
-
- AutoDetectReader(InputStream) - Constructor for class org.apache.tika.detect.AutoDetectReader
-
- autoTranslate(InputStream, String, String) - Method in class org.apache.tika.server.resource.TranslateResource
-
- available() - Method in class org.apache.tika.io.LookaheadInputStream
-
- available() - Method in class org.apache.tika.io.NullInputStream
-
Return the number of bytes that can be read.
- available() - Method in class org.apache.tika.io.ProxyInputStream
-
Invokes the delegate's available()
method.
- available - Variable in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- CachedTranslator - Class in org.apache.tika.language.translate
-
CachedTranslator.
- CachedTranslator() - Constructor for class org.apache.tika.language.translate.CachedTranslator
-
- CachedTranslator(Translator) - Constructor for class org.apache.tika.language.translate.CachedTranslator
-
Create a new CachedTranslator.
- calculateContrastStatistics(Map<String, MutableInt>, TokenStatistics, Map<String, MutableInt>, TokenStatistics) - Method in class org.apache.tika.eval.tokens.TokenContraster
-
- call() - Method in class org.apache.tika.batch.BatchProcess
-
Runs main execution loop.
- call() - Method in class org.apache.tika.batch.FileResourceConsumer
-
- call() - Method in class org.apache.tika.batch.FileResourceCrawler
-
- call() - Method in class org.apache.tika.batch.fs.strawman.StrawManTikaAppDriver
-
- call() - Method in class org.apache.tika.batch.Interrupter
-
- call() - Method in class org.apache.tika.batch.StatusReporter
-
Startup the reporter.
- CAN_MODIFY - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Can any modifications be made to the document
- CAN_MODIFY_ANNOTATIONS - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Can the user modify annotations
- CAN_PRINT - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Can the user print the document
- CAN_PRINT_DEGRADED - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
Can the user print an image-degraded version of the document.
- canRun() - Static method in class org.apache.tika.langdetect.TextLangDetector
-
- canRun() - Static method in class org.apache.tika.parser.journal.GrobidRESTParser
-
- CAPTION_WRITER - Static variable in interface org.apache.tika.metadata.Photoshop
-
- CaptionObject - Class in org.apache.tika.parser.captioning
-
A model for caption objects from graphics and texts typically includes
human readable sentence, language of the sentence and confidence score.
- CaptionObject(String, String, double) - Constructor for class org.apache.tika.parser.captioning.CaptionObject
-
- cast(InputStream) - Static method in class org.apache.tika.io.TikaInputStream
-
Returns the given stream casts to a TikaInputStream, or
null
if the stream is not a TikaInputStream.
- CATEGORY - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.
- CATEGORY - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- CATEGORY - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
A categorization of the content of this package.
- CATEGORY - Static variable in interface org.apache.tika.metadata.Photoshop
-
- Cell - Interface in org.apache.tika.parser.microsoft
-
Cell of content.
- cell(String, String, XSSFComment) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
-
- CellDecorator - Class in org.apache.tika.parser.microsoft
-
Cell decorator.
- CellDecorator(Cell) - Constructor for class org.apache.tika.parser.microsoft.CellDecorator
-
- CERTIFICATE - Static variable in interface org.apache.tika.metadata.XMPRights
-
A Web URL for a rights management certificate.
- ChannelTypePropertyConverter() - Constructor for class org.apache.tika.metadata.XMPDM.ChannelTypePropertyConverter
-
Deprecated.
- CHARACTER_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- CHARACTER_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Characters in the document
- CHARACTER_COUNT_WITH_SPACES - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- CHARACTER_COUNT_WITH_SPACES - Static variable in interface org.apache.tika.metadata.Office
-
The number of Characters in the document, including spaces
- characters - Variable in class org.apache.tika.mime.MimeTypesReader
-
- characters(char[], int, int) - Method in class org.apache.tika.mime.MimeTypesReader
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.dif.DIFContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.parser.xml.MetadataHandler
-
Deprecated.
- characters(char[], int, int) - Method in class org.apache.tika.sax.ContentHandlerDecorator
-
- characters(char[], int, int) - Method in class org.apache.tika.sax.DIFContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.sax.ExpandedTitleContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.sax.LinkContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.sax.PhoneExtractingContentHandler
-
The characters method is called whenever a Parser wants to pass raw...
- characters(char[], int, int) - Method in class org.apache.tika.sax.SafeContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.sax.SecureContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.sax.StandardsExtractingContentHandler
-
The characters method is called whenever a Parser wants to pass raw
characters to the ContentHandler.
- characters(char[], int, int) - Method in class org.apache.tika.sax.TeeContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.sax.TextContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.sax.ToTextContentHandler
-
Writes the given characters to the given character stream.
- characters(char[], int, int) - Method in class org.apache.tika.sax.ToXMLContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.sax.WriteOutContentHandler
-
Writes the given characters to the given character stream.
- characters(char[], int, int) - Method in class org.apache.tika.sax.XHTMLContentHandler
-
- characters(String) - Method in class org.apache.tika.sax.XHTMLContentHandler
-
- characters(char[], int, int) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
-
- CHARACTERS_PER_PAGE - Static variable in interface org.apache.tika.metadata.PDF
-
- CharsetDetector - Class in org.apache.tika.parser.txt
-
CharsetDetector
provides a facility for detecting the
charset or encoding of character data in an unknown format.
- CharsetDetector() - Constructor for class org.apache.tika.parser.txt.CharsetDetector
-
Constructor
- CharsetDetector(int) - Constructor for class org.apache.tika.parser.txt.CharsetDetector
-
- CharsetMatch - Class in org.apache.tika.parser.txt
-
This class represents a charset that has been identified by a CharsetDetector
as a possible encoding for a set of input data.
- CharsetUtils - Class in org.apache.tika.utils
-
- CharsetUtils() - Constructor for class org.apache.tika.utils.CharsetUtils
-
- check(String, int...) - Static method in class org.apache.tika.embedder.ExternalEmbedder
-
Checks to see if the command can be run.
- check(String[], int...) - Static method in class org.apache.tika.embedder.ExternalEmbedder
-
Checks to see if the command can be run.
- check(String, int...) - Static method in class org.apache.tika.parser.external.ExternalParser
-
Checks to see if the command can be run.
- check(String[], int...) - Static method in class org.apache.tika.parser.external.ExternalParser
-
- check(Metadata) - Method in class org.apache.tika.parser.pdf.AccessChecker
-
- CHECK_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
-
- checkAvail() - Method in class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
-
Ping lucene-geo-gazetteer API
- checkBit(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- checkCommand(String, int...) - Method in class org.apache.tika.language.translate.ExternalTranslator
-
Checks to see if the command can be run.
- checkForTimedOutMillis(long) - Method in class org.apache.tika.batch.FileResourceConsumer
-
Checks to see if the currentFile being processed (if there is one)
should be timed out (still being worked on after staleThresholdMillis).
- checkInitialization(InitializableProblemHandler) - Method in interface org.apache.tika.config.Initializable
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.dl.imagerec.DL4JInceptionV3Net
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.dl.imagerec.DL4JVGG16Net
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.jdbc.SQLite3Parser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.AgeRecogniser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- checkInitialization(InitializableProblemHandler) - Method in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
-
- checkIntegrity() - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- checkIsOperating() - Static method in class org.apache.tika.server.resource.TikaResource
-
- checkThisIsAncestorOfOrSameAsThat(File, File) - Static method in class org.apache.tika.batch.fs.FSUtil
-
Deprecated.
- checkThisIsAncestorOfThat(File, File) - Static method in class org.apache.tika.batch.fs.FSUtil
-
Deprecated.
- ChildMatcher - Class in org.apache.tika.sax.xpath
-
Intermediate evaluation state of a .../*...
XPath expression.
- ChildMatcher(Matcher) - Constructor for class org.apache.tika.sax.xpath.ChildMatcher
-
- CHM_ITSF_V2_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_ITSF_V3_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_ITSP_V1_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_LZXC_MIN_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_LZXC_RESETTABLE_V1_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_LZXC_V2_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_PMGI_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_PMGI_MARKER - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_PMGL_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_SIGNATURE_LEN - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_VER_1 - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_VER_2 - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_VER_3 - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CHM_WINDOW_SIZE_BLOCK - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- ChmAccessor<T> - Interface in org.apache.tika.parser.chm.accessor
-
Defines an accessor interface
- ChmAssert - Class in org.apache.tika.parser.chm.assertion
-
Contains chm extractor assertions
- ChmAssert() - Constructor for class org.apache.tika.parser.chm.assertion.ChmAssert
-
- ChmBlockInfo - Class in org.apache.tika.parser.chm.lzx
-
A container that contains chm block information such as: i.
- ChmCommons - Class in org.apache.tika.parser.chm.core
-
- ChmCommons.EntryType - Enum in org.apache.tika.parser.chm.core
-
Represents entry types: uncompressed, compressed
- ChmCommons.IntelState - Enum in org.apache.tika.parser.chm.core
-
Represents intel file states during decompression
- ChmCommons.LzxState - Enum in org.apache.tika.parser.chm.core
-
Represents lzx states: started decoding, not started decoding
- ChmConstants - Class in org.apache.tika.parser.chm.core
-
- ChmDirectoryListingSet - Class in org.apache.tika.parser.chm.accessor
-
Holds chm listing entries
- ChmDirectoryListingSet(byte[], ChmItsfHeader, ChmItspHeader) - Constructor for class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Constructs chm directory listing set
- ChmExtractor - Class in org.apache.tika.parser.chm.core
-
Extracts text from chm file.
- ChmExtractor(InputStream) - Constructor for class org.apache.tika.parser.chm.core.ChmExtractor
-
- ChmItsfHeader - Class in org.apache.tika.parser.chm.accessor
-
The Header 0000: char[4] 'ITSF' 0004: DWORD 3 (Version number) 0008: DWORD
Total header length, including header section table and following data.
- ChmItsfHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
- ChmItspHeader - Class in org.apache.tika.parser.chm.accessor
-
Directory header The directory starts with a header; its format is as
follows: 0000: char[4] 'ITSP' 0004: DWORD Version number 1 0008: DWORD Length
of the directory header 000C: DWORD $0a (unknown) 0010: DWORD $1000 Directory
chunk size 0014: DWORD "Density" of quickref section, usually 2 0018: DWORD
Depth of the index tree - 1 there is no index, 2 if there is one level of
PMGI chunks 001C: DWORD Chunk number of root index chunk, -1 if there is none
(though at least one file has 0 despite there being no index chunk, probably
a bug) 0020: DWORD Chunk number of first PMGL (listing) chunk 0024: DWORD
Chunk number of last PMGL (listing) chunk 0028: DWORD -1 (unknown) 002C:
DWORD Number of directory chunks (total) 0030: DWORD Windows language ID
0034: GUID {5D02926A-212E-11D0-9DF9-00A0C922E6EC} 0044: DWORD $54 (This is
the length again) 0048: DWORD -1 (unknown) 004C: DWORD -1 (unknown) 0050:
DWORD -1 (unknown)
- ChmItspHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
- ChmLzxBlock - Class in org.apache.tika.parser.chm.lzx
-
Decompresses a chm block.
- ChmLzxBlock(int, byte[], long, ChmLzxBlock) - Constructor for class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- ChmLzxcControlData - Class in org.apache.tika.parser.chm.accessor
-
::DataSpace/Storage//ControlData This file contains $20 bytes of
information on the compression.
- ChmLzxcControlData() - Constructor for class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
- ChmLzxcResetTable - Class in org.apache.tika.parser.chm.accessor
-
LZXC reset table For ensuring a decompression.
- ChmLzxcResetTable() - Constructor for class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
- ChmLzxState - Class in org.apache.tika.parser.chm.lzx
-
- ChmLzxState(int) - Constructor for class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- ChmParser - Class in org.apache.tika.parser.chm
-
- ChmParser() - Constructor for class org.apache.tika.parser.chm.ChmParser
-
- ChmParsingException - Exception in org.apache.tika.parser.chm.exception
-
- ChmParsingException(String) - Constructor for exception org.apache.tika.parser.chm.exception.ChmParsingException
-
- ChmPmgiHeader - Class in org.apache.tika.parser.chm.accessor
-
Description Note: not always exists An index chunk has the following format:
0000: char[4] 'PMGI' 0004: DWORD Length of quickref/free area at end of
directory chunk 0008: Directory index entries (to quickref/free area) The
quickref area in an PMGI is the same as in an PMGL The format of a directory
index entry is as follows: BYTE: length of name BYTEs: name (UTF-8 encoded)
ENCINT: directory listing chunk which starts with name Encoded Integers aka
ENCINT An ENCINT is a variable-length integer.
- ChmPmgiHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
-
- ChmPmglHeader - Class in org.apache.tika.parser.chm.accessor
-
Description There are two types of directory chunks -- index chunks, and
listing chunks.
- ChmPmglHeader() - Constructor for class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- ChmSection - Class in org.apache.tika.parser.chm.lzx
-
- ChmSection(byte[]) - Constructor for class org.apache.tika.parser.chm.lzx.ChmSection
-
- ChmSection(byte[], byte[]) - Constructor for class org.apache.tika.parser.chm.lzx.ChmSection
-
- ChmWrapper - Class in org.apache.tika.parser.chm.core
-
- ChmWrapper() - Constructor for class org.apache.tika.parser.chm.core.ChmWrapper
-
- CITY - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of the city the content is focussing on -- either the place shown
in visual media or referenced by text or audio media.
- CITY - Static variable in interface org.apache.tika.metadata.Photoshop
-
- CJKBigramAwareLengthFilterFactory - Class in org.apache.tika.eval.tokens
-
Creates a very narrowly focused TokenFilter that limits tokens based on length
_unless_ they've been identified as <DOUBLE> or <SINGLE>
by the CJKBigramFilter.
- CJKBigramAwareLengthFilterFactory(Map<String, String>) - Constructor for class org.apache.tika.eval.tokens.CJKBigramAwareLengthFilterFactory
-
- ClassLoaderUtil - Class in org.apache.tika.util
-
- ClassLoaderUtil() - Constructor for class org.apache.tika.util.ClassLoaderUtil
-
- className - Variable in class org.apache.tika.server.resource.TikaWelcome.Endpoint
-
- ClassParser - Class in org.apache.tika.parser.asm
-
Parser for Java .class files.
- ClassParser() - Constructor for class org.apache.tika.parser.asm.ClassParser
-
- clean(String) - Static method in class org.apache.tika.sax.CleanPhoneText
-
- clean(String) - Static method in class org.apache.tika.utils.CharsetUtils
-
Handle various common charset name errors, and return something
that will be considered valid (and is normalized)
- CleanPhoneText - Class in org.apache.tika.sax
-
Class to help de-obfuscate phone numbers in text.
- CleanPhoneText() - Constructor for class org.apache.tika.sax.CleanPhoneText
-
- cleanSubstitutions - Static variable in class org.apache.tika.sax.CleanPhoneText
-
- clear(String) - Method in class org.apache.tika.eval.tokens.TokenCounter
-
- clearProfiles() - Static method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.
Clears the current map of language profiles
- ClimateForcast - Interface in org.apache.tika.metadata
-
- clone() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- cloneMetadata(Metadata) - Static method in class org.apache.tika.utils.ParserUtils
-
Does a deep clone of a Metadata object.
- close(Closeable) - Method in class org.apache.tika.batch.FileResourceConsumer
-
- close() - Method in class org.apache.tika.eval.db.DBBuffer
-
- close() - Method in class org.apache.tika.eval.db.MimeBuffer
-
- close() - Method in class org.apache.tika.eval.io.DBWriter
-
- close() - Method in interface org.apache.tika.eval.io.IDBWriter
-
- close() - Method in class org.apache.tika.eval.tokens.CommonTokenCountManager
-
- close() - Method in class org.apache.tika.fork.ForkParser
-
- close() - Method in class org.apache.tika.io.CloseShieldInputStream
-
- close() - Method in class org.apache.tika.io.LookaheadInputStream
-
- close() - Method in class org.apache.tika.io.NullInputStream
-
Close this input stream - resets the internal state to
the initial values.
- close() - Method in class org.apache.tika.io.ProxyInputStream
-
Invokes the delegate's close()
method.
- close() - Method in class org.apache.tika.io.TemporaryResources
-
Closes all tracked resources.
- close() - Method in class org.apache.tika.io.TikaInputStream
-
- close() - Method in class org.apache.tika.language.detect.LanguageWriter
-
Ignored.
- close() - Method in class org.apache.tika.language.ProfilingWriter
-
Deprecated.
- close() - Method in class org.apache.tika.metadata.serialization.JsonStreamingSerializer
-
- close() - Method in class org.apache.tika.parser.mp4.DirectFileReadDataSource
-
- close() - Method in class org.apache.tika.parser.ParsingReader
-
Closes the read end of the pipe.
- close() - Method in class org.apache.tika.utils.RereadableInputStream
-
Closes the input stream and removes the temporary file if one was
created.
- ClosedInputStream - Class in org.apache.tika.io
-
Closed input stream.
- ClosedInputStream() - Constructor for class org.apache.tika.io.ClosedInputStream
-
- closeQuietly(Reader) - Static method in class org.apache.tika.io.IOUtils
-
Unconditionally close an Reader
.
- closeQuietly(Channel) - Static method in class org.apache.tika.io.IOUtils
-
Unconditionally close a Channel
.
- closeQuietly(Writer) - Static method in class org.apache.tika.io.IOUtils
-
Unconditionally close a Writer
.
- closeQuietly(InputStream) - Static method in class org.apache.tika.io.IOUtils
-
Unconditionally close an InputStream
.
- closeQuietly(OutputStream) - Static method in class org.apache.tika.io.IOUtils
-
Unconditionally close an OutputStream
.
- CloseShieldInputStream - Class in org.apache.tika.io
-
Proxy stream that prevents the underlying input stream from being closed.
- CloseShieldInputStream(InputStream) - Constructor for class org.apache.tika.io.CloseShieldInputStream
-
Creates a proxy that shields the given input stream from being
closed.
- closeStyleTags(XHTMLContentHandler, Deque<FormattingUtils.Tag>) - Static method in class org.apache.tika.parser.microsoft.FormattingUtils
-
Closes all formatting tags.
- closeWriter() - Method in class org.apache.tika.eval.AbstractProfiler
-
- ColInfo - Class in org.apache.tika.eval.db
-
- ColInfo(Cols, int) - Constructor for class org.apache.tika.eval.db.ColInfo
-
- ColInfo(Cols, int, String) - Constructor for class org.apache.tika.eval.db.ColInfo
-
- ColInfo(Cols, int, Integer) - Constructor for class org.apache.tika.eval.db.ColInfo
-
- ColInfo(Cols, int, Integer, String) - Constructor for class org.apache.tika.eval.db.ColInfo
-
- COLOR_MODE - Static variable in interface org.apache.tika.metadata.Photoshop
-
- Cols - Enum in org.apache.tika.eval.db
-
- COLUMN_COUNT - Static variable in interface org.apache.tika.metadata.Database
-
- COLUMN_NAME - Static variable in interface org.apache.tika.metadata.Database
-
- COMMAND_LINE - Static variable in interface org.apache.tika.metadata.ClimateForcast
-
- COMMAND_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
-
- CommandLineParserBuilder - Class in org.apache.tika.batch.builders
-
Reads configurable options from a config file and returns org.apache.commons.cli.Options
object to be used in commandline parser.
- CommandLineParserBuilder() - Constructor for class org.apache.tika.batch.builders.CommandLineParserBuilder
-
- COMMENT - Static variable in interface org.apache.tika.metadata.ClimateForcast
-
- COMMENT_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- COMMENTS - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- COMMENTS - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
-
- COMMENTS - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- CommonsDigester - Class in org.apache.tika.parser.utils
-
- CommonsDigester(int, String) - Constructor for class org.apache.tika.parser.utils.CommonsDigester
-
Include a string representing the comma-separated algorithms to run: e.g.
- CommonsDigester(int, CommonsDigester.DigestAlgorithm...) - Constructor for class org.apache.tika.parser.utils.CommonsDigester
-
- CommonsDigester.DigestAlgorithm - Enum in org.apache.tika.parser.utils
-
- CommonTokenCountManager - Class in org.apache.tika.eval.tokens
-
- CommonTokenCountManager(Path, String) - Constructor for class org.apache.tika.eval.tokens.CommonTokenCountManager
-
- CommonTokenResult - Class in org.apache.tika.eval.tokens
-
- CommonTokenResult(String, int, int, int, int) - Constructor for class org.apache.tika.eval.tokens.CommonTokenResult
-
- COMP_OBJ - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Some other kind of embedded document, in a CompObj container within another OLE2 document
- COMPANY - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- COMPANY - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
-
- compare(String, String) - Method in class org.apache.tika.metadata.serialization.PrettyMetadataKeyComparator
-
- compareFiles(EvalFilePaths, EvalFilePaths) - Method in class org.apache.tika.eval.ExtractComparer
-
- compareTo(TokenIntPair) - Method in class org.apache.tika.eval.tokens.TokenIntPair
-
Descending by value, ascending by token
- compareTo(Property) - Method in class org.apache.tika.metadata.Property
-
- compareTo(MediaType) - Method in class org.apache.tika.mime.MediaType
-
- compareTo(MimeType) - Method in class org.apache.tika.mime.MimeType
-
- compareTo(CSVResult) - Method in class org.apache.tika.parser.csv.CSVResult
-
Sorts in descending order of confidence
- compareTo(CharsetMatch) - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Compare to other CharsetMatch objects.
- COMPARISON_CONTAINERS - Static variable in class org.apache.tika.eval.ExtractComparer
-
- COMPILATION - Static variable in interface org.apache.tika.metadata.XMPDM
-
"An album created by various artists."
- complete(long) - Method in class org.apache.tika.server.ServerStatus
-
Removes the task from the collection of currently running tasks.
- COMPOSER - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The composer's name."
- composite(Property, Property[]) - Static method in class org.apache.tika.metadata.Property
-
Constructs a new composite property from the given primary and array of secondary properties.
- CompositeDetector - Class in org.apache.tika.detect
-
Content type detector that combines multiple different detection mechanisms.
- CompositeDetector(MediaTypeRegistry, List<Detector>, Collection<Class<? extends Detector>>) - Constructor for class org.apache.tika.detect.CompositeDetector
-
- CompositeDetector(MediaTypeRegistry, List<Detector>) - Constructor for class org.apache.tika.detect.CompositeDetector
-
- CompositeDetector(List<Detector>) - Constructor for class org.apache.tika.detect.CompositeDetector
-
- CompositeDetector(Detector...) - Constructor for class org.apache.tika.detect.CompositeDetector
-
- CompositeDigester - Class in org.apache.tika.parser.digest
-
- CompositeDigester(DigestingParser.Digester...) - Constructor for class org.apache.tika.parser.digest.CompositeDigester
-
- CompositeEncodingDetector - Class in org.apache.tika.detect
-
- CompositeEncodingDetector(List<EncodingDetector>, Collection<Class<? extends EncodingDetector>>) - Constructor for class org.apache.tika.detect.CompositeEncodingDetector
-
- CompositeEncodingDetector(List<EncodingDetector>) - Constructor for class org.apache.tika.detect.CompositeEncodingDetector
-
- CompositeExternalParser - Class in org.apache.tika.parser.external
-
A Composite Parser that wraps up all the available External Parsers,
and provides an easy way to access them.
- CompositeExternalParser() - Constructor for class org.apache.tika.parser.external.CompositeExternalParser
-
- CompositeExternalParser(MediaTypeRegistry) - Constructor for class org.apache.tika.parser.external.CompositeExternalParser
-
- CompositeMatcher - Class in org.apache.tika.sax.xpath
-
Composite XPath evaluation state.
- CompositeMatcher(Matcher, Matcher) - Constructor for class org.apache.tika.sax.xpath.CompositeMatcher
-
- CompositeParser - Class in org.apache.tika.parser
-
Composite parser that delegates parsing tasks to a component parser
based on the declared content type of the incoming document.
- CompositeParser(MediaTypeRegistry, List<Parser>, Collection<Class<? extends Parser>>) - Constructor for class org.apache.tika.parser.CompositeParser
-
- CompositeParser(MediaTypeRegistry, List<Parser>) - Constructor for class org.apache.tika.parser.CompositeParser
-
- CompositeParser(MediaTypeRegistry, Parser...) - Constructor for class org.apache.tika.parser.CompositeParser
-
- CompositeParser() - Constructor for class org.apache.tika.parser.CompositeParser
-
- CompositeTagHandler - Class in org.apache.tika.parser.mp3
-
Takes an array of
ID3Tags
in preference order, and when asked for
a given tag, will return it from the first
ID3Tags
that has it.
- CompositeTagHandler(ID3Tags[]) - Constructor for class org.apache.tika.parser.mp3.CompositeTagHandler
-
- CompressorParser - Class in org.apache.tika.parser.pkg
-
Parser for various compression formats.
- CompressorParser() - Constructor for class org.apache.tika.parser.pkg.CompressorParser
-
- CompressorParserOptions - Interface in org.apache.tika.parser.pkg
-
- ConcurrentUtils - Class in org.apache.tika.utils
-
Utility Class for Concurrency in Tika
- ConcurrentUtils() - Constructor for class org.apache.tika.utils.ConcurrentUtils
-
- confidence - Variable in class org.apache.tika.parser.recognition.RecognisedObject
-
Confidence score
- config - Variable in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
- ConfigurableThreadPoolExecutor - Interface in org.apache.tika.concurrent
-
Allows Thread Pool to be Configurable.
- configure(ParseContext) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- configure(PDF2XHTML) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Configures the given pdf2XHTML.
- configureExtractor(POIXMLTextExtractor, Locale) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFBExcelExtractorDecorator
-
- configureExtractor(POIXMLTextExtractor, Locale) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
- consume(String) - Method in interface org.apache.tika.parser.external.ExternalParser.LineConsumer
-
Consume a line
- ConsumersManager - Class in org.apache.tika.batch
-
Simple interface around a collection of consumers that allows
for initializing and shutting shared resources (e.g.
- ConsumersManager(List<FileResourceConsumer>) - Constructor for class org.apache.tika.batch.ConsumersManager
-
- CONTACT - Static variable in interface org.apache.tika.metadata.ClimateForcast
-
- CONTACT_INFO_ADDRESS - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information address part.
- CONTACT_INFO_CITY - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information city part.
- CONTACT_INFO_COUNTRY - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information country part.
- CONTACT_INFO_EMAIL - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information email address part.
- CONTACT_INFO_PHONE - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information phone number part.
- CONTACT_INFO_POSTAL_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information part denoting the local postal code.
- CONTACT_INFO_STATE_PROVINCE - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information part denoting regional information such as state or province.
- CONTACT_INFO_WEB_URL - Static variable in interface org.apache.tika.metadata.IPTC
-
The contact information web address part.
- CONTAINER_TABLE - Static variable in class org.apache.tika.eval.ExtractProfiler
-
- ContainerExtractor - Interface in org.apache.tika.extractor
-
Tika container extractor interface.
- contains(String, String, String) - Method in class org.apache.tika.language.translate.CachedTranslator
-
Check whether this CachedTranslator's cache contains a translation of the text from the
source language to the target language.
- contains(String, String) - Method in class org.apache.tika.language.translate.CachedTranslator
-
Check whether this CachedTranslator's cache contains a translation of the text to the target language,
attempting to auto-detect the source language.
- contains(Charset) - Method in class org.apache.tika.parser.html.charsetdetector.charsets.ReplacementCharset
-
- contains(Charset) - Method in class org.apache.tika.parser.html.charsetdetector.charsets.XUserDefinedCharset
-
- containsColumn(Cols) - Method in class org.apache.tika.eval.db.TableInfo
-
- containsEmail(String) - Static method in class org.apache.tika.parser.mail.MailUtil
-
If the chunk looks like it contains an email
- containsTable(String) - Method in class org.apache.tika.eval.db.JDBCUtil
-
- CONTENT - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CONTENT_COMPARISONS - Static variable in class org.apache.tika.eval.ExtractComparer
-
- CONTENT_DISPOSITION - Static variable in interface org.apache.tika.metadata.HttpHeaders
-
- CONTENT_ENCODING - Static variable in interface org.apache.tika.metadata.HttpHeaders
-
- CONTENT_LANGUAGE - Static variable in interface org.apache.tika.metadata.HttpHeaders
-
- CONTENT_LENGTH - Static variable in interface org.apache.tika.metadata.HttpHeaders
-
- CONTENT_LOCATION - Static variable in interface org.apache.tika.metadata.HttpHeaders
-
- CONTENT_MD5 - Static variable in interface org.apache.tika.metadata.HttpHeaders
-
- CONTENT_STATUS - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- CONTENT_STATUS - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
The status of the content.
- CONTENT_TYPE - Static variable in interface org.apache.tika.metadata.HttpHeaders
-
- CONTENT_TYPE_HINT - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
This is currently used to identify Content-Type that may be
included within a document, such as in html documents
(e.g.
- CONTENT_TYPE_OVERRIDE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- contentEquals(InputStream, InputStream) - Static method in class org.apache.tika.io.IOUtils
-
Compare the contents of two Streams to determine if they are equal or
not.
- contentEquals(Reader, Reader) - Static method in class org.apache.tika.io.IOUtils
-
Compare the contents of two Readers to determine if they are equal or
not.
- ContentHandlerDecorator - Class in org.apache.tika.sax
-
- ContentHandlerDecorator(ContentHandler) - Constructor for class org.apache.tika.sax.ContentHandlerDecorator
-
Creates a decorator for the given SAX event handler.
- ContentHandlerDecorator() - Constructor for class org.apache.tika.sax.ContentHandlerDecorator
-
Creates a decorator that by default forwards incoming SAX events to
a dummy content handler that simply ignores all the events.
- ContentHandlerExample - Class in org.apache.tika.example
-
Examples of using different Content Handlers to
get different parts of the file's contents
- ContentHandlerExample() - Constructor for class org.apache.tika.example.ContentHandlerExample
-
- ContentHandlerFactory - Interface in org.apache.tika.sax
-
Interface to allow easier injection of code for getting a new ContentHandler
- CONTENTS_TABLE - Static variable in class org.apache.tika.eval.ExtractProfiler
-
- CONTENTS_TABLE_A - Static variable in class org.apache.tika.eval.ExtractComparer
-
- CONTENTS_TABLE_B - Static variable in class org.apache.tika.eval.ExtractComparer
-
- ContentTagParser - Class in org.apache.tika.eval.util
-
- ContentTagParser() - Constructor for class org.apache.tika.eval.util.ContentTagParser
-
- ContentTags - Class in org.apache.tika.eval.util
-
- ContentTags(String) - Constructor for class org.apache.tika.eval.util.ContentTags
-
- ContentTags(String, boolean) - Constructor for class org.apache.tika.eval.util.ContentTags
-
- ContentTags(String, Map<String, Integer>) - Constructor for class org.apache.tika.eval.util.ContentTags
-
- ContrastStatistics - Class in org.apache.tika.eval.tokens
-
- ContrastStatistics() - Constructor for class org.apache.tika.eval.tokens.ContrastStatistics
-
- CONTRIBUTOR - Static variable in interface org.apache.tika.metadata.DublinCore
-
An entity responsible for making contributions to the content of the
resource.
- CONTRIBUTOR - Static variable in class org.apache.tika.metadata.Metadata
-
- CONTRIBUTOR - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- CONTROL_DATA - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- CONTROLLED_VOCABULARY_TERM - Static variable in interface org.apache.tika.metadata.IPTC
-
A term to describe the content of the image by a value from a Controlled
Vocabulary.
- CONVENTIONS - Static variable in interface org.apache.tika.metadata.ClimateForcast
-
- convert(Object) - Static method in class org.apache.tika.metadata.XMPDM.ChannelTypePropertyConverter
-
Deprecated.
How a standalone converter might work
- convert(Metadata) - Static method in class org.apache.tika.xmp.convert.TikaToXMP
-
- convert(Metadata, String) - Static method in class org.apache.tika.xmp.convert.TikaToXMP
-
Convert the given Tika metadata map to XMP object.
- convertAndSet(Metadata, Object) - Static method in class org.apache.tika.metadata.XMPDM.ChannelTypePropertyConverter
-
Deprecated.
How convert+set might work
- converttoInt(byte[]) - Static method in class org.apache.tika.parser.image.ICNSType
-
- convertToJSONArray(JSONObject, String) - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
Converts JSON Object to JSON Array
- convertToJSONObject(String) - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
Parses a JSON String and converts it to a JSON Object
- copy(InputStream, OutputStream) - Static method in class org.apache.tika.io.IOUtils
-
Copy bytes from an InputStream
to an
OutputStream
.
- copy(InputStream, Writer) - Static method in class org.apache.tika.io.IOUtils
-
Copy bytes from an InputStream
to chars on a
Writer
using the default character encoding of the platform.
- copy(InputStream, Writer, String) - Static method in class org.apache.tika.io.IOUtils
-
Copy bytes from an InputStream
to chars on a
Writer
using the specified character encoding.
- copy(Reader, Writer) - Static method in class org.apache.tika.io.IOUtils
-
Copy chars from a Reader
to a Writer
.
- copy(Reader, OutputStream) - Static method in class org.apache.tika.io.IOUtils
-
Copy chars from a Reader
to bytes on an
OutputStream
using the default character encoding of the
platform, and calling flush.
- copy(Reader, OutputStream, String) - Static method in class org.apache.tika.io.IOUtils
-
Copy chars from a Reader
to bytes on an
OutputStream
using the specified character encoding, and
calling flush.
- copyLarge(InputStream, OutputStream) - Static method in class org.apache.tika.io.IOUtils
-
Copy bytes from a large (over 2GB) InputStream
to an
OutputStream
.
- copyLarge(Reader, Writer) - Static method in class org.apache.tika.io.IOUtils
-
Copy chars from a large (over 2GB) Reader
to a Writer
.
- copyOfRange(byte[], int, int) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
- COPYRIGHT - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The copyright information."
- COPYRIGHT_NOTICE - Static variable in interface org.apache.tika.metadata.IPTC
-
Contains any necessary copyright notice for claiming the intellectual
property for this item and should identify the current owner of the
copyright for the item.
- COPYRIGHT_OWNER - Static variable in interface org.apache.tika.metadata.IPTC
-
Owner or owners of the copyright in the licensed image.
- COPYRIGHT_OWNER_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
The ID of the owner or owners of the copyright in the licensed image.
- COPYRIGHT_OWNER_ID_WRONG_CASE - Static variable in interface org.apache.tika.metadata.IPTC
-
- COPYRIGHT_OWNER_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of the owner or owners of the copyright in the licensed image.
- CoreNLPNERecogniser - Class in org.apache.tika.parser.ner.corenlp
-
This class offers an implementation of
NERecogniser
based on
CRF classifiers from Stanford CoreNLP.
- CoreNLPNERecogniser() - Constructor for class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
- CoreNLPNERecogniser(String) - Constructor for class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
Creates a NERecogniser by loading model from given path
- CorruptedFileException - Exception in org.apache.tika.exception
-
This exception should be thrown when the parse absolutely, positively has to stop.
- CorruptedFileException(String) - Constructor for exception org.apache.tika.exception.CorruptedFileException
-
- CorruptedFileException(String, Throwable) - Constructor for exception org.apache.tika.exception.CorruptedFileException
-
- count() - Method in class org.apache.tika.detect.TextStatistics
-
Returns the total number of bytes seen so far.
- count(int) - Method in class org.apache.tika.detect.TextStatistics
-
Returns the number of occurrences of the given byte.
- countControl() - Method in class org.apache.tika.detect.TextStatistics
-
Counts control characters (i.e.
- countEightBit() - Method in class org.apache.tika.detect.TextStatistics
-
Counts eight bit characters, i.e.
- CountingInputStream - Class in org.apache.tika.io
-
A decorating input stream that counts the number of bytes that have passed
through the stream so far.
- CountingInputStream(InputStream) - Constructor for class org.apache.tika.io.CountingInputStream
-
Constructs a new CountingInputStream.
- COUNTRY - Static variable in interface org.apache.tika.metadata.IPTC
-
Full name of the country the content is focussing on -- either the
country shown in visual media or referenced in text or audio media.
- COUNTRY - Static variable in interface org.apache.tika.metadata.Photoshop
-
- COUNTRY_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
Code of the country the content is focussing on -- either the country
shown in visual media or referenced in text or audio media.
- countSafeAscii() - Method in class org.apache.tika.detect.TextStatistics
-
Counts "safe" (i.e.
- countTokenOverlaps(String, Map<String, MutableInt>) - Method in class org.apache.tika.eval.tokens.CommonTokenCountManager
-
- COVERAGE - Static variable in interface org.apache.tika.metadata.DublinCore
-
The extent or scope of the content of the resource.
- COVERAGE - Static variable in class org.apache.tika.metadata.Metadata
-
- COVERAGE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- create(TokenStream) - Method in class org.apache.tika.eval.tokens.AlphaIdeographFilterFactory
-
- create(TokenStream) - Method in class org.apache.tika.eval.tokens.CJKBigramAwareLengthFilterFactory
-
- create(String, InputStream, String) - Static method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.
Creates a new Language profile from (preferably quite large - 5-10k of
lines) text file
- create() - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates an empty instance; same as calling new MimeTypes().
- create(Document) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates and returns a MimeTypes instance from the specified document.
- create(InputStream...) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates and returns a MimeTypes instance from the specified input stream.
- create(InputStream) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
- create(URL...) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates and returns a MimeTypes instance from the resource
at the location specified by the URL.
- create(URL) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
- create(String) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates and returns a MimeTypes instance from the specified file path,
as interpreted by the class loader in getResource().
- create(String, String) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates and returns a MimeTypes instance.
- create(String, String, ClassLoader) - Static method in class org.apache.tika.mime.MimeTypesFactory
-
Creates and returns a MimeTypes instance.
- create() - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
-
- create(ServiceLoader) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
-
- create(String, ServiceLoader) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
-
- create(URL...) - Static method in class org.apache.tika.parser.external.ExternalParsersFactory
-
- CREATE_DATE - Static variable in interface org.apache.tika.metadata.XMP
-
The date and time the resource was created.
- createArrayProperty(Property, String, String, int) - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
- createArrayProperty(String, String, String, int) - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
Creates an array property from a list of values.
- createCommaSeparatedArray(Property, String, String, int) - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
- createCommaSeparatedArray(String, String, String, int) - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
Creates an array property from a comma separated list.
- CREATED - Static variable in interface org.apache.tika.metadata.DublinCore
-
Date of creation of the resource.
- CREATED - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- createFrameIfPresent(InputStream) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
Returns the next ID3v2 Frame in
the file, or null if the next batch of data
doesn't correspond to either an ID3v2 header.
- createLangAltProperty(Property, String, String) - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
- createLangAltProperty(String, String, String) - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
Creates a language alternative property in the x-default language
- createParser() - Static method in class org.apache.tika.server.resource.TikaResource
-
- createProperty(Property, String, String) - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
- createProperty(String, String, String) - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
Creates a simple property.
- createTables(List<TableInfo>, JDBCUtil.CREATE_TABLE) - Method in class org.apache.tika.eval.db.JDBCUtil
-
- createTempFile() - Method in class org.apache.tika.io.TemporaryResources
-
Creates a temporary file that will automatically be deleted when
the
TemporaryResources.close()
method is called, returning its path.
- createTemporaryFile() - Method in class org.apache.tika.io.TemporaryResources
-
- CREATION_DATE - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- CREATION_DATE - Static variable in interface org.apache.tika.metadata.Office
-
When was the document created?
- CreativeCommons - Interface in org.apache.tika.metadata
-
A collection of Creative Commons properties names.
- CREATOR - Static variable in interface org.apache.tika.metadata.DublinCore
-
An entity primarily responsible for making the content of the resource.
- CREATOR - Static variable in interface org.apache.tika.metadata.IPTC
-
Contains the name of the person who created the content of this item, a
photographer for photos, a graphic artist for graphics, or a writer for
textual news, but in cases where the photographer should not be
identified the name of a company or organisation may be appropriate.
- CREATOR - Static variable in class org.apache.tika.metadata.Metadata
-
- CREATOR - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- CREATOR_TOOL - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- CREATOR_TOOL - Static variable in interface org.apache.tika.metadata.XMP
-
The name of the first known tool used to create the resource.
- CREATORS_CONTACT_INFO - Static variable in interface org.apache.tika.metadata.IPTC
-
The creator's contact information provides all necessary information to
get in contact with the creator of this item and comprises a set of
sub-properties for proper addressing.
- CREATORS_JOB_TITLE - Static variable in interface org.apache.tika.metadata.IPTC
-
Contains the job title of the person who created the content of this
item.
- CREDIT - Static variable in interface org.apache.tika.metadata.Photoshop
-
- CREDIT_LINE - Static variable in interface org.apache.tika.metadata.IPTC
-
The credit to person(s) and/or organisation(s) required by the supplier
of the item to be used when published.
- CryptoParser - Class in org.apache.tika.parser
-
Decrypts the incoming document stream and delegates further parsing to
another parser instance.
- CryptoParser(String, Provider, Set<MediaType>) - Constructor for class org.apache.tika.parser.CryptoParser
-
- CryptoParser(String, Set<MediaType>) - Constructor for class org.apache.tika.parser.CryptoParser
-
- CSVMessageBodyWriter - Class in org.apache.tika.server.writer
-
- CSVMessageBodyWriter() - Constructor for class org.apache.tika.server.writer.CSVMessageBodyWriter
-
- CSVParams - Class in org.apache.tika.parser.csv
-
- CSVResult - Class in org.apache.tika.parser.csv
-
- CSVResult(double, MediaType, Character) - Constructor for class org.apache.tika.parser.csv.CSVResult
-
- CTAKES_META_PREFIX - Static variable in class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
- CTAKESAnnotationProperty - Enum in org.apache.tika.parser.ctakes
-
This enumeration includes the properties that an IdentifiedAnnotation
object can provide.
- CTAKESConfig - Class in org.apache.tika.parser.ctakes
-
- CTAKESConfig() - Constructor for class org.apache.tika.parser.ctakes.CTAKESConfig
-
Default constructor.
- CTAKESConfig(InputStream) - Constructor for class org.apache.tika.parser.ctakes.CTAKESConfig
-
Loads properties from InputStream and then tries to close InputStream.
- CTAKESContentHandler - Class in org.apache.tika.parser.ctakes
-
Class used to extract biomedical information while parsing.
- CTAKESContentHandler(ContentHandler, Metadata, CTAKESConfig) - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
- CTAKESContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
- CTAKESContentHandler() - Constructor for class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
Default constructor.
- CTAKESParser - Class in org.apache.tika.parser.ctakes
-
CTAKESParser decorates a
Parser
and leverages on
CTAKESContentHandler
to extract biomedical information from
clinical text using Apache cTAKES.
- CTAKESParser() - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
-
Wraps the default Parser
- CTAKESParser(TikaConfig) - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
-
Wraps the default Parser for this Config
- CTAKESParser(Parser) - Constructor for class org.apache.tika.parser.ctakes.CTAKESParser
-
Wraps the specified Parser
- CTAKESSerializer - Enum in org.apache.tika.parser.ctakes
-
Enumeration for types of cTAKES (UIMA) CAS serializer supported by cTAKES.
- CTAKESUtils - Class in org.apache.tika.parser.ctakes
-
This class provides methods to extract biomedical information from plain text
using
CTAKESContentHandler
that relies on Apache cTAKES.
- CTAKESUtils() - Constructor for class org.apache.tika.parser.ctakes.CTAKESUtils
-
- CUSTOM_MIMES_SYS_PROP - Static variable in class org.apache.tika.mime.MimeTypesFactory
-
System property to set a path to an additional external custom mimetypes
XML file to be loaded.
- customCompositeDetector() - Static method in class org.apache.tika.example.CustomMimeInfo
-
- CustomMimeInfo - Class in org.apache.tika.example
-
- CustomMimeInfo() - Constructor for class org.apache.tika.example.CustomMimeInfo
-
- customMimeInfo() - Static method in class org.apache.tika.example.CustomMimeInfo
-
- data - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.RawTag
-
- Database - Interface in org.apache.tika.metadata
-
- databaseExists(Path) - Static method in class org.apache.tika.eval.db.H2Util
-
- DataURIScheme - Class in org.apache.tika.parser.utils
-
- DataURISchemeParseException - Exception in org.apache.tika.parser.utils
-
- DataURISchemeParseException(String) - Constructor for exception org.apache.tika.parser.utils.DataURISchemeParseException
-
- DataURISchemeUtil - Class in org.apache.tika.parser.utils
-
Not thread safe.
- DataURISchemeUtil() - Constructor for class org.apache.tika.parser.utils.DataURISchemeUtil
-
- DATE - Static variable in interface org.apache.tika.metadata.DublinCore
-
A date associated with an event in the life cycle of the resource.
- DATE - Static variable in class org.apache.tika.metadata.Metadata
-
- DATE - Static variable in interface org.apache.tika.parser.ner.NERecogniser
-
- DATE_CREATED - Static variable in interface org.apache.tika.metadata.IPTC
-
Designates the date and optionally the time the intellectual content was
created rather than the date of the creation of the physical
representation.
- DATE_CREATED - Static variable in interface org.apache.tika.metadata.Photoshop
-
- DATE_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
- DateUtils - Class in org.apache.tika.utils
-
Date related utility methods and constants
- DateUtils() - Constructor for class org.apache.tika.utils.DateUtils
-
- DBBuffer - Class in org.apache.tika.eval.db
-
- DBBuffer(Connection, String, String, String) - Constructor for class org.apache.tika.eval.db.DBBuffer
-
- DBConsumersManager - Class in org.apache.tika.eval.batch
-
- DBConsumersManager(JDBCUtil, MimeBuffer, List<FileResourceConsumer>) - Constructor for class org.apache.tika.eval.batch.DBConsumersManager
-
- DBFParser - Class in org.apache.tika.parser.dbf
-
This is a Tika wrapper around the DBFReader.
- DBFParser() - Constructor for class org.apache.tika.parser.dbf.DBFParser
-
- DBWriter - Class in org.apache.tika.eval.io
-
This is still in its early stages.
- DBWriter(Connection, List<TableInfo>, JDBCUtil, MimeBuffer) - Constructor for class org.apache.tika.eval.io.DBWriter
-
- DcXMLParser - Class in org.apache.tika.parser.xml
-
Dublin Core metadata parser
- DcXMLParser() - Constructor for class org.apache.tika.parser.xml.DcXMLParser
-
- decode(String) - Static method in class org.apache.tika.mime.HexCoDec
-
Decode a hex string
- decode(char[]) - Static method in class org.apache.tika.mime.HexCoDec
-
Decode an array of hex chars
- decode(char[], int, int) - Static method in class org.apache.tika.mime.HexCoDec
-
Decode an array of hex chars.
- decompressConcatenated(Metadata) - Method in interface org.apache.tika.parser.pkg.CompressorParserOptions
-
- DEF_MODEL - Static variable in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
-
- DEFAULT - Static variable in interface org.apache.tika.config.InitializableProblemHandler
-
- DEFAULT - Static variable in class org.apache.tika.config.ParamField
-
- DEFAULT_CHARSET - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- DEFAULT_CHILD_STARTUP_MILLIS - Static variable in class org.apache.tika.server.ServerTimeouts
-
Number of milliseconds to wait for child process to startup
- DEFAULT_HOST - Static variable in class org.apache.tika.server.TikaServerCli
-
- DEFAULT_ID - Static variable in class org.apache.tika.language.translate.MicrosoftTranslator
-
- DEFAULT_MAX_ENTITY_EXPANSIONS - Static variable in class org.apache.tika.utils.XMLReaderUtils
-
- DEFAULT_MAX_QUEUE_SIZE - Static variable in class org.apache.tika.batch.builders.BatchProcessBuilder
-
- DEFAULT_MODEL_PATH - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
default Model path
- DEFAULT_MODELS - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
- DEFAULT_NER_IMPL - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
-
- DEFAULT_NGRAM_LENGTH - Static variable in class org.apache.tika.language.LanguageProfile
-
Deprecated.
- DEFAULT_PING_PULSE_MILLIS - Static variable in class org.apache.tika.server.ServerTimeouts
-
How often should the parent try to ping the child to check status
- DEFAULT_PING_TIMEOUT_MILLIS - Static variable in class org.apache.tika.server.ServerTimeouts
-
If the child doesn't receive a ping or the parent doesn't
hear back from a ping in this amount of time, kill and restart the child.
- DEFAULT_POOL_SIZE - Static variable in class org.apache.tika.utils.XMLReaderUtils
-
Default size for the pool of SAX Parsers
and the pool of DOM builders
- DEFAULT_PORT - Static variable in class org.apache.tika.server.TikaServerCli
-
- DEFAULT_SECRET - Static variable in class org.apache.tika.language.translate.MicrosoftTranslator
-
- DEFAULT_TASK_TIMEOUT_MILLIS - Static variable in class org.apache.tika.server.ServerTimeouts
-
Number of milliseconds to wait per server task (parse, detect, unpack, translate,
etc.) before timing out and shutting down the child process.
- DefaultContentHandlerFactoryBuilder - Class in org.apache.tika.batch.builders
-
Builds BasicContentHandler with type defined by attribute "basicHandlerType"
with possible values: xml, html, text, body, ignore.
- DefaultContentHandlerFactoryBuilder() - Constructor for class org.apache.tika.batch.builders.DefaultContentHandlerFactoryBuilder
-
- DefaultDetector - Class in org.apache.tika.detect
-
- DefaultDetector(MimeTypes, ServiceLoader, Collection<Class<? extends Detector>>) - Constructor for class org.apache.tika.detect.DefaultDetector
-
- DefaultDetector(MimeTypes, ServiceLoader) - Constructor for class org.apache.tika.detect.DefaultDetector
-
- DefaultDetector(MimeTypes, ClassLoader) - Constructor for class org.apache.tika.detect.DefaultDetector
-
- DefaultDetector(ClassLoader) - Constructor for class org.apache.tika.detect.DefaultDetector
-
- DefaultDetector(MimeTypes) - Constructor for class org.apache.tika.detect.DefaultDetector
-
- DefaultDetector() - Constructor for class org.apache.tika.detect.DefaultDetector
-
- DefaultEncodingDetector - Class in org.apache.tika.detect
-
- DefaultEncodingDetector() - Constructor for class org.apache.tika.detect.DefaultEncodingDetector
-
- DefaultEncodingDetector(ServiceLoader) - Constructor for class org.apache.tika.detect.DefaultEncodingDetector
-
- DefaultEncodingDetector(ServiceLoader, Collection<Class<? extends EncodingDetector>>) - Constructor for class org.apache.tika.detect.DefaultEncodingDetector
-
- DefaultHtmlMapper - Class in org.apache.tika.parser.html
-
The default HTML mapping rules in Tika.
- DefaultHtmlMapper() - Constructor for class org.apache.tika.parser.html.DefaultHtmlMapper
-
- DefaultInputStreamFactory - Class in org.apache.tika.server
-
Passthrough -- returns InputStream as is
- DefaultInputStreamFactory() - Constructor for class org.apache.tika.server.DefaultInputStreamFactory
-
- DefaultParser - Class in org.apache.tika.parser
-
- DefaultParser(MediaTypeRegistry, ServiceLoader, Collection<Class<? extends Parser>>, EncodingDetector) - Constructor for class org.apache.tika.parser.DefaultParser
-
- DefaultParser(MediaTypeRegistry, ServiceLoader, Collection<Class<? extends Parser>>) - Constructor for class org.apache.tika.parser.DefaultParser
-
- DefaultParser(MediaTypeRegistry, ServiceLoader, EncodingDetector) - Constructor for class org.apache.tika.parser.DefaultParser
-
- DefaultParser(MediaTypeRegistry, ServiceLoader) - Constructor for class org.apache.tika.parser.DefaultParser
-
- DefaultParser(MediaTypeRegistry, ClassLoader) - Constructor for class org.apache.tika.parser.DefaultParser
-
- DefaultParser(ClassLoader) - Constructor for class org.apache.tika.parser.DefaultParser
-
- DefaultParser(MediaTypeRegistry) - Constructor for class org.apache.tika.parser.DefaultParser
-
- DefaultParser() - Constructor for class org.apache.tika.parser.DefaultParser
-
- DefaultProbDetector - Class in org.apache.tika.detect
-
A version of
DefaultDetector
for probabilistic mime
detectors, which use statistical techniques to blend the
results of differing underlying detectors when attempting
to detect the type of a given file.
- DefaultProbDetector(ProbabilisticMimeDetectionSelector, ServiceLoader) - Constructor for class org.apache.tika.detect.DefaultProbDetector
-
- DefaultProbDetector(ProbabilisticMimeDetectionSelector, ClassLoader) - Constructor for class org.apache.tika.detect.DefaultProbDetector
-
- DefaultProbDetector(ClassLoader) - Constructor for class org.apache.tika.detect.DefaultProbDetector
-
- DefaultProbDetector(MimeTypes) - Constructor for class org.apache.tika.detect.DefaultProbDetector
-
- DefaultProbDetector() - Constructor for class org.apache.tika.detect.DefaultProbDetector
-
- DefaultTranslator - Class in org.apache.tika.language.translate
-
- DefaultTranslator(ServiceLoader) - Constructor for class org.apache.tika.language.translate.DefaultTranslator
-
- DefaultTranslator() - Constructor for class org.apache.tika.language.translate.DefaultTranslator
-
- DelegatingParser - Class in org.apache.tika.parser
-
Base class for parser implementations that want to delegate parts of the
task of parsing an input document to another parser.
- DelegatingParser() - Constructor for class org.apache.tika.parser.DelegatingParser
-
- deleteNamespace(String) - Static method in class org.apache.tika.xmp.XMPMetadata
-
Deletes a namespace from the registry.
- DELIMITER_PROPERTY - Static variable in class org.apache.tika.parser.csv.TextAndCSVParser
-
- DERIVED_FROM_DOCUMENTID - Static variable in interface org.apache.tika.metadata.XMPMM
-
Document id for the document that this document
was derived from
- DERIVED_FROM_INSTANCEID - Static variable in interface org.apache.tika.metadata.XMPMM
-
Instance id for the document instance that this
document was derived from
- descend(String, String) - Method in class org.apache.tika.sax.xpath.ChildMatcher
-
- descend(String, String) - Method in class org.apache.tika.sax.xpath.CompositeMatcher
-
- descend(String, String) - Method in class org.apache.tika.sax.xpath.Matcher
-
Returns the XPath evaluation state that results from descending
to a child element with the given name.
- descend(String, String) - Method in class org.apache.tika.sax.xpath.NamedElementMatcher
-
- descend(String, String) - Method in class org.apache.tika.sax.xpath.SubtreeMatcher
-
- describeMediaType() - Static method in class org.apache.tika.example.MediaTypeExample
-
- DescribeMetadata - Class in org.apache.tika.example
-
Print the supported Tika Metadata models and their fields.
- DescribeMetadata() - Constructor for class org.apache.tika.example.DescribeMetadata
-
- DESCRIPTION - Static variable in interface org.apache.tika.metadata.DublinCore
-
An account of the content of the resource.
- DESCRIPTION - Static variable in interface org.apache.tika.metadata.IPTC
-
A textual description, including captions, of the item's content,
particularly used where the object is not text.
- DESCRIPTION - Static variable in class org.apache.tika.metadata.Metadata
-
- DESCRIPTION - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- DESCRIPTION_WRITER - Static variable in interface org.apache.tika.metadata.IPTC
-
Identifier or the name of the person involved in writing, editing or
correcting the description of the content.
- deserialize(JsonElement, Type, JsonDeserializationContext) - Method in class org.apache.tika.metadata.serialization.JsonMetadataDeserializer
-
Deserializes a json object (equivalent to: Map)
into a Metadata object.
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.CompositeDetector
-
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.CompositeEncodingDetector
-
- detect(InputStream, Metadata) - Method in interface org.apache.tika.detect.Detector
-
Detects the content type of the given input document.
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.EmptyDetector
-
- detect(InputStream, Metadata) - Method in interface org.apache.tika.detect.EncodingDetector
-
Detects the character encoding of the given text document, or
null
if the encoding of the document can not be detected.
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.MagicDetector
-
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.NameDetector
-
Detects the content type of an input document based on the document
name given in the input metadata.
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.NonDetectingEncodingDetector
-
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.OverrideDetector
-
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.TextDetector
-
Looks at the beginning of the document input stream to determine
whether the document is text or not.
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.TrainedModelDetector
-
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.TypeDetector
-
Detects the content type of an input document based on a type hint
given in the input metadata.
- detect(InputStream, Metadata) - Method in class org.apache.tika.detect.ZeroSizeFileDetector
-
- detect(String) - Static method in class org.apache.tika.eval.util.LanguageIDWrapper
-
- detect(InputStream, Metadata) - Method in class org.apache.tika.example.EncryptedPrescriptionDetector
-
- detect() - Method in class org.apache.tika.language.detect.LanguageDetector
-
- detect(CharSequence) - Method in class org.apache.tika.language.detect.LanguageDetector
-
- detect(InputStream, Metadata) - Method in class org.apache.tika.mime.MimeTypes
-
Automatically detects the MIME type of a document based on magic
markers in the stream prefix and any given metadata hints.
- detect(InputStream, Metadata) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
-
- detect(InputStream, Metadata) - Method in class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
-
- detect(InputStream, Metadata) - Method in class org.apache.tika.parser.html.HtmlEncodingDetector
-
- detect(ZipFile) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
-
- detect(Set<String>) - Static method in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
- detect(Set<String>, DirectoryEntry) - Static method in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Internal detection of the specific kind of OLE2 document, based on the
names of the top-level streams within the file.
- detect(InputStream, Metadata) - Method in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
- detect(InputStream, Metadata) - Method in class org.apache.tika.parser.pkg.ZipContainerDetector
-
- detect() - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Return the charset that best matches the supplied input data.
- detect(InputStream, Metadata) - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
- detect(InputStream, Metadata) - Method in class org.apache.tika.parser.txt.UniversalEncodingDetector
-
- detect(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.resource.DetectorResource
-
- detect(InputStream) - Method in class org.apache.tika.server.resource.LanguageResource
-
- detect(String) - Method in class org.apache.tika.server.resource.LanguageResource
-
- detect(InputStream, Metadata) - Method in class org.apache.tika.Tika
-
Detects the media type of the given document.
- detect(InputStream, String) - Method in class org.apache.tika.Tika
-
Detects the media type of the given document.
- detect(InputStream) - Method in class org.apache.tika.Tika
-
Detects the media type of the given document.
- detect(byte[], String) - Method in class org.apache.tika.Tika
-
Detects the media type of the given document.
- detect(byte[]) - Method in class org.apache.tika.Tika
-
Detects the media type of the given document.
- detect(Path) - Method in class org.apache.tika.Tika
-
Detects the media type of the file at the given path.
- detect(File) - Method in class org.apache.tika.Tika
-
Detects the media type of the given file.
- detect(URL) - Method in class org.apache.tika.Tika
-
Detects the media type of the resource at the given URL.
- detect(String) - Method in class org.apache.tika.Tika
-
Detects the media type of a document with the given file name.
- detectAll() - Method in class org.apache.tika.langdetect.Lingo24LangDetector
-
- detectAll() - Method in class org.apache.tika.langdetect.OptimaizeLangDetector
-
Detect languages based on previously submitted text (via addText calls).
- detectAll() - Method in class org.apache.tika.langdetect.TextLangDetector
-
- detectAll() - Method in class org.apache.tika.language.detect.LanguageDetector
-
Detect languages based on previously submitted text (via addText calls).
- detectAll(String) - Method in class org.apache.tika.language.detect.LanguageDetector
-
Utility wrapper that detects the language of a given chunk of text.
- detectAll() - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Return an array of all charsets that appear to be plausible
matches with the input data.
- detectFilename(MultivaluedMap<String, String>) - Static method in class org.apache.tika.server.resource.TikaResource
-
- detectIfPossible(ZipEntry) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
-
- detectLanguage(String) - Method in class org.apache.tika.example.LanguageDetectorExample
-
- detectLanguage(String) - Method in class org.apache.tika.language.translate.AbstractTranslator
-
- detectOfficeOpenXML(OPCPackage) - Static method in class org.apache.tika.parser.pkg.ZipContainerDetector
-
Detects the type of an OfficeOpenXML (OOXML) file from
opened Package
- Detector - Interface in org.apache.tika.detect
-
Content type detector.
- DetectorResource - Class in org.apache.tika.server.resource
-
- DetectorResource(ServerStatus) - Constructor for class org.apache.tika.server.resource.DetectorResource
-
- detectType(ZipArchiveEntry, ZipFile) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
- detectType(ZipArchiveEntry, ZipArchiveInputStream) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
- detectType(InputStream) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
- detectType(POIFSFileSystem) - Static method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
-
- detectType(DirectoryEntry) - Static method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
-
- detectWithCustomConfig(String) - Static method in class org.apache.tika.example.AdvancedTypeDetector
-
- detectWithCustomDetector(String) - Static method in class org.apache.tika.example.AdvancedTypeDetector
-
- DIFContentHandler - Class in org.apache.tika.parser.dif
-
- DIFContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.parser.dif.DIFContentHandler
-
- DIFContentHandler - Class in org.apache.tika.sax
-
- DIFContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.sax.DIFContentHandler
-
- DIFParser - Class in org.apache.tika.parser.dif
-
- DIFParser() - Constructor for class org.apache.tika.parser.dif.DIFParser
-
- digest(InputStream, Metadata, ParseContext) - Method in class org.apache.tika.parser.digest.CompositeDigester
-
- digest(InputStream, Metadata, ParseContext) - Method in class org.apache.tika.parser.digest.InputStreamDigester
-
- digest(InputStream, Metadata, ParseContext) - Method in interface org.apache.tika.parser.DigestingParser.Digester
-
Digests an InputStream and sets the appropriate value(s) in the metadata.
- DigestingAutoDetectParserFactory - Class in org.apache.tika.batch
-
- DigestingAutoDetectParserFactory() - Constructor for class org.apache.tika.batch.DigestingAutoDetectParserFactory
-
- DigestingParser - Class in org.apache.tika.parser
-
- DigestingParser(Parser, DigestingParser.Digester) - Constructor for class org.apache.tika.parser.DigestingParser
-
Creates a decorator for the given parser.
- DigestingParser.Digester - Interface in org.apache.tika.parser
-
Interface for digester.
- DigestingParser.Encoder - Interface in org.apache.tika.parser
-
Encodes byte array from a MessageDigest to String
- DIGITAL_IMAGE_GUID - Static variable in interface org.apache.tika.metadata.IPTC
-
Globally unique identifier for the item.
- DIGITAL_SOURCE_FILE_TYPE - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.
- DIGITAL_SOURCE_TYPE - Static variable in interface org.apache.tika.metadata.IPTC
-
The type of the source of this digital image
- DirectFileReadDataSource - Class in org.apache.tika.parser.mp4
-
A
DataSource
implementation that relies on direct reads from a
RandomAccessFile
.
- DirectFileReadDataSource(File) - Constructor for class org.apache.tika.parser.mp4.DirectFileReadDataSource
-
- DirectoryListingEntry - Class in org.apache.tika.parser.chm.accessor
-
The format of a directory listing entry is as follows: BYTE: length of name
BYTEs: name (UTF-8 encoded) ENCINT: content section ENCINT: offset ENCINT:
length The offset is from the beginning of the content section the file is
in, after the section has been decompressed (if appropriate).
- DirectoryListingEntry() - Constructor for class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
- DirectoryListingEntry(int, String, ChmCommons.EntryType, int, int) - Constructor for class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
Constructs directoryListingEntry
- DirListParser - Class in org.apache.tika.example
-
Parses the output of /bin/ls and counts the number of files and the number of
executables using Tika.
- DirListParser() - Constructor for class org.apache.tika.example.DirListParser
-
- DISC_NUMBER - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The disc number for part of an album set."
- DisplayMetInstance - Class in org.apache.tika.example
-
Grabs a PDF file from a URL and prints its
Metadata
- DisplayMetInstance() - Constructor for class org.apache.tika.example.DisplayMetInstance
-
- dispose() - Method in class org.apache.tika.io.TemporaryResources
-
- distance(LanguageProfile) - Method in class org.apache.tika.language.LanguageProfile
-
Deprecated.
Calculates the geometric distance between this and the given
other language profile.
- DL4JInceptionV3Net - Class in org.apache.tika.dl.imagerec
-
- DL4JInceptionV3Net() - Constructor for class org.apache.tika.dl.imagerec.DL4JInceptionV3Net
-
- DL4JVGG16Net - Class in org.apache.tika.dl.imagerec
-
- DL4JVGG16Net() - Constructor for class org.apache.tika.dl.imagerec.DL4JVGG16Net
-
- DOC - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Microsoft Word
- DOC_INFO_CREATED - Static variable in interface org.apache.tika.metadata.PDF
-
- DOC_INFO_CREATOR - Static variable in interface org.apache.tika.metadata.PDF
-
- DOC_INFO_CREATOR_TOOL - Static variable in interface org.apache.tika.metadata.PDF
-
- DOC_INFO_KEY_WORDS - Static variable in interface org.apache.tika.metadata.PDF
-
- DOC_INFO_MODIFICATION_DATE - Static variable in interface org.apache.tika.metadata.PDF
-
- DOC_INFO_PRODUCER - Static variable in interface org.apache.tika.metadata.PDF
-
- DOC_INFO_SUBJECT - Static variable in interface org.apache.tika.metadata.PDF
-
- DOC_INFO_TITLE - Static variable in interface org.apache.tika.metadata.PDF
-
- DOC_INFO_TRAPPED - Static variable in interface org.apache.tika.metadata.PDF
-
- DOC_SECURITY - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
-
- doClose() - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- document(int, StoredFieldVisitor) - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- DOCUMENTID - Static variable in interface org.apache.tika.metadata.XMPMM
-
The common identifier for all versions and renditions of a resource.
- DocumentSelector - Interface in org.apache.tika.extractor
-
Interface for different document selection strategies for purposes like
embedded document extraction by a
ContainerExtractor
instance.
- doubleByte - Variable in class org.apache.tika.parser.mp3.ID3v2Frame.TextEncoding
-
- DRAW_NS - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
-
- drawingHyperlinks - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
- dropTableIfExists(Connection, String) - Method in class org.apache.tika.eval.db.H2Util
-
- dropTableIfExists(Connection, String) - Method in class org.apache.tika.eval.db.JDBCUtil
-
- DublinCore - Interface in org.apache.tika.metadata
-
A collection of Dublin Core metadata names.
- DumpTikaConfigExample - Class in org.apache.tika.example
-
This class shows how to dump a TikaConfig object to a configuration file.
- DumpTikaConfigExample() - Constructor for class org.apache.tika.example.DumpTikaConfigExample
-
- DURATION - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The duration of the media file."
- DurationFormatUtils - Class in org.apache.tika.util
-
Functionality and naming conventions (roughly) copied from org.apache.commons.lang3
so that we didn't have to add another dependency.
- DurationFormatUtils() - Constructor for class org.apache.tika.util.DurationFormatUtils
-
- DWGParser - Class in org.apache.tika.parser.dwg
-
DWG (CAD Drawing) parser.
- DWGParser() - Constructor for class org.apache.tika.parser.dwg.DWGParser
-
- GDALParser - Class in org.apache.tika.parser.gdal
-
- GDALParser() - Constructor for class org.apache.tika.parser.gdal.GDALParser
-
- GENERAL_EMBEDDED - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
General embedded document type within an OLE2 container
- generateFooter(StringBuffer) - Method in class org.apache.tika.server.HTMLHelper
-
- generateHeader(StringBuffer, String) - Method in class org.apache.tika.server.HTMLHelper
-
Generates the HTML Header for the user facing page, adding
in the given title as required
- generateRSS(Path) - Method in class org.apache.tika.example.RecentFiles
-
- GenericConverter - Class in org.apache.tika.xmp.convert
-
Trys to convert as much of the properties in the Metadata
map to XMP namespaces.
- GenericConverter() - Constructor for class org.apache.tika.xmp.convert.GenericConverter
-
- GENRE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the genre."
- GENRES - Static variable in interface org.apache.tika.parser.mp3.ID3Tags
-
List of predefined genres.
- GeoGazetteerClient - Class in org.apache.tika.parser.geo.topic.gazetteer
-
- GeoGazetteerClient(String) - Constructor for class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
-
Pass URL on which lucene-geo-gazetteer is available - eg.
- GeoGazetteerClient(GeoParserConfig) - Constructor for class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
-
- Geographic - Interface in org.apache.tika.metadata
-
Geographic schema.
- GeographicInformationParser - Class in org.apache.tika.parser.geoinfo
-
- GeographicInformationParser() - Constructor for class org.apache.tika.parser.geoinfo.GeographicInformationParser
-
- geoInfoType - Static variable in class org.apache.tika.parser.geoinfo.GeographicInformationParser
-
- GeoParser - Class in org.apache.tika.parser.geo.topic
-
- GeoParser() - Constructor for class org.apache.tika.parser.geo.topic.GeoParser
-
- GeoParserConfig - Class in org.apache.tika.parser.geo.topic
-
- GeoParserConfig() - Constructor for class org.apache.tika.parser.geo.topic.GeoParserConfig
-
- GeoTag - Class in org.apache.tika.parser.geo.topic
-
- GeoTag() - Constructor for class org.apache.tika.parser.geo.topic.GeoTag
-
- get(InputStream) - Static method in class org.apache.tika.io.TaggedInputStream
-
Casts or wraps the given stream to a TaggedInputStream instance.
- get(InputStream, TemporaryResources) - Static method in class org.apache.tika.io.TikaInputStream
-
Casts or wraps the given stream to a TikaInputStream instance.
- get(InputStream) - Static method in class org.apache.tika.io.TikaInputStream
-
Casts or wraps the given stream to a TikaInputStream instance.
- get(byte[]) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the given array of bytes.
- get(byte[], Metadata) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the given array of bytes.
- get(Path) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the file at the given path.
- get(Path, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the file at the given path.
- get(File) - Static method in class org.apache.tika.io.TikaInputStream
-
- get(File, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
-
- get(Blob) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the given database BLOB.
- get(Blob, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the given database BLOB.
- get(URI) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the resource at the given URI.
- get(URI, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the resource at the given URI.
- get(URL) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the resource at the given URL.
- get(URL, Metadata) - Static method in class org.apache.tika.io.TikaInputStream
-
Creates a TikaInputStream from the resource at the given URL.
- get(String) - Method in class org.apache.tika.metadata.Metadata
-
Get the value associated to a metadata name.
- get(Property) - Method in class org.apache.tika.metadata.Metadata
-
Returns the value (if any) of the identified metadata property.
- get(String) - Static method in class org.apache.tika.metadata.Property
-
Retrieve the property object that corresponds to the given key
- get(Class<T>) - Method in class org.apache.tika.parser.ParseContext
-
Returns the object in this context that implements the given interface.
- get(Class<T>, T) - Method in class org.apache.tika.parser.ParseContext
-
Returns the object in this context that implements the given interface,
or the given default value if such an object is not found.
- get() - Method in enum org.apache.tika.parser.strings.StringsEncoding
-
- get(String) - Method in class org.apache.tika.xmp.XMPMetadata
-
Returns the value of a simple property or the first one of an array.
- get(Property) - Method in class org.apache.tika.xmp.XMPMetadata
-
- get7BitsInt(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
AKA a Synchsafe integer.
- getAccessChecker() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getAcronym() - Method in class org.apache.tika.mime.MimeType
-
Returns an acronym for this mime type.
- getAdded() - Method in class org.apache.tika.batch.FileResourceCrawler
-
- getAdded() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
-
- getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
Every Converter has to provide information about namespaces that are used additionally to the
core set of XMP namespaces.
- getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.GenericConverter
-
- getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.MSOfficeBinaryConverter
-
- getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.MSOfficeXMLConverter
-
- getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.OpenDocumentConverter
-
- getAdditionalNamespaces() - Method in class org.apache.tika.xmp.convert.RTFConverter
-
- getAdmin1Code() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- getAdmin2Code() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- getAeDescriptorPath() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the path to XML descriptor for AnalysisEngine.
- getAgePredictorClient() - Method in class org.apache.tika.parser.recognition.AgeRecogniser
-
- getAlbum() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getAlbum() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
- getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getAlbum() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getAlbumArtist() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
The Artist for the overall album / compilation of albums
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
ID3v1 doesn't have album-wide artists,
so returns null;
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getAlbumArtist() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getAliases(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Returns the set of known aliases of the given canonical media type.
- getAlignedLenTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getAlignedTreeTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getAllComponentParsers() - Method in class org.apache.tika.parser.CompositeParser
-
Returns all parsers registered with the Composite Parser,
including ones which may not currently be active.
- getAllComponentParsers() - Method in class org.apache.tika.parser.DefaultParser
-
- getAllDetectableCharsets() - Static method in class org.apache.tika.parser.txt.CharsetDetector
-
Get the names of all charsets supported by CharsetDetector
class.
- getAllNameEntitiesfromInput(InputStream) - Method in class org.apache.tika.parser.geo.topic.NameEntityExtractor
-
- getAllTagHandlers(InputStream, ContentHandler) - Static method in class org.apache.tika.parser.mp3.Mp3Parser
-
Scans the MP3 frames for ID3 tags, and creates ID3Tag Handlers
for each supported set of tags.
- getAlphabeticTokens() - Method in class org.apache.tika.eval.tokens.CommonTokenResult
-
- getAnalysisEngine(String, String, String) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Returns a new UIMA Analysis Engine (AE).
- getAnnotationProperty(IdentifiedAnnotation, CTAKESAnnotationProperty) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Returns the annotation value based on the given annotation type.
- getAnnotationProps() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
- getAnnotationPropsAsString() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns a string containing a comma-separated list of
CTAKESAnnotationProperty
names that will be included into cTAKES metadata.
- getApiKey() - Method in class org.apache.tika.language.translate.YandexTranslator
-
Get the API Key in use for client authentication
- getApiUri(Metadata) - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
-
- getApiUri(Metadata) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- getApiUri(Metadata) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTVideoRecogniser
-
- getApplyRotation() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getArray() - Method in class org.apache.tika.eval.tokens.TokenCountPriorityQueue
-
- getArtist() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getArtist() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
The Artist for the track
- getArtist() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getArtist() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getArtist() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getArtist() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getAttributesMapping() - Method in class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
-
- getAttrValue(String, Attributes) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
- getAverageCharTolerance() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getBaseType() - Method in class org.apache.tika.mime.MediaType
-
Returns the base form of the MediaType, excluding
any parameters, such as "text/plain" for
"text/plain; charset=utf-8"
- getBestNameEntity() - Method in class org.apache.tika.parser.geo.topic.NameEntityExtractor
-
- getBigInteger(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getBinaryDocValues(String) - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- getBitRate() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the bit rate in bit per second.
- getBitsPerPixel() - Method in class org.apache.tika.parser.image.ICNSType
-
- getBlock_len() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns block's length
- getBlockAddress() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Returns block addresses
- getBlockCount() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Gets a block count
- getBlockidx_intvl() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns block index interval
- getBlockLen() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Gets a block length
- getBlockLength() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getBlockNext() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- getBlockNumber() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- getBlockPrev() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- getBlockRemaining() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getBlockType() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getBoolean(String, Boolean) - Static method in class org.apache.tika.util.PropsUtil
-
Parses v.
- getByte() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getByteCount() - Method in class org.apache.tika.io.CountingInputStream
-
The number of bytes that have passed through this stream.
- getCatchIntermediateIOExceptions() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getCause() - Method in exception org.apache.tika.io.TaggedIOException
-
Returns the wrapped exception.
- getCause() - Method in exception org.apache.tika.sax.TaggedSAXException
-
Returns the wrapped exception.
- getCauseForTermination() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
-
- getCenter() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
-
- getChannels() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the number of channels (1=mono, 2=stereo)
- getCharset() - Method in class org.apache.tika.detect.AutoDetectReader
-
- getCharset() - Method in class org.apache.tika.detect.NonDetectingEncodingDetector
-
- getCharset() - Method in class org.apache.tika.parser.csv.CSVParams
-
- getChildTypes(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Returns the set of known children of the given canonical media type
- getChmBlockInfoInstance(DirectoryListingEntry, int, ChmLzxcControlData) - Static method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Deprecated.
- getChmBlockInfoInstance(DirectoryListingEntry, int, ChmLzxcControlData, ChmBlockInfo) - Static method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
- getChmBlockSegment(byte[], ChmLzxcResetTable, int, int, int) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
- getChmDirList() - Method in class org.apache.tika.parser.chm.core.ChmExtractor
-
- getChmDirList() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getChmItsfHeader() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getChmItspHeader() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getChmLzxcControlData() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getChmLzxcResetTable() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getChoices() - Method in class org.apache.tika.metadata.Property
-
Returns the (immutable) set of choices for the values of this property.
- getClassName() - Method in enum org.apache.tika.parser.ctakes.CTAKESSerializer
-
- getColInfos() - Method in class org.apache.tika.eval.db.TableInfo
-
- getColorspace() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getCommand() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Gets the command to be run.
- getCommand() - Method in class org.apache.tika.parser.external.ExternalParser
-
- getCommand() - Method in class org.apache.tika.parser.gdal.GDALParser
-
- getCommandAppendOperator() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Gets the operator to append rather than replace a value for the command
line tool, i.e.
- getCommandAssignmentDelimeter() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Gets the delimiter for multiple assignments for the command line tool,
i.e.
- getCommandAssignmentOperator() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Gets the assignment operator for the command line tool, i.e.
- getCommandMetadataSegments(Metadata) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Constructs a collection of command line arguments responsible for setting
individual metadata fields based on the given metadata
.
- getComment(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
Builds up the ID3 comment, by parsing and extracting
the comment string parts from the given data.
- getComments() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getComments() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
Retrieves the comments, if any.
- getComments() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getComments() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getComments() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getComments() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getCommonTokens() - Method in class org.apache.tika.eval.tokens.CommonTokenResult
-
- getCommonTokensAnalyzer() - Method in class org.apache.tika.eval.tokens.AnalyzerManager
-
This analyzer should be used to generate common tokens lists from
large corpora.
- getCompilation() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getCompilation() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
- getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
ID3v1 doesn't have compilations,
so returns null;
- getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
ID3v22 doesn't have compilations,
so returns null;
- getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getCompilation() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getComposer() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getComposer() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
- getComposer() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
ID3v1 doesn't have composers,
so returns null;
- getComposer() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getComposer() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getComposer() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getCompressedLen() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Gets compressed length
- getConcatenatePhoneticRuns() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getConfidence() - Method in class org.apache.tika.language.detect.LanguageResult
-
- getConfidence() - Method in class org.apache.tika.parser.csv.CSVResult
-
- getConfidence() - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- getConfidence() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Get an indication of the confidence in the charset detected.
- getConfig() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
- getConfig() - Static method in class org.apache.tika.server.resource.TikaResource
-
- getConnection() - Method in class org.apache.tika.eval.db.JDBCUtil
-
Override this any optimizations you want to do on the db
before writing/reading.
- getConnectionString() - Method in class org.apache.tika.eval.db.H2Util
-
- getConnectionString() - Method in class org.apache.tika.eval.db.JDBCUtil
-
- getConsidered() - Method in class org.apache.tika.batch.FileResourceCrawler
-
- getConsidered() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
-
Returns the number of file resources considered.
- getConstraints() - Method in class org.apache.tika.eval.db.ColInfo
-
- getConsumed() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
-
- getConsumers() - Method in class org.apache.tika.batch.ConsumersManager
-
Get the consumers
- getConsumersManagerMaxMillis() - Method in class org.apache.tika.batch.ConsumersManager
-
BatchProcess
will throw an exception
if the ConsumersManager doesn't complete init() or shutdown()
within this amount of time.
- getContent(EvalFilePaths, Metadata) - Static method in class org.apache.tika.eval.AbstractProfiler
-
- getContent() - Method in class org.apache.tika.eval.util.ContentTags
-
- getContent() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- getContent(int, int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- getContent(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.example.PrescriptionParser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.dif.DIFParser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.WordMLParser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentMetaParser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.DcXMLParser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.FictionBookParser
-
- getContentHandler(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.XMLParser
-
- getContentHandlerFactory() - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
-
- getContentLanguage() - Method in class org.apache.tika.example.ImportContextImpl
-
- getContentLength() - Method in class org.apache.tika.example.ImportContextImpl
-
- getContentLength() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- getContentParser() - Method in class org.apache.tika.parser.epub.EpubParser
-
- getContentParser() - Method in class org.apache.tika.parser.odf.OpenDocumentParser
-
- getControlDataIndex() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Returns control data index that located in List
- getConverter(String) - Static method in class org.apache.tika.xmp.convert.TikaToXMP
-
Retrieve a specific converter according to the mimetype
- getCoreCacheHelper() - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- getCoreProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
-
- getCoreProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
-
- getCoreProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
-
- getCount() - Method in class org.apache.tika.io.CountingInputStream
-
The number of bytes that have passed through this stream.
- getCount() - Method in class org.apache.tika.language.LanguageProfile
-
Deprecated.
- getCount(String) - Method in class org.apache.tika.language.LanguageProfile
-
Deprecated.
- getCountryCode() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- getCurrentFile() - Method in class org.apache.tika.batch.FileResourceConsumer
-
Returns the name and start time of a file that is currently being processed.
- getCustomProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
-
- getCustomProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
-
- getCustomProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
-
- getData() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getData() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getData() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getDataOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Returns data offset
- getDataOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns data offset
- getDate(Property) - Method in class org.apache.tika.metadata.Metadata
-
Returns the value of the identified Date based metadata property.
- getDate(Property) - Method in class org.apache.tika.xmp.XMPMetadata
-
- getDBWriter(List<TableInfo>) - Method in class org.apache.tika.eval.batch.EvalConsumerBuilder
-
- getDecorationName() - Method in class org.apache.tika.parser.ctakes.CTAKESParser
-
- getDecorationName() - Method in class org.apache.tika.parser.ParserDecorator
-
- getDectorsHTML() - Method in class org.apache.tika.server.resource.TikaDetectors
-
- getDefaultConfig() - Static method in class org.apache.tika.config.TikaConfig
-
Provides a default configuration (TikaConfig).
- getDefaultConfig() - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- getDefaultDetector(MimeTypes, ServiceLoader) - Static method in class org.apache.tika.config.TikaConfig
-
- getDefaultEncodingDetector(ServiceLoader) - Static method in class org.apache.tika.config.TikaConfig
-
- getDefaultLanguageDetector() - Static method in class org.apache.tika.language.detect.LanguageDetector
-
- getDefaultMimeTypes() - Static method in class org.apache.tika.mime.MimeTypes
-
Get the default MimeTypes.
- getDefaultMimeTypes(ClassLoader) - Static method in class org.apache.tika.mime.MimeTypes
-
Get the default MimeTypes.
- getDefaultNumConsumers() - Static method in class org.apache.tika.batch.builders.AbstractConsumersBuilder
-
- getDefaultRegistry() - Static method in class org.apache.tika.mime.MediaTypeRegistry
-
Returns the built-in media type registry included in Tika.
- getDelegateParser(ParseContext) - Method in class org.apache.tika.parser.DelegatingParser
-
Returns the parser instance to which parsing tasks should be delegated.
- getDelimiter() - Method in class org.apache.tika.parser.csv.CSVParams
-
- getDelimiter() - Method in class org.apache.tika.parser.csv.CSVResult
-
- getDensity() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getDepth() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getDescription() - Method in class org.apache.tika.mime.MimeType
-
Returns the description of this media type.
- getDescription() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
-
Gets the description, if present
- getDetectableCharsets() - Method in class org.apache.tika.parser.txt.CharsetDetector
-
- getDetectAngles() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getDetector() - Method in class org.apache.tika.config.TikaConfig
-
Returns the configured detector instance.
- getDetector() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
- getDetector() - Method in class org.apache.tika.language.detect.LanguageHandler
-
Returns the language detector used by this content handler.
- getDetector() - Method in class org.apache.tika.language.detect.LanguageWriter
-
Returns the language detector used by this writer.
- getDetector() - Method in class org.apache.tika.parser.AutoDetectParser
-
Returns the type detector used by this parser to auto-detect the type
of a document.
- getDetector(Parser) - Static method in class org.apache.tika.server.resource.TikaResource
-
- getDetector() - Method in class org.apache.tika.Tika
-
Returns the detector instance used by this facade.
- getDetectors() - Method in class org.apache.tika.detect.CompositeDetector
-
Returns the component detectors.
- getDetectors() - Method in class org.apache.tika.detect.CompositeEncodingDetector
-
- getDetectors() - Method in class org.apache.tika.detect.DefaultDetector
-
- getDetectors() - Method in class org.apache.tika.detect.DefaultProbDetector
-
- getDetectorsJSON() - Method in class org.apache.tika.server.resource.TikaDetectors
-
- getDetectorsPlain() - Method in class org.apache.tika.server.resource.TikaDetectors
-
- getDiceCoefficient() - Method in class org.apache.tika.eval.tokens.ContrastStatistics
-
- getDir_uuid() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns directory uuid
- getDirectoryListingEntryList() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Returns chm directory listing entry list
- getDirLen() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns directory length
- getDirOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns directory offset
- getDisc() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getDisc() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
The number of the disc this belongs to, within the set
- getDisc() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
ID3v1 doesn't have disc numbers,
so returns null;
- getDisc() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getDisc() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getDisc() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
- getDocument() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor
-
Returns the opened document.
- getDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSExtractorDecorator
-
- getDocumentBuilder() - Method in class org.apache.tika.parser.ParseContext
-
Returns the DOM builder specified in this parsing context.
- getDocumentBuilder() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the DOM builder specified in this parsing context.
- getDocumentBuilderFactory() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the DOM builder factory specified in this parsing context.
- getDuration() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Returns the duration in milliseconds.
- getEmbeddedDocumentExtractor(ParseContext) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
This offers a uniform way to get an EmbeddedDocumentExtractor from a ParseContext.
- getEnableAutoSpace() - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getEnableAutoSpace() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getEncint() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getEncoding() - Method in class org.apache.tika.example.ImportContextImpl
-
- getEncoding() - Method in class org.apache.tika.parser.strings.StringsConfig
-
Returns the character encoding of the strings that are to be found.
- getEncodingDetector() - Method in class org.apache.tika.config.TikaConfig
-
Returns the configured encoding detector instance
- getEncodingDetector(ParseContext) - Method in class org.apache.tika.parser.AbstractEncodingDetectorParser
-
Look for an EncodingDetetor in the ParseContext.
- getEncodingDetector() - Method in class org.apache.tika.parser.AbstractEncodingDetectorParser
-
- getEndBlock() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Returns the end block index
- getEndDocumentWasCalled() - Method in class org.apache.tika.sax.EndDocumentShieldingContentHandler
-
- getEndOffset() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Returns the end offset index
- getEntityTypes() - Method in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
Gets set of entity types recognised by this recogniser
- getEntityTypes() - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
Gets set of entity types recognised by this recogniser
- getEntityTypes() - Method in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
-
Gets set of entity types recognised by this recogniser
- getEntityTypes() - Method in interface org.apache.tika.parser.ner.NERecogniser
-
gets a set of entity types whose names are recognisable by this
- getEntityTypes() - Method in class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
-
Gets set of entity types recognised by this recogniser
- getEntityTypes() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
-
- getEntityTypes() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
- getEntityTypes() - Method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
-
- getEntropy() - Method in class org.apache.tika.eval.tokens.TokenStatistics
-
- getEntryType() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
Returns ChmCommons.EntryType (COMPRESSED or UNCOMPRESSED)
- getErrors() - Static method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.
Returns a string of error messages related to initializing language profiles
- getExecutorService() - Method in class org.apache.tika.config.TikaConfig
-
- getExitStatus() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
-
- getExtendedHeader() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getExtendedProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
-
- getExtendedProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
-
- getExtendedProperties() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
-
- getExtension(TikaInputStream, Metadata) - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
- getExtension() - Method in class org.apache.tika.mime.MimeType
-
Returns the preferred file extension of this type, or an empty string
if no extensions are known.
- getExtension() - Method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
-
- getExtensions() - Method in class org.apache.tika.mime.MimeType
-
Returns the list of all known file extensions of this media type.
- getExtractAcroFormContent() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractActions() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractAllAlternativesFromMSG() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- getExtractAllAlternativesFromMSG() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getExtractAnnotationText() - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getExtractAnnotationText() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractBookmarksText() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractInlineImages() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getExtractMacros() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- getExtractMacros() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getExtractScripts() - Method in class org.apache.tika.parser.html.HtmlParser
-
- getExtractUniqueInlineImagesOnly() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getFallback() - Method in class org.apache.tika.parser.CompositeParser
-
Returns the fallback parser.
- getField() - Method in class org.apache.tika.config.ParamField
-
- getFieldInfos() - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- getFile() - Method in class org.apache.tika.io.TikaInputStream
-
- getFile(String, File) - Static method in class org.apache.tika.util.PropsUtil
-
Deprecated.
- getFileChannel() - Method in class org.apache.tika.io.TikaInputStream
-
- getFileLength(Path) - Method in class org.apache.tika.eval.AbstractProfiler
-
- getFilePath() - Method in class org.apache.tika.parser.strings.FileConfig
-
Returns the "file" installation folder.
- getFileProg() - Static method in class org.apache.tika.parser.strings.StringsParser
-
- getFilesProcessed() - Method in class org.apache.tika.server.ServerStatus
-
- getFilter() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getFilteredStackTrace(Throwable) - Static method in class org.apache.tika.utils.ExceptionUtils
-
Simple util to get stack trace.
- getFlags() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getFormat() - Method in class org.apache.tika.language.translate.YandexTranslator
-
Retrieve the current text format setting.
- getFormattedNumber(Paragraph) - Method in class org.apache.tika.parser.microsoft.ListManager
-
Get the formatted number for a given paragraph
- getFormattedNumber(XWPFParagraph) - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFListManager
-
- getFormattedNumber(BigInteger, int) - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFListManager
-
- getFramesRead() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getFreeSpace() - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
-
Returns pmgi free space
- getFreeSpace() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- getGazetteerRestEndpoint() - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
-
- getGeneralAnalyzer() - Method in class org.apache.tika.eval.tokens.AnalyzerManager
-
This analyzer should be used to extract all tokens.
- getGenre() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getGenre() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
- getGenre() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getGenre() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getGenre() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getGenre() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getHadStarted() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getHeader_len() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns header length
- getHeaderLen() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns itsf header length
- getHeight() - Method in class org.apache.tika.parser.image.ICNSType
-
- getHTML(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.resource.TikaResource
-
- getHTMLFromMultipart(Attachment, UriInfo) - Method in class org.apache.tika.server.resource.TikaResource
-
- getId() - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- getIdentifier() - Method in class org.apache.tika.sax.StandardReference
-
- getIfXFAExtractOnlyXFA() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getIgnoredLineConsumer() - Method in class org.apache.tika.parser.external.ExternalParser
-
Gets lines consumer
- getIlvl() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- getImageMagickPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getImportRoot() - Method in class org.apache.tika.example.ImportContextImpl
-
- getIncludeDeletedContent() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- getIncludeDeletedContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIncludeDeletedText() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- getIncludeDeletedText() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- getIncludeHeadersAndFooters() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIncludeMissingRows() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIncludeMoveFromContent() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- getIncludeMoveFromContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIncludeMoveFromText() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- getIncludeMoveFromText() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- getIncludeShapeBasedContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIncludeSlideMasterContent() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIncludeSlideNotes() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getIndex_depth() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns an index depth
- getIndex_head() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns an index head
- getIndex_root() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns index root
- getIndexOfContent() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getIndexOfResetData() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getIndexOfResetTable() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getIniBlock() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Returns an initial block index
- getInitializableProblemHandler() - Method in class org.apache.tika.config.ServiceLoader
-
Returns the handler for problems with initializables
- getInputSteam(InputStream, HttpHeaders) - Method in class org.apache.tika.server.DefaultInputStreamFactory
-
- getInputSteam(InputStream, HttpHeaders) - Method in interface org.apache.tika.server.InputStreamFactory
-
- getInputSteam(InputStream, HttpHeaders) - Method in class org.apache.tika.server.URLEnabledInputStreamFactory
-
- getInputStream(FileResource) - Method in class org.apache.tika.batch.fs.AbstractFSConsumer
-
- getInputStream() - Method in class org.apache.tika.example.ImportContextImpl
-
Returns a new InputStream
to the temporary file created
during instanciation or null
, if this context does not
provide a stream.
- getInputStream() - Method in class org.apache.tika.parser.utils.DataURIScheme
-
- getInputStream(InputStream, HttpHeaders) - Static method in class org.apache.tika.server.resource.TikaResource
-
- getInstance() - Static method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
-
- getInt(Property) - Method in class org.apache.tika.metadata.Metadata
-
Returns the value of the identified Integer based metadata property.
- getInt(byte[]) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getInt(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getInt(String, Integer) - Static method in class org.apache.tika.util.PropsUtil
-
Parses v.
- getInt(String, Map<String, String>, Node) - Static method in class org.apache.tika.util.XMLDOMUtil
-
Get an int value.
- getInt(Property) - Method in class org.apache.tika.xmp.XMPMetadata
-
- getInt2(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getInt3(byte[], int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getIntBE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE int value from the beginning of a byte array
- getIntBE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE int value from a byte array
- getIntelCurrentPossition() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getIntelFileSize() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getIntelState() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getIntLE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE int value from the beginning of a byte array
- getIntLE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE int value from a byte array
- getIntValues(Property) - Method in class org.apache.tika.metadata.Metadata
-
Gets the array of ints of the identified "seq" integer metadata property.
- getIOListener() - Method in class org.apache.tika.example.ImportContextImpl
-
- getJavaCommand() - Method in class org.apache.tika.fork.ForkParser
-
- getJavaCommandAsList() - Method in class org.apache.tika.fork.ForkParser
-
Returns the command used to start the forked server process.
- getJCas(AnalysisEngine) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Returns a new JCas () appropriate for the given Analysis Engine.
- getJDBCDriverClass() - Method in class org.apache.tika.eval.db.H2Util
-
- getJDBCDriverClass() - Method in class org.apache.tika.eval.db.JDBCUtil
-
JDBC driver class.
- getJustFileName(String) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
- getKey() - Static method in class org.apache.tika.example.Pharmacy
-
- getLabel() - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- getLabelLang() - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- getLang_id() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns language id
- getLangCode() - Method in class org.apache.tika.eval.tokens.CommonTokenResult
-
- getLangId() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns language ID
- getLanguage() - Method in class org.apache.tika.language.detect.LanguageHandler
-
Returns the detected language based on text handled thus far.
- getLanguage() - Method in class org.apache.tika.language.detect.LanguageResult
-
The ISO 639-1 language code (plus optional country code)
- getLanguage() - Method in class org.apache.tika.language.detect.LanguageWriter
-
Returns the detected language based on text written thus far.
- getLanguage() - Method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.
Gets the identified language
- getLanguage() - Method in class org.apache.tika.language.ProfilingHandler
-
Deprecated.
Returns the language that best matches the current state of the
language profile.
- getLanguage() - Method in class org.apache.tika.language.ProfilingWriter
-
Deprecated.
Returns the language that best matches the current state of the
language profile.
- getLanguage(long) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
Returns textual representation of LangID
- getLanguage() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
-
Gets the language, if present
- getLanguage() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getLanguage() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Get the ISO code for the language of the detected charset.
- getLanguageDetectors() - Static method in class org.apache.tika.language.detect.LanguageDetector
-
- getLanguageDetectors(ServiceLoader) - Static method in class org.apache.tika.language.detect.LanguageDetector
-
- getLastModified() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns last modified date of the chm file
- getLatitude() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- getLayer() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the audio layer code.
- getLeft() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getLeft() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
-
- getLength() - Method in class org.apache.tika.detect.MagicDetector
-
- getLength() - Method in class org.apache.tika.io.TikaInputStream
-
Returns the length (in bytes) of this stream.
- getLength() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
- getLength() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Returns the frame length in bytes.
- getLength() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getLengthTreeLengtsTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getLengthTreeTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getLinks() - Method in class org.apache.tika.mime.MimeType
-
Get a list of links to help document this mime type
- getLinks() - Method in class org.apache.tika.sax.LinkContentHandler
-
Returns the list of collected links.
- getLiveDocs() - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- getLoader() - Method in class org.apache.tika.config.ServiceLoader
-
- getLoadErrorHandler() - Method in class org.apache.tika.config.ServiceLoader
-
Returns the load error handler used by this loader.
- getLocations(List<String>) - Method in class org.apache.tika.parser.geo.topic.gazetteer.GeoGazetteerClient
-
Calls API of lucene-geo-gazetteer to search location name in gazetteer.
- getLong(String, Long) - Static method in class org.apache.tika.util.PropsUtil
-
Parses v.
- getLong(String, Map<String, String>, Node) - Static method in class org.apache.tika.util.XMLDOMUtil
-
Get a long value.
- getLongitude() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- getLongLE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE long value from a byte array
- getLzxBlockLength() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getLzxBlockOffset() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getLzxBlocksCache() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getMacroLanguage(String) - Static method in class org.apache.tika.language.detect.LanguageNames
-
If language is a specific variant of a macro language (e.g.
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
Return a list of the main parts of the document, used
when searching for embedded resources.
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.POIXMLTextExtractorDecorator
-
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.SXSLFPowerPointExtractorDecorator
-
In PowerPoint files, slides have things embedded in them,
and slide drawings which have the images
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.SXWPFWordExtractorDecorator
-
This returns all items that might contain embedded objects:
main document, headers, footers, comments, etc.
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSExtractorDecorator
-
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.XSLFPowerPointExtractorDecorator
-
In PowerPoint files, slides have things embedded in them,
and slide drawings which have the images
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
In Excel files, sheets have things embedded in them,
and sheet drawings which have the images
- getMainDocumentParts() - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator
-
Include main body and anything else that can
have an attachment/embedded object
- getMainOrganizationAcronym() - Method in class org.apache.tika.sax.StandardReference
-
- getMainTreeElements() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getMainTreeLengtsTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getMainTreeTable() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getMajorVersion() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getMappedTagName() - Method in class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
-
- getMarkLimit() - Method in class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
-
- getMarkLimit() - Method in class org.apache.tika.parser.html.HtmlEncodingDetector
-
- getMarkLimit() - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
- getMarkLimit() - Method in class org.apache.tika.parser.txt.UniversalEncodingDetector
-
- getMaxBytesForEmbeddedObject() - Static method in class org.apache.tika.parser.rtf.RTFParser
-
Deprecated.
- getMaxChildStartupMillis() - Method in class org.apache.tika.server.ServerTimeouts
-
Maximum time in millis to allow for the child process to startup
or restart
- getMaxEntityExpansions() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
- getMaxFileSizeToOcr() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getMaximumCompressionRatio() - Method in class org.apache.tika.sax.SecureContentHandler
-
Returns the maximum compression ratio.
- getMaximumDepth() - Method in class org.apache.tika.sax.SecureContentHandler
-
Returns the maximum XML element nesting level.
- getMaximumPackageEntryDepth() - Method in class org.apache.tika.sax.SecureContentHandler
-
Returns the maximum package entry nesting level.
- getMaxMainMemoryBytes() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
The maximum amount of memory to use when loading a pdf into a PDDocument.
- getMaxRestarts() - Method in class org.apache.tika.server.ServerTimeouts
-
- getMaxStringLength() - Method in class org.apache.tika.Tika
-
Returns the maximum length of strings returned by the
parseToString methods.
- getMaxXMPMMHistory() - Static method in class org.apache.tika.parser.image.xmp.JempboxExtractor
-
- getMediaType() - Method in class org.apache.tika.parser.csv.CSVParams
-
- getMediaType() - Method in class org.apache.tika.parser.csv.CSVResult
-
- getMediaType() - Method in class org.apache.tika.parser.utils.DataURIScheme
-
- getMediaTypeRegistry() - Method in class org.apache.tika.config.TikaConfig
-
- getMediaTypeRegistry() - Method in class org.apache.tika.mime.MimeTypes
-
- getMediaTypeRegistry() - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
-
- getMediaTypeRegistry() - Method in class org.apache.tika.parser.CompositeParser
-
Returns the media type registry used to infer type relationships.
- getMediaTypes() - Method in class org.apache.tika.server.resource.TikaMimeTypes
-
- getMessage() - Method in class org.apache.tika.server.resource.TikaResource
-
- getMessageClass(String) - Static method in class org.apache.tika.parser.microsoft.OutlookExtractor
-
- getMet(URL) - Static method in class org.apache.tika.example.DisplayMetInstance
-
- getMetadata() - Method in interface org.apache.tika.batch.FileResource
-
This gets the metadata available before the parsing of the file.
- getMetadata() - Method in class org.apache.tika.batch.fs.FSFileResource
-
- getMetaData() - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- getMetadata() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns an array of metadata whose values will be analyzed using cTAKES.
- getMetadata() - Method in class org.apache.tika.parser.ctakes.CTAKESContentHandler
-
Returns metadata that includes cTAKES annotations.
- getMetadata() - Method in class org.apache.tika.parser.RecursiveParserWrapper
-
- getMetadata() - Method in class org.apache.tika.server.MetadataList
-
- getMetadata(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.resource.MetadataResource
-
- getMetadata(InputStream, HttpHeaders, UriInfo, String) - Method in class org.apache.tika.server.resource.RecursiveMetadataResource
-
Returns an InputStream that can be deserialized as a list of
Metadata
objects.
- getMetadataAsString() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns a string containing a comma-separated list of metadata whose values will be analyzed using cTAKES.
- getMetadataCommandArguments() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Gets the map of Metadata keys to command line parameters.
- getMetadataExtractionPatterns() - Method in class org.apache.tika.parser.external.ExternalParser
-
- getMetadataExtractor() - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
- getMetadataExtractor() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor
-
POIXMLTextExtractor.getMetadataTextExtractor()
not yet supported
for OOXML by POI.
- getMetadataField(InputStream, HttpHeaders, UriInfo, String) - Method in class org.apache.tika.server.resource.MetadataResource
-
Get a specific metadata field.
- getMetadataFromMultipart(Attachment, UriInfo) - Method in class org.apache.tika.server.resource.MetadataResource
-
- getMetadataFromMultipart(Attachment, UriInfo, String) - Method in class org.apache.tika.server.resource.RecursiveMetadataResource
-
Returns an InputStream that can be deserialized as a list of
Metadata
objects.
- getMetadataList() - Method in class org.apache.tika.sax.RecursiveParserWrapperHandler
-
- getMetaParser() - Method in class org.apache.tika.parser.epub.EpubParser
-
- getMetaParser() - Method in class org.apache.tika.parser.odf.OpenDocumentParser
-
- getMimeId(String) - Method in class org.apache.tika.eval.io.DBWriter
-
- getMimeId(String) - Method in interface org.apache.tika.eval.io.IDBWriter
-
- getMimeRepository() - Method in class org.apache.tika.config.TikaConfig
-
- getMimeType() - Method in class org.apache.tika.example.ImportContextImpl
-
- getMimeType(String) - Method in class org.apache.tika.mime.MimeTypes
-
- getMimeType(File) - Method in class org.apache.tika.mime.MimeTypes
-
- getMimeTypes() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
- getMimeTypesHTML() - Method in class org.apache.tika.server.resource.TikaMimeTypes
-
- getMimeTypesJSON() - Method in class org.apache.tika.server.resource.TikaMimeTypes
-
- getMimeTypesPlain() - Method in class org.apache.tika.server.resource.TikaMimeTypes
-
- getMinFileSizeToOcr() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getMinLength() - Method in class org.apache.tika.detect.TrainedModelDetector
-
- getMinLength() - Method in class org.apache.tika.mime.MimeTypes
-
Return the minimum length of data to provide to analyzing methods based
on the document's content in order to check all the known MimeTypes.
- getMinLength() - Method in class org.apache.tika.parser.strings.StringsConfig
-
Returns the minimum sequence length (characters) to print.
- getMinorVersion() - Method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- getMinSize() - Method in class org.apache.tika.parser.strings.Latin1StringsParser
-
Returns the minimum size of a character sequence to be extracted.
- getModificationTime() - Method in class org.apache.tika.example.ImportContextImpl
-
- getMSB() - Method in class org.apache.tika.parser.executable.MachineMetadata.Endian
-
- getName() - Method in class org.apache.tika.config.Param
-
- getName() - Method in class org.apache.tika.config.ParamField
-
- getName() - Method in class org.apache.tika.eval.db.ColInfo
-
- getName() - Method in class org.apache.tika.eval.db.TableInfo
-
- getName(String) - Static method in class org.apache.tika.io.FilenameUtils
-
This is a duplication of the algorithm and functionality
available in commons io FilenameUtils.
- getName() - Method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.
- getName() - Method in class org.apache.tika.metadata.Property
-
- getName() - Method in class org.apache.tika.mime.MimeType
-
Returns the name of this media type.
- getName() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
Returns an entry name
- getName() - Method in enum org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
-
- getName() - Method in class org.apache.tika.parser.executable.MachineMetadata.Endian
-
- getName() - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- getName() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Get the name of the detected charset.
- getNameLength() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
Returns an entry name length
- getNames(Metadata) - Method in class org.apache.tika.metadata.serialization.JsonMetadataSerializer
-
Override to get a custom sort order
or to filter names.
- getNamespace() - Method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
- getNamespacePrefix(String) - Static method in class org.apache.tika.xmp.XMPMetadata
-
Obtain the prefix for a registered namespace URI.
- getNamespaces() - Static method in class org.apache.tika.xmp.XMPMetadata
-
- getNamespaceURI(String) - Static method in class org.apache.tika.xmp.XMPMetadata
-
Obtain the URI for a registered namespace prefix.
- getNerModelUrl() - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
-
- getNewContentHandler() - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
-
- getNewContentHandler(OutputStream, Charset) - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
-
- getNewContentHandler() - Method in class org.apache.tika.sax.BasicContentHandlerFactory
-
- getNewContentHandler(OutputStream, String) - Method in class org.apache.tika.sax.BasicContentHandlerFactory
-
- getNewContentHandler(OutputStream, Charset) - Method in class org.apache.tika.sax.BasicContentHandlerFactory
-
- getNewContentHandler() - Method in interface org.apache.tika.sax.ContentHandlerFactory
-
- getNewContentHandler(OutputStream, String) - Method in interface org.apache.tika.sax.ContentHandlerFactory
-
- getNewContentHandler(OutputStream, Charset) - Method in interface org.apache.tika.sax.ContentHandlerFactory
-
- getNonRefTableInfos() - Method in class org.apache.tika.eval.batch.EvalConsumerBuilder
-
- getNonRefTableInfos() - Method in class org.apache.tika.eval.batch.ExtractComparerBuilder
-
- getNonRefTableInfos() - Method in class org.apache.tika.eval.batch.ExtractProfilerBuilder
-
- getNormValues(String) - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- getNum_blocks() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns number of blocks
- getNumberHandledExceptions() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
-
- getNumberOfLevels() - Method in class org.apache.tika.parser.microsoft.AbstractListManager.ParagraphLevelCounter
-
- getNumConsumers(Map<String, String>) - Static method in class org.apache.tika.batch.builders.BatchProcessBuilder
-
numConsumers is needed by both the crawler and the consumers.
- getNumericDocValues(String) - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- getNumHandledExceptions() - Method in class org.apache.tika.batch.FileResourceConsumer
-
- getNumId() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- getNumOfHidden() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
-
- getNumOfInputs() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
-
- getNumOfOutputs() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
-
- getNumResourcesConsumed() - Method in class org.apache.tika.batch.FileResourceConsumer
-
- getNumRestarts() - Method in class org.apache.tika.batch.BatchProcessDriverCLI
-
- getNumTranslationPairs() - Method in class org.apache.tika.language.translate.CachedTranslator
-
Get the number of different source/target translation pairs this CachedTranslator
currently has in its cache.
- getNumTranslationsFor(String, String) - Method in class org.apache.tika.language.translate.CachedTranslator
-
Get the number of different translations from the source language to the target language
this CachedTranslator has in its cache.
- getOcrDPI() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Dots per inch used to render the page image for OCR
- getOcrImageFormatName() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
String representation of the image format used to render
the page image for OCR (examples: png, tiff, jpeg)
- getOcrImageQuality() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image quality used to render the page image for OCR.
- getOcrImageScale() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Scale to use if rendering a page and then running OCR on that rendered image.
- getOcrImageType() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image type used to render the page image for OCR.
- getOcrStrategy() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getOffset() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
- getOpenContainer() - Method in class org.apache.tika.io.TikaInputStream
-
Returns the open container object, such as a
POIFS FileSystem in the event of an OLE2
document being detected and processed by
the OLE2 detector.
- getOrganizations() - Static method in class org.apache.tika.sax.StandardOrganizations
-
Returns the map containing the collection of the most important technical standard organizations.
- getOrganzationsRegex() - Static method in class org.apache.tika.sax.StandardOrganizations
-
Returns the regular expression containing the most important technical standard organizations.
- getOtherTesseractConfig() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getOutputEncoding() - Method in class org.apache.tika.batch.fs.BasicTikaFSConsumer
-
- getOutputEncoding() - Method in class org.apache.tika.batch.fs.RecursiveParserWrapperFSConsumer
-
- getOutputEncoding() - Method in class org.apache.tika.batch.fs.StreamOutRPWFSConsumer
-
- getOutputFile(File, String, FSUtil.HANDLE_EXISTING, String) - Static method in class org.apache.tika.batch.fs.FSUtil
-
Deprecated.
- getOutputPath(Path, String, FSUtil.HANDLE_EXISTING, String) - Static method in class org.apache.tika.batch.fs.FSUtil
-
Given an output root and an initial relative path,
return the output file according to the HANDLE_EXISTING strategy
In the most basic use case, given a root directory "input",
a file's relative path "dir1/dir2/fileA.docx", and an output directory
"output", the output file would be "output/dir1/dir2/fileA.docx."
If HANDLE_EXISTING is set to OVERWRITE, this will not check to see if the output already exists,
and the returned file could overwrite an existing file!!!
If HANDLE_EXISTING is set to RENAME, this will try to increment a counter at the end of
the file name (fileA(2).docx) until there is a file name that doesn't exist.
- getOutputStream(OutputStreamFactory, FileResource) - Method in class org.apache.tika.batch.fs.AbstractFSConsumer
-
Use this for consistent logging of exceptions.
- getOutputStream(Metadata) - Method in class org.apache.tika.batch.fs.FSOutputStreamFactory
-
This tries to create a file based on the
FSUtil.HANDLE_EXISTING
value that was passed in during initialization.
- getOutputStream(Metadata) - Method in interface org.apache.tika.batch.OutputStreamFactory
-
- getOutputStream() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
- getOutputThreshold() - Method in class org.apache.tika.sax.SecureContentHandler
-
Returns the configured output threshold.
- getOutputType() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getOverlap() - Method in class org.apache.tika.eval.tokens.ContrastStatistics
-
- getPackage() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
-
- getPackage() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
-
- getPackage() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
-
- getPageSegMode() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getPageSeparator() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getParameters() - Method in class org.apache.tika.mime.MediaType
-
Returns an immutable sorted map of the parameters of this media type.
- getParams() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
-
- getParseException() - Method in class org.apache.tika.eval.util.ContentTags
-
- getParser(TikaConfig) - Method in class org.apache.tika.batch.AutoDetectParserFactory
-
- getParser(TikaConfig) - Method in class org.apache.tika.batch.DigestingAutoDetectParserFactory
-
- getParser(TikaConfig) - Method in class org.apache.tika.batch.ParserFactory
-
- getParser(MediaType) - Method in class org.apache.tika.config.TikaConfig
-
- getParser() - Method in class org.apache.tika.config.TikaConfig
-
Returns the configured parser instance.
- getParser(Metadata) - Method in class org.apache.tika.parser.CompositeParser
-
Returns the parser that best matches the given metadata.
- getParser(Metadata, ParseContext) - Method in class org.apache.tika.parser.CompositeParser
-
- getParser() - Method in class org.apache.tika.Tika
-
Returns the parser instance used by this facade.
- getParserClassname(Parser) - Static method in class org.apache.tika.utils.ParserUtils
-
- getParserDetailsHTML() - Method in class org.apache.tika.server.resource.TikaParsers
-
- getParserDetailsJSON() - Method in class org.apache.tika.server.resource.TikaParsers
-
- getParserDetailssPlain() - Method in class org.apache.tika.server.resource.TikaParsers
-
- getParseRecursively() - Method in class org.apache.tika.batch.ParserFactory
-
- getParsers(ParseContext) - Method in class org.apache.tika.parser.CompositeParser
-
- getParsers() - Method in class org.apache.tika.parser.CompositeParser
-
Returns the component parsers.
- getParsers(ParseContext) - Method in class org.apache.tika.parser.DefaultParser
-
- getParsersHTML() - Method in class org.apache.tika.server.resource.TikaParsers
-
- getParsersHTML(boolean) - Method in class org.apache.tika.server.resource.TikaParsers
-
- getParsersJSON() - Method in class org.apache.tika.server.resource.TikaParsers
-
- getParsersJSON(boolean) - Method in class org.apache.tika.server.resource.TikaParsers
-
- getParsersPlain() - Method in class org.apache.tika.server.resource.TikaParsers
-
- getParsersPlain(boolean) - Method in class org.apache.tika.server.resource.TikaParsers
-
- getPart() - Method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
- getPassword(Metadata) - Method in interface org.apache.tika.parser.PasswordProvider
-
Looks up the password for a document with the given metadata,
and returns it for the Parser.
- getPasswordProvider() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
- getPath(Map<String, String>, String) - Method in class org.apache.tika.eval.batch.EvalConsumersBuilder
-
- getPath() - Method in class org.apache.tika.io.TikaInputStream
-
If the user created this TikaInputStream with a file,
the original file will be returned.
- getPath(int) - Method in class org.apache.tika.io.TikaInputStream
-
- getPath(String, Path) - Static method in class org.apache.tika.util.PropsUtil
-
Parses v.
- getPathClassifyModel() - Method in class org.apache.tika.parser.recognition.AgeRecogniserConfig
-
- getPathClassifyRegression() - Method in class org.apache.tika.parser.recognition.AgeRecogniserConfig
-
- getPathsFromExtractCrawl(Metadata, Path) - Method in class org.apache.tika.eval.AbstractProfiler
-
- getPathsFromSrcCrawl(Metadata, Path, Path) - Method in class org.apache.tika.eval.AbstractProfiler
-
- getPDFParserConfig() - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getPingPulseMillis() - Method in class org.apache.tika.server.ServerTimeouts
-
- getPingTimeoutMillis() - Method in class org.apache.tika.server.ServerTimeouts
-
- getPointValues(String) - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- getPoolSize() - Method in class org.apache.tika.fork.ForkParser
-
Returns the size of the process pool.
- getPoolSize() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
- getPosition() - Method in class org.apache.tika.io.NullInputStream
-
Return the current position.
- getPosition() - Method in class org.apache.tika.io.TikaInputStream
-
Returns the current position within the stream.
- getPrecision() - Method in class org.apache.tika.eval.db.ColInfo
-
Gets the precision.
- getPrefixes() - Static method in class org.apache.tika.xmp.XMPMetadata
-
- getPreserveInterwordSpacing() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getPrevContent() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getPrimaryProperty() - Method in class org.apache.tika.metadata.Property
-
Gets the primary property for a composite property
- getProbabilities(String) - Static method in class org.apache.tika.eval.util.LanguageIDWrapper
-
- getProfile() - Method in class org.apache.tika.language.ProfilingHandler
-
Deprecated.
Returns the language profile being built by this content handler.
- getProfile() - Method in class org.apache.tika.language.ProfilingWriter
-
Deprecated.
Returns the language profile being built by this writer.
- getProperties(String) - Static method in class org.apache.tika.metadata.Property
-
- getProperty(Object) - Method in class org.apache.tika.example.ImportContextImpl
-
- getPropertyType(String) - Static method in class org.apache.tika.metadata.Property
-
Get the type of a property
- getPropertyType() - Method in class org.apache.tika.metadata.Property
-
- getProvider() - Method in class org.apache.tika.parser.digest.InputStreamDigester
-
When subclassing this, becare to ensure that your provider is
thread-safe (not likely) or return a new provider with each call.
- getQNameAsString(QName) - Static method in class org.apache.tika.sax.ElementMappingContentHandler
-
- getR0() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getR1() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getR2() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getRawScore() - Method in class org.apache.tika.language.detect.LanguageResult
-
- getReader(InputStream, String) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Autodetect the charset of an inputStream, and return a Java Reader
to access the converted input data.
- getReader() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Create a java.io.Reader for reading the Unicode character data corresponding
to the original byte data supplied to the Charset detect operation.
- getReaderCacheHelper() - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- getRefTableInfos() - Method in class org.apache.tika.eval.batch.EvalConsumerBuilder
-
- getRefTableInfos() - Method in class org.apache.tika.eval.batch.ExtractComparerBuilder
-
- getRefTableInfos() - Method in class org.apache.tika.eval.batch.ExtractProfilerBuilder
-
- getRegisteredMimeType(String) - Method in class org.apache.tika.mime.MimeTypes
-
Returns the registered, normalised media type with the given name (or alias).
- getRel() - Method in class org.apache.tika.sax.Link
-
- getResetInterval() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns reset interval
- getResetTableIndex() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Return index of reset table
- getResize() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getResource(Class<T>) - Method in class org.apache.tika.io.TemporaryResources
-
Returns the latest of the tracked resources that implements or
extends the given interface or class.
- getResourceAsStream(String) - Method in class org.apache.tika.config.ServiceLoader
-
Returns an input stream for reading the specified resource from the
configured class loader.
- getResourceId() - Method in interface org.apache.tika.batch.FileResource
-
This is only used in logging to identify which file
may have caused problems.
- getResourceId() - Method in class org.apache.tika.batch.fs.FSFileResource
-
- getRight() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
-
- getRoughCountExceptions() - Method in class org.apache.tika.batch.StatusReporter
-
This returns a rough (unsynchronized) count of caught/handled exceptions.
- getRSSFooters() - Method in class org.apache.tika.example.RecentFiles
-
- getRSSHeaders() - Method in class org.apache.tika.example.RecentFiles
-
- getRSSItem(Document) - Method in class org.apache.tika.example.RecentFiles
-
- getSampleRate() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the sampling rate, in Hz
- getSAXParser() - Method in class org.apache.tika.parser.ParseContext
-
Returns the SAX parser specified in this parsing context.
- getSAXParser() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the SAX parser specified in this parsing context.
- getSAXParserFactory() - Method in class org.apache.tika.parser.ParseContext
-
Returns the SAX parser factory specified in this parsing context.
- getSAXParserFactory() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the SAX parser factory specified in this parsing context.
- getScore() - Method in class org.apache.tika.sax.StandardReference
-
- getSecondaryExtractProperties() - Method in class org.apache.tika.metadata.Property
-
Gets the secondary properties for a composite property
- getSecondOrganizationAcronym() - Method in class org.apache.tika.sax.StandardReference
-
- getSeparator() - Method in class org.apache.tika.sax.StandardReference
-
- getSeparatorChar() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the separator character used for annotation properties.
- getSerializerType() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the type of cTAKES (UIMA) serializer used to write the CAS.
- getServiceClass(Class<T>, String) - Method in class org.apache.tika.config.ServiceLoader
-
Loads and returns the named service class that's expected to implement
the given interface.
- getServiceLoader() - Method in class org.apache.tika.config.TikaConfig
-
- getSetKCMS() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getSetter() - Method in class org.apache.tika.config.ParamField
-
- getShortBE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE short value from the beginning of a byte array
- getShortBE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE short value from a byte array
- getShortLE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE short value from the beginning of a byte array
- getShortLE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE short value from a byte array
- getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns a signature of itsf header
- getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns a signature of the header
- getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns a signature of control data block
- getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
-
Returns pmgi signature if exists
- getSignature() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- getSimilarity(LanguageProfilerBuilder) - Method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.
Calculates a score how well NGramProfiles match each other
- getSize() - Method in class org.apache.tika.io.NullInputStream
-
- getSize() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns a size of control data
- getSize() - Method in class org.apache.tika.parser.mp3.ID3v2Frame.RawTag
-
- getSize(Metadata, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.CSVMessageBodyWriter
-
- getSize(Metadata, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.JSONMessageBodyWriter
-
- getSize(MetadataList, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.MetadataListMessageBodyWriter
-
- getSize(Map<String, byte[]>, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.TarWriter
-
- getSize(Metadata, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.TextMessageBodyWriter
-
- getSize(Metadata, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.XMPMessageBodyWriter
-
- getSize(Map<String, byte[]>, Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.ZipWriter
-
- getSize() - Method in class org.apache.tika.utils.RereadableInputStream
-
Returns the number of bytes read from the original stream.
- getSortByPosition() - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getSortByPosition() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getSorted() - Method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.
Returns a sorted list of ngrams (sort done by 1.
- getSortedDocValues(String) - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- getSortedNumericDocValues(String) - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- getSortedSetDocValues(String) - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- getSourceFileLength(EvalFilePaths, List<Metadata>) - Method in class org.apache.tika.eval.AbstractProfiler
-
- getSpacingTolerance() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getSqlDef() - Method in class org.apache.tika.eval.db.ColInfo
-
- getStackTrace(Throwable) - Static method in class org.apache.tika.utils.ExceptionUtils
-
Get the full stacktrace as a string
- getStartBlock() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Returns the start block index
- getStartIndex() - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- getStartOffset() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Returns the start offset index
- getState() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- getStatus() - Method in class org.apache.tika.server.ServerStatus
-
- getStream_uuid() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns stream uuid
- getString(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
Returns the String at the given
offset and length.
- getString(byte[], String) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Autodetect the charset of an inputStream, and return a String
containing the converted input data.
- getString() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Create a Java String from Unicode character data corresponding
to the original byte data supplied to the Charset detect operation.
- getString(int) - Method in class org.apache.tika.parser.txt.CharsetMatch
-
Create a Java String from Unicode character data corresponding
to the original byte data supplied to the Charset detect operation.
- getString(String, String) - Static method in class org.apache.tika.util.PropsUtil
-
Parses v.
- getStringsPath() - Method in class org.apache.tika.parser.strings.StringsConfig
-
Returns the "strings" installation folder.
- getStringsProg() - Static method in class org.apache.tika.parser.strings.StringsParser
-
- getStripMarkup() - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
- getStyleClass() - Method in class org.apache.tika.parser.microsoft.WordExtractor.TagAndStyle
-
- getStyleID() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- getStyleName(String) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFStylesShim
-
- getSubtype() - Method in class org.apache.tika.mime.MediaType
-
Return the Sub-Type of the MediaType,
such as "plain" for "text/plain"
- getSuffix(InputStream, int) - Static method in class org.apache.tika.parser.mp3.LyricsHandler
-
Reads and returns the last length
bytes from the
given stream.
- getSummaryStatistics() - Method in class org.apache.tika.eval.tokens.TokenStatistics
-
- getSupertype(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Returns the supertype of the given type.
- getSupportedEmbedTypes(ParseContext) - Method in interface org.apache.tika.embedder.Embedder
-
Returns the set of media types supported by this embedder when used with
the given parse context.
- getSupportedEmbedTypes(ParseContext) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
- getSupportedEmbedTypes() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
- getSupportedLanguages() - Static method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.
Returns what languages are supported for language identification
- getSupportedMimes() - Method in class org.apache.tika.dl.imagerec.DL4JInceptionV3Net
-
- getSupportedMimes() - Method in class org.apache.tika.dl.imagerec.DL4JVGG16Net
-
- getSupportedMimes() - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
-
- getSupportedMimes() - Method in interface org.apache.tika.parser.recognition.ObjectRecogniser
-
The mimes supported by this recogniser
- getSupportedMimes() - Method in class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
-
- getSupportedMimes() - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.example.DirListParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.example.EncryptedPrescriptionParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.example.PrescriptionParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.fork.ForkParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.apple.AppleSingleFileParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.asm.ClassParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.audio.AudioParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.audio.MidiParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.chm.ChmParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.code.SourceCodeParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.CompositeParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.crypto.Pkcs7Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.crypto.TSDParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.CryptoParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.csv.TextAndCSVParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dbf.DBFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.DelegatingParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dif.DIFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.dwg.DWGParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.EmptyParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.envi.EnviHeaderParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.epub.EpubContentParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.epub.EpubParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ErrorParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.executable.ExecutableParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.external.ExternalParser
-
- getSupportedTypes() - Method in class org.apache.tika.parser.external.ExternalParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.feed.FeedParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.font.AdobeFontMetricParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.font.TrueTypeParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.gdal.GDALParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.geo.topic.GeoParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.geoinfo.GeographicInformationParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.grib.GribParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.hdf.HDFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.html.HtmlParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.BPGParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.ICNSParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.ImageParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.PSDParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.TiffParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.image.WebPParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iptc.IptcAnpaParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.isatab.ISArchiveParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.iwork.IWorkPackageParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.jdbc.SQLite3Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.journal.JournalParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.jpeg.JpegParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mail.RFC822Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mat.MatParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mbox.MboxParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mbox.OutlookPSTParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.EMFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.JackcessParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.MSOwnerFileParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.OfficeParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.OldExcelParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006.Word2006MLParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.TNEFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.WMFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.WordMLParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mp3.Mp3Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.mp4.MP4Parser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ner.NamedEntityParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.netcdf.NetCDFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.NetworkParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentContentParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
-
- getSupportedTypes(ParseContext) - Method in interface org.apache.tika.parser.Parser
-
Returns the set of media types supported by this parser when used
with the given parse context.
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.ParserDecorator
-
Delegates the method call to the decorated parser.
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.CompressorParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.PackageParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pkg.RarParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.pot.PooledTimeSeriesParser
-
Returns the set of media types supported by this parser when used with the
given parse context.
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.prt.PRTParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.recognition.AgeRecogniser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.RecursiveParserWrapper
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.rtf.RTFParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.sas.SAS7BDATParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
-
Returns the types supported
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.strings.Latin1StringsParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.strings.StringsParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.txt.TXTParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.video.FLVParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.wordperfect.QuattroProParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.wordperfect.WordPerfectParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xml.FictionBookParser
-
- getSupportedTypes(ParseContext) - Method in class org.apache.tika.parser.xml.XMLParser
-
- getSuppressDuplicateOverlappingText() - Method in class org.apache.tika.parser.pdf.PDFParser
-
- getSuppressDuplicateOverlappingText() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- getSwath() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getSyncBits(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getSystem_uuid() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns system uuid
- getSystemId() - Method in class org.apache.tika.example.ImportContextImpl
-
- getTableOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Gets a table offset
- getTables(Connection) - Method in class org.apache.tika.eval.db.H2Util
-
- getTables(Connection) - Method in class org.apache.tika.eval.db.JDBCUtil
-
- getTag() - Method in exception org.apache.tika.io.TaggedIOException
-
Returns the object reference used as the tag this exception.
- getTag() - Method in class org.apache.tika.parser.microsoft.WordExtractor.TagAndStyle
-
- getTag() - Method in exception org.apache.tika.sax.TaggedSAXException
-
Returns the object reference used as the tag this exception.
- getTags() - Method in class org.apache.tika.eval.util.ContentTags
-
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getTagsPresent() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
Does the file contain this kind of tags?
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getTagsPresent() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getTagString(byte[], int, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
Returns the (possibly null padded) String at the given offset and
length.
- getTail() - Method in class org.apache.tika.io.TailStream
-
Returns an array with the last data read from the underlying stream.
- getTasks() - Method in class org.apache.tika.server.ServerStatus
-
- getTaskTimeoutMillis() - Method in class org.apache.tika.server.ServerTimeouts
-
How long to wait for a task before shutting down the child server process
and restarting it.
- getTermVectors(int) - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- getTessdataPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getTesseractPath() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getText() - Method in class org.apache.tika.parser.microsoft.ooxml.xps.XPSTextExtractor
-
- getText() - Method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
-
- getText() - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
-
- getText() - Method in class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
-
Gets the text, if present
- getText() - Method in class org.apache.tika.sax.Link
-
- getText(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.resource.TikaResource
-
- getTextDocument() - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
Retrieves the built TextDocument
- getTextFromMultipart(Attachment, UriInfo) - Method in class org.apache.tika.server.resource.TikaResource
-
- getTextMain(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.resource.TikaResource
-
- getTextMainFromMultipart(Attachment, UriInfo) - Method in class org.apache.tika.server.resource.TikaResource
-
- getThreshold() - Method in class org.apache.tika.sax.StandardsExtractingContentHandler
-
Gets the threshold to be used for selecting the standard references found
within the text based on their score.
- getTikaConfig() - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
- getTimeout() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- getTimeout() - Method in class org.apache.tika.parser.strings.StringsConfig
-
Returns the maximum time (in seconds) to wait for the "strings" command
to terminate.
- getTitle() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getTitle() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
- getTitle() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getTitle() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getTitle() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getTitle() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getTitle() - Method in class org.apache.tika.sax.Link
-
- getToken() - Method in class org.apache.tika.eval.tokens.TokenIntPair
-
- getTokens(String) - Method in class org.apache.tika.eval.tokens.TokenCounter
-
- getTokenStatistics(String) - Method in class org.apache.tika.eval.tokens.TokenCounter
-
- getTopN() - Method in class org.apache.tika.eval.tokens.TokenStatistics
-
- getTopNMoreA() - Method in class org.apache.tika.eval.tokens.ContrastStatistics
-
- getTopNMoreB() - Method in class org.apache.tika.eval.tokens.ContrastStatistics
-
- getTopNUniqueA() - Method in class org.apache.tika.eval.tokens.ContrastStatistics
-
- getTopNUniqueB() - Method in class org.apache.tika.eval.tokens.ContrastStatistics
-
- getTotal() - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- getTotalTokens() - Method in class org.apache.tika.eval.tokens.TokenStatistics
-
- getTotalUniqueTokens() - Method in class org.apache.tika.eval.tokens.TokenStatistics
-
- getTrackingMetadata() - Method in class org.apache.tika.parser.mbox.MboxParser
-
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getTrackNumber() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
The number of the track within the album / recording
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getTrackNumber() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- getTransformer() - Method in class org.apache.tika.parser.ParseContext
-
Returns the transformer specified in this parsing context.
- getTransformer() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns a new transformer
- getTranslator() - Method in class org.apache.tika.config.TikaConfig
-
Returns the configured translator instance.
- getTranslator() - Method in class org.apache.tika.language.translate.CachedTranslator
-
- getTranslator() - Method in class org.apache.tika.language.translate.DefaultTranslator
-
Returns the current translator
- getTranslator() - Method in class org.apache.tika.Tika
-
Returns the translator instance used by this facade.
- getTranslators() - Method in class org.apache.tika.language.translate.DefaultTranslator
-
Returns all available translators
- getType() - Method in class org.apache.tika.config.Param
-
- getType() - Method in class org.apache.tika.config.ParamField
-
- getType() - Method in class org.apache.tika.detect.NNTrainedModelBuilder
-
- getType() - Method in class org.apache.tika.eval.db.ColInfo
-
- getType() - Method in exception org.apache.tika.eval.io.ExtractReaderException
-
- getType() - Method in class org.apache.tika.mime.MediaType
-
Return the Type of the MediaType, such as
"text" for "text/plain"
- getType() - Method in class org.apache.tika.mime.MimeType
-
Returns the normalized media type name.
- getType() - Method in class org.apache.tika.parser.image.ICNSType
-
- getType() - Method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
-
- getType() - Method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
- getType() - Method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
-
- getType() - Method in class org.apache.tika.sax.BasicContentHandlerFactory
-
- getType() - Method in class org.apache.tika.sax.Link
-
- getTypeFromVal(int) - Static method in enum org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
-
- getTypes() - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Returns the set of all known canonical media types.
- getTypeString() - Method in class org.apache.tika.config.Param
-
- getUByte(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
get the unsigned value of a byte.
- getUIntBE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE unsigned int value from a byte array
- getUIntBE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE unsigned int value from a byte array
- getUIntLE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE unsigned int value from a byte array
- getUIntLE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE unsigned int value from a byte array
- getUMLSPass() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the UMLS password.
- getUMLSUser() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns the UMLS username.
- getUncompressedLen() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Gets uncompressed length
- getUnderline() - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- getUniformTypeIdentifier() - Method in class org.apache.tika.mime.MimeType
-
Get the UTI for this mime type.
- getUniqueAlphabeticTokens() - Method in class org.apache.tika.eval.tokens.CommonTokenResult
-
- getUniqueCommonTokens() - Method in class org.apache.tika.eval.tokens.CommonTokenResult
-
- getUnknown() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Gets unknown
- getUnknown0008() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- getUnknown_000c() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns unknown_00c value
- getUnknown_000c() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns 000c unknown bytes
- getUnknown_0024() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns 0024 unknown bytes
- getUnknown_002c() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns 002c unknown bytes
- getUnknown_0044() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns 0044 unknown bytes
- getUnknown_18() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns unknown 18 bytes
- getUnknownLen() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns unknown length
- getUnknownOffset() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns unknown offset
- getUri() - Method in class org.apache.tika.sax.Link
-
- getUserInterrupted() - Method in class org.apache.tika.batch.BatchProcessDriverCLI
-
- getUseSAXDocxExtractor() - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- getUseSAXDocxExtractor() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getUseSAXPptxExtractor() - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
- getUShortBE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE unsigned short value from the beginning of a byte array
- getUShortBE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE unsigned short value from a byte array
- getUShortLE(byte[]) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE unsigned short value from the beginning of a byte array
- getUShortLE(byte[], int) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE unsigned short value from a byte array
- getValue() - Method in class org.apache.tika.config.Param
-
- getValue() - Method in class org.apache.tika.eval.tokens.TokenIntPair
-
- getValues(Property) - Method in class org.apache.tika.metadata.Metadata
-
Get the values associated to a metadata name.
- getValues(String) - Method in class org.apache.tika.metadata.Metadata
-
Get the values associated to a metadata name.
- getValues(Property) - Method in class org.apache.tika.xmp.XMPMetadata
-
- getValues(String) - Method in class org.apache.tika.xmp.XMPMetadata
-
Returns the value of a simple property or all if the property is an array and the elements
are of simple type.
- getValueType() - Method in class org.apache.tika.metadata.Property
-
- getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Returns itsf header version
- getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Returns version of itsp header
- getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns a version of control data block
- getVersion() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Returns the version
- getVersion() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
- getVersion() - Method in class org.apache.tika.server.resource.TikaVersion
-
- getVersionCode() - Method in class org.apache.tika.parser.mp3.AudioFrame
-
Get the version code.
- getWelcomeHTML() - Method in class org.apache.tika.server.resource.TikaWelcome
-
- getWelcomePlain() - Method in class org.apache.tika.server.resource.TikaWelcome
-
- getWidth() - Method in class org.apache.tika.parser.image.ICNSType
-
- getWindow() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getWindowPosition() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getWindowSize() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns a window size
- getWindowSize(int) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
LZX supports window sizes of 2^15 (32Kb) through 2^21 (2Mb) Returns X,
i.e 2^X
- getWindowSize() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- getWindowsPerReset() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns windows per reset
- getWrappedParser() - Method in class org.apache.tika.parser.ParserDecorator
-
Gets the parser wrapped by this ParserDecorator
- getXHTML(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
- getXHTML(ContentHandler, Metadata, ParseContext) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor
-
Parses the document into a sequence of XHTML SAX events sent to the
given content handler.
- getXHTML(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFBExcelExtractorDecorator
-
- getXHTML(ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
- getXML(InputStream, HttpHeaders, UriInfo) - Method in class org.apache.tika.server.resource.TikaResource
-
- getXMLFromMultipart(Attachment, UriInfo) - Method in class org.apache.tika.server.resource.TikaResource
-
- getXMLifiedLogMsg(String, String, String...) - Method in class org.apache.tika.batch.FileResourceConsumer
-
- getXMLifiedLogMsg(String, String, Throwable, String...) - Method in class org.apache.tika.batch.FileResourceConsumer
-
Use this for structured output that captures resourceId and other attributes.
- getXMLInputFactory() - Method in class org.apache.tika.parser.ParseContext
-
Returns the StAX input factory specified in this parsing context.
- getXMLInputFactory() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the StAX input factory specified in this parsing context.
- getXMLReader() - Method in class org.apache.tika.parser.ParseContext
-
Returns the XMLReader specified in this parsing context.
- getXMLReader() - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Returns the XMLReader specified in this parsing context.
- getXMPData() - Method in class org.apache.tika.xmp.XMPMetadata
-
Provides direct access to the XMP data model, in case a client prefers to work directly on it
instead of using the Metadata API
- getXMPMeta() - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
- getYear() - Method in class org.apache.tika.parser.mp3.CompositeTagHandler
-
- getYear() - Method in interface org.apache.tika.parser.mp3.ID3Tags
-
- getYear() - Method in class org.apache.tika.parser.mp3.ID3v1Handler
-
- getYear() - Method in class org.apache.tika.parser.mp3.ID3v22Handler
-
- getYear() - Method in class org.apache.tika.parser.mp3.ID3v23Handler
-
- getYear() - Method in class org.apache.tika.parser.mp3.ID3v24Handler
-
- GLOB_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- GoogleTranslator - Class in org.apache.tika.language.translate
-
- GoogleTranslator() - Constructor for class org.apache.tika.language.translate.GoogleTranslator
-
- GrabPhoneNumbersExample - Class in org.apache.tika.example
-
- GrabPhoneNumbersExample() - Constructor for class org.apache.tika.example.GrabPhoneNumbersExample
-
- GREETING - Static variable in class org.apache.tika.server.resource.TikaResource
-
- GRIB_MIME_TYPE - Static variable in class org.apache.tika.parser.grib.GribParser
-
- GribParser - Class in org.apache.tika.parser.grib
-
- GribParser() - Constructor for class org.apache.tika.parser.grib.GribParser
-
- GrobidNERecogniser - Class in org.apache.tika.parser.ner.grobid
-
- GrobidNERecogniser() - Constructor for class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
- GrobidRESTParser - Class in org.apache.tika.parser.journal
-
- GrobidRESTParser() - Constructor for class org.apache.tika.parser.journal.GrobidRESTParser
-
- ICNS_1024x1024_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_128x128_24BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_128x128_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_128x128_8BIT_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_128x128_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x12_1BIT_IMAGE_AND_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x12_4BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x12_8BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x16_1BIT_IMAGE_AND_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x16_24BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x16_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x16_4BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x16_8BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x16_8BIT_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_16x16_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_256x256_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_256x256_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_32x32_1BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_32x32_1BIT_IMAGE_AND_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_32x32_24BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_32x32_2X_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_32x32_4BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_32x32_8BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_32x32_8BIT_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_32x32_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_48x48_1BIT_IMAGE_AND_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_48x48_24BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_48x48_4BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_48x48_8BIT_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_48x48_8BIT_MASK - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_512x512_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_64x64_JPEG_PNG_IMAGE - Static variable in class org.apache.tika.parser.image.ICNSType
-
- ICNS_MIME_TYPE - Static variable in class org.apache.tika.parser.image.ICNSParser
-
- ICNSParser - Class in org.apache.tika.parser.image
-
A basic parser class for Apple ICNS icon files
- ICNSParser() - Constructor for class org.apache.tika.parser.image.ICNSParser
-
- ICNSType - Class in org.apache.tika.parser.image
-
Holds details on Apple ICNS icons
- IContentHandlerFactoryBuilder - Interface in org.apache.tika.batch.builders
-
- ICrawlerBuilder - Interface in org.apache.tika.batch.builders
-
- Icu4jEncodingDetector - Class in org.apache.tika.parser.txt
-
- Icu4jEncodingDetector() - Constructor for class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
- ID - Static variable in class org.apache.tika.eval.AbstractProfiler
-
- ID - Static variable in interface org.apache.tika.metadata.QuattroPro
-
ID.
- id - Variable in class org.apache.tika.parser.recognition.RecognisedObject
-
Identifier for this object
- id - Variable in class org.apache.tika.parser.rtf.ListDescriptor
-
- ID3Comment(String) - Constructor for class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
-
Creates an ID3 v1 style comment tag
- ID3Comment(String, String, String) - Constructor for class org.apache.tika.parser.mp3.ID3Tags.ID3Comment
-
Creates an ID3 v2 style comment tag
- ID3Tags - Interface in org.apache.tika.parser.mp3
-
Interface that defines the common interface for ID3 tag parsers,
such as ID3v1 and ID3v2.3.
- ID3Tags.ID3Comment - Class in org.apache.tika.parser.mp3
-
Represents a comments in ID3 (especially ID3 v2), where are
made up of several parts
- ID3TagsAndAudio() - Constructor for class org.apache.tika.parser.mp3.Mp3Parser.ID3TagsAndAudio
-
- ID3v1Handler - Class in org.apache.tika.parser.mp3
-
This is used to parse ID3 Version 1 Tag information from an MP3 file,
if available.
- ID3v1Handler(InputStream, ContentHandler) - Constructor for class org.apache.tika.parser.mp3.ID3v1Handler
-
- ID3v1Handler(byte[]) - Constructor for class org.apache.tika.parser.mp3.ID3v1Handler
-
Creates from the last 128 bytes of a stream.
- ID3v22Handler - Class in org.apache.tika.parser.mp3
-
This is used to parse ID3 Version 2.2 Tag information from an MP3 file,
if available.
- ID3v22Handler(ID3v2Frame) - Constructor for class org.apache.tika.parser.mp3.ID3v22Handler
-
- ID3v23Handler - Class in org.apache.tika.parser.mp3
-
This is used to parse ID3 Version 2.3 Tag information from an MP3 file,
if available.
- ID3v23Handler(ID3v2Frame) - Constructor for class org.apache.tika.parser.mp3.ID3v23Handler
-
- ID3v24Handler - Class in org.apache.tika.parser.mp3
-
This is used to parse ID3 Version 2.4 Tag information from an MP3 file,
if available.
- ID3v24Handler(ID3v2Frame) - Constructor for class org.apache.tika.parser.mp3.ID3v24Handler
-
- ID3v2Frame - Class in org.apache.tika.parser.mp3
-
A frame of ID3v2 data, which is then passed to a handler to
be turned into useful data.
- ID3v2Frame.RawTag - Class in org.apache.tika.parser.mp3
-
- ID3v2Frame.RawTagIterator - Class in org.apache.tika.parser.mp3
-
Iterates over id3v2 raw tags.
- ID3v2Frame.TextEncoding - Class in org.apache.tika.parser.mp3
-
- ID_PROPERTY - Static variable in class org.apache.tika.language.translate.MicrosoftTranslator
-
- IDBWriter - Interface in org.apache.tika.eval.io
-
- IDENTIFIER - Static variable in interface org.apache.tika.metadata.DublinCore
-
Recommended best practice is to identify the resource by means of
a string or number conforming to a formal identification system.
- IDENTIFIER - Static variable in class org.apache.tika.metadata.Metadata
-
- IDENTIFIER - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- IDENTIFIER - Static variable in interface org.apache.tika.metadata.XMP
-
An unordered array of text strings that unambiguously identify the resource
within a given context.
- identifyEndpoints() - Method in class org.apache.tika.server.resource.TikaWelcome
-
- identifyStaticServiceProviders(Class<T>) - Method in class org.apache.tika.config.ServiceLoader
-
Returns the defined static service providers of the given type, without
attempting to load them.
- IdentityHtmlMapper - Class in org.apache.tika.parser.html
-
Alternative HTML mapping rules that pass the input HTML as-is without any
modifications.
- IdentityHtmlMapper() - Constructor for class org.apache.tika.parser.html.IdentityHtmlMapper
-
- IFileProcessorFutureResult - Interface in org.apache.tika.batch
-
stub interface to allow for different result types from different processors
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.parser.dif.DIFContentHandler
-
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
-
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
-
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.ContentHandlerDecorator
-
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.DIFContentHandler
-
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.LinkContentHandler
-
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.SafeContentHandler
-
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.SecureContentHandler
-
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.TeeContentHandler
-
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.TextContentHandler
-
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.ToTextContentHandler
-
Writes the given ignorable characters to the given character stream.
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.WriteOutContentHandler
-
- ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
-
- IGNORE - Static variable in interface org.apache.tika.config.InitializableProblemHandler
-
Strategy that simply ignores all problems.
- IGNORE - Static variable in interface org.apache.tika.config.LoadErrorHandler
-
Strategy that simply ignores all problems.
- IGNORE_LENGTH - Static variable in class org.apache.tika.eval.io.ExtractReader
-
- image(String) - Static method in class org.apache.tika.mime.MediaType
-
- IMAGE_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- IMAGE_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Images in the document
- IMAGE_CREATOR - Static variable in interface org.apache.tika.metadata.IPTC
-
Creator or creators of the image.
- IMAGE_CREATOR_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
The ID of the creator or creators of the image.
- IMAGE_CREATOR_ID_WRONG_CASE - Static variable in interface org.apache.tika.metadata.IPTC
-
- IMAGE_CREATOR_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of the creator or creators of the image.
- IMAGE_LENGTH - Static variable in interface org.apache.tika.metadata.TIFF
-
"Image height in pixels."
- IMAGE_REGISTRY_ENTRY - Static variable in interface org.apache.tika.metadata.IPTC
-
Both a Registry Item Id and a Registry Organisation Id to record any
registration of this item with a registry.
- IMAGE_SUPPLIER - Static variable in interface org.apache.tika.metadata.IPTC
-
Identifies the most recent supplier of the item, who is not necessarily
its owner or creator.
- IMAGE_SUPPLIER_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
Identifies the most recent supplier of the item, who is not necessarily
its owner or creator.
- IMAGE_SUPPLIER_ID_WRONG_CASE - Static variable in interface org.apache.tika.metadata.IPTC
-
- IMAGE_SUPPLIER_IMAGE_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
Optional identifier assigned by the Image Supplier to the image.
- IMAGE_SUPPLIER_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
Identifies the most recent supplier of the item, who is not necessarily
its owner or creator.
- IMAGE_WIDTH - Static variable in interface org.apache.tika.metadata.TIFF
-
"Image width in pixels."
- ImageMetadataExtractor - Class in org.apache.tika.parser.image
-
Uses the
Metadata Extractor library
to read EXIF and IPTC image metadata and map to Tika fields.
- ImageMetadataExtractor(Metadata) - Constructor for class org.apache.tika.parser.image.ImageMetadataExtractor
-
- ImageMetadataExtractor(Metadata, ImageMetadataExtractor.DirectoryHandler...) - Constructor for class org.apache.tika.parser.image.ImageMetadataExtractor
-
- ImageParser - Class in org.apache.tika.parser.image
-
- ImageParser() - Constructor for class org.apache.tika.parser.image.ImageParser
-
- ImportContextImpl - Class in org.apache.tika.example
-
ImportContextImpl
...
- ImportContextImpl(Item, String, InputContext, InputStream, IOListener, Detector) - Constructor for class org.apache.tika.example.ImportContextImpl
-
Creates a new item import context.
- increaseFramesRead() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- incrementHandledExceptions() - Method in class org.apache.tika.batch.FileResourceConsumer
-
Make sure to call this appropriately!
- incrementLevel(int, AbstractListManager.LevelTuple[]) - Method in class org.apache.tika.parser.microsoft.AbstractListManager.ParagraphLevelCounter
-
Apply this to every numbered paragraph in order.
- indexContentSpecificMet(File) - Method in class org.apache.tika.example.MetadataAwareLuceneIndexer
-
- indexDocument(File) - Method in class org.apache.tika.example.LuceneIndexer
-
- indexDocument(File) - Method in class org.apache.tika.example.LuceneIndexerExtended
-
- indexOf(byte[], byte[]) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
Searches some pattern in byte[]
- indexOf(List<DirectoryListingEntry>, String) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
Searches for some pattern in the directory listing entry list
- indexOfResetTableBlock(byte[], byte[]) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
Returns an index of the reset table
- indexWithDublinCore(File) - Method in class org.apache.tika.example.MetadataAwareLuceneIndexer
-
- INFO - Static variable in interface org.apache.tika.config.InitializableProblemHandler
-
Strategy that logs warnings of all problems using a
Logger
created using the given class name.
- informCompleted(boolean) - Method in class org.apache.tika.example.ImportContextImpl
-
- init() - Method in class org.apache.tika.batch.ConsumersManager
-
This is called by BatchProcess before submitting the threads
- init() - Method in class org.apache.tika.batch.fs.FSConsumersManager
-
- init(ArrayBlockingQueue<FileResource>, Map<String, String>, JDBCUtil, boolean) - Method in class org.apache.tika.eval.batch.EvalConsumerBuilder
-
- init(DataInputStream, DataOutputStream) - Method in interface org.apache.tika.fork.ForkProxy
-
- init(TikaConfig, DigestingParser.Digester, InputStreamFactory, ServerStatus) - Static method in class org.apache.tika.server.resource.TikaResource
-
- INITIAL_AUTHOR - Static variable in interface org.apache.tika.metadata.Office
-
Name of the initial creator/author of a document
- Initializable - Interface in org.apache.tika.config
-
Components that must do special processing across multiple fields
at initialization time should implement this interface.
- InitializableProblemHandler - Interface in org.apache.tika.config
-
This is to be used to handle potential recoverable problems that
might arise during initialization.
- initialize(Map<String, Param>) - Method in interface org.apache.tika.config.Initializable
-
- initialize(Map<String, Param>) - Method in class org.apache.tika.dl.imagerec.DL4JInceptionV3Net
-
- initialize(Map<String, Param>) - Method in class org.apache.tika.dl.imagerec.DL4JVGG16Net
-
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
-
- initialize(URL) - Method in class org.apache.tika.parser.geo.topic.GeoParser
-
Initializes this parser
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.jdbc.SQLite3Parser
-
No-op
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
no-op
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.pdf.PDFParser
-
This is a no-op.
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.recognition.AgeRecogniser
-
- initialize(Map<String, Param>) - Method in interface org.apache.tika.parser.recognition.ObjectRecogniser
-
This is the hook for configuring the recogniser
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
-
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
-
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTVideoRecogniser
-
- initialize(Map<String, Param>) - Method in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
-
- initProfiles() - Static method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.
Builds the language profiles.
- initProfiles(Map<String, LanguageProfile>) - Static method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.
Initializes the language profiles from a user supplied initialized Map.
- INPUT_FILE_TOKEN - Static variable in class org.apache.tika.parser.external.ExternalParser
-
The token, which if present in the Command string, will
be replaced with the input filename.
- inputFilterEnabled() - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Test whether or not input filtering is enabled.
- InputStreamDigester - Class in org.apache.tika.parser.digest
-
- InputStreamDigester(int, String, DigestingParser.Encoder) - Constructor for class org.apache.tika.parser.digest.InputStreamDigester
-
- InputStreamDigester(int, String, String, DigestingParser.Encoder) - Constructor for class org.apache.tika.parser.digest.InputStreamDigester
-
- InputStreamFactory - Interface in org.apache.tika.server
-
Interface to allow for custom/consistent creation of InputStream
- insert(PreparedStatement, TableInfo, Map<Cols, String>) - Static method in class org.apache.tika.eval.db.JDBCUtil
-
- INSTANCE - Static variable in class org.apache.tika.detect.EmptyDetector
-
Singleton instance of this class.
- INSTANCE - Static variable in class org.apache.tika.parser.EmptyParser
-
Singleton instance of this class.
- INSTANCE - Static variable in class org.apache.tika.parser.ErrorParser
-
Singleton instance of this class.
- INSTANCE - Static variable in class org.apache.tika.parser.html.DefaultHtmlMapper
-
- INSTANCE - Static variable in class org.apache.tika.parser.html.IdentityHtmlMapper
-
- INSTANCE - Static variable in class org.apache.tika.sax.xpath.AttributeMatcher
-
- INSTANCE - Static variable in class org.apache.tika.sax.xpath.ElementMatcher
-
- INSTANCE - Static variable in class org.apache.tika.sax.xpath.NodeMatcher
-
- INSTANCE - Static variable in class org.apache.tika.sax.xpath.TextMatcher
-
- INSTANCEID - Static variable in interface org.apache.tika.metadata.XMPMM
-
An identifier for a specific incarnation of a resource, updated
each time a file is saved.
- inStartElement - Variable in class org.apache.tika.sax.ToXMLContentHandler
-
- INSTITUTION - Static variable in interface org.apache.tika.metadata.ClimateForcast
-
- INSTRUCTIONS - Static variable in interface org.apache.tika.metadata.IPTC
-
Any of a number of instructions from the provider or creator to the
receiver of the item.
- INSTRUCTIONS - Static variable in interface org.apache.tika.metadata.Photoshop
-
- INSTRUMENT - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The musical instrument."
- intelE8Decoding() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- INTELLECTUAL_GENRE - Static variable in interface org.apache.tika.metadata.IPTC
-
Describes the nature, intellectual, artistic or journalistic
characteristic of a item, not specifically its content.
- internalBoolean(String) - Static method in class org.apache.tika.metadata.Property
-
- internalClosedChoise(String, String...) - Static method in class org.apache.tika.metadata.Property
-
- internalDate(String) - Static method in class org.apache.tika.metadata.Property
-
- internalInteger(String) - Static method in class org.apache.tika.metadata.Property
-
- internalIntegerSequence(String) - Static method in class org.apache.tika.metadata.Property
-
- internalOpenChoise(String, String...) - Static method in class org.apache.tika.metadata.Property
-
- internalRational(String) - Static method in class org.apache.tika.metadata.Property
-
- internalReal(String) - Static method in class org.apache.tika.metadata.Property
-
- internalText(String) - Static method in class org.apache.tika.metadata.Property
-
- internalTextBag(String) - Static method in class org.apache.tika.metadata.Property
-
- internalURI(String) - Static method in class org.apache.tika.metadata.Property
-
- INTERPRETED_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- InterruptableParsingExample - Class in org.apache.tika.example
-
This example demonstrates how to interrupt document parsing if
some condition is met.
- InterruptableParsingExample() - Constructor for class org.apache.tika.example.InterruptableParsingExample
-
- Interrupter - Class in org.apache.tika.batch
-
Class that waits for input on System.in.
- Interrupter(long) - Constructor for class org.apache.tika.batch.Interrupter
-
- InterrupterBuilder - Class in org.apache.tika.batch.builders
-
Builds an Interrupter
- InterrupterBuilder() - Constructor for class org.apache.tika.batch.builders.InterrupterBuilder
-
- InterrupterFutureResult - Class in org.apache.tika.batch
-
- InterrupterFutureResult() - Constructor for class org.apache.tika.batch.InterrupterFutureResult
-
- IO_IS - Static variable in class org.apache.tika.batch.FileResourceConsumer
-
- IO_OS - Static variable in class org.apache.tika.batch.FileResourceConsumer
-
- IOExceptionWithCause - Exception in org.apache.tika.io
-
Subclasses IOException with the
Throwable
constructors missing before Java 6.
- IOExceptionWithCause(String, Throwable) - Constructor for exception org.apache.tika.io.IOExceptionWithCause
-
Constructs a new instance with the given message and cause.
- IOExceptionWithCause(Throwable) - Constructor for exception org.apache.tika.io.IOExceptionWithCause
-
Constructs a new instance with the given cause.
- IOUtils - Class in org.apache.tika.io
-
General IO stream manipulation utilities.
- IOUtils() - Constructor for class org.apache.tika.io.IOUtils
-
Instances should NOT be constructed in standard programming.
- IParserFactoryBuilder - Interface in org.apache.tika.batch.builders
-
- IPTC - Interface in org.apache.tika.metadata
-
IPTC photo metadata schema.
- IPTC_LAST_EDITED - Static variable in interface org.apache.tika.metadata.IPTC
-
The date and optionally time when any of the IPTC photo metadata fields
has been last edited
- IptcAnpaParser - Class in org.apache.tika.parser.iptc
-
Parser for IPTC ANPA New Wire Feeds
- IptcAnpaParser() - Constructor for class org.apache.tika.parser.iptc.IptcAnpaParser
-
- IS_ENCRYPTED - Static variable in interface org.apache.tika.metadata.PDF
-
- IS_OS_AIX - Static variable in class org.apache.tika.utils.SystemUtils
-
- IS_OS_HP_UX - Static variable in class org.apache.tika.utils.SystemUtils
-
- IS_OS_IRIX - Static variable in class org.apache.tika.utils.SystemUtils
-
- IS_OS_LINUX - Static variable in class org.apache.tika.utils.SystemUtils
-
- IS_OS_MAC - Static variable in class org.apache.tika.utils.SystemUtils
-
- IS_OS_MAC_OSX - Static variable in class org.apache.tika.utils.SystemUtils
-
- IS_OS_OS2 - Static variable in class org.apache.tika.utils.SystemUtils
-
- IS_OS_SOLARIS - Static variable in class org.apache.tika.utils.SystemUtils
-
- IS_OS_SUN_OS - Static variable in class org.apache.tika.utils.SystemUtils
-
- IS_OS_UNIX - Static variable in class org.apache.tika.utils.SystemUtils
-
- IS_OS_WINDOWS - Static variable in class org.apache.tika.utils.SystemUtils
-
- isActive() - Method in class org.apache.tika.batch.FileResourceCrawler
-
If the crawler stops for any reason, it is no longer active.
- isAlphabetic(char[], int) - Static method in class org.apache.tika.eval.tokens.AlphaIdeographFilterFactory
-
- isAnchor() - Method in class org.apache.tika.sax.Link
-
- ISArchiveParser - Class in org.apache.tika.parser.isatab
-
- ISArchiveParser() - Constructor for class org.apache.tika.parser.isatab.ISArchiveParser
-
Default constructor.
- ISArchiveParser(String) - Constructor for class org.apache.tika.parser.isatab.ISArchiveParser
-
Constructor that accepts the pathname of ISArchive folder.
- ISATabUtils - Class in org.apache.tika.parser.isatab
-
- ISATabUtils() - Constructor for class org.apache.tika.parser.isatab.ISATabUtils
-
- isAudioHeader(int, int, int, int) - Static method in class org.apache.tika.parser.mp3.AudioFrame
-
Does this appear to be a 4 byte audio frame header?
- isAvailable() - Method in class org.apache.tika.dl.imagerec.DL4JInceptionV3Net
-
- isAvailable() - Method in class org.apache.tika.dl.imagerec.DL4JVGG16Net
-
- isAvailable() - Method in class org.apache.tika.langdetect.Lingo24LangDetector
-
- isAvailable() - Method in class org.apache.tika.language.translate.CachedTranslator
-
- isAvailable() - Method in class org.apache.tika.language.translate.DefaultTranslator
-
- isAvailable() - Method in class org.apache.tika.language.translate.EmptyTranslator
-
- isAvailable() - Method in class org.apache.tika.language.translate.GoogleTranslator
-
- isAvailable() - Method in class org.apache.tika.language.translate.JoshuaNetworkTranslator
-
- isAvailable() - Method in class org.apache.tika.language.translate.Lingo24Translator
-
- isAvailable() - Method in class org.apache.tika.language.translate.MicrosoftTranslator
-
Check whether this instance has a working property file and its keys are not the defaults.
- isAvailable() - Method in class org.apache.tika.language.translate.MosesTranslator
-
- isAvailable() - Method in interface org.apache.tika.language.translate.Translator
-
- isAvailable() - Method in class org.apache.tika.language.translate.YandexTranslator
-
- isAvailable() - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
-
- isAvailable() - Method in class org.apache.tika.parser.geo.topic.GeoParser
-
- isAvailable() - Method in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
- isAvailable() - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
- isAvailable() - Method in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
-
- isAvailable() - Method in interface org.apache.tika.parser.ner.NERecogniser
-
checks if this Named Entity recogniser is available for service
- isAvailable() - Method in class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
-
- isAvailable() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
-
- isAvailable() - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
- isAvailable() - Method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
-
- isAvailable() - Method in interface org.apache.tika.parser.recognition.ObjectRecogniser
-
Is this service available
- isAvailable() - Method in class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
-
- isAvailable() - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- isBase64() - Method in class org.apache.tika.parser.utils.DataURIScheme
-
- isBold() - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- isCatchIntermediateIOExceptions() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- isCauseOf(IOException) - Method in class org.apache.tika.io.TaggedInputStream
-
Tests if the given exception was caused by this stream.
- isCauseOf(SAXException) - Method in class org.apache.tika.sax.TaggedContentHandler
-
Tests if the given exception was caused by this handler.
- isComplete() - Method in class org.apache.tika.parser.csv.CSVParams
-
- isCompleted() - Method in class org.apache.tika.example.ImportContextImpl
-
- isConverterAvailable(String) - Static method in class org.apache.tika.xmp.convert.TikaToXMP
-
Check if there is a converter available which allows to convert the Tika metadata to XMP
- isDiscardElement(String) - Method in class org.apache.tika.parser.html.DefaultHtmlMapper
-
- isDiscardElement(String) - Method in interface org.apache.tika.parser.html.HtmlMapper
-
Checks whether all content within the given HTML element should be
discarded instead of including it in the parse output.
- isDiscardElement(String) - Method in class org.apache.tika.parser.html.HtmlParser
-
- isDiscardElement(String) - Method in class org.apache.tika.parser.html.IdentityHtmlMapper
-
- isDynamic() - Method in class org.apache.tika.config.ServiceLoader
-
Returns if the service loader is static or dynamic
- isEmpty(String) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
- isEmpty() - Method in class org.apache.tika.parser.csv.CSVParams
-
- isEnableImageProcessing() - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- isExternal() - Method in class org.apache.tika.metadata.Property
-
- isHeading() - Method in class org.apache.tika.parser.microsoft.WordExtractor.TagAndStyle
-
- isIframe() - Method in class org.apache.tika.sax.Link
-
- isImage() - Method in class org.apache.tika.sax.Link
-
- isIncludeMarkup() - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
- isInstanceOf(MediaType, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Checks whether the given media type equals the given base type or
is a specialization of it.
- isInstanceOf(String, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Parses and normalises the given media type string and checks whether
the result equals the given base type or is a specialization of it.
- isInternal() - Method in class org.apache.tika.metadata.Property
-
- isInvalid(int) - Method in class org.apache.tika.sax.SafeContentHandler
-
Checks whether the given Unicode character is an invalid XML character
and should be replaced for output.
- isItalics() - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- isLanguage(String) - Method in class org.apache.tika.language.detect.LanguageResult
-
Return true if the target language matches the detected language.
- isLink() - Method in class org.apache.tika.sax.Link
-
- isListenForAllRecords() - Method in class org.apache.tika.parser.microsoft.ExcelExtractor
-
Returns true
if this parser is configured to listen
for all records instead of just the specified few.
- isMacroLanguage(String) - Static method in class org.apache.tika.language.detect.LanguageNames
-
- isMatchingElement(String, String) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
-
- isMatchingParentElement(String, String) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
-
- isMetadataField(String) - Static method in class org.apache.tika.parser.image.MetadataFields
-
- isMetadataField(Property) - Static method in class org.apache.tika.parser.image.MetadataFields
-
- isMimetype() - Method in class org.apache.tika.parser.strings.FileConfig
-
Returns true
if the mime option is enabled.
- isMixedLanguages() - Method in class org.apache.tika.language.detect.LanguageDetector
-
- isMostlyAscii() - Method in class org.apache.tika.detect.TextStatistics
-
Checks whether at least one byte was seen and that the bytes that
were seen were mostly plain text (i.e.
- isMSB() - Method in class org.apache.tika.parser.executable.MachineMetadata.Endian
-
- isMultiValued(Property) - Method in class org.apache.tika.metadata.Metadata
-
Returns true if named value is multivalued.
- isMultiValued(String) - Method in class org.apache.tika.metadata.Metadata
-
Returns true if named value is multivalued.
- isMultiValued(Property) - Method in class org.apache.tika.xmp.XMPMetadata
-
- isMultiValued(String) - Method in class org.apache.tika.xmp.XMPMetadata
-
Checks if the named property is an array.
- isMultiValuePermitted() - Method in class org.apache.tika.metadata.Property
-
Is the PropertyType one which accepts multiple values?
- ISO_SPEED_RATINGS - Static variable in interface org.apache.tika.metadata.TIFF
-
"ISO Speed and ISO Latitude of the input device as specified in ISO 12232"
- isOperating() - Method in class org.apache.tika.server.ServerStatus
-
- isPrettyPrint() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns true
if formatted output is enabled, false
otherwise.
- isQueueEmpty() - Method in class org.apache.tika.batch.FileResourceCrawler
-
Use sparingly.
- isQuoteAssignmentValues() - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Gets whether or not to quote assignment values, i.e.
- isReasonablyCertain() - Method in class org.apache.tika.language.detect.LanguageResult
-
- isReasonablyCertain() - Method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.
Tries to judge whether the identification is certain enough
to be trusted.
- ISREGEX_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- isRequired() - Method in class org.apache.tika.config.ParamField
-
- isScript() - Method in class org.apache.tika.sax.Link
-
- isSerialize() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns true
if CAS serialization is enabled, false
otherwise.
- isShortText() - Method in class org.apache.tika.language.detect.LanguageDetector
-
- isSpecializationOf(MediaType, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
-
Checks whether the given media type a is a specialization of a more
generic type b.
- isStillActive() - Method in class org.apache.tika.batch.FileResourceConsumer
-
Returns whether or not the consumer is still could process
a file or is still processing a file (ACTIVELY_CONSUMING or ASKED_TO_SHUTDOWN)
- isStrikeThrough() - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- isStyle - Variable in class org.apache.tika.parser.rtf.ListDescriptor
-
- isSupported(TikaInputStream) - Method in interface org.apache.tika.extractor.ContainerExtractor
-
Is this Container Extractor able to process the
supplied container?
- isSupported(TikaInputStream) - Method in class org.apache.tika.extractor.ParserContainerExtractor
-
- isSupported(String) - Static method in class org.apache.tika.utils.CharsetUtils
-
Safely return whether is supported, without throwing exceptions
- isText() - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Returns true
if content text analysis is enabled false
otherwise.
- isTikaInputStream(InputStream) - Static method in class org.apache.tika.io.TikaInputStream
-
Checks whether the given stream is a TikaInputStream instance.
- isTracking() - Method in class org.apache.tika.parser.mbox.MboxParser
-
- isUnknown() - Method in class org.apache.tika.language.detect.LanguageResult
-
- isUnordered(int) - Method in class org.apache.tika.parser.rtf.ListDescriptor
-
- isValid(String) - Static method in class org.apache.tika.mime.MimeType
-
Checks that the given string is a valid Internet media type name
based on rules from RFC 2054 section 5.3.
- isWriteable(Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.CSVMessageBodyWriter
-
- isWriteable(Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.JSONMessageBodyWriter
-
- isWriteable(Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.MetadataListMessageBodyWriter
-
- isWriteable(Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.TarWriter
-
- isWriteable(Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.TextMessageBodyWriter
-
- isWriteable(Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.XMPMessageBodyWriter
-
- isWriteable(Class<?>, Type, Annotation[], MediaType) - Method in class org.apache.tika.server.writer.ZipWriter
-
- isWriteLimitReached(Throwable) - Method in class org.apache.tika.sax.WriteOutContentHandler
-
Checks whether the given exception (or any of it's root causes) was
thrown by this handler as a signal of reaching the write limit.
- ITikaToXMPConverter - Interface in org.apache.tika.xmp.convert
-
Interface for the specific Metadata
to XMP converters
- ITSF - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- ITSP - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- IWORK13_COMMON_ENTRY - Static variable in class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
-
All iWork 13 files contain this, so we can detect based on it
- IWork13PackageParser - Class in org.apache.tika.parser.iwork.iwana
-
- IWork13PackageParser() - Constructor for class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
-
- IWork13PackageParser.IWork13DocumentType - Enum in org.apache.tika.parser.iwork.iwana
-
- IWORK_COMMON_ENTRY - Static variable in class org.apache.tika.parser.iwork.IWorkPackageParser
-
All iWork files contain one of these, so we can detect based on it
- IWORK_CONTENT_ENTRIES - Static variable in class org.apache.tika.parser.iwork.IWorkPackageParser
-
Which files within an iWork file contain the actual content?
- IWorkPackageParser - Class in org.apache.tika.parser.iwork
-
A parser for the IWork container files.
- IWorkPackageParser() - Constructor for class org.apache.tika.parser.iwork.IWorkPackageParser
-
- IWorkPackageParser.IWORKDocumentType - Enum in org.apache.tika.parser.iwork
-
- LABEL - Static variable in interface org.apache.tika.metadata.XMP
-
A word or short phrase that identifies a resource as a member of a userdefined collection.
- label - Variable in class org.apache.tika.parser.recognition.RecognisedObject
-
Label of this object.
- LABEL_LANG - Static variable in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- labelLang - Variable in class org.apache.tika.parser.recognition.RecognisedObject
-
Language of label, Example : english
- Language - Class in org.apache.tika.example
-
- Language() - Constructor for class org.apache.tika.example.Language
-
- LANGUAGE - Static variable in interface org.apache.tika.metadata.DublinCore
-
A language of the intellectual content of the resource.
- LANGUAGE - Static variable in class org.apache.tika.metadata.Metadata
-
- LANGUAGE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- LanguageConfidence - Enum in org.apache.tika.language.detect
-
- LanguageDetectingParser - Class in org.apache.tika.example
-
- LanguageDetectingParser() - Constructor for class org.apache.tika.example.LanguageDetectingParser
-
- languageDetection() - Static method in class org.apache.tika.example.Language
-
- languageDetectionWithHandler() - Static method in class org.apache.tika.example.Language
-
- languageDetectionWithWriter() - Static method in class org.apache.tika.example.Language
-
- LanguageDetector - Class in org.apache.tika.language.detect
-
- LanguageDetector() - Constructor for class org.apache.tika.language.detect.LanguageDetector
-
- LanguageDetectorExample - Class in org.apache.tika.example
-
- LanguageDetectorExample() - Constructor for class org.apache.tika.example.LanguageDetectorExample
-
- LanguageHandler - Class in org.apache.tika.language.detect
-
SAX content handler that updates a language detector based on all the
received character content.
- LanguageHandler() - Constructor for class org.apache.tika.language.detect.LanguageHandler
-
- LanguageHandler(LanguageWriter) - Constructor for class org.apache.tika.language.detect.LanguageHandler
-
- LanguageHandler(LanguageDetector) - Constructor for class org.apache.tika.language.detect.LanguageHandler
-
- LanguageIdentifier - Class in org.apache.tika.language
-
- LanguageIdentifier(LanguageProfile) - Constructor for class org.apache.tika.language.LanguageIdentifier
-
Deprecated.
Constructs a language identifier based on a LanguageProfile
- LanguageIdentifier(String) - Constructor for class org.apache.tika.language.LanguageIdentifier
-
Deprecated.
Constructs a language identifier based on a String of text content
- LanguageIDWrapper - Class in org.apache.tika.eval.util
-
- LanguageIDWrapper() - Constructor for class org.apache.tika.eval.util.LanguageIDWrapper
-
- LanguageNames - Class in org.apache.tika.language.detect
-
Support for language tags (as defined by https://tools.ietf.org/html/bcp47)
See https://en.wikipedia.org/wiki/List_of_ISO_639-3_codes for a list of
three character language codes.
- LanguageNames() - Constructor for class org.apache.tika.language.detect.LanguageNames
-
- LanguageProfile - Class in org.apache.tika.language
-
Deprecated.
- LanguageProfile(int) - Constructor for class org.apache.tika.language.LanguageProfile
-
Deprecated.
- LanguageProfile() - Constructor for class org.apache.tika.language.LanguageProfile
-
Deprecated.
- LanguageProfile(String, int) - Constructor for class org.apache.tika.language.LanguageProfile
-
Deprecated.
- LanguageProfile(String) - Constructor for class org.apache.tika.language.LanguageProfile
-
Deprecated.
- LanguageProfilerBuilder - Class in org.apache.tika.language
-
Deprecated.
- LanguageProfilerBuilder(String, int, int) - Constructor for class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.
Constructs a new ngram profile
- LanguageProfilerBuilder(String) - Constructor for class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.
Constructs a new ngram profile where minlen=3, maxlen=3
- LanguageResource - Class in org.apache.tika.server.resource
-
- LanguageResource() - Constructor for class org.apache.tika.server.resource.LanguageResource
-
- LanguageResult - Class in org.apache.tika.language.detect
-
- LanguageResult(String, LanguageConfidence, float) - Constructor for class org.apache.tika.language.detect.LanguageResult
-
- LanguageWriter - Class in org.apache.tika.language.detect
-
Writer that builds a language profile based on all the written content.
- LanguageWriter(LanguageDetector) - Constructor for class org.apache.tika.language.detect.LanguageWriter
-
- LAST_AUTHOR - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- LAST_AUTHOR - Static variable in interface org.apache.tika.metadata.Office
-
Name of the last (most recent) author of a document
- LAST_MODIFIED - Static variable in interface org.apache.tika.metadata.HttpHeaders
-
- LAST_MODIFIED_BY - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
The user who performed the last modification.
- LAST_PRINTED - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- LAST_PRINTED - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
The date and time of the last printing.
- LAST_SAVED - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- Latin1StringsParser - Class in org.apache.tika.parser.strings
-
Parser to extract printable Latin1 strings from arbitrary files with pure java
without running any external process.
- Latin1StringsParser() - Constructor for class org.apache.tika.parser.strings.Latin1StringsParser
-
- LATITUDE - Static variable in interface org.apache.tika.metadata.Geographic
-
The WGS84 Latitude of the Point
- LATITUDE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- LAYER_1 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
-
Constant for audio layer 1.
- LAYER_2 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
-
Constant for audio layer 2.
- LAYER_3 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
-
Constant for audio layer 3.
- lengthTreeLengtsTable - Variable in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- lengthTreeTable - Variable in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- lessThan(TokenIntPair, TokenIntPair) - Method in class org.apache.tika.eval.tokens.TokenCountPriorityQueue
-
- LevelTuple(String) - Constructor for class org.apache.tika.parser.microsoft.AbstractListManager.LevelTuple
-
- LevelTuple(int, int, String, String, boolean) - Constructor for class org.apache.tika.parser.microsoft.AbstractListManager.LevelTuple
-
- LICENSE_LOCATION - Static variable in interface org.apache.tika.metadata.CreativeCommons
-
- LICENSE_URL - Static variable in interface org.apache.tika.metadata.CreativeCommons
-
- LICENSOR - Static variable in interface org.apache.tika.metadata.IPTC
-
A person or company that should be contacted to obtain a licence for
using the item or who has licensed the item.
- LICENSOR_CITY - Static variable in interface org.apache.tika.metadata.IPTC
-
The city of a person or company that should be contacted to obtain a licence for
using the item or who has licensed the item.
- LICENSOR_COUNTRY - Static variable in interface org.apache.tika.metadata.IPTC
-
The country of a person or company that should be contacted to obtain a licence for
using the item or who has licensed the item.
- LICENSOR_EMAIL - Static variable in interface org.apache.tika.metadata.IPTC
-
The email of a person or company that should be contacted to obtain a licence for
using the item or who has licensed the item.
- LICENSOR_EXTENDED_ADDRESS - Static variable in interface org.apache.tika.metadata.IPTC
-
The extended address of a person or company that should be contacted to obtain a licence for
using the item or who has licensed the item.
- LICENSOR_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
The ID of the person or company that should be contacted to obtain a licence for
using the item or who has licensed the item.
- LICENSOR_ID_WRONG_CASE - Static variable in interface org.apache.tika.metadata.IPTC
-
- LICENSOR_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of the person or company that should be contacted to obtain a licence for
using the item or who has licensed the item.
- LICENSOR_POSTAL_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
The postal code of a person or company that should be contacted to obtain a licence for
using the item or who has licensed the item.
- LICENSOR_REGION - Static variable in interface org.apache.tika.metadata.IPTC
-
The region of a person or company that should be contacted to obtain a licence for
using the item or who has licensed the item.
- LICENSOR_STREET_ADDRESS - Static variable in interface org.apache.tika.metadata.IPTC
-
The street address of a person or company that should be contacted to obtain a licence for
using the item or who has licensed the item.
- LICENSOR_TELEPHONE_1 - Static variable in interface org.apache.tika.metadata.IPTC
-
The phone number of a person or company that should be contacted to obtain a licence for
using the item or who has licensed the item.
- LICENSOR_TELEPHONE_2 - Static variable in interface org.apache.tika.metadata.IPTC
-
The phone number of a person or company that should be contacted to obtain a licence for
using the item or who has licensed the item.
- LICENSOR_URL - Static variable in interface org.apache.tika.metadata.IPTC
-
The URL of a person or company that should be contacted to obtain a licence for
using the item or who has licensed the item.
- LINE_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- LINE_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of lines in the document
- Lingo24LangDetector - Class in org.apache.tika.langdetect
-
- Lingo24LangDetector() - Constructor for class org.apache.tika.langdetect.Lingo24LangDetector
-
Default constructor which first checks for the presence of
the langdetect.lingo24.properties
file to set the API Key.
- Lingo24Translator - Class in org.apache.tika.language.translate
-
- Lingo24Translator() - Constructor for class org.apache.tika.language.translate.Lingo24Translator
-
- Link - Class in org.apache.tika.sax
-
- Link(String, String, String, String) - Constructor for class org.apache.tika.sax.Link
-
- Link(String, String, String, String, String) - Constructor for class org.apache.tika.sax.Link
-
- LinkContentHandler - Class in org.apache.tika.sax
-
Content handler that collects links from an XHTML document.
- LinkContentHandler() - Constructor for class org.apache.tika.sax.LinkContentHandler
-
Default constructor
- LinkContentHandler(boolean) - Constructor for class org.apache.tika.sax.LinkContentHandler
-
Default constructor
- LinkedCell - Class in org.apache.tika.parser.microsoft
-
Linked cell.
- LinkedCell(Cell, String) - Constructor for class org.apache.tika.parser.microsoft.LinkedCell
-
- listAllTypes() - Static method in class org.apache.tika.example.MediaTypeExample
-
- ListDescriptor - Class in org.apache.tika.parser.rtf
-
Contains the information for a single list in the list or list override tables.
- ListDescriptor() - Constructor for class org.apache.tika.parser.rtf.ListDescriptor
-
- listLevelMap - Variable in class org.apache.tika.parser.microsoft.AbstractListManager
-
- ListManager - Class in org.apache.tika.parser.microsoft
-
Computes the number text which goes at the beginning of each list paragraph
- ListManager(HWPFDocument) - Constructor for class org.apache.tika.parser.microsoft.ListManager
-
Ordinary constructor for a new list reader
- listZipEntries(String) - Static method in class org.apache.tika.example.ZipListFiles
-
- LITTLE - Static variable in class org.apache.tika.parser.executable.MachineMetadata.Endian
-
- load(InputStream) - Static method in class org.apache.tika.config.Param
-
- load(Node) - Static method in class org.apache.tika.config.Param
-
- load(InputStream) - Method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.
Loads a ngram profile from an InputStream (assumes UTF-8 encoded content)
- loadBuiltInModels() - Static method in class org.apache.tika.eval.util.LanguageIDWrapper
-
- loadClassIndex(InputStream) - Method in class org.apache.tika.dl.imagerec.DL4JInceptionV3Net
-
Loads the class to
- loadCommonTokens(Path, String) - Static method in class org.apache.tika.eval.AbstractProfiler
-
- loadDefaultModels(InputStream) - Method in class org.apache.tika.detect.NNExampleModelDetector
-
- loadDefaultModels(ClassLoader) - Method in class org.apache.tika.detect.NNExampleModelDetector
-
this method gets overwritten to register load neural network models
- loadDefaultModels(Path) - Method in class org.apache.tika.detect.TrainedModelDetector
-
- loadDefaultModels(File) - Method in class org.apache.tika.detect.TrainedModelDetector
-
- loadDefaultModels(InputStream) - Method in class org.apache.tika.detect.TrainedModelDetector
-
- loadDefaultModels(ClassLoader) - Method in class org.apache.tika.detect.TrainedModelDetector
-
- loadDynamicServiceProviders(Class<T>) - Method in class org.apache.tika.config.ServiceLoader
-
Returns the available dynamic service providers of the given type.
- LoadErrorHandler - Interface in org.apache.tika.config
-
Interface for error handling strategies in service class loading.
- loadExtract(Path) - Method in class org.apache.tika.eval.io.ExtractReader
-
- loadLinkedRelationships(PackagePart, boolean, Metadata) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
-
This is used by the SAX docx and pptx decorators to load hyperlinks and
other linked objects
- loadModels(Path) - Static method in class org.apache.tika.eval.util.LanguageIDWrapper
-
- loadModels() - Method in class org.apache.tika.langdetect.Lingo24LangDetector
-
- loadModels(Set<String>) - Method in class org.apache.tika.langdetect.Lingo24LangDetector
-
- loadModels() - Method in class org.apache.tika.langdetect.OptimaizeLangDetector
-
- loadModels(Set<String>) - Method in class org.apache.tika.langdetect.OptimaizeLangDetector
-
- loadModels() - Method in class org.apache.tika.langdetect.TextLangDetector
-
- loadModels(Set<String>) - Method in class org.apache.tika.langdetect.TextLangDetector
-
- loadModels() - Method in class org.apache.tika.language.detect.LanguageDetector
-
Load (or re-load) all available language models.
- loadModels(Set<String>) - Method in class org.apache.tika.language.detect.LanguageDetector
-
Load (or re-load) the models specified in .
- loadServiceProviders(Class<T>) - Method in class org.apache.tika.config.ServiceLoader
-
Returns all the available service providers of the given type.
- loadStaticServiceProviders(Class<T>) - Method in class org.apache.tika.config.ServiceLoader
-
Returns the available static service providers of the given type.
- LOCAL_NAME_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- LOCATION - Static variable in interface org.apache.tika.metadata.HttpHeaders
-
- Location - Class in org.apache.tika.parser.geo.topic.gazetteer
-
- Location() - Constructor for class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- LOCATION - Static variable in interface org.apache.tika.parser.ner.NERecogniser
-
- LOCATION_CREATED - Static variable in interface org.apache.tika.metadata.IPTC
-
The location the content of the item was created.
- LOCATION_CREATED_CITY - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of the city of a location.
- LOCATION_CREATED_COUNTRY_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
The ISO code of a country of a location.
- LOCATION_CREATED_COUNTRY_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of a country of a location.
- LOCATION_CREATED_PROVINCE_OR_STATE - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of a subregion of a country - a province or state - of a
location.
- LOCATION_CREATED_SUBLOCATION - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of a sublocation.
- LOCATION_CREATED_WORLD_REGION - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of a world region of a location.
- LOCATION_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
- LOCATION_SHOWN - Static variable in interface org.apache.tika.metadata.IPTC
-
A location the content of the item is about.
- LOCATION_SHOWN_CITY - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of the city of a location.
- LOCATION_SHOWN_COUNTRY_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
The ISO code of a country of a location.
- LOCATION_SHOWN_COUNTRY_NAME - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of a country of a location.
- LOCATION_SHOWN_PROVINCE_OR_STATE - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of a subregion of a country - a province or state - of a
location.
- LOCATION_SHOWN_SUBLOCATION - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of a sublocation.
- LOCATION_SHOWN_WORLD_REGION - Static variable in interface org.apache.tika.metadata.IPTC
-
The name of a world region of a location.
- LOG - Static variable in class org.apache.tika.batch.FileResourceConsumer
-
- LOG - Static variable in class org.apache.tika.batch.FileResourceCrawler
-
- LOG - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
-
- LOG_COMMENT - Static variable in interface org.apache.tika.metadata.XMPDM
-
"User's log comments."
- LOG_LEVELS - Static variable in class org.apache.tika.server.TikaServerCli
-
- logRequest(Logger, UriInfo, Metadata) - Static method in class org.apache.tika.server.resource.TikaResource
-
- LONGITUDE - Static variable in interface org.apache.tika.metadata.Geographic
-
The WGS84 Longitude of the Point
- LONGITUDE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- LookaheadInputStream - Class in org.apache.tika.io
-
Stream wrapper that make it easy to read up to n bytes ahead from
a stream that supports the mark feature.
- LookaheadInputStream(InputStream, int) - Constructor for class org.apache.tika.io.LookaheadInputStream
-
Creates a lookahead wrapper for the given input stream.
- looksLikeUTF8() - Method in class org.apache.tika.detect.TextStatistics
-
Checks whether the observed byte stream looks like UTF-8 encoded text.
- LOOP - Static variable in interface org.apache.tika.metadata.XMPDM
-
"When true, the clip can be looped seamlessly."
- LOWEST_VERSION - Static variable in interface org.apache.tika.metadata.QuattroPro
-
Lowest version.
- LuceneIndexer - Class in org.apache.tika.example
-
- LuceneIndexer(Tika, IndexWriter) - Constructor for class org.apache.tika.example.LuceneIndexer
-
- LuceneIndexerExtended - Class in org.apache.tika.example
-
- LuceneIndexerExtended(IndexWriter, Tika) - Constructor for class org.apache.tika.example.LuceneIndexerExtended
-
- LyricsHandler - Class in org.apache.tika.parser.mp3
-
This is used to parse Lyrics3 tag information
from an MP3 file, if available.
- LyricsHandler(InputStream, ContentHandler) - Constructor for class org.apache.tika.parser.mp3.LyricsHandler
-
- LyricsHandler(byte[]) - Constructor for class org.apache.tika.parser.mp3.LyricsHandler
-
Looks for the Lyrics data, which will be
just before the ID3v1 data (if present),
and process it.
- LZX_ALIGNED_MAXSYMBOLS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_ALIGNED_NUM_ELEMENTS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_ALIGNED_TABLEBITS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_BLOCKTYPE_ALIGNED - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_BLOCKTYPE_INVALID - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_BLOCKTYPE_UNCOMPRESSED - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_BLOCKTYPE_VERBATIM - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_LENGTH_MAXSYMBOLS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_LENGTH_TABLEBITS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_LENTABLE_SAFETY - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_MAIN_MAXSYMBOLS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_MAINTREE_MAXSYMBOLS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_MAINTREE_TABLEBITS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_MAX_MATCH - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_MIN_MATCH - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_NUM_CHARS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_NUM_PRIMARY_LENGTHS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_NUM_SECONDARY_LENGTHS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_PRETREE_MAXSYMBOLS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_PRETREE_NUM_ELEMENTS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_PRETREE_NUM_ELEMENTS_BITS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZX_PRETREE_TABLEBITS - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- LZXC - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- MACHINE_ALPHA - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_ARM - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_EFI - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_IA_64 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_M32R - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_M68K - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_M88K - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_MIPS - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_PPC - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_S370 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_S390 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_SH3 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_SH4 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_SH5 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_SPARC - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_TYPE - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_UNKNOWN - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_VAX - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_x86_32 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MACHINE_x86_64 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- MachineMetadata - Interface in org.apache.tika.parser.executable
-
Metadata for describing machines, such as their
architecture, type and endian-ness
- MachineMetadata.Endian - Class in org.apache.tika.parser.executable
-
- magic_neg(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
-
- MAGIC_PRIORITY_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- MAGIC_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- magic_trust(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
-
- MagicDetector - Class in org.apache.tika.detect
-
Content type detection based on magic bytes, i.e.
- MagicDetector(MediaType, byte[]) - Constructor for class org.apache.tika.detect.MagicDetector
-
Creates a detector for input documents that have the exact given byte
pattern at the beginning of the document stream.
- MagicDetector(MediaType, byte[], int) - Constructor for class org.apache.tika.detect.MagicDetector
-
Creates a detector for input documents that have the exact given byte
pattern at the given offset of the document stream.
- MagicDetector(MediaType, byte[], byte[], int, int) - Constructor for class org.apache.tika.detect.MagicDetector
-
Creates a detector for input documents that meet the specified magic
match.
- MagicDetector(MediaType, byte[], byte[], boolean, int, int) - Constructor for class org.apache.tika.detect.MagicDetector
-
Creates a detector for input documents that meet the specified
magic match.
- MagicDetector(MediaType, byte[], byte[], boolean, boolean, int, int) - Constructor for class org.apache.tika.detect.MagicDetector
-
Creates a detector for input documents that meet the specified
magic match.
- MAIL_MAX_SIZE - Static variable in class org.apache.tika.parser.mbox.MboxParser
-
- MailUtil - Class in org.apache.tika.parser.mail
-
- MailUtil() - Constructor for class org.apache.tika.parser.mail.MailUtil
-
- main(String[]) - Static method in class org.apache.tika.batch.BatchProcessDriverCLI
-
- main(String[]) - Static method in class org.apache.tika.batch.fs.FSBatchProcessCLI
-
- main(String[]) - Static method in class org.apache.tika.batch.fs.strawman.StrawManTikaAppDriver
-
- main(String[]) - Static method in class org.apache.tika.cli.TikaCLI
-
- main(String[]) - Static method in class org.apache.tika.eval.reports.ResultsReporter
-
- main(String[]) - Static method in class org.apache.tika.eval.TikaEvalCLI
-
- main(String[]) - Static method in class org.apache.tika.eval.tools.BatchTopCommonTokenCounter
-
- main(String[]) - Static method in class org.apache.tika.eval.tools.TopCommonTokenCounter
-
- main(String[]) - Static method in class org.apache.tika.eval.XMLErrorLogUpdater
-
- main(String[]) - Static method in class org.apache.tika.example.CustomMimeInfo
-
- main(String[]) - Static method in class org.apache.tika.example.DescribeMetadata
-
- main(String[]) - Static method in class org.apache.tika.example.DirListParser
-
- main(String[]) - Static method in class org.apache.tika.example.DisplayMetInstance
-
- main(String[]) - Static method in class org.apache.tika.example.DumpTikaConfigExample
-
- main(String[]) - Static method in class org.apache.tika.example.GrabPhoneNumbersExample
-
- main(String[]) - Static method in class org.apache.tika.example.LuceneIndexerExtended
-
- main(String[]) - Static method in class org.apache.tika.example.MediaTypeExample
-
- main(String[]) - Static method in class org.apache.tika.example.MyFirstTika
-
- main(String[]) - Static method in class org.apache.tika.example.RollbackSoftware
-
- main(String[]) - Static method in class org.apache.tika.example.SimpleTextExtractor
-
- main(String[]) - Static method in class org.apache.tika.example.SimpleTypeDetector
-
- main(String[]) - Static method in class org.apache.tika.example.SpringExample
-
- main(String[]) - Static method in class org.apache.tika.example.ZipListFiles
-
- main(String[]) - Static method in class org.apache.tika.gui.TikaGUI
-
Main method.
- main(String[]) - Static method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.
main method used for testing only
- main(String[]) - Static method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
- main(String[]) - Static method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
- main(String[]) - Static method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
- main(String[]) - Static method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- main(String[]) - Static method in class org.apache.tika.parser.microsoft.ooxml.xslf.XSLFEventBasedPowerPointExtractor
-
- main(String[]) - Static method in class org.apache.tika.parser.microsoft.ooxml.xwpf.XWPFEventBasedWordExtractor
-
- main(String[]) - Static method in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
- main(String[]) - Static method in class org.apache.tika.sax.StandardsExtractionExample
-
- main(String[]) - Static method in class org.apache.tika.server.TikaServerCli
-
- mainTreeLengtsTable - Variable in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- mainTreeTable - Variable in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- MAJOR_VERSION - Static variable in interface org.apache.tika.metadata.WordPerfect
-
Major version.
- makeName(String, String, String) - Static method in class org.apache.tika.language.detect.LanguageNames
-
- MANAGER - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- MANAGER - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
-
- map(long, long) - Method in class org.apache.tika.parser.mp4.DirectFileReadDataSource
-
- mapAttributes(Attributes) - Method in class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
-
- MAPI_FROM_REPRESENTING_EMAIL - Static variable in interface org.apache.tika.metadata.Office
-
- MAPI_FROM_REPRESENTING_NAME - Static variable in interface org.apache.tika.metadata.Office
-
- MAPI_MESSAGE_CLASS - Static variable in interface org.apache.tika.metadata.Office
-
MAPI message class.
- MAPI_SENT_BY_SERVER_TYPE - Static variable in interface org.apache.tika.metadata.Office
-
- mapifyAttrs(Node, Map<String, String>) - Static method in class org.apache.tika.util.XMLDOMUtil
-
This grabs the attributes from a dom node and overwrites those values with those
specified by the overwrite map.
- MappedBufferCleaner - Class in org.apache.tika.io
-
Copied/pasted from the Apache Lucene/Solr project.
- MappedBufferCleaner() - Constructor for class org.apache.tika.io.MappedBufferCleaner
-
- mapSafeAttribute(String, String) - Method in class org.apache.tika.parser.html.DefaultHtmlMapper
-
Normalizes an attribute name.
- mapSafeAttribute(String, String) - Method in interface org.apache.tika.parser.html.HtmlMapper
-
Maps "safe" HTML attribute names to semantic XHTML equivalents.
- mapSafeAttribute(String, String) - Method in class org.apache.tika.parser.html.HtmlParser
-
- mapSafeAttribute(String, String) - Method in class org.apache.tika.parser.html.IdentityHtmlMapper
-
- mapSafeElement(String) - Method in class org.apache.tika.parser.html.DefaultHtmlMapper
-
- mapSafeElement(String) - Method in interface org.apache.tika.parser.html.HtmlMapper
-
Maps "safe" HTML element names to semantic XHTML equivalents.
- mapSafeElement(String) - Method in class org.apache.tika.parser.html.HtmlParser
-
- mapSafeElement(String) - Method in class org.apache.tika.parser.html.IdentityHtmlMapper
-
- mark(int) - Method in class org.apache.tika.io.BoundedInputStream
-
- mark(int) - Method in class org.apache.tika.io.LookaheadInputStream
-
- mark(int) - Method in class org.apache.tika.io.NullInputStream
-
Mark the current position.
- mark(int) - Method in class org.apache.tika.io.ProxyInputStream
-
Invokes the delegate's mark(int)
method.
- mark(int) - Method in class org.apache.tika.io.TailStream
-
This implementation saves the internal state including the
content of the tail buffer so that it can be restored when ''reset()'' is
called later.
- mark(int) - Method in class org.apache.tika.io.TikaInputStream
-
- MARKED - Static variable in interface org.apache.tika.metadata.XMPRights
-
When true, indicates that this is a rights-managed resource.
- markSupported() - Method in class org.apache.tika.io.LookaheadInputStream
-
- markSupported() - Method in class org.apache.tika.io.NullInputStream
-
Indicates whether mark is supported.
- markSupported() - Method in class org.apache.tika.io.ProxyInputStream
-
Invokes the delegate's markSupported()
method.
- markSupported() - Method in class org.apache.tika.io.TikaInputStream
-
- MATCH_MASK_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- MATCH_OFFSET_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- MATCH_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- MATCH_TYPE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- MATCH_VALUE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- Matcher - Class in org.apache.tika.sax.xpath
-
XPath element matcher.
- Matcher() - Constructor for class org.apache.tika.sax.xpath.Matcher
-
- matches(byte[]) - Method in class org.apache.tika.mime.MimeType
-
- matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.AttributeMatcher
-
- matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.CompositeMatcher
-
- matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.Matcher
-
Returns true
if the XPath expression matches the named
attribute of the element associated with this evaluation state.
- matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.NamedAttributeMatcher
-
- matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.NodeMatcher
-
- matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.SubtreeMatcher
-
- matchesElement() - Method in class org.apache.tika.sax.xpath.CompositeMatcher
-
- matchesElement() - Method in class org.apache.tika.sax.xpath.ElementMatcher
-
- matchesElement() - Method in class org.apache.tika.sax.xpath.Matcher
-
Returns true
if the XPath expression matches
the element associated with this evaluation state.
- matchesElement() - Method in class org.apache.tika.sax.xpath.NodeMatcher
-
- matchesElement() - Method in class org.apache.tika.sax.xpath.SubtreeMatcher
-
- matchesMagic(byte[]) - Method in class org.apache.tika.mime.MimeType
-
- matchesText() - Method in class org.apache.tika.sax.xpath.CompositeMatcher
-
- matchesText() - Method in class org.apache.tika.sax.xpath.Matcher
-
Returns true
if the XPath expression matches all text
nodes whose parent is the element associated with this evaluation
state.
- matchesText() - Method in class org.apache.tika.sax.xpath.NodeMatcher
-
- matchesText() - Method in class org.apache.tika.sax.xpath.SubtreeMatcher
-
- matchesText() - Method in class org.apache.tika.sax.xpath.TextMatcher
-
- MatchingContentHandler - Class in org.apache.tika.sax.xpath
-
Content handler decorator that only passes the elements, attributes,
and text nodes that match the given XPath expression.
- MatchingContentHandler(ContentHandler, Matcher) - Constructor for class org.apache.tika.sax.xpath.MatchingContentHandler
-
- MATLAB_MIME_TYPE - Static variable in class org.apache.tika.parser.mat.MatParser
-
- MatParser - Class in org.apache.tika.parser.mat
-
- MatParser() - Constructor for class org.apache.tika.parser.mat.MatParser
-
- MAX_AVAIL_HEIGHT - Static variable in interface org.apache.tika.metadata.IPTC
-
The maximum available height in pixels of the original photo from which
this photo has been derived by downsizing.
- MAX_AVAIL_WIDTH - Static variable in interface org.apache.tika.metadata.IPTC
-
The maximum available width in pixels of the original photo from which
this photo has been derived by downsizing.
- MAX_QUEUE_SIZE_KEY - Static variable in class org.apache.tika.batch.builders.BatchProcessBuilder
-
- maxDoc() - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- MAXIMUM_TEXT_CHUNK_SIZE - Variable in class org.apache.tika.example.ContentHandlerExample
-
- MBOX_MIME_TYPE - Static variable in class org.apache.tika.parser.mbox.MboxParser
-
- MBOX_RECORD_DIVIDER - Static variable in class org.apache.tika.parser.mbox.MboxParser
-
- MboxParser - Class in org.apache.tika.parser.mbox
-
Mbox (mailbox) parser.
- MboxParser() - Constructor for class org.apache.tika.parser.mbox.MboxParser
-
- MD_KEY_ESTIMATED_AGE - Static variable in class org.apache.tika.parser.recognition.AgeRecogniser
-
- MD_KEY_ESTIMATED_AGE_RANGE - Static variable in class org.apache.tika.parser.recognition.AgeRecogniser
-
- MD_KEY_IMG_CAP - Static variable in class org.apache.tika.parser.recognition.ObjectRecognitionParser
-
- MD_KEY_OBJ_REC - Static variable in class org.apache.tika.parser.recognition.ObjectRecognitionParser
-
- MD_KEY_PREFIX - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
-
- MD_REC_IMPL_KEY - Static variable in class org.apache.tika.parser.recognition.ObjectRecognitionParser
-
- MDB_PROPERTY_PREFIX - Static variable in class org.apache.tika.parser.microsoft.JackcessParser
-
- MDB_PW - Static variable in class org.apache.tika.parser.microsoft.JackcessParser
-
- MEDIA_TYPES - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
-
- MediaType - Class in org.apache.tika.mime
-
Internet media type.
- MediaType(String, String, Map<String, String>) - Constructor for class org.apache.tika.mime.MediaType
-
- MediaType(String, String) - Constructor for class org.apache.tika.mime.MediaType
-
- MediaType(MediaType, Map<String, String>) - Constructor for class org.apache.tika.mime.MediaType
-
- MediaType(MediaType, String, String) - Constructor for class org.apache.tika.mime.MediaType
-
Creates a media type by adding a parameter to a base type.
- MediaType(MediaType, Charset) - Constructor for class org.apache.tika.mime.MediaType
-
Creates a media type by adding the "charset" parameter to a base type.
- MediaTypeExample - Class in org.apache.tika.example
-
- MediaTypeExample() - Constructor for class org.apache.tika.example.MediaTypeExample
-
- MediaTypeRegistry - Class in org.apache.tika.mime
-
Registry of known Internet media types.
- MediaTypeRegistry() - Constructor for class org.apache.tika.mime.MediaTypeRegistry
-
- Message - Interface in org.apache.tika.metadata
-
A collection of Message related property names.
- MESSAGE_BCC - Static variable in interface org.apache.tika.metadata.Message
-
- MESSAGE_BCC_DISPLAY_NAME - Static variable in interface org.apache.tika.metadata.Message
-
- MESSAGE_BCC_EMAIL - Static variable in interface org.apache.tika.metadata.Message
-
Where possible, this records the email value in the bcc field.
- MESSAGE_BCC_NAME - Static variable in interface org.apache.tika.metadata.Message
-
In Outlook messages, there are sometimes separate fields for "bcc-name" and
"bcc-display-name" name.
- MESSAGE_CC - Static variable in interface org.apache.tika.metadata.Message
-
- MESSAGE_CC_DISPLAY_NAME - Static variable in interface org.apache.tika.metadata.Message
-
- MESSAGE_CC_EMAIL - Static variable in interface org.apache.tika.metadata.Message
-
Where possible, this records the email value in the cc field.
- MESSAGE_CC_NAME - Static variable in interface org.apache.tika.metadata.Message
-
In Outlook messages, there are sometimes separate fields for "cc-name" and
"cc-display-name" name.
- MESSAGE_FROM - Static variable in interface org.apache.tika.metadata.Message
-
- MESSAGE_FROM_EMAIL - Static variable in interface org.apache.tika.metadata.Message
-
Where possible, this records the value from the name field.
- MESSAGE_FROM_NAME - Static variable in interface org.apache.tika.metadata.Message
-
Where possible, this records the value from the name field.
- MESSAGE_PREFIX - Static variable in interface org.apache.tika.metadata.Message
-
- MESSAGE_RAW_HEADER_PREFIX - Static variable in interface org.apache.tika.metadata.Message
-
- MESSAGE_RECIPIENT_ADDRESS - Static variable in interface org.apache.tika.metadata.Message
-
- MESSAGE_TO - Static variable in interface org.apache.tika.metadata.Message
-
- MESSAGE_TO_DISPLAY_NAME - Static variable in interface org.apache.tika.metadata.Message
-
- MESSAGE_TO_EMAIL - Static variable in interface org.apache.tika.metadata.Message
-
Where possible, this records the email value in the to field.
- MESSAGE_TO_NAME - Static variable in interface org.apache.tika.metadata.Message
-
In Outlook messages, there are sometimes separate fields for "to-name" and
"to-display-name" name.
- meta - Variable in class org.apache.tika.xmp.convert.AbstractConverter
-
- meta_neg(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
-
- meta_trust(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
-
- Metadata - Class in org.apache.tika.metadata
-
A multi-valued metadata container.
- Metadata() - Constructor for class org.apache.tika.metadata.Metadata
-
Constructs a new, empty metadata.
- metadata - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
- metadata(Metadata) - Method in class org.apache.tika.sax.XMPContentHandler
-
- METADATA_COMMAND_ARGUMENTS_SERIALIZED_TOKEN - Static variable in class org.apache.tika.embedder.ExternalEmbedder
-
Token to be replaced with a String array of metadata assignment command
arguments
- METADATA_COMMAND_ARGUMENTS_TOKEN - Static variable in class org.apache.tika.embedder.ExternalEmbedder
-
Token to be replaced with a String array of metadata assignment command
arguments
- METADATA_DATE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- METADATA_DATE - Static variable in interface org.apache.tika.metadata.XMP
-
The date and time that any metadata for this resource was last changed.
- METADATA_KEY_ATTR - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
-
- METADATA_MATCH_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
-
- METADATA_MOD_DATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The date and time when the metadata was last modified."
- METADATA_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
-
- MetadataAwareLuceneIndexer - Class in org.apache.tika.example
-
Builds on the LuceneIndexer from Chapter 5 and adds indexing of Metadata.
- MetadataAwareLuceneIndexer(IndexWriter, Tika) - Constructor for class org.apache.tika.example.MetadataAwareLuceneIndexer
-
- MetadataExtractor - Class in org.apache.tika.parser.microsoft.ooxml
-
OOXML metadata extractor.
- MetadataExtractor(POIXMLTextExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.MetadataExtractor
-
- MetadataFields - Class in org.apache.tika.parser.image
-
Knowns about all declared
Metadata
fields.
- MetadataFields() - Constructor for class org.apache.tika.parser.image.MetadataFields
-
- MetadataHandler - Class in org.apache.tika.parser.xml
-
- MetadataHandler(Metadata, String) - Constructor for class org.apache.tika.parser.xml.MetadataHandler
-
Deprecated.
- MetadataHandler(Metadata, Property) - Constructor for class org.apache.tika.parser.xml.MetadataHandler
-
Deprecated.
- metadataList - Variable in class org.apache.tika.sax.RecursiveParserWrapperHandler
-
- MetadataList - Class in org.apache.tika.server
-
wrapper class to make isWriteable in MetadataListMBW simpler
- MetadataList(List<Metadata>) - Constructor for class org.apache.tika.server.MetadataList
-
- MetadataListMessageBodyWriter - Class in org.apache.tika.server.writer
-
- MetadataListMessageBodyWriter() - Constructor for class org.apache.tika.server.writer.MetadataListMessageBodyWriter
-
- MetadataResource - Class in org.apache.tika.server.resource
-
- MetadataResource() - Constructor for class org.apache.tika.server.resource.MetadataResource
-
- metadataToCsv(Metadata, OutputStream) - Static method in class org.apache.tika.server.resource.UnpackerResource
-
- methodName - Variable in class org.apache.tika.server.resource.TikaWelcome.Endpoint
-
- microsoftTranslateToFrench(String) - Method in class org.apache.tika.example.TranslatorExample
-
- MicrosoftTranslator - Class in org.apache.tika.language.translate
-
Wrapper class to access the Windows translation service.
- MicrosoftTranslator() - Constructor for class org.apache.tika.language.translate.MicrosoftTranslator
-
Create a new MicrosoftTranslator with the client keys specified in
resources/org/apache/tika/language/translate/translator.microsoft.properties.
- MIDDAY - Static variable in class org.apache.tika.utils.DateUtils
-
Custom time zone used to interpret date values without a time
component in a way that most likely falls within the same day
regardless of in which time zone it is later interpreted.
- MidiParser - Class in org.apache.tika.parser.audio
-
- MidiParser() - Constructor for class org.apache.tika.parser.audio.MidiParser
-
- MIME_INFO_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- MIME_TABLE - Static variable in class org.apache.tika.eval.AbstractProfiler
-
- MIME_TYPE_MAGIC - Static variable in interface org.apache.tika.metadata.TikaMimeKeys
-
- MIME_TYPE_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- MIME_TYPE_TYPE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- MimeBuffer - Class in org.apache.tika.eval.db
-
- MimeBuffer(Connection, TikaConfig) - Constructor for class org.apache.tika.eval.db.MimeBuffer
-
- MimeType - Class in org.apache.tika.mime
-
Internet media type.
- MIMETYPE_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
-
- MimeTypeException - Exception in org.apache.tika.mime
-
A class to encapsulate MimeType related exceptions.
- MimeTypeException(String) - Constructor for exception org.apache.tika.mime.MimeTypeException
-
Constructs a MimeTypeException with the specified detail message.
- MimeTypeException(String, Throwable) - Constructor for exception org.apache.tika.mime.MimeTypeException
-
Constructs a MimeTypeException with the specified detail message
and root cause.
- MimeTypes - Class in org.apache.tika.mime
-
This class is a MimeType repository.
- MimeTypes() - Constructor for class org.apache.tika.mime.MimeTypes
-
- MIMETYPES_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
-
- MimeTypesFactory - Class in org.apache.tika.mime
-
Creates instances of MimeTypes.
- MimeTypesFactory() - Constructor for class org.apache.tika.mime.MimeTypesFactory
-
- MimeTypesReader - Class in org.apache.tika.mime
-
A reader for XML files compliant with the freedesktop MIME-info DTD.
- MimeTypesReader(MimeTypes) - Constructor for class org.apache.tika.mime.MimeTypesReader
-
- MimeTypesReaderMetKeys - Interface in org.apache.tika.mime
-
- minConfidence - Variable in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- MINOR_MODEL_AGE_DISCLOSURE - Static variable in interface org.apache.tika.metadata.IPTC
-
Age of the youngest model pictured in the image, at the time that the
image was made.
- MINOR_VERSION - Static variable in interface org.apache.tika.metadata.WordPerfect
-
Minor version.
- MISCELLANEOUS - Static variable in interface org.apache.tika.parser.ner.NERecogniser
-
- MITIENERecogniser - Class in org.apache.tika.parser.ner.mitie
-
This class offers an implementation of
NERecogniser
based on
trained models using state-of-the-art information extraction tools.
- MITIENERecogniser() - Constructor for class org.apache.tika.parser.ner.mitie.MITIENERecogniser
-
- MITIENERecogniser(String) - Constructor for class org.apache.tika.parser.ner.mitie.MITIENERecogniser
-
Creates a NERecogniser by loading model from given path
- mixedLanguages - Variable in class org.apache.tika.language.detect.LanguageDetector
-
- MODEL_AGE - Static variable in interface org.apache.tika.metadata.IPTC
-
Age of the human model(s) at the time this image was taken in a model
released image.
- MODEL_NAME_ENGLISH - Static variable in interface org.apache.tika.metadata.ClimateForcast
-
- MODEL_PROP_NAME - Static variable in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
- MODEL_PROP_NAME - Static variable in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
-
- MODEL_RELEASE_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
Optional identifier associated with each Model Release.
- MODEL_RELEASE_STATUS - Static variable in interface org.apache.tika.metadata.IPTC
-
Summarizes the availability and scope of model releases authorizing usage
of the likenesses of persons appearing in the photograph.
- MODELS_DIR - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
- MODIFIED - Static variable in interface org.apache.tika.metadata.DublinCore
-
Date on which the resource was changed.
- MODIFIED - Static variable in class org.apache.tika.metadata.Metadata
-
- MODIFIED - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- modifiedService(ServiceReference, Object) - Method in class org.apache.tika.config.TikaActivator
-
- MODIFIER - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- MODIFY_DATE - Static variable in interface org.apache.tika.metadata.XMP
-
The date and time the resource was last modified.
- MONEY - Static variable in interface org.apache.tika.parser.ner.NERecogniser
-
- MONEY_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
- MosesTranslator - Class in org.apache.tika.language.translate
-
Translator that uses the Moses decoder for translation.
- MosesTranslator() - Constructor for class org.apache.tika.language.translate.MosesTranslator
-
Default constructor that attempts to read the smt jar and script paths from the
translator.moses.properties file.
- MosesTranslator(String, String) - Constructor for class org.apache.tika.language.translate.MosesTranslator
-
Create a Moses Translator with the specified smt jar and script paths.
- MP3Frame - Interface in org.apache.tika.parser.mp3
-
A frame in an MP3 file, such as ID3v2 Tags or some
audio.
- Mp3Parser - Class in org.apache.tika.parser.mp3
-
The Mp3Parser
is used to parse ID3 Version 1 Tag information
from an MP3 file, if available.
- Mp3Parser() - Constructor for class org.apache.tika.parser.mp3.Mp3Parser
-
- Mp3Parser.ID3TagsAndAudio - Class in org.apache.tika.parser.mp3
-
- MP4Parser - Class in org.apache.tika.parser.mp4
-
Parser for the MP4 media container format, as well as the older
QuickTime format that MP4 is based on.
- MP4Parser() - Constructor for class org.apache.tika.parser.mp4.MP4Parser
-
- MPEG_V1 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
-
Constant for the MPEG version 1.
- MPEG_V2 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
-
Constant for the MPEG version 2.
- MPEG_V2_5 - Static variable in class org.apache.tika.parser.mp3.AudioFrame
-
Constant for the MPEG version 2.5.
- MPP - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Microsoft Project
- MS_EQUATION - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Equation embedded in Office docs
- MS_GRAPH_CHART - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Graph/Charts embedded in PowerPoint and Excel
- MS_OUTLOOK_PST_MIMETYPE - Static variable in class org.apache.tika.parser.mbox.OutlookPSTParser
-
- MSG - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Microsoft Outlook
- MSOffice - Interface in org.apache.tika.metadata
-
A collection of Microsoft Office and Open Document property names.
- MSOfficeBinaryConverter - Class in org.apache.tika.xmp.convert
-
Tika to XMP mapping for the binary MS formats Word (.doc), Excel (.xls) and PowerPoint (.ppt).
- MSOfficeBinaryConverter() - Constructor for class org.apache.tika.xmp.convert.MSOfficeBinaryConverter
-
- MSOfficeXMLConverter - Class in org.apache.tika.xmp.convert
-
Tika to XMP mapping for the Office Open XML formats Word (.docx), Excel (.xlsx) and PowerPoint
(.pptx).
- MSOfficeXMLConverter() - Constructor for class org.apache.tika.xmp.convert.MSOfficeXMLConverter
-
- MSOwnerFileParser - Class in org.apache.tika.parser.microsoft
-
Parser for temporary MSOFfice files.
- MSOwnerFileParser() - Constructor for class org.apache.tika.parser.microsoft.MSOwnerFileParser
-
- MULTIPART_BOUNDARY - Static variable in interface org.apache.tika.metadata.Message
-
- MULTIPART_SUBTYPE - Static variable in interface org.apache.tika.metadata.Message
-
- MyFirstTika - Class in org.apache.tika.example
-
Demonstrates how to call the different components within Tika: its
Detector
framework (aka MIME identification and repository), its
Parser
interface, its
LanguageIdentifier
and other goodies.
- MyFirstTika() - Constructor for class org.apache.tika.example.MyFirstTika
-
- PackageParser - Class in org.apache.tika.parser.pkg
-
Parser for various packaging formats.
- PackageParser() - Constructor for class org.apache.tika.parser.pkg.PackageParser
-
- PAGE_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- PAGE_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Pages are there in the (paged) document
- PagedText - Interface in org.apache.tika.metadata
-
XMP Paged-text schema.
- PARAGRAPH_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- PARAGRAPH_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of individual Paragraphs in the document
- ParagraphLevelCounter(AbstractListManager.LevelTuple[]) - Constructor for class org.apache.tika.parser.microsoft.AbstractListManager.ParagraphLevelCounter
-
- ParagraphProperties - Class in org.apache.tika.parser.microsoft.ooxml
-
- ParagraphProperties() - Constructor for class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- ParallelFileProcessingResult - Class in org.apache.tika.batch
-
- ParallelFileProcessingResult(int, int, int, int, double, int, String) - Constructor for class org.apache.tika.batch.ParallelFileProcessingResult
-
- Param<T> - Class in org.apache.tika.config
-
This is a serializable model class for parameters from configuration file.
- Param() - Constructor for class org.apache.tika.config.Param
-
- Param(String, Class<T>, T) - Constructor for class org.apache.tika.config.Param
-
- Param(String, T) - Constructor for class org.apache.tika.config.Param
-
- ParamField - Class in org.apache.tika.config
-
This class stores metdata for
Field
annotation are used to map them
to
Param
at runtime
- ParamField(AccessibleObject) - Constructor for class org.apache.tika.config.ParamField
-
Creates a ParamField object
- parse(String, Parser, InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.batch.FileResourceConsumer
-
Utility method to handle logging equivalently among all
implementing classes.
- parse(MediaType, String, String, String, String) - Static method in class org.apache.tika.detect.MagicDetector
-
- parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.example.DirListParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.example.DirListParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.example.EncryptedPrescriptionParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.example.LanguageDetectingParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.fork.ForkParser
-
This sends the objects to the server for parsing, and the server via
the proxies acts on the handler as if it were updating it directly.
- parse(String) - Static method in class org.apache.tika.mime.MediaType
-
Parses the given string to a media type.
- parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.AbstractParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.apple.AppleSingleFileParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.asm.ClassParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.audio.AudioParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.audio.MidiParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.AutoDetectParser
-
- parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.AutoDetectParser
-
- parse(byte[], T) - Method in interface org.apache.tika.parser.chm.accessor.ChmAccessor
-
Parses chm accessor
- parse(byte[], ChmItsfHeader) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
- parse(byte[], ChmItspHeader) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
- parse(byte[], ChmLzxcControlData) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
- parse(byte[], ChmLzxcResetTable) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
- parse(byte[], ChmPmgiHeader) - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
-
- parse(byte[], ChmPmglHeader) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.chm.ChmParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.code.SourceCodeParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.CompositeParser
-
Delegates the call to the matching component parser.
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.crypto.Pkcs7Parser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.crypto.TSDParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.CryptoParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.csv.TextAndCSVParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ctakes.CTAKESParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.dbf.DBFParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.DelegatingParser
-
Looks up the delegate parser from the parsing context and
delegates the parse operation to it.
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.dif.DIFParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.DigestingParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.dwg.DWGParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.EmptyParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.envi.EnviHeaderParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.epub.EpubContentParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.epub.EpubParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ErrorParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.executable.ExecutableParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.external.ExternalParser
-
Executes the configured external command and passes the given document
stream as a simple XHTML document to the given SAX content handler.
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.feed.FeedParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.font.AdobeFontMetricParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.font.TrueTypeParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.gdal.GDALParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.geo.topic.GeoParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.geoinfo.GeographicInformationParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.grib.GribParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.hdf.HDFParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.html.HtmlParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.BPGParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.ICNSParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.ImageParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.PSDParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.TiffParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.WebPParser
-
- parse(InputStream) - Method in class org.apache.tika.parser.image.xmp.JempboxExtractor
-
- parse(InputStream, OutputStream) - Method in class org.apache.tika.parser.image.xmp.XMPPacketScanner
-
Locates an XMP packet in a stream, parses it and returns the XMP metadata.
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.iptc.IptcAnpaParser
-
- parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.iptc.IptcAnpaParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.isatab.ISArchiveParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.iwork.iwana.IWork13PackageParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.iwork.IWorkPackageParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.jdbc.SQLite3Parser
-
- parse(String, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.journal.GrobidRESTParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.journal.JournalParser
-
- parse(String, ParseContext) - Method in class org.apache.tika.parser.journal.TEIDOMParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.jpeg.JpegParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mail.RFC822Parser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mat.MatParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mbox.MboxParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mbox.OutlookPSTParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.EMFParser
-
- parse(POIFSFileSystem, XHTMLContentHandler, Locale) - Method in class org.apache.tika.parser.microsoft.ExcelExtractor
-
Extracts text from an Excel Workbook writing the extracted content
to the specified
Appendable
.
- parse(DirectoryNode, XHTMLContentHandler, Locale) - Method in class org.apache.tika.parser.microsoft.ExcelExtractor
-
- parse(POIFSFileSystem, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.HSLFExtractor
-
- parse(DirectoryNode, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.HSLFExtractor
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.JackcessParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.MSOwnerFileParser
-
Extracts owner from MS temp file
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.OfficeParser
-
Extracts properties and text from an MS Document input stream
- parse(DirectoryNode, ParseContext, Metadata, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.OfficeParser
-
- parse(OldExcelExtractor, XHTMLContentHandler) - Static method in class org.apache.tika.parser.microsoft.OldExcelParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.OldExcelParser
-
Extracts properties and text from an MS Document input stream
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Static method in class org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006.Word2006MLParser
-
- parse(XHTMLContentHandler, Metadata) - Method in class org.apache.tika.parser.microsoft.OutlookExtractor
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.TNEFParser
-
Extracts properties and text from an MS Document input stream
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.WMFParser
-
- parse(POIFSFileSystem, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.WordExtractor
-
- parse(DirectoryNode, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.WordExtractor
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mp3.Mp3Parser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mp4.MP4Parser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ner.NamedEntityParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.netcdf.NetCDFParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.NetworkParser
-
- parse(Image, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentContentParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentMetaParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in interface org.apache.tika.parser.Parser
-
Parses a document stream into a sequence of XHTML SAX events.
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ParserDecorator
-
Delegates the method call to the decorated parser.
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ParserPostProcessor
-
Forwards the call to the delegated parser and post-processes the
results as described above.
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.CompressorParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.PackageParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.RarParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pot.PooledTimeSeriesParser
-
Parses a document stream into a sequence of XHTML SAX events.
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.prt.PRTParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.recognition.AgeRecogniser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.RecursiveParserWrapper
-
Acts like a regular parser except it ignores the ContentHandler
and it automatically sets/overwrites the embedded Parser in the
ParseContext object.
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.rtf.RTFParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.sas.SAS7BDATParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.sentiment.SentimentAnalysisParser
-
Performs the parse
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.strings.Latin1StringsParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.strings.StringsParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.txt.TXTParser
-
- parse(String) - Static method in class org.apache.tika.parser.utils.CommonsDigester
-
- parse(String) - Method in class org.apache.tika.parser.utils.DataURISchemeUtil
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.video.FLVParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.wordperfect.QuattroProParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.wordperfect.WordPerfectParser
-
- parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.XMLParser
-
- parse(String) - Method in class org.apache.tika.sax.xpath.XPathParser
-
Parses the given simple XPath expression to an evaluation state
initialized at the document node.
- parse(Parser, Logger, String, InputStream, ContentHandler, Metadata, ParseContext) - Static method in class org.apache.tika.server.resource.TikaResource
-
Use this to call a parser and unify exception handling.
- parse(InputStream, Metadata) - Method in class org.apache.tika.Tika
-
Parses the given document and returns the extracted text content.
- parse(InputStream) - Method in class org.apache.tika.Tika
-
Parses the given document and returns the extracted text content.
- parse(Path, Metadata) - Method in class org.apache.tika.Tika
-
Parses the file at the given path and returns the extracted text content.
- parse(Path) - Method in class org.apache.tika.Tika
-
Parses the file at the given path and returns the extracted text content.
- parse(File, Metadata) - Method in class org.apache.tika.Tika
-
Parses the given file and returns the extracted text content.
- parse(File) - Method in class org.apache.tika.Tika
-
Parses the given file and returns the extracted text content.
- parse(URL) - Method in class org.apache.tika.Tika
-
Parses the resource at the given URL and returns the extracted
text content.
- PARSE_ERR - Static variable in class org.apache.tika.batch.FileResourceConsumer
-
- PARSE_EX - Static variable in class org.apache.tika.batch.FileResourceConsumer
-
- PARSE_TIME_MILLIS - Static variable in class org.apache.tika.parser.RecursiveParserWrapper
-
- PARSE_TIME_MILLIS - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
-
- parseAssay(InputStream, XHTMLContentHandler, Metadata, ParseContext) - Static method in class org.apache.tika.parser.isatab.ISATabUtils
-
- parseBodyToHTML() - Method in class org.apache.tika.example.ContentHandlerExample
-
Example of extracting just the body as HTML, without the
head part, as a string
- parseContext - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
- ParseContext - Class in org.apache.tika.parser
-
Parse context.
- ParseContext() - Constructor for class org.apache.tika.parser.ParseContext
-
- parseDate(String) - Static method in class org.apache.tika.parser.mbox.MboxParser
-
- parseELF(XHTMLContentHandler, Metadata, InputStream, byte[]) - Method in class org.apache.tika.parser.executable.ExecutableParser
-
Parses a Unix ELF file
- parseEmbedded(InputStream, ContentHandler, Metadata, boolean) - Method in interface org.apache.tika.extractor.EmbeddedDocumentExtractor
-
Processes the supplied embedded resource, calling the delegating
parser with the appropriate details.
- parseEmbedded(InputStream, ContentHandler, Metadata, boolean) - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
- parseEmbedded(InputStream, ContentHandler, Metadata, boolean) - Method in class org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
-
- parseEmbeddedExample() - Method in class org.apache.tika.example.ParsingExample
-
This example shows how to extract content from the outer document and all
embedded documents.
- parseExample() - Method in class org.apache.tika.example.ParsingExample
-
Example of how to use Tika to parse a file when you do not know its file type
ahead of time.
- parseFileInputStream(String) - Static method in class org.apache.tika.example.TIAParsingExample
-
- parseHandlerType(String, BasicContentHandlerFactory.HANDLER_TYPE) - Static method in class org.apache.tika.sax.BasicContentHandlerFactory
-
Tries to parse string into handler type.
- parseHTML(String, Set<String>) - Static method in class org.apache.tika.eval.util.ContentTagParser
-
- parseInline(InputStream, XHTMLContentHandler, TesseractOCRConfig) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- parseInline(InputStream, XHTMLContentHandler, ParseContext, TesseractOCRConfig) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
Use this to parse content without starting a new document.
- parseInvestigation(InputStream, XHTMLContentHandler, Metadata, ParseContext, String) - Static method in class org.apache.tika.parser.isatab.ISATabUtils
-
- parseInvestigation(InputStream, XHTMLContentHandler, Metadata, ParseContext) - Static method in class org.apache.tika.parser.isatab.ISATabUtils
-
- parseJpeg(File) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
-
- parseNoEmbeddedExample() - Method in class org.apache.tika.example.ParsingExample
-
- parseObject(String, ParsePosition) - Method in class org.apache.tika.parser.microsoft.TikaExcelGeneralFormat
-
- parseOnePartToHTML() - Method in class org.apache.tika.example.ContentHandlerExample
-
Example of extracting just one part of the document's body,
as HTML as a string, excluding the rest
- parsePE(XHTMLContentHandler, Metadata, InputStream, byte[]) - Method in class org.apache.tika.parser.executable.ExecutableParser
-
Parses a DOS or Windows PE file
- Parser - Interface in org.apache.tika.parser
-
Tika parser interface.
- PARSER_TAG - Static variable in interface org.apache.tika.parser.external.ExternalParsersConfigReaderMetKeys
-
- parseRawExif(InputStream, int, boolean) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
-
- parseRawExif(byte[]) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
-
- parseRawXMP(byte[]) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
-
- ParserContainerExtractor - Class in org.apache.tika.extractor
-
- ParserContainerExtractor() - Constructor for class org.apache.tika.extractor.ParserContainerExtractor
-
- ParserContainerExtractor(TikaConfig) - Constructor for class org.apache.tika.extractor.ParserContainerExtractor
-
- ParserContainerExtractor(Parser, Detector) - Constructor for class org.apache.tika.extractor.ParserContainerExtractor
-
- ParserDecorator - Class in org.apache.tika.parser
-
Decorator base class for the
Parser
interface.
- ParserDecorator(Parser) - Constructor for class org.apache.tika.parser.ParserDecorator
-
Creates a decorator for the given parser.
- ParserFactory - Class in org.apache.tika.batch
-
- ParserFactory() - Constructor for class org.apache.tika.batch.ParserFactory
-
- ParserFactory - Class in org.apache.tika.parser
-
- ParserFactory(Map<String, String>) - Constructor for class org.apache.tika.parser.ParserFactory
-
- ParserFactoryBuilder - Class in org.apache.tika.batch.builders
-
- ParserFactoryBuilder() - Constructor for class org.apache.tika.batch.builders.ParserFactoryBuilder
-
- ParserFactoryFactory - Class in org.apache.tika.fork
-
Lightweight, easily serializable class that contains enough information
to build a
ParserFactory
- ParserFactoryFactory(String, Map<String, String>) - Constructor for class org.apache.tika.fork.ParserFactoryFactory
-
- ParserPostProcessor - Class in org.apache.tika.parser
-
Parser decorator that post-processes the results from a decorated parser.
- ParserPostProcessor(Parser) - Constructor for class org.apache.tika.parser.ParserPostProcessor
-
Creates a post-processing decorator for the given parser.
- ParserUtils - Class in org.apache.tika.utils
-
Helper util methods for Parsers themselves.
- ParserUtils() - Constructor for class org.apache.tika.utils.ParserUtils
-
- parseSAX(InputStream, DefaultHandler, ParseContext) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
This checks context for a user specified
SAXParser
.
- parseStudy(InputStream, XHTMLContentHandler, Metadata, ParseContext) - Static method in class org.apache.tika.parser.isatab.ISATabUtils
-
- parseSuffixes(String) - Static method in class org.apache.tika.eval.io.ExtractReader
-
- parseSummaries(POIFSFileSystem) - Method in class org.apache.tika.parser.microsoft.SummaryExtractor
-
- parseSummaries(DirectoryNode) - Method in class org.apache.tika.parser.microsoft.SummaryExtractor
-
- parseTiff(File) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
-
- parseTikaInputStream(String) - Static method in class org.apache.tika.example.TIAParsingExample
-
- parseToHTML() - Method in class org.apache.tika.example.ContentHandlerExample
-
Example of extracting the contents as HTML, as a string.
- parseToPlainText() - Method in class org.apache.tika.example.ContentHandlerExample
-
Example of extracting the plain text of the contents.
- parseToPlainTextChunks() - Method in class org.apache.tika.example.ContentHandlerExample
-
Example of extracting the plain text in chunks, with each chunk
of no more than a certain maximum size
- parseToReaderExample() - Static method in class org.apache.tika.example.TIAParsingExample
-
- parseToString(InputStream, Metadata) - Method in class org.apache.tika.Tika
-
Parses the given document and returns the extracted text content.
- parseToString(InputStream, Metadata, int) - Method in class org.apache.tika.Tika
-
Parses the given document and returns the extracted text content.
- parseToString(InputStream) - Method in class org.apache.tika.Tika
-
Parses the given document and returns the extracted text content.
- parseToString(Path) - Method in class org.apache.tika.Tika
-
Parses the file at the given path and returns the extracted text content.
- parseToString(File) - Method in class org.apache.tika.Tika
-
Parses the given file and returns the extracted text content.
- parseToString(URL) - Method in class org.apache.tika.Tika
-
Parses the resource at the given URL and returns the extracted
text content.
- parseToStringExample() - Method in class org.apache.tika.example.ParsingExample
-
Example of how to use Tika's parseToString method to parse the content of a file,
and return any text found.
- parseToStringExample() - Static method in class org.apache.tika.example.TIAParsingExample
-
- parseURLStream(String) - Static method in class org.apache.tika.example.TIAParsingExample
-
- parseUsingAutoDetect(String, TikaConfig, Metadata) - Static method in class org.apache.tika.example.MyFirstTika
-
- parseUsingComponents(String, TikaConfig, Metadata) - Static method in class org.apache.tika.example.MyFirstTika
-
- parseWebP(File) - Method in class org.apache.tika.parser.image.ImageMetadataExtractor
-
- parseWord6(POIFSFileSystem, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.WordExtractor
-
- parseWord6(DirectoryNode, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.WordExtractor
-
- parseXML(String, Set<String>) - Static method in class org.apache.tika.eval.util.ContentTagParser
-
- ParsingEmbeddedDocumentExtractor - Class in org.apache.tika.extractor
-
Helper class for parsers of package archives or other compound document
formats that support embedded or attached component documents.
- ParsingEmbeddedDocumentExtractor(ParseContext) - Constructor for class org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
-
- ParsingExample - Class in org.apache.tika.example
-
- ParsingExample() - Constructor for class org.apache.tika.example.ParsingExample
-
- ParsingReader - Class in org.apache.tika.parser
-
Reader for the text content from a given binary stream.
- ParsingReader(InputStream) - Constructor for class org.apache.tika.parser.ParsingReader
-
Creates a reader for the text content of the given binary stream.
- ParsingReader(InputStream, String) - Constructor for class org.apache.tika.parser.ParsingReader
-
Creates a reader for the text content of the given binary stream
with the given name.
- ParsingReader(Path) - Constructor for class org.apache.tika.parser.ParsingReader
-
Creates a reader for the text content of the file at the given path.
- ParsingReader(File) - Constructor for class org.apache.tika.parser.ParsingReader
-
Creates a reader for the text content of the given file.
- ParsingReader(Parser, InputStream, Metadata, ParseContext) - Constructor for class org.apache.tika.parser.ParsingReader
-
Creates a reader for the text content of the given binary stream
with the given document metadata.
- ParsingReader(Parser, InputStream, Metadata, ParseContext, Executor) - Constructor for class org.apache.tika.parser.ParsingReader
-
Creates a reader for the text content of the given binary stream
with the given document metadata.
- PASSWORD - Static variable in class org.apache.tika.parser.pdf.PDFParser
-
- PASSWORD - Static variable in class org.apache.tika.server.resource.TikaResource
-
- PASSWORD_BASE64_UTF8 - Static variable in class org.apache.tika.server.resource.TikaResource
-
- PasswordProvider - Interface in org.apache.tika.parser
-
Interface for providing a password to a Parser for handling Encrypted
and Password Protected Documents.
- path - Variable in class org.apache.tika.server.resource.TikaWelcome.Endpoint
-
- PATTERN_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- patterns - Variable in class org.apache.tika.parser.ner.regex.RegexNERecogniser
-
- PDF - Interface in org.apache.tika.metadata
-
PDF properties collection.
- PDF_DOC_INFO_CUSTOM_PREFIX - Static variable in interface org.apache.tika.metadata.PDF
-
- PDF_DOC_INFO_PREFIX - Static variable in interface org.apache.tika.metadata.PDF
-
Prefix to be used for properties that record what was stored
in the docinfo section (as opposed to XMP)
- PDF_EXTENSION_VERSION - Static variable in interface org.apache.tika.metadata.PDF
-
- PDF_PREFIX - Static variable in interface org.apache.tika.metadata.PDF
-
- PDF_VERSION - Static variable in interface org.apache.tika.metadata.PDF
-
- PDFA_PREFIX - Static variable in interface org.apache.tika.metadata.PDF
-
- PDFA_VERSION - Static variable in interface org.apache.tika.metadata.PDF
-
- PDFAID_CONFORMANCE - Static variable in interface org.apache.tika.metadata.PDF
-
- PDFAID_PART - Static variable in interface org.apache.tika.metadata.PDF
-
- PDFAID_PREFIX - Static variable in interface org.apache.tika.metadata.PDF
-
- PDFParser - Class in org.apache.tika.parser.pdf
-
PDF parser.
- PDFParser() - Constructor for class org.apache.tika.parser.pdf.PDFParser
-
- PDFParserConfig - Class in org.apache.tika.parser.pdf
-
Config for PDFParser.
- PDFParserConfig() - Constructor for class org.apache.tika.parser.pdf.PDFParserConfig
-
- PDFParserConfig(InputStream) - Constructor for class org.apache.tika.parser.pdf.PDFParserConfig
-
Loads properties from InputStream and then tries to close InputStream.
- PDFParserConfig.OCR_STRATEGY - Enum in org.apache.tika.parser.pdf
-
- peek(byte[]) - Method in class org.apache.tika.io.TikaInputStream
-
Fills the given buffer with upcoming bytes from this stream without
advancing the current stream position.
- peekBits(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- PERCENT - Static variable in interface org.apache.tika.parser.ner.NERecogniser
-
- PERCENT_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
- PERSON - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of a person the content of the item is about.
- PERSON - Static variable in interface org.apache.tika.parser.ner.NERecogniser
-
- PERSON_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
- Pharmacy - Class in org.apache.tika.example
-
- Pharmacy() - Constructor for class org.apache.tika.example.Pharmacy
-
- PhoneExtractingContentHandler - Class in org.apache.tika.sax
-
Class used to extract phone numbers while parsing.
- PhoneExtractingContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.sax.PhoneExtractingContentHandler
-
Creates a decorator for the given SAX event handler and Metadata object.
- PhoneExtractingContentHandler() - Constructor for class org.apache.tika.sax.PhoneExtractingContentHandler
-
Creates a decorator that by default forwards incoming SAX events to
a dummy content handler that simply ignores all the events.
- Photoshop - Interface in org.apache.tika.metadata
-
XMP Photoshop metadata schema.
- Pkcs7Parser - Class in org.apache.tika.parser.crypto
-
Basic parser for PKCS7 data.
- Pkcs7Parser() - Constructor for class org.apache.tika.parser.crypto.Pkcs7Parser
-
- PLAIN_TEXT - Static variable in class org.apache.tika.mime.MimeTypes
-
Name of the
text
type, text/plain.
- PLATFORM - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- PLATFORM_AIX - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- PLATFORM_ARM - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- PLATFORM_EMBEDDED - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- PLATFORM_FREEBSD - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- PLATFORM_HPUX - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- PLATFORM_IRIX - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- PLATFORM_LINUX - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- PLATFORM_NETBSD - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- PLATFORM_SOLARIS - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- PLATFORM_SYSV - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- PLATFORM_TRU64 - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- PLATFORM_WINDOWS - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- pleaseShutdown() - Method in class org.apache.tika.batch.FileResourceConsumer
-
This politely asks the consumer to shutdown.
- PLUS_VERSION - Static variable in interface org.apache.tika.metadata.IPTC
-
The version number of the PLUS standards in place at the time of the
transaction.
- PMGL - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- POIFSContainerDetector - Class in org.apache.tika.parser.microsoft
-
A detector that works on a POIFS OLE2 document
to figure out exactly what the file is.
- POIFSContainerDetector() - Constructor for class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
- POIXMLTextExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
-
- POIXMLTextExtractorDecorator(ParseContext, POIXMLTextExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.POIXMLTextExtractorDecorator
-
- PooledTimeSeriesParser - Class in org.apache.tika.parser.pot
-
Uses the Pooled Time Series algorithm + command line tool, to
generate a numeric representation of the video suitable for
similarity searches.
- PooledTimeSeriesParser() - Constructor for class org.apache.tika.parser.pot.PooledTimeSeriesParser
-
- populateRefTables() - Method in class org.apache.tika.eval.batch.EvalConsumerBuilder
-
- position() - Method in class org.apache.tika.parser.mp4.DirectFileReadDataSource
-
- position(long) - Method in class org.apache.tika.parser.mp4.DirectFileReadDataSource
-
- POSITION_BASE - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- PPT - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Microsoft PowerPoint
- predict(double[]) - Method in class org.apache.tika.detect.NNTrainedModel
-
- predict(float[]) - Method in class org.apache.tika.detect.NNTrainedModel
-
The given input vector of unseen is m=(256 + 1) * n= 1 this returns a
prediction probability
- predict(double[]) - Method in class org.apache.tika.detect.TrainedModel
-
- predict(float[]) - Method in class org.apache.tika.detect.TrainedModel
-
- PREFIX - Static variable in interface org.apache.tika.metadata.AccessPermissions
-
- PREFIX - Static variable in interface org.apache.tika.metadata.Database
-
- PREFIX - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
- PREFIX - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
-
- PREFIX - Static variable in interface org.apache.tika.metadata.XMP
-
- PREFIX - Static variable in interface org.apache.tika.metadata.XMPIdq
-
- PREFIX - Static variable in interface org.apache.tika.metadata.XMPMM
-
- PREFIX - Static variable in interface org.apache.tika.parser.executable.MachineMetadata
-
- prefix - Variable in class org.apache.tika.xmp.convert.Namespace
-
- PREFIX_ - Static variable in interface org.apache.tika.metadata.XMP
-
The xmp prefix followed by the colon delimiter
- PREFIX_ - Static variable in interface org.apache.tika.metadata.XMPIdq
-
The xmpidq prefix followed by the colon delimiter
- PREFIX_ - Static variable in interface org.apache.tika.metadata.XMPMM
-
The xmpMM prefix followed by the colon delimiter
- PREFIX_ - Static variable in interface org.apache.tika.metadata.XMPRights
-
The xmpRights prefix followed by the colon delimiter
- PREFIX_DC - Static variable in interface org.apache.tika.metadata.DublinCore
-
- PREFIX_DC_TERMS - Static variable in interface org.apache.tika.metadata.DublinCore
-
- PREFIX_DOC_META - Static variable in interface org.apache.tika.metadata.Office
-
- PREFIX_HTML_META - Static variable in interface org.apache.tika.metadata.HTML
-
- PREFIX_IPTC_CORE - Static variable in interface org.apache.tika.metadata.IPTC
-
- PREFIX_IPTC_EXT - Static variable in interface org.apache.tika.metadata.IPTC
-
- PREFIX_PHOTOSHOP - Static variable in interface org.apache.tika.metadata.Photoshop
-
- PREFIX_PLUS - Static variable in interface org.apache.tika.metadata.IPTC
-
- PREFIX_RTF_META - Static variable in interface org.apache.tika.metadata.RTFMetadata
-
- PREFIX_XMP_RIGHTS - Static variable in interface org.apache.tika.metadata.XMPRights
-
- preProcessImage(INDArray) - Method in class org.apache.tika.dl.imagerec.DL4JInceptionV3Net
-
Pre process image to reduce to make it feedable to inception network
- PrescriptionParser - Class in org.apache.tika.example
-
- PrescriptionParser() - Constructor for class org.apache.tika.example.PrescriptionParser
-
- PRESENTATION_FORMAT - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- PRESENTATION_FORMAT - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
-
- PRESENTATION_NS - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
-
- PrettyMetadataKeyComparator - Class in org.apache.tika.metadata.serialization
-
- PrettyMetadataKeyComparator() - Constructor for class org.apache.tika.metadata.serialization.PrettyMetadataKeyComparator
-
- PRINT_DATE - Static variable in interface org.apache.tika.metadata.Office
-
When was the document last printed?
- PRINT_DATE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- priorExtensionFileType(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
-
- priority - Variable in class org.apache.tika.mime.MimeTypesReader
-
- priorMagicFileType(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
-
- priorMetaFileType(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
-
- ProbabilisticMimeDetectionSelector - Class in org.apache.tika.mime
-
Selector for combining different mime detection results
based on probability
- ProbabilisticMimeDetectionSelector() - Constructor for class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
-
- ProbabilisticMimeDetectionSelector(ProbabilisticMimeDetectionSelector.Builder) - Constructor for class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
-
- ProbabilisticMimeDetectionSelector(MimeTypes) - Constructor for class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
-
- ProbabilisticMimeDetectionSelector(MimeTypes, ProbabilisticMimeDetectionSelector.Builder) - Constructor for class org.apache.tika.mime.ProbabilisticMimeDetectionSelector
-
- ProbabilisticMimeDetectionSelector.Builder - Class in org.apache.tika.mime
-
build class for probability parameters setting
- probeContentType(Path) - Method in class org.apache.tika.filetypedetector.TikaFileTypeDetector
-
- process(String) - Method in class org.apache.tika.cli.TikaCLI
-
- process(Path) - Static method in class org.apache.tika.example.GrabPhoneNumbersExample
-
- process(DataInputStream, DataOutputStream) - Method in interface org.apache.tika.fork.ForkResource
-
- process(Path) - Static method in class org.apache.tika.sax.StandardsExtractionExample
-
- process(Metadata) - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
- process(Metadata) - Method in class org.apache.tika.xmp.convert.GenericConverter
-
- process(Metadata) - Method in interface org.apache.tika.xmp.convert.ITikaToXMPConverter
-
Converts a Tika
Metadata
-object into an
XMPMeta
containing the useful
properties.
- process(Metadata) - Method in class org.apache.tika.xmp.convert.MSOfficeBinaryConverter
-
- process(Metadata) - Method in class org.apache.tika.xmp.convert.MSOfficeXMLConverter
-
- process(Metadata) - Method in class org.apache.tika.xmp.convert.OpenDocumentConverter
-
- process(Metadata) - Method in class org.apache.tika.xmp.convert.RTFConverter
-
- process(Metadata) - Method in class org.apache.tika.xmp.XMPMetadata
-
- process(Metadata, String) - Method in class org.apache.tika.xmp.XMPMetadata
-
Converts the Metadata information to XMP.
- PROCESS_COMPLETED_SUCCESSFULLY - Static variable in class org.apache.tika.batch.BatchProcessDriverCLI
-
- PROCESS_NO_RESTART_EXIT_CODE - Static variable in class org.apache.tika.batch.BatchProcessDriverCLI
-
- PROCESS_RESTART_EXIT_CODE - Static variable in class org.apache.tika.batch.BatchProcessDriverCLI
-
This relies on an special exit values of 254 (do not restart),
0 ended correctly, 253 ended with exception (do restart)
- processByte() - Method in class org.apache.tika.io.NullInputStream
-
Return a byte value for the read()
method.
- processBytes(byte[], int, int) - Method in class org.apache.tika.io.NullInputStream
-
Process the bytes for the read(byte[], offset, length)
method.
- processCommand(InputStream) - Method in class org.apache.tika.parser.gdal.GDALParser
-
- processFileResource(FileResource) - Method in class org.apache.tika.batch.FileResourceConsumer
-
Main piece of code that needs to be implemented.
- processFileResource(FileResource) - Method in class org.apache.tika.batch.fs.BasicTikaFSConsumer
-
- processFileResource(FileResource) - Method in class org.apache.tika.batch.fs.RecursiveParserWrapperFSConsumer
-
- processFileResource(FileResource) - Method in class org.apache.tika.batch.fs.StreamOutRPWFSConsumer
-
- processFileResource(FileResource) - Method in class org.apache.tika.eval.ExtractComparer
-
- processFileResource(FileResource) - Method in class org.apache.tika.eval.ExtractProfiler
-
- processFolder(Path) - Static method in class org.apache.tika.example.GrabPhoneNumbersExample
-
- processFolder(Path) - Static method in class org.apache.tika.sax.StandardsExtractionExample
-
- processingInstruction(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- processingInstruction(String, String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
-
- processingInstruction(String, String) - Method in class org.apache.tika.sax.TeeContentHandler
-
- processingInstruction(String, String) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
-
- processShapes(List<XSSFShape>, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
- processSheet(XSSFSheetXMLHandler.SheetContentsHandler, CommentsTable, StylesTable, ReadOnlySharedStringsTable, InputStream) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
- ProcessUtils - Class in org.apache.tika.utils
-
- ProcessUtils() - Constructor for class org.apache.tika.utils.ProcessUtils
-
- produces - Variable in class org.apache.tika.server.resource.TikaWelcome.Endpoint
-
- produceText(InputStream, MultivaluedMap<String, String>, UriInfo) - Method in class org.apache.tika.server.resource.TikaResource
-
- produceTextMain(InputStream, MultivaluedMap<String, String>, UriInfo) - Method in class org.apache.tika.server.resource.TikaResource
-
- PRODUCT_TYPE - Static variable in interface org.apache.tika.metadata.WordPerfect
-
Product type.
- PROFILE_TABLE - Static variable in class org.apache.tika.eval.ExtractProfiler
-
- PROFILES_A - Static variable in class org.apache.tika.eval.ExtractComparer
-
- PROFILES_B - Static variable in class org.apache.tika.eval.ExtractComparer
-
- ProfilingHandler - Class in org.apache.tika.language
-
- ProfilingHandler(ProfilingWriter) - Constructor for class org.apache.tika.language.ProfilingHandler
-
Deprecated.
- ProfilingHandler(LanguageProfile) - Constructor for class org.apache.tika.language.ProfilingHandler
-
Deprecated.
- ProfilingHandler() - Constructor for class org.apache.tika.language.ProfilingHandler
-
Deprecated.
- ProfilingWriter - Class in org.apache.tika.language
-
- ProfilingWriter(LanguageProfile) - Constructor for class org.apache.tika.language.ProfilingWriter
-
Deprecated.
- ProfilingWriter() - Constructor for class org.apache.tika.language.ProfilingWriter
-
Deprecated.
- PROGRAM_ID - Static variable in interface org.apache.tika.metadata.ClimateForcast
-
- PROJECT_ID - Static variable in interface org.apache.tika.metadata.ClimateForcast
-
- PROPERTIES_FILE - Static variable in class org.apache.tika.language.translate.MicrosoftTranslator
-
- Property - Class in org.apache.tika.metadata
-
XMP property definition.
- property(String, String) - Method in class org.apache.tika.sax.XMPContentHandler
-
- Property.PropertyType - Enum in org.apache.tika.metadata
-
- Property.ValueType - Enum in org.apache.tika.metadata
-
- PROPERTY_GROUP_IPTC_CORE - Static variable in interface org.apache.tika.metadata.IPTC
-
- PROPERTY_GROUP_IPTC_EXT - Static variable in interface org.apache.tika.metadata.IPTC
-
- PROPERTY_RELEASE_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
Optional identifier associated with each Property Release.
- PROPERTY_RELEASE_STATUS - Static variable in interface org.apache.tika.metadata.IPTC
-
Summarises the availability and scope of property releases authorizing
usage of the properties appearing in the photograph.
- PropertyTypeException - Exception in org.apache.tika.metadata
-
XMP property definition violation exception.
- PropertyTypeException(String) - Constructor for exception org.apache.tika.metadata.PropertyTypeException
-
- PropertyTypeException(Property.PropertyType, Property.PropertyType) - Constructor for exception org.apache.tika.metadata.PropertyTypeException
-
- PropertyTypeException(Property.ValueType, Property.ValueType) - Constructor for exception org.apache.tika.metadata.PropertyTypeException
-
- PropertyTypeException(Property.PropertyType) - Constructor for exception org.apache.tika.metadata.PropertyTypeException
-
- PropsUtil - Class in org.apache.tika.util
-
Utility class to handle properties.
- PropsUtil() - Constructor for class org.apache.tika.util.PropsUtil
-
- PROTECTED - Static variable in interface org.apache.tika.metadata.TikaMetadataKeys
-
- PROVINCE_OR_STATE - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of the subregion of a country -- either called province or state or
anything else -- the content is focussing on -- either the subregion
shown in visual media or referenced by text or audio media.
- ProxyInputStream - Class in org.apache.tika.io
-
A Proxy stream which acts as expected, that is it passes the method
calls on to the proxied stream and doesn't change which methods are
being called.
- ProxyInputStream(InputStream) - Constructor for class org.apache.tika.io.ProxyInputStream
-
Constructs a new ProxyInputStream.
- PRT_MIME_TYPE - Static variable in class org.apache.tika.parser.prt.PRTParser
-
- PRTParser - Class in org.apache.tika.parser.prt
-
A basic text extracting parser for the CADKey PRT (CAD Drawing)
format.
- PRTParser() - Constructor for class org.apache.tika.parser.prt.PRTParser
-
- PSDParser - Class in org.apache.tika.parser.image
-
Parser for the Adobe Photoshop PSD File Format.
- PSDParser() - Constructor for class org.apache.tika.parser.image.PSDParser
-
- PUB - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Microsoft Publisher
- PUBLISHER - Static variable in interface org.apache.tika.metadata.DublinCore
-
An entity responsible for making the resource available.
- PUBLISHER - Static variable in class org.apache.tika.metadata.Metadata
-
- PUBLISHER - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- PULL_DOWN - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The sampling phase of film to be converted to video (pull-down)."
- RarParser - Class in org.apache.tika.parser.pkg
-
Parser for Rar files.
- RarParser() - Constructor for class org.apache.tika.parser.pkg.RarParser
-
- RATING - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- RATING - Static variable in interface org.apache.tika.metadata.XMP
-
A user-assigned rating for this file.
- RawTagIterator(int, int, int, int) - Constructor for class org.apache.tika.parser.mp3.ID3v2Frame.RawTagIterator
-
- RDF - Static variable in class org.apache.tika.sax.XMPContentHandler
-
The RDF namespace URI
- read(InputStream, XMLLogMsgHandler) - Method in class org.apache.tika.eval.io.XMLLogReader
-
- read() - Method in class org.apache.tika.io.BoundedInputStream
-
- read(byte[]) - Method in class org.apache.tika.io.BoundedInputStream
-
Invokes the delegate's read(byte[])
method.
- read(byte[], int, int) - Method in class org.apache.tika.io.BoundedInputStream
-
Invokes the delegate's read(byte[], int, int)
method.
- read() - Method in class org.apache.tika.io.ClosedInputStream
-
Returns -1 to indicate that the stream is closed.
- read(byte[]) - Method in class org.apache.tika.io.CountingInputStream
-
Reads a number of bytes into the byte array, keeping count of the
number read.
- read(byte[], int, int) - Method in class org.apache.tika.io.CountingInputStream
-
Reads a number of bytes into the byte array at a specific offset,
keeping count of the number read.
- read() - Method in class org.apache.tika.io.CountingInputStream
-
Reads the next byte of data adding to the count of bytes received
if a byte is successfully read.
- read(InputStream, byte[], int, int) - Static method in class org.apache.tika.io.IOUtils
-
Reads bytes from an input stream.
- read() - Method in class org.apache.tika.io.LookaheadInputStream
-
- read(byte[], int, int) - Method in class org.apache.tika.io.LookaheadInputStream
-
- read() - Method in class org.apache.tika.io.NullInputStream
-
Read a byte.
- read(byte[]) - Method in class org.apache.tika.io.NullInputStream
-
Read some bytes into the specified array.
- read(byte[], int, int) - Method in class org.apache.tika.io.NullInputStream
-
Read the specified number bytes into an array.
- read() - Method in class org.apache.tika.io.ProxyInputStream
-
Invokes the delegate's read()
method.
- read(byte[]) - Method in class org.apache.tika.io.ProxyInputStream
-
Invokes the delegate's read(byte[])
method.
- read(byte[], int, int) - Method in class org.apache.tika.io.ProxyInputStream
-
Invokes the delegate's read(byte[], int, int)
method.
- read() - Method in class org.apache.tika.io.TailStream
-
This implementation adds the read byte to the internal tail
buffer.
- read(byte[]) - Method in class org.apache.tika.io.TailStream
-
This implementation delegates to the underlying stream and
then adds the correct portion of the read buffer to the internal tail
buffer.
- read(byte[], int, int) - Method in class org.apache.tika.io.TailStream
-
This implementation delegates to the underlying stream and
then adds the correct portion of the read buffer to the internal tail
buffer.
- read(InputStream) - Method in class org.apache.tika.mime.MimeTypesReader
-
- read(Document) - Method in class org.apache.tika.mime.MimeTypesReader
-
- read(InputStream) - Static method in class org.apache.tika.parser.external.ExternalParsersConfigReader
-
- read(Document) - Static method in class org.apache.tika.parser.external.ExternalParsersConfigReader
-
- read(Element) - Static method in class org.apache.tika.parser.external.ExternalParsersConfigReader
-
- read(ByteBuffer) - Method in class org.apache.tika.parser.mp4.DirectFileReadDataSource
-
- read(char[], int, int) - Method in class org.apache.tika.parser.ParsingReader
-
Reads parsed text from the pipe connected to the parsing thread.
- read() - Method in class org.apache.tika.utils.RereadableInputStream
-
Reads a byte from the stream, saving it in the store if it is being
read from the original stream.
- readAllInOnce(ByteBuffer) - Method in class org.apache.tika.parser.mp4.DirectFileReadDataSource
-
- readByteFrequencies(InputStream) - Method in class org.apache.tika.detect.TrainedModelDetector
-
Read the inputstream
and build a byte frequency histogram
- readFully(InputStream, int) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- readFully(InputStream, int, boolean) - Static method in class org.apache.tika.parser.mp3.ID3v2Frame
-
- readIntBE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE int value from an InputStream
- readIntLE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE int value from an InputStream
- readLines(InputStream) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of an InputStream
as a list of Strings,
one entry per line, using the default character encoding of the platform.
- readLines(InputStream, String) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of an InputStream
as a list of Strings,
one entry per line, using the specified character encoding.
- readLines(Reader) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of a Reader
as a list of Strings,
one entry per line.
- readLongBE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a NE long value from an InputStream
- readLongLE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE long value from an InputStream
- readShortBE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE short value from an InputStream
- readShortLE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE short value from an InputStream
- readUE7(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Gets the integer value that is stored in UTF-8 like fashion, in Big Endian
but with the high bit on each number indicating if it continues or not
- readUIntBE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a BE unsigned int value from an InputStream
- readUIntLE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
Get a LE unsigned int value from an InputStream
- readUShortBE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
- readUShortLE(InputStream) - Static method in class org.apache.tika.io.EndianUtils
-
- REALIZATION - Static variable in interface org.apache.tika.metadata.ClimateForcast
-
- reallyEndDocument() - Method in class org.apache.tika.sax.EndDocumentShieldingContentHandler
-
- RecentFiles - Class in org.apache.tika.example
-
Builds on top of the LuceneIndexer and the Metadata discussions in Chapter 6
to output an RSS (or RDF) feed of files crawled by the LuceneIndexer within
the last N minutes.
- RecentFiles() - Constructor for class org.apache.tika.example.RecentFiles
-
- recognise(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.dl.imagerec.DL4JInceptionV3Net
-
- recognise(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.dl.imagerec.DL4JVGG16Net
-
- recognise(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
-
- recognise(String) - Method in class org.apache.tika.parser.ner.corenlp.CoreNLPNERecogniser
-
recognises names of entities in the text
- recognise(String) - Method in class org.apache.tika.parser.ner.grobid.GrobidNERecogniser
-
recognises names of entities in the text
- recognise(String) - Method in class org.apache.tika.parser.ner.mitie.MITIENERecogniser
-
recognises names of entities in the text
- recognise(String) - Method in interface org.apache.tika.parser.ner.NERecogniser
-
call for name recognition action from text
- recognise(String) - Method in class org.apache.tika.parser.ner.nltk.NLTKNERecogniser
-
recognises names of entities in the text
- recognise(String) - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
-
- recognise(String) - Method in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
- recognise(String) - Method in class org.apache.tika.parser.ner.regex.RegexNERecogniser
-
- recognise(InputStream, ContentHandler, Metadata, ParseContext) - Method in interface org.apache.tika.parser.recognition.ObjectRecogniser
-
Recognise the objects in the stream
- recognise(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
-
- recognise(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- RecognisedObject - Class in org.apache.tika.parser.recognition
-
A model for recognised objects from graphics and texts typically includes
human readable label for the object, language of the label, id and confidence score.
- RecognisedObject(String, String, String, double) - Constructor for class org.apache.tika.parser.recognition.RecognisedObject
-
- recordEmbeddedStreamException(Throwable, Metadata) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
- recordException(Throwable, Metadata) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
- recordParserDetails(Parser, Metadata) - Static method in class org.apache.tika.utils.ParserUtils
-
Records details of the
Parser
used to the
Metadata
,
typically wanted where multiple parsers could be picked between
or used.
- recordParserFailure(Parser, Throwable, Metadata) - Static method in class org.apache.tika.utils.ParserUtils
-
Records details of a
Parser
's failure to the
Metadata
, so you can check what went wrong even if the
Exception
wasn't immediately thrown (eg when several different
Parsers are used)
- RecursiveMetadataResource - Class in org.apache.tika.server.resource
-
- RecursiveMetadataResource() - Constructor for class org.apache.tika.server.resource.RecursiveMetadataResource
-
- RecursiveParserWrapper - Class in org.apache.tika.parser
-
This is a helper class that wraps a parser in a recursive handler.
- RecursiveParserWrapper(Parser) - Constructor for class org.apache.tika.parser.RecursiveParserWrapper
-
- RecursiveParserWrapper(Parser, boolean) - Constructor for class org.apache.tika.parser.RecursiveParserWrapper
-
- RecursiveParserWrapper(Parser, ContentHandlerFactory) - Constructor for class org.apache.tika.parser.RecursiveParserWrapper
-
- RecursiveParserWrapper(Parser, ContentHandlerFactory, boolean) - Constructor for class org.apache.tika.parser.RecursiveParserWrapper
-
- recursiveParserWrapperExample() - Method in class org.apache.tika.example.ParsingExample
-
For documents that may contain embedded documents, it might be helpful
to create list of metadata objects, one for the container document and
one for each embedded document.
- RecursiveParserWrapperFSConsumer - Class in org.apache.tika.batch.fs
-
This runs a RecursiveParserWrapper against an input file
and outputs the json metadata to an output file.
- RecursiveParserWrapperFSConsumer(ArrayBlockingQueue<FileResource>, Parser, ContentHandlerFactory, OutputStreamFactory) - Constructor for class org.apache.tika.batch.fs.RecursiveParserWrapperFSConsumer
-
- RecursiveParserWrapperHandler - Class in org.apache.tika.sax
-
- RecursiveParserWrapperHandler(ContentHandlerFactory) - Constructor for class org.apache.tika.sax.RecursiveParserWrapperHandler
-
Create a handler with no limit on the number of embedded resources
- RecursiveParserWrapperHandler(ContentHandlerFactory, int) - Constructor for class org.apache.tika.sax.RecursiveParserWrapperHandler
-
Create a handler that limits the number of embedded resources that will be
parsed
- REF_EXTRACT_EXCEPTION_TYPES - Static variable in class org.apache.tika.eval.AbstractProfiler
-
- REF_PAIR_NAMES - Static variable in class org.apache.tika.eval.ExtractComparer
-
- REF_PARSE_ERROR_TYPES - Static variable in class org.apache.tika.eval.AbstractProfiler
-
- REF_PARSE_EXCEPTION_TYPES - Static variable in class org.apache.tika.eval.AbstractProfiler
-
- REFERENCES - Static variable in interface org.apache.tika.metadata.ClimateForcast
-
- RegexNERecogniser - Class in org.apache.tika.parser.ner.regex
-
This class offers an implementation of
NERecogniser
based on
Regular Expressions.
- RegexNERecogniser() - Constructor for class org.apache.tika.parser.ner.regex.RegexNERecogniser
-
- RegexNERecogniser(InputStream) - Constructor for class org.apache.tika.parser.ner.regex.RegexNERecogniser
-
- RegexUtils - Class in org.apache.tika.utils
-
Inspired from Nutch code class OutlinkExtractor.
- RegexUtils() - Constructor for class org.apache.tika.utils.RegexUtils
-
- registerModels(MediaType, TrainedModel) - Method in class org.apache.tika.detect.TrainedModelDetector
-
- registerNamespace(String, String) - Static method in class org.apache.tika.xmp.XMPMetadata
-
Register a namespace URI with a suggested prefix.
- registerNamespaces(Set<Namespace>) - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
Registers a number Namespace
information with XMPCore.
- REGISTRY_ENTRY_CREATED_ITEM_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
A unique identifier created by a registry and applied by the creator of
the item.
- REGISTRY_ENTRY_CREATED_ORGANISATION_ID - Static variable in interface org.apache.tika.metadata.IPTC
-
An identifier for the registry which issued the corresponding Registry Image Id.
- RELATION - Static variable in interface org.apache.tika.metadata.DublinCore
-
A reference to a related resource.
- RELATION - Static variable in class org.apache.tika.metadata.Metadata
-
- RELATION - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- RELATIVE_PEAK_AUDIO_FILE_PATH - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The relative path to the file's peak audio file.
- RELEASE_DATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The date the title was released."
- remove(String) - Method in class org.apache.tika.metadata.Metadata
-
Remove a metadata and all its associated values.
- remove() - Method in class org.apache.tika.parser.mp3.ID3v2Frame.RawTagIterator
-
- remove(Property) - Method in class org.apache.tika.xmp.XMPMetadata
-
- remove(String) - Method in class org.apache.tika.xmp.XMPMetadata
-
Removes the given property from the XMP data.
- removedService(ServiceReference, Object) - Method in class org.apache.tika.config.TikaActivator
-
- render(XHTMLContentHandler) - Method in interface org.apache.tika.parser.microsoft.Cell
-
Renders the content to the given XHTML SAX event stream.
- render(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.CellDecorator
-
- render(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.LinkedCell
-
- render(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.NumberCell
-
- render(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.TextCell
-
- RENDITION_CLASS - Static variable in interface org.apache.tika.metadata.XMPMM
-
The rendition class name for this resource.
- RENDITION_PARAMS - Static variable in interface org.apache.tika.metadata.XMPMM
-
Can be used to provide additional rendition parameters that
are too complex or verbose to encode in xmpMM:RenditionClass
- ReplacementCharset - Class in org.apache.tika.parser.html.charsetdetector.charsets
-
An implementation of the standard "replacement" charset defined by the W3C.
- ReplacementCharset() - Constructor for class org.apache.tika.parser.html.charsetdetector.charsets.ReplacementCharset
-
- report(String) - Method in class org.apache.tika.batch.StatusReporter
-
Override for different behavior.
- Report - Class in org.apache.tika.eval.reports
-
This class represents a single report.
- Report() - Constructor for class org.apache.tika.eval.reports.Report
-
- ReporterBuilder - Interface in org.apache.tika.batch.builders
-
Interface for reporter builders
- RereadableInputStream - Class in org.apache.tika.utils
-
Wraps an input stream, reading it only once, but making it available
for rereading an arbitrary number of times.
- RereadableInputStream(InputStream, int, boolean, boolean) - Constructor for class org.apache.tika.utils.RereadableInputStream
-
Creates a rereadable input stream.
- RESERVED_FILENAME_CHARACTERS - Static variable in class org.apache.tika.io.FilenameUtils
-
Reserved characters
- reset(XSSFWorkbook) - Method in class org.apache.tika.eval.reports.XLSXHREFFormatter
-
- reset() - Method in class org.apache.tika.io.BoundedInputStream
-
- reset() - Method in class org.apache.tika.io.LookaheadInputStream
-
- reset() - Method in class org.apache.tika.io.NullInputStream
-
Reset the stream to the point when mark was last called.
- reset() - Method in class org.apache.tika.io.ProxyInputStream
-
Invokes the delegate's reset()
method.
- reset() - Method in class org.apache.tika.io.TailStream
-
This implementation restores this stream's state to the
state when ''mark()'' was called the last time.
- reset() - Method in class org.apache.tika.io.TikaInputStream
-
- reset() - Method in class org.apache.tika.langdetect.Lingo24LangDetector
-
- reset() - Method in class org.apache.tika.langdetect.OptimaizeLangDetector
-
- reset() - Method in class org.apache.tika.langdetect.TextLangDetector
-
- reset() - Method in class org.apache.tika.language.detect.LanguageDetector
-
Reset statistics about the current document being processed
- reset() - Method in class org.apache.tika.language.detect.LanguageWriter
-
- reset(AnalysisEngine, JCas) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Resets cTAKES objects, if created.
- reset() - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- reset() - Method in class org.apache.tika.parser.RecursiveParserWrapper
-
- RESET_TABLE - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- resetAE(AnalysisEngine) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Resets the AE (AnalysisEngine), releasing all resources held by the
current AE.
- resetByteCount() - Method in class org.apache.tika.io.CountingInputStream
-
Set the byte count back to 0.
- resetCAS(JCas) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Resets the CAS (Common Analysis System), emptying it of all content.
- resetCount() - Method in class org.apache.tika.io.CountingInputStream
-
Set the byte count back to 0.
- RESOLUTION_HORIZONTAL - Static variable in interface org.apache.tika.metadata.TIFF
-
"Horizontal resolution in pixels per unit."
- RESOLUTION_UNIT - Static variable in interface org.apache.tika.metadata.TIFF
-
"Units used for Horizontal and Vertical Resolutions."
One of "Inch" or "cm"
- RESOLUTION_VERTICAL - Static variable in interface org.apache.tika.metadata.TIFF
-
"Vertical resolution in pixels per unit."
- resolveEntity(String, String) - Method in class org.apache.tika.mime.MimeTypesReader
-
- resolveEntity(String, String) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
-
do not load any DTDs (may be requested by parser).
- resolveEntity(String, String) - Method in class org.apache.tika.sax.OfflineContentHandler
-
Returns an empty stream.
- resolveRelative(Path, String) - Static method in class org.apache.tika.batch.fs.FSUtil
-
Convenience method to ensure that "other" is not an absolute path.
- RESOURCE_NAME_KEY - Static variable in interface org.apache.tika.metadata.TikaMetadataKeys
-
- ResultsReporter - Class in org.apache.tika.eval.reports
-
- ResultsReporter() - Constructor for class org.apache.tika.eval.reports.ResultsReporter
-
- reverse(byte[]) - Static method in class org.apache.tika.parser.chm.core.ChmCommons
-
Reverses the order of given array
- reverseByteOrder(byte[]) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- REVISION - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
The revision number.
- REVISION_NUMBER - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- rewind() - Method in class org.apache.tika.utils.RereadableInputStream
-
"Rewinds" the stream to the beginning for rereading.
- RFC822Parser - Class in org.apache.tika.parser.mail
-
Uses apache-mime4j to parse emails.
- RFC822Parser() - Constructor for class org.apache.tika.parser.mail.RFC822Parser
-
- RichTextContentHandler - Class in org.apache.tika.sax
-
Content handler for Rich Text, it will extract XHTML <img/>
tag <alt/> attribute and XHTML <a/> tag <name/>
attribute into the output.
- RichTextContentHandler(Writer) - Constructor for class org.apache.tika.sax.RichTextContentHandler
-
Creates a content handler that writes XHTML body character events to
the given writer.
- RIGHTS - Static variable in interface org.apache.tika.metadata.DublinCore
-
Information about rights held in and over the resource.
- RIGHTS - Static variable in class org.apache.tika.metadata.Metadata
-
- RIGHTS - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- RIGHTS_USAGE_TERMS - Static variable in interface org.apache.tika.metadata.IPTC
-
The licensing parameters of the item expressed in free-text.
- rollback(File) - Method in class org.apache.tika.example.RollbackSoftware
-
- RollbackSoftware - Class in org.apache.tika.example
-
Demonstrates Tika and its ability to sense symlinks.
- RollbackSoftware() - Constructor for class org.apache.tika.example.RollbackSoftware
-
- ROOT_XML_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- ROW_COUNT - Static variable in interface org.apache.tika.metadata.Database
-
- RTF_PICT_META_PREFIX - Static variable in interface org.apache.tika.metadata.RTFMetadata
-
- RTFConverter - Class in org.apache.tika.xmp.convert
-
Tika to XMP mapping for the RTF format.
- RTFConverter() - Constructor for class org.apache.tika.xmp.convert.RTFConverter
-
- RTFMetadata - Interface in org.apache.tika.metadata
-
- RTFParser - Class in org.apache.tika.parser.rtf
-
RTF parser
- RTFParser() - Constructor for class org.apache.tika.parser.rtf.RTFParser
-
- run(RunProperties, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- run(RunProperties, String) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- run() - Method in class org.apache.tika.server.ServerStatusWatcher
-
- runAndGetOutput(String, String[], File) - Method in class org.apache.tika.language.translate.ExternalTranslator
-
Run the given command and return the output written to standard out.
- RunProperties - Class in org.apache.tika.parser.microsoft.ooxml
-
WARNING: This class is mutable.
- RunProperties() - Constructor for class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- SafeContentHandler - Class in org.apache.tika.sax
-
- SafeContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.SafeContentHandler
-
- SafeContentHandler.Output - Interface in org.apache.tika.sax
-
Internal interface that allows both character and
ignorable whitespace content to be filtered the same way.
- salvageCopy(InputStream, File) - Static method in class org.apache.tika.parser.utils.ZipSalvager
-
This streams the broken zip and rebuilds a new zip that
is at least a valid zip file.
- salvageCopy(File, File) - Static method in class org.apache.tika.parser.utils.ZipSalvager
-
- SAMPLES_PER_PIXEL - Static variable in interface org.apache.tika.metadata.TIFF
-
"Number of components per pixel."
- SAS7BDATParser - Class in org.apache.tika.parser.sas
-
Processes the SAS7BDAT data columnar database file used by SAS and
other similar languages.
- SAS7BDATParser() - Constructor for class org.apache.tika.parser.sas.SAS7BDATParser
-
- save(OutputStream) - Method in class org.apache.tika.config.Param
-
- save(Node) - Method in class org.apache.tika.config.Param
-
- save(OutputStream) - Method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.
Writes NGramProfile content into OutputStream, content is outputted with
UTF-8 encoding
- SAVE_DATE - Static variable in interface org.apache.tika.metadata.Office
-
When was the document last saved?
- SCALE_TYPE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The musical scale used in the music.
- SCENE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the scene."
- SCENE_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
Describes the scene of a news content.
- SCHEME - Static variable in interface org.apache.tika.metadata.XMPIdq
-
A qualifier providing the name of the formal identification
scheme used for an item in the xmp:Identifier array.
- SCRIPT_SOURCE - Static variable in interface org.apache.tika.metadata.HTML
-
If a script element contains a src value, this value
is set in the embedded document's metadata
- SDA - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
StarOffice Draw
- SDC - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
StarOffice Calc
- SDD - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
StarOffice Impress
- SDW - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
StarOffice Writer
- searchGeoNames(ArrayList<String>) - Method in class org.apache.tika.parser.geo.topic.GeoParser
-
- secondaryParser - Variable in class org.apache.tika.parser.ner.NamedEntityParser
-
- secondaryParser - Variable in class org.apache.tika.parser.recognition.AgeRecogniser
-
- secondsElapsed() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
-
- SECRET_PROPERTY - Static variable in class org.apache.tika.language.translate.MicrosoftTranslator
-
- SecureContentHandler - Class in org.apache.tika.sax
-
Content handler decorator that attempts to prevent denial of service
attacks against Tika parsers.
- SecureContentHandler(ContentHandler, TikaInputStream) - Constructor for class org.apache.tika.sax.SecureContentHandler
-
Decorates the given content handler with zip bomb prevention based
on the count of bytes read from the given counting input stream.
- SECURITY - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- select(Metadata) - Method in class org.apache.tika.batch.FileResourceCrawler
-
- select(Metadata) - Method in class org.apache.tika.batch.fs.FSDocumentSelector
-
- select(Metadata) - Method in interface org.apache.tika.extractor.DocumentSelector
-
Checks if a document with the given metadata matches the specified
selection criteria.
- SentimentAnalysisParser - Class in org.apache.tika.parser.sentiment
-
This parser classifies documents based on the sentiment of document.
- SentimentAnalysisParser() - Constructor for class org.apache.tika.parser.sentiment.SentimentAnalysisParser
-
- serialize(TikaConfig, TikaConfigSerializer.Mode, Writer, Charset) - Static method in class org.apache.tika.config.TikaConfigSerializer
-
- serialize(Metadata, Type, JsonSerializationContext) - Method in class org.apache.tika.metadata.serialization.JsonMetadataSerializer
-
Serializes a Metadata object into effectively Map.
- serialize(JCas, CTAKESSerializer, boolean, OutputStream) - Static method in class org.apache.tika.parser.ctakes.CTAKESUtils
-
Serializes a CAS in the given format.
- serializedRecursiveParserWrapperExample() - Method in class org.apache.tika.example.ParsingExample
-
We include a simple JSON serializer for a list of metadata with
JsonMetadataList
.
- serializeMetadata(List<String>) - Static method in class org.apache.tika.embedder.ExternalEmbedder
-
Serializes a collection of metadata command line arguments into a single
string.
- ServerStatus - Class in org.apache.tika.server
-
- ServerStatus() - Constructor for class org.apache.tika.server.ServerStatus
-
- ServerStatus(boolean) - Constructor for class org.apache.tika.server.ServerStatus
-
- ServerStatus.STATUS - Enum in org.apache.tika.server
-
- ServerStatus.TASK - Enum in org.apache.tika.server
-
- ServerStatusWatcher - Class in org.apache.tika.server
-
- ServerStatusWatcher(ServerStatus, InputStream, Path, long, ServerTimeouts) - Constructor for class org.apache.tika.server.ServerStatusWatcher
-
- ServerTimeouts - Class in org.apache.tika.server
-
- ServerTimeouts() - Constructor for class org.apache.tika.server.ServerTimeouts
-
- ServiceLoader - Class in org.apache.tika.config
-
Internal utility class that Tika uses to look up service providers.
- ServiceLoader(ClassLoader, LoadErrorHandler, InitializableProblemHandler, boolean) - Constructor for class org.apache.tika.config.ServiceLoader
-
- ServiceLoader(ClassLoader, LoadErrorHandler, boolean) - Constructor for class org.apache.tika.config.ServiceLoader
-
- ServiceLoader(ClassLoader, LoadErrorHandler) - Constructor for class org.apache.tika.config.ServiceLoader
-
- ServiceLoader(ClassLoader) - Constructor for class org.apache.tika.config.ServiceLoader
-
- ServiceLoader() - Constructor for class org.apache.tika.config.ServiceLoader
-
- ServiceLoaderUtils - Class in org.apache.tika.utils
-
Service Loading and Ordering related utils
- ServiceLoaderUtils() - Constructor for class org.apache.tika.utils.ServiceLoaderUtils
-
- set(String, String) - Method in class org.apache.tika.metadata.Metadata
-
Set metadata name/value.
- set(Property, String) - Method in class org.apache.tika.metadata.Metadata
-
Sets the value of the identified metadata property.
- set(Property, String[]) - Method in class org.apache.tika.metadata.Metadata
-
Sets the values of the identified metadata property.
- set(Property, int) - Method in class org.apache.tika.metadata.Metadata
-
Sets the integer value of the identified metadata property.
- set(Property, double) - Method in class org.apache.tika.metadata.Metadata
-
Sets the real or rational value of the identified metadata property.
- set(Property, Date) - Method in class org.apache.tika.metadata.Metadata
-
Sets the date value of the identified metadata property.
- set(Property, Calendar) - Method in class org.apache.tika.metadata.Metadata
-
Sets the date value of the identified metadata property.
- set(MediaType...) - Static method in class org.apache.tika.mime.MediaType
-
Convenience method that returns an unmodifiable set that contains
all the given media types.
- set(String...) - Static method in class org.apache.tika.mime.MediaType
-
Convenience method that parses the given media type strings and
returns an unmodifiable set that contains all the parsed types.
- set(Class<T>, T) - Method in class org.apache.tika.parser.ParseContext
-
Adds the given value to the context as an implementation of the given
interface.
- set(String, String) - Method in class org.apache.tika.xmp.XMPMetadata
-
Sets the given property.
- set(Property, String) - Method in class org.apache.tika.xmp.XMPMetadata
-
- set(Property, int) - Method in class org.apache.tika.xmp.XMPMetadata
-
- set(Property, double) - Method in class org.apache.tika.xmp.XMPMetadata
-
- set(Property, Date) - Method in class org.apache.tika.xmp.XMPMetadata
-
- set(Property, String[]) - Method in class org.apache.tika.xmp.XMPMetadata
-
Sets array properties.
- setAccessChecker(AccessChecker) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setAdmin1Code(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- setAdmin2Code(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- setAeDescriptorPath(String) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the path to XML descriptor for AnalysisEngine.
- setAgePredictorClient(AgePredicterLocal) - Static method in class org.apache.tika.parser.recognition.AgeRecogniser
-
USED in test cases to mock response of AgeClassifier
- setAlignedLenTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setAlignedTreeTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setAll(Properties) - Method in class org.apache.tika.metadata.Metadata
-
Copy All key-value pairs from properties.
- setAll(Properties) - Method in class org.apache.tika.xmp.XMPMetadata
-
It will set all simple and array properties that have QName keys in registered namespaces.
- setAnnotationProps(CTAKESAnnotationProperty[]) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
- setAnnotationProps(String[]) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
- setApiKey(String) - Method in class org.apache.tika.language.translate.YandexTranslator
-
Set the API Key for client authentication
- setApplyRotation(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Sets whether or not a rotation value should be calculated and passed to ImageMagick.
- setApplyRotation(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setAverageCharTolerance(Float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
See PDFTextStripper.setAverageCharTolerance(float)
- setBlock_len(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets block length
- setBlockAddress(long[]) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets block addresses
- setBlockCount(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets a block count
- setBlockidx_intvl(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets block index interval
- setBlockLength(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setBlockLlen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets a block length
- setBlockNext(int) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- setBlockPrev(int) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- setBlockRemaining(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setBlockType(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setBold(boolean) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- setCatchIntermediateIOExceptions(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
The PDFBox parser will throw an IOException if there is
a problem with a stream.
- setCenter(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
-
- setCharset(Charset) - Method in class org.apache.tika.parser.csv.CSVParams
-
- setChmDirList(ChmDirectoryListingSet) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setChmItsfHeader(ChmItsfHeader) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setChmItspHeader(ChmItspHeader) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setChmLzxcControlData(ChmLzxcControlData) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setChmLzxcResetTable(ChmLzxcResetTable) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setColorspace(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setColorspace(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setCommand(String...) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Sets the command to be run.
- setCommand(String...) - Method in class org.apache.tika.parser.external.ExternalParser
-
Sets the command to be run.
- setCommand(String) - Method in class org.apache.tika.parser.gdal.GDALParser
-
- setCommandAppendOperator(String) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Sets the operator to append rather than replace a value for the command
line tool, i.e.
- setCommandAssignmentDelimeter(String) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Sets the delimiter for multiple assignments for the command line tool,
i.e.
- setCommandAssignmentOperator(String) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Sets the assignment operator for the command line tool, i.e.
- setCompressedLen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets compressed length
- setConcatenatePhoneticRuns(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setConcatenatePhoneticRuns(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Microsoft Excel files can sometimes contain phonetic (furigana) strings.
- setConfidence(double) - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- setConsumersManagerMaxMillis(long) - Method in class org.apache.tika.batch.ConsumersManager
-
- setContentHandler(ContentHandler) - Method in class org.apache.tika.sax.ContentHandlerDecorator
-
Sets the underlying content handler.
- setContentLength(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxBlock
-
- setContentParser(Parser) - Method in class org.apache.tika.parser.epub.EpubParser
-
- setContentParser(Parser) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
-
- setContentType(Metadata) - Method in class org.apache.tika.parser.microsoft.xml.AbstractXML2003Parser
-
- setContentType(Metadata) - Method in class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
-
- setContentType(Metadata) - Method in class org.apache.tika.parser.microsoft.xml.WordMLParser
-
- setContextClassLoader(ClassLoader) - Static method in class org.apache.tika.config.ServiceLoader
-
Sets the context class loader to use for all threads that access
this class.
- setControlDataIndex(int) - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Sets control data index
- setCorePoolSize(int) - Method in interface org.apache.tika.concurrent.ConfigurableThreadPoolExecutor
-
- setCountryCode(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- setData(byte[]) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setDataOffset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets data offset
- setDeclaredEncoding(String) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Set the declared encoding for charset detection.
- setDelimiter(Character) - Method in class org.apache.tika.parser.csv.CSVParams
-
- setDensity(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setDensity(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setDepth(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setDepth(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setDescription(String) - Method in class org.apache.tika.mime.MimeType
-
Set the description of this media type.
- setDetectableCharset(String, boolean) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
- setDetectAngles(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setDetector(Detector) - Method in class org.apache.tika.parser.AutoDetectParser
-
Sets the type detector used by this parser to auto-detect the type
of a document.
- setDetector(Parser, Detector) - Static method in class org.apache.tika.server.resource.TikaResource
-
- setDigester(DigestingParser.Digester) - Method in class org.apache.tika.batch.DigestingAutoDetectParserFactory
-
- setDir_uuid(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets directory uuid
- setDirectoryListingEntryList(List<DirectoryListingEntry>) - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Sets chm directory listing entry list
- setDirLen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets directory length
- setDirOffset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets directory offset
- setDocumentLocator(Locator) - Method in class org.apache.tika.parser.dif.DIFContentHandler
-
- setDocumentLocator(Locator) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- setDocumentLocator(Locator) - Method in class org.apache.tika.sax.ContentHandlerDecorator
-
- setDocumentLocator(Locator) - Method in class org.apache.tika.sax.DIFContentHandler
-
- setDocumentLocator(Locator) - Method in class org.apache.tika.sax.TeeContentHandler
-
- setDocumentLocator(Locator) - Method in class org.apache.tika.sax.TextContentHandler
-
- setDocumentSelector(DocumentSelector) - Method in class org.apache.tika.batch.FileResourceCrawler
-
- setEnableAutoSpace(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setEnableAutoSpace(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true (the default), the parser should estimate
where spaces should be inserted between words.
- setEnableImageProcessing(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set the value to true if processing is to be enabled.
- setEnableImageProcessing(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setEncoding(StringsEncoding) - Method in class org.apache.tika.parser.strings.StringsConfig
-
Sets the character encoding of the strings that are to be found.
- setEncodingDetector(EncodingDetector) - Method in class org.apache.tika.parser.AbstractEncodingDetectorParser
-
- setEntryType(ChmCommons.EntryType) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
- setExtractAcroFormContent(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true (the default), extract content from AcroForms
at the end of the document.
- setExtractActions(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Whether or not to extract PDActions from the file.
- setExtractAllAlternatives(boolean) - Method in class org.apache.tika.parser.mail.RFC822Parser
-
Until version 1.17, Tika handled all body parts as embedded objects (see TIKA-2478).
- setExtractAllAlternativesFromMSG(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
Some .msg files can contain body content in html, rtf and/or text.
- setExtractAllAlternativesFromMSG(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Some .msg files can contain body content in html, rtf and/or text.
- setExtractAnnotationText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setExtractAnnotationText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true (the default), text in annotations will be
extracted.
- setExtractBookmarksText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, extract bookmarks (document outline) text.
- setExtractInlineImages(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, extract inline embedded OBXImages.
- setExtractMacros(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setExtractMacros(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Sets whether or not MSOffice parsers should extract macros.
- setExtractScripts(boolean) - Method in class org.apache.tika.parser.html.HtmlParser
-
Whether or not to extract contents in script entities.
- setExtractUniqueInlineImagesOnly(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Multiple pages within a PDF file might refer to the same underlying image.
- setFallback(Parser) - Method in class org.apache.tika.parser.CompositeParser
-
Sets the fallback parser.
- setFilePath(String) - Method in class org.apache.tika.parser.strings.FileConfig
-
Sets the "file" installation folder.
- setFilter(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setFilter(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setFormat(String) - Method in class org.apache.tika.language.translate.YandexTranslator
-
Set the text format to use (plain/html)
- setFramesRead(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setFreeSpace(long) - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
-
Sets pmgi free space
- setFreeSpace(long) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- setGazetteerRestEndpoint(String) - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
-
Configure REST endpoint for lucene-geo-gazetteer
- setGson(Gson) - Static method in class org.apache.tika.metadata.serialization.JsonMetadata
-
Enables setting custom configurations on Gson.
- setGson(Gson) - Static method in class org.apache.tika.metadata.serialization.JsonMetadataList
-
Enables setting custom configurations on Gson.
- setHadStarted(ChmCommons.LzxState) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setHeader_len(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets itsp header length
- setHeaderLen(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets itsf header length
- setId(String) - Method in class org.apache.tika.language.translate.MicrosoftTranslator
-
Sets the client Id for the translator API.
- setId(String) - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- setIdentifier(String) - Method in class org.apache.tika.sax.StandardReference
-
- setIfXFAExtractOnlyXFA(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If false (the default), extract content from the full PDF
as well as the XFA form.
- setIgnoredLineConsumer(ExternalParser.LineConsumer) - Method in class org.apache.tika.parser.external.ExternalParser
-
Set a consumer for the lines ignored by the parse functions
- setIlvl(int) - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- setImageMagickPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set the path to the ImageMagick executable directory, needed if it is not on system path.
- setImageMagickPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setIncludeDeletedContent(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setIncludeDeletedContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Sets whether or not the parser should include deleted content.
- setIncludeDeletedContent(boolean) - Method in class org.apache.tika.parser.wordperfect.WordPerfectParser
-
Whether or not to include deleted content.
- setIncludeHeadersAndFooters(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Whether or not to include headers and footers.
- setIncludeMarkup(boolean) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
- setIncludeMissingRows(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
For table-like formats, and tables within other formats, should
missing rows in sparse tables be output where detected?
The default is to only output rows defined within the file, which
avoid lots of blank lines, but means layout isn't preserved.
- setIncludeMoveFromContent(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setIncludeMoveFromContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
With track changes on, when a section is moved, the content
is stored in both the "moveFrom" section and in the "moveTo" section.
- setIncludeShapeBasedContent(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setIncludeShapeBasedContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
In Excel and Word, there can be text stored within drawing shapes.
- setIncludeSlideMasterContent(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Whether or not to include contents from any of the three
types of masters -- slide, notes, handout -- in a .ppt or ppt[xm] file.
- setIncludeSlideNotes(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Whether or not to process slide notes content.
- setIndex_depth(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets an index depth
- setIndex_head(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets an index head
- setIndex_root(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets an index root
- setIndexOfContent(int) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setIndexOfResetData(int) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setIndexOfResetTable(int) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setInitializableProblemHandler(InitializableProblemHandler) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setIntelCurrentPossition(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setIntelFileSize(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setIntelState(ChmCommons.IntelState) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setIsShuttingDown(boolean) - Method in class org.apache.tika.batch.StatusReporter
-
Set whether the main process is in the process of shutting down.
- setItalics(boolean) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- setJavaCommand(List<String>) - Method in class org.apache.tika.fork.ForkParser
-
Sets the command used to start the forked server process.
- setJavaCommand(String) - Method in class org.apache.tika.fork.ForkParser
-
- setKey(Key) - Static method in class org.apache.tika.example.Pharmacy
-
- setLabel(String) - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- setLabelLang(String) - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- setLang_id(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets language id
- setLangId(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets language_id
- setLanguage(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set tesseract language dictionary to be used.
- setLanguage(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setLastModified(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets last modified date of the chm file
- setLatitude(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- setLeft(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
-
- setLength(int) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
- setLengthTreeLengtsTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setLengthTreeTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setListenForAllRecords(boolean) - Method in class org.apache.tika.parser.microsoft.ExcelExtractor
-
Specifies whether this parser should to listen for all
records or just for the specified few.
- setLongitude(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- setLzxBlockLength(long) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setLzxBlockOffset(long) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setLzxBlocksCache(List<ChmLzxBlock>) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setMain(String, String, String) - Method in class org.apache.tika.parser.geo.topic.GeoTag
-
- setMainOrganizationAcronym(String) - Method in class org.apache.tika.sax.StandardReference
-
- setMainTreeElements(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setMainTreeLengtsTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setMainTreeTable(short[]) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setMarkLimit(int) - Method in class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
-
How far into the stream to read for charset detection.
- setMarkLimit(int) - Method in class org.apache.tika.parser.html.HtmlEncodingDetector
-
How far into the stream to read for charset detection.
- setMarkLimit(int) - Method in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
- setMarkLimit(int) - Method in class org.apache.tika.parser.pkg.ZipContainerDetector
-
If this is less than 0, the file will be spooled to disk,
and detection will run on the full file.
- setMarkLimit(int) - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
How far into the stream to read for charset detection.
- setMarkLimit(int) - Method in class org.apache.tika.parser.txt.UniversalEncodingDetector
-
How far into the stream to read for charset detection.
- setMaxAliveTimeSeconds(int) - Method in class org.apache.tika.batch.BatchProcess
-
The maximum amount of time that this process can be alive.
- setMaxBytesForEmbeddedObject(int) - Static method in class org.apache.tika.parser.rtf.RTFParser
-
- setMaxChildStartupMillis(long) - Method in class org.apache.tika.server.ServerTimeouts
-
- setMaxConsecWaitInMillis(long) - Method in class org.apache.tika.batch.FileResourceCrawler
-
- setMaxContentLength(int) - Method in class org.apache.tika.eval.AbstractProfiler
-
Truncate the content string if greater than this length to this length
- setMaxContentLengthForLangId(int) - Method in class org.apache.tika.eval.AbstractProfiler
-
Truncate content string if greater than this length to this length for lang id
- setMaxEmbeddedResources(int) - Method in class org.apache.tika.parser.RecursiveParserWrapper
-
- setMaxEntityExpansions(int) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Set the maximum number of entity expansions allowable in SAX/DOM/StAX parsing.
- setMaxFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set maximum file size to submit file to ocr.
- setMaxFilesProcessedPerServer(int) - Method in class org.apache.tika.fork.ForkParser
-
If there is a slowly building memory leak in one of the parsers,
it is useful to set a limit on the number of files processed
by a server before it is shutdown and restarted.
- setMaxFilesToAdd(int) - Method in class org.apache.tika.batch.FileResourceCrawler
-
Maximum number of files to add.
- setMaxFilesToConsider(int) - Method in class org.apache.tika.batch.FileResourceCrawler
-
Maximum number of files to consider.
- setMaximumCompressionRatio(long) - Method in class org.apache.tika.sax.SecureContentHandler
-
Sets the ratio between output characters and input bytes.
- setMaximumDepth(int) - Method in class org.apache.tika.sax.SecureContentHandler
-
Sets the maximum XML element nesting level.
- setMaximumPackageEntryDepth(int) - Method in class org.apache.tika.sax.SecureContentHandler
-
Sets the maximum package entry nesting level.
- setMaximumPoolSize(int) - Method in interface org.apache.tika.concurrent.ConfigurableThreadPoolExecutor
-
- setMaxMainMemoryBytes(int) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setMaxRestarts(int) - Method in class org.apache.tika.server.ServerTimeouts
-
- setMaxStringLength(int) - Method in class org.apache.tika.Tika
-
Sets the maximum length of strings returned by the parseToString
methods.
- setMaxTextLength(int) - Static method in class org.apache.tika.eval.util.LanguageIDWrapper
-
- setMaxTokens(int) - Method in class org.apache.tika.eval.AbstractProfiler
-
Add a LimitTokenCountFilterFactory if > -1
- setMaxXMPMMHistory(int) - Static method in class org.apache.tika.parser.image.xmp.JempboxExtractor
-
Maximum number of events to extract from the
event history in the XMP Media Management (XMPMM) section.
- setMediaType(MediaType) - Method in class org.apache.tika.parser.csv.CSVParams
-
- setMediaTypeRegistry(MediaTypeRegistry) - Method in class org.apache.tika.parser.CompositeParser
-
Sets the media type registry used to infer type relationships.
- setMemoryLimitInKb(int) - Method in class org.apache.tika.parser.pkg.CompressorParser
-
- setMemoryLimitInKb(int) - Method in class org.apache.tika.parser.rtf.RTFParser
-
- setMetadata(String[]) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the metadata whose values will be analyzed using cTAKES.
- setMetadata(Metadata) - Method in class org.apache.tika.xmp.convert.AbstractConverter
-
- setMetadataCommandArguments(Map<Property, String[]>) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Sets the map of Metadata keys to command line parameters.
- setMetadataExtractionPatterns(Map<Pattern, String>) - Method in class org.apache.tika.parser.external.ExternalParser
-
Sets the map of regular expression patterns and Metadata
keys.
- setMetaParser(Parser) - Method in class org.apache.tika.parser.epub.EpubParser
-
- setMetaParser(Parser) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
-
- setMimetype(boolean) - Method in class org.apache.tika.parser.strings.FileConfig
-
Sets the mime option.
- setMinFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set minimum file size to submit file to ocr.
- setMinFileSizeToOcr(long) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setMinLength(int) - Method in class org.apache.tika.parser.strings.StringsConfig
-
Sets the minimum sequence length (characters) to print.
- setMinSize(int) - Method in class org.apache.tika.parser.strings.Latin1StringsParser
-
Sets the minimum size of a character sequence to be extracted.
- setMixedLanguages(boolean) - Method in class org.apache.tika.language.detect.LanguageDetector
-
- setName(String) - Method in class org.apache.tika.config.Param
-
- setName(String) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
Sets entry name
- setName(String) - Method in class org.apache.tika.parser.geo.topic.gazetteer.Location
-
- setNameLength(int) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
Sets an entry name length
- setNamePrefix(String) - Method in class org.apache.tika.eval.db.TableInfo
-
- setNERModelPath(String) - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
-
- setNerModelUrl(URL) - Method in class org.apache.tika.parser.geo.topic.GeoParserConfig
-
- setNum_blocks(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets number of blocks containing in the chm file
- setNumId(int) - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- setNumOfHidden(int) - Method in class org.apache.tika.detect.NNTrainedModelBuilder
-
- setNumOfInputs(int) - Method in class org.apache.tika.detect.NNTrainedModelBuilder
-
- setNumOfOutputs(int) - Method in class org.apache.tika.detect.NNTrainedModelBuilder
-
- setOcrDPI(int) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Dots per inch used to render the page image for OCR.
- setOcrImageFormatName(String) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setOcrImageQuality(float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image quality used to render the page image for OCR.
- setOcrImageScale(float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- setOcrImageType(String) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setOcrImageType(ImageType) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image type used to render the page image for OCR.
- setOcrImageType(String) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Image type used to render the page image for OCR.
- setOcrStrategy(String) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setOcrStrategy(PDFParserConfig.OCR_STRATEGY) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Which strategy to use for OCR
- setOcrStrategy(String) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Which strategy to use for OCR
- setOffset(int) - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
- setOpenContainer(Object) - Method in class org.apache.tika.io.TikaInputStream
-
Stores the open container object against
the stream, eg after a Zip contents
detector has loaded the file to decide
what it contains.
- setOutputEncoding(Charset) - Method in class org.apache.tika.batch.fs.BasicTikaFSConsumer
-
- setOutputEncoding(String) - Method in class org.apache.tika.batch.fs.RecursiveParserWrapperFSConsumer
-
- setOutputEncoding(String) - Method in class org.apache.tika.batch.fs.StreamOutRPWFSConsumer
-
- setOutputStream(OutputStream) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
- setOutputThreshold(long) - Method in class org.apache.tika.sax.SecureContentHandler
-
Sets the threshold for output characters before the zip bomb prevention
is activated.
- setOutputType(TesseractOCRConfig.OUTPUT_TYPE) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set output type from ocr process.
- setOutputType(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setOutputType(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setPageSegMode(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set tesseract page segmentation mode.
- setPageSegMode(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setPageSeparator(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
The page separator to use in plain text output.
- setParams(float[]) - Method in class org.apache.tika.detect.NNTrainedModelBuilder
-
- setParseException(boolean) - Method in class org.apache.tika.eval.util.ContentTags
-
- setParseRecursively(boolean) - Method in class org.apache.tika.batch.ParserFactory
-
- setParsers(Map<MediaType, Parser>) - Method in class org.apache.tika.parser.CompositeParser
-
Sets the component parsers.
- setPathClassifyModel(String) - Method in class org.apache.tika.parser.recognition.AgeRecogniserConfig
-
- setPathClassifyRegression(String) - Method in class org.apache.tika.parser.recognition.AgeRecogniserConfig
-
- setPauseOnEarlyTerminationMillis(long) - Method in class org.apache.tika.batch.BatchProcess
-
If there is an early termination via an interrupt or too many timed out consumers
or because a consumer or other Runnable threw a Throwable, pause this long
before killing the consumers and other threads.
- setPDFParserConfig(PDFParserConfig) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setPersonAndEmail(String, Property, Property, Metadata) - Static method in class org.apache.tika.parser.mail.MailUtil
-
This tries to split a "from" or "to" value into a person field and an email field.
- setPingPulseMillis(long) - Method in class org.apache.tika.server.ServerTimeouts
-
- setPingTimeoutMillis(long) - Method in class org.apache.tika.server.ServerTimeouts
-
- setPoolSize(int) - Method in class org.apache.tika.fork.ForkParser
-
Sets the size of the process pool.
- setPoolSize(int) - Static method in class org.apache.tika.mime.MimeTypesReader
-
Set the pool size for cached XML parsers.
- setPoolSize(int) - Static method in class org.apache.tika.utils.XMLReaderUtils
-
Set the pool size for cached XML parsers.
- setPreserveInterwordSpacing(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Whether or not to maintain interword spacing.
- setPreserveInterwordSpacing(boolean) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setPrettyPrint(boolean) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Enables the formatted output for serializer.
- setPrettyPrinting(boolean) - Static method in class org.apache.tika.metadata.serialization.JsonMetadata
-
- setPrettyPrinting(boolean) - Static method in class org.apache.tika.metadata.serialization.JsonMetadataList
-
- setPriors(Map<String, Float>) - Method in class org.apache.tika.langdetect.Lingo24LangDetector
-
- setPriors(Map<String, Float>) - Method in class org.apache.tika.langdetect.OptimaizeLangDetector
-
- setPriors(Map<String, Float>) - Method in class org.apache.tika.langdetect.TextLangDetector
-
- setPriors(Map<String, Float>) - Method in class org.apache.tika.language.detect.LanguageDetector
-
Set the a-priori probabilities for these languages.
- setQuoteAssignmentValues(boolean) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
Sets whether or not to quote assignment values, i.e.
- setR0(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setR1(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setR2(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setRecogniser(String) - Method in class org.apache.tika.parser.recognition.ObjectRecognitionParser
-
- setRedirectChildProcessToStdOut(boolean) - Method in class org.apache.tika.batch.BatchProcessDriverCLI
-
Typically only used for testing.
- setResetInterval(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets a reset interval
- setResetTableIndex(int) - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
Sets reset table index
- setResize(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setResize(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setRight(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.HeaderFooterFromString
-
- setScore(double) - Method in class org.apache.tika.sax.StandardReference
-
- setScore(double) - Method in class org.apache.tika.sax.StandardReference.StandardReferenceBuilder
-
- setSecondOrganization(String, String) - Method in class org.apache.tika.sax.StandardReference.StandardReferenceBuilder
-
- setSecondOrganizationAcronym(String) - Method in class org.apache.tika.sax.StandardReference
-
- setSecret(String) - Method in class org.apache.tika.language.translate.MicrosoftTranslator
-
Sets the client secret for the translator API.
- setSeparator(String) - Method in class org.apache.tika.sax.StandardReference
-
- setSeparatorChar(char) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the separator character used for annotation properties.
- setSerialize(boolean) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Enables CAS serialization.
- setSerializerType(CTAKESSerializer) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the type of cTAKES (UIMA) serializer used to write CAS.
- setServerParseTimeoutMillis(long) - Method in class org.apache.tika.fork.ForkParser
-
The maximum amount of time allowed for the server to try to parse a file.
- setServerPulseMillis(long) - Method in class org.apache.tika.fork.ForkParser
-
The amount of time in milliseconds that the server
should wait before checking to see if the parse has timed out
or if the wait has timed out
The default is 5 seconds.
- setServerWaitTimeoutMillis(long) - Method in class org.apache.tika.fork.ForkParser
-
The maximum amount of time allowed for the server to wait for a new request to parse
a file.
- setSetKCMS(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
Whether to call System.setProperty("sun.java2d.cmm", "sun.java2d.cmm.kcms.KcmsServiceProvider")
.
- setShortText(boolean) - Method in class org.apache.tika.language.detect.LanguageDetector
-
- setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets itsf header signature
- setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets itsp signature
- setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets a signature of control data block
- setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
-
Sets pmgi signature
- setSignature(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- setSize(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets a size of control data
- setSleepMillis(long) - Method in class org.apache.tika.batch.StatusReporter
-
Set the amount of time to sleep between reports.
- setSortByPosition(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setSortByPosition(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, sort text tokens by their x/y position
before extracting text.
- setSpacingTolerance(Float) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
See PDFTextStripper.setSpacingTolerance(float)
- setStaleThresholdMillis(long) - Method in class org.apache.tika.batch.StatusReporter
-
Set the amount of time in milliseconds to use as the threshold for determining
a stale parse.
- setStartIndex(int) - Method in class org.apache.tika.parser.chm.core.ChmWrapper
-
- setStatus(ServerStatus.STATUS) - Method in class org.apache.tika.server.ServerStatus
-
- setStream_uuid(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets stream uuid
- setStrike(boolean) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- setStringsPath(String) - Method in class org.apache.tika.parser.strings.StringsConfig
-
Sets the "strings" installation folder.
- setStripMarkup(boolean) - Method in class org.apache.tika.parser.txt.Icu4jEncodingDetector
-
Whether or not to attempt to strip html-ish markup
from the stream before sending it to the underlying
detector.
- setStyleID(String) - Method in class org.apache.tika.parser.microsoft.ooxml.ParagraphProperties
-
- setSuperType(MimeType, MediaType) - Method in class org.apache.tika.mime.MimeTypes
-
- setSupportedEmbedTypes(Set<MediaType>) - Method in class org.apache.tika.embedder.ExternalEmbedder
-
- setSupportedTypes(Set<MediaType>) - Method in class org.apache.tika.parser.external.ExternalParser
-
- setSuppressDuplicateOverlappingText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParser
-
- setSuppressDuplicateOverlappingText(boolean) - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
If true, the parser should try to remove duplicated
text over the same region.
- setSwath(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- setSystem_uuid(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets system uuid
- setTableOffset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets a table offset
- setTaskTimeoutMillis(long) - Method in class org.apache.tika.server.ServerTimeouts
-
- setTemporaryFileDirectory(Path) - Method in class org.apache.tika.io.TemporaryResources
-
- setTemporaryFileDirectory(File) - Method in class org.apache.tika.io.TemporaryResources
-
- setTessdataPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set the path to the 'tessdata' folder, which contains language files and config files.
- setTessdataPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setTesseractPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set the path to the Tesseract executable's directory, needed if it is not on system path.
- setTesseractPath(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setText(boolean) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Enables content text analysis using cTAKES.
- setText(byte[]) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Set the input text (byte) data whose charset is to be detected.
- setText(InputStream) - Method in class org.apache.tika.parser.txt.CharsetDetector
-
Set the input text (byte) data whose charset is to be detected.
- setThreshold(double) - Method in class org.apache.tika.sax.StandardsExtractingContentHandler
-
Sets the score to be used as threshold.
- setTimeout(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Set maximum time (seconds) to wait for the ocring process to terminate.
- setTimeout(int) - Method in class org.apache.tika.parser.ocr.TesseractOCRParser
-
- setTimeout(int) - Method in class org.apache.tika.parser.strings.StringsConfig
-
Sets the maximum time (in seconds) to wait for the "strings" command to
terminate.
- setTimeoutCheckPulseMillis(long) - Method in class org.apache.tika.batch.BatchProcess
-
- setTimeoutThresholdMillis(long) - Method in class org.apache.tika.batch.BatchProcess
-
The amount of time allowed before a consumer should be timed out.
- setTopN(int) - Method in class org.apache.tika.eval.tokens.TokenCounter
-
- setTotal(int) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- setTracking(boolean) - Method in class org.apache.tika.parser.mbox.MboxParser
-
- setTranslator(Translator) - Method in class org.apache.tika.language.translate.CachedTranslator
-
- setTrustedPageSeparator(String) - Method in class org.apache.tika.parser.ocr.TesseractOCRConfig
-
- setType(Class<T>) - Method in class org.apache.tika.config.Param
-
- setType(MediaType) - Method in class org.apache.tika.detect.NNTrainedModelBuilder
-
- setTypeString(String) - Method in class org.apache.tika.config.Param
-
- setUMLSPass(String) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the UMLS password.
- setUMLSUser(String) - Method in class org.apache.tika.parser.ctakes.CTAKESConfig
-
Sets the UMLS username.
- setUncompressedLen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets uncompressed length
- setUnderline(String) - Method in class org.apache.tika.parser.microsoft.ooxml.RunProperties
-
- setUnknown(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets an unknown
- setUnknown0008(long) - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- setUnknown_000c(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets unknown_00c
- setUnknown_000c(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets 000c unknown bytes Unknown means here that those guys who cracked
the chm format do not know what's it purposes for
- setUnknown_0024(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets 0024 unknown bytes
- setUnknown_002c(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets 002c unknown bytes
- setUnknown_0044(byte[]) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets 0044 unknown bytes
- setUnknown_18(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets unknown 18 bytes
- setUnknownLen(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets unknown length
- setUnknownOffset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets unknown offset
- setUseSAXDocxExtractor(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setUseSAXDocxExtractor(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Use the experimental SAX-based streaming DOCX parser?
If set to false
, the classic parser will be used; if true
,
the new experimental parser will be used.
- setUseSAXPptxExtractor(boolean) - Method in class org.apache.tika.parser.microsoft.AbstractOfficeParser
-
- setUseSAXPptxExtractor(boolean) - Method in class org.apache.tika.parser.microsoft.OfficeParserConfig
-
Use the experimental SAX-based streaming DOCX parser?
If set to false
, the classic parser will be used; if true
,
the new experimental parser will be used.
- setVersion(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Sets itsf version
- setVersion(int) - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
Sets a version of itsp header
- setVersion(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets version of control data block
- setVersion(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
Sets the version
- setWindow(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setWindowPosition(int) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setWindowSize(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets a window size
- setWindowSize(long) - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
- setWindowsPerReset(long) - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Sets windows per reset
- sheetParts - Variable in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
-
- SheetTextAsHTML(OfficeParserConfig, XHTMLContentHandler) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
-
- shortText - Variable in class org.apache.tika.language.detect.LanguageDetector
-
- SHOT_DATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The date and time when the video was shot."
- SHOT_LOCATION - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the location where the video was shot.
- SHOT_NAME - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the shot or take."
- shouldParseEmbedded(Metadata) - Method in interface org.apache.tika.extractor.EmbeddedDocumentExtractor
-
- shouldParseEmbedded(Metadata) - Method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
- shouldParseEmbedded(Metadata) - Method in class org.apache.tika.extractor.ParsingEmbeddedDocumentExtractor
-
- shutdown() - Method in class org.apache.tika.batch.ConsumersManager
-
This is called by BatchProcess immediately before closing.
- shutdown() - Method in class org.apache.tika.batch.fs.FSConsumersManager
-
- shutdown() - Method in class org.apache.tika.eval.batch.DBConsumersManager
-
- shutDownNoPoison() - Method in class org.apache.tika.batch.FileResourceCrawler
-
Set to true to shut down the FileResourceCrawler without
adding poison.
- SimpleLogReporterBuilder - Class in org.apache.tika.batch.builders
-
- SimpleLogReporterBuilder() - Constructor for class org.apache.tika.batch.builders.SimpleLogReporterBuilder
-
- SimpleTextExtractor - Class in org.apache.tika.example
-
- SimpleTextExtractor() - Constructor for class org.apache.tika.example.SimpleTextExtractor
-
- SimpleThreadPoolExecutor - Class in org.apache.tika.concurrent
-
Simple Thread Pool Executor
- SimpleThreadPoolExecutor() - Constructor for class org.apache.tika.concurrent.SimpleThreadPoolExecutor
-
- SimpleTypeDetector - Class in org.apache.tika.example
-
- SimpleTypeDetector() - Constructor for class org.apache.tika.example.SimpleTypeDetector
-
- size() - Method in class org.apache.tika.metadata.Metadata
-
Returns the number of metadata names in this metadata.
- size() - Method in class org.apache.tika.parser.mp4.DirectFileReadDataSource
-
- size() - Method in class org.apache.tika.xmp.XMPMetadata
-
Returns the number of top-level namespaces
- skip(long) - Method in class org.apache.tika.io.BoundedInputStream
-
Invokes the delegate's skip(long)
method.
- skip(long) - Method in class org.apache.tika.io.CountingInputStream
-
Skips the stream over the specified number of bytes, adding the skipped
amount to the count.
- skip(long) - Method in class org.apache.tika.io.LookaheadInputStream
-
- skip(long) - Method in class org.apache.tika.io.NullInputStream
-
Skip a specified number of bytes.
- skip(long) - Method in class org.apache.tika.io.ProxyInputStream
-
Invokes the delegate's skip(long)
method.
- skip(long) - Method in class org.apache.tika.io.TailStream
-
This implementation delegates to the read()
method
to ensure that the tail buffer is also filled if data is skipped.
- skip(long) - Method in class org.apache.tika.io.TikaInputStream
-
- SKIPPED - Static variable in class org.apache.tika.batch.FileResourceCrawler
-
- skippedEntity(String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- skippedEntity(String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
-
- skippedEntity(String) - Method in class org.apache.tika.sax.TeeContentHandler
-
- skippedEntity(String) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
-
- SLDWORKS - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
SolidWorks CAD file
- SLIDE_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- SLIDE_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Slides are there in the (presentation) document
- SlowCompositeReaderWrapper - Class in org.apache.tika.eval.tools
-
COPIED VERBATIM FROM LUCENE
This class forces a composite reader (eg a MultiReader
or DirectoryReader
) to emulate a
LeafReader
.
- SOFTWARE - Static variable in interface org.apache.tika.metadata.TIFF
-
"Software or firmware used to generate the image."
- sortLoadedClasses(List<T>) - Static method in class org.apache.tika.utils.ServiceLoaderUtils
-
Sorts a list of loaded classes, so that non-Tika ones come
before Tika ones, and otherwise in reverse alphabetical order
- SOURCE - Static variable in interface org.apache.tika.metadata.ClimateForcast
-
- SOURCE - Static variable in interface org.apache.tika.metadata.DublinCore
-
A reference to a resource from which the present resource is derived.
- SOURCE - Static variable in interface org.apache.tika.metadata.IPTC
-
Identifies the original owner of the copyright for the intellectual
content of the item.
- SOURCE - Static variable in class org.apache.tika.metadata.Metadata
-
- SOURCE - Static variable in interface org.apache.tika.metadata.Photoshop
-
- SOURCE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- SourceCodeParser - Class in org.apache.tika.parser.code
-
Generic Source code parser for Java, Groovy, C++.
- SourceCodeParser() - Constructor for class org.apache.tika.parser.code.SourceCodeParser
-
- SourceCodeParser(EncodingDetector) - Constructor for class org.apache.tika.parser.code.SourceCodeParser
-
- SPEAKER_PLACEMENT - Static variable in interface org.apache.tika.metadata.XMPDM
-
"A description of the speaker angles from center front in degrees.
- SpreadsheetMLParser - Class in org.apache.tika.parser.microsoft.xml
-
Parses wordml 2003 format Excel files.
- SpreadsheetMLParser() - Constructor for class org.apache.tika.parser.microsoft.xml.SpreadsheetMLParser
-
- SpringExample - Class in org.apache.tika.example
-
- SpringExample() - Constructor for class org.apache.tika.example.SpringExample
-
- SQLite3Parser - Class in org.apache.tika.parser.jdbc
-
This is the main class for parsing SQLite3 files.
- SQLite3Parser() - Constructor for class org.apache.tika.parser.jdbc.SQLite3Parser
-
Checks to see if class is available for org.sqlite.JDBC.
- STANDARD_REFERENCES - Static variable in class org.apache.tika.sax.StandardsExtractingContentHandler
-
- StandardHtmlEncodingDetector - Class in org.apache.tika.parser.html.charsetdetector
-
An encoding detector that tries to respect the spirit of the HTML spec
part 12.2.3 "The input byte stream", or at least the part that is compatible with
the implementation of tika.
- StandardHtmlEncodingDetector() - Constructor for class org.apache.tika.parser.html.charsetdetector.StandardHtmlEncodingDetector
-
- StandardOrganizations - Class in org.apache.tika.sax
-
This class provides a collection of the most important technical standard organizations.
- StandardOrganizations() - Constructor for class org.apache.tika.sax.StandardOrganizations
-
- StandardReference - Class in org.apache.tika.sax
-
Class that represents a standard reference.
- StandardReference.StandardReferenceBuilder - Class in org.apache.tika.sax
-
- StandardReferenceBuilder(String, String) - Constructor for class org.apache.tika.sax.StandardReference.StandardReferenceBuilder
-
- StandardsExtractingContentHandler - Class in org.apache.tika.sax
-
StandardsExtractingContentHandler is a Content Handler used to extract
standard references while parsing.
- StandardsExtractingContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.sax.StandardsExtractingContentHandler
-
Creates a decorator for the given SAX event handler and Metadata object.
- StandardsExtractingContentHandler() - Constructor for class org.apache.tika.sax.StandardsExtractingContentHandler
-
Creates a decorator that by default forwards incoming SAX events to a
dummy content handler that simply ignores all the events.
- StandardsExtractionExample - Class in org.apache.tika.sax
-
- StandardsExtractionExample() - Constructor for class org.apache.tika.sax.StandardsExtractionExample
-
- StandardsText - Class in org.apache.tika.sax
-
StandardText relies on regular expressions to extract standard references
from text.
- StandardsText() - Constructor for class org.apache.tika.sax.StandardsText
-
- start() - Method in class org.apache.tika.batch.FileResourceCrawler
-
Implement this to control the addition of FileResources.
- start() - Method in class org.apache.tika.batch.fs.FSDirectoryCrawler
-
- start() - Method in class org.apache.tika.batch.fs.FSListCrawler
-
- start(BundleContext) - Method in class org.apache.tika.config.TikaActivator
-
- start(BundleContext) - Method in class org.apache.tika.parser.internal.Activator
-
- start(ServerStatus.TASK, String) - Method in class org.apache.tika.server.ServerStatus
-
- START_PMGL - Static variable in class org.apache.tika.parser.chm.core.ChmConstants
-
- startBookmark(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startBookmark(String, String) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- startDescription(String, String, String) - Method in class org.apache.tika.sax.XMPContentHandler
-
- startDocument() - Method in class org.apache.tika.parser.dif.DIFContentHandler
-
- startDocument() - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
- startDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
-
- startDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- startDocument() - Method in class org.apache.tika.sax.ContentHandlerDecorator
-
- startDocument() - Method in class org.apache.tika.sax.DIFContentHandler
-
- startDocument() - Method in class org.apache.tika.sax.EmbeddedContentHandler
-
Ignored.
- startDocument() - Method in class org.apache.tika.sax.ExpandedTitleContentHandler
-
- startDocument() - Method in class org.apache.tika.sax.TeeContentHandler
-
- startDocument() - Method in class org.apache.tika.sax.TextContentHandler
-
- startDocument() - Method in class org.apache.tika.sax.ToHTMLContentHandler
-
- startDocument() - Method in class org.apache.tika.sax.ToXMLContentHandler
-
Writes the XML prefix.
- startDocument() - Method in class org.apache.tika.sax.XHTMLContentHandler
-
Starts an XHTML document by setting up the namespace mappings
when called for the first time.
- startDocument() - Method in class org.apache.tika.sax.XMPContentHandler
-
Starts an XMP document by setting up the namespace mappings and
writing out the following header:
- startEditedSection(String, Date, OOXMLWordAndPowerPointTextHandler.EditType) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startEditedSection(String, Date, OOXMLWordAndPowerPointTextHandler.EditType) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.mime.MimeTypesReader
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.dif.DIFContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.AttributeDependantMetadataHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.AttributeMetadataHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.ElementMetadataHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.MetadataHandler
-
Deprecated.
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ContentHandlerDecorator
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.DIFContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ElementMappingContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ExpandedTitleContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.LinkContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.RichTextContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.SafeContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.SecureContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.TeeContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.TextContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ToTextContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ToXMLContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.XHTMLContentHandler
-
Starts the given element.
- startElement(String) - Method in class org.apache.tika.sax.XHTMLContentHandler
-
- startElement(String, String, String) - Method in class org.apache.tika.sax.XHTMLContentHandler
-
- startElement(String, AttributesImpl) - Method in class org.apache.tika.sax.XHTMLContentHandler
-
- startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
-
- startEmbeddedDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
-
This is called before parsing each embedded document.
- startEmbeddedDocument(ContentHandler, Metadata) - Method in class org.apache.tika.sax.RecursiveParserWrapperHandler
-
This is called before parsing an embedded document
- startParagraph(ParagraphProperties) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startParagraph(ParagraphProperties) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- startPrefixMapping(String, String) - Method in class org.apache.tika.parser.html.BoilerpipeContentHandler
-
- startPrefixMapping(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler
-
- startPrefixMapping(String, String) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.XSSFSheetInterestingPartsCapturer
-
- startPrefixMapping(String, String) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
-
- startPrefixMapping(String, String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
-
- startPrefixMapping(String, String) - Method in class org.apache.tika.sax.TeeContentHandler
-
- startPrefixMapping(String, String) - Method in class org.apache.tika.sax.ToXMLContentHandler
-
- startRow(int) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator.SheetTextAsHTML
-
- startSDT() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startSDT() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- startsWith(byte[], String) - Static method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
- startTable() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startTable() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- startTableCell() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startTableCell() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- startTableRow() - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLTikaBodyPartHandler
-
- startTableRow() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.XWPFBodyContentsHandler
-
- STATE - Static variable in interface org.apache.tika.metadata.Photoshop
-
- StatusReporter - Class in org.apache.tika.batch
-
Basic class to use for reporting status from both the crawler and the consumers.
- StatusReporter(FileResourceCrawler, ConsumersManager) - Constructor for class org.apache.tika.batch.StatusReporter
-
Initialize with the crawler and consumers
- StatusReporterBuilder - Interface in org.apache.tika.batch.builders
-
- StatusReporterFutureResult - Class in org.apache.tika.batch
-
Empty class for what a StatusReporter returns when it finishes.
- StatusReporterFutureResult() - Constructor for class org.apache.tika.batch.StatusReporterFutureResult
-
- stop(BundleContext) - Method in class org.apache.tika.config.TikaActivator
-
- stop(BundleContext) - Method in class org.apache.tika.parser.internal.Activator
-
- STOP_NOW - Static variable in class org.apache.tika.batch.FileResourceCrawler
-
- StrawManTikaAppDriver - Class in org.apache.tika.batch.fs.strawman
-
Simple single-threaded class that calls tika-app against every file in a directory.
- StrawManTikaAppDriver(Path, Path, int, Path, String[]) - Constructor for class org.apache.tika.batch.fs.strawman.StrawManTikaAppDriver
-
- StreamOutRPWFSConsumer - Class in org.apache.tika.batch.fs
-
- StreamOutRPWFSConsumer(ArrayBlockingQueue<FileResource>, Parser, ContentHandlerFactory, OutputStreamFactory) - Constructor for class org.apache.tika.batch.fs.StreamOutRPWFSConsumer
-
- STRETCH_MODE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio stretch mode."
- StringsConfig - Class in org.apache.tika.parser.strings
-
Configuration for the "strings" (or strings-alternative) command.
- StringsConfig() - Constructor for class org.apache.tika.parser.strings.StringsConfig
-
Default contructor.
- StringsConfig(InputStream) - Constructor for class org.apache.tika.parser.strings.StringsConfig
-
Loads properties from InputStream and then tries to close InputStream.
- StringsEncoding - Enum in org.apache.tika.parser.strings
-
Character encoding of the strings that are to be found using the "strings" command.
- StringsParser - Class in org.apache.tika.parser.strings
-
Parser that uses the "strings" (or strings-alternative) command to find the
printable strings in a object, or other binary, file
(application/octet-stream).
- StringsParser() - Constructor for class org.apache.tika.parser.strings.StringsParser
-
- stringToAsciiBytes(String) - Method in class org.apache.tika.parser.chm.lzx.ChmSection
-
- STYLE_NS - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
-
- SUB_CLASS_OF_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- SUB_CLASS_TYPE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- SUBJECT - Static variable in interface org.apache.tika.metadata.DublinCore
-
The topic of the content of the resource.
- SUBJECT - Static variable in class org.apache.tika.metadata.Metadata
-
- SUBJECT - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
The document's subject.
- SUBJECT_CODE - Static variable in interface org.apache.tika.metadata.IPTC
-
Specifies one or more Subjects from the IPTC Subject-NewsCodes taxonomy
to categorise the content.
- SUBLOCATION - Static variable in interface org.apache.tika.metadata.IPTC
-
Name of a sublocation the content is focussing on -- either the
location shown in visual media or referenced by text or audio media.
- SubtreeMatcher - Class in org.apache.tika.sax.xpath
-
Evaluation state of a ...//...
XPath expression.
- SubtreeMatcher(Matcher) - Constructor for class org.apache.tika.sax.xpath.SubtreeMatcher
-
- summarize(File) - Method in class org.apache.tika.example.TrecDocumentGenerator
-
- SUMMARY_PROPERTY_PREFIX - Static variable in class org.apache.tika.parser.microsoft.JackcessParser
-
- SummaryExtractor - Class in org.apache.tika.parser.microsoft
-
Extractor for Common OLE2 (HPSF) metadata
- SummaryExtractor(Metadata) - Constructor for class org.apache.tika.parser.microsoft.SummaryExtractor
-
- SUPPLEMENTAL_CATEGORIES - Static variable in interface org.apache.tika.metadata.IPTC
-
Deprecated.
- SUPPLEMENTAL_CATEGORIES - Static variable in interface org.apache.tika.metadata.Photoshop
-
- SUPPORTED_MIMES - Static variable in class org.apache.tika.dl.imagerec.DL4JVGG16Net
-
- SUPPORTED_TYPES - Static variable in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
-
- SUPPORTED_TYPES - Static variable in class org.apache.tika.parser.microsoft.ooxml.xwpf.ml2006.Word2006MLParser
-
- SVG_NS - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
-
- SXSLFPowerPointExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
-
SAX/Streaming pptx extractior
- SXSLFPowerPointExtractorDecorator(Metadata, ParseContext, XSLFEventBasedPowerPointExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.SXSLFPowerPointExtractorDecorator
-
- SXWPFWordExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
-
This is an experimental, alternative extractor for docx files.
- SXWPFWordExtractorDecorator(Metadata, ParseContext, XWPFEventBasedWordExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.SXWPFWordExtractorDecorator
-
- SYS_PROP_NER_IMPL - Static variable in class org.apache.tika.parser.ner.NamedEntityParser
-
- SystemUtils - Class in org.apache.tika.utils
-
Copied from commons-lang to avoid requiring the dependency
- SystemUtils() - Constructor for class org.apache.tika.utils.SystemUtils
-
- TAB - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
-
- TABLE_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- TABLE_COUNT - Static variable in interface org.apache.tika.metadata.Office
-
The number of Tables in the document
- TABLE_ID - Static variable in interface org.apache.tika.metadata.ClimateForcast
-
- TABLE_NAME - Static variable in interface org.apache.tika.metadata.Database
-
- TABLE_NS - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
-
- TABLE_PREFIX_A_KEY - Static variable in class org.apache.tika.eval.batch.ExtractComparerBuilder
-
- TABLE_PREFIX_B_KEY - Static variable in class org.apache.tika.eval.batch.ExtractComparerBuilder
-
- TABLE_PREFIX_KEY - Static variable in class org.apache.tika.eval.batch.ExtractProfilerBuilder
-
- TableInfo - Class in org.apache.tika.eval.db
-
- TableInfo(String, ColInfo...) - Constructor for class org.apache.tika.eval.db.TableInfo
-
- TableInfo(String, List<ColInfo>) - Constructor for class org.apache.tika.eval.db.TableInfo
-
- TagAndStyle(String, String) - Constructor for class org.apache.tika.parser.microsoft.WordExtractor.TagAndStyle
-
- TaggedContentHandler - Class in org.apache.tika.sax
-
A content handler decorator that tags potential exceptions so that the
handler that caused the exception can easily be identified.
- TaggedContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.TaggedContentHandler
-
Creates a tagging decorator for the given content handler.
- TaggedInputStream - Class in org.apache.tika.io
-
An input stream decorator that tags potential exceptions so that the
stream that caused the exception can easily be identified.
- TaggedInputStream(InputStream) - Constructor for class org.apache.tika.io.TaggedInputStream
-
Creates a tagging decorator for the given input stream.
- TaggedIOException - Exception in org.apache.tika.io
-
An
IOException
wrapper that tags the wrapped exception with
a given object reference.
- TaggedIOException(IOException, Object) - Constructor for exception org.apache.tika.io.TaggedIOException
-
Creates a tagged wrapper for the given exception.
- TaggedSAXException - Exception in org.apache.tika.sax
-
A
SAXException
wrapper that tags the wrapped exception with
a given object reference.
- TaggedSAXException(SAXException, Object) - Constructor for exception org.apache.tika.sax.TaggedSAXException
-
Creates a tagged wrapper for the given exception.
- tagName() - Method in enum org.apache.tika.parser.microsoft.FormattingUtils.Tag
-
- TAGS_TABLE - Static variable in class org.apache.tika.eval.ExtractProfiler
-
- TAGS_TABLE_A - Static variable in class org.apache.tika.eval.ExtractComparer
-
- TAGS_TABLE_B - Static variable in class org.apache.tika.eval.ExtractComparer
-
- TailStream - Class in org.apache.tika.io
-
A specialized input stream implementation which records the last portion read
from an underlying stream.
- TailStream(InputStream, int) - Constructor for class org.apache.tika.io.TailStream
-
Creates a new instance of TailStream
.
- TAPE_NAME - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The name of the tape from which the clip was captured, as set during
the capture process."
- TargetElement(QName, Map<QName, QName>) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
-
Creates an TargetElement, attributes of this element will
be mapped as specified
- TargetElement(String, String, Map<QName, QName>) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
-
A shortcut that automatically creates the QName object
- TargetElement(QName) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
-
Creates an TargetElement with no attributes, all attributes
will be deleted from SAX stream
- TargetElement(String, String) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
-
A shortcut that automatically creates the QName object
- TarWriter - Class in org.apache.tika.server.writer
-
- TarWriter() - Constructor for class org.apache.tika.server.writer.TarWriter
-
- TaskStatus - Class in org.apache.tika.server
-
- TeeContentHandler - Class in org.apache.tika.sax
-
Content handler proxy that forwards the received SAX events to zero or
more underlying content handlers.
- TeeContentHandler(ContentHandler...) - Constructor for class org.apache.tika.sax.TeeContentHandler
-
- TEIDOMParser - Class in org.apache.tika.parser.journal
-
- TEIDOMParser() - Constructor for class org.apache.tika.parser.journal.TEIDOMParser
-
- TEMPLATE - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- TEMPLATE - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
-
- templateID - Variable in class org.apache.tika.parser.rtf.ListDescriptor
-
- TEMPO - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The audio's tempo."
- TemporaryResources - Class in org.apache.tika.io
-
Utility class for tracking and ultimately closing or otherwise disposing
a collection of temporary resources.
- TemporaryResources() - Constructor for class org.apache.tika.io.TemporaryResources
-
- TensorflowImageRecParser - Class in org.apache.tika.parser.recognition.tf
-
- TensorflowImageRecParser() - Constructor for class org.apache.tika.parser.recognition.tf.TensorflowImageRecParser
-
- TensorflowRESTCaptioner - Class in org.apache.tika.parser.captioning.tf
-
Tensorflow image captioner.
- TensorflowRESTCaptioner() - Constructor for class org.apache.tika.parser.captioning.tf.TensorflowRESTCaptioner
-
- TensorflowRESTRecogniser - Class in org.apache.tika.parser.recognition.tf
-
Tensor Flow image recogniser which has high performance.
- TensorflowRESTRecogniser() - Constructor for class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- TensorflowRESTVideoRecogniser - Class in org.apache.tika.parser.recognition.tf
-
Tensor Flow video recogniser which has high performance.
- TensorflowRESTVideoRecogniser() - Constructor for class org.apache.tika.parser.recognition.tf.TensorflowRESTVideoRecogniser
-
- terms(String) - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- TesseractOCRConfig - Class in org.apache.tika.parser.ocr
-
Configuration for TesseractOCRParser.
- TesseractOCRConfig() - Constructor for class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Default contructor.
- TesseractOCRConfig(InputStream) - Constructor for class org.apache.tika.parser.ocr.TesseractOCRConfig
-
Loads properties from InputStream and then tries to close InputStream.
- TesseractOCRConfig.OUTPUT_TYPE - Enum in org.apache.tika.parser.ocr
-
- TesseractOCRParser - Class in org.apache.tika.parser.ocr
-
TesseractOCRParser powered by tesseract-ocr engine.
- TesseractOCRParser() - Constructor for class org.apache.tika.parser.ocr.TesseractOCRParser
-
- testCompositeDocument() - Static method in class org.apache.tika.example.TIAParsingExample
-
- testHtmlMapper() - Static method in class org.apache.tika.example.TIAParsingExample
-
- testLocale() - Static method in class org.apache.tika.example.TIAParsingExample
-
- testTeeContentHandler(String) - Static method in class org.apache.tika.example.TIAParsingExample
-
- text(String) - Static method in class org.apache.tika.mime.MediaType
-
- TEXT_FILENAME - Static variable in class org.apache.tika.server.resource.UnpackerResource
-
- TEXT_HTML - Static variable in class org.apache.tika.mime.MediaType
-
- TEXT_NS - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
-
- TEXT_PLAIN - Static variable in class org.apache.tika.mime.MediaType
-
- TextAndCSVParser - Class in org.apache.tika.parser.csv
-
- TextAndCSVParser() - Constructor for class org.apache.tika.parser.csv.TextAndCSVParser
-
- TextAndCSVParser(EncodingDetector) - Constructor for class org.apache.tika.parser.csv.TextAndCSVParser
-
- TextCell - Class in org.apache.tika.parser.microsoft
-
Text cell.
- TextCell(String) - Constructor for class org.apache.tika.parser.microsoft.TextCell
-
- TextContentHandler - Class in org.apache.tika.sax
-
- TextContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.TextContentHandler
-
- TextContentHandler(ContentHandler, boolean) - Constructor for class org.apache.tika.sax.TextContentHandler
-
- TextDetector - Class in org.apache.tika.detect
-
Content type detection of plain text documents.
- TextDetector() - Constructor for class org.apache.tika.detect.TextDetector
-
Constructs a
TextDetector
which will look at the default number
of bytes from the beginning of the document.
- TextDetector(int) - Constructor for class org.apache.tika.detect.TextDetector
-
Constructs a
TextDetector
which will look at a given number of
bytes from the beginning of the document.
- TextLangDetector - Class in org.apache.tika.langdetect
-
Language Detection using MIT Lincoln Lab’s Text.jl library
https://github.com/trevorlewis/TextREST.jl
Please run the TextREST.jl server before using this.
- TextLangDetector() - Constructor for class org.apache.tika.langdetect.TextLangDetector
-
- TextMatcher - Class in org.apache.tika.sax.xpath
-
Final evaluation state of a .../text()
XPath expression.
- TextMatcher() - Constructor for class org.apache.tika.sax.xpath.TextMatcher
-
- TextMessageBodyWriter - Class in org.apache.tika.server.writer
-
Returns simple text string for a particular metadata value.
- TextMessageBodyWriter() - Constructor for class org.apache.tika.server.writer.TextMessageBodyWriter
-
- TextStatistics - Class in org.apache.tika.detect
-
Utility class for computing a histogram of the bytes seen in a stream.
- TextStatistics() - Constructor for class org.apache.tika.detect.TextStatistics
-
- threshold(float) - Method in class org.apache.tika.mime.ProbabilisticMimeDetectionSelector.Builder
-
- THROW - Static variable in interface org.apache.tika.config.InitializableProblemHandler
-
- THROW - Static variable in interface org.apache.tika.config.LoadErrorHandler
-
Strategy that throws a
RuntimeException
with the given
throwable as the root cause, thus interrupting the entire service
loading operation.
- throwIfCauseOf(Exception) - Method in class org.apache.tika.io.TaggedInputStream
-
Re-throws the original exception thrown by this stream.
- throwIfCauseOf(SAXException) - Method in class org.apache.tika.sax.SecureContentHandler
-
- throwIfCauseOf(Exception) - Method in class org.apache.tika.sax.TaggedContentHandler
-
Re-throws the original exception thrown by this handler.
- THUMBNAIL - Static variable in interface org.apache.tika.metadata.RTFMetadata
-
if set to true, this means that an image file is probably a "thumbnail"
any time a pict/emf/wmf is in an object
- TIAParsingExample - Class in org.apache.tika.example
-
- TIAParsingExample() - Constructor for class org.apache.tika.example.TIAParsingExample
-
- TIFF - Interface in org.apache.tika.metadata
-
XMP Exif TIFF schema.
- TiffParser - Class in org.apache.tika.parser.image
-
- TiffParser() - Constructor for class org.apache.tika.parser.image.TiffParser
-
- Tika - Class in org.apache.tika
-
Facade class for accessing Tika functionality.
- Tika(Detector, Parser) - Constructor for class org.apache.tika.Tika
-
Creates a Tika facade using the given detector and parser instances, but the default Translator.
- Tika(Detector, Parser, Translator) - Constructor for class org.apache.tika.Tika
-
Creates a Tika facade using the given detector, parser, and translator instances.
- Tika(TikaConfig) - Constructor for class org.apache.tika.Tika
-
Creates a Tika facade using the given configuration.
- Tika() - Constructor for class org.apache.tika.Tika
-
Creates a Tika facade using the default configuration.
- Tika(Detector) - Constructor for class org.apache.tika.Tika
-
Creates a Tika facade using the given detector instance, the
default parser configuration, and the default Translator.
- TIKA_CONFIG_PATH - Static variable in class org.apache.tika.parser.AutoDetectParserFactory
-
Path to a tika-config file.
- TIKA_CONTENT - Static variable in class org.apache.tika.parser.RecursiveParserWrapper
-
- TIKA_CONTENT - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
-
- TIKA_CONTENT_HANDLER - Static variable in class org.apache.tika.sax.AbstractRecursiveParserWrapperHandler
-
Simple class name of the content handler
- TIKA_LINK_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- TIKA_META_EXCEPTION_EMBEDDED_STREAM - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Use this to store exceptions caught while trying to read the
stream of an embedded resource.
- TIKA_META_EXCEPTION_PREFIX - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Use this to store parse exception information in the Metadata object.
- TIKA_META_EXCEPTION_WARNING - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Use this to store exceptions caught during a parse that are
non-fatal, e.g.
- TIKA_META_PREFIX - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
Use this to prefix metadata properties that store information
about the parsing process.
- TIKA_MIME_FILE - Static variable in interface org.apache.tika.metadata.TikaMimeKeys
-
- TIKA_UTI_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
-
- TikaActivator - Class in org.apache.tika.config
-
Bundle activator that adjust the class loading mechanism of the
ServiceLoader
class to work correctly in an OSGi environment.
- TikaActivator() - Constructor for class org.apache.tika.config.TikaActivator
-
- TikaCLI - Class in org.apache.tika.cli
-
Simple command line interface for Apache Tika.
- TikaCLI() - Constructor for class org.apache.tika.cli.TikaCLI
-
- TikaConfig - Class in org.apache.tika.config
-
Parse xml config file.
- TikaConfig(String) - Constructor for class org.apache.tika.config.TikaConfig
-
- TikaConfig(Path) - Constructor for class org.apache.tika.config.TikaConfig
-
- TikaConfig(Path, ServiceLoader) - Constructor for class org.apache.tika.config.TikaConfig
-
- TikaConfig(File) - Constructor for class org.apache.tika.config.TikaConfig
-
- TikaConfig(File, ServiceLoader) - Constructor for class org.apache.tika.config.TikaConfig
-
- TikaConfig(URL) - Constructor for class org.apache.tika.config.TikaConfig
-
- TikaConfig(URL, ClassLoader) - Constructor for class org.apache.tika.config.TikaConfig
-
- TikaConfig(URL, ServiceLoader) - Constructor for class org.apache.tika.config.TikaConfig
-
- TikaConfig(InputStream) - Constructor for class org.apache.tika.config.TikaConfig
-
- TikaConfig(Document) - Constructor for class org.apache.tika.config.TikaConfig
-
- TikaConfig(Document, ServiceLoader) - Constructor for class org.apache.tika.config.TikaConfig
-
- TikaConfig(Element) - Constructor for class org.apache.tika.config.TikaConfig
-
- TikaConfig(Element, ClassLoader) - Constructor for class org.apache.tika.config.TikaConfig
-
- TikaConfig(ClassLoader) - Constructor for class org.apache.tika.config.TikaConfig
-
Creates a Tika configuration from the built-in media type rules
and all the
Parser
implementations available through the
service provider mechanism
in the given
class loader.
- TikaConfig() - Constructor for class org.apache.tika.config.TikaConfig
-
Creates a default Tika configuration.
- TikaConfigException - Exception in org.apache.tika.exception
-
Tika Config Exception is an exception to occur when there is an error
in Tika config file and/or one or more of the parsers failed to initialize
from that erroneous config.
- TikaConfigException(String) - Constructor for exception org.apache.tika.exception.TikaConfigException
-
Creates an instance of exception
- TikaConfigException(String, Throwable) - Constructor for exception org.apache.tika.exception.TikaConfigException
-
- TikaConfigSerializer - Class in org.apache.tika.config
-
- TikaConfigSerializer() - Constructor for class org.apache.tika.config.TikaConfigSerializer
-
- TikaConfigSerializer.Mode - Enum in org.apache.tika.config
-
- TikaCoreProperties - Interface in org.apache.tika.metadata
-
Contains a core set of basic Tika metadata properties, which all parsers
will attempt to supply (where the file format permits).
- TikaCoreProperties.EmbeddedResourceType - Enum in org.apache.tika.metadata
-
A file might contain different types of embedded documents.
- TikaDetectors - Class in org.apache.tika.server.resource
-
Provides details of all the
Detector
s registered with
Apache Tika, similar to
--list-detectors with the Tika CLI.
- TikaDetectors() - Constructor for class org.apache.tika.server.resource.TikaDetectors
-
- TikaEvalCLI - Class in org.apache.tika.eval
-
- TikaEvalCLI() - Constructor for class org.apache.tika.eval.TikaEvalCLI
-
- TikaExcelDataFormatter - Class in org.apache.tika.parser.microsoft
-
Overrides Excel's General format to include more
significant digits than the MS Spec allows.
- TikaExcelDataFormatter() - Constructor for class org.apache.tika.parser.microsoft.TikaExcelDataFormatter
-
- TikaExcelDataFormatter(Locale) - Constructor for class org.apache.tika.parser.microsoft.TikaExcelDataFormatter
-
- TikaExcelGeneralFormat - Class in org.apache.tika.parser.microsoft
-
A Format that allows up to 15 significant digits for integers.
- TikaExcelGeneralFormat(Locale) - Constructor for class org.apache.tika.parser.microsoft.TikaExcelGeneralFormat
-
- TikaException - Exception in org.apache.tika.exception
-
Tika exception
- TikaException(String) - Constructor for exception org.apache.tika.exception.TikaException
-
- TikaException(String, Throwable) - Constructor for exception org.apache.tika.exception.TikaException
-
- TikaFileTypeDetector - Class in org.apache.tika.filetypedetector
-
- TikaFileTypeDetector() - Constructor for class org.apache.tika.filetypedetector.TikaFileTypeDetector
-
- TikaGUI - Class in org.apache.tika.gui
-
Simple Swing GUI for Apache Tika.
- TikaGUI(Parser) - Constructor for class org.apache.tika.gui.TikaGUI
-
- TikaInputStream - Class in org.apache.tika.io
-
Input stream with extended capabilities.
- tikaInputStreamGetFile(String) - Static method in class org.apache.tika.example.TIAParsingExample
-
- TikaLoggingFilter - Class in org.apache.tika.server
-
- TikaLoggingFilter(boolean) - Constructor for class org.apache.tika.server.TikaLoggingFilter
-
- TikaMemoryLimitException - Exception in org.apache.tika.exception
-
- TikaMemoryLimitException(String) - Constructor for exception org.apache.tika.exception.TikaMemoryLimitException
-
- TikaMetadataKeys - Interface in org.apache.tika.metadata
-
Contains keys to properties in Metadata instances.
- TikaMimeKeys - Interface in org.apache.tika.metadata
-
A collection of Tika metadata keys used in Mime Type resolution
- TikaMimeTypes - Class in org.apache.tika.server.resource
-
Provides details of all the mimetypes known to Apache Tika,
similar to --list-supported-types with the Tika CLI.
- TikaMimeTypes() - Constructor for class org.apache.tika.server.resource.TikaMimeTypes
-
- TikaParsers - Class in org.apache.tika.server.resource
-
Provides details of all the
Parser
s registered with
Apache Tika, similar to
--list-parsers and
--list-parser-details within the Tika CLI.
- TikaParsers() - Constructor for class org.apache.tika.server.resource.TikaParsers
-
- TikaResource - Class in org.apache.tika.server.resource
-
- TikaResource() - Constructor for class org.apache.tika.server.resource.TikaResource
-
- TikaServerCli - Class in org.apache.tika.server
-
- TikaServerCli() - Constructor for class org.apache.tika.server.TikaServerCli
-
- TikaServerParseException - Exception in org.apache.tika.server
-
Simple wrapper exception to be thrown for consistent handling
of exceptions that can happen during a parse.
- TikaServerParseException(String) - Constructor for exception org.apache.tika.server.TikaServerParseException
-
- TikaServerParseException(Exception) - Constructor for exception org.apache.tika.server.TikaServerParseException
-
- TikaServerParseExceptionMapper - Class in org.apache.tika.server
-
- TikaServerParseExceptionMapper(boolean) - Constructor for class org.apache.tika.server.TikaServerParseExceptionMapper
-
- TikaServerWatchDog - Class in org.apache.tika.server
-
- TikaServerWatchDog() - Constructor for class org.apache.tika.server.TikaServerWatchDog
-
- TikaToXMP - Class in org.apache.tika.xmp.convert
-
- TikaToXMP() - Constructor for class org.apache.tika.xmp.convert.TikaToXMP
-
- TikaVersion - Class in org.apache.tika.server.resource
-
- TikaVersion() - Constructor for class org.apache.tika.server.resource.TikaVersion
-
- TikaWelcome - Class in org.apache.tika.server.resource
-
Provides a basic welcome to the Apache Tika Server.
- TikaWelcome(List<ResourceProvider>) - Constructor for class org.apache.tika.server.resource.TikaWelcome
-
- TikaWelcome.Endpoint - Class in org.apache.tika.server.resource
-
- TIME - Static variable in interface org.apache.tika.parser.ner.NERecogniser
-
- TIME_FILE - Static variable in class org.apache.tika.parser.ner.opennlp.OpenNLPNERecogniser
-
- TIME_SIGNATURE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The time signature of the music."
- TIMED_OUT - Static variable in class org.apache.tika.batch.FileResourceConsumer
-
- TIMES_INSTANTIATED - Static variable in class org.apache.tika.config.TikaConfig
-
- TITLE - Static variable in interface org.apache.tika.metadata.DublinCore
-
A name given to the resource.
- TITLE - Static variable in interface org.apache.tika.metadata.IPTC
-
A shorthand reference for the item.
- TITLE - Static variable in class org.apache.tika.metadata.Metadata
-
- TITLE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- TNEFParser - Class in org.apache.tika.parser.microsoft
-
A POI-powered Tika Parser for TNEF (Transport Neutral
Encoding Format) messages, aka winmail.dat
- TNEFParser() - Constructor for class org.apache.tika.parser.microsoft.TNEFParser
-
- toByteArray(InputStream) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of an InputStream
as a byte[]
.
- toByteArray(Reader) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of a Reader
as a byte[]
using the default character encoding of the platform.
- toByteArray(Reader, String) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of a Reader
as a byte[]
using the specified character encoding.
- toByteArray(String) - Static method in class org.apache.tika.io.IOUtils
-
- toCharArray(InputStream) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of an InputStream
as a character array
using the default character encoding of the platform.
- toCharArray(InputStream, String) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of an InputStream
as a character array
using the specified character encoding.
- toCharArray(Reader) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of a Reader
as a character array.
- toGeoTag(Map<String, List<Location>>, String) - Method in class org.apache.tika.parser.geo.topic.GeoTag
-
- ToHTMLContentHandler - Class in org.apache.tika.sax
-
SAX event handler that serializes the HTML document to a character stream.
- ToHTMLContentHandler(OutputStream, String) - Constructor for class org.apache.tika.sax.ToHTMLContentHandler
-
- ToHTMLContentHandler() - Constructor for class org.apache.tika.sax.ToHTMLContentHandler
-
- toInputStream(CharSequence) - Static method in class org.apache.tika.io.IOUtils
-
Convert the specified CharSequence to an input stream, encoded as bytes
using the default character encoding of the platform.
- toInputStream(CharSequence, String) - Static method in class org.apache.tika.io.IOUtils
-
Convert the specified CharSequence to an input stream, encoded as bytes
using the specified character encoding.
- toInputStream(String) - Static method in class org.apache.tika.io.IOUtils
-
Convert the specified string to an input stream, encoded as bytes
using the default character encoding of the platform.
- toInputStream(String, String) - Static method in class org.apache.tika.io.IOUtils
-
Convert the specified string to an input stream, encoded as bytes
using the specified character encoding.
- toJson(Metadata, Writer) - Static method in class org.apache.tika.metadata.serialization.JsonMetadata
-
Serializes a Metadata object to Json.
- toJson(List<Metadata>, Writer) - Static method in class org.apache.tika.metadata.serialization.JsonMetadataList
-
Serializes a Metadata object to Json.
- TokenContraster - Class in org.apache.tika.eval.tokens
-
Computes some corpus contrast statistics.
- TokenContraster() - Constructor for class org.apache.tika.eval.tokens.TokenContraster
-
- TokenCounter - Class in org.apache.tika.eval.tokens
-
- TokenCounter(Analyzer) - Constructor for class org.apache.tika.eval.tokens.TokenCounter
-
- TokenCountPriorityQueue - Class in org.apache.tika.eval.tokens
-
- TokenIntPair - Class in org.apache.tika.eval.tokens
-
- TokenIntPair(String, int) - Constructor for class org.apache.tika.eval.tokens.TokenIntPair
-
- tokenize(String) - Static method in class org.apache.tika.parser.ner.opennlp.OpenNLPNameFinder
-
- TokenStatistics - Class in org.apache.tika.eval.tokens
-
- TokenStatistics(int, int, TokenIntPair[], double, SummaryStatistics) - Constructor for class org.apache.tika.eval.tokens.TokenStatistics
-
- TopCommonTokenCounter - Class in org.apache.tika.eval.tools
-
Utility class that reads in a UTF-8 input file with one document per row
and outputs the 20000 tokens with the highest document frequencies.
- TopCommonTokenCounter() - Constructor for class org.apache.tika.eval.tools.TopCommonTokenCounter
-
- topN - Variable in class org.apache.tika.parser.recognition.tf.TensorflowRESTRecogniser
-
- toResponse(TikaServerParseException) - Method in class org.apache.tika.server.TikaServerParseExceptionMapper
-
- toString() - Method in class org.apache.tika.batch.ParallelFileProcessingResult
-
- toString() - Method in class org.apache.tika.config.Param
-
- toString() - Method in class org.apache.tika.config.ParamField
-
- toString() - Method in class org.apache.tika.detect.MagicDetector
-
Returns a string representation of the Detection Rule.
- toString() - Method in class org.apache.tika.eval.tokens.TokenIntPair
-
- toString() - Method in class org.apache.tika.eval.tokens.TokenStatistics
-
- toString() - Method in class org.apache.tika.eval.tools.SlowCompositeReaderWrapper
-
- toString() - Method in class org.apache.tika.io.CountingInputStream
-
- toString(InputStream) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of an InputStream
as a String
using the default character encoding of the platform.
- toString(InputStream, String) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of an InputStream
as a String
using the specified character encoding.
- toString(Reader) - Static method in class org.apache.tika.io.IOUtils
-
Get the contents of a Reader
as a String.
- toString(byte[]) - Static method in class org.apache.tika.io.IOUtils
-
- toString(byte[], String) - Static method in class org.apache.tika.io.IOUtils
-
- toString() - Method in class org.apache.tika.io.TaggedInputStream
-
- toString() - Method in class org.apache.tika.io.TikaInputStream
-
- toString() - Method in class org.apache.tika.language.detect.LanguageResult
-
- toString() - Method in class org.apache.tika.language.LanguageIdentifier
-
Deprecated.
- toString() - Method in class org.apache.tika.language.LanguageProfile
-
Deprecated.
- toString() - Method in class org.apache.tika.language.LanguageProfilerBuilder
-
Deprecated.
- toString() - Method in class org.apache.tika.metadata.Metadata
-
- toString() - Method in class org.apache.tika.mime.MediaType
-
- toString() - Method in class org.apache.tika.mime.MimeType
-
Returns the name of this media type.
- toString() - Method in class org.apache.tika.parser.captioning.CaptionObject
-
- toString() - Method in class org.apache.tika.parser.chm.accessor.ChmDirectoryListingSet
-
- toString() - Method in class org.apache.tika.parser.chm.accessor.ChmItsfHeader
-
Prints the values of ChmfHeader
- toString() - Method in class org.apache.tika.parser.chm.accessor.ChmItspHeader
-
- toString() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcControlData
-
Returns textual representation of ChmLzxcControlData
- toString() - Method in class org.apache.tika.parser.chm.accessor.ChmLzxcResetTable
-
- toString() - Method in class org.apache.tika.parser.chm.accessor.ChmPmgiHeader
-
Returns textual representation of the pmgi header
- toString() - Method in class org.apache.tika.parser.chm.accessor.ChmPmglHeader
-
- toString() - Method in class org.apache.tika.parser.chm.accessor.DirectoryListingEntry
-
- toString() - Method in class org.apache.tika.parser.chm.lzx.ChmBlockInfo
-
Returns textual representation of ChmBlockInfo
- toString() - Method in class org.apache.tika.parser.chm.lzx.ChmLzxState
-
It suits for informative outlook
- toString() - Method in class org.apache.tika.parser.csv.CSVResult
-
- toString() - Method in class org.apache.tika.parser.dif.DIFContentHandler
-
- toString() - Method in class org.apache.tika.parser.microsoft.NumberCell
-
- toString() - Method in class org.apache.tika.parser.microsoft.TextCell
-
- toString() - Method in class org.apache.tika.parser.pdf.PDFParserConfig
-
- toString() - Method in class org.apache.tika.parser.recognition.RecognisedObject
-
- toString() - Method in enum org.apache.tika.parser.strings.StringsEncoding
-
- toString() - Method in class org.apache.tika.parser.txt.CharsetMatch
-
- toString() - Method in class org.apache.tika.sax.ContentHandlerDecorator
-
- toString() - Method in class org.apache.tika.sax.DIFContentHandler
-
- toString() - Method in class org.apache.tika.sax.Link
-
- toString() - Method in class org.apache.tika.sax.StandardReference
-
- toString() - Method in class org.apache.tika.sax.TextContentHandler
-
- toString() - Method in class org.apache.tika.sax.ToTextContentHandler
-
Returns the contents of the internal string buffer where
all the received characters have been collected.
- toString() - Method in class org.apache.tika.server.TaskStatus
-
- toString() - Method in class org.apache.tika.Tika
-
- toString() - Method in class org.apache.tika.xmp.XMPMetadata
-
Serializes the XMP data in compact form without packet wrapper
- toTags(CharacterRun) - Static method in class org.apache.tika.parser.microsoft.FormattingUtils
-
- TOTAL_TIME - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- TOTAL_TIME - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLExtended
-
- ToTextContentHandler - Class in org.apache.tika.sax
-
SAX event handler that writes all character content out to a character
stream.
- ToTextContentHandler(Writer) - Constructor for class org.apache.tika.sax.ToTextContentHandler
-
Creates a content handler that writes character events to
the given writer.
- ToTextContentHandler(OutputStream) - Constructor for class org.apache.tika.sax.ToTextContentHandler
-
Creates a content handler that writes character events to
the given output stream using the platform default encoding.
- ToTextContentHandler(OutputStream, String) - Constructor for class org.apache.tika.sax.ToTextContentHandler
-
Creates a content handler that writes character events to
the given output stream using the given encoding.
- ToTextContentHandler() - Constructor for class org.apache.tika.sax.ToTextContentHandler
-
Creates a content handler that writes character events
to an internal string buffer.
- ToXMLContentHandler - Class in org.apache.tika.sax
-
SAX event handler that serializes the XML document to a character stream.
- ToXMLContentHandler(OutputStream, String) - Constructor for class org.apache.tika.sax.ToXMLContentHandler
-
Creates an XML serializer that writes to the given byte stream
using the given character encoding.
- ToXMLContentHandler(String) - Constructor for class org.apache.tika.sax.ToXMLContentHandler
-
- ToXMLContentHandler() - Constructor for class org.apache.tika.sax.ToXMLContentHandler
-
- TRACK_NUMBER - Static variable in interface org.apache.tika.metadata.XMPDM
-
"A numeric value indicating the order of the audio file within its
original recording."
- TrainedModel - Class in org.apache.tika.detect
-
- TrainedModel() - Constructor for class org.apache.tika.detect.TrainedModel
-
- TrainedModelDetector - Class in org.apache.tika.detect
-
- TrainedModelDetector() - Constructor for class org.apache.tika.detect.TrainedModelDetector
-
- transferTo(long, long, WritableByteChannel) - Method in class org.apache.tika.parser.mp4.DirectFileReadDataSource
-
- TRANSITION_KEYWORDS_TO_DC_SUBJECT - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- TRANSITION_SUBJECT_TO_DC_DESCRIPTION - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- TRANSITION_SUBJECT_TO_DC_TITLE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- TRANSITION_SUBJECT_TO_OO_SUBJECT - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- translate(String, String, String) - Method in class org.apache.tika.language.translate.CachedTranslator
-
- translate(String, String) - Method in class org.apache.tika.language.translate.CachedTranslator
-
- translate(String, String, String) - Method in class org.apache.tika.language.translate.DefaultTranslator
-
Translate, using the first available service-loaded translator
- translate(String, String) - Method in class org.apache.tika.language.translate.DefaultTranslator
-
Translate, using the first available service-loaded translator
- translate(String, String, String) - Method in class org.apache.tika.language.translate.EmptyTranslator
-
- translate(String, String) - Method in class org.apache.tika.language.translate.EmptyTranslator
-
- translate(String, String) - Method in class org.apache.tika.language.translate.ExternalTranslator
-
Default translate method which uses built Tika language identification.
- translate(String, String, String) - Method in class org.apache.tika.language.translate.GoogleTranslator
-
- translate(String, String) - Method in class org.apache.tika.language.translate.GoogleTranslator
-
- translate(String, String, String) - Method in class org.apache.tika.language.translate.JoshuaNetworkTranslator
-
Initially then check if the source language has been provided.
- translate(String, String) - Method in class org.apache.tika.language.translate.JoshuaNetworkTranslator
-
- translate(String, String, String) - Method in class org.apache.tika.language.translate.Lingo24Translator
-
- translate(String, String) - Method in class org.apache.tika.language.translate.Lingo24Translator
-
- translate(String, String, String) - Method in class org.apache.tika.language.translate.MicrosoftTranslator
-
Use the Microsoft service to translate the given text from the given source language to the given target.
- translate(String, String) - Method in class org.apache.tika.language.translate.MicrosoftTranslator
-
Use the Microsoft service to translate the given text to the given target language.
- translate(String, String, String) - Method in class org.apache.tika.language.translate.MosesTranslator
-
- translate(String, String, String) - Method in interface org.apache.tika.language.translate.Translator
-
Translate text between given languages.
- translate(String, String) - Method in interface org.apache.tika.language.translate.Translator
-
Translate text to the given language
This method attempts to auto-detect the source language of the text.
- translate(String, String, String) - Method in class org.apache.tika.language.translate.YandexTranslator
-
- translate(String, String) - Method in class org.apache.tika.language.translate.YandexTranslator
-
- translate(InputStream, String, String, String) - Method in class org.apache.tika.server.resource.TranslateResource
-
- translate(String, String, String) - Method in class org.apache.tika.Tika
-
Translate the given text String to and from the given languages.
- translate(String, String) - Method in class org.apache.tika.Tika
-
Translate the given text String to the given language, attempting to auto-detect the source language.
- translate(InputStream, String, String) - Method in class org.apache.tika.Tika
-
Translate the given text InputStream to and from the given languages.
- translate(InputStream, String) - Method in class org.apache.tika.Tika
-
Translate the given text InputStream to the given language, attempting to auto-detect the source language.
- TranslateResource - Class in org.apache.tika.server.resource
-
- TranslateResource(ServerStatus) - Constructor for class org.apache.tika.server.resource.TranslateResource
-
- Translator - Interface in org.apache.tika.language.translate
-
Interface for Translator services.
- TranslatorExample - Class in org.apache.tika.example
-
- TranslatorExample() - Constructor for class org.apache.tika.example.TranslatorExample
-
- TRANSMISSION_REFERENCE - Static variable in interface org.apache.tika.metadata.Photoshop
-
- TrecDocumentGenerator - Class in org.apache.tika.example
-
Generates document summaries for corpus analysis in the Open Relevance
project.
- TrecDocumentGenerator() - Constructor for class org.apache.tika.example.TrecDocumentGenerator
-
- trimMessage(String) - Static method in class org.apache.tika.utils.ExceptionUtils
-
Utility method to trim the message from a stack trace
string.
- TRUE - Static variable in class org.apache.tika.eval.AbstractProfiler
-
- TrueTypeParser - Class in org.apache.tika.parser.font
-
Parser for TrueType font files (TTF).
- TrueTypeParser() - Constructor for class org.apache.tika.parser.font.TrueTypeParser
-
- truncateContent(ContentTags, int, Map<Cols, String>) - Static method in class org.apache.tika.eval.AbstractProfiler
-
- tryToAdd(FileResource) - Method in class org.apache.tika.batch.FileResourceCrawler
-
- tryToFindExistingLeafParser(Class, ParseContext) - Static method in class org.apache.tika.extractor.EmbeddedDocumentUtil
-
Tries to find an existing parser within the ParseContext.
- tryToParse(String) - Method in class org.apache.tika.utils.DateUtils
-
Tries to parse the date string; returns null if no parse was possible.
- TSD_MIME_TYPE - Static variable in class org.apache.tika.parser.crypto.TSDParser
-
- TSDParser - Class in org.apache.tika.parser.crypto
-
Tika parser for Time Stamped Data Envelope (application/timestamped-data)
- TSDParser() - Constructor for class org.apache.tika.parser.crypto.TSDParser
-
- TXTParser - Class in org.apache.tika.parser.txt
-
Plain text parser.
- TXTParser() - Constructor for class org.apache.tika.parser.txt.TXTParser
-
- TXTParser(EncodingDetector) - Constructor for class org.apache.tika.parser.txt.TXTParser
-
- TYPE - Static variable in interface org.apache.tika.metadata.DublinCore
-
The nature or genre of the content of the resource.
- TYPE - Static variable in class org.apache.tika.metadata.Metadata
-
- TYPE - Static variable in interface org.apache.tika.metadata.TikaCoreProperties
-
- type - Variable in class org.apache.tika.mime.MimeTypesReader
-
Current type
- TypeDetector - Class in org.apache.tika.detect
-
Content type detection based on a content type hint.
- TypeDetector() - Constructor for class org.apache.tika.detect.TypeDetector
-
- types - Variable in class org.apache.tika.mime.MimeTypesReader
-
- valueOf(String) - Static method in enum org.apache.tika.batch.BatchProcess.BATCH_CONSTANTS
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.batch.fs.FSDirectoryCrawler.CRAWL_ORDER
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.batch.fs.FSOutputStreamFactory.COMPRESSION
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.batch.fs.FSUtil.HANDLE_EXISTING
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.config.TikaConfigSerializer.Mode
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.eval.AbstractProfiler.EXCEPTION_TYPE
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.eval.AbstractProfiler.PARSE_ERROR_TYPE
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.eval.db.Cols
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.eval.db.JDBCUtil.CREATE_TABLE
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.eval.io.ExtractReader.ALTER_METADATA_LIST
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.eval.io.ExtractReaderException.TYPE
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.language.detect.LanguageConfidence
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.metadata.Property.PropertyType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.metadata.Property.ValueType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.metadata.TikaCoreProperties.EmbeddedResourceType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.EntryType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.IntelState
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.LzxState
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.ctakes.CTAKESSerializer
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.FormattingUtils.Tag
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.EditType
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.ocr.TesseractOCRConfig.OUTPUT_TYPE
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.pdf.PDFParserConfig.OCR_STRATEGY
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.strings.StringsEncoding
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.parser.utils.CommonsDigester.DigestAlgorithm
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.sax.BasicContentHandlerFactory.HANDLER_TYPE
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.server.ServerStatus.STATUS
-
Returns the enum constant of this type with the specified name.
- valueOf(String) - Static method in enum org.apache.tika.server.ServerStatus.TASK
-
Returns the enum constant of this type with the specified name.
- values() - Static method in enum org.apache.tika.batch.BatchProcess.BATCH_CONSTANTS
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.batch.fs.FSDirectoryCrawler.CRAWL_ORDER
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.batch.fs.FSOutputStreamFactory.COMPRESSION
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.batch.fs.FSUtil.HANDLE_EXISTING
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.config.TikaConfigSerializer.Mode
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.eval.AbstractProfiler.EXCEPTION_TYPE
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.eval.AbstractProfiler.PARSE_ERROR_TYPE
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.eval.db.Cols
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.eval.db.JDBCUtil.CREATE_TABLE
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.eval.io.ExtractReader.ALTER_METADATA_LIST
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.eval.io.ExtractReaderException.TYPE
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.language.detect.LanguageConfidence
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.metadata.Property.PropertyType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.metadata.Property.ValueType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.metadata.TikaCoreProperties.EmbeddedResourceType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.EntryType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.IntelState
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.chm.core.ChmCommons.LzxState
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.ctakes.CTAKESAnnotationProperty
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.ctakes.CTAKESSerializer
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.iwork.iwana.IWork13PackageParser.IWork13DocumentType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.iwork.IWorkPackageParser.IWORKDocumentType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.FormattingUtils.Tag
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.OfficeParser.POIFSDocumentType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.ooxml.OOXMLWordAndPowerPointTextHandler.EditType
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.microsoft.OutlookExtractor.RECIPIENT_TYPE
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.ocr.TesseractOCRConfig.OUTPUT_TYPE
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.pdf.PDFParserConfig.OCR_STRATEGY
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.strings.StringsEncoding
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.parser.utils.CommonsDigester.DigestAlgorithm
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.sax.BasicContentHandlerFactory.HANDLER_TYPE
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.server.ServerStatus.STATUS
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- values() - Static method in enum org.apache.tika.server.ServerStatus.TASK
-
Returns an array containing the constants of this enum type, in
the order they are declared.
- VERBATIM - Static variable in class org.apache.tika.parser.chm.core.ChmCommons
-
- VERSION - Static variable in interface org.apache.tika.metadata.MSOffice
-
Deprecated.
- VERSION - Static variable in interface org.apache.tika.metadata.OfficeOpenXMLCore
-
The version number.
- VERSION - Static variable in interface org.apache.tika.metadata.QuattroPro
-
Version.
- video(String) - Static method in class org.apache.tika.mime.MediaType
-
- VIDEO_ALPHA_MODE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The alpha mode."
- VIDEO_ALPHA_UNITY_IS_TRANSPARENT - Static variable in interface org.apache.tika.metadata.XMPDM
-
"When true, unity is clear, when false, it is opaque."
- VIDEO_COLOR_SPACE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The color space."
- VIDEO_COMPRESSOR - Static variable in interface org.apache.tika.metadata.XMPDM
-
"Video compression used.
- VIDEO_FIELD_ORDER - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The field order for video."
- VIDEO_FRAME_RATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The video frame rate."
- VIDEO_MOD_DATE - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The date and time when the video was last modified."
- VIDEO_PIXEL_ASPECT_RATIO - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The aspect ratio, expressed as wd/ht.
- VIDEO_PIXEL_DEPTH - Static variable in interface org.apache.tika.metadata.XMPDM
-
"The size in bits of each color component of a pixel.
- VSD - Static variable in class org.apache.tika.parser.microsoft.POIFSContainerDetector
-
Microsoft Visio