A B C D E F G H I J K L M N O P R S T U V W X Z

A

AbstractOOXMLExtractor - Class in org.apache.tika.parser.microsoft.ooxml
Base class for all Tika OOXML extractors.
AbstractOOXMLExtractor(POIXMLTextExtractor, String) - Constructor for class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
 
add(String) - Method in class org.apache.tika.language.LanguageProfile
Adds a single occurrence of the given ngram to this profile.
add(String, long) - Method in class org.apache.tika.language.LanguageProfile
Adds multiple occurrences of the given ngram to this profile.
add(String, String) - Method in class org.apache.tika.metadata.Metadata
Add a metadata name/value mapping.
addAlias(MediaType, MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
 
addAlias(String) - Method in class org.apache.tika.mime.MimeType
Adds an alias name for this media type.
addMetadata(String) - Method in class org.apache.tika.parser.xml.MetadataHandler
 
addPattern(MimeType, String) - Method in class org.apache.tika.mime.MimeTypes
Adds a file name pattern for the given media type.
addPattern(MimeType, String, boolean) - Method in class org.apache.tika.mime.MimeTypes
Adds a file name pattern for the given media type.
addPrefix(String, String) - Method in class org.apache.tika.sax.xpath.XPathParser
 
ALIAS_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
ALIAS_TYPE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
APPLICATION_NAME - Static variable in interface org.apache.tika.metadata.MSOffice
 
APPLICATION_VERSION - Static variable in interface org.apache.tika.metadata.MSOffice
 
APPLICATION_XML - Static variable in class org.apache.tika.mime.MediaType
 
ArParser - Class in org.apache.tika.parser.pkg
Ar archive parser.
ArParser() - Constructor for class org.apache.tika.parser.pkg.ArParser
 
AttributeMatcher - Class in org.apache.tika.sax.xpath
Final evaluation state of a .../@* XPath expression.
AttributeMatcher() - Constructor for class org.apache.tika.sax.xpath.AttributeMatcher
 
AudioParser - Class in org.apache.tika.parser.audio
 
AudioParser() - Constructor for class org.apache.tika.parser.audio.AudioParser
 
AUTHOR - Static variable in interface org.apache.tika.metadata.MSOffice
 
AutoDetectParser - Class in org.apache.tika.parser
 
AutoDetectParser() - Constructor for class org.apache.tika.parser.AutoDetectParser
Creates an auto-detecting parser instance using the default Tika configuration.
AutoDetectParser(TikaConfig) - Constructor for class org.apache.tika.parser.AutoDetectParser
 
available() - Method in class org.apache.tika.io.NullInputStream
Return the number of bytes that can be read.
available() - Method in class org.apache.tika.io.ProxyInputStream
Invokes the delegate's available() method.

B

BodyContentHandler - Class in org.apache.tika.sax
Content handler decorator that only passes everything inside the XHTML <body/> tag to the underlying handler.
BodyContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.BodyContentHandler
Creates a content handler that passes all XHTML body events to the given underlying content handler.
BodyContentHandler(Writer) - Constructor for class org.apache.tika.sax.BodyContentHandler
Creates a content handler that writes XHTML body character events to the given writer.
BodyContentHandler(OutputStream) - Constructor for class org.apache.tika.sax.BodyContentHandler
Creates a content handler that writes XHTML body character events to the given output stream using the default encoding.
BodyContentHandler() - Constructor for class org.apache.tika.sax.BodyContentHandler
Creates a content handler that writes XHTML body character events to an internal string buffer.
BOM - Static variable in class org.apache.tika.parser.txt.CharsetMatch
Bit flag indicating the match is based on the presence of a BOM.
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
Populates the XHTMLContentHandler object received as parameter.
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.POIXMLTextExtractorDecorator
 
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSLFPowerPointExtractorDecorator
 
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
buildXHTML(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator
 
ByteArrayOutputStream - Class in org.apache.tika.io
This class implements an output stream in which the data is written into a byte array.
ByteArrayOutputStream() - Constructor for class org.apache.tika.io.ByteArrayOutputStream
Creates a new byte array output stream.
ByteArrayOutputStream(int) - Constructor for class org.apache.tika.io.ByteArrayOutputStream
Creates a new byte array output stream, with a buffer capacity of the specified size, in bytes.
Bzip2Parser - Class in org.apache.tika.parser.pkg
Bzip2 parser.
Bzip2Parser() - Constructor for class org.apache.tika.parser.pkg.Bzip2Parser
 

C

CATEGORY - Static variable in interface org.apache.tika.metadata.MSOffice
 
Cell - Interface in org.apache.tika.parser.microsoft
Cell of content.
CellDecorator - Class in org.apache.tika.parser.microsoft
Cell decorator.
CellDecorator(Cell) - Constructor for class org.apache.tika.parser.microsoft.CellDecorator
 
CHARACTER_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
 
CHARACTER_COUNT_WITH_SPACES - Static variable in interface org.apache.tika.metadata.MSOffice
 
characters(char[], int, int) - Method in class org.apache.tika.parser.xml.MetadataHandler
 
characters(char[], int, int) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
characters(char[], int, int) - Method in class org.apache.tika.sax.SafeContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.sax.SecureContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.sax.TeeContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.sax.TextContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.sax.WriteOutContentHandler
Writes the given characters to the given character stream.
characters(char[], int, int) - Method in class org.apache.tika.sax.XHTMLContentHandler
 
characters(String) - Method in class org.apache.tika.sax.XHTMLContentHandler
 
characters(char[], int, int) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
 
CharsetDetector - Class in org.apache.tika.parser.txt
CharsetDetector provides a facility for detecting the charset or encoding of character data in an unknown format.
CharsetDetector() - Constructor for class org.apache.tika.parser.txt.CharsetDetector
Constructor
CharsetMatch - Class in org.apache.tika.parser.txt
This class represents a charset that has been identified by a CharsetDetector as a possible encoding for a set of input data.
ChildMatcher - Class in org.apache.tika.sax.xpath
Intermediate evaluation state of a .../*... XPath expression.
ChildMatcher(Matcher) - Constructor for class org.apache.tika.sax.xpath.ChildMatcher
 
ClassParser - Class in org.apache.tika.parser.asm
Parser for Java .class files.
ClassParser() - Constructor for class org.apache.tika.parser.asm.ClassParser
 
close() - Method in class org.apache.tika.io.ByteArrayOutputStream
Closing a ByteArrayOutputStream has no effect.
close() - Method in class org.apache.tika.io.CloseShieldInputStream
Replaces the underlying input stream with a ClosedInputStream sentinel.
close() - Method in class org.apache.tika.io.NullInputStream
Close this input stream - resets the internal state to the initial values.
close() - Method in class org.apache.tika.io.ProxyInputStream
Invokes the delegate's close() method.
close() - Method in class org.apache.tika.language.ProfilingWriter
 
close() - Method in class org.apache.tika.parser.ParsingReader
Closes the read end of the pipe.
close() - Method in class org.apache.tika.utils.RereadableInputStream
Closes the input stream and removes the temporary file if one was created.
ClosedInputStream - Class in org.apache.tika.io
Closed input stream.
ClosedInputStream() - Constructor for class org.apache.tika.io.ClosedInputStream
 
closeQuietly(Reader) - Static method in class org.apache.tika.io.IOUtils
Unconditionally close an Reader.
closeQuietly(Channel) - Static method in class org.apache.tika.io.IOUtils
Unconditionally close a Channel.
closeQuietly(Writer) - Static method in class org.apache.tika.io.IOUtils
Unconditionally close a Writer.
closeQuietly(InputStream) - Static method in class org.apache.tika.io.IOUtils
Unconditionally close an InputStream.
closeQuietly(OutputStream) - Static method in class org.apache.tika.io.IOUtils
Unconditionally close an OutputStream.
CloseShieldInputStream - Class in org.apache.tika.io
Proxy stream that prevents the underlying input stream from being closed.
CloseShieldInputStream(InputStream) - Constructor for class org.apache.tika.io.CloseShieldInputStream
Creates a proxy that shields the given input stream from being closed.
COMMENT_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
COMMENTS - Static variable in interface org.apache.tika.metadata.MSOffice
 
COMPANY - Static variable in interface org.apache.tika.metadata.MSOffice
 
compareTo(MimeType) - Method in class org.apache.tika.mime.MimeType
 
compareTo(Object) - Method in class org.apache.tika.parser.txt.CharsetMatch
Compare to other CharsetMatch objects.
CompositeDetector - Class in org.apache.tika.detect
Content type detector that combines multiple different detection mechanisms.
CompositeDetector(List<Detector>) - Constructor for class org.apache.tika.detect.CompositeDetector
 
CompositeMatcher - Class in org.apache.tika.sax.xpath
Composite XPath evaluation state.
CompositeMatcher(Matcher, Matcher) - Constructor for class org.apache.tika.sax.xpath.CompositeMatcher
 
CompositeParser - Class in org.apache.tika.parser
Composite parser that delegates parsing tasks to a component parser based on the declared content type of the incoming document.
CompositeParser() - Constructor for class org.apache.tika.parser.CompositeParser
 
CONTENT_DISPOSITION - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_ENCODING - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_LANGUAGE - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_LENGTH - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_LOCATION - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_MD5 - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
CONTENT_STATUS - Static variable in interface org.apache.tika.metadata.MSOffice
 
CONTENT_TYPE - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
contentEquals(InputStream, InputStream) - Static method in class org.apache.tika.io.IOUtils
Compare the contents of two Streams to determine if they are equal or not.
contentEquals(Reader, Reader) - Static method in class org.apache.tika.io.IOUtils
Compare the contents of two Readers to determine if they are equal or not.
ContentHandlerDecorator - Class in org.apache.tika.sax
Decorator base class for the ContentHandler interface.
ContentHandlerDecorator(ContentHandler) - Constructor for class org.apache.tika.sax.ContentHandlerDecorator
Creates a decorator for the given SAX event handler.
CONTRIBUTOR - Static variable in interface org.apache.tika.metadata.DublinCore
An entity responsible for making contributions to the content of the resource.
copy(InputStream, OutputStream) - Static method in class org.apache.tika.io.IOUtils
Copy bytes from an InputStream to an OutputStream.
copy(InputStream, Writer) - Static method in class org.apache.tika.io.IOUtils
Copy bytes from an InputStream to chars on a Writer using the default character encoding of the platform.
copy(InputStream, Writer, String) - Static method in class org.apache.tika.io.IOUtils
Copy bytes from an InputStream to chars on a Writer using the specified character encoding.
copy(Reader, Writer) - Static method in class org.apache.tika.io.IOUtils
Copy chars from a Reader to a Writer.
copy(Reader, OutputStream) - Static method in class org.apache.tika.io.IOUtils
Copy chars from a Reader to bytes on an OutputStream using the default character encoding of the platform, and calling flush.
copy(Reader, OutputStream, String) - Static method in class org.apache.tika.io.IOUtils
Copy chars from a Reader to bytes on an OutputStream using the specified character encoding, and calling flush.
copyLarge(InputStream, OutputStream) - Static method in class org.apache.tika.io.IOUtils
Copy bytes from a large (over 2GB) InputStream to an OutputStream.
copyLarge(Reader, Writer) - Static method in class org.apache.tika.io.IOUtils
Copy chars from a large (over 2GB) Reader to a Writer.
CountingInputStream - Class in org.apache.tika.io
A decorating input stream that counts the number of bytes that have passed through the stream so far.
CountingInputStream(InputStream) - Constructor for class org.apache.tika.io.CountingInputStream
Constructs a new CountingInputStream.
COVERAGE - Static variable in interface org.apache.tika.metadata.DublinCore
The extent or scope of the content of the resource.
CpioParser - Class in org.apache.tika.parser.pkg
CPIO parser.
CpioParser() - Constructor for class org.apache.tika.parser.pkg.CpioParser
 
create() - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates an empty instance; same as calling new MimeTypes().
create(Document) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance from the specified document.
create(InputStream) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance from the specified input stream.
create(URL) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance from the resource at the location specified by the URL.
create(String) - Static method in class org.apache.tika.mime.MimeTypesFactory
Creates and returns a MimeTypes instance from the specified file path, as interpreted by the class loader in getResource().
createExtractor(POIXMLTextExtractor) - Static method in class org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory
 
CREATION_DATE - Static variable in interface org.apache.tika.metadata.MSOffice
 
CreativeCommons - Interface in org.apache.tika.metadata
A collection of Creative Commons properties names.
CREATOR - Static variable in interface org.apache.tika.metadata.DublinCore
An entity primarily responsible for making the content of the resource.

D

DATE - Static variable in interface org.apache.tika.metadata.DublinCore
A date associated with an event in the life cycle of the resource.
DcXMLParser - Class in org.apache.tika.parser.xml
Dublin Core metadata parser
DcXMLParser() - Constructor for class org.apache.tika.parser.xml.DcXMLParser
 
DECLARED_ENCODING - Static variable in class org.apache.tika.parser.txt.CharsetMatch
Bit flag indicating he match is based on the declared encoding.
decode(String) - Static method in class org.apache.tika.mime.HexCoDec
Decode a hex string
decode(char[]) - Static method in class org.apache.tika.mime.HexCoDec
Decode an array of hex chars
decode(char[], int, int) - Static method in class org.apache.tika.mime.HexCoDec
Decode an array of hex chars.
DEFAULT_CONFIG_LOCATION - Static variable in class org.apache.tika.config.TikaConfig
 
DEFAULT_NGRAM_LENGTH - Static variable in class org.apache.tika.language.LanguageProfile
 
DelegatingParser - Class in org.apache.tika.parser
Base class for parser implementations that want to delegate parts of the task of parsing an input document to another parser.
DelegatingParser() - Constructor for class org.apache.tika.parser.DelegatingParser
 
descend(String, String) - Method in class org.apache.tika.sax.xpath.ChildMatcher
 
descend(String, String) - Method in class org.apache.tika.sax.xpath.CompositeMatcher
 
descend(String, String) - Method in class org.apache.tika.sax.xpath.Matcher
Returns the XPath evaluation state that results from descending to a child element with the given name.
descend(String, String) - Method in class org.apache.tika.sax.xpath.NamedElementMatcher
 
descend(String, String) - Method in class org.apache.tika.sax.xpath.SubtreeMatcher
 
DESCRIPTION - Static variable in interface org.apache.tika.metadata.DublinCore
An account of the content of the resource.
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.CompositeDetector
 
detect(InputStream, Metadata) - Method in interface org.apache.tika.detect.Detector
Detects the content type of the given input document.
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.MagicDetector
 
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.NameDetector
Detects the content type of an input document based on the document name given in the input metadata.
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.TextDetector
Looks at the beginning of the document input stream to determine whether the document is text or not.
detect(InputStream, Metadata) - Method in class org.apache.tika.detect.TypeDetector
Detects the content type of an input document based on a type hint given in the input metadata.
detect(InputStream, Metadata) - Method in class org.apache.tika.mime.MimeTypes
Automatically detects the MIME type of a document based on magic markers in the stream prefix and any given metadata hints.
detect() - Method in class org.apache.tika.parser.txt.CharsetDetector
Return the charset that best matches the supplied input data.
detect(InputStream, Metadata) - Method in class org.apache.tika.Tika
Detects the media type of the given document.
detect(InputStream) - Method in class org.apache.tika.Tika
Detects the media type of the given document.
detect(File) - Method in class org.apache.tika.Tika
Detects the media type of the given file.
detect(URL) - Method in class org.apache.tika.Tika
Detects the media type of the resource at the given URL.
detect(String) - Method in class org.apache.tika.Tika
Detects the media type of a document with the given file name.
detectAll() - Method in class org.apache.tika.parser.txt.CharsetDetector
Return an array of all charsets that appear to be plausible matches with the input data.
Detector - Interface in org.apache.tika.detect
Content type detector.
distance(LanguageProfile) - Method in class org.apache.tika.language.LanguageProfile
Calculates the geometric distance between this and the given other language profile.
DRAW_NS - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
 
DublinCore - Interface in org.apache.tika.metadata
A collection of Dublin Core metadata names.

E

EDIT_TIME - Static variable in interface org.apache.tika.metadata.MSOffice
 
element(String, String) - Method in class org.apache.tika.sax.XHTMLContentHandler
 
ElementMappingContentHandler - Class in org.apache.tika.sax
Content handler decorator that maps element QNames using a Map.
ElementMappingContentHandler(ContentHandler, Map<QName, ElementMappingContentHandler.TargetElement>) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler
 
ElementMappingContentHandler.TargetElement - Class in org.apache.tika.sax
 
ElementMappingContentHandler.TargetElement(QName, Map<QName, QName>) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
Creates an TargetElement, attributes of this element will be mapped as specified
ElementMappingContentHandler.TargetElement(String, String, Map<QName, QName>) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
A shortcut that automatically creates the QName object
ElementMappingContentHandler.TargetElement(QName) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
Creates an TargetElement with no attributes, all attributes will be deleted from SAX stream
ElementMappingContentHandler.TargetElement(String, String) - Constructor for class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
A shortcut that automatically creates the QName object
ElementMatcher - Class in org.apache.tika.sax.xpath
Final evaluation state of an XPath expression that targets an element.
ElementMatcher() - Constructor for class org.apache.tika.sax.xpath.ElementMatcher
 
EmbeddedContentHandler - Class in org.apache.tika.sax
Content handler decorator that prevents the EmbeddedContentHandler.startDocument() and EmbeddedContentHandler.endDocument() events from reaching the decorated handler.
EmbeddedContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.EmbeddedContentHandler
Created a decorator that prevents the given handler from receiving EmbeddedContentHandler.startDocument() and EmbeddedContentHandler.endDocument() events.
EmptyParser - Class in org.apache.tika.parser
Dummy parser that always produces an empty XHTML document without even attempting to parse the given document stream.
EmptyParser() - Constructor for class org.apache.tika.parser.EmptyParser
 
enableInputFilter(boolean) - Method in class org.apache.tika.parser.txt.CharsetDetector
Enable filtering of input text.
encode(byte[]) - Static method in class org.apache.tika.mime.HexCoDec
Hex encode an array of bytes
encode(byte[], int, int) - Static method in class org.apache.tika.mime.HexCoDec
Hex encode an array of bytes
ENCODING_SCHEME - Static variable in class org.apache.tika.parser.txt.CharsetMatch
Bit flag indicating the match is based on the the encoding scheme.
endDocument() - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
endDocument() - Method in class org.apache.tika.sax.EmbeddedContentHandler
Ignored.
endDocument() - Method in class org.apache.tika.sax.TeeContentHandler
 
endDocument() - Method in class org.apache.tika.sax.TextContentHandler
 
endDocument() - Method in class org.apache.tika.sax.WriteOutContentHandler
Flushes the character stream so that no characters are forgotten in internal buffers.
endDocument() - Method in class org.apache.tika.sax.XHTMLContentHandler
Ends the XHTML document by writing the following footer and clearing the namespace mappings:
endElement(String, String, String) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.parser.xml.MetadataHandler
 
endElement(String, String, String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
endElement(String, String, String) - Method in class org.apache.tika.sax.ElementMappingContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.sax.TeeContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.sax.XHTMLContentHandler
Ends the given element.
endElement(String) - Method in class org.apache.tika.sax.XHTMLContentHandler
 
endElement(String, String, String) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
 
endPrefixMapping(String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
endPrefixMapping(String) - Method in class org.apache.tika.sax.TeeContentHandler
 
EpubContentParser - Class in org.apache.tika.parser.epub
Parser for EPUB OPS *.html files.
EpubContentParser() - Constructor for class org.apache.tika.parser.epub.EpubContentParser
 
EpubParser - Class in org.apache.tika.parser.epub
Epub parser
EpubParser() - Constructor for class org.apache.tika.parser.epub.EpubParser
 
equals(Object) - Method in class org.apache.tika.metadata.Metadata
 
equals(Object) - Method in class org.apache.tika.mime.MediaType
 
ErrorParser - Class in org.apache.tika.parser
Dummy parser that always throws a TikaException without even attempting to parse the given document stream.
ErrorParser() - Constructor for class org.apache.tika.parser.ErrorParser
 
ExcelExtractor - Class in org.apache.tika.parser.microsoft
Excel parser implementation which uses POI's Event API to handle the contents of a Workbook.
ExcelExtractor() - Constructor for class org.apache.tika.parser.microsoft.ExcelExtractor
 
ExternalParser - Class in org.apache.tika.parser
Parser that uses an external program (like catdoc or pdf2txt) to extract text content from a given document.
ExternalParser() - Constructor for class org.apache.tika.parser.ExternalParser
 
extract(Metadata) - Method in class org.apache.tika.parser.microsoft.ooxml.MetadataExtractor
 
extractLinks(String) - Static method in class org.apache.tika.utils.RegexUtils
Extract urls from plain text.
extractor - Variable in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
 
extractRootElement(byte[]) - Method in class org.apache.tika.detect.XmlRootExtractor
 

F

FAIL - Static variable in class org.apache.tika.sax.xpath.Matcher
State of a failed XPath evaluation, where nothing is matched.
flush() - Method in class org.apache.tika.language.ProfilingWriter
Ignored.
FORMAT - Static variable in interface org.apache.tika.metadata.DublinCore
Typically, Format may include the media-type or dimensions of the resource.
forName(String) - Method in class org.apache.tika.mime.MimeTypes
Returns the registered media type with the given name (or alias).

G

get(String) - Method in class org.apache.tika.metadata.Metadata
Get the value associated to a metadata name.
get(Class<T>) - Method in class org.apache.tika.parser.ParseContext
 
get(Class<T>, T) - Method in class org.apache.tika.parser.ParseContext
 
getAliases() - Method in class org.apache.tika.mime.MimeType
Returns the aliases of this media type.
getAllDetectableCharsets() - Static method in class org.apache.tika.parser.txt.CharsetDetector
Get the names of all char sets that can be recognized by the char set detector.
getAttributesMapping() - Method in class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
 
getBaseType() - Method in class org.apache.tika.mime.MediaType
 
getByteCount() - Method in class org.apache.tika.io.CountingInputStream
The number of bytes that have passed through this stream.
getCause() - Method in exception org.apache.tika.io.TaggedIOException
Returns the wrapped exception.
getCause() - Method in exception org.apache.tika.sax.TaggedSAXException
Returns the wrapped exception.
getConfidence() - Method in class org.apache.tika.parser.txt.CharsetMatch
Get an indication of the confidence in the charset detected.
getContentHandler(ContentHandler, Metadata) - Method in class org.apache.tika.parser.odf.OpenDocumentMetaParser
 
getContentHandler(ContentHandler, Metadata) - Method in class org.apache.tika.parser.xml.DcXMLParser
 
getContentHandler(ContentHandler, Metadata) - Method in class org.apache.tika.parser.xml.XMLParser
 
getContentParser() - Method in class org.apache.tika.parser.epub.EpubParser
 
getContentParser() - Method in class org.apache.tika.parser.odf.OpenDocumentParser
 
getCount() - Method in class org.apache.tika.io.CountingInputStream
The number of bytes that have passed through this stream.
getCount() - Method in class org.apache.tika.language.LanguageProfile
 
getCount(String) - Method in class org.apache.tika.language.LanguageProfile
 
getDefaultConfig() - Static method in class org.apache.tika.config.TikaConfig
Provides a default configuration (TikaConfig).
getDefaultConfig(Parser) - Static method in class org.apache.tika.config.TikaConfig
Deprecated. This method will be removed in Apache Tika 1.0
getDescription() - Method in class org.apache.tika.mime.MimeType
Returns the description of this media type.
getDetector() - Method in class org.apache.tika.parser.AutoDetectParser
Returns the type detector used by this parser to auto-detect the type of a document.
getDocument() - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
 
getDocument() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor
Returns the opened document.
getFallback() - Method in class org.apache.tika.parser.CompositeParser
Returns the fallback parser.
getLanguage() - Method in class org.apache.tika.language.LanguageIdentifier
 
getLanguage() - Method in class org.apache.tika.language.ProfilingHandler
Returns the language that best matches the current state of the language profile.
getLanguage() - Method in class org.apache.tika.language.ProfilingWriter
Returns the language that best matches the current state of the language profile.
getLanguage() - Method in class org.apache.tika.parser.txt.CharsetMatch
Get the ISO code for the language of the detected charset.
getMappedTagName() - Method in class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
 
getMatchType() - Method in class org.apache.tika.parser.txt.CharsetMatch
Return flags indicating what it was about the input data that caused this charset to be considered as a possible match.
getMaximumCompressionRatio() - Method in class org.apache.tika.sax.SecureContentHandler
Returns the maximum compression ratio.
getMetadataExtractor() - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
 
getMetadataExtractor() - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor
POIXMLTextExtractor.getMetadataTextExtractor() not yet supported for OOXML by POI.
getMetaParser() - Method in class org.apache.tika.parser.epub.EpubParser
 
getMetaParser() - Method in class org.apache.tika.parser.odf.OpenDocumentParser
 
getMimeRepository() - Method in class org.apache.tika.config.TikaConfig
 
getMimeType(File) - Method in class org.apache.tika.mime.MimeTypes
Find the Mime Content Type of a file.
getMimeType(URL) - Method in class org.apache.tika.mime.MimeTypes
Find the Mime Content Type of a document from its URL.
getMimeType(String) - Method in class org.apache.tika.mime.MimeTypes
Find the Mime Content Type of a document from its name.
getMimeType(byte[]) - Method in class org.apache.tika.mime.MimeTypes
Returns the MIME type that best matches the given first few bytes of a document stream.
getMimeType(InputStream) - Method in class org.apache.tika.mime.MimeTypes
Returns the MIME type that best matches the first few bytes of the given document stream.
getMimeType(String, byte[]) - Method in class org.apache.tika.mime.MimeTypes
Find the Mime Content Type of a document from its name and its content.
getMimeType(String, InputStream) - Method in class org.apache.tika.mime.MimeTypes
Returns the MIME type that best matches the given document name and the first few bytes of the given document stream.
getMinLength() - Method in class org.apache.tika.mime.MimeTypes
Return the minimum length of data to provide to analyzing methods based on the document's content in order to check all the known MimeTypes.
getName() - Method in class org.apache.tika.mime.MimeType
Returns the name of this media type.
getName() - Method in class org.apache.tika.parser.txt.CharsetMatch
Get the name of the detected charset.
getOutputThreshold() - Method in class org.apache.tika.sax.SecureContentHandler
Returns the configured output threshold.
getParameters() - Method in class org.apache.tika.mime.MediaType
 
getParser(String) - Method in class org.apache.tika.config.TikaConfig
Returns the parser instance configured for the given MIME type.
getParser(Metadata) - Method in class org.apache.tika.parser.CompositeParser
Returns the parser that best matches the given metadata.
getParser(String, TikaConfig) - Static method in class org.apache.tika.utils.ParseUtils
Returns a parser that can handle the specified MIME type, and is set to receive input from a stream opened from the specified URL.
getParser(URL, TikaConfig) - Static method in class org.apache.tika.utils.ParseUtils
Returns a parser that can handle the specified MIME type, and is set to receive input from a stream opened from the specified URL.
getParser(File, TikaConfig) - Static method in class org.apache.tika.utils.ParseUtils
Returns a parser that can handle the specified MIME type, and is set to receive input from a stream opened from the specified URL.
getParsers() - Method in class org.apache.tika.config.TikaConfig
 
getParsers() - Method in class org.apache.tika.parser.CompositeParser
Returns the component parsers.
getPosition() - Method in class org.apache.tika.io.NullInputStream
Return the current position.
getProfile() - Method in class org.apache.tika.language.ProfilingHandler
Returns the language profile being built by this content handler.
getProfile() - Method in class org.apache.tika.language.ProfilingWriter
Returns the language profile being built by this writer.
getQNameAsString(QName) - Static method in class org.apache.tika.sax.ElementMappingContentHandler
 
getReader(InputStream, String) - Method in class org.apache.tika.parser.txt.CharsetDetector
Autodetect the charset of an inputStream, and return a Java Reader to access the converted input data.
getReader() - Method in class org.apache.tika.parser.txt.CharsetMatch
Create a java.io.Reader for reading the Unicode character data corresponding to the original byte data supplied to the Charset detect operation.
getSize() - Method in class org.apache.tika.io.NullInputStream
Return the size this InputStream emulates.
getSize() - Method in class org.apache.tika.utils.RereadableInputStream
Returns the number of bytes read from the original stream.
getString(byte[], String) - Method in class org.apache.tika.parser.txt.CharsetDetector
Autodetect the charset of an inputStream, and return a String containing the converted input data.
getString() - Method in class org.apache.tika.parser.txt.CharsetMatch
Create a Java String from Unicode character data corresponding to the original byte data supplied to the Charset detect operation.
getString(int) - Method in class org.apache.tika.parser.txt.CharsetMatch
Create a Java String from Unicode character data corresponding to the original byte data supplied to the Charset detect operation.
getStringContent(InputStream, TikaConfig, String) - Static method in class org.apache.tika.utils.ParseUtils
Gets the string content of a document read from an input stream.
getStringContent(URL, TikaConfig) - Static method in class org.apache.tika.utils.ParseUtils
Gets the string content of a document read from an input stream.
getStringContent(URL, TikaConfig, String) - Static method in class org.apache.tika.utils.ParseUtils
Gets the string content of a document read from an input stream.
getStringContent(File, TikaConfig, String) - Static method in class org.apache.tika.utils.ParseUtils
Gets the string content of a document read from an input stream.
getStringContent(File, TikaConfig) - Static method in class org.apache.tika.utils.ParseUtils
Gets the string content of a document read from an input stream.
getSubtype() - Method in class org.apache.tika.mime.MediaType
 
getSubTypes() - Method in class org.apache.tika.mime.MimeType
 
getSuperType() - Method in class org.apache.tika.mime.MimeType
Returns the parent of this media type.
getTag() - Method in exception org.apache.tika.io.TaggedIOException
Returns the object reference used as the tag this exception.
getTag() - Method in exception org.apache.tika.sax.TaggedSAXException
Returns the object reference used as the tag this exception.
getType() - Method in class org.apache.tika.mime.MediaType
 
getType(String, String, byte[]) - Method in class org.apache.tika.mime.MimeTypes
 
getType(URL) - Method in class org.apache.tika.mime.MimeTypes
Determines the MIME type of the resource pointed to by the specified URL.
getValues(String) - Method in class org.apache.tika.metadata.Metadata
Get the values associated to a metadata name.
getXHTML(ContentHandler, Metadata) - Method in class org.apache.tika.parser.microsoft.ooxml.AbstractOOXMLExtractor
 
getXHTML(ContentHandler, Metadata) - Method in interface org.apache.tika.parser.microsoft.ooxml.OOXMLExtractor
Parses the document into a sequence of XHTML SAX events sent to the given content handler.
GLOB_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
GzipParser - Class in org.apache.tika.parser.pkg
Gzip parser.
GzipParser() - Constructor for class org.apache.tika.parser.pkg.GzipParser
 

H

handleException(SAXException) - Method in class org.apache.tika.sax.ContentHandlerDecorator
Handle any exceptions thrown by methods in this class.
handleException(SAXException) - Method in class org.apache.tika.sax.TaggedContentHandler
Tags any SAXExceptions thrown, wrapping and re-throwing.
handleIOException(IOException) - Method in class org.apache.tika.io.ProxyInputStream
Handle any IOExceptions thrown.
handleIOException(IOException) - Method in class org.apache.tika.io.TaggedInputStream
Tags any IOExceptions thrown, wrapping and re-throwing.
hashCode() - Method in class org.apache.tika.mime.MediaType
 
hasMagic() - Method in class org.apache.tika.mime.MimeType
 
HexCoDec - Class in org.apache.tika.mime
A set of Hex encoding and decoding utility methods.
HexCoDec() - Constructor for class org.apache.tika.mime.HexCoDec
 
HtmlParser - Class in org.apache.tika.parser.html
HTML parser.
HtmlParser() - Constructor for class org.apache.tika.parser.html.HtmlParser
 
HttpHeaders - Interface in org.apache.tika.metadata
A collection of HTTP header names.

I

IDENTIFIER - Static variable in interface org.apache.tika.metadata.DublinCore
Recommended best practice is to identify the resource by means of a string or number conforming to a formal identification system.
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.SafeContentHandler
 
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.SecureContentHandler
 
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.TeeContentHandler
 
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.TextContentHandler
 
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.WriteOutContentHandler
Writes the given ignorable characters to the given character stream.
ignorableWhitespace(char[], int, int) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
 
ImageParser - Class in org.apache.tika.parser.image
 
ImageParser() - Constructor for class org.apache.tika.parser.image.ImageParser
 
importStream(InputStream) - Method in class org.apache.tika.gui.TikaGUI
 
inputFilterEnabled() - Method in class org.apache.tika.parser.txt.CharsetDetector
Test whether or not input filtering is enabled.
INSTANCE - Static variable in class org.apache.tika.parser.EmptyParser
Singleton instance of this class.
INSTANCE - Static variable in class org.apache.tika.parser.ErrorParser
Singleton instance of this class.
INSTANCE - Static variable in class org.apache.tika.sax.xpath.AttributeMatcher
 
INSTANCE - Static variable in class org.apache.tika.sax.xpath.ElementMatcher
 
INSTANCE - Static variable in class org.apache.tika.sax.xpath.NodeMatcher
 
INSTANCE - Static variable in class org.apache.tika.sax.xpath.TextMatcher
 
IOExceptionWithCause - Exception in org.apache.tika.io
Subclasses IOException with the Throwable constructors missing before Java 6.
IOExceptionWithCause(String, Throwable) - Constructor for exception org.apache.tika.io.IOExceptionWithCause
Constructs a new instance with the given message and cause.
IOExceptionWithCause(Throwable) - Constructor for exception org.apache.tika.io.IOExceptionWithCause
Constructs a new instance with the given cause.
IOUtils - Class in org.apache.tika.io
General IO stream manipulation utilities.
IOUtils() - Constructor for class org.apache.tika.io.IOUtils
Instances should NOT be constructed in standard programming.
isCauseOf(IOException) - Method in class org.apache.tika.io.TaggedInputStream
Tests if the given exception was caused by this stream.
isCauseOf(SAXException) - Method in class org.apache.tika.sax.TaggedContentHandler
Tests if the given exception was caused by this handler.
isDescendantOf(MimeType) - Method in class org.apache.tika.mime.MimeType
 
isDiscardElement(String) - Method in class org.apache.tika.parser.html.HtmlParser
Checks whether all content within the given HTML element should be discarded instead of including it in the parse output.
isInvalid(char) - Method in class org.apache.tika.sax.SafeContentHandler
Checks whether the given character (more accurately a UTF-16 code unit) is an invalid XML character and should be replaced for output.
isListenForAllRecords() - Method in class org.apache.tika.parser.microsoft.ExcelExtractor
Returns true if this parser is configured to listen for all records instead of just the specified few.
isMultiValued(String) - Method in class org.apache.tika.metadata.Metadata
Returns true if named value is multivalued.
isReasonablyCertain() - Method in class org.apache.tika.language.LanguageIdentifier
 
ISREGEX_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
isSpecializationOf(MediaType) - Method in class org.apache.tika.mime.MediaType
 
isValid(String) - Static method in class org.apache.tika.mime.MimeType
Checks that the given string is a valid Internet media type name based on rules from RFC 2054 section 5.3.

J

JpegParser - Class in org.apache.tika.parser.jpeg
 
JpegParser() - Constructor for class org.apache.tika.parser.jpeg.JpegParser
 

K

KEYWORDS - Static variable in interface org.apache.tika.metadata.MSOffice
 

L

LANG_STATISTICS - Static variable in class org.apache.tika.parser.txt.CharsetMatch
Bit flag indicating the match is based on language statistics.
LANGUAGE - Static variable in interface org.apache.tika.metadata.DublinCore
A language of the intellectual content of the resource.
LanguageIdentifier - Class in org.apache.tika.language
Identifier of the language that best matches a given content profile.
LanguageIdentifier(LanguageProfile) - Constructor for class org.apache.tika.language.LanguageIdentifier
 
LanguageIdentifier(String) - Constructor for class org.apache.tika.language.LanguageIdentifier
 
LanguageProfile - Class in org.apache.tika.language
Language profile based on ngram counts.
LanguageProfile(int) - Constructor for class org.apache.tika.language.LanguageProfile
 
LanguageProfile() - Constructor for class org.apache.tika.language.LanguageProfile
 
LanguageProfile(String, int) - Constructor for class org.apache.tika.language.LanguageProfile
 
LanguageProfile(String) - Constructor for class org.apache.tika.language.LanguageProfile
 
LAST_AUTHOR - Static variable in interface org.apache.tika.metadata.MSOffice
 
LAST_MODIFIED - Static variable in interface org.apache.tika.metadata.HttpHeaders
 
LAST_PRINTED - Static variable in interface org.apache.tika.metadata.MSOffice
 
LAST_SAVED - Static variable in interface org.apache.tika.metadata.MSOffice
 
LICENSE_LOCATION - Static variable in interface org.apache.tika.metadata.CreativeCommons
 
LICENSE_URL - Static variable in interface org.apache.tika.metadata.CreativeCommons
 
LINE_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
 
LinkedCell - Class in org.apache.tika.parser.microsoft
Linked cell.
LinkedCell(Cell, String) - Constructor for class org.apache.tika.parser.microsoft.LinkedCell
 
LOCAL_NAME_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
LOCATION - Static variable in interface org.apache.tika.metadata.HttpHeaders
 

M

MAGIC_PRIORITY_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
MAGIC_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
MagicDetector - Class in org.apache.tika.detect
Content type detection based on magic bytes, i.e.
MagicDetector(MediaType, byte[]) - Constructor for class org.apache.tika.detect.MagicDetector
Creates a detector for input documents that have the exact given byte pattern at the beginning of the document stream.
MagicDetector(MediaType, byte[], long) - Constructor for class org.apache.tika.detect.MagicDetector
Creates a detector for input documents that have the exact given byte pattern at the given offset of the document stream.
MagicDetector(MediaType, byte[], byte[], long, long) - Constructor for class org.apache.tika.detect.MagicDetector
Creates a detector for input documents that meet the specified magic match.
main(String[]) - Static method in class org.apache.tika.cli.TikaCLI
 
main(String[]) - Static method in class org.apache.tika.gui.TikaGUI
Main method.
MANAGER - Static variable in interface org.apache.tika.metadata.MSOffice
 
mapAttributes(Attributes) - Method in class org.apache.tika.sax.ElementMappingContentHandler.TargetElement
 
mapSafeElement(String) - Method in class org.apache.tika.parser.html.HtmlParser
Maps "safe" HTML element names to semantic XHTML equivalents.
mark(int) - Method in class org.apache.tika.io.NullInputStream
Mark the current position.
mark(int) - Method in class org.apache.tika.io.ProxyInputStream
Invokes the delegate's mark(int) method.
markSupported() - Method in class org.apache.tika.io.NullInputStream
Indicates whether mark is supported.
markSupported() - Method in class org.apache.tika.io.ProxyInputStream
Invokes the delegate's markSupported() method.
MATCH_MASK_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
MATCH_OFFSET_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
MATCH_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
MATCH_TYPE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
MATCH_VALUE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
Matcher - Class in org.apache.tika.sax.xpath
XPath element matcher.
Matcher() - Constructor for class org.apache.tika.sax.xpath.Matcher
 
matches(byte[]) - Method in class org.apache.tika.mime.MimeType
 
matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.AttributeMatcher
 
matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.CompositeMatcher
 
matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.Matcher
Returns true if the XPath expression matches the named attribute of the element associated with this evaluation state.
matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.NamedAttributeMatcher
 
matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.NodeMatcher
 
matchesAttribute(String, String) - Method in class org.apache.tika.sax.xpath.SubtreeMatcher
 
matchesElement() - Method in class org.apache.tika.sax.xpath.CompositeMatcher
 
matchesElement() - Method in class org.apache.tika.sax.xpath.ElementMatcher
 
matchesElement() - Method in class org.apache.tika.sax.xpath.Matcher
Returns true if the XPath expression matches the element associated with this evaluation state.
matchesElement() - Method in class org.apache.tika.sax.xpath.NodeMatcher
 
matchesElement() - Method in class org.apache.tika.sax.xpath.SubtreeMatcher
 
matchesMagic(byte[]) - Method in class org.apache.tika.mime.MimeType
 
matchesText() - Method in class org.apache.tika.sax.xpath.CompositeMatcher
 
matchesText() - Method in class org.apache.tika.sax.xpath.Matcher
Returns true if the XPath expression matches all text nodes whose parent is the element associated with this evaluation state.
matchesText() - Method in class org.apache.tika.sax.xpath.NodeMatcher
 
matchesText() - Method in class org.apache.tika.sax.xpath.SubtreeMatcher
 
matchesText() - Method in class org.apache.tika.sax.xpath.TextMatcher
 
MatchingContentHandler - Class in org.apache.tika.sax.xpath
Content handler decorator that only passes the elements, attributes, and text nodes that match the given XPath expression.
MatchingContentHandler(ContentHandler, Matcher) - Constructor for class org.apache.tika.sax.xpath.MatchingContentHandler
 
MBOX_MIME_TYPE - Static variable in class org.apache.tika.parser.mbox.MboxParser
 
MBOX_RECORD_DIVIDER - Static variable in class org.apache.tika.parser.mbox.MboxParser
 
MboxParser - Class in org.apache.tika.parser.mbox
Mbox (mailbox) parser.
MboxParser() - Constructor for class org.apache.tika.parser.mbox.MboxParser
 
MediaType - Class in org.apache.tika.mime
Internet media type.
MediaType(String, String, Map<String, String>) - Constructor for class org.apache.tika.mime.MediaType
 
MediaType(String, String) - Constructor for class org.apache.tika.mime.MediaType
 
MediaType(MediaType, Map<String, String>) - Constructor for class org.apache.tika.mime.MediaType
 
MediaTypeRegistry - Class in org.apache.tika.mime
Registry of Internet media types.
MediaTypeRegistry() - Constructor for class org.apache.tika.mime.MediaTypeRegistry
 
Metadata - Class in org.apache.tika.metadata
A multi-valued metadata container.
Metadata() - Constructor for class org.apache.tika.metadata.Metadata
Constructs a new, empty metadata.
MetadataExtractor - Class in org.apache.tika.parser.microsoft.ooxml
OOXML metadata extractor.
MetadataExtractor(POIXMLTextExtractor, String) - Constructor for class org.apache.tika.parser.microsoft.ooxml.MetadataExtractor
 
MetadataHandler - Class in org.apache.tika.parser.xml
 
MetadataHandler(Metadata, String) - Constructor for class org.apache.tika.parser.xml.MetadataHandler
 
MidiParser - Class in org.apache.tika.parser.audio
 
MidiParser() - Constructor for class org.apache.tika.parser.audio.MidiParser
 
MIME_INFO_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
MIME_TYPE_MAGIC - Static variable in interface org.apache.tika.metadata.TikaMimeKeys
 
MIME_TYPE_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
MIME_TYPE_TYPE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
MimeType - Class in org.apache.tika.mime
Internet media type.
MimeTypeException - Exception in org.apache.tika.mime
A class to encapsulate MimeType related exceptions.
MimeTypeException(String) - Constructor for exception org.apache.tika.mime.MimeTypeException
Constructs a MimeTypeException with the specified detail message.
MimeTypeException(String, Throwable) - Constructor for exception org.apache.tika.mime.MimeTypeException
Constructs a MimeTypeException with the specified detail message and root cause.
MimeTypes - Class in org.apache.tika.mime
This class is a MimeType repository.
MimeTypes() - Constructor for class org.apache.tika.mime.MimeTypes
 
MimeTypesFactory - Class in org.apache.tika.mime
Creates instances of MimeTypes.
MimeTypesFactory() - Constructor for class org.apache.tika.mime.MimeTypesFactory
 
MimeTypesReaderMetKeys - Interface in org.apache.tika.mime
Met Keys used by the MimeTypesReader.
MODIFIED - Static variable in interface org.apache.tika.metadata.DublinCore
Date on which the resource was changed.
Mp3Parser - Class in org.apache.tika.parser.mp3
The Mp3Parser is used to parse ID3 Version 1 Tag information from an MP3 file, if available.
Mp3Parser() - Constructor for class org.apache.tika.parser.mp3.Mp3Parser
 
MSOffice - Interface in org.apache.tika.metadata
A collection of Microsoft Office documents property names.

N

NamedAttributeMatcher - Class in org.apache.tika.sax.xpath
Final evaluation state of a .../@name XPath expression.
NamedAttributeMatcher(String, String) - Constructor for class org.apache.tika.sax.xpath.NamedAttributeMatcher
 
NamedElementMatcher - Class in org.apache.tika.sax.xpath
Intermediate evaluation state of a .../name... XPath expression.
NamedElementMatcher(String, String, Matcher) - Constructor for class org.apache.tika.sax.xpath.NamedElementMatcher
 
NameDetector - Class in org.apache.tika.detect
Content type detection based on the resource name.
NameDetector(Map<Pattern, MediaType>) - Constructor for class org.apache.tika.detect.NameDetector
Creates a new content type detector based on the given name patterns.
names() - Method in class org.apache.tika.metadata.Metadata
Returns an array of the names contained in the metadata.
NodeMatcher - Class in org.apache.tika.sax.xpath
Final evaluation state of a .../node() XPath expression.
NodeMatcher() - Constructor for class org.apache.tika.sax.xpath.NodeMatcher
 
NOTES - Static variable in interface org.apache.tika.metadata.MSOffice
 
NS_URI_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
NSNormalizerContentHandler - Class in org.apache.tika.parser.odf
Content handler decorator that: Maps old OpenOffice 1.0 Namespaces to the OpenDocument ones Returns a fake DTD when parser requests OpenOffice DTD
NSNormalizerContentHandler(ContentHandler) - Constructor for class org.apache.tika.parser.odf.NSNormalizerContentHandler
 
NULL_OUTPUT_STREAM - Static variable in class org.apache.tika.io.NullOutputStream
A singleton.
NullInputStream - Class in org.apache.tika.io
A functional, light weight InputStream that emulates a stream of a specified size.
NullInputStream(long) - Constructor for class org.apache.tika.io.NullInputStream
Create an InputStream that emulates a specified size which supports marking and does not throw EOFException.
NullInputStream(long, boolean, boolean) - Constructor for class org.apache.tika.io.NullInputStream
Create an InputStream that emulates a specified size with option settings.
NullOutputStream - Class in org.apache.tika.io
This OutputStream writes all data to the famous /dev/null.
NullOutputStream() - Constructor for class org.apache.tika.io.NullOutputStream
 
NumberCell - Class in org.apache.tika.parser.microsoft
Number cell.
NumberCell(double, NumberFormat) - Constructor for class org.apache.tika.parser.microsoft.NumberCell
 

O

OCTET_STREAM - Static variable in class org.apache.tika.mime.MediaType
 
OCTET_STREAM - Static variable in class org.apache.tika.mime.MimeTypes
Name of the root type, application/octet-stream.
OFFICE_NS - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
 
OfficeParser - Class in org.apache.tika.parser.microsoft
Defines a Microsoft document content extractor.
OfficeParser() - Constructor for class org.apache.tika.parser.microsoft.OfficeParser
 
OfflineContentHandler - Class in org.apache.tika.sax
Content handler decorator that always returns an empty stream from the OfflineContentHandler.resolveEntity(String, String) method to prevent potential network or other external resources from being accessed by an XML parser.
OfflineContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.OfflineContentHandler
 
OOXMLExtractor - Interface in org.apache.tika.parser.microsoft.ooxml
Interface implemented by all Tika OOXML extractors.
OOXMLExtractorFactory - Class in org.apache.tika.parser.microsoft.ooxml
Figures out the correct OOXMLExtractor for the supplied document and returns it.
OOXMLExtractorFactory() - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLExtractorFactory
 
OOXMLParser - Class in org.apache.tika.parser.microsoft.ooxml
Office Open XML (OOXML) parser.
OOXMLParser() - Constructor for class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
 
OpenDocumentContentParser - Class in org.apache.tika.parser.odf
Parser for ODF content.xml files.
OpenDocumentContentParser() - Constructor for class org.apache.tika.parser.odf.OpenDocumentContentParser
 
OpenDocumentMetaParser - Class in org.apache.tika.parser.odf
Parser for OpenDocument meta.xml files.
OpenDocumentMetaParser() - Constructor for class org.apache.tika.parser.odf.OpenDocumentMetaParser
 
OpenDocumentParser - Class in org.apache.tika.parser.odf
OpenOffice parser
OpenDocumentParser() - Constructor for class org.apache.tika.parser.odf.OpenDocumentParser
 
OpenOfficeParser - Class in org.apache.tika.parser.opendocument
Deprecated. Use the OpenDocumentParser class instead. This class will be removed in Apache Tika 1.0.
OpenOfficeParser() - Constructor for class org.apache.tika.parser.opendocument.OpenOfficeParser
Deprecated.  
org.apache.tika - package org.apache.tika
 
org.apache.tika.cli - package org.apache.tika.cli
 
org.apache.tika.config - package org.apache.tika.config
 
org.apache.tika.detect - package org.apache.tika.detect
 
org.apache.tika.exception - package org.apache.tika.exception
 
org.apache.tika.gui - package org.apache.tika.gui
 
org.apache.tika.io - package org.apache.tika.io
 
org.apache.tika.language - package org.apache.tika.language
 
org.apache.tika.metadata - package org.apache.tika.metadata
A Multi-valued Metadata container, and set of constant fields for Tika Metadata.
org.apache.tika.mime - package org.apache.tika.mime
 
org.apache.tika.parser - package org.apache.tika.parser
 
org.apache.tika.parser.asm - package org.apache.tika.parser.asm
 
org.apache.tika.parser.audio - package org.apache.tika.parser.audio
 
org.apache.tika.parser.epub - package org.apache.tika.parser.epub
 
org.apache.tika.parser.html - package org.apache.tika.parser.html
 
org.apache.tika.parser.image - package org.apache.tika.parser.image
 
org.apache.tika.parser.jpeg - package org.apache.tika.parser.jpeg
 
org.apache.tika.parser.mbox - package org.apache.tika.parser.mbox
 
org.apache.tika.parser.microsoft - package org.apache.tika.parser.microsoft
 
org.apache.tika.parser.microsoft.ooxml - package org.apache.tika.parser.microsoft.ooxml
 
org.apache.tika.parser.mp3 - package org.apache.tika.parser.mp3
 
org.apache.tika.parser.odf - package org.apache.tika.parser.odf
 
org.apache.tika.parser.opendocument - package org.apache.tika.parser.opendocument
 
org.apache.tika.parser.pdf - package org.apache.tika.parser.pdf
 
org.apache.tika.parser.pkg - package org.apache.tika.parser.pkg
 
org.apache.tika.parser.rtf - package org.apache.tika.parser.rtf
 
org.apache.tika.parser.txt - package org.apache.tika.parser.txt
 
org.apache.tika.parser.xml - package org.apache.tika.parser.xml
 
org.apache.tika.sax - package org.apache.tika.sax
 
org.apache.tika.sax.xpath - package org.apache.tika.sax.xpath
 
org.apache.tika.utils - package org.apache.tika.utils
 

P

PackageParser - Class in org.apache.tika.parser.pkg
Abstract base class for parsers that deal with package formats.
PackageParser() - Constructor for class org.apache.tika.parser.pkg.PackageParser
 
PAGE_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
 
PARAGRAPH_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
 
parse(String) - Static method in class org.apache.tika.mime.MediaType
Parses the given string to a media type.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.asm.ClassParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.asm.ClassParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.audio.AudioParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.audio.AudioParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.audio.MidiParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.audio.MidiParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.AutoDetectParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.AutoDetectParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.CompositeParser
Delegates the call to the matching component parser.
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.CompositeParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.DelegatingParser
Looks up the delegate parser from the parsing context and delegates the parse operation to it.
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.DelegatingParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.EmptyParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.EmptyParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.epub.EpubContentParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.epub.EpubContentParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.epub.EpubParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.epub.EpubParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ErrorParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.ErrorParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ExternalParser
Executes the configured external command and passes the given document stream as a simple XHTML document to the given SAX content handler.
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.ExternalParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.html.HtmlParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.html.HtmlParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.image.ImageParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.image.ImageParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.jpeg.JpegParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.jpeg.JpegParser
 
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mbox.MboxParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.mbox.MboxParser
 
parse(POIFSFileSystem, XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.ExcelExtractor
Extracts text from an Excel Workbook writing the extracted content to the specified Appendable.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.OfficeParser
Extracts properties and text from an MS Document input stream
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.microsoft.OfficeParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.microsoft.ooxml.OOXMLParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.mp3.Mp3Parser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.mp3.Mp3Parser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentContentParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.odf.OpenDocumentContentParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in interface org.apache.tika.parser.Parser
Parses a document stream into a sequence of XHTML SAX events.
parse(InputStream, ContentHandler, Metadata) - Method in interface org.apache.tika.parser.Parser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ParserDecorator
Delegates the method call to the decorated parser.
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.ParserDecorator
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.ParserPostProcessor
Forwards the call to the delegated parser and post-processes the results as described above.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pdf.PDFParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.pdf.PDFParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.ArParser
Parses the given stream as an ar archive.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.Bzip2Parser
Parses the given stream as a bzip2 file.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.CpioParser
Parses the given stream as a cpio file.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.GzipParser
Parses the given stream as a gzip file.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.TarParser
Parses the given stream as a tar file.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.ZipParser
Parses the given stream as a Zip file.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.rtf.RTFParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.rtf.RTFParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.txt.TXTParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.txt.TXTParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(InputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.xml.XMLParser
 
parse(InputStream, ContentHandler, Metadata) - Method in class org.apache.tika.parser.xml.XMLParser
Deprecated. This method will be removed in Apache Tika 1.0.
parse(String) - Method in class org.apache.tika.sax.xpath.XPathParser
Parses the given simple XPath expression to an evaluation state initialized at the document node.
parse(InputStream, Metadata) - Method in class org.apache.tika.Tika
Parses the given document and returns the extracted text content.
parse(InputStream) - Method in class org.apache.tika.Tika
Parses the given document and returns the extracted text content.
parse(File) - Method in class org.apache.tika.Tika
Parses the given file and returns the extracted text content.
parse(URL) - Method in class org.apache.tika.Tika
Parses the resource at the given URL and returns the extracted text content.
parseArchive(ArchiveInputStream, ContentHandler, Metadata, ParseContext) - Method in class org.apache.tika.parser.pkg.PackageParser
Parses the given stream as a package of multiple underlying files.
ParseContext - Class in org.apache.tika.parser
Parse context.
ParseContext() - Constructor for class org.apache.tika.parser.ParseContext
 
Parser - Interface in org.apache.tika.parser
Tika parser interface.
ParserDecorator - Class in org.apache.tika.parser
Decorator base class for the Parser interface.
ParserDecorator(Parser) - Constructor for class org.apache.tika.parser.ParserDecorator
Creates a decorator for the given parser.
ParserPostProcessor - Class in org.apache.tika.parser
Parser decorator that post-processes the results from a decorated parser.
ParserPostProcessor(Parser) - Constructor for class org.apache.tika.parser.ParserPostProcessor
Creates a post-processing decorator for the given parser.
parseToString(InputStream, Metadata) - Method in class org.apache.tika.Tika
Parses the given document and returns the extracted text content.
parseToString(InputStream) - Method in class org.apache.tika.Tika
Parses the given document and returns the extracted text content.
parseToString(File) - Method in class org.apache.tika.Tika
Parses the given file and returns the extracted text content.
parseToString(URL) - Method in class org.apache.tika.Tika
Parses the resource at the given URL and returns the extracted text content.
ParseUtils - Class in org.apache.tika.utils
Contains utility methods for parsing documents.
ParseUtils() - Constructor for class org.apache.tika.utils.ParseUtils
 
ParsingReader - Class in org.apache.tika.parser
Reader for the text content from a given binary stream.
ParsingReader(InputStream) - Constructor for class org.apache.tika.parser.ParsingReader
Creates a reader for the text content of the given binary stream.
ParsingReader(InputStream, String) - Constructor for class org.apache.tika.parser.ParsingReader
Creates a reader for the text content of the given binary stream with the given name.
ParsingReader(File) - Constructor for class org.apache.tika.parser.ParsingReader
Creates a reader for the text content of the given file.
ParsingReader(Parser, InputStream, Metadata, ParseContext) - Constructor for class org.apache.tika.parser.ParsingReader
Creates a reader for the text content of the given binary stream with the given document metadata.
ParsingReader(Parser, InputStream, Metadata, ParseContext, Executor) - Constructor for class org.apache.tika.parser.ParsingReader
Creates a reader for the text content of the given binary stream with the given document metadata.
ParsingReader(Parser, InputStream, Metadata) - Constructor for class org.apache.tika.parser.ParsingReader
Deprecated. This method will be removed in Apache Tika 1.0
ParsingReader(Parser, InputStream, Metadata, Executor) - Constructor for class org.apache.tika.parser.ParsingReader
Deprecated. This method will be removed in Apache Tika 1.0
PASSWORD - Static variable in class org.apache.tika.parser.pdf.PDFParser
Metadata key for giving the document password to the parser.
PATTERN_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
PDFParser - Class in org.apache.tika.parser.pdf
PDF parser.
PDFParser() - Constructor for class org.apache.tika.parser.pdf.PDFParser
 
PLAIN_TEXT - Static variable in class org.apache.tika.mime.MimeTypes
Name of the text type, text/plain.
POIXMLTextExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
 
POIXMLTextExtractorDecorator(POIXMLTextExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.POIXMLTextExtractorDecorator
 
PRESENTATION_FORMAT - Static variable in interface org.apache.tika.metadata.MSOffice
 
PRESENTATION_NS - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
 
process(String) - Method in class org.apache.tika.cli.TikaCLI
 
processByte() - Method in class org.apache.tika.io.NullInputStream
Return a byte value for the read() method.
processBytes(byte[], int, int) - Method in class org.apache.tika.io.NullInputStream
Process the bytes for the read(byte[], offset, length) method.
processingInstruction(String, String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
processingInstruction(String, String) - Method in class org.apache.tika.sax.TeeContentHandler
 
processingInstruction(String, String) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
 
ProfilingHandler - Class in org.apache.tika.language
SAX content handler that builds a language profile based on all the received character content.
ProfilingHandler(ProfilingWriter) - Constructor for class org.apache.tika.language.ProfilingHandler
 
ProfilingHandler(LanguageProfile) - Constructor for class org.apache.tika.language.ProfilingHandler
 
ProfilingHandler() - Constructor for class org.apache.tika.language.ProfilingHandler
 
ProfilingWriter - Class in org.apache.tika.language
Writer that builds a language profile based on all the written content.
ProfilingWriter(LanguageProfile) - Constructor for class org.apache.tika.language.ProfilingWriter
 
ProfilingWriter() - Constructor for class org.apache.tika.language.ProfilingWriter
 
ProxyInputStream - Class in org.apache.tika.io
A Proxy stream which acts as expected, that is it passes the method calls on to the proxied stream and doesn't change which methods are being called.
ProxyInputStream(InputStream) - Constructor for class org.apache.tika.io.ProxyInputStream
Constructs a new ProxyInputStream.
PUBLISHER - Static variable in interface org.apache.tika.metadata.DublinCore
An entity responsible for making the resource available.

R

read() - Method in class org.apache.tika.io.ClosedInputStream
Returns -1 to indicate that the stream is closed.
read(byte[]) - Method in class org.apache.tika.io.CountingInputStream
Reads a number of bytes into the byte array, keeping count of the number read.
read(byte[], int, int) - Method in class org.apache.tika.io.CountingInputStream
Reads a number of bytes into the byte array at a specific offset, keeping count of the number read.
read() - Method in class org.apache.tika.io.CountingInputStream
Reads the next byte of data adding to the count of bytes received if a byte is successfully read.
read() - Method in class org.apache.tika.io.NullInputStream
Read a byte.
read(byte[]) - Method in class org.apache.tika.io.NullInputStream
Read some bytes into the specified array.
read(byte[], int, int) - Method in class org.apache.tika.io.NullInputStream
Read the specified number bytes into an array.
read() - Method in class org.apache.tika.io.ProxyInputStream
Invokes the delegate's read() method.
read(byte[]) - Method in class org.apache.tika.io.ProxyInputStream
Invokes the delegate's read(byte[]) method.
read(byte[], int, int) - Method in class org.apache.tika.io.ProxyInputStream
Invokes the delegate's read(byte[], int, int) method.
read(char[], int, int) - Method in class org.apache.tika.parser.ParsingReader
Reads parsed text from the pipe connected to the parsing thread.
read() - Method in class org.apache.tika.utils.RereadableInputStream
Reads a byte from the stream, saving it in the store if it is being read from the original stream.
readLines(InputStream) - Static method in class org.apache.tika.io.IOUtils
Get the contents of an InputStream as a list of Strings, one entry per line, using the default character encoding of the platform.
readLines(InputStream, String) - Static method in class org.apache.tika.io.IOUtils
Get the contents of an InputStream as a list of Strings, one entry per line, using the specified character encoding.
readLines(Reader) - Static method in class org.apache.tika.io.IOUtils
Get the contents of a Reader as a list of Strings, one entry per line.
RegexUtils - Class in org.apache.tika.utils
Inspired from Nutch code class OutlinkExtractor.
RegexUtils() - Constructor for class org.apache.tika.utils.RegexUtils
 
RELATION - Static variable in interface org.apache.tika.metadata.DublinCore
A reference to a related resource.
remove(String) - Method in class org.apache.tika.metadata.Metadata
Remove a metadata and all its associated values.
render(XHTMLContentHandler) - Method in interface org.apache.tika.parser.microsoft.Cell
Renders the content to the given XHTML SAX event stream.
render(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.CellDecorator
 
render(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.LinkedCell
 
render(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.NumberCell
 
render(XHTMLContentHandler) - Method in class org.apache.tika.parser.microsoft.TextCell
 
RereadableInputStream - Class in org.apache.tika.utils
Wraps an input stream, reading it only once, but making it available for rereading an arbitrary number of times.
RereadableInputStream(InputStream, int, boolean, boolean) - Constructor for class org.apache.tika.utils.RereadableInputStream
Creates a rereadable input stream.
reset() - Method in class org.apache.tika.io.ByteArrayOutputStream
 
reset() - Method in class org.apache.tika.io.NullInputStream
Reset the stream to the point when mark was last called.
reset() - Method in class org.apache.tika.io.ProxyInputStream
Invokes the delegate's reset() method.
resetByteCount() - Method in class org.apache.tika.io.CountingInputStream
Set the byte count back to 0.
resetCount() - Method in class org.apache.tika.io.CountingInputStream
Set the byte count back to 0.
resolveEntity(String, String) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
do not load any DTDs (may be requested by parser).
resolveEntity(String, String) - Method in class org.apache.tika.sax.OfflineContentHandler
Returns an empty stream.
RESOURCE_NAME_KEY - Static variable in interface org.apache.tika.metadata.TikaMetadataKeys
 
REVISION_NUMBER - Static variable in interface org.apache.tika.metadata.MSOffice
 
rewind() - Method in class org.apache.tika.utils.RereadableInputStream
"Rewinds" the stream to the beginning for rereading.
RIGHTS - Static variable in interface org.apache.tika.metadata.DublinCore
Information about rights held in and over the resource.
ROOT_XML_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
RTFParser - Class in org.apache.tika.parser.rtf
RTF parser
RTFParser() - Constructor for class org.apache.tika.parser.rtf.RTFParser
 

S

SafeContentHandler - Class in org.apache.tika.sax
Content handler decorator that makes sure that the character events (SafeContentHandler.characters(char[], int, int) or SafeContentHandler.ignorableWhitespace(char[], int, int)) passed to the decorated content handler contain only valid XML characters.
SafeContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.SafeContentHandler
 
SafeContentHandler.Output - Interface in org.apache.tika.sax
Internal interface that allows both character and ignorable whitespace content to be filtered the same way.
SecureContentHandler - Class in org.apache.tika.sax
Content handler decorator that attempts to prevent denial of service attacks against Tika parsers.
SecureContentHandler(ContentHandler, CountingInputStream) - Constructor for class org.apache.tika.sax.SecureContentHandler
Decorates the given content handler with zip bomb prevention based on the count of bytes read from the given counting input stream.
SECURITY - Static variable in interface org.apache.tika.metadata.MSOffice
 
set(String, String) - Method in class org.apache.tika.metadata.Metadata
Set metadata name/value.
set(Class<T>, T) - Method in class org.apache.tika.parser.ParseContext
 
setAll(Properties) - Method in class org.apache.tika.metadata.Metadata
Copy All key-value pairs from properties.
setConfig(TikaConfig) - Method in class org.apache.tika.parser.AutoDetectParser
 
setContentParser(Parser) - Method in class org.apache.tika.parser.epub.EpubParser
 
setContentParser(Parser) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
 
setDeclaredEncoding(String) - Method in class org.apache.tika.parser.txt.CharsetDetector
Set the declared encoding for charset detection.
setDescription(String) - Method in class org.apache.tika.mime.MimeType
Set the description of this media type.
setDetector(Detector) - Method in class org.apache.tika.parser.AutoDetectParser
Sets the type detector used by this parser to auto-detect the type of a document.
setDocumentLocator(Locator) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
setDocumentLocator(Locator) - Method in class org.apache.tika.sax.TeeContentHandler
 
setFallback(Parser) - Method in class org.apache.tika.parser.CompositeParser
Sets the fallback parser.
setListenForAllRecords(boolean) - Method in class org.apache.tika.parser.microsoft.ExcelExtractor
Specifies whether this parser should to listen for all records or just for the specified few.
setMaximumCompressionRatio(long) - Method in class org.apache.tika.sax.SecureContentHandler
Sets the ratio between output characters and input bytes.
setMetaParser(Parser) - Method in class org.apache.tika.parser.epub.EpubParser
 
setMetaParser(Parser) - Method in class org.apache.tika.parser.odf.OpenDocumentParser
 
setOutputThreshold(long) - Method in class org.apache.tika.sax.SecureContentHandler
Sets the threshold for output characters before the zip bomb prevention is activated.
setParsers(Map<String, Parser>) - Method in class org.apache.tika.parser.CompositeParser
Sets the component parsers.
setSuperType(MimeType) - Method in class org.apache.tika.mime.MimeType
 
setText(byte[]) - Method in class org.apache.tika.parser.txt.CharsetDetector
Set the input text (byte) data whose charset is to be detected.
setText(InputStream) - Method in class org.apache.tika.parser.txt.CharsetDetector
Set the input text (byte) data whose charset is to be detected.
size() - Method in class org.apache.tika.io.ByteArrayOutputStream
Return the current size of the byte array.
size() - Method in class org.apache.tika.metadata.Metadata
Returns the number of metadata names in this metadata.
skip(long) - Method in class org.apache.tika.io.CountingInputStream
Skips the stream over the specified number of bytes, adding the skipped amount to the count.
skip(long) - Method in class org.apache.tika.io.NullInputStream
Skip a specified number of bytes.
skip(long) - Method in class org.apache.tika.io.ProxyInputStream
Invokes the delegate's skip(long) method.
skippedEntity(String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
skippedEntity(String) - Method in class org.apache.tika.sax.TeeContentHandler
 
skippedEntity(String) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
 
SLIDE_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
 
SOURCE - Static variable in interface org.apache.tika.metadata.DublinCore
A reference to a resource from which the present resource is derived.
startDocument() - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
startDocument() - Method in class org.apache.tika.sax.EmbeddedContentHandler
Ignored.
startDocument() - Method in class org.apache.tika.sax.TeeContentHandler
 
startDocument() - Method in class org.apache.tika.sax.TextContentHandler
 
startDocument() - Method in class org.apache.tika.sax.XHTMLContentHandler
Starts an XHTML document by setting up the namespace mappings.
startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
 
startElement(String, String, String, Attributes) - Method in class org.apache.tika.parser.xml.MetadataHandler
 
startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.ElementMappingContentHandler
 
startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.TeeContentHandler
 
startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.XHTMLContentHandler
Starts the given element.
startElement(String) - Method in class org.apache.tika.sax.XHTMLContentHandler
 
startElement(String, String, String) - Method in class org.apache.tika.sax.XHTMLContentHandler
 
startElement(String, String, String, Attributes) - Method in class org.apache.tika.sax.xpath.MatchingContentHandler
 
startPrefixMapping(String, String) - Method in class org.apache.tika.parser.odf.NSNormalizerContentHandler
 
startPrefixMapping(String, String) - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
startPrefixMapping(String, String) - Method in class org.apache.tika.sax.TeeContentHandler
 
SUB_CLASS_OF_TAG - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
SUB_CLASS_TYPE_ATTR - Static variable in interface org.apache.tika.mime.MimeTypesReaderMetKeys
 
SUBJECT - Static variable in interface org.apache.tika.metadata.DublinCore
The topic of the content of the resource.
SubtreeMatcher - Class in org.apache.tika.sax.xpath
Evaluation state of a ...//... XPath expression.
SubtreeMatcher(Matcher) - Constructor for class org.apache.tika.sax.xpath.SubtreeMatcher
 
SVG_NS - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
 

T

TAB - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
 
TABLE_NS - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
 
TaggedContentHandler - Class in org.apache.tika.sax
A content handler decorator that tags potential exceptions so that the handler that caused the exception can easily be identified.
TaggedContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.TaggedContentHandler
Creates a tagging decorator for the given content handler.
TaggedInputStream - Class in org.apache.tika.io
An input stream decorator that tags potential exceptions so that the stream that caused the exception can easily be identified.
TaggedInputStream(InputStream) - Constructor for class org.apache.tika.io.TaggedInputStream
Creates a tagging decorator for the given input stream.
TaggedIOException - Exception in org.apache.tika.io
An IOException wrapper that tags the wrapped exception with a given object reference.
TaggedIOException(IOException, Object) - Constructor for exception org.apache.tika.io.TaggedIOException
Creates a tagged wrapper for the given exception.
TaggedSAXException - Exception in org.apache.tika.sax
A SAXException wrapper that tags the wrapped exception with a given object reference.
TaggedSAXException(SAXException, Object) - Constructor for exception org.apache.tika.sax.TaggedSAXException
Creates a tagged wrapper for the given exception.
TarParser - Class in org.apache.tika.parser.pkg
Tar parser.
TarParser() - Constructor for class org.apache.tika.parser.pkg.TarParser
 
TeeContentHandler - Class in org.apache.tika.sax
Content handler proxy that forwards the received SAX events to zero or more underlying content handlers.
TeeContentHandler(ContentHandler...) - Constructor for class org.apache.tika.sax.TeeContentHandler
 
TEMPLATE - Static variable in interface org.apache.tika.metadata.MSOffice
 
TEXT_NS - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
 
TEXT_PLAIN - Static variable in class org.apache.tika.mime.MediaType
 
TextCell - Class in org.apache.tika.parser.microsoft
Text cell.
TextCell(String) - Constructor for class org.apache.tika.parser.microsoft.TextCell
 
TextContentHandler - Class in org.apache.tika.sax
Content handler decorator that only passes the TextContentHandler.characters(char[], int, int) and (@link TextContentHandler.ignorableWhitespace(char[], int, int) (plus TextContentHandler.startDocument() and TextContentHandler.endDocument() events to the decorated content handler.
TextContentHandler(ContentHandler) - Constructor for class org.apache.tika.sax.TextContentHandler
 
TextDetector - Class in org.apache.tika.detect
Content type detection of plain text documents.
TextDetector() - Constructor for class org.apache.tika.detect.TextDetector
 
TextMatcher - Class in org.apache.tika.sax.xpath
Final evaluation state of a .../text() XPath expression.
TextMatcher() - Constructor for class org.apache.tika.sax.xpath.TextMatcher
 
throwIfCauseOf(Exception) - Method in class org.apache.tika.io.TaggedInputStream
Re-throws the original exception thrown by this stream.
throwIfCauseOf(SAXException) - Method in class org.apache.tika.sax.SecureContentHandler
Converts the given SAXException to a corresponding TikaException if it's caused by this instance detecting a zip bomb.
throwIfCauseOf(Exception) - Method in class org.apache.tika.sax.TaggedContentHandler
Re-throws the original exception thrown by this handler.
Tika - Class in org.apache.tika
Facade class for accessing Tika functionality.
Tika(TikaConfig) - Constructor for class org.apache.tika.Tika
Creates a Tika facade using the given configuration.
Tika() - Constructor for class org.apache.tika.Tika
Creates a Tika facade using the default configuration.
TIKA_MIME_FILE - Static variable in interface org.apache.tika.metadata.TikaMimeKeys
 
TikaCLI - Class in org.apache.tika.cli
Simple command line interface for Apache Tika.
TikaCLI() - Constructor for class org.apache.tika.cli.TikaCLI
 
TikaConfig - Class in org.apache.tika.config
Parse xml config file.
TikaConfig(String) - Constructor for class org.apache.tika.config.TikaConfig
 
TikaConfig(File) - Constructor for class org.apache.tika.config.TikaConfig
 
TikaConfig(URL) - Constructor for class org.apache.tika.config.TikaConfig
 
TikaConfig(InputStream) - Constructor for class org.apache.tika.config.TikaConfig
 
TikaConfig(InputStream, Parser) - Constructor for class org.apache.tika.config.TikaConfig
Deprecated. This method will be removed in Apache Tika 1.0
TikaConfig(Document) - Constructor for class org.apache.tika.config.TikaConfig
 
TikaConfig(Document, Parser) - Constructor for class org.apache.tika.config.TikaConfig
Deprecated. This method will be removed in Apache Tika 1.0
TikaConfig(Element) - Constructor for class org.apache.tika.config.TikaConfig
 
TikaConfig(Element, Parser) - Constructor for class org.apache.tika.config.TikaConfig
Deprecated. This method will be removed in Apache Tika 1.0
TikaException - Exception in org.apache.tika.exception
Tika exception
TikaException(String) - Constructor for exception org.apache.tika.exception.TikaException
 
TikaException(String, Throwable) - Constructor for exception org.apache.tika.exception.TikaException
 
TikaGUI - Class in org.apache.tika.gui
Simple Swing GUI for Apache Tika.
TikaGUI(Parser) - Constructor for class org.apache.tika.gui.TikaGUI
 
TikaMetadataKeys - Interface in org.apache.tika.metadata
Contains keys to properties in Metadata instances.
TikaMimeKeys - Interface in org.apache.tika.metadata
A collection of Tika metadata keys used in Mime Type resolution
TITLE - Static variable in interface org.apache.tika.metadata.DublinCore
A name given to the resource.
toBufferedInputStream(InputStream) - Static method in class org.apache.tika.io.ByteArrayOutputStream
Fetches entire contents of an InputStream and represent same data as result InputStream.
toBufferedInputStream(InputStream) - Static method in class org.apache.tika.io.IOUtils
Fetches entire contents of an InputStream and represent same data as result InputStream.
toByteArray() - Method in class org.apache.tika.io.ByteArrayOutputStream
Gets the curent contents of this byte stream as a byte array.
toByteArray(InputStream) - Static method in class org.apache.tika.io.IOUtils
Get the contents of an InputStream as a byte[].
toByteArray(Reader) - Static method in class org.apache.tika.io.IOUtils
Get the contents of a Reader as a byte[] using the default character encoding of the platform.
toByteArray(Reader, String) - Static method in class org.apache.tika.io.IOUtils
Get the contents of a Reader as a byte[] using the specified character encoding.
toByteArray(String) - Static method in class org.apache.tika.io.IOUtils
Deprecated. Use String.getBytes()
toCharArray(InputStream) - Static method in class org.apache.tika.io.IOUtils
Get the contents of an InputStream as a character array using the default character encoding of the platform.
toCharArray(InputStream, String) - Static method in class org.apache.tika.io.IOUtils
Get the contents of an InputStream as a character array using the specified character encoding.
toCharArray(Reader) - Static method in class org.apache.tika.io.IOUtils
Get the contents of a Reader as a character array.
toInputStream(CharSequence) - Static method in class org.apache.tika.io.IOUtils
Convert the specified CharSequence to an input stream, encoded as bytes using the default character encoding of the platform.
toInputStream(CharSequence, String) - Static method in class org.apache.tika.io.IOUtils
Convert the specified CharSequence to an input stream, encoded as bytes using the specified character encoding.
toInputStream(String) - Static method in class org.apache.tika.io.IOUtils
Convert the specified string to an input stream, encoded as bytes using the default character encoding of the platform.
toInputStream(String, String) - Static method in class org.apache.tika.io.IOUtils
Convert the specified string to an input stream, encoded as bytes using the specified character encoding.
toString() - Method in class org.apache.tika.io.ByteArrayOutputStream
Gets the curent contents of this byte stream as a string.
toString(String) - Method in class org.apache.tika.io.ByteArrayOutputStream
Gets the curent contents of this byte stream as a string using the specified encoding.
toString(InputStream) - Static method in class org.apache.tika.io.IOUtils
Get the contents of an InputStream as a String using the default character encoding of the platform.
toString(InputStream, String) - Static method in class org.apache.tika.io.IOUtils
Get the contents of an InputStream as a String using the specified character encoding.
toString(Reader) - Static method in class org.apache.tika.io.IOUtils
Get the contents of a Reader as a String.
toString(byte[]) - Static method in class org.apache.tika.io.IOUtils
Deprecated. Use String.String(byte[])
toString(byte[], String) - Static method in class org.apache.tika.io.IOUtils
Deprecated. Use String.String(byte[],String)
toString() - Method in class org.apache.tika.language.LanguageIdentifier
 
toString() - Method in class org.apache.tika.language.LanguageProfile
 
toString() - Method in class org.apache.tika.metadata.Metadata
 
toString() - Method in class org.apache.tika.mime.MediaType
 
toString() - Method in class org.apache.tika.mime.MimeType
Returns the name of this media type.
toString() - Method in class org.apache.tika.sax.ContentHandlerDecorator
 
toString() - Method in class org.apache.tika.sax.TextContentHandler
 
toString() - Method in class org.apache.tika.sax.WriteOutContentHandler
Returns the contents of the internal string buffer where all the received characters have been collected.
TOTAL_TIME - Static variable in interface org.apache.tika.metadata.MSOffice
 
TXTParser - Class in org.apache.tika.parser.txt
Plain text parser.
TXTParser() - Constructor for class org.apache.tika.parser.txt.TXTParser
 
TYPE - Static variable in interface org.apache.tika.metadata.DublinCore
The nature or genre of the content of the resource.
TypeDetector - Class in org.apache.tika.detect
Content type detection based on a content type hint.
TypeDetector() - Constructor for class org.apache.tika.detect.TypeDetector
 

U

unalias(MediaType) - Method in class org.apache.tika.mime.MediaTypeRegistry
 

V

VERSION - Static variable in interface org.apache.tika.metadata.MSOffice
 

W

WORD_COUNT - Static variable in interface org.apache.tika.metadata.MSOffice
 
WORK_TYPE - Static variable in interface org.apache.tika.metadata.CreativeCommons
 
write(byte[], int, int) - Method in class org.apache.tika.io.ByteArrayOutputStream
Write the bytes to byte array.
write(int) - Method in class org.apache.tika.io.ByteArrayOutputStream
Write a byte to byte array.
write(InputStream) - Method in class org.apache.tika.io.ByteArrayOutputStream
Writes the entire contents of the specified input stream to this byte stream.
write(byte[], OutputStream) - Static method in class org.apache.tika.io.IOUtils
Writes bytes from a byte[] to an OutputStream.
write(byte[], Writer) - Static method in class org.apache.tika.io.IOUtils
Writes bytes from a byte[] to chars on a Writer using the default character encoding of the platform.
write(byte[], Writer, String) - Static method in class org.apache.tika.io.IOUtils
Writes bytes from a byte[] to chars on a Writer using the specified character encoding.
write(char[], Writer) - Static method in class org.apache.tika.io.IOUtils
Writes chars from a char[] to a Writer using the default character encoding of the platform.
write(char[], OutputStream) - Static method in class org.apache.tika.io.IOUtils
Writes chars from a char[] to bytes on an OutputStream.
write(char[], OutputStream, String) - Static method in class org.apache.tika.io.IOUtils
Writes chars from a char[] to bytes on an OutputStream using the specified character encoding.
write(CharSequence, Writer) - Static method in class org.apache.tika.io.IOUtils
Writes chars from a CharSequence to a Writer.
write(CharSequence, OutputStream) - Static method in class org.apache.tika.io.IOUtils
Writes chars from a CharSequence to bytes on an OutputStream using the default character encoding of the platform.
write(CharSequence, OutputStream, String) - Static method in class org.apache.tika.io.IOUtils
Writes chars from a CharSequence to bytes on an OutputStream using the specified character encoding.
write(String, Writer) - Static method in class org.apache.tika.io.IOUtils
Writes chars from a String to a Writer.
write(String, OutputStream) - Static method in class org.apache.tika.io.IOUtils
Writes chars from a String to bytes on an OutputStream using the default character encoding of the platform.
write(String, OutputStream, String) - Static method in class org.apache.tika.io.IOUtils
Writes chars from a String to bytes on an OutputStream using the specified character encoding.
write(StringBuffer, Writer) - Static method in class org.apache.tika.io.IOUtils
Deprecated. replaced by write(CharSequence, Writer)
write(StringBuffer, OutputStream) - Static method in class org.apache.tika.io.IOUtils
Deprecated. replaced by write(CharSequence, OutputStream)
write(StringBuffer, OutputStream, String) - Static method in class org.apache.tika.io.IOUtils
Deprecated. replaced by write(CharSequence, OutputStream, String)
write(byte[], int, int) - Method in class org.apache.tika.io.NullOutputStream
Does nothing - output to /dev/null.
write(int) - Method in class org.apache.tika.io.NullOutputStream
Does nothing - output to /dev/null.
write(byte[]) - Method in class org.apache.tika.io.NullOutputStream
Does nothing - output to /dev/null.
write(char[], int, int) - Method in class org.apache.tika.language.ProfilingWriter
 
write(char[], int, int) - Method in interface org.apache.tika.sax.SafeContentHandler.Output
 
WriteOutContentHandler - Class in org.apache.tika.sax
SAX event handler that writes all character content out to a Writer character stream.
WriteOutContentHandler(Writer) - Constructor for class org.apache.tika.sax.WriteOutContentHandler
Creates a content handler that writes character events to the given writer.
WriteOutContentHandler(OutputStream) - Constructor for class org.apache.tika.sax.WriteOutContentHandler
Creates a content handler that writes character events to the given output stream using the default encoding.
WriteOutContentHandler() - Constructor for class org.apache.tika.sax.WriteOutContentHandler
Creates a content handler that writes character events to an internal string buffer.
writeReplacement(SafeContentHandler.Output) - Method in class org.apache.tika.sax.SafeContentHandler
Outputs the replacement for an invalid character.
writeTo(OutputStream) - Method in class org.apache.tika.io.ByteArrayOutputStream
Writes the entire contents of this byte stream to the specified output stream.

X

XHTML - Static variable in class org.apache.tika.sax.XHTMLContentHandler
The XHTML namespace URI
XHTMLContentHandler - Class in org.apache.tika.sax
Content handler decorator that simplifies the task of producing XHTML events for Tika content parsers.
XHTMLContentHandler(ContentHandler, Metadata) - Constructor for class org.apache.tika.sax.XHTMLContentHandler
 
XLINK_NS - Static variable in class org.apache.tika.parser.odf.OpenDocumentContentParser
 
XML - Static variable in class org.apache.tika.mime.MimeTypes
Name of the xml type, application/xml.
XMLParser - Class in org.apache.tika.parser.xml
XML parser.
XMLParser() - Constructor for class org.apache.tika.parser.xml.XMLParser
 
XmlRootExtractor - Class in org.apache.tika.detect
Utility class that uses a SAXParser to determine the namespace URI and local name of the root element of an XML file.
XmlRootExtractor() - Constructor for class org.apache.tika.detect.XmlRootExtractor
 
XPathParser - Class in org.apache.tika.sax.xpath
Parser for a very simple XPath subset.
XPathParser() - Constructor for class org.apache.tika.sax.xpath.XPathParser
 
XPathParser(String, String) - Constructor for class org.apache.tika.sax.xpath.XPathParser
 
XSLFPowerPointExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
 
XSLFPowerPointExtractorDecorator(XSLFPowerPointExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XSLFPowerPointExtractorDecorator
 
XSSFExcelExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
 
XSSFExcelExtractorDecorator(XSSFExcelExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XSSFExcelExtractorDecorator
 
XWPFWordExtractorDecorator - Class in org.apache.tika.parser.microsoft.ooxml
 
XWPFWordExtractorDecorator(XWPFWordExtractor) - Constructor for class org.apache.tika.parser.microsoft.ooxml.XWPFWordExtractorDecorator
 

Z

ZipParser - Class in org.apache.tika.parser.pkg
Zip File Parser.
ZipParser() - Constructor for class org.apache.tika.parser.pkg.ZipParser
 

A B C D E F G H I J K L M N O P R S T U V W X Z

Copyright © 2010 The Apache Software Foundation. All Rights Reserved.