Package org.apache.tika.example
Class ContentHandlerExample
java.lang.Object
org.apache.tika.example.ContentHandlerExample
Examples of using different Content Handlers to
get different parts of the file's contents
-
Field Summary
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionExample of extracting just the body as HTML, without the head part, as a stringExample of extracting just one part of the document's body, as HTML as a string, excluding the restExample of extracting the contents as HTML, as a string.Example of extracting the plain text of the contents.Example of extracting the plain text in chunks, with each chunk of no more than a certain maximum size
-
Field Details
-
MAXIMUM_TEXT_CHUNK_SIZE
protected final int MAXIMUM_TEXT_CHUNK_SIZE- See Also:
-
-
Constructor Details
-
ContentHandlerExample
public ContentHandlerExample()
-
-
Method Details
-
parseToPlainText
Example of extracting the plain text of the contents. Will return only the "body" part of the document- Throws:
IOException
SAXException
TikaException
-
parseToHTML
Example of extracting the contents as HTML, as a string.- Throws:
IOException
SAXException
TikaException
-
parseBodyToHTML
Example of extracting just the body as HTML, without the head part, as a string- Throws:
IOException
SAXException
TikaException
-
parseOnePartToHTML
Example of extracting just one part of the document's body, as HTML as a string, excluding the rest- Throws:
IOException
SAXException
TikaException
-
parseToPlainTextChunks
Example of extracting the plain text in chunks, with each chunk of no more than a certain maximum size- Throws:
IOException
SAXException
TikaException
-