Package org.apache.tika.example
package org.apache.tika.example
-
ClassDescriptionExamples of using different Content Handlers to get different parts of the file's contentsPrint the supported Tika Metadata models and their fields.Parses the output of /bin/ls and counts the number of files and the number of executables using Tika.Grabs a PDF file from a URL and prints its
Metadata
This class shows how to dump a TikaConfig object to a configuration file.Class to demonstrate how to use thePhoneExtractingContentHandler
to get a list of all of the phone numbers from every file in a directory.ImportContextImpl
...This example demonstrates how to interrupt document parsing if some condition is met.Builds on the LuceneIndexer from Chapter 5 and adds indexing of Metadata.Deprecated.Currently not suitable for real use, more a demo / prototype!Builds on top of the LuceneIndexer and the Metadata discussions in Chapter 6 to output an RSS (or RDF) feed of files crawled by the LuceneIndexer within the last N minutes.Demonstrates Tika and its ability to sense symlinks.Class to demonstrate how to use theStandardsExtractingContentHandler
to get a list of the standard references from every file in a directory.These examples create a newCompositeTextStatsCalculator
for each call.This example demonstrates primitive logic for chaining Tika API calls.Generates document summaries for corpus analysis in the Open Relevance project.Example code listing from Chapter 1.