Class InputStreamDigester

java.lang.Object
org.apache.tika.digest.InputStreamDigester
All Implemented Interfaces:
Digester

public class InputStreamDigester extends Object implements Digester
Digester that uses TikaInputStream.enableRewind() and TikaInputStream.rewind() to read the entire stream for digesting, then rewind for subsequent processing.
  • Constructor Details

    • InputStreamDigester

      public InputStreamDigester(String algorithm, String metadataKey, Encoder encoder)
      Parameters:
      algorithm - name of the digest algorithm to retrieve from the Provider
      metadataKey - the full metadata key to use when storing the digest (e.g., "X-TIKA:digest:MD5" or "X-TIKA:digest:SHA256:BASE32")
      encoder - encoder to convert the byte array returned from the digester to a string
  • Method Details

    • getProvider

      protected Provider getProvider()
      When subclassing this, becare to ensure that your provider is thread-safe (not likely) or return a new provider with each call.
      Returns:
      provider to use to get the MessageDigest from the algorithm name. Default is to return null.
    • digest

      public void digest(TikaInputStream tis, Metadata metadata, ParseContext parseContext) throws IOException
      Digests the TikaInputStream and stores the result in metadata.

      Uses TikaInputStream.enableRewind() to ensure the stream can be rewound after digesting, then calls TikaInputStream.rewind() to reset the stream for subsequent processing.

      Specified by:
      digest in interface Digester
      Parameters:
      tis - TikaInputStream to digest
      metadata - metadata in which to store the digest information
      parseContext - ParseContext -- not actually used yet, but there for future expansion
      Throws:
      IOException - on IO problem or IllegalArgumentException if algorithm couldn't be found