Interface DigesterFactory

All Known Implementing Classes:
BouncyCastleDigesterFactory, CommonsDigesterFactory

public interface DigesterFactory
Factory interface for creating Digester instances. Implementations should be annotated with @TikaComponent and provide bean properties for configuration (e.g., digests).

Configure this factory in the "parse-context" section of tika-config.json. The factory is loaded into the ParseContext and used by AutoDetectParser during parsing to compute digests.

Example JSON configuration:

 {
   "parse-context": {
     "commons-digester-factory": {
       "digests": [
         { "algorithm": "MD5" },
         { "algorithm": "SHA256", "encoding": "BASE32" }
       ],
       "skipContainerDocumentDigest": true
     }
   }
 }
 

When using TikaLoader, call loader.loadParseContext() to get a ParseContext with the DigesterFactory already set.

See Also:
  • Method Summary

    Modifier and Type
    Method
    Description
    Build a new Digester instance using the factory's configured properties.
    default boolean
    Returns whether to skip digesting for container (top-level) documents.
  • Method Details

    • build

      Digester build()
      Build a new Digester instance using the factory's configured properties.
      Returns:
      a new Digester instance
    • isSkipContainerDocumentDigest

      default boolean isSkipContainerDocumentDigest()
      Returns whether to skip digesting for container (top-level) documents. When true, only embedded documents (depth > 0) will be digested.

      Default implementation returns false (digest everything).

      Returns:
      true if container documents should be skipped, false otherwise