Class ConfigDeserializer

java.lang.Object
org.apache.tika.serialization.ConfigDeserializer

public class ConfigDeserializer extends Object
Helper utility for SelfConfiguring components to deserialize their configuration from ParseContext at run time.

Note for Parser Developers: Instead of calling this class directly, use ParseContextConfig which provides the same functionality but with better error handling. ParseContextConfig will throw a clear exception if tika-serialization is not on the classpath.

This allows parsers to retrieve their configuration using the same friendly names as in tika-config.json (e.g., "pdf-parser", "html-parser") from per-request configurations sent via FetchEmitTuple or other serialization mechanisms.

The helper automatically merges user configuration with parser defaults, eliminating the need for config-specific cloneAndUpdate methods.

Example usage in a parser:

 // Recommended: Use ParseContextConfig wrapper (in tika-core)
 PDFParserConfig localConfig = ParseContextConfig.getConfig(
     context, "pdf-parser", PDFParserConfig.class, defaultConfig);
 
See Also:
  • Constructor Details

    • ConfigDeserializer

      public ConfigDeserializer()
  • Method Details

    • getConfig

      public static <T> T getConfig(ParseContext context, String configKey, Class<T> configClass, T defaultConfig) throws IOException
      Retrieves and deserializes a configuration from ParseContext.

      Resolution order:

      1. Check resolved configs cache (already deserialized)
      2. Check JSON configs (deserialize, merge with default, cache)
      3. Return default config if nothing found

      The resolved config is cached in ParseContext's resolvedConfigs map and also set in the main context map so components can find it via parseContext.get(configClass).

      Type Parameters:
      T - the configuration type
      Parameters:
      context - the parse context
      configKey - the configuration key (e.g., "pdf-parser", "handler-config")
      configClass - the configuration class to deserialize into
      defaultConfig - optional default config to merge with user config (can be null)
      Returns:
      the merged configuration, the default config if no user config found, or null if neither exists
      Throws:
      IOException - if deserialization fails
    • getConfig

      public static <T> T getConfig(ParseContext context, String configKey, Class<T> configClass) throws IOException
      Retrieves and deserializes a configuration from ParseContext. This version does not merge with any default config.
      Type Parameters:
      T - the configuration type
      Parameters:
      context - the parse context
      configKey - the configuration key (e.g., "pdf-parser", "handler-config")
      configClass - the configuration class to deserialize into
      Returns:
      the deserialized configuration, or null if not found
      Throws:
      IOException - if deserialization fails
    • hasConfig

      public static boolean hasConfig(ParseContext context, String configKey)
      Checks if a configuration exists in the ParseContext.
      Parameters:
      context - the parse context
      configKey - the configuration key to check
      Returns:
      true if the configuration exists (either as JSON or already resolved)