Class EmbeddedLimits

java.lang.Object
org.apache.tika.config.EmbeddedLimits
All Implemented Interfaces:
Serializable

public class EmbeddedLimits extends Object implements Serializable
Configuration for limits on embedded document processing.

This controls how deep and how many embedded documents are processed:

  • maxDepth - maximum nesting depth for embedded documents (-1 = unlimited)
  • throwOnMaxDepth - whether to throw an exception when maxDepth is reached
  • maxCount - maximum number of embedded documents to process (-1 = unlimited)
  • throwOnMaxCount - whether to throw an exception when maxCount is reached

maxDepth behavior: When the depth limit is reached, recursion stops but siblings at the current level continue to be processed. For example, with maxDepth=1:

 container.zip (depth 0)
 ├── doc1.docx (depth 1) ✓ PARSED
 │   ├── image1.png (depth 2) ✗ NOT PARSED (exceeds maxDepth)
 │   └── embed.xlsx (depth 2) ✗ NOT PARSED (exceeds maxDepth)
 ├── doc2.pdf (depth 1) ✓ PARSED (sibling at same level)
 └── doc3.txt (depth 1) ✓ PARSED (sibling at same level)
 

maxCount behavior: When the count limit is reached, processing stops immediately. No more embedded documents are processed, including siblings.

When a limit is hit and throwing is disabled:

  • X-TIKA-maxDepthReached=true is set when maxDepth is hit
  • X-TIKA-maxEmbeddedCountReached=true is set when maxCount is hit

Example configuration:

 {
   "parse-context": {
     "embedded-limits": {
       "maxDepth": 10,
       "throwOnMaxDepth": false,
       "maxCount": 1000,
       "throwOnMaxCount": false
     }
   }
 }
 
Since:
Apache Tika 4.0
See Also:
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final int
     
  • Constructor Summary

    Constructors
    Constructor
    Description
    No-arg constructor for Jackson deserialization.
    EmbeddedLimits(int maxDepth, boolean throwOnMaxDepth, int maxCount, boolean throwOnMaxCount)
    Constructor with all parameters.
  • Method Summary

    Modifier and Type
    Method
    Description
    boolean
     
    get(ParseContext context)
    Helper method to get EmbeddedLimits from ParseContext with defaults.
    int
    Gets the maximum number of embedded documents to process.
    int
    Gets the maximum nesting depth for embedded documents.
    int
     
    boolean
    Gets whether to throw an exception when maxCount is reached.
    boolean
    Gets whether to throw an exception when maxDepth is reached.
    void
    setMaxCount(int maxCount)
    Sets the maximum number of embedded documents to process.
    void
    setMaxDepth(int maxDepth)
    Sets the maximum nesting depth for embedded documents.
    void
    setThrowOnMaxCount(boolean throwOnMaxCount)
    Sets whether to throw an exception when maxCount is reached.
    void
    setThrowOnMaxDepth(boolean throwOnMaxDepth)
    Sets whether to throw an exception when maxDepth is reached.
     

    Methods inherited from class java.lang.Object

    clone, finalize, getClass, notify, notifyAll, wait, wait, wait
  • Field Details

  • Constructor Details

    • EmbeddedLimits

      public EmbeddedLimits()
      No-arg constructor for Jackson deserialization.
    • EmbeddedLimits

      public EmbeddedLimits(int maxDepth, boolean throwOnMaxDepth, int maxCount, boolean throwOnMaxCount)
      Constructor with all parameters.
      Parameters:
      maxDepth - maximum nesting depth (-1 = unlimited)
      throwOnMaxDepth - whether to throw when depth limit is reached
      maxCount - maximum number of embedded documents (-1 = unlimited)
      throwOnMaxCount - whether to throw when count limit is reached
  • Method Details

    • getMaxDepth

      public int getMaxDepth()
      Gets the maximum nesting depth for embedded documents.
      Returns:
      maximum depth, or -1 for unlimited
    • setMaxDepth

      public void setMaxDepth(int maxDepth)
      Sets the maximum nesting depth for embedded documents.
      Parameters:
      maxDepth - maximum depth, or -1 for unlimited
    • isThrowOnMaxDepth

      public boolean isThrowOnMaxDepth()
      Gets whether to throw an exception when maxDepth is reached.
      Returns:
      true if an exception should be thrown
    • setThrowOnMaxDepth

      public void setThrowOnMaxDepth(boolean throwOnMaxDepth)
      Sets whether to throw an exception when maxDepth is reached.
      Parameters:
      throwOnMaxDepth - true to throw an exception
    • getMaxCount

      public int getMaxCount()
      Gets the maximum number of embedded documents to process.
      Returns:
      maximum count, or -1 for unlimited
    • setMaxCount

      public void setMaxCount(int maxCount)
      Sets the maximum number of embedded documents to process.
      Parameters:
      maxCount - maximum count, or -1 for unlimited
    • isThrowOnMaxCount

      public boolean isThrowOnMaxCount()
      Gets whether to throw an exception when maxCount is reached.
      Returns:
      true if an exception should be thrown
    • setThrowOnMaxCount

      public void setThrowOnMaxCount(boolean throwOnMaxCount)
      Sets whether to throw an exception when maxCount is reached.
      Parameters:
      throwOnMaxCount - true to throw an exception
    • get

      public static EmbeddedLimits get(ParseContext context)
      Helper method to get EmbeddedLimits from ParseContext with defaults.
      Parameters:
      context - the ParseContext (may be null)
      Returns:
      the EmbeddedLimits from context, or a new instance with defaults if not found
    • toString

      public String toString()
      Overrides:
      toString in class Object
    • equals

      public boolean equals(Object o)
      Overrides:
      equals in class Object
    • hashCode

      public int hashCode()
      Overrides:
      hashCode in class Object