Package org.apache.tika.config
Class EmbeddedLimits
java.lang.Object
org.apache.tika.config.EmbeddedLimits
- All Implemented Interfaces:
Serializable
Configuration for limits on embedded document processing.
This controls how deep and how many embedded documents are processed:
maxDepth- maximum nesting depth for embedded documents (-1 = unlimited)throwOnMaxDepth- whether to throw an exception when maxDepth is reachedmaxCount- maximum number of embedded documents to process (-1 = unlimited)throwOnMaxCount- whether to throw an exception when maxCount is reached
maxDepth behavior: When the depth limit is reached, recursion stops but siblings at the current level continue to be processed. For example, with maxDepth=1:
container.zip (depth 0) ├── doc1.docx (depth 1) ✓ PARSED │ ├── image1.png (depth 2) ✗ NOT PARSED (exceeds maxDepth) │ └── embed.xlsx (depth 2) ✗ NOT PARSED (exceeds maxDepth) ├── doc2.pdf (depth 1) ✓ PARSED (sibling at same level) └── doc3.txt (depth 1) ✓ PARSED (sibling at same level)
maxCount behavior: When the count limit is reached, processing stops immediately. No more embedded documents are processed, including siblings.
When a limit is hit and throwing is disabled:
X-TIKA-maxDepthReached=trueis set when maxDepth is hitX-TIKA-maxEmbeddedCountReached=trueis set when maxCount is hit
Example configuration:
{
"parse-context": {
"embedded-limits": {
"maxDepth": 10,
"throwOnMaxDepth": false,
"maxCount": 1000,
"throwOnMaxCount": false
}
}
}
- Since:
- Apache Tika 4.0
- See Also:
-
Field Summary
Fields -
Constructor Summary
ConstructorsConstructorDescriptionNo-arg constructor for Jackson deserialization.EmbeddedLimits(int maxDepth, boolean throwOnMaxDepth, int maxCount, boolean throwOnMaxCount) Constructor with all parameters. -
Method Summary
Modifier and TypeMethodDescriptionbooleanstatic EmbeddedLimitsget(ParseContext context) Helper method to get EmbeddedLimits from ParseContext with defaults.intGets the maximum number of embedded documents to process.intGets the maximum nesting depth for embedded documents.inthashCode()booleanGets whether to throw an exception when maxCount is reached.booleanGets whether to throw an exception when maxDepth is reached.voidsetMaxCount(int maxCount) Sets the maximum number of embedded documents to process.voidsetMaxDepth(int maxDepth) Sets the maximum nesting depth for embedded documents.voidsetThrowOnMaxCount(boolean throwOnMaxCount) Sets whether to throw an exception when maxCount is reached.voidsetThrowOnMaxDepth(boolean throwOnMaxDepth) Sets whether to throw an exception when maxDepth is reached.toString()
-
Field Details
-
UNLIMITED
public static final int UNLIMITED- See Also:
-
-
Constructor Details
-
EmbeddedLimits
public EmbeddedLimits()No-arg constructor for Jackson deserialization. -
EmbeddedLimits
public EmbeddedLimits(int maxDepth, boolean throwOnMaxDepth, int maxCount, boolean throwOnMaxCount) Constructor with all parameters.- Parameters:
maxDepth- maximum nesting depth (-1 = unlimited)throwOnMaxDepth- whether to throw when depth limit is reachedmaxCount- maximum number of embedded documents (-1 = unlimited)throwOnMaxCount- whether to throw when count limit is reached
-
-
Method Details
-
getMaxDepth
public int getMaxDepth()Gets the maximum nesting depth for embedded documents.- Returns:
- maximum depth, or -1 for unlimited
-
setMaxDepth
public void setMaxDepth(int maxDepth) Sets the maximum nesting depth for embedded documents.- Parameters:
maxDepth- maximum depth, or -1 for unlimited
-
isThrowOnMaxDepth
public boolean isThrowOnMaxDepth()Gets whether to throw an exception when maxDepth is reached.- Returns:
- true if an exception should be thrown
-
setThrowOnMaxDepth
public void setThrowOnMaxDepth(boolean throwOnMaxDepth) Sets whether to throw an exception when maxDepth is reached.- Parameters:
throwOnMaxDepth- true to throw an exception
-
getMaxCount
public int getMaxCount()Gets the maximum number of embedded documents to process.- Returns:
- maximum count, or -1 for unlimited
-
setMaxCount
public void setMaxCount(int maxCount) Sets the maximum number of embedded documents to process.- Parameters:
maxCount- maximum count, or -1 for unlimited
-
isThrowOnMaxCount
public boolean isThrowOnMaxCount()Gets whether to throw an exception when maxCount is reached.- Returns:
- true if an exception should be thrown
-
setThrowOnMaxCount
public void setThrowOnMaxCount(boolean throwOnMaxCount) Sets whether to throw an exception when maxCount is reached.- Parameters:
throwOnMaxCount- true to throw an exception
-
get
Helper method to get EmbeddedLimits from ParseContext with defaults.- Parameters:
context- the ParseContext (may be null)- Returns:
- the EmbeddedLimits from context, or a new instance with defaults if not found
-
toString
-
equals
-
hashCode
public int hashCode()
-