Package org.apache.tika.parser
Class ParseRecord
java.lang.Object
org.apache.tika.parser.ParseRecord
Use this class to store exceptions, warnings and other information
during the parse. This information is added to the parent's metadata
after the parse by the
CompositeParser.
This class also tracks embedded document processing limits (depth and count)
which can be configured via setMaxEmbeddedDepth(int) and
setMaxEmbeddedCount(int).
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoidvoidaddMetadata(Metadata metadata) voidaddWarning(String msg) intgetDepth()intGets the current count of embedded documents processed.intGets the maximum number of embedded documents to parse.intGets the maximum depth for parsing embedded documents.String[]voidIncrements the embedded document count.booleanReturns whether the embedded count limit was reached during parsing.booleanReturns whether the embedded depth limit was reached during parsing.booleanReturns whether throwing is configured when max count is reached.booleanReturns whether throwing is configured when max depth is reached.booleanstatic ParseRecordnewInstance(ParseContext context) Creates a new ParseRecord configured from EmbeddedLimits in the ParseContext.voidsetEmbeddedCountLimitReached(boolean embeddedCountLimitReached) Sets the flag indicating the embedded count limit was reached.voidsetEmbeddedDepthLimitReached(boolean embeddedDepthLimitReached) Sets the flag indicating the embedded depth limit was reached.voidsetMaxEmbeddedCount(int maxEmbeddedCount) Sets the maximum number of embedded documents to parse.voidsetMaxEmbeddedDepth(int maxEmbeddedDepth) Sets the maximum depth for parsing embedded documents.voidsetWriteLimitReached(boolean writeLimitReached)
-
Constructor Details
-
ParseRecord
public ParseRecord()
-
-
Method Details
-
newInstance
Creates a new ParseRecord configured from EmbeddedLimits in the ParseContext.If EmbeddedLimits is present in the context, the ParseRecord will be configured with those limits. Otherwise, default unlimited values are used.
- Parameters:
context- the ParseContext (may be null)- Returns:
- a new ParseRecord configured from the context
-
getDepth
public int getDepth() -
getParsers
-
addException
-
addWarning
-
addMetadata
-
setWriteLimitReached
public void setWriteLimitReached(boolean writeLimitReached) -
getExceptions
-
getWarnings
-
isWriteLimitReached
public boolean isWriteLimitReached() -
getMetadataList
-
isThrowOnMaxDepth
public boolean isThrowOnMaxDepth()Returns whether throwing is configured when max depth is reached.- Returns:
- true if an exception should be thrown on max depth
-
isThrowOnMaxCount
public boolean isThrowOnMaxCount()Returns whether throwing is configured when max count is reached.- Returns:
- true if an exception should be thrown on max count
-
setEmbeddedDepthLimitReached
public void setEmbeddedDepthLimitReached(boolean embeddedDepthLimitReached) Sets the flag indicating the embedded depth limit was reached.- Parameters:
embeddedDepthLimitReached- true if depth limit was reached
-
setEmbeddedCountLimitReached
public void setEmbeddedCountLimitReached(boolean embeddedCountLimitReached) Sets the flag indicating the embedded count limit was reached.- Parameters:
embeddedCountLimitReached- true if count limit was reached
-
incrementEmbeddedCount
public void incrementEmbeddedCount()Increments the embedded document count. Should be called when an embedded document is about to be parsed. -
getEmbeddedCount
public int getEmbeddedCount()Gets the current count of embedded documents processed.- Returns:
- the embedded document count
-
setMaxEmbeddedDepth
public void setMaxEmbeddedDepth(int maxEmbeddedDepth) Sets the maximum depth for parsing embedded documents. A value of -1 means unlimited (the default). A value of 0 means no embedded documents will be parsed. A value of 1 means only first-level embedded documents will be parsed, etc.- Parameters:
maxEmbeddedDepth- the maximum embedded depth, or -1 for unlimited
-
getMaxEmbeddedDepth
public int getMaxEmbeddedDepth()Gets the maximum depth for parsing embedded documents.- Returns:
- the maximum embedded depth, or -1 if unlimited
-
setMaxEmbeddedCount
public void setMaxEmbeddedCount(int maxEmbeddedCount) Sets the maximum number of embedded documents to parse. A value of -1 means unlimited (the default).- Parameters:
maxEmbeddedCount- the maximum embedded count, or -1 for unlimited
-
getMaxEmbeddedCount
public int getMaxEmbeddedCount()Gets the maximum number of embedded documents to parse.- Returns:
- the maximum embedded count, or -1 if unlimited
-
isEmbeddedDepthLimitReached
public boolean isEmbeddedDepthLimitReached()Returns whether the embedded depth limit was reached during parsing.- Returns:
- true if the depth limit was reached
-
isEmbeddedCountLimitReached
public boolean isEmbeddedCountLimitReached()Returns whether the embedded count limit was reached during parsing.- Returns:
- true if the count limit was reached
-