Class StandardWriteFilter
- java.lang.Object
-
- org.apache.tika.metadata.writefilter.StandardWriteFilter
-
- All Implemented Interfaces:
Serializable
,MetadataWriteFilter
public class StandardWriteFilter extends Object implements MetadataWriteFilter, Serializable
This is to be used to limit the amount of metadata that a parser can add based on themaxTotalEstimatedSize
,maxFieldSize
,maxValuesPerField
, andmaxKeySize
. This can also be used to limit which fields are stored in the metadata object at write-time withincludeFields
. All sizes are measured in UTF-16 bytes. The size is estimated as a rough order of magnitude of what is required to store the string in memory in Java. We recognize that Java uses more bytes to store length, offset etc. for strings. But the extra overhead varies by Java version and implementation, and we just need a basic estimate. We also recognize actual memory usage is affected by interning strings, etc. Please forgive us ... or consider writing your own write filter. :) NOTE: Fields inALWAYS_SET_FIELDS
are always set no matter the current state ofmaxTotalEstimatedSize
. Except forTikaCoreProperties.TIKA_CONTENT
, they are truncated atmaxFieldSize
, and their sizes contribute to themaxTotalEstimatedSize
. NOTE: Fields inALWAYS_ADD_FIELDS
are always added no matter the current state ofmaxTotalEstimatedSize
. Except forTikaCoreProperties.TIKA_CONTENT
, each addition is truncated atmaxFieldSize
, and their sizes contribute to themaxTotalEstimatedSize
. This classminimumMaxFieldSizeInAlwaysFields
to protect theALWAYS_ADD_FIELDS
andALWAYS_SET_FIELDS
. If we didn't have this and a user sets themaxFieldSize
to, say, 10 bytes, the internal parser behavior would be broken because parsers rely onHttpHeaders.CONTENT_TYPE
to determine which parser to call. NOTE: as withMetadata
, this object is not thread safe.- See Also:
- Serialized Form
-
-
Field Summary
Fields Modifier and Type Field Description static Set<String>
ALWAYS_ADD_FIELDS
static Set<String>
ALWAYS_SET_FIELDS
-
Constructor Summary
Constructors Modifier Constructor Description protected
StandardWriteFilter(int maxKeySize, int maxFieldSize, int maxEstimatedSize, int maxValuesPerField, Set<String> includeFields, boolean includeEmpty)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
add(String field, String value, Map<String,String[]> data)
Based on the field and value, this filter modifies the field and/or the value to something that should be added to the Metadata object.void
filterExisting(Map<String,String[]> data)
void
set(String field, String value, Map<String,String[]> data)
Based on the field and the value, this filter modifies the field and/or the value to something that should be set in the Metadata object.
-
-
-
Constructor Detail
-
StandardWriteFilter
protected StandardWriteFilter(int maxKeySize, int maxFieldSize, int maxEstimatedSize, int maxValuesPerField, Set<String> includeFields, boolean includeEmpty)
- Parameters:
maxKeySize
- maximum key size in UTF-16 bytes-- keys will be truncated to this length; if less than 0, keys will not be truncatedmaxEstimatedSize
-includeFields
- if null or empty, all fields are included; otherwise, which fields to add to the metadata object.includeEmpty
- iftrue
, this will set or add an empty value to the metadata object.
-
-
Method Detail
-
filterExisting
public void filterExisting(Map<String,String[]> data)
- Specified by:
filterExisting
in interfaceMetadataWriteFilter
-
set
public void set(String field, String value, Map<String,String[]> data)
Description copied from interface:MetadataWriteFilter
Based on the field and the value, this filter modifies the field and/or the value to something that should be set in the Metadata object.- Specified by:
set
in interfaceMetadataWriteFilter
-
add
public void add(String field, String value, Map<String,String[]> data)
Description copied from interface:MetadataWriteFilter
Based on the field and value, this filter modifies the field and/or the value to something that should be added to the Metadata object. If the value isnull
, no value is set or added. Status updates (e.g. write limit reached) can be added directly to the underlying metadata.- Specified by:
add
in interfaceMetadataWriteFilter
-
-