public class StandardWriteFilter extends Object implements MetadataWriteFilter, Serializable
maxTotalEstimatedSize
,
maxFieldSize
, maxValuesPerField
, and
maxKeySize
. This can also be used to limit which
fields are stored in the metadata object at write-time
with includeFields
.
All sizes are measured in UTF-16 bytes. The size is estimated
as a rough order of magnitude of what is
required to store the string in memory in Java. We recognize
that Java uses more bytes to store length, offset etc. for strings. But
the extra overhead varies by Java version and implementation,
and we just need a basic estimate. We also recognize actual
memory usage is affected by interning strings, etc.
Please forgive us ... or consider writing your own write filter. :)
NOTE: Fields in ALWAYS_SET_FIELDS
are
always set no matter the current state of maxTotalEstimatedSize
.
Except for TikaCoreProperties.TIKA_CONTENT
, they are truncated at
maxFieldSize
, and their sizes contribute to the maxTotalEstimatedSize
.
NOTE: Fields in ALWAYS_ADD_FIELDS
are
always added no matter the current state of maxTotalEstimatedSize
.
Except for TikaCoreProperties.TIKA_CONTENT
, each addition is truncated at
maxFieldSize
, and their sizes contribute to the maxTotalEstimatedSize
.
This class minimumMaxFieldSizeInAlwaysFields
to protect the
ALWAYS_ADD_FIELDS
and ALWAYS_SET_FIELDS
. If we didn't
have this and a user sets the maxFieldSize
to, say, 10 bytes,
the internal parser behavior would be broken because parsers rely on
HttpHeaders.CONTENT_TYPE
to determine which parser to call.
NOTE: as with Metadata
, this object is not thread safe.Modifier and Type | Field and Description |
---|---|
static Set<String> |
ALWAYS_ADD_FIELDS |
static Set<String> |
ALWAYS_SET_FIELDS |
Modifier | Constructor and Description |
---|---|
protected |
StandardWriteFilter(int maxKeySize,
int maxFieldSize,
int maxEstimatedSize,
int maxValuesPerField,
Set<String> includeFields,
boolean includeEmpty) |
Modifier and Type | Method and Description |
---|---|
void |
add(String field,
String value,
Map<String,String[]> data)
Based on the field and value, this filter modifies the field
and/or the value to something that should be added to the Metadata object.
|
void |
filterExisting(Map<String,String[]> data) |
void |
set(String field,
String value,
Map<String,String[]> data)
Based on the field and the value, this filter modifies
the field and/or the value to something that should be set in the
Metadata object.
|
protected StandardWriteFilter(int maxKeySize, int maxFieldSize, int maxEstimatedSize, int maxValuesPerField, Set<String> includeFields, boolean includeEmpty)
maxKeySize
- maximum key size in UTF-16 bytes-- keys will be truncated to this
length; if less than 0, keys will not be truncatedmaxEstimatedSize
- includeFields
- if null or empty, all fields are included; otherwise, which fields
to add to the metadata object.includeEmpty
- if true
, this will set or add an empty value to the
metadata object.public void filterExisting(Map<String,String[]> data)
filterExisting
in interface MetadataWriteFilter
public void set(String field, String value, Map<String,String[]> data)
MetadataWriteFilter
set
in interface MetadataWriteFilter
public void add(String field, String value, Map<String,String[]> data)
MetadataWriteFilter
null
, no value is set or added.
Status updates (e.g. write limit reached) can be added directly to the
underlying metadata.add
in interface MetadataWriteFilter
Copyright © 2007–2023 The Apache Software Foundation. All rights reserved.