public interface TikaCoreProperties
Users of Tika who wish to have consistent metadata across file formats
can make use of these Properties, knowing that where present they will
have consistent semantic meaning between different file formats. (No
matter if one file format calls it Title, another Long-Title and another
Long-Name, if they all mean the same thing as defined by
DublinCore.TITLE
then they will all be present as such)
For now, most of these properties are composite ones including the deprecated non-prefixed String properties from the Metadata class. In Tika 2.0, most of these will revert back to simple assignments.
Modifier and Type | Interface and Description |
---|---|
static class |
TikaCoreProperties.EmbeddedResourceType
A file might contain different types of embedded documents.
|
Modifier and Type | Field and Description |
---|---|
static Property |
ALTITUDE |
static Property |
COMMENTS |
static Property |
CONTAINER_EXCEPTION |
static Property |
CONTENT_TYPE_HINT
This is currently used to identify Content-Type that may be
included within a document, such as in html documents
(e.g.
|
static Property |
CONTENT_TYPE_PARSER_OVERRIDE
This is used by parsers to override detection of embedded resources
with the override detector.
|
static Property |
CONTENT_TYPE_USER_OVERRIDE
This is used by users to override detection with the override detector.
|
static Property |
CONTRIBUTOR |
static Property |
COVERAGE |
static Property |
CREATED |
static Property |
CREATOR |
static Property |
CREATOR_TOOL |
static Property |
DESCRIPTION |
static Property |
EMBEDDED_DEPTH |
static Property |
EMBEDDED_EXCEPTION |
static Property |
EMBEDDED_ID
This is a 1-index counter for embedded files, used by the RecursiveParserWrapper
|
static Property |
EMBEDDED_ID_PATH
This tracks the embedded file paths based on the embedded file's
EMBEDDED_ID . |
static String |
EMBEDDED_RELATIONSHIP_ID |
static Property |
EMBEDDED_RESOURCE_PATH
This tracks the embedded file paths based on the name of embedded files
where available.
|
static Property |
EMBEDDED_RESOURCE_TYPE
Embedded resource type property
|
static String |
EMBEDDED_RESOURCE_TYPE_KEY |
static String |
EMBEDDED_STORAGE_CLASS_ID |
static Property |
EMBEDDED_WARNING |
static Property |
FORMAT |
static Property |
HAS_SIGNATURE |
static Property |
IDENTIFIER |
static Property |
IS_ENCRYPTED |
static Property |
LANGUAGE |
static Property |
LATITUDE |
static Property |
LONGITUDE |
static Property |
METADATA_DATE |
static Property |
MODIFIED |
static Property |
MODIFIER |
static String |
NAMESPACE_PREFIX_DELIMITER
The common delimiter used between the namespace abbreviation and the property name
|
static Property |
ORIGINAL_RESOURCE_NAME
Some file formats can store information about their original
file name/location or about their attachment's original file name/location
within the file.
|
static Property |
PARSE_TIME_MILLIS |
static Property |
PRINT_DATE |
static String |
PROTECTED |
static Property |
PUBLISHER |
static Property |
RATING |
static Property |
RELATION |
static String |
RESOURCE_NAME_KEY |
static Property |
RIGHTS |
static Property |
SIGNATURE_CONTACT_INFO |
static Property |
SIGNATURE_DATE |
static Property |
SIGNATURE_FILTER |
static Property |
SIGNATURE_LOCATION |
static Property |
SIGNATURE_NAME |
static Property |
SIGNATURE_REASON |
static Property |
SOURCE |
static Property |
SOURCE_PATH
This should be used to store the path (relative or full)
of the source file, including the file name,
e.g.
|
static Property |
SUBJECT
DublinCore.SUBJECT ; should include both subject and keywords
if a document format has both. |
static Property |
TIKA_CONTENT |
static Property |
TIKA_CONTENT_HANDLER
Simple class name of the content handler
|
static Property |
TIKA_DETECTED_LANGUAGE |
static Property |
TIKA_DETECTED_LANGUAGE_CONFIDENCE |
static Property |
TIKA_DETECTED_LANGUAGE_CONFIDENCE_RAW |
static Property |
TIKA_META_EXCEPTION_EMBEDDED_STREAM
Use this to store exceptions caught while trying to read the
stream of an embedded resource.
|
static String |
TIKA_META_EXCEPTION_PREFIX
Use this to store parse exception information in the Metadata object.
|
static Property |
TIKA_META_EXCEPTION_WARNING
Use this to store exceptions caught during a parse that are
non-fatal, e.g.
|
static String |
TIKA_META_PREFIX
Use this to prefix metadata properties that store information
about the parsing process.
|
static String |
TIKA_META_WARN_PREFIX
Use this to store warnings that happened during the parse.
|
static Property |
TIKA_PARSED_BY |
static Property |
TIKA_PARSED_BY_FULL_SET
Use this to store a record of all parsers that touched a given file
in the container file's metadata.
|
static Property |
TITLE |
static Property |
TRUNCATED_METADATA
This means that metadata keys or metadata values were truncated.
|
static Property |
TYPE |
static Property |
WRITE_LIMIT_REACHED |
static final String NAMESPACE_PREFIX_DELIMITER
static final String TIKA_META_PREFIX
static final Property EMBEDDED_DEPTH
static final Property EMBEDDED_RESOURCE_PATH
EMBEDDED_ID_PATH
.static final Property EMBEDDED_ID_PATH
EMBEDDED_ID
.static final Property EMBEDDED_ID
static final Property PARSE_TIME_MILLIS
static final Property TIKA_CONTENT_HANDLER
static final Property TIKA_CONTENT
static final String TIKA_META_EXCEPTION_PREFIX
static final String TIKA_META_WARN_PREFIX
static final Property CONTAINER_EXCEPTION
static final Property EMBEDDED_EXCEPTION
static final Property EMBEDDED_WARNING
static final Property WRITE_LIMIT_REACHED
static final Property TIKA_META_EXCEPTION_WARNING
static final Property TRUNCATED_METADATA
static final Property TIKA_META_EXCEPTION_EMBEDDED_STREAM
static final Property TIKA_PARSED_BY
static final Property TIKA_PARSED_BY_FULL_SET
static final Property TIKA_DETECTED_LANGUAGE
static final Property TIKA_DETECTED_LANGUAGE_CONFIDENCE
static final Property TIKA_DETECTED_LANGUAGE_CONFIDENCE_RAW
static final String RESOURCE_NAME_KEY
static final String PROTECTED
static final String EMBEDDED_RELATIONSHIP_ID
static final String EMBEDDED_STORAGE_CLASS_ID
static final String EMBEDDED_RESOURCE_TYPE_KEY
static final Property ORIGINAL_RESOURCE_NAME
static final Property SOURCE_PATH
This can also be used for a primary key within a database.
static final Property CONTENT_TYPE_HINT
static final Property CONTENT_TYPE_USER_OVERRIDE
static final Property CONTENT_TYPE_PARSER_OVERRIDE
static final Property FORMAT
DublinCore.FORMAT
static final Property IDENTIFIER
DublinCore.IDENTIFIER
static final Property CONTRIBUTOR
DublinCore.CONTRIBUTOR
static final Property COVERAGE
DublinCore.COVERAGE
static final Property CREATOR
DublinCore.CREATOR
static final Property MODIFIER
Office.LAST_AUTHOR
static final Property CREATOR_TOOL
XMP.CREATOR_TOOL
static final Property LANGUAGE
DublinCore.LANGUAGE
static final Property PUBLISHER
DublinCore.PUBLISHER
static final Property RELATION
DublinCore.RELATION
static final Property RIGHTS
DublinCore.RIGHTS
static final Property SOURCE
DublinCore.SOURCE
static final Property TYPE
DublinCore.TYPE
static final Property TITLE
DublinCore.TITLE
static final Property DESCRIPTION
DublinCore.DESCRIPTION
static final Property SUBJECT
DublinCore.SUBJECT
; should include both subject and keywords
if a document format has both. See also Office.KEYWORDS
and OfficeOpenXMLCore.SUBJECT
.static final Property CREATED
DublinCore.DATE
static final Property MODIFIED
DublinCore.MODIFIED
,
Office.SAVE_DATE
static final Property PRINT_DATE
Office.PRINT_DATE
static final Property METADATA_DATE
XMP.METADATA_DATE
static final Property LATITUDE
Geographic.LATITUDE
static final Property LONGITUDE
Geographic.LONGITUDE
static final Property ALTITUDE
Geographic.ALTITUDE
static final Property RATING
XMP.RATING
static final Property COMMENTS
OfficeOpenXMLExtended.COMMENTS
static final Property EMBEDDED_RESOURCE_TYPE
static final Property HAS_SIGNATURE
static final Property SIGNATURE_NAME
static final Property SIGNATURE_DATE
static final Property SIGNATURE_LOCATION
static final Property SIGNATURE_REASON
static final Property SIGNATURE_FILTER
static final Property SIGNATURE_CONTACT_INFO
static final Property IS_ENCRYPTED
Copyright © 2007–2023 The Apache Software Foundation. All rights reserved.