Package org.apache.tika.batch
Interface FileResource
- All Known Implementing Classes:
FSFileResource
public interface FileResource
This is a basic interface to handle a logical "file".
This should enable code-agnostic handling of files from different
sources: file system, database, etc.
-
Field Summary
-
Method Summary
Modifier and TypeMethodDescriptionThis gets the metadata available before the parsing of the file.This is only used in logging to identify which file may have caused problems.
-
Field Details
-
FILE_EXTENSION
-
-
Method Details
-
getResourceId
String getResourceId()This is only used in logging to identify which file may have caused problems. While it is probably best to use unique ids for the sake of debugging, it is not necessary that the ids be unique. This id is never used as a hashkey by the batch processors, for example.- Returns:
- an id for a FileResource
-
getMetadata
Metadata getMetadata()This gets the metadata available before the parsing of the file. This will typically be "external" metadata: file name, file size, file location, data stream, etc. That is, things that are known about the file from outside information, not file-internal metadata.- Returns:
- Metadata
-
openInputStream
- Returns:
- an InputStream for the FileResource
- Throws:
IOException
-