Package org.apache.tika.batch
Interface FileResource
-
- All Known Implementing Classes:
FSFileResource
public interface FileResource
This is a basic interface to handle a logical "file". This should enable code-agnostic handling of files from different sources: file system, database, etc.
-
-
Field Summary
Fields Modifier and Type Field Description static Property
FILE_EXTENSION
-
Method Summary
All Methods Instance Methods Abstract Methods Modifier and Type Method Description Metadata
getMetadata()
This gets the metadata available before the parsing of the file.String
getResourceId()
This is only used in logging to identify which file may have caused problems.InputStream
openInputStream()
-
-
-
Field Detail
-
FILE_EXTENSION
static final Property FILE_EXTENSION
-
-
Method Detail
-
getResourceId
String getResourceId()
This is only used in logging to identify which file may have caused problems. While it is probably best to use unique ids for the sake of debugging, it is not necessary that the ids be unique. This id is never used as a hashkey by the batch processors, for example.- Returns:
- an id for a FileResource
-
getMetadata
Metadata getMetadata()
This gets the metadata available before the parsing of the file. This will typically be "external" metadata: file name, file size, file location, data stream, etc. That is, things that are known about the file from outside information, not file-internal metadata.- Returns:
- Metadata
-
openInputStream
InputStream openInputStream() throws IOException
- Returns:
- an InputStream for the FileResource
- Throws:
IOException
-
-