Class FSFileResource

java.lang.Object
org.apache.tika.batch.fs.FSFileResource
All Implemented Interfaces:
FileResource

public class FSFileResource extends Object implements FileResource
FileSystem(FS)Resource wraps a file name.

This class automatically sets the following keys in Metadata:

  • TikaCoreProperties.RESOURCE_NAME_KEY (file name)
  • Metadata.CONTENT_LENGTH
  • FSProperties.FS_REL_PATH
  • FileResource.FILE_EXTENSION
,
  • Constructor Details

    • FSFileResource

      public FSFileResource(Path inputRoot, Path fullPath)
      Constructor
      Parameters:
      inputRoot - the input root for the file
      fullPath - the full path to the file
      Throws:
      IllegalArgumentException - if the fullPath is not a child of inputRoot
  • Method Details

    • getResourceId

      public String getResourceId()
      Description copied from interface: FileResource
      This is only used in logging to identify which file may have caused problems. While it is probably best to use unique ids for the sake of debugging, it is not necessary that the ids be unique. This id is never used as a hashkey by the batch processors, for example.
      Specified by:
      getResourceId in interface FileResource
      Returns:
      file's relativePath
    • getMetadata

      public Metadata getMetadata()
      Description copied from interface: FileResource
      This gets the metadata available before the parsing of the file. This will typically be "external" metadata: file name, file size, file location, data stream, etc. That is, things that are known about the file from outside information, not file-internal metadata.
      Specified by:
      getMetadata in interface FileResource
      Returns:
      Metadata
    • openInputStream

      public InputStream openInputStream() throws IOException
      Specified by:
      openInputStream in interface FileResource
      Returns:
      an InputStream for the FileResource
      Throws:
      IOException