Class PipesParser

java.lang.Object
org.apache.tika.pipes.core.PipesParser
All Implemented Interfaces:
Closeable, AutoCloseable

public class PipesParser extends Object implements Closeable
  • Method Details

    • load

      public static PipesParser load(Path tikaConfigPath) throws IOException, TikaConfigException
      Loads a PipesParser from a configuration file path.

      This method:

      1. Loads the JSON configuration
      2. Pre-extracts plugins before spawning child processes
      3. Creates the PipesParser with the loaded configuration
      Parameters:
      tikaConfigPath - path to the tika-config.json file
      Returns:
      a new PipesParser instance
      Throws:
      IOException - if reading config or extraction fails
      TikaConfigException - if configuration is invalid
    • load

      public static PipesParser load(TikaJsonConfig tikaJsonConfig, PipesConfig pipesConfig, Path tikaConfigPath) throws IOException
      Loads a PipesParser from pre-loaded configuration objects.

      Use this method when you need to modify the PipesConfig before creating the parser (e.g., to override emit strategy).

      Parameters:
      tikaJsonConfig - the pre-loaded JSON configuration
      pipesConfig - the pipes configuration (may be modified by caller)
      tikaConfigPath - path to the config file (passed to child processes)
      Returns:
      a new PipesParser instance
      Throws:
      IOException - if plugin extraction fails
    • parse

      Throws:
      InterruptedException
      PipesException
      IOException
    • close

      public void close() throws IOException
      Specified by:
      close in interface AutoCloseable
      Specified by:
      close in interface Closeable
      Throws:
      IOException
    • isSharedMode

      public boolean isSharedMode()
      Returns whether this parser is using shared server mode.
      Returns:
      true if using shared server mode
    • getCurrentServerPort

      public int getCurrentServerPort()
      Returns the current server port. For testing purposes only. In shared mode, returns the port of the shared server. In per-client mode, returns the port of the first client's server.
      Returns:
      the current server port, or -1 if no server is running