Class PopplerRenderer

java.lang.Object
org.apache.tika.renderer.pdf.poppler.PopplerRenderer
All Implemented Interfaces:
Serializable, SelfConfiguring, Renderer

public class PopplerRenderer extends Object implements Renderer
Renderer that uses Poppler's pdftoppm command to convert PDF pages to PNG images.

Poppler is pre-installed on most Linux distributions and is the fastest widely-available PDF renderer. On macOS it can be installed via brew install poppler; on Windows via MSYS2 or Chocolatey.

Configuration key: "poppler-renderer"

Since:
Apache Tika 4.0
See Also:
  • Constructor Details

    • PopplerRenderer

      public PopplerRenderer()
  • Method Details

    • getSupportedTypes

      public Set<MediaType> getSupportedTypes(ParseContext context)
      Description copied from interface: Renderer
      Returns the set of media types supported by this renderer when used with the given parse context.
      Specified by:
      getSupportedTypes in interface Renderer
      Parameters:
      context - parse context
      Returns:
      immutable set of media types
    • render

      public RenderResults render(TikaInputStream tis, Metadata metadata, ParseContext parseContext, RenderRequest... requests) throws IOException, TikaException
      Specified by:
      render in interface Renderer
      Throws:
      IOException
      TikaException
    • getPdftoppmPath

      public String getPdftoppmPath()
    • setPdftoppmPath

      public void setPdftoppmPath(String pdftoppmPath)
      Set the path to the pdftoppm executable. Defaults to "pdftoppm" (assumes it is on the system path).
    • getDpi

      public int getDpi()
    • setDpi

      public void setDpi(int dpi)
      Set the rendering resolution in DPI. Defaults to 300.
    • isGray

      public boolean isGray()
    • setGray

      public void setGray(boolean gray)
      If true (the default), render in grayscale. Set to false for full-color rendering.
    • getTimeoutMs

      public int getTimeoutMs()
    • setTimeoutMs

      public void setTimeoutMs(int timeoutMs)
      Set the timeout in milliseconds for the pdftoppm process. Defaults to 120000 (2 minutes).
    • getMaxScaleTo

      public int getMaxScaleTo()
    • setMaxScaleTo

      public void setMaxScaleTo(int maxScaleTo)
      Set the maximum pixel dimension (in pixels) for the longest edge of rendered page images. Maps to pdftoppm's -scale-to flag. Pages that would render smaller than this are not enlarged.

      Default is 4096 pixels. Set to -1 to disable (not recommended).