Class OpenAIVLMParser

java.lang.Object
org.apache.tika.parser.vlm.AbstractVLMParser
org.apache.tika.parser.vlm.OpenAIVLMParser
All Implemented Interfaces:
Serializable, Initializable, SelfConfiguring, Parser

public class OpenAIVLMParser extends AbstractVLMParser
VLM parser for OpenAI-compatible chat completions endpoints (OpenAI, Azure OpenAI, OpenRouter, vLLM, Ollama, LiteLLM, Together AI, Groq, Fireworks, Mistral, NVIDIA NIM, Jina, local FastAPI wrappers, etc.).

Images are base64-encoded and sent as image_url content parts.

Azure OpenAI is supported by configuring:

  • baseUrlhttps://{resource}.openai.azure.com
  • completionsPath/openai/deployments/{deployment}/chat/completions?api-version=2024-02-01
  • apiKeyHeaderNameapi-key
  • apiKeyPrefix"" (empty string)

Configuration key: "openai-vlm-parser"

Since:
Apache Tika 4.0
See Also:
  • Constructor Details

    • OpenAIVLMParser

      public OpenAIVLMParser()
    • OpenAIVLMParser

      public OpenAIVLMParser(VLMOCRConfig config)
    • OpenAIVLMParser

      public OpenAIVLMParser(JsonConfig jsonConfig)
  • Method Details

    • buildHttpCall

      protected AbstractVLMParser.HttpCall buildHttpCall(VLMOCRConfig config, String base64Data, String mimeType)
      Description copied from class: AbstractVLMParser
      Build a fully formed AbstractVLMParser.HttpCall for the target API.
      Specified by:
      buildHttpCall in class AbstractVLMParser
      Parameters:
      config - resolved config for this parse
      base64Data - base64-encoded version of the file bytes
      mimeType - the MIME type of the input (e.g. image/png)
      Returns:
      a ready-to-execute AbstractVLMParser.HttpCall
    • extractResponseText

      protected String extractResponseText(String responseBody, Metadata metadata) throws TikaException
      Description copied from class: AbstractVLMParser
      Parse the API response body and extract the model's text output. Implementations should also populate AbstractVLMParser.VLM_PROMPT_TOKENS and AbstractVLMParser.VLM_COMPLETION_TOKENS in metadata when the information is available.
      Specified by:
      extractResponseText in class AbstractVLMParser
      Parameters:
      responseBody - raw JSON response body
      metadata - metadata to enrich with token counts
      Returns:
      the extracted text content
      Throws:
      TikaException
    • getSupportedMediaTypes

      protected Set<MediaType> getSupportedMediaTypes()
      Specified by:
      getSupportedMediaTypes in class AbstractVLMParser
      Returns:
      the set of media types this parser handles (images, PDFs, etc.)
    • configKey

      protected String configKey()
      Specified by:
      configKey in class AbstractVLMParser
      Returns:
      the JSON config key for ParseContextConfig lookup (e.g. "openai-vlm-parser", "gemini-vlm-parser")
    • getHealthCheckUrl

      protected String getHealthCheckUrl(VLMOCRConfig config)
      Specified by:
      getHealthCheckUrl in class AbstractVLMParser
      Returns:
      an optional health-check URL to probe at init time, or null to skip the probe
    • getCompletionsPath

      public String getCompletionsPath()
    • setCompletionsPath

      public void setCompletionsPath(String completionsPath)
      Set the URL path for chat completions requests. Default is /v1/chat/completions.

      For Azure OpenAI, use something like /openai/deployments/my-gpt4o/chat/completions?api-version=2024-02-01.

    • getApiKeyHeaderName

      public String getApiKeyHeaderName()
    • setApiKeyHeaderName

      public void setApiKeyHeaderName(String apiKeyHeaderName)
      Set the HTTP header name for API key authentication. Default is Authorization. For Azure OpenAI, set to api-key.
    • getApiKeyPrefix

      public String getApiKeyPrefix()
    • setApiKeyPrefix

      public void setApiKeyPrefix(String apiKeyPrefix)
      Set the prefix prepended to the API key in the auth header. Default is "Bearer " (with trailing space). For Azure OpenAI, set to "" (empty string).