java.lang.Object

org.apache.tika.parser.vlm.AbstractVLMParser

org.apache.tika.parser.vlm.GeminiVLMParser

All Implemented Interfaces:: Serializable, Initializable, SelfConfiguring, Parser

public class GeminiVLMParser extends AbstractVLMParser

VLM parser for the Google Gemini generateContent API.

Supports both images and PDFs natively (Gemini processes PDFs with native vision, understanding layout, charts, tables, and diagrams — not just extracting text).

The API key is sent as a key query parameter (not a Bearer header).

Default base URL points to the public Gemini API; change it for Vertex AI or a proxy.

Configuration key: "gemini-vlm-parser"

Since:

Apache Tika 4.0

See Also:

Serialized Form

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.tika.parser.vlm.AbstractVLMParser
AbstractVLMParser.HttpCall
Field Summary

Fields inherited from class org.apache.tika.parser.vlm.AbstractVLMParser
VLM_COMPLETION_TOKENS, VLM_META, VLM_MODEL, VLM_PROMPT_TOKENS
Constructor Summary

Constructors

Constructor

Description

GeminiVLMParser()

GeminiVLMParser(JsonConfig jsonConfig)

GeminiVLMParser(VLMOCRConfig config)
Method Summary

Modifier and Type

Method

Description

protected AbstractVLMParser.HttpCall

buildHttpCall(VLMOCRConfig config, String base64Data, String mimeType)

Build a fully formed AbstractVLMParser.HttpCall for the target API.

protected String

configKey()

protected String

extractResponseText(String responseBody, Metadata metadata)

Parse the API response body and extract the model's text output.

protected String

getHealthCheckUrl(VLMOCRConfig config)

protected Set<MediaType>

getSupportedMediaTypes()

Methods inherited from class org.apache.tika.parser.vlm.AbstractVLMParser
getApiKey, getBaseUrl, getConfig, getDefaultConfig, getMaxFileSizeToOcr, getMaxTokens, getMinFileSizeToOcr, getModel, getPrompt, getSupportedTypes, getTimeoutSeconds, initialize, isInlineContent, isServerAvailable, isSkipOcr, parse, setApiKey, setBaseUrl, setInlineContent, setMaxFileSizeToOcr, setMaxTokens, setMinFileSizeToOcr, setModel, setPrompt, setSkipOcr, setTimeoutSeconds, stripTrailingSlash

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- GeminiVLMParser
  
  public GeminiVLMParser()
- GeminiVLMParser
  
  public GeminiVLMParser(VLMOCRConfig config)
- GeminiVLMParser
  
  public GeminiVLMParser(JsonConfig jsonConfig)
Method Details
- buildHttpCall
  
  protected AbstractVLMParser.HttpCall buildHttpCall(VLMOCRConfig config, String base64Data, String mimeType)
  
  Description copied from class: AbstractVLMParser
  
  Build a fully formed AbstractVLMParser.HttpCall for the target API.
  
  Specified by:
  
  buildHttpCall in class AbstractVLMParser
  
  Parameters:
  
  config - resolved config for this parse
  
  base64Data - base64-encoded version of the file bytes
  
  mimeType - the MIME type of the input (e.g. image/png)
  
  Returns:
  
  a ready-to-execute AbstractVLMParser.HttpCall
- extractResponseText
  
  protected String extractResponseText(String responseBody, Metadata metadata) throws TikaException
  
  Description copied from class: AbstractVLMParser
  
  Parse the API response body and extract the model's text output. Implementations should also populate AbstractVLMParser.VLM_PROMPT_TOKENS and AbstractVLMParser.VLM_COMPLETION_TOKENS in metadata when the information is available.
  
  Specified by:
  
  extractResponseText in class AbstractVLMParser
  
  Parameters:
  
  responseBody - raw JSON response body
  
  metadata - metadata to enrich with token counts
  
  Returns:
  
  the extracted text content
  
  Throws:
  
  TikaException
- getSupportedMediaTypes
  
  protected Set<MediaType> getSupportedMediaTypes()
  
  Specified by:
  
  getSupportedMediaTypes in class AbstractVLMParser
  
  Returns:
  
  the set of media types this parser handles (images, PDFs, etc.)
- configKey
  
  protected String configKey()
  
  Specified by:
  
  configKey in class AbstractVLMParser
  
  Returns:
  
  the JSON config key for ParseContextConfig lookup (e.g. "openai-vlm-parser", "gemini-vlm-parser")
- getHealthCheckUrl
  
  protected String getHealthCheckUrl(VLMOCRConfig config)
  
  Specified by:
  
  getHealthCheckUrl in class AbstractVLMParser
  
  Returns:
  
  an optional health-check URL to probe at init time, or null to skip the probe

Class GeminiVLMParser

Nested Class Summary

Nested classes/interfaces inherited from class org.apache.tika.parser.vlm.AbstractVLMParser

Field Summary

Fields inherited from class org.apache.tika.parser.vlm.AbstractVLMParser

Constructor Summary

Method Summary

Methods inherited from class org.apache.tika.parser.vlm.AbstractVLMParser

Methods inherited from class java.lang.Object

Constructor Details

GeminiVLMParser

GeminiVLMParser

GeminiVLMParser

Method Details

buildHttpCall

extractResponseText

getSupportedMediaTypes

configKey

getHealthCheckUrl