Package org.apache.tika.parser.vlm
Class OpenAIVLMParser
java.lang.Object
org.apache.tika.parser.vlm.AbstractVLMParser
org.apache.tika.parser.vlm.OpenAIVLMParser
- All Implemented Interfaces:
Serializable,Initializable,SelfConfiguring,Parser
VLM parser for OpenAI-compatible chat completions endpoints
(OpenAI, Azure OpenAI, OpenRouter, vLLM, Ollama, LiteLLM, Together AI,
Groq, Fireworks, Mistral, NVIDIA NIM, Jina, local FastAPI wrappers, etc.).
Images are base64-encoded and sent as image_url content parts.
Azure OpenAI is supported by configuring:
baseUrl—https://{resource}.openai.azure.comcompletionsPath—/openai/deployments/{deployment}/chat/completions?api-version=2024-02-01apiKeyHeaderName—api-keyapiKeyPrefix—""(empty string)
Configuration key: "openai-vlm-parser"
- Since:
- Apache Tika 4.0
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.tika.parser.vlm.AbstractVLMParser
AbstractVLMParser.HttpCall -
Field Summary
Fields inherited from class org.apache.tika.parser.vlm.AbstractVLMParser
VLM_COMPLETION_TOKENS, VLM_META, VLM_MODEL, VLM_PROMPT_TOKENS -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected AbstractVLMParser.HttpCallbuildHttpCall(VLMOCRConfig config, String base64Data, String mimeType) Build a fully formedAbstractVLMParser.HttpCallfor the target API.protected Stringprotected StringextractResponseText(String responseBody, Metadata metadata) Parse the API response body and extract the model's text output.protected StringgetHealthCheckUrl(VLMOCRConfig config) voidsetApiKeyHeaderName(String apiKeyHeaderName) Set the HTTP header name for API key authentication.voidsetApiKeyPrefix(String apiKeyPrefix) Set the prefix prepended to the API key in the auth header.voidsetCompletionsPath(String completionsPath) Set the URL path for chat completions requests.Methods inherited from class org.apache.tika.parser.vlm.AbstractVLMParser
getApiKey, getBaseUrl, getConfig, getDefaultConfig, getMaxFileSizeToOcr, getMaxTokens, getMinFileSizeToOcr, getModel, getPrompt, getSupportedTypes, getTimeoutSeconds, initialize, isInlineContent, isServerAvailable, isSkipOcr, parse, setApiKey, setBaseUrl, setInlineContent, setMaxFileSizeToOcr, setMaxTokens, setMinFileSizeToOcr, setModel, setPrompt, setSkipOcr, setTimeoutSeconds, stripTrailingSlash
-
Constructor Details
-
OpenAIVLMParser
public OpenAIVLMParser() -
OpenAIVLMParser
-
OpenAIVLMParser
-
-
Method Details
-
buildHttpCall
protected AbstractVLMParser.HttpCall buildHttpCall(VLMOCRConfig config, String base64Data, String mimeType) Description copied from class:AbstractVLMParserBuild a fully formedAbstractVLMParser.HttpCallfor the target API.- Specified by:
buildHttpCallin classAbstractVLMParser- Parameters:
config- resolved config for this parsebase64Data- base64-encoded version of the file bytesmimeType- the MIME type of the input (e.g.image/png)- Returns:
- a ready-to-execute
AbstractVLMParser.HttpCall
-
extractResponseText
Description copied from class:AbstractVLMParserParse the API response body and extract the model's text output. Implementations should also populateAbstractVLMParser.VLM_PROMPT_TOKENSandAbstractVLMParser.VLM_COMPLETION_TOKENSin metadata when the information is available.- Specified by:
extractResponseTextin classAbstractVLMParser- Parameters:
responseBody- raw JSON response bodymetadata- metadata to enrich with token counts- Returns:
- the extracted text content
- Throws:
TikaException
-
getSupportedMediaTypes
- Specified by:
getSupportedMediaTypesin classAbstractVLMParser- Returns:
- the set of media types this parser handles (images, PDFs, etc.)
-
configKey
- Specified by:
configKeyin classAbstractVLMParser- Returns:
- the JSON config key for
ParseContextConfiglookup (e.g."openai-vlm-parser","gemini-vlm-parser")
-
getHealthCheckUrl
- Specified by:
getHealthCheckUrlin classAbstractVLMParser- Returns:
- an optional health-check URL to probe at init time, or
nullto skip the probe
-
getCompletionsPath
-
setCompletionsPath
Set the URL path for chat completions requests. Default is/v1/chat/completions.For Azure OpenAI, use something like
/openai/deployments/my-gpt4o/chat/completions?api-version=2024-02-01. -
getApiKeyHeaderName
-
setApiKeyHeaderName
Set the HTTP header name for API key authentication. Default isAuthorization. For Azure OpenAI, set toapi-key. -
getApiKeyPrefix
-
setApiKeyPrefix
Set the prefix prepended to the API key in the auth header. Default is"Bearer "(with trailing space). For Azure OpenAI, set to""(empty string).
-