Describe Image Block
Leverage AI to analyze and describe images based on natural language prompts
Overview
The Describe Image Block uses Vision Language Model (VLM) capabilities to analyze and describe images. By providing an image and an optional natural language prompt, you can instruct the AI to focus on specific aspects or provide general descriptions of the image content.
Inputs
The system prompt to send to the model. Optional. Used to provide high-level guidance to the AI model.
The prompt message or messages to send to the model. Only available if “Use Prompt Input” is enabled in settings.
The input image to be analyzed. Required. The image will be converted to a data URI before being sent to the model.
Outputs
The resulting description of the image. The content and focus of this description will depend on the input image and any provided prompts.
Editor Settings
The AI vision model used to describe the image. Available models are dynamically populated based on the LLM provider configuration.
When enabled, allows the prompt to be provided via an input port instead of being set in the settings.
The prompt to use when “Use Prompt Input” is disabled. This text will be sent to the model along with the image.
The maximum number of tokens to generate in the response.
The sampling temperature to use. Lower values produce more focused and deterministic outputs, while higher values allow for more creativity in descriptions.
Available settings may vary depending on the selected LLM provider and model.
Example: Analyzing a Chart Image
- Add a Describe Image block to your flow.
- Connect your input image (e.g., a chart or graph) to the
image
input of the Describe Image block. - Add a Text block with a prompt like “Describe the main trends and key data points in this chart” and connect it to the
prompt
input if using prompt input mode. - Select your desired model in the Describe Image block settings.
- Run your flow. The block will output a detailed description of the chart, focusing on the trends and key data points.
Error Handling
- If the input image is empty, invalid, or in an unsupported format, the block will return an error.
- If the AI provider fails to analyze the image, the block will retry up to 3 times with exponential backoff (1-10 seconds between retries).
- If the image is too large or complex for the model to process, the block may return an error or a partial description.
Always validate the output of the Describe Image block, especially when using it for critical applications or decision-making processes.