Describe Image Block
Leverage AI to analyze and describe images based on natural language prompts
Overview
The Describe Image Block uses Vision Language Model (VLM) capabilities to analyze and describe images. By providing an image and an optional natural language prompt, you can instruct the AI to focus on specific aspects or provide general descriptions of the image content.
Inputs
The system prompt to send to the model. Optional. Used to provide high-level guidance to the AI model.
The prompt message or messages to send to the model. Only available if “Use Prompt Input” is enabled in settings.
The input image to be analyzed. Required. The image will be converted to a data URI before being sent to the model.
Outputs
The resulting description of the image. The content and focus of this description will depend on the input image and any provided prompts.
Editor Settings
The AI vision model used to describe the image. Available models are dynamically populated based on the LLM provider configuration.
When enabled, allows the prompt to be provided via an input port instead of being set in the settings.
The prompt to use when “Use Prompt Input” is disabled. This text will be sent to the model along with the image.
The maximum number of tokens to generate in the response.
The sampling temperature to use. Lower values produce more focused and deterministic outputs, while higher values allow for more creativity in descriptions.
Available settings may vary depending on the selected LLM provider and model.
Example: Analyzing a Chart Image
- Add a Describe Image block to your flow.
- Connect your input image (e.g., a chart or graph) to the
image
input of the Describe Image block. - Add a Text block with a prompt like “Describe the main trends and key data points in this chart” and connect it to the
prompt
input if using prompt input mode. - Select your desired model in the Describe Image block settings.
- Run your flow. The block will output a detailed description of the chart, focusing on the trends and key data points.
Error Handling
- If the input image is empty, invalid, or in an unsupported format, the block will return an error.
- If the AI provider fails to analyze the image, the block will retry up to 3 times with exponential backoff (1-10 seconds between retries).
- If the image is too large or complex for the model to process, the block may return an error or a partial description.
Always validate the output of the Describe Image block, especially when using it for critical applications or decision-making processes.
FAQ
What types of images can the Describe Image block analyze?
What types of images can the Describe Image block analyze?
The block can analyze a wide variety of images, including photographs, charts, graphs, diagrams, and more. The effectiveness may vary depending on the complexity of the image and the capabilities of the chosen AI model.
How detailed are the image descriptions?
How detailed are the image descriptions?
The level of detail in the descriptions can vary based on the complexity of the image, the specificity of the prompt (if provided), and the capabilities of the chosen AI model. You can often get more detailed or focused descriptions by using specific prompts.
Can the Describe Image block identify specific objects or people in images?
Can the Describe Image block identify specific objects or people in images?
While the block can generally describe the contents of an image, including objects and people, it typically doesn’t identify specific individuals. The level of object recognition depends on the AI model’s training and capabilities.