mirror of
https://github.com/simstudioai/sim.git
synced 2026-02-10 14:45:16 -05:00
* feat(confluence): added more confluence endpoints * update license * updated * updated docs
53 lines
1.9 KiB
Plaintext
53 lines
1.9 KiB
Plaintext
---
|
|
title: Vision
|
|
description: Analyze images with vision models
|
|
---
|
|
|
|
import { BlockInfoCard } from "@/components/ui/block-info-card"
|
|
|
|
<BlockInfoCard
|
|
type="vision_v2"
|
|
color="#4D5FFF"
|
|
/>
|
|
|
|
{/* MANUAL-CONTENT-START:intro */}
|
|
Vision is a tool that allows you to analyze images with vision models.
|
|
|
|
With Vision, you can:
|
|
|
|
- **Analyze images**: Analyze images with vision models
|
|
- **Extract text**: Extract text from images
|
|
- **Identify objects**: Identify objects in images
|
|
- **Describe images**: Describe images in detail
|
|
- **Generate images**: Generate images from text
|
|
|
|
In Sim, the Vision integration enables your agents to analyze images with vision models as part of their workflows. This allows for powerful automation scenarios that require analyzing images with vision models. Your agents can analyze images with vision models, extract text from images, identify objects in images, describe images in detail, and generate images from text. This integration bridges the gap between your AI workflows and your image analysis needs, enabling more sophisticated and image-centric automations. By connecting Sim with Vision, you can create agents that stay current with the latest information, provide more accurate responses, and deliver more value to users - all without requiring manual intervention or custom code.
|
|
{/* MANUAL-CONTENT-END */}
|
|
|
|
|
|
## Usage Instructions
|
|
|
|
Integrate Vision into the workflow. Can analyze images with vision models.
|
|
|
|
|
|
|
|
## Tools
|
|
|
|
### `vision_tool`
|
|
|
|
#### Input
|
|
|
|
| Parameter | Type | Required | Description |
|
|
| --------- | ---- | -------- | ----------- |
|
|
| `apiKey` | string | Yes | API key for the selected model provider |
|
|
| `imageUrl` | string | No | Publicly accessible image URL |
|
|
| `imageFile` | file | No | Image file to analyze |
|
|
| `model` | string | No | Vision model to use \(gpt-4o, claude-3-opus-20240229, etc\) |
|
|
| `prompt` | string | No | Custom prompt for image analysis |
|
|
|
|
#### Output
|
|
|
|
This tool does not produce any outputs.
|
|
|
|
|