genai-toolbox/docs/en/resources/embeddingModels/_index.md at f57352fcaafd8d3eb7cb2b2fda4366b453773a99

mirror of https://github.com/googleapis/genai-toolbox.git synced 2026-02-09 06:35:38 -05:00

Files

Wenxin Du 15101b1edb feat(tools): Add valueFromParam support to Tool config (#2333 )

This PR introduces a new configuration field valueFromParam to the tool
definitions. This feature allows a parameter to automatically inherit
its value from another sibling parameter, mainly to streamline the
configuration of vector insertion tools.

Parameters utilizing valueFromParam are excluded from the Tool and MCP
manifests. This means the LLM does not see these parameters and is not
required to generate them. The value is resolved internally by the
Toolbox during execution.

2026-01-21 16:35:27 -08:00

3.2 KiB

Raw Blame History

title, type, weight, description

title	type	weight	description
EmbeddingModels	docs	2	EmbeddingModels represent services that transform text into vector embeddings for semantic search.

EmbeddingModels represent services that generate vector representations of text data. In the MCP Toolbox, these models enable Semantic Queries, allowing Tools to automatically convert human-readable text into numerical vectors before using them in a query.

This is primarily used in two scenarios:

Vector Ingestion: Converting a text parameter into a vector string during an INSERT operation.
Semantic Search: Converting a natural language query into a vector to perform similarity searches.

Hidden Parameter Duplication (valueFromParam)

When building tools for vector ingestion, you often need the same input string twice:

To store the original text in a TEXT column.
To generate the vector embedding for a VECTOR column.

Requesting an Agent (LLM) to output the exact same string twice is inefficient and error-prone. The valueFromParam field solves this by allowing a parameter to inherit its value from another parameter in the same tool.

Key Behaviors

Hidden from Manifest: Parameters with valueFromParam set are excluded from the tool definition sent to the Agent. The Agent does not know this parameter exists.
Auto-Filled: When the tool is executed, the Toolbox automatically copies the value from the referenced parameter before processing embeddings.

Example

The following configuration defines an embedding model and applies it to specific tool parameters.

{{< notice tip >}} Use environment variable replacement with the format ${ENV_NAME} instead of hardcoding your API keys into the configuration file. {{< /notice >}}

Step 1 - Define an Embedding Model

Define an embedding model in the embeddingModels section:

embeddingModels:
  gemini-model: # Name of the embedding model
    kind: gemini
    model: gemini-embedding-001
    apiKey: ${GOOGLE_API_KEY}
    dimension: 768

Step 2 - Embed Tool Parameters

Use the defined embedding model, embed your query parameters using the embeddedBy field. Only string-typed parameters can be embedded:

tools:
  # Vector ingestion tool
  insert_embedding:
    kind: postgres-sql
    source: my-pg-instance
    statement: |
      INSERT INTO documents (content, embedding) 
      VALUES ($1, $2);
    parameters:
      - name: content
        type: string
        description: The raw text content to be stored in the database.
      - name: vector_string
        type: string
        # This parameter is hidden from the LLM.
        # It automatically copies the value from 'content' and embeds it.
        valueFromParam: content
        embeddedBy: gemini-model

  # Semantic search tool
  search_embedding:
    kind: postgres-sql
    source: my-pg-instance
    statement: |
      SELECT id, content, embedding <-> $1 AS distance 
      FROM documents
      ORDER BY distance LIMIT 1
    parameters:
      - name: semantic_search_string
        type: string
        description: The search query that will be converted to a vector.
        embeddedBy: gemini-model # refers to the name of a defined embedding model

3.2 KiB Raw Blame History