feat: add the Gemini Data Analytics (GDA) integration for DB NL2SQL conversion to Toolbox (#2181)

## Description

This PR is to add the Gemini Data Analytics (GDA) integration for DB
NL2SQL conversion to Toolbox. It allows the user to convert a natural
language query to SQL statement based on their database instance. See
the doc section for details.

## PR Checklist

> Thank you for opening a Pull Request! Before submitting your PR, there
are a
> few things you can do to make sure it goes smoothly:

- [x] Make sure you reviewed

[CONTRIBUTING.md](https://github.com/googleapis/genai-toolbox/blob/main/CONTRIBUTING.md)
- [x] Make sure to open an issue as a

[bug/issue](https://github.com/googleapis/genai-toolbox/issues/new/choose)
  before writing your code! That way we can discuss the change, evaluate
  designs, and agree on the general idea
- [x] Ensure the tests and linter pass
- [x] Code coverage does not decrease (if any source code was changed)
- [x] Appropriate docs were updated (if necessary)
- [x] Make sure to add `!` if this involve a breaking change

🛠️ Fixes #2180

---------

Co-authored-by: Yuan Teoh <45984206+Yuan325@users.noreply.github.com>
This commit is contained in:
Juexin Wang
2025-12-18 09:58:46 -08:00
committed by GitHub
parent e1bd98ef5b
commit aa270b2630
14 changed files with 1471 additions and 6 deletions

View File

@@ -0,0 +1,40 @@
---
title: "Gemini Data Analytics"
type: docs
weight: 1
description: >
A "cloud-gemini-data-analytics" source provides a client for the Gemini Data Analytics API.
aliases:
- /resources/sources/cloud-gemini-data-analytics
---
## About
The `cloud-gemini-data-analytics` source provides a client to interact with the [Gemini Data Analytics API](https://docs.cloud.google.com/gemini/docs/conversational-analytics-api/reference/rest). This allows tools to send natural language queries to the API.
Authentication can be handled in two ways:
1. **Application Default Credentials (ADC) (Recommended):** By default, the source uses ADC to authenticate with the API. The Toolbox server will fetch the credentials from its running environment (server-side authentication). This is the recommended method.
2. **Client-side OAuth:** If `useClientOAuth` is set to `true`, the source expects the authentication token to be provided by the caller when making a request to the Toolbox server (typically via an HTTP Bearer token). The Toolbox server will then forward this token to the underlying Gemini Data Analytics API calls.
## Example
```yaml
sources:
my-gda-source:
kind: cloud-gemini-data-analytics
projectId: my-project-id
my-oauth-gda-source:
kind: cloud-gemini-data-analytics
projectId: my-project-id
useClientOAuth: true
```
## Reference
| **field** | **type** | **required** | **description** |
| -------------- | :------: | :----------: | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| kind | string | true | Must be "cloud-gemini-data-analytics". |
| projectId | string | true | The Google Cloud Project ID where the API is enabled. |
| useClientOAuth | boolean | false | If true, the source uses the token provided by the caller (forwarded to the API). Otherwise, it uses server-side Application Default Credentials (ADC). Defaults to `false`. |

View File

@@ -0,0 +1,7 @@
---
title: "Gemini Data Analytics"
type: docs
weight: 1
description: >
Tools for Gemini Data Analytics.
---

View File

@@ -0,0 +1,92 @@
---
title: "Gemini Data Analytics QueryData"
type: docs
weight: 1
description: >
A tool to convert natural language queries into SQL statements using the Gemini Data Analytics QueryData API.
aliases:
- /resources/tools/cloud-gemini-data-analytics-query
---
## About
The `cloud-gemini-data-analytics-query` tool allows you to send natural language questions to the Gemini Data Analytics API and receive structured responses containing SQL queries, natural language answers, and explanations. For details on defining data agent context for database data sources, see the official [documentation](https://docs.cloud.google.com/gemini/docs/conversational-analytics-api/data-agent-authored-context-databases).
## Example
```yaml
tools:
my-gda-query-tool:
kind: cloud-gemini-data-analytics-query
source: my-gda-source
description: "Use this tool to send natural language queries to the Gemini Data Analytics API and receive SQL, natural language answers, and explanations."
location: ${your_database_location}
context:
datasourceReferences:
cloudSqlReference:
databaseReference:
projectId: "${your_project_id}"
region: "${your_database_instance_region}"
instanceId: "${your_database_instance_id}"
databaseId: "${your_database_name}"
engine: "POSTGRESQL"
agentContextReference:
contextSetId: "${your_context_set_id}" # E.g. projects/${project_id}/locations/${context_set_location}/contextSets/${context_set_id}
generationOptions:
generateQueryResult: true
generateNaturalLanguageAnswer: true
generateExplanation: true
generateDisambiguationQuestion: true
```
### Usage Flow
When using this tool, a `prompt` parameter containing a natural language query is provided to the tool (typically by an agent). The tool then interacts with the Gemini Data Analytics API using the context defined in your configuration.
The structure of the response depends on the `generationOptions` configured in your tool definition (e.g., enabling `generateQueryResult` will include the SQL query results).
See [Data Analytics API REST documentation](https://clouddocs.devsite.corp.google.com/gemini/docs/conversational-analytics-api/reference/rest/v1alpha/projects.locations/queryData?rep_location=global) for details.
**Example Input Prompt:**
```text
How many accounts who have region in Prague are eligible for loans? A3 contains the data of region.
```
**Example API Response:**
```json
{
"generatedQuery": "SELECT COUNT(T1.account_id) FROM account AS T1 INNER JOIN loan AS T2 ON T1.account_id = T2.account_id INNER JOIN district AS T3 ON T1.district_id = T3.district_id WHERE T3.A3 = 'Prague'",
"intentExplanation": "I found a template that matches the user's question. The template asks about the number of accounts who have region in a given city and are eligible for loans. The question asks about the number of accounts who have region in Prague and are eligible for loans. The template's parameterized SQL is 'SELECT COUNT(T1.account_id) FROM account AS T1 INNER JOIN loan AS T2 ON T1.account_id = T2.account_id INNER JOIN district AS T3 ON T1.district_id = T3.district_id WHERE T3.A3 = ?'. I will replace the named parameter '?' with 'Prague'.",
"naturalLanguageAnswer": "There are 84 accounts from the Prague region that are eligible for loans.",
"queryResult": {
"columns": [
{
"type": "INT64"
}
],
"rows": [
{
"values": [
{
"value": "84"
}
]
}
],
"totalRowCount": "1"
}
}
```
## Reference
| **field** | **type** | **required** | **description** |
| ----------------- | :------: | :----------: | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| kind | string | true | Must be "cloud-gemini-data-analytics-query". |
| source | string | true | The name of the `cloud-gemini-data-analytics` source to use. |
| description | string | true | A description of the tool's purpose. |
| location | string | true | The Google Cloud location of the target database resource (e.g., "us-central1"). This is used to construct the parent resource name in the API call. |
| context | object | true | The context for the query, including datasource references. See [QueryDataContext](https://github.com/googleapis/googleapis/blob/b32495a713a68dd0dff90cf0b24021debfca048a/google/cloud/geminidataanalytics/v1beta/data_chat_service.proto#L156) for details. |
| generationOptions | object | false | Options for generating the response. See [GenerationOptions](https://github.com/googleapis/googleapis/blob/b32495a713a68dd0dff90cf0b24021debfca048a/google/cloud/geminidataanalytics/v1beta/data_chat_service.proto#L135) for details. |