Files
genai-toolbox/docs/en/resources/sources/serverless-spark.md
Yuan Teoh 293c1d6889 feat!: update configuration file v2 (#2369)
This PR introduces a significant update to the Toolbox configuration
file format, which is one of the primary **breaking changes** required
for the implementation of the Advanced Control Plane.

# Summary of Changes
The configuration schema has been updated to enforce resource isolation
and facilitate atomic, incremental updates.
* Resource Isolation: Resource definitions are now separated into
individual blocks, using a distinct structure for each resource type
(Source, Tool, Toolset, etc.). This improves readability, management,
and auditing of configuration files.
* Field Name Modification: Internal field names have been modified to
align with declarative methodologies. Specifically, the configuration
now separates kind (general resource type, e.g., Source) from type
(specific implementation, e.g., Postgres).

# User Impact
Existing tools.yaml configuration files are now in an outdated format.
Users must eventually update their files to the new YAML format.

# Mitigation & Compatibility
Backward compatibility is maintained during this transition to ensure no
immediate user action is required for existing files.
* Immediate Backward Compatibility: The source code includes a
pre-processing layer that automatically detects outdated configuration
files (v1 format) and converts them to the new v2 format under the hood.
* [COMING SOON] Migration Support: The new toolbox migrate subcommand
will be introduced to allow users to automatically convert their old
configuration files to the latest format.

# Example
Example for config file v2:
```
kind: sources
name: my-pg-instance
type: cloud-sql-postgres
project: my-project
region: my-region
instance: my-instance
database: my_db
user: my_user
password: my_pass
---
kind: authServices
name: my-google-auth
type: google
clientId: testing-id
---
kind: tools
name: example_tool
type: postgres-sql
source: my-pg-instance
description: some description
statement: SELECT * FROM SQL_STATEMENT;
parameters:
- name: country
  type: string
  description: some description
---
kind: tools
name: example_tool_2
type: postgres-sql
source: my-pg-instance
description: returning the number one
statement: SELECT 1;
---
kind: toolsets
name: example_toolset
tools:
- example_tool
```

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Averi Kitsch <akitsch@google.com>
2026-01-27 16:58:43 -08:00

2.7 KiB

title, type, weight, description
title type weight description
Serverless for Apache Spark docs 1 Google Cloud Serverless for Apache Spark lets you run Spark workloads without requiring you to provision and manage your own Spark cluster.

About

The Serverless for Apache Spark source allows Toolbox to interact with Spark batches hosted on Google Cloud Serverless for Apache Spark.

Available Tools

Requirements

IAM Permissions

Serverless for Apache Spark uses Identity and Access Management (IAM) to control user and group access to serverless Spark resources like batches and sessions.

Toolbox will use your Application Default Credentials (ADC) to authorize and authenticate when interacting with Google Cloud Serverless for Apache Spark. When using this method, you need to ensure the IAM identity associated with your ADC has the correct permissions for the actions you intend to perform. Common roles include roles/dataproc.serverlessEditor (which includes permissions to run batches) or roles/dataproc.serverlessViewer. Follow this guide to set up your ADC.

Example

kind: sources
name: my-serverless-spark-source
type: serverless-spark
project: my-project-id
location: us-central1

Reference

field type required description
type string true Must be "serverless-spark".
project string true ID of the GCP project with Serverless for Apache Spark resources.
location string true Location containing Serverless for Apache Spark resources.