Commit Graph

11 Commits

Author SHA1 Message Date
Yuan Teoh
293c1d6889 feat!: update configuration file v2 (#2369)
This PR introduces a significant update to the Toolbox configuration
file format, which is one of the primary **breaking changes** required
for the implementation of the Advanced Control Plane.

# Summary of Changes
The configuration schema has been updated to enforce resource isolation
and facilitate atomic, incremental updates.
* Resource Isolation: Resource definitions are now separated into
individual blocks, using a distinct structure for each resource type
(Source, Tool, Toolset, etc.). This improves readability, management,
and auditing of configuration files.
* Field Name Modification: Internal field names have been modified to
align with declarative methodologies. Specifically, the configuration
now separates kind (general resource type, e.g., Source) from type
(specific implementation, e.g., Postgres).

# User Impact
Existing tools.yaml configuration files are now in an outdated format.
Users must eventually update their files to the new YAML format.

# Mitigation & Compatibility
Backward compatibility is maintained during this transition to ensure no
immediate user action is required for existing files.
* Immediate Backward Compatibility: The source code includes a
pre-processing layer that automatically detects outdated configuration
files (v1 format) and converts them to the new v2 format under the hood.
* [COMING SOON] Migration Support: The new toolbox migrate subcommand
will be introduced to allow users to automatically convert their old
configuration files to the latest format.

# Example
Example for config file v2:
```
kind: sources
name: my-pg-instance
type: cloud-sql-postgres
project: my-project
region: my-region
instance: my-instance
database: my_db
user: my_user
password: my_pass
---
kind: authServices
name: my-google-auth
type: google
clientId: testing-id
---
kind: tools
name: example_tool
type: postgres-sql
source: my-pg-instance
description: some description
statement: SELECT * FROM SQL_STATEMENT;
parameters:
- name: country
  type: string
  description: some description
---
kind: tools
name: example_tool_2
type: postgres-sql
source: my-pg-instance
description: returning the number one
statement: SELECT 1;
---
kind: toolsets
name: example_toolset
tools:
- example_tool
```

---------

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Averi Kitsch <akitsch@google.com>
2026-01-27 16:58:43 -08:00
Yuan Teoh
0691a6f715 refactor: move source implementation in Invoke() function to Source (#2274)
Move source-related queries from `Invoke()` function into Source.

This PR addresses the following sources:
* dataplex
* http
* serverlessspark

This is an effort to generalizing tools to work with any Source that
implements a specific interface. This will provide a better segregation
of the roles for Tools vs Source.

Tool's role will be limited to the following:
* Resolve any pre-implementation steps or parameters (e.g. template
parameters)
* Retrieving Source
* Calling the source's implementation
2026-01-12 18:16:32 +00:00
Dave Borowitz
c6ccf4bd87 feat(serverless-spark)!: add URLs to create batch tool outputs 2025-12-10 15:10:40 -08:00
Dave Borowitz
5605eabd69 feat(serverless-spark)!: add URLs to list_batches output
Unlike get_batch, in this case we are not returning a JSON type directly
from the server, so we can add the new fields in our top-level object
rather than wrapping.
2025-12-10 15:10:40 -08:00
Dave Borowitz
e29c0616d6 feat(serverless-spark)!: add Cloud Console and Logging URLs to get_batch
These are useful links for humans to follow for more information
(output, metrics, logs) that's not readily availble via MCP.
2025-12-10 15:10:40 -08:00
Dave Borowitz
17a979207d feat(serverless-spark): add create_spark_batch tool
This tool is almost identical to create_pyspark_batch, but for Java
Spark batches. There are some minor differences in how the main files
and args are provided.
2025-12-04 11:05:53 -08:00
Dave Borowitz
1bf0b51f03 feat(serverless-spark): add create_pyspark_batch tool
This tool creates a PySpark batch from a minimal set of parameters, to
keep things simple for the LLM. Advanced runtime and environment config
can be specified in tools.yaml.
2025-12-04 11:05:53 -08:00
Dave Borowitz
2881683226 feat(serverless-spark): add cancel-batch tool 2025-11-05 11:13:35 -08:00
Dave Borowitz
8ef0566e1e refactor(serverless-spark): rearrange and parallelize integration tests
In general tests should be parallizable since they interact only with a
deterministic set of batches. The exception is list-batches, especially
with pagination, so run that one sequentially.

This doesn't make much difference for the current set of tests, but in
the near future we will have tests that create batches, which take tens
of seconds to even start running.

Rearrange subtests to be primarily organized by tool, which is more
understandable and easier to filter with `-run`. Test helper methods can
still be called multiple times in different subtests for different
tools.

Sample test output showing the new structure:

```
--- PASS: TestServerlessSparkToolEndpoints (2.01s)
    --- PASS: TestServerlessSparkToolEndpoints/list-batches (1.78s)
        --- PASS: TestServerlessSparkToolEndpoints/list-batches/success (1.23s)
            --- PASS: TestServerlessSparkToolEndpoints/list-batches/success/filtered (0.34s)
            --- PASS: TestServerlessSparkToolEndpoints/list-batches/success/empty (0.40s)
            --- PASS: TestServerlessSparkToolEndpoints/list-batches/success/omit_page_size (0.42s)
            --- PASS: TestServerlessSparkToolEndpoints/list-batches/success/one_page (0.43s)
            --- PASS: TestServerlessSparkToolEndpoints/list-batches/success/20_batches (0.44s)
            --- PASS: TestServerlessSparkToolEndpoints/list-batches/success/two_pages (0.54s)
        --- PASS: TestServerlessSparkToolEndpoints/list-batches/errors (0.00s)
            --- PASS: TestServerlessSparkToolEndpoints/list-batches/errors/negative_page_size (0.01s)
            --- PASS: TestServerlessSparkToolEndpoints/list-batches/errors/zero_page_size (0.01s)
        --- PASS: TestServerlessSparkToolEndpoints/list-batches/auth (0.77s)
            --- PASS: TestServerlessSparkToolEndpoints/list-batches/auth/no_auth_token (0.00s)
            --- PASS: TestServerlessSparkToolEndpoints/list-batches/auth/invalid_auth_token (0.00s)
            --- PASS: TestServerlessSparkToolEndpoints/list-batches/auth/valid_auth_token (0.18s)
    --- PASS: TestServerlessSparkToolEndpoints/parallel-tool-tests (0.00s)
        --- PASS: TestServerlessSparkToolEndpoints/parallel-tool-tests/get-batch (0.09s)
            --- PASS: TestServerlessSparkToolEndpoints/parallel-tool-tests/get-batch/errors (0.00s)
                --- PASS: TestServerlessSparkToolEndpoints/parallel-tool-tests/get-batch/errors/full_batch_name (0.01s)
                --- PASS: TestServerlessSparkToolEndpoints/parallel-tool-tests/get-batch/errors/missing_batch (0.11s)
            --- PASS: TestServerlessSparkToolEndpoints/parallel-tool-tests/get-batch/success (0.21s)
                --- PASS: TestServerlessSparkToolEndpoints/parallel-tool-tests/get-batch/success/found_batch (0.11s)
            --- PASS: TestServerlessSparkToolEndpoints/parallel-tool-tests/get-batch/auth (0.60s)
                --- PASS: TestServerlessSparkToolEndpoints/parallel-tool-tests/get-batch/auth/invalid_auth_token (0.00s)
                --- PASS: TestServerlessSparkToolEndpoints/parallel-tool-tests/get-batch/auth/no_auth_token (0.00s)
                --- PASS: TestServerlessSparkToolEndpoints/parallel-tool-tests/get-batch/auth/valid_auth_token (0.11s)
```
2025-11-05 11:13:35 -08:00
Dave Borowitz
7ad10720b4 feat(serverless-spark): Add get_batch tool 2025-10-28 13:42:02 -07:00
Dave Borowitz
816dbce268 feat(serverless-spark): Add serverless-spark source with list_batches tool
Built as a thin wrapper over the official Google Cloud Dataproc Go
client library, with support for filtering and pagination.
2025-10-23 20:40:52 -07:00