mirror of
https://github.com/googleapis/genai-toolbox.git
synced 2026-01-10 07:58:12 -05:00
feat(tools/dataform): add dataform compile tool (#1470)
## Description
This change introduces a new tool for compiling local Dataform projects.
The new tool, `dataform-compile`, allows users to programmatically run
the `dataform compile` command against a project on the local
filesystem. This tool does not require a `source` and instead relies on
the `dataform` CLI being available in the server's `PATH`.
### Changes:
* Added the new tool definition in
`internal/tools/dataformcompile/dataformcompile.go`.
* The tool requires the following parameter:
* `project_dir`: The local Dataform project directory to compile.
* The tool uses `os/exec` to run the `dataform compile --json` command
and parses the resulting JSON output.
* Added a new integration test in
`internal/tools/dataformcompile/dataformcompile_test.go` which:
* Skips the test if the `dataform` CLI is not found in the `PATH`.
* Uses `dataform init` to create a temporary, minimal project for
testing.
* Verifies success, missing parameter errors, and errors from a
non-existent directory.
---
> Should include a concise description of the changes (bug or feature),
it's
> impact, along with a summary of the solution
## PR Checklist
---
> Thank you for opening a Pull Request! Before submitting your PR, there
are a
> few things you can do to make sure it goes smoothly:
- [x] Make sure you reviewed
[CONTRIBUTING.md](https://github.com/googleapis/genai-toolbox/blob/main/CONTRIBUTING.md)
- [x] Make sure to open an issue as a
[bug/issue](https://github.com/googleapis/genai-toolbox/issues/new/choose)
before writing your code! That way we can discuss the change, evaluate
designs, and agree on the general idea
- [x] Ensure the tests and linter pass
- [x] Code coverage does not decrease (if any source code was changed)
- [x] Appropriate docs were updated (if necessary)
- [x] Make sure to add `!` if this involve a breaking change
🛠️ Fixes #1469
---------
Co-authored-by: Yuan Teoh <45984206+Yuan325@users.noreply.github.com>
This commit is contained in:
@@ -194,6 +194,26 @@ steps:
|
||||
dataplex \
|
||||
dataplex
|
||||
|
||||
- id: "dataform"
|
||||
name: golang:1
|
||||
waitFor: ["compile-test-binary"]
|
||||
entrypoint: /bin/bash
|
||||
env:
|
||||
- "GOPATH=/gopath"
|
||||
secretEnv: ["CLIENT_ID"]
|
||||
volumes:
|
||||
- name: "go"
|
||||
path: "/gopath"
|
||||
args:
|
||||
- -c
|
||||
- |
|
||||
apt-get update && apt-get install -y npm && \
|
||||
npm install -g @dataform/cli && \
|
||||
.ci/test_with_coverage.sh \
|
||||
"Dataform" \
|
||||
dataform \
|
||||
dataform
|
||||
|
||||
- id: "postgres"
|
||||
name: golang:1
|
||||
waitFor: ["compile-test-binary"]
|
||||
|
||||
@@ -80,6 +80,7 @@ import (
|
||||
_ "github.com/googleapis/genai-toolbox/internal/tools/cloudsqlmysql/cloudsqlmysqlcreateinstance"
|
||||
_ "github.com/googleapis/genai-toolbox/internal/tools/cloudsqlpg/cloudsqlpgcreateinstances"
|
||||
_ "github.com/googleapis/genai-toolbox/internal/tools/couchbase"
|
||||
_ "github.com/googleapis/genai-toolbox/internal/tools/dataform/dataformcompilelocal"
|
||||
_ "github.com/googleapis/genai-toolbox/internal/tools/dataplex/dataplexlookupentry"
|
||||
_ "github.com/googleapis/genai-toolbox/internal/tools/dataplex/dataplexsearchaspecttypes"
|
||||
_ "github.com/googleapis/genai-toolbox/internal/tools/dataplex/dataplexsearchentries"
|
||||
|
||||
7
docs/en/resources/tools/dataform/_index.md
Normal file
7
docs/en/resources/tools/dataform/_index.md
Normal file
@@ -0,0 +1,7 @@
|
||||
---
|
||||
title: "Dataform"
|
||||
type: docs
|
||||
weight: 1
|
||||
description: >
|
||||
Tools that work with Dataform.
|
||||
---
|
||||
48
docs/en/resources/tools/dataform/dataform-compile-local.md
Normal file
48
docs/en/resources/tools/dataform/dataform-compile-local.md
Normal file
@@ -0,0 +1,48 @@
|
||||
---
|
||||
title: "dataform-compile-local"
|
||||
type: docs
|
||||
weight: 1
|
||||
description: >
|
||||
A "dataform-compile-local" tool runs the `dataform compile` CLI command on a local project directory.
|
||||
aliases:
|
||||
- /resources/tools/dataform-compile-local
|
||||
---
|
||||
|
||||
## About
|
||||
|
||||
A `dataform-compile-local` tool runs the `dataform compile` command on a local Dataform project.
|
||||
|
||||
It is a standalone tool and **is not** compatible with any sources.
|
||||
|
||||
At invocation time, the tool executes `dataform compile --json` in the specified project directory and returns the resulting JSON object from the CLI.
|
||||
|
||||
`dataform-compile-local` takes the following parameter:
|
||||
- `project_dir` (string): The absolute or relative path to the local Dataform project directory. The server process must have read access to this path.
|
||||
|
||||
## Requirements
|
||||
|
||||
### Dataform CLI
|
||||
|
||||
This tool executes the `dataform` command-line interface (CLI) via a system call. You must have the **`dataform` CLI** installed and available in the server's system `PATH`.
|
||||
|
||||
You can typically install the CLI via `npm`:
|
||||
```bash
|
||||
npm install -g @dataform/cli
|
||||
```
|
||||
|
||||
See the [official Dataform documentation](https://www.google.com/search?q=https://cloud.google.com/dataform/docs/install-dataform-cli) for more details.
|
||||
|
||||
## Example
|
||||
|
||||
```yaml
|
||||
tools:
|
||||
my_dataform_compiler:
|
||||
kind: dataform-compile-local
|
||||
description: Use this tool to compile a local Dataform project.
|
||||
```
|
||||
|
||||
## Reference
|
||||
| **field** | **type** | **required** | **description** |
|
||||
| :---- | :---- | :---- | :---- |
|
||||
| kind | string | true | Must be "dataform-compile-local". |
|
||||
| description | string | true | Description of the tool that is passed to the LLM. |
|
||||
@@ -0,0 +1,122 @@
|
||||
// Copyright 2025 Google LLC
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
package dataformcompilelocal
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"os/exec"
|
||||
"strings"
|
||||
|
||||
"github.com/goccy/go-yaml"
|
||||
"github.com/googleapis/genai-toolbox/internal/sources"
|
||||
"github.com/googleapis/genai-toolbox/internal/tools"
|
||||
)
|
||||
|
||||
const kind string = "dataform-compile-local"
|
||||
|
||||
func init() {
|
||||
if !tools.Register(kind, newConfig) {
|
||||
panic(fmt.Sprintf("tool kind %q already registered", kind))
|
||||
}
|
||||
}
|
||||
|
||||
func newConfig(ctx context.Context, name string, decoder *yaml.Decoder) (tools.ToolConfig, error) {
|
||||
actual := Config{Name: name}
|
||||
if err := decoder.DecodeContext(ctx, &actual); err != nil {
|
||||
return nil, err
|
||||
}
|
||||
return actual, nil
|
||||
}
|
||||
|
||||
type Config struct {
|
||||
Name string `yaml:"name" validate:"required"`
|
||||
Kind string `yaml:"kind" validate:"required"`
|
||||
Description string `yaml:"description" validate:"required"`
|
||||
AuthRequired []string `yaml:"authRequired"`
|
||||
}
|
||||
|
||||
var _ tools.ToolConfig = Config{}
|
||||
|
||||
func (cfg Config) ToolConfigKind() string {
|
||||
return kind
|
||||
}
|
||||
|
||||
func (cfg Config) Initialize(srcs map[string]sources.Source) (tools.Tool, error) {
|
||||
allParameters := tools.Parameters{
|
||||
tools.NewStringParameter("project_dir", "The Dataform project directory."),
|
||||
}
|
||||
paramManifest := allParameters.Manifest()
|
||||
mcpManifest := tools.GetMcpManifest(cfg.Name, cfg.Description, cfg.AuthRequired, allParameters)
|
||||
|
||||
t := Tool{
|
||||
Name: cfg.Name,
|
||||
Kind: kind,
|
||||
AuthRequired: cfg.AuthRequired,
|
||||
Parameters: allParameters,
|
||||
manifest: tools.Manifest{Description: cfg.Description, Parameters: paramManifest, AuthRequired: cfg.AuthRequired},
|
||||
mcpManifest: mcpManifest,
|
||||
}
|
||||
|
||||
return t, nil
|
||||
}
|
||||
|
||||
var _ tools.Tool = Tool{}
|
||||
|
||||
type Tool struct {
|
||||
Name string `yaml:"name"`
|
||||
Kind string `yaml:"kind"`
|
||||
AuthRequired []string `yaml:"authRequired"`
|
||||
Parameters tools.Parameters `yaml:"allParams"`
|
||||
manifest tools.Manifest
|
||||
mcpManifest tools.McpManifest
|
||||
}
|
||||
|
||||
func (t Tool) Invoke(ctx context.Context, params tools.ParamValues, accessToken tools.AccessToken) (any, error) {
|
||||
paramsMap := params.AsMap()
|
||||
|
||||
projectDir, ok := paramsMap["project_dir"].(string)
|
||||
if !ok || projectDir == "" {
|
||||
return nil, fmt.Errorf("error casting 'project_dir' to string or invalid value")
|
||||
}
|
||||
|
||||
cmd := exec.CommandContext(ctx, "dataform", "compile", projectDir, "--json")
|
||||
output, err := cmd.CombinedOutput()
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("error executing dataform compile: %w\nOutput: %s", err, string(output))
|
||||
}
|
||||
|
||||
return strings.TrimSpace(string(output)), nil
|
||||
}
|
||||
|
||||
func (t Tool) ParseParams(data map[string]any, claims map[string]map[string]any) (tools.ParamValues, error) {
|
||||
return tools.ParseParams(t.Parameters, data, claims)
|
||||
}
|
||||
|
||||
func (t Tool) Manifest() tools.Manifest {
|
||||
return t.manifest
|
||||
}
|
||||
|
||||
func (t Tool) McpManifest() tools.McpManifest {
|
||||
return t.mcpManifest
|
||||
}
|
||||
|
||||
func (t Tool) Authorized(verifiedAuthServices []string) bool {
|
||||
return tools.IsAuthorized(t.AuthRequired, verifiedAuthServices)
|
||||
}
|
||||
|
||||
func (t Tool) RequiresClientAuthorization() bool {
|
||||
return false
|
||||
}
|
||||
@@ -0,0 +1,71 @@
|
||||
// Copyright 2025 Google LLC
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
package dataformcompilelocal_test
|
||||
|
||||
import (
|
||||
"testing"
|
||||
|
||||
yaml "github.com/goccy/go-yaml"
|
||||
"github.com/google/go-cmp/cmp"
|
||||
"github.com/googleapis/genai-toolbox/internal/server"
|
||||
"github.com/googleapis/genai-toolbox/internal/testutils"
|
||||
"github.com/googleapis/genai-toolbox/internal/tools/dataform/dataformcompilelocal"
|
||||
)
|
||||
|
||||
func TestParseFromYamlDataformCompile(t *testing.T) {
|
||||
ctx, err := testutils.ContextWithNewLogger()
|
||||
if err != nil {
|
||||
t.Fatalf("unexpected error: %s", err)
|
||||
}
|
||||
tcs := []struct {
|
||||
desc string
|
||||
in string
|
||||
want server.ToolConfigs
|
||||
}{
|
||||
{
|
||||
desc: "basic example",
|
||||
in: `
|
||||
tools:
|
||||
example_tool:
|
||||
kind: dataform-compile-local
|
||||
description: some description
|
||||
`,
|
||||
want: server.ToolConfigs{
|
||||
"example_tool": dataformcompilelocal.Config{
|
||||
Name: "example_tool",
|
||||
Kind: "dataform-compile-local",
|
||||
Description: "some description",
|
||||
AuthRequired: []string{},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
for _, tc := range tcs {
|
||||
t.Run(tc.desc, func(t *testing.T) {
|
||||
got := struct {
|
||||
Tools server.ToolConfigs `yaml:"tools"`
|
||||
}{}
|
||||
// Parse contents
|
||||
err := yaml.UnmarshalContext(ctx, testutils.FormatYaml(tc.in), &got)
|
||||
if err != nil {
|
||||
t.Fatalf("unable to unmarshal: %s", err)
|
||||
}
|
||||
if diff := cmp.Diff(tc.want, got.Tools); diff != "" {
|
||||
t.Fatalf("incorrect parse: diff %v", diff)
|
||||
}
|
||||
})
|
||||
}
|
||||
|
||||
}
|
||||
138
tests/dataform/dataform_integration_test.go
Normal file
138
tests/dataform/dataform_integration_test.go
Normal file
@@ -0,0 +1,138 @@
|
||||
// Copyright 2025 Google LLC
|
||||
//
|
||||
// Licensed under the Apache License, Version 2.0 (the "License");
|
||||
// you may not use this file except in compliance with the License.
|
||||
// You may obtain a copy of the License at
|
||||
//
|
||||
// http://www.apache.org/licenses/LICENSE-2.0
|
||||
//
|
||||
// Unless required by applicable law or agreed to in writing, software
|
||||
// distributed under the License is distributed on an "AS IS" BASIS,
|
||||
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
// See the License for the specific language governing permissions and
|
||||
// limitations under the License.
|
||||
|
||||
package dataformcompilelocal
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"net/http"
|
||||
"os"
|
||||
"os/exec"
|
||||
"path/filepath"
|
||||
"regexp"
|
||||
"strings"
|
||||
"testing"
|
||||
"time"
|
||||
|
||||
"github.com/googleapis/genai-toolbox/internal/testutils"
|
||||
"github.com/googleapis/genai-toolbox/tests"
|
||||
)
|
||||
|
||||
// setupTestProject creates a minimal dataform project using the 'dataform init' CLI.
|
||||
// It returns the path to the directory and a cleanup function.
|
||||
func setupTestProject(t *testing.T) (string, func()) {
|
||||
tmpDir, err := os.MkdirTemp("", "dataform-project-*")
|
||||
if err != nil {
|
||||
t.Fatalf("Failed to create temp dir: %v", err)
|
||||
}
|
||||
cleanup := func() {
|
||||
os.RemoveAll(tmpDir)
|
||||
}
|
||||
|
||||
cmd := exec.Command("dataform", "init", tmpDir, "test-project-id", "US")
|
||||
if output, err := cmd.CombinedOutput(); err != nil {
|
||||
cleanup()
|
||||
t.Fatalf("Failed to run 'dataform init': %v\nOutput: %s", err, string(output))
|
||||
}
|
||||
|
||||
definitionsDir := filepath.Join(tmpDir, "definitions")
|
||||
exampleSQLX := `config { type: "table" } SELECT 1 AS test_col`
|
||||
err = os.WriteFile(filepath.Join(definitionsDir, "example.sqlx"), []byte(exampleSQLX), 0644)
|
||||
if err != nil {
|
||||
cleanup()
|
||||
t.Fatalf("Failed to write example.sqlx: %v", err)
|
||||
}
|
||||
|
||||
return tmpDir, cleanup
|
||||
}
|
||||
|
||||
func TestDataformCompileTool(t *testing.T) {
|
||||
if _, err := exec.LookPath("dataform"); err != nil {
|
||||
t.Skip("dataform CLI not found in $PATH, skipping integration test")
|
||||
}
|
||||
|
||||
projectDir, cleanupProject := setupTestProject(t)
|
||||
defer cleanupProject()
|
||||
|
||||
toolsFile := map[string]any{
|
||||
"tools": map[string]any{
|
||||
"my-dataform-compiler": map[string]any{
|
||||
"kind": "dataform-compile-local",
|
||||
"description": "Tool to compile dataform projects",
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
ctx, cancel := context.WithTimeout(context.Background(), 2*time.Minute)
|
||||
defer cancel()
|
||||
|
||||
cmd, cleanupServer, err := tests.StartCmd(ctx, toolsFile)
|
||||
if err != nil {
|
||||
t.Fatalf("command initialization returned an error: %s", err)
|
||||
}
|
||||
defer cleanupServer()
|
||||
|
||||
waitCtx, cancelWait := context.WithTimeout(ctx, 30*time.Second)
|
||||
defer cancelWait()
|
||||
out, err := testutils.WaitForString(waitCtx, regexp.MustCompile(`Server ready to serve`), cmd.Out)
|
||||
if err != nil {
|
||||
t.Logf("toolbox command logs: \n%s", out)
|
||||
t.Fatalf("toolbox didn't start successfully: %s", err)
|
||||
}
|
||||
|
||||
nonExistentDir := filepath.Join(os.TempDir(), "non-existent-dir")
|
||||
|
||||
testCases := []struct {
|
||||
name string
|
||||
reqBody string
|
||||
wantStatus int
|
||||
wantBody string // Substring to check for in the response
|
||||
}{
|
||||
{
|
||||
name: "success case",
|
||||
reqBody: fmt.Sprintf(`{"project_dir":"%s"}`, projectDir),
|
||||
wantStatus: http.StatusOK,
|
||||
wantBody: "test_col",
|
||||
},
|
||||
{
|
||||
name: "missing parameter",
|
||||
reqBody: `{}`,
|
||||
wantStatus: http.StatusBadRequest,
|
||||
wantBody: `parameter \"project_dir\" is required`,
|
||||
},
|
||||
{
|
||||
name: "non-existent directory",
|
||||
reqBody: fmt.Sprintf(`{"project_dir":"%s"}`, nonExistentDir),
|
||||
wantStatus: http.StatusBadRequest,
|
||||
wantBody: "error executing dataform compile",
|
||||
},
|
||||
}
|
||||
|
||||
api := "http://127.0.0.1:5000/api/tool/my-dataform-compiler/invoke"
|
||||
|
||||
for _, tc := range testCases {
|
||||
t.Run(tc.name, func(t *testing.T) {
|
||||
resp, bodyBytes := tests.RunRequest(t, http.MethodPost, api, strings.NewReader(tc.reqBody), nil)
|
||||
|
||||
if resp.StatusCode != tc.wantStatus {
|
||||
t.Fatalf("unexpected status: got %d, want %d. Body: %s", resp.StatusCode, tc.wantStatus, string(bodyBytes))
|
||||
}
|
||||
|
||||
if tc.wantBody != "" && !strings.Contains(string(bodyBytes), tc.wantBody) {
|
||||
t.Fatalf("expected body to contain %q, got: %s", tc.wantBody, string(bodyBytes))
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
Reference in New Issue
Block a user