This only checks within `SourceConfig`, `ToolConfig`, and
`AuthSourceConfig`.
Error when an unknown field is provided:
`2025-01-27T22:43:46.988401-08:00 ERROR "unable to parse tool file at
\"tools.yaml\": unable to parse as \"cloud-sql-postgres\": [2:1] unknown
field \"extra\"\n 1 | database: test_database\n> 2 | extra: here\n ^\n 3
| instance: toolbox-cloudsql\n 4 | kind: cloud-sql-postgres\n 5 |
password: postgres\n 6 | "`
Error when a required field is not provided:
`2025-01-27T17:49:47.584846-08:00 ERROR "unable to parse tool file at
\"tools.yaml\": validation failed: Key: 'Config.Region' Error:Field
validation for 'Region' failed on the 'required' tag"`
---------
Co-authored-by: Kurtis Van Gent <31518063+kurtisvg@users.noreply.github.com>
Add user agent to cloud databases that provides us anonymized data
request count, number of users, number of projects, and other
environment settings.
User agent is using the format: `genai-toolbox/$version+metadata`
Currently, we are throwing 401 error immediately after auth token
verification failure. This is not expected in the following situations:
1. Non-auth tool invocation with auth token that is invalid.
2. Auth tool invocation with all the required auth token, but the header
contains extra non-required token that is invalid
These requests should pass the authorization check but fail under the
current implementation.
Change made in this PR:
1. Do not throw error immediately after auth token verification failure.
Instead only log it and continue to the next header iteration.
2. In the parseParams() method, if an auth parameter is missing, we
should error with the message telling the user that either the auth
header is missing or is invalid.
1. `sql/database` provides a `Scan()`interface to scan query results
into typed variables. Therefore we have to create a slice of typed
variables (types retrieved from rows.ColumnTypes()) to pass them into
`Scan()`. Using []byte works but makes the printing result different
from other tools (e.g [1] instead of %!s(int32=1)]
2. MS SQL supports both named (e.g @name) and positional args (e.g @p2),
so we have to check if the name is contained in the original statement
before passing them into `db.Query()` as either named arg or as values.
Add CloudSQL for MySQL source and tool.
CloudSQLMySQL source is initialize with the following config:
```
sources:
my-cloudsqlmysql-source:
kind: cloud-sql-mysql
project: my-project-name
region: my-region
instance: my-instance-name
user: my_user
password: my_pass
database: my_db
# ipType: public # The default dialect is public.
```
MySQL tool is initialize with the following config.
```
tools:
test_tool:
kind: mysql
source: my-cloudsqlmysql-source
description: >
Testing tool.
statement: "SELECT 1;"
```
- configure neo4j source with url, username, password, database
- configure neo4j tools with cypher statement and paramters
- tests based on the postgres tests
- neo4j.yaml for integration tests
---------
Co-authored-by: duwenxin <duwenxin@google.com>
Add debug logs to Toolbox.
For example when a http fail, it will just show Error at the http level,
but not log with actual error message. err message are returned to the
api as following `{"status":"Internal Server Error","error":"error while
invoking tool: unable to execute client: spanner: code =
\"InvalidArgument\", desc = \"invalid session pool\""}`.
After adding this, if user/dev run toolbox with `--log-level=debug`, it
will output the following (debug log in addition to the error for http
request):
```
2025-01-08T14:16:25.040824-08:00 DEBUG "error while invoking tool: unable to execute client: spanner: code = \"InvalidArgument\", desc = \"invalid session pool\""
2025-01-08T14:16:25.040968-08:00 ERROR Response: 500 Server Error service: "httplog" httpRequest: {url: "http://127.0.0.1:5000/api/tool/test_tool_two/invoke" method: "POST" path: "/api/tool/test_tool_two/invoke" remoteIP: "127.0.0.1:51708" proto: "HTTP/1.1" requestID: "yuanteoh-macbookpro.roam.internal/N7LNMcLIUH-000001" scheme: "http" header: {user-agent: "curl/8.7.1" accept: "*/*" content-type: "application/json" content-length: "2"}} httpResponse: {status: 500 bytes: 167 elapsed: 0.301917}
```
Corrects an issue caused by Go defaulting to parsing JSON Numbers as
float64s. This caused some numbers to be incorrectly parsed as floats
when they were integers. This defaults to parsing using json.Number,
which allows us to parse between Int/Float more accurately.
Adds logic to make the server shutdown gracefully, including better
respecting cancelled contexts and providing up to 10 seconds to finish
current connections.
1. Add []ParamAuthSource to every Parameter type implementation to
support authenticated configs. Create new constructors for types with
auth.
2. Tool invocation API changes to parse auth header and authentecated
parameters.
3. Add authSources to Tool manifest.
End to end integration test for cloudsql postgres.
Include checks for one tool's get (manifest) and post (invoke) endpoint.
Integration tests are excluded from regular unit tests.
Add Spanner source and tool.
Spanner source is initialize with the following config:
```
sources:
my-spanner-source:
kind: spanner
project: my-project-name
instance: my-instance-name
database: my_db
# dialect: postgresql # The default dialect is google_standard_sql.
```
Spanner tool (with gsql dialect) is initialize with the following
config.
```
tools:
get_flight_by_id:
kind: spanner
source: my-cloud-sql-source
description: >
Use this tool to list all airports matching search criteria. Takes
at least one of country, city, name, or all and returns all matching
airports. The agent can decide to return the results directly to
the user.
statement: "SELECT * FROM flights WHERE id = @id"
parameters:
- name: id
type: int
description: 'id' represents the unique ID for each flight.
```
Spanner tool (with postgresql dialect) is initialize with the following
config.
```
tools:
get_flight_by_id:
kind: spanner
source: my-cloud-sql-source
description: >
Use this tool to list all airports matching search criteria. Takes
at least one of country, city, name, or all and returns all matching
airports. The agent can decide to return the results directly to
the user.
statement: "SELECT * FROM flights WHERE id = $1"
parameters:
- name: id
type: int
description: 'id' represents the unique ID for each flight.
```
Note: the only difference in config for both dialects is the sql
statement.
---------
Co-authored-by: Kurtis Van Gent <31518063+kurtisvg@users.noreply.github.com>
Allow user to set if their database uses private or public ip. The
reason we add this is because the dialer require different
initialization with private and public ip.
By default, toolbox will use public ip.
Different databases require different types for `Params` field when
adding parameters to their statement. e.g. alloydb, cloudsql, and
postgres uses `pgxpool` to query and build sql statement, whereas
spanner uses `Spanner` library.
Added a new `ParamValue` struct. `ParseParams` helper function parses
arbitraryJSON object into `[]ParamValue`, and the tool's invoke will
convert `[]ParamValue` into it's required type.
---------
Co-authored-by: Kurtis Van Gent <31518063+kurtisvg@users.noreply.github.com>
Allow user to set if their database uses private or public ip. The
reason we add this is because the dialer require different
initialization with private and public ip.
By default, toolbox will use public ip.
---------
Co-authored-by: Kurtis Van Gent <31518063+kurtisvg@users.noreply.github.com>
Logging support 4 different types of logging (debug, info, warn, error).
The default logging level is Info.
User will be able to set flag for log level (allowed values: "debug",
"info", "warn", "error"), example:
`go run . --log-level debug`
User will be able to set flag for logging format (allowed values:
"standard", "JSON"), example:
`go run . --logging-format json`
**sample http request log - std:**
server
```
2024-11-12T15:08:11.451377-08:00 INFO "Initalized 0 sources.\n"
```
httplog
```
2024-11-26T15:15:53.947287-08:00 INFO Response: 200 OK service: "httplog" httpRequest: {url: "http://127.0.0.1:5000/" method: "GET" path: "/" remoteIP: "127.0.0.1:64216" proto: "HTTP/1.1" requestID: "macbookpro.roam.interna/..." scheme: "http" header: {user-agent: "curl/8.7.1" accept: "*/*"}} httpResponse: {status: 200 bytes: 22 elapsed: 0.012417}
```
**sample http request log - structured:**
server
```
{
"timestamp":"2024-11-04T16:45:11.987299-08:00",
"severity":"ERROR",
"logging.googleapis.com/sourceLocation":{
"function":"github.com/googleapis/genai-toolbox/internal/log.(*StructuredLogger).Errorf",
"file":"/Users/yuanteoh/github/genai-toolbox/internal/log/log.go","line":157
},
"message":"unable to parse tool file at \"tools.yaml\": \"cloud-sql-postgres1\" is not a valid kind of data source"
}
```
httplog
```
{
"timestamp":"2024-11-26T15:12:49.290974-08:00",
"severity":"INFO",
"logging.googleapis.com/sourceLocation":{
"function":"github.com/go-chi/httplog/v2.(*RequestLoggerEntry).Write",
"file":"/Users/yuanteoh/go/pkg/mod/github.com/go-chi/httplog/v2@v2.1.1/httplog.go","line":173
},
"message":"Response: 200 OK",
"service":"httplog",
"httpRequest":{
"url":"http://127.0.0.1:5000/",
"method":"GET",
"path":"/",
"remoteIP":"127.0.0.1:64140",
"proto":"HTTP/1.1",
"requestID":"yuanteoh-macbookpro.roam.internal/NBrtYBu3q9-000001",
"scheme":"http",
"header":{"user-agent":"curl/8.7.1","accept":"*/*"}
},
"httpResponse":{"status":200,"bytes":22,"elapsed":0.0115}
}
```
This PR introduces the following breaking change: The
`alloydb-pg-generic`, `cloud-sql-pg-generic`, and
`postgres-generic-tool` have been replaced by the `postgres-sql` tool,
which works with all 3 Postgres sources.
If you were using of the the previous tools, you will need to update it
as follows:
```diff
example_tool:
- kind: cloud-sql-pg-generic
+ kind: postgres-sql
source: my-cloud-sql-pg-instance
description: some description
statement: |
SELECT * FROM SQL_STATEMENT;
parameters:
- name: country
type: string
description: some description
```
I'm proposing this change for the following reasons:
1. It provides greater flexibility between postgres-compatible sources
-- you can change between "postgres" and "alloydb-postgres" without
issue
2. The name "postgres-sql" is more clear that "postgres-generic" -- it
indicates it's a tool that runs SQL on the source
3. It's easier for us to maintain feature compatibility across a single
"postgres-sql" tool
Moves all of the "source" and "tool" implementations into their own
packages. This layout makes it a bit more clear where the
implementations are, and seems likely to scale more cleanly as more
sources and tools are added.
Adds support for "array" type parameters. Uses a subet of JSONSchema for
specification, in that arrays can be specified in the following way:
```yaml
parameters:
name: "my_array"
type: "array"
description: "some description"
items:
type: "integer"
```
1. Calculate tool manifests when server starts.
2. Add toolset manifest endpoints.
---------
Co-authored-by: Kurtis Van Gent <31518063+kurtisvg@users.noreply.github.com>
Makes the following changes:
- Fills in previously stubbed "Invoke()" function for the Cloud SQL
Generic Tool
- Updates API to /{tool_name}/invoke to for invocation of said tool
- Updates response to use JSON
- Correctly returns error messages for invalid http codes
Add `Toolset` implementation to the `tools` package:
- struct and configs.
- Custom `UnmarshalYAML` function.
- Initialization function that validates if tools specified for the
toolset exist.
This PR adds preliminary parsing of parameters. Currently it only
supports 4 types: string, int, float32, and bool. Almost certainly we
will need to introduce more complicated parsing configuration (to handle
objects and arrays), but my initial attempts got quickly complicated, so
I simplified in the short term.
This also makes 2 breaking changes to config.yaml:
- changes "parameters" to be a list over object -- this is because
parameter ordering is important, and needs to be preserved
- removed the "required" field from parameter objects -- we need to
determine how to handle optional parameters in SQL queries