Update source code `Kind` to `Type`. It's only changes within our code.
Changes to yaml tag (that will affect users) will be done in later PRs.
This is a breaking change since it updates telemetry's span attribute
from `source_kind` to `source_type`.
Related #817
Future updates will include:
* Updating a preprocessing function to convert config file from v1 to v2
* Update unmarshal function for ToolsFile to convert config file (test
will fail since the yaml tag is not yet updated).
* Update yaml tag (test will pass).
Move source-related queries from `Invoke()` function into Source.
This PR addresses the following sources:
* dataplex
* http
* serverlessspark
This is an effort to generalizing tools to work with any Source that
implements a specific interface. This will provide a better segregation
of the roles for Tools vs Source.
Tool's role will be limited to the following:
* Resolve any pre-implementation steps or parameters (e.g. template
parameters)
* Retrieving Source
* Calling the source's implementation
Add parameter `embeddedBy` field to support vector embedding & semantic
search.
Major change in `internal/util/parameters/parameters.go`
This PR only adds vector formatter for the postgressql tool. Other tools
requiring vector formatting may not work with embeddedBy.
Second part of the Semantic Search support. First part:
https://github.com/googleapis/genai-toolbox/pull/2121
This PR update the linking mechanism between Source and Tool.
Tools are directly linked to their Source, either by pointing to the
Source's functions or by assigning values from the source during Tool's
initialization. However, the existing approach means that any
modification to the Source after Tool's initialization might not be
reflected. To address this limitation, each tool should only store a
name reference to the Source, rather than direct link or assigned
values.
Tools will provide interface for `compatibleSource`. This will be used
to determine if a Source is compatible with the Tool.
```
type compatibleSource interface{
Client() http.Client
ProjectID() string
}
```
During `Invoke()`, the tool will run the following operations:
* retrieve Source from the `resourceManager` with source's named defined
in Tool's config
* validate Source via `compatibleSource interface{}`
* run the remaining `Invoke()` function. Fields that are needed is
retrieved directly from the source.
With this update, resource manager is also added as input to other
Tool's function that require access to source (e.g.
`RequiresClientAuthorization()`).
This tool is almost identical to create_pyspark_batch, but for Java
Spark batches. There are some minor differences in how the main files
and args are provided.
We are planning to add several very similar tools for creating batches
like the existing pyspark batches: spark (Java), R, etc. They will use
an identical approach of specifying environment and runtime config in
the YAML, differing only in how language-specific args are passed. We
can streamline this.