feat: add support for DuckDB (#879)

Fixes #861 
This PR adds support for DuckDB which is a free, open-source, embedded,
in-process, relational database management system (RDBMS) designed for
analytical processing (OLAP)

---------

Co-authored-by: Averi Kitsch <akitsch@google.com>
This commit is contained in:
Pranava B
2025-07-30 03:01:22 +05:30
committed by GitHub
parent c1305b5ab4
commit fd149337e9
13 changed files with 912 additions and 21 deletions

View File

@@ -0,0 +1,73 @@
---
title: DuckDB
linkTitle: DuckDB
type: docs
weight: 1
description: >
DuckDB is an in-process SQL OLAP database management system designed for analytical query processing.
---
## About
[DuckDB](https://duckdb.org/) is an embedded analytical database management system that runs in-process with the client application. It is optimized for analytical workloads, providing high performance for complex queries with minimal setup.
DuckDB has the following notable characteristics:
- In-process, serverless database engine
- Supports complex SQL queries for analytical processing
- Can operate on in-memory or persistent storage
- Zero-configuration - no external dependencies or server setup required
- Highly optimized for columnar data storage and query execution
For more details, refer to the [DuckDB Documentation](https://duckdb.org/).
## Available Tools
- [`duckdb-sql`](../tools/duckdb/duckdb-sql.md)
Execute pre-defined prepared SQL queries in DuckDB.
## Requirements
### Database File
To use DuckDB, you can either:
- Specify a file path for a persistent database stored on the filesystem
- Omit the file path to use an in-memory database
## Example
For a persistent DuckDB database:
```yaml
sources:
my-duckdb:
kind: "duckdb"
dbFilePath: "/path/to/database.db"
configuration:
memory_limit: "2GB"
threads: "4"
```
For an in-memory DuckDB database:
```yaml
sources:
my-duckdb-memory:
name: "my-duckdb-memory"
kind: "duckdb"
```
## Reference
### Configuration Fields
| **field** | **type** | **required** | **description** |
|-------------------|:-----------------:|:------------:|---------------------------------------------------------------------------------|
| kind | string | true | Must be "duckdb". |
| dbFilePath | string | false | Path to the DuckDB database file. Omit for an in-memory database. |
| configuration | map[string]string | false | Additional DuckDB configuration options (e.g., `memory_limit`, `threads`). |
For a complete list of available configuration options, refer to the [DuckDB Configuration Documentation](https://duckdb.org/docs/stable/configuration/overview.html#local-configuration-options).
For more details on the Go implementation, see the [go-duckdb package documentation](https://pkg.go.dev/github.com/scottlepp/go-duckdb#section-readme).

View File

@@ -0,0 +1,7 @@
---
title: "DuckDB"
type: docs
weight: 1
description: >
Tools that work with DuckDB Sources.
---

View File

@@ -0,0 +1,80 @@
---
title: "duckdb-sql"
type: docs
weight: 1
description: >
Execute SQL statements against a DuckDB database using the DuckDB SQL tools configuration.
aliases:
- /resources/tools/duckdb-sql
---
## About
A `duckdb-sql` tool executes a pre-defined SQL statement against a [DuckDB](https://duckdb.org/) database. It is compatible with any DuckDB source configuration as defined in the [DuckDB source documentation](../../sources/duckdb.md).
The specified SQL statement is executed as a prepared statement, and parameters are inserted according to their position: e.g., `$1` is the first parameter, `$2` is the second, and so on. If template parameters are included, they are resolved before execution of the prepared statement.
DuckDB's SQL dialect closely follows the conventions of the PostgreSQL dialect, with a few exceptions listed in the [DuckDB PostgreSQL Compatibility documentation](https://duckdb.org/docs/stable/sql/dialect/postgresql_compatibility.html). For an introduction to DuckDB's SQL dialect, refer to the [DuckDB SQL Introduction](https://duckdb.org/docs/stable/sql/introduction).
### Concepts
DuckDB is a relational database management system (RDBMS). Data is stored in relations (tables), where each table is a named collection of rows. Each row in a table has the same set of named columns, each with a specific data type. Tables are stored within schemas, and a collection of schemas constitutes the entire database.
For more details, see the [DuckDB SQL Introduction](https://duckdb.org/docs/stable/sql/introduction).
## Example
> **Note:** This tool uses parameterized queries to prevent SQL injections. Query parameters can be used as substitutes for arbitrary expressions but cannot be used for identifiers, column names, table names, or other parts of the query.
```yaml
tools:
search-users:
kind: duckdb-sql
source: my-duckdb
description: Search users by name and age
statement: SELECT * FROM users WHERE name LIKE $1 AND age >= $2
parameters:
- name: name
type: string
description: The name to search for
- name: min_age
type: integer
description: Minimum age
```
## Example with Template Parameters
> **Note:** Template parameters allow direct modifications to the SQL statement, including identifiers, column names, and table names, which makes them more vulnerable to SQL injections. Using basic parameters (see above) is recommended for performance and safety. For more details, see the [templateParameters](../#template-parameters) section.
```yaml
tools:
list_table:
kind: duckdb-sql
source: my-duckdb
statement: |
SELECT * FROM {{.tableName}};
description: |
Use this tool to list all information from a specific table.
Example:
{{
"tableName": "flights",
}}
templateParameters:
- name: tableName
type: string
description: Table to select from
```
## Reference
### Configuration Fields
| **field** | **type** | **required** | **description** |
|--------------------|:-------------------------------:|:------------:|--------------------------------------------------------------------------------------------------------------------------------------------|
| kind | string | true | Must be "duckdb-sql". |
| source | string | true | Name of the DuckDB source configuration (see [DuckDB source documentation](../../sources/duckdb.md)). |
| description | string | true | Description of the tool that is passed to the LLM. |
| statement | string | true | The SQL statement to execute. |
| authRequired | []string | false | List of authentication requirements for the tool (if any). |
| parameters | [parameters](../#specifying-parameters) | false | List of parameters that will be inserted into the SQL statement |
| templateParameters | [templateParameters](../#template-parameters) | false | List of template parameters that will be inserted into the SQL statement before executing the prepared statement. |