Files
genai-toolbox/docs/en/resources/sources/bigquery.md
Shobhit Singh a4506009b9 feat(bigquery): Make credentials scope configurable (#2210)
## Description

This change addresses the ask where the user may want to use custom
scopes. For instance, the default scope (bigquery) falls short from
running sql that utilizes integration with other google products, such
as Drive, Vertex AI, Cloud Run etc. With this change the user would be
able to configure custom scopes depending on their use case.

The custom scopes can be configured in the tools.yaml file, e.g.:

```yaml
sources:
  bigquery-source:
    kind: "bigquery"
    project: ${BIGQUERY_PROJECT}
    location: ${BIGQUERY_LOCATION:}
    useClientOAuth: ${BIGQUERY_USE_CLIENT_OAUTH:false}
    scopes:
      - "https://www.googleapis.com/auth/bigquery"
      - "https://www.googleapis.com/auth/drive"
```

and if the [bigquery prebuilt
config](https://github.com/googleapis/genai-toolbox/blob/main/internal/prebuiltconfigs/tools/bigquery.yaml)
is being used, then it can be set in the environment variable as well:

```shell
$ export BIGQUERY_SCOPES="https://www.googleapis.com/auth/bigquery,https://www.googleapis.com/auth/drive"
```

## PR Checklist

> Thank you for opening a Pull Request! Before submitting your PR, there
are a
> few things you can do to make sure it goes smoothly:

- [ ] Make sure you reviewed

[CONTRIBUTING.md](https://github.com/googleapis/genai-toolbox/blob/main/CONTRIBUTING.md)
- [x] Make sure to open an issue as a

[bug/issue](https://github.com/googleapis/genai-toolbox/issues/new/choose)
  before writing your code! That way we can discuss the change, evaluate
  designs, and agree on the general idea
- [x] Ensure the tests and linter pass
- [x] Code coverage does not decrease (if any source code was changed)
- [x] Appropriate docs were updated (if necessary)
- [ ] Make sure to add `!` if this involve a breaking change

🛠️ Fixes #1942
2026-01-07 02:07:49 +00:00

13 KiB

title, type, weight, description
title type weight description
BigQuery docs 1 BigQuery is Google Cloud's fully managed, petabyte-scale, and cost-effective analytics data warehouse that lets you run analytics over vast amounts of data in near real time. With BigQuery, there's no infrastructure to set up or manage, letting you focus on finding meaningful insights using GoogleSQL and taking advantage of flexible pricing models across on-demand and flat-rate options.

BigQuery Source

BigQuery is Google Cloud's fully managed, petabyte-scale, and cost-effective analytics data warehouse that lets you run analytics over vast amounts of data in near real time. With BigQuery, there's no infrastructure to set up or manage, letting you focus on finding meaningful insights using GoogleSQL and taking advantage of flexible pricing models across on-demand and flat-rate options.

If you are new to BigQuery, you can try to load and query data with the bq tool.

BigQuery uses GoogleSQL for querying data. GoogleSQL is an ANSI-compliant structured query language (SQL) that is also implemented for other Google Cloud services. SQL queries are handled by cluster nodes in the same way as NoSQL data requests. Therefore, the same best practices apply when creating SQL queries to run against your BigQuery data, such as avoiding full table scans or complex filters.

Available Tools

Pre-built Configurations

Requirements

IAM Permissions

BigQuery uses Identity and Access Management (IAM) to control user and group access to BigQuery resources like projects, datasets, and tables.

Authentication via Application Default Credentials (ADC)

By default, Toolbox will use your Application Default Credentials (ADC) to authorize and authenticate when interacting with BigQuery.

When using this method, you need to ensure the IAM identity associated with your ADC (such as a service account) has the correct permissions for the queries you intend to run. Common roles include roles/bigquery.user (which includes permissions to run jobs and read data) or roles/bigbigquery.dataViewer. Follow this guide to set up your ADC.

If you are running on Google Compute Engine (GCE) or Google Kubernetes Engine (GKE), you might need to explicitly set the access scopes for the service account. While you can configure scopes when creating the VM or node pool, you can also specify them in the source configuration using the scopes field. Common scopes include https://www.googleapis.com/auth/bigquery or https://www.googleapis.com/auth/cloud-platform.

Authentication via User's OAuth Access Token

If the useClientOAuth parameter is set to true, Toolbox will instead use the OAuth access token for authentication. This token is parsed from the Authorization header passed in with the tool invocation request. This method allows Toolbox to make queries to BigQuery on behalf of the client or the end-user.

When using this on-behalf-of authentication, you must ensure that the identity used has been granted the correct IAM permissions.

Example

Initialize a BigQuery source that uses ADC:

sources:
  my-bigquery-source:
    kind: "bigquery"
    project: "my-project-id"
    # location: "US" # Optional: Specifies the location for query jobs.
    # writeMode: "allowed" # One of: allowed, blocked, protected. Defaults to "allowed".
    # allowedDatasets: # Optional: Restricts tool access to a specific list of datasets.
    #   - "my_dataset_1"
    #   - "other_project.my_dataset_2"
    # impersonateServiceAccount: "service-account@project-id.iam.gserviceaccount.com" # Optional: Service account to impersonate
    # scopes: # Optional: List of OAuth scopes to request.
    #   - "https://www.googleapis.com/auth/bigquery"
    #   - "https://www.googleapis.com/auth/drive.readonly"

Initialize a BigQuery source that uses the client's access token:

sources:
  my-bigquery-client-auth-source:
    kind: "bigquery"
    project: "my-project-id"
    useClientOAuth: true
    # location: "US" # Optional: Specifies the location for query jobs.
    # writeMode: "allowed" # One of: allowed, blocked, protected. Defaults to "allowed".
    # allowedDatasets: # Optional: Restricts tool access to a specific list of datasets.
    #   - "my_dataset_1"
    #   - "other_project.my_dataset_2"
    # impersonateServiceAccount: "service-account@project-id.iam.gserviceaccount.com" # Optional: Service account to impersonate
    # scopes: # Optional: List of OAuth scopes to request.
    #   - "https://www.googleapis.com/auth/bigquery"
    #   - "https://www.googleapis.com/auth/drive.readonly"

Reference

field type required description
kind string true Must be "bigquery".
project string true Id of the Google Cloud project to use for billing and as the default project for BigQuery resources.
location string false Specifies the location (e.g., 'us', 'asia-northeast1') in which to run the query job. This location must match the location of any tables referenced in the query. Defaults to the table's location or 'US' if the location cannot be determined. Learn More
writeMode string false Controls the write behavior for tools. allowed (default): All queries are permitted. blocked: Only SELECT statements are allowed for the bigquery-execute-sql tool. protected: Enables session-based execution where all tools associated with this source instance share the same BigQuery session. This allows for stateful operations using temporary tables (e.g., CREATE TEMP TABLE). For bigquery-execute-sql, SELECT statements can be used on all tables, but write operations are restricted to the session's temporary dataset. For tools like bigquery-sql, bigquery-forecast, and bigquery-analyze-contribution, the writeMode restrictions do not apply, but they will operate within the shared session. Note: The protected mode cannot be used with useClientOAuth: true. It is also not recommended for multi-user server environments, as all users would share the same session. A session is terminated automatically after 24 hours of inactivity or after 7 days, whichever comes first. A new session is created on the next request, and any temporary data from the previous session will be lost.
allowedDatasets []string false An optional list of dataset IDs that tools using this source are allowed to access. If provided, any tool operation attempting to access a dataset not in this list will be rejected. To enforce this, two types of operations are also disallowed: 1) Dataset-level operations (e.g., CREATE SCHEMA), and 2) operations where table access cannot be statically analyzed (e.g., EXECUTE IMMEDIATE, CREATE PROCEDURE). If a single dataset is provided, it will be treated as the default for prebuilt tools.
useClientOAuth bool false If true, forwards the client's OAuth access token from the "Authorization" header to downstream queries. Note: This cannot be used with writeMode: protected.
scopes []string false A list of OAuth 2.0 scopes to use for the credentials. If not provided, default scopes are used.
impersonateServiceAccount string false Service account email to impersonate when making BigQuery and Dataplex API calls. The authenticated principal must have the roles/iam.serviceAccountTokenCreator role on the target service account. Learn More