mirror of
https://github.com/googleapis/genai-toolbox.git
synced 2026-01-06 22:24:02 -05:00
ci: add link checker workflow (#2189)
This workflow prevents the links that are broken or 404 errors by checking the documentation links during development and before merging into the main code base. This ensures all project documentation (Readme , contribution files) remains current and functional , proactively addressing technical debt. Please note this is a resubmission of a previous [PR](https://github.com/googleapis/genai-toolbox/pull/1756) that was closed due to merge conflicts --------- Co-authored-by: Twisha Bansal <58483338+twishabansal@users.noreply.github.com>
This commit is contained in:
59
.github/workflows/link_checker_workflow.yaml
vendored
Normal file
59
.github/workflows/link_checker_workflow.yaml
vendored
Normal file
@@ -0,0 +1,59 @@
|
||||
# Copyright 2025 Google LLC
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License");
|
||||
# you may not use this file except in compliance with the License.
|
||||
# You may obtain a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS,
|
||||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
# See the License for the specific language governing permissions and
|
||||
# limitations under the License.
|
||||
name: Link Checker
|
||||
|
||||
on:
|
||||
pull_request:
|
||||
|
||||
|
||||
jobs:
|
||||
link-check:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Checkout Repository
|
||||
uses: actions/checkout@v5
|
||||
|
||||
- name: Restore lychee cache
|
||||
uses: actions/cache@v4
|
||||
with:
|
||||
path: .lycheecache
|
||||
key: cache-lychee-${{ github.sha }}
|
||||
restore-keys: cache-lychee-
|
||||
|
||||
- name: Link Checker
|
||||
uses: lycheeverse/lychee-action@v2
|
||||
with:
|
||||
args: >
|
||||
--verbose
|
||||
--no-progress
|
||||
--cache
|
||||
--max-cache-age 1d
|
||||
README.md
|
||||
docs/
|
||||
output: /tmp/foo.txt
|
||||
fail: true
|
||||
jobSummary: true
|
||||
debug: true
|
||||
env:
|
||||
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
|
||||
# This step only runs if the 'lychee_check' step fails, ensuring the
|
||||
# context note only appears when the developer needs to troubleshoot.
|
||||
- name: Display Link Context Note on Failure
|
||||
if: ${{ failure() }}
|
||||
run: |
|
||||
echo "## Link Resolution Note" >> $GITHUB_STEP_SUMMARY
|
||||
echo "Local links and directory changes work differently on GitHub than on the docsite." >> $GITHUB_STEP_SUMMARY
|
||||
echo "You must ensure fixes pass the **GitHub check** and also work with **\`hugo server\`**." >> $GITHUB_STEP_SUMMARY
|
||||
echo "---" >> $GITHUB_STEP_SUMMARY
|
||||
|
||||
45
.lycheeignore
Normal file
45
.lycheeignore
Normal file
@@ -0,0 +1,45 @@
|
||||
# Ignore documentation placeholders and generic example domains
|
||||
^https?://([a-zA-Z0-9-]+\.)?example\.com(:\d+)?(/.*)?$
|
||||
^http://example\.net
|
||||
|
||||
# Shields.io badges often trigger rate limits or intermittent 503s
|
||||
^https://img\.shields\.io/.*
|
||||
|
||||
# PDF files are ignored as lychee cannot reliably parse internal PDF links
|
||||
\.pdf$
|
||||
|
||||
# Standard mailto: protocol is not a web URL
|
||||
^mailto:
|
||||
|
||||
# Ignore local development endpoints that won't resolve in CI/CD environments
|
||||
^https?://(127\.0\.0\.1|localhost)(:\d+)?(/.*)?$
|
||||
|
||||
# Placeholder for Google Cloud Run service discovery
|
||||
https://cloud-run-url.app/
|
||||
|
||||
# DGraph Cloud and private instance endpoints
|
||||
https://xxx.cloud.dgraph.io/
|
||||
https://cloud.dgraph.io/login
|
||||
https://dgraph.io/docs
|
||||
|
||||
# MySQL Community downloads and main site (often protected by bot mitigation)
|
||||
https://dev.mysql.com/downloads/installer/
|
||||
https://www.mysql.com/
|
||||
|
||||
# Claude desktop download link
|
||||
https://claude.ai/download
|
||||
|
||||
# Google Cloud Run product page
|
||||
https://cloud.google.com/run
|
||||
|
||||
# These specific deep links are known to cause redirect loops or 403s in automated scrapers
|
||||
https://dev.mysql.com/doc/refman/8.4/en/sql-prepared-statements.html
|
||||
https://dev.mysql.com/doc/refman/8.4/en/user-names.html
|
||||
|
||||
# npmjs links can occasionally trigger rate limiting during high-frequency CI builds
|
||||
https://www.npmjs.com/package/@toolbox-sdk/core
|
||||
https://www.npmjs.com/package/@toolbox-sdk/adk
|
||||
|
||||
|
||||
# Ignore social media and blog profiles to reduce external request overhead
|
||||
https://medium.com/@mcp_toolbox
|
||||
25
DEVELOPER.md
25
DEVELOPER.md
@@ -207,6 +207,30 @@ variables for each source.
|
||||
* SQLite - setup in the integration test, where we create a temporary database
|
||||
file
|
||||
|
||||
### Link Checking and Fixing with Lychee
|
||||
|
||||
We use **[lychee](https://github.com/lycheeverse/lychee-action)** for repository link checks.
|
||||
|
||||
* To run the checker **locally**, see the [command-line usage guide](https://github.com/lycheeverse/lychee?tab=readme-ov-file#commandline-usage).
|
||||
|
||||
#### Fixing Broken Links
|
||||
|
||||
1. **Update the Link:** Correct the broken URL or update the content where it is used.
|
||||
2. **Ignore the Link:** If you can't fix the link (e.g., due to **external rate-limits** or if it's a **local-only URL**), tell Lychee to **ignore** it.
|
||||
|
||||
* List **regular expressions** or **direct links** in the **[.lycheeignore](https://github.com/googleapis/genai-toolbox/blob/main/.lycheeignore)** file, one entry per line.
|
||||
* **Always add a comment** explaining **why** the link is being skipped to prevent link rot. **Example `.lycheeignore`:**
|
||||
```text
|
||||
# These are email addresses, not standard web URLs, and usually cause check failures.
|
||||
^mailto:.*
|
||||
```
|
||||
> [!NOTE]
|
||||
> To avoid build failures in GitHub Actions, follow the linking pattern demonstrated here: <br>
|
||||
> **Avoid:** (Works in Hugo, breaks Link Checker): `[Read more](docs/setup)` or `[Read more](docs/setup/)` <br>
|
||||
> **Reason:** The link checker cannot find a file named "setup" or a directory with that name containing an index. <br>
|
||||
> **Preferred:** `[Read more](docs/setup.md)` <br>
|
||||
> **Reason:** The GitHub Action finds the physical file. Hugo then uses its internal logic (or render hooks) to resolve this to the correct `/docs/setup/` web URL. <br>
|
||||
|
||||
### Other GitHub Checks
|
||||
|
||||
* License header check (`.github/header-checker-lint.yml`) - Ensures files have
|
||||
@@ -280,6 +304,7 @@ There are 3 GHA workflows we use to achieve document versioning:
|
||||
Request a repo owner to run the preview deployment workflow on your PR. A
|
||||
preview link will be automatically added as a comment to your PR.
|
||||
|
||||
|
||||
#### Maintainers
|
||||
|
||||
1. **Inspect Changes:** Review the proposed changes in the PR to ensure they are
|
||||
|
||||
Reference in New Issue
Block a user