ci: zstd-compress the src cache and drop the doubled win_toolchain (#50702)

* ci: shrink src cache and fix Windows tar cleanup

- Exclude platform-specific toolchains (llvm-build, rust-toolchain) from
  the src cache; all platforms now fetch them via fix-sync post-restore
- Exclude unused test data and benchmarks: blink/web_tests, jetstream,
  speedometer, catapult/tracing/test_data, swiftshader/tests/regres
- Fix Windows restore leaving the tarball on disk after extraction
  ($src_cache was scoped to the previous PowerShell step)
- Bump src-cache key v1 -> v2

* ci: fetch llvm/rust toolchains in gn-check and clang-tidy

These workflows restore the src cache but don't run fix-sync. Now that
llvm-build and rust-toolchain are excluded from the cache, they need to
download them directly — gn gen read_file()s both, and clang-tidy runs
the binary from llvm-build.

* ci: fetch clang-tidy package explicitly

update.py's default 'clang' package doesn't include the clang-tidy
binary; it ships as a separate package.

* ci: preserve blink/web_tests/BUILD.gn when stripping test data

//BUILD.gn references //third_party/blink/web_tests:wpt_tests as a
target label, so the BUILD.gn must exist for gn gen. The data = [...]
entries it declares are runtime-only and not existence-checked at gen
time, so the actual test directories can still be removed.

* ci: compress src cache with zstd and drop gclient sync -vv

The src cache was an uncompressed tar (~16GB after exclusions). Switch
to zstd -T0 --long=30 for ~4x smaller transfer and multi-threaded
compression. Decompress on restore:
- Linux/macOS: zstd -d -c | tar -xf -
- Windows: zstd -d to an intermediate .tar, then the existing 7z
  -snld20 extraction (preserves symlink handling)

All filename references updated .tar -> .tar.zst. -f added to the two
-o invocations so re-runs overwrite instead of failing.

Also drop -vv from gclient sync; default verbosity is sufficient.

* ci: keep .tar extension for src cache (zstd content inside)

The sas-sidecar that issues Azure SAS tokens validates filenames against
/^v[0-9]+-[a-z\-]+-[a-f0-9]+\.(tar|tgz)$/ and is not easily redeployed,
so keep the .tar extension and decode zstd on restore. Windows
decompresses to a distinct intermediate (src_cache.tar) so input and
output don't collide.

* ci: log NTFS 8.3/lastaccess/Defender state before Windows cache extract

Temporary diagnostics to see whether 8.3 short-name generation is the
cause of the ~20 min tar extraction.

* ci: revert src-cache exclusion additions

The new exclusions (web_tests contents, jetstream, speedometer,
catapult test_data, regres, llvm-build, rust-toolchain) caused siso/RBE
cache misses — even data-only deps are part of action input hashes.
Revert to the original exclusion list and drop the corresponding
toolchain-fetch plumbing. zstd compression, the Windows tar cleanup,
and the -vv removal remain.

* ci: drop win_toolchain from src cache; remove NTFS diagnostics

The Windows src cache includes 14.6GB of depot_tools/win_toolchain —
7.3GB of MSVC/SDK doubled because tar captures both the vs_files.ciopfs
backing store and the live ciopfs mount at vs_files/. Every Windows
cache consumer already re-fetches this via vs_toolchain.py update
--force (fix-sync for build/publish, inline for gn-check/clang-tidy),
so the cached copy is never used.

Diagnostics removed — CI confirmed 8dot3, last-access, and Defender are
all already off on the AKS Windows nodes.

* ci: unmount ciopfs vs_files before removing win_toolchain

vs_files is a live ciopfs mount during the win-targeted checkout; rm -rf
fails with EBUSY until it's unmounted.

* ci: skip win_toolchain download during checkout instead of removing after

fusermount isn't on the checkout container, so the ciopfs mount can't be
torn down before rm. Setting DEPOT_TOOLS_WIN_TOOLCHAIN=0 makes the
win_toolchain hook a no-op (vs_toolchain.py:525-527), so there's no
download and no mount. All Windows consumers re-fetch it post-restore
anyway. The rm -rf stays as a safety net.

* ci: also set ELECTRON_DEPOT_TOOLS_WIN_TOOLCHAIN=0 for checkout sync

build.yml sets ELECTRON_DEPOT_TOOLS_WIN_TOOLCHAIN=1 at the job level for
the Windows checkout, which makes e d inject DEPOT_TOOLS_WIN_TOOLCHAIN=1
and override the inline =0. Need both: the ELECTRON_ var stops e d from
overriding, the plain one stops vs_toolchain.py from defaulting to 1.

* ci: extract Windows src cache with piped tar instead of 7z

7z takes ~20 min to extract the ~1.1M-entry tar regardless of size —
~1ms per entry of header parsing and path handling, single-threaded,
well under the 75k IOPS / 1000 MBps the ephemeral disk can do. Switch
to the same zstd -d | tar -xf - pipe used on Linux/macOS (via Git Bash
tar). No intermediate src_cache.tar, download deleted after extract.

The -snld20 flag was working around 7z's own "dangerous symlink"
refusal; GNU tar extracts symlinks as-is so it shouldn't be needed.

* ci: keep depot_tools/win_toolchain scripts in src cache

The rm -rf removed get_toolchain_if_necessary.py (a depot_tools source
file), breaking vs_toolchain.py update --force on restore.
DEPOT_TOOLS_WIN_TOOLCHAIN=0 on the sync already prevents the vs_files
download, so the rm was only removing scripts.

* ci: split src cache into 4 parallel-extractable shards

Windows tar extraction is ~1ms/entry for ~1.2M entries (~20 min)
regardless of tool, well under the 75k IOPS / 1000 MBps the D16lds_v5
ephemeral disk can do. Tar is a sequential stream so the only way to
parallelize is to split at creation time.

Shards (balanced by entry count, ~220-360k each):
  a: src/third_party/blink
  b: src/third_party/{dawn,electron_node,tflite,devtools-frontend}
  c: src/third_party (rest)
  d: src (excluding third_party)

DEPSHASH is now the raw hash; shard files are
v2-src-cache-shard-{a..d}-${DEPSHASH}.tar (all pass the sas-sidecar
filename regex). sas-token is now a JSON keyed by shard letter. All
restore paths extract the four shards in parallel with per-PID wait so
a failed shard aborts the step.

* Revert "ci: split src cache into 4 parallel-extractable shards"

This reverts commit 970574998b.
This commit is contained in:
Samuel Attard
2026-04-05 20:56:03 -04:00
committed by GitHub
parent 4d8fd31e5f
commit fef2fd2941
8 changed files with 21 additions and 24 deletions

View File

@@ -28,7 +28,7 @@ runs:
shell: bash
run: |
node src/electron/script/generate-deps-hash.js
DEPSHASH="v1-src-cache-$(cat src/electron/.depshash)"
DEPSHASH="v2-src-cache-$(cat src/electron/.depshash)"
echo "DEPSHASH=$DEPSHASH" >> $GITHUB_ENV
echo "CACHE_FILE=$DEPSHASH.tar" >> $GITHUB_ENV
if [ "${{ inputs.target-platform }}" = "win" ]; then
@@ -109,7 +109,7 @@ runs:
echo "target_os=['$TARGET_OS']" >> ./.gclient
fi
ELECTRON_USE_THREE_WAY_MERGE_FOR_PATCHES=1 e d gclient sync --with_branch_heads --with_tags -vv
ELECTRON_DEPOT_TOOLS_WIN_TOOLCHAIN=0 DEPOT_TOOLS_WIN_TOOLCHAIN=0 ELECTRON_USE_THREE_WAY_MERGE_FOR_PATCHES=1 e d gclient sync --with_branch_heads --with_tags
if [[ "${{ inputs.is-release }}" != "true" ]]; then
# Re-export all the patches to check if there were changes.
python3 src/electron/script/export_all_patches.py src/electron/patches/config.json
@@ -187,7 +187,9 @@ runs:
shell: bash
run: |
echo "Uncompressed src size: $(du -sh src | cut -f1 -d' ')"
tar -cf $CACHE_FILE src
# Named .tar but zstd-compressed; the sas-sidecar's filename allowlist
# only permits .tar/.tgz so we keep the extension and decode on restore.
tar -cf - src | zstd -T0 --long=30 -f -o $CACHE_FILE
echo "Compressed src to $(du -sh $CACHE_FILE | cut -f1 -d' ')"
cp ./$CACHE_FILE $CACHE_DRIVE/
- name: Persist Src Cache

View File

@@ -31,7 +31,7 @@ runs:
fi
mkdir temp-cache
tar -xf $cache_path -C temp-cache
zstd -d --long=30 -c $cache_path | tar -xf - -C temp-cache
echo "Unzipped cache is $(du -sh temp-cache/src | cut -f1)"
if [ -d "temp-cache/src" ]; then

View File

@@ -61,9 +61,9 @@ runs:
echo "Cache is empty - exiting"
exit 1
fi
mkdir temp-cache
tar -xf $DEPSHASH.tar -C temp-cache
zstd -d --long=30 -c $DEPSHASH.tar | tar -xf - -C temp-cache
echo "Unzipped cache is $(du -sh temp-cache/src | cut -f1)"
if [ -d "temp-cache/src" ]; then
@@ -85,19 +85,17 @@ runs:
- name: Unzip and Ensure Src Cache (Windows)
if: ${{ inputs.target-platform == 'win' }}
shell: powershell
shell: bash
run: |
$src_cache = "$env:DEPSHASH.tar"
$cache_size = $(Get-Item $src_cache).length
Write-Host "Downloaded cache is $cache_size"
if ($cache_size -eq 0) {
Write-Host "Cache is empty - exiting"
echo "Downloaded cache is $(du -sh $DEPSHASH.tar | cut -f1)"
if [ `du $DEPSHASH.tar | cut -f1` = "0" ]; then
echo "Cache is empty - exiting"
exit 1
}
fi
$TEMP_DIR=New-Item -ItemType Directory -Path temp-cache
$TEMP_DIR_PATH = $TEMP_DIR.FullName
C:\ProgramData\Chocolatey\bin\7z.exe -y -snld20 x $src_cache -o"$TEMP_DIR_PATH"
mkdir temp-cache
zstd -d --long=30 -c $DEPSHASH.tar | tar -xf - -C temp-cache
rm -f $DEPSHASH.tar
- name: Move Src Cache (Windows)
if: ${{ inputs.target-platform == 'win' }}
@@ -112,9 +110,6 @@ runs:
Write-Host "Relocating Cache"
Remove-Item -Recurse -Force src
Move-Item temp-cache\src src
Write-Host "Deleting zip file"
Remove-Item -Force $src_cache
}
if (-Not (Test-Path "src\third_party\blink")) {
Write-Host "Cache was not correctly restored - exiting"

View File

@@ -35,7 +35,7 @@ jobs:
- name: Generate DEPS Hash
run: |
node src/electron/script/generate-deps-hash.js
DEPSHASH=v1-src-cache-$(cat src/electron/.depshash)
DEPSHASH=v2-src-cache-$(cat src/electron/.depshash)
echo "DEPSHASH=$DEPSHASH" >> $GITHUB_ENV
echo "CACHE_PATH=$DEPSHASH.tar" >> $GITHUB_ENV
- name: Restore src cache via AKS

View File

@@ -156,7 +156,7 @@ jobs:
- name: Generate DEPS Hash
run: |
node src/electron/script/generate-deps-hash.js
DEPSHASH=v1-src-cache-$(cat src/electron/.depshash)
DEPSHASH=v2-src-cache-$(cat src/electron/.depshash)
echo "DEPSHASH=$DEPSHASH" >> $GITHUB_ENV
echo "CACHE_PATH=$DEPSHASH.tar" >> $GITHUB_ENV
- name: Restore src cache via AZCopy

View File

@@ -80,7 +80,7 @@ jobs:
- name: Generate DEPS Hash
run: |
node src/electron/script/generate-deps-hash.js
DEPSHASH=v1-src-cache-$(cat src/electron/.depshash)
DEPSHASH=v2-src-cache-$(cat src/electron/.depshash)
echo "DEPSHASH=$DEPSHASH" >> $GITHUB_ENV
echo "CACHE_PATH=$DEPSHASH.tar" >> $GITHUB_ENV
- name: Restore src cache via AZCopy

View File

@@ -81,7 +81,7 @@ jobs:
- name: Generate DEPS Hash
run: |
node src/electron/script/generate-deps-hash.js
DEPSHASH=v1-src-cache-$(cat src/electron/.depshash)
DEPSHASH=v2-src-cache-$(cat src/electron/.depshash)
echo "DEPSHASH=$DEPSHASH" >> $GITHUB_ENV
echo "CACHE_PATH=$DEPSHASH.tar" >> $GITHUB_ENV
- name: Restore src cache via AZCopy

View File

@@ -165,7 +165,7 @@ jobs:
- name: Generate DEPS Hash
run: |
node src/electron/script/generate-deps-hash.js
DEPSHASH=v1-src-cache-$(cat src/electron/.depshash)
DEPSHASH=v2-src-cache-$(cat src/electron/.depshash)
echo "DEPSHASH=$DEPSHASH" >> $GITHUB_ENV
echo "CACHE_PATH=$DEPSHASH.tar" >> $GITHUB_ENV
- name: Restore src cache via AZCopy