Files
tinygrad/.github/actions/setup-tinygrad/action.yml
Christopher Milan 0aabc1e938 Mesa NIR backend (NAK/LLVMpipe) (#12089)
* nak works

* TestOps::test_add works

* testop has no crashes

* fix bool casts

* fix typo

* add disassemble

* RANGE and locals/regs

* simplify NAKCompiler

* disass cleanup

* cleanup nir codegen

* almost all tests passing

* cleanup notes in extra/

* old notes

* only import nak if NIR=1

* fix new SPECIAL syntax

* fix local/shared memory

* more tests passing

* add DEFINE_VAR support

* llvmpipe kinda works

* diskcache

* some mypy stuff

* lvp passing test_ops.py

* fix imports

* actually fix imports

* remove 'stdout'

* fix llvm import

* fix mypy issues

* nicer errors

* simpler test_dtype skips

* test lvp in CI

* fix github action syntax

* fix more actions typos

* switch to mesa 25.1.0

* diskcache_put

* better generation for lvp nir_options

* b64encode shader blobs

* Revert diskcache changes

This reverts commits 930fa3de8a and 8428c694b3.

* general cleanup

* better error messages

* fix llvm import

* fix windows tests

* link with libm and libgcc_s

* fix some errors

* dont check for 'float4'

* NIR uses pointer arithmetic

* use tinymesa

* bump tinymesa

* bump tinymesa again

* update lvp nir_options

* print nir shader with DEBUG

* simplify LVPCompiler

* more tests

* "gated" STORE

* NAK is cacheable

* more tests

* all tests pass locally for NAK

* test autogen in CI

* autogen deps

* more deps

* fix uop_gc

* fix macos

* mypy

* save 2 lines

* save two more lines

* save 1 line

* save 4 lines

* save more lines

* Revert "save more lines"

This reverts commit dd3a720c5a.

* save more lines

* fix LVP on windows

* refactor

* reorganize some code

* refactor lib_gpu

* move LVP check

* out of order loads

* remove support.mesa

* bump tinymesa version

* simplify LVP jit

* macos

* macos ci

* shell: bash

* testing

* more testing

* compute brew prefix

* stupid typo

* actually fix

* lib

* stdout on macos

* inline gallivm_compile_module

* Revert "inline gallivm_compile_module"

This reverts commit b65983b151.

* elf macos

* semicolon

* inherit from CPULLVMCompiler

* ruff

* disas test

* fix libm linking

* default is fine actually

* arm works

* add elf loader link test

* fix NAK beam

* pylint is too smart by half

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com>
2025-10-15 17:38:33 +08:00

306 lines
12 KiB
YAML

name: Setup Python & Install
description: Sets up Python and installs project dependencies.
inputs:
python-version:
description: 'Python version to use'
required: false
default: '3.12'
key:
description: 'Key for the python cache'
required: false
default: '' # if you don't set a key, it doesn't cache
deps:
description: 'Extra dependency groups (comma separated)'
required: false
default: ''
pydeps:
description: 'Extra Python dependency groups (space separated)'
required: false
default: ''
opencl:
description: "Install OpenCL?"
required: false
default: 'false'
amd:
description: "Install AMD?"
required: false
default: 'false'
cuda:
description: "Install CUDA?"
required: false
default: 'false'
ocelot:
description: "Install gpuocelot?"
required: false
default: 'false'
webgpu:
description: "Install webgpu?"
required: false
default: 'false'
llvm:
description: "Install LLVM?"
required: false
default: 'false'
mesa:
description: "Install mesa"
required: false
default: 'false'
runs:
using: "composite"
steps:
- name: Set up Python ${{ inputs.python-version }}
id: setup-python
uses: actions/setup-python@v5
with:
python-version: ${{ inputs.python-version }}
# **** Caching packages ****
- name: Cache Python packages
id: restore-venv
uses: actions/cache@v4
with:
path: ${{ github.workspace }}/.venv
key: venv-${{ runner.os }}-python-${{ steps.setup-python.outputs.python-version }}-${{ inputs.deps }}-${{ inputs.pydeps }}-${{ hashFiles('**/setup.py') }}-${{ env.PYTHON_CACHE_VERSION }}
# **** Caching downloads ****
- name: Cache downloads (Linux)
if: inputs.key != '' && runner.os == 'Linux'
uses: actions/cache@v4
with:
path: ~/.cache/tinygrad/downloads/
key: downloads-cache-${{ inputs.key }}-${{ env.DOWNLOAD_CACHE_VERSION }}
- name: Cache downloads (macOS)
if: inputs.key != '' && runner.os == 'macOS'
uses: actions/cache@v4
with:
path: ~/Library/Caches/tinygrad/downloads/
key: osx-downloads-cache-${{ inputs.key }}-${{ env.DOWNLOAD_CACHE_VERSION }}
# **** Python deps ****
- name: Install dependencies in venv (with extra)
if: inputs.deps != '' && steps.restore-venv.outputs.cache-hit != 'true'
shell: bash
run: |
python -m venv .venv
if [[ "$RUNNER_OS" == "Windows" ]]; then
source .venv/Scripts/activate
else
. .venv/bin/activate
fi
python -m pip install -e ".[${{ inputs.deps }}]" ${{ inputs.pydeps }} --extra-index-url https://download.pytorch.org/whl/cpu --extra-index-url https://aiinfra.pkgs.visualstudio.com/PublicPackages/_packaging/Triton-Nightly/pypi/simple/
- name: Install dependencies in venv (without extra)
if: inputs.deps == '' && steps.restore-venv.outputs.cache-hit != 'true'
shell: bash
run: |
python -m venv .venv
if [[ "$RUNNER_OS" == "Windows" ]]; then
source .venv/Scripts/activate
else
. .venv/bin/activate
fi
python -m pip install -e . ${{ inputs.pydeps }}
- name: Set up venv environment
shell: bash
run: |
echo "VIRTUAL_ENV=${{ github.workspace }}/.venv" >> "$GITHUB_ENV"
echo "OMP_NUM_THREADS=1" >> "$GITHUB_ENV"
# no buffers should be over 300MB in CI
echo "MAX_BUFFER_SIZE=300000000" >> "$GITHUB_ENV"
if [[ "$RUNNER_OS" == "Windows" ]]; then
echo "${{ github.workspace }}/.venv/Scripts" >> "$GITHUB_PATH"
else
echo "${{ github.workspace }}/.venv/bin" >> "$GITHUB_PATH"
fi
# ******************* apt *******************
- name: Setup apt
if: runner.os == 'Linux' && (inputs.opencl == 'true' || inputs.amd == 'true' || inputs.cuda == 'true' || inputs.webgpu == 'true' || inputs.llvm == 'true')
shell: bash
run: |
sudo chown -R $USER:$USER /var/cache/apt/archives
echo 'Acquire::GzipIndexes "true";' | sudo tee /etc/apt/apt.conf.d/gzip
echo 'Acquire::http::Pipeline-Depth "5";' | sudo tee -a /etc/apt/apt.conf.d/99parallel
echo 'Binary::apt::APT::Keep-Downloaded-Packages "true";' | sudo tee -a /etc/apt/apt.conf.d/99keep-debs
- name: Add OpenCL Repo
if: inputs.opencl == 'true' && runner.os == 'Linux'
shell: bash
run: echo "deb [ allow-insecure=yes ] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
- name: Add AMD Repo (Linux)
if: inputs.amd == 'true' && runner.os == 'Linux'
shell: bash
run: |
wget https://repo.radeon.com/rocm/rocm.gpg.key -O - | gpg --dearmor | sudo tee /etc/apt/keyrings/rocm.gpg > /dev/null
sudo tee /etc/apt/sources.list.d/rocm.list <<EOF
deb [arch=amd64 signed-by=/etc/apt/keyrings/rocm.gpg] https://repo.radeon.com/rocm/apt/6.2 $(lsb_release -cs) main
EOF
echo -e 'Package: *\nPin: release o=repo.radeon.com\nPin-Priority: 600' | sudo tee /etc/apt/preferences.d/rocm-pin-600
- name: Add LLVM Repo (Linux)
if: inputs.llvm == 'true' && runner.os == 'Linux'
shell: bash
run: |
wget -qO- https://apt.llvm.org/llvm-snapshot.gpg.key | sudo tee /etc/apt/trusted.gpg.d/apt.llvm.org.asc
echo "deb http://apt.llvm.org/$(lsb_release -cs)/ llvm-toolchain-$(lsb_release -cs)-20 main" | sudo tee /etc/apt/sources.list.d/llvm.list
- name: Compute Package List + Hash
if: runner.os == 'Linux' && (inputs.opencl == 'true' || inputs.amd == 'true' || inputs.cuda == 'true' || inputs.webgpu == 'true' || inputs.llvm == 'true')
id: apt-pkgs
shell: bash
run: |
pkgs=""
# **** OpenCL ****
if [[ "${{ inputs.opencl }}" == "true" ]]; then
pkgs+=" opencl-headers \
intel-oneapi-runtime-openmp=2023.2.1-16 intel-oneapi-runtime-compilers-common=2023.2.1-16 intel-oneapi-runtime-compilers=2023.2.1-16 \
intel-oneapi-runtime-dpcpp-sycl-opencl-cpu=2023.2.1-16 intel-oneapi-runtime-tbb-common=2021.10.0-49541 \
intel-oneapi-runtime-tbb=2021.10.0-49541 intel-oneapi-runtime-opencl=2023.2.1-16"
fi
# **** AMD ****
if [[ "${{ inputs.amd }}" == "true" ]]; then
pkgs+=" hsa-rocr comgr hsa-rocr-dev liburing-dev libibverbs-dev libc6-dev"
fi
# **** CUDA ****
if [[ "${{ inputs.cuda }}" == "true" ]]; then
pkgs+=" git g++ cmake ninja-build llvm-15-dev zlib1g-dev libglew-dev \
flex bison libfl-dev libboost-thread-dev libboost-filesystem-dev nvidia-cuda-toolkit-gcc libzstd-dev"
fi
# **** WebGPU (dependencies for software-based vulkan) ****
if [[ "${{ inputs.webgpu }}" == "true" ]]; then
pkgs+=" libgl1 libglx-mesa0 libgl1-mesa-dri libxcb-xfixes0-dev mesa-vulkan-drivers"
fi
# **** LLVM ****
if [[ "${{ inputs.llvm }}" == "true" ]]; then
pkgs+=" libllvm20 clang-20 lld-20"
fi
echo "pkgs=$pkgs" >> "$GITHUB_OUTPUT"
echo "hash=$(echo -n "$pkgs" | sha256sum | cut -d' ' -f1)" >> "$GITHUB_OUTPUT"
- name: Cache apt
if: runner.os == 'Linux' && (inputs.opencl == 'true' || inputs.amd == 'true' || inputs.cuda == 'true' || inputs.webgpu == 'true' || inputs.llvm == 'true')
uses: actions/cache@v4
with:
path: /var/cache/apt/archives/
key: ${{ runner.os }}-apt-${{ steps.apt-pkgs.outputs.hash }}-${{ env.APT_CACHE_VERSION }}
- name: Run apt Update + Install
if: runner.os == 'Linux' && (inputs.opencl == 'true' || inputs.amd == 'true' || inputs.cuda == 'true' || inputs.webgpu == 'true' || inputs.llvm == 'true')
shell: bash
run: |
sudo apt -qq update || true
# ******** do install ********
if [[ -n "${{ steps.apt-pkgs.outputs.pkgs }}" ]]; then
sudo apt-get -y --allow-unauthenticated --no-install-recommends install ${{ steps.apt-pkgs.outputs.pkgs }}
fi
sudo chown -R $USER:$USER /var/cache/apt/archives/
# **** AMD ****
- name: Setup AMD (Linux)
if: inputs.amd == 'true' && runner.os == 'Linux'
shell: bash
run: |
cargo build --release --manifest-path ./extra/remu/Cargo.toml
sudo ln -sf ${{ github.workspace }}/extra/remu/target/release/libremu.so /usr/local/lib/libremu.so
sudo tee --append /etc/ld.so.conf.d/rocm.conf <<'EOF'
/opt/rocm/lib
/opt/rocm/lib64
EOF
sudo ldconfig
- name: Setup AMD comgr+remu (macOS)
if: inputs.amd == 'true' && runner.os == 'macOS'
shell: bash
run: |
sudo mkdir -p /usr/local/lib
curl -s -H "Authorization: token $GH_TOKEN" curl -s https://api.github.com/repos/nimlgen/amdcomgr_dylib/releases/latest | \
jq -r '.assets[] | select(.name == "libamd_comgr.dylib").browser_download_url' | \
sudo xargs curl -L -o /usr/local/lib/libamd_comgr.dylib
cargo build --release --manifest-path ./extra/remu/Cargo.toml
# **** gpuocelot ****
- name: Install gpuocelot dependencies (MacOS)
if: inputs.ocelot == 'true' && runner.os == 'macOS'
shell: bash
run: |
pkgs=(cmake ninja llvm@15 zlib glew flex bison boost@1.85 zstd ncurses)
for f in "${pkgs[@]}"; do
brew ls --versions "$f" >/dev/null 2>&1 || brew install --quiet "$f"
done
# Fix boost 1.85 for gpuocelot
ln -s /opt/homebrew/opt/boost@1.85 /opt/homebrew/opt/boost || true
ln -s /opt/homebrew/opt/boost/lib/libboost_atomic-mt.dylib /opt/homebrew/opt/boost/lib/libboost_atomic.dylib || true
ln -s /opt/homebrew/opt/boost/lib/libboost_thread-mt.dylib /opt/homebrew/opt/boost/lib/libboost_thread.dylib || true
- name: Cache gpuocelot
if: inputs.ocelot == 'true'
id: cache-build
uses: actions/cache@v4
env:
cache-name: cache-gpuocelot-build-1
with:
path: ${{ github.workspace }}/gpuocelot/ocelot
key: ${{ runner.os }}-gpuocelot-b16039dc940dc6bc4ea0a98380495769ff35ed99-rebuild-${{ env.BUILD_CACHE_VERSION }}
- name: Clone/compile gpuocelot
if: inputs.ocelot == 'true' && steps.cache-build.outputs.cache-hit != 'true'
shell: bash
run: |
git clone --recurse-submodules https://github.com/gpuocelot/gpuocelot.git ${{ github.workspace }}/gpuocelot
cd ${{ github.workspace }}/gpuocelot/ocelot
git checkout b16039dc940dc6bc4ea0a98380495769ff35ed99
mkdir build
cd build
CMAKE_ARGS="-Wno-dev -G Ninja -DOCELOT_BUILD_TOOLS=OFF -DCMAKE_BUILD_ALWAYS=0 -DBUILD_TESTS_CUDA=OFF -DCMAKE_POLICY_VERSION_MINIMUM=3.5"
if [[ "${{ runner.os }}" == "macOS" ]]; then
CMAKE_ARGS="$CMAKE_ARGS -DBoost_INCLUDE_DIR=$(brew --prefix boost)/include -DBoost_LIBRARY_DIR=$(brew --prefix boost)/lib"
fi
cmake .. $CMAKE_ARGS
ninja
- name: Install gpuocelot
if: inputs.ocelot == 'true'
shell: bash
run: |
cd ${{ github.workspace }}/gpuocelot/ocelot/build
sudo cp libgpuocelot.${{ runner.os == 'macOS' && 'dylib' || 'so' }} /usr/${{ runner.os == 'macOS' && 'local/' || '' }}lib/
# **** WebGPU ****
- name: Install WebGPU dawn (Linux)
if: inputs.webgpu == 'true' && runner.os == 'Linux'
shell: bash
run: |
sudo curl -L https://github.com/wpmed92/pydawn/releases/download/v0.1.6/libwebgpu_dawn.so -o /usr/local/lib/libwebgpu_dawn.so
sudo ldconfig
- name: Install WebGPU dawn (macOS)
if: inputs.webgpu == 'true' && runner.os == 'macOS'
shell: bash
run: |
brew tap wpmed92/dawn
brew install dawn
# **** LLVM ****
- name: Install LLVM (macOS)
if: inputs.llvm == 'true' && runner.os == 'macOS'
shell: bash
run: brew install llvm@20
# **** mesa ****
- name: Install mesa (linux)
if: inputs.mesa == 'true' && runner.os == 'Linux'
shell: bash
run: sudo curl -L https://github.com/sirhcm/tinymesa/releases/download/tinymesa-32dc66c/libtinymesa_cpu-mesa-25.2.4-linux-amd64.so -o /usr/lib/libtinymesa_cpu.so
- name: Install mesa (macOS)
if: inputs.mesa == 'true' && runner.os == 'macOS'
shell: bash
run: brew install sirhcm/tinymesa/tinymesa