CI < 5 minutes (#1252)

* models matrix

* fix typo and install gpu deps

* install llvm deps if needed

* fix

* testops with cuda

* remove pip cache since not work

* cuda env

* install cuda deps

* maybe it will work now

* i can't read

* all tests in matrix

* trim down more

* opencl stuff in matrix

* opencl pip cache

* test split

* change cuda test exclusion

* test

* fix cuda maybe

* add models

* add more n=auto

* third thing

* fix bug

* cache pip more

* change name

* update tests

* try again cause why not

* balance

* try again...

* try apt cache for cuda

* try on gpu:

* try cuda again

* update packages step

* replace libz-dev with zlib1g-dev

* only cache cuda

* why error

* fix gpuocelot bug

* apt cache err

* apt cache to slow?

* opt and image in single runner

* add a couple n=autos

* remove test matrix

* try cuda apt cache again

* libz-dev -> zlib1g-dev

* remove -s since not supported by xdist

* the cache takes too long and doesn't work

* combine webgpu and metal tests

* combine imagenet to c and cpu tests

* torch tests with linters

* torch back by itself

* small windows clang test with torch tests

* fix a goofy windows bug

* im dumb

* bro

* clang with linters

* fix pylint error

* linter not work on windows

* try with clang again

* clang and imagenet?

* install deps

* fix

* fix quote

* clang by itself (windows too slow)

* env vars for imagenet

* cache pip for metal and webgpu tests

* try torch with metal and webgpu

* doesn't work, too long

* remove -v

* try -n=logical

* don't use logical

* revert accidental thing

* remove some prints unless CI

* fix print unless CI

* ignore speed tests for slow tests

* clang windows in matrix (ubuntu being tested in imagenet->c test)

* try manual pip cache

* fix windows pip cache path

* all manual pip cache

* fix pip cache dir for macos

* print_ci function in helpers

* CI as variable, no print_ci

* missed one

* cuda tests with docker image

* remove setup-python action for cuda

* python->python3?

* remove -s -v

* try fix pip cache

* maybe fix

* try to fix pip cache

* is this the path?

* maybe cache pip

* try again

* create wheels dir

* ?

* cuda pip deps in dockerfile

* disable pip cache for clang

* image from ghcr instead of docker hub

* why is clang like this

* fast deps

* try use different caches

* remove the fast thing

* try with lighter image

* remove setup python for cuda

* small docker and cuda fast deps

* ignore a few more tests

* cool docker thing (maybe)

* oops

* quotes

* fix docker command

* fix bug

* ignore train efficientnet test

* remove dockerfile (docker stuff takes too long)

* remove docker stuff and normal cuda

* oops

* ignore the tests for cuda

* does this work

* ignore test_train on slow backends

* add space

* llvm ignore same tests as cuda

* nvm

* ignore lr scheduler tests

* get some stats

* fix ignore bug

* remove extra '

* remove and

* ignore test for llvm

* change ignored tests and durationon all backends

* fix

* and -> or

* ignore some more cuda tests

* finally?

* does this fix it

* remove durations=0

* add some more tests to llvm

* make last pytest more readable

* fix

* don't train efficientnet on cpu

* try w/out pip cache

* pip cache seems to be generally better

* pytest file markers

* try apt fast for cuda

* use quick install for apt-fast

* apt-fast not worth

* apt-get to apt

* fix typo

* suppress warnings

* register markers

* disable debug on fuzz tests

* change marker names

* apt update and apt install in one command

* update marker names in test.yml

* webgpu pytest marker
This commit is contained in:
cheeetoo
2023-07-23 15:00:56 -05:00
committed by GitHub
parent 47f9d82722
commit a0965ee198
23 changed files with 237 additions and 226 deletions

View File

@@ -18,6 +18,11 @@ jobs:
uses: actions/setup-python@v4
with:
python-version: 3.8
- name: Cache pip
uses: actions/cache@v3
with:
path: ~/.cache/pip
key: linting
- name: Install dependencies
run: pip install -e '.[linting,testing]' --extra-index-url https://download.pytorch.org/whl/cpu
- name: Repo line count
@@ -31,12 +36,12 @@ jobs:
- name: Run mypy
run: mypy tinygrad/ --ignore-missing-imports --check-untyped-defs --explicit-package-bases --warn-unreachable
- name: Install SLOCCount
run: sudo apt-get install sloccount
run: sudo apt install sloccount
- name: Check <5000 lines
run: sloccount tinygrad test examples extra; if [ $(sloccount tinygrad | sed -n 's/.*Total Physical Source Lines of Code (SLOC)[ ]*= \([^ ]*\).*/\1/p' | tr -d ',') -gt 5000 ]; then exit 1; fi
testcpu:
name: CPU Tests
testcpuimagenet:
name: CPU and ImageNet to C Tests
runs-on: ubuntu-latest
timeout-minutes: 20
@@ -47,6 +52,11 @@ jobs:
uses: actions/setup-python@v4
with:
python-version: 3.8
- name: Cache pip
uses: actions/cache@v3
with:
path: ~/.cache/pip
key: testing
- name: Install Dependencies
run: pip install -e '.[testing]' --extra-index-url https://download.pytorch.org/whl/cpu
- name: Test Docs
@@ -54,49 +64,11 @@ jobs:
- name: Test Quickstart
run: awk '/```python/{flag=1;next}/```/{flag=0}flag' docs/quickstart.md > quickstart.py && PYTHONPATH=. python3 quickstart.py
- name: Run Pytest
run: python -m pytest -s -v -n=auto test/
run: python -m pytest -n=auto test/ -k "not (test_efficientnet and models/test_train.py)"
- name: Fuzz Test symbolic
run: DEBUG=1 python test/external/fuzz_symbolic.py
run: python test/external/fuzz_symbolic.py
- name: Fuzz Test shapetracker
run: PYTHONPATH="." DEBUG=1 python test/external/fuzz_shapetracker.py
testwebgpu:
name: WebGPU Tests
runs-on: macos-13
steps:
- name: Checkout Code
uses: actions/checkout@v3
- name: Set up Python 3.8
uses: actions/setup-python@v4
with:
python-version: 3.8
- name: Install Dependencies
run: pip install -e '.[testing,webgpu]' --extra-index-url https://download.pytorch.org/whl/cpu
# - name: Set Env
# run: printf "WEBGPU=1\nWGPU_BACKEND_TYPE=D3D12\n" >> $GITHUB_ENV
- name: Run Pytest
run: WEBGPU=1 WGPU_BACKEND_TYPE=Metal python -m pytest -s -v -n=auto test/test_ops.py test/test_speed_v_torch.py test/test_nn.py test/test_jit.py test/test_randomness.py test/test_tensor.py test/test_assign.py test/test_conv.py test/test_nn.py test/test_custom_function.py test/test_conv_shapetracker.py
- name: Build WEBGPU Efficientnet
run: WEBGPU=1 WGPU_BACKEND_TYPE=Metal python -m examples.webgpu.compile_webgpu
# - name: Install Puppeteer
# run: npm install puppeteer
# - name: Run Efficientnet
# run: node test/test_webgpu.js
testimagenet:
name: ImageNet to C Compile Test
runs-on: ubuntu-latest
timeout-minutes: 20
steps:
- name: Checkout Code
uses: actions/checkout@v3
- name: Set up Python 3.8
uses: actions/setup-python@v4
with:
python-version: 3.8
- name: Install Dependencies
run: pip install -e .
run: PYTHONPATH="." python test/external/fuzz_shapetracker.py
- name: Compile EfficientNet to C
run: PYTHONPATH="." CLANG=1 python3 examples/compile_efficientnet.py > recognize.c
- name: Compile C to native
@@ -104,44 +76,6 @@ jobs:
- name: Test EfficientNet
run: curl https://media.istockphoto.com/photos/hen-picture-id831791190 | ./recognize | grep hen
testllvm:
name: LLVM Tests
runs-on: ubuntu-latest
timeout-minutes: 20
steps:
- name: Checkout Code
uses: actions/checkout@v3
- name: Set up Python 3.8
uses: actions/setup-python@v4
with:
python-version: 3.8
- name: Install Dependencies
run: pip install -e '.[llvm,testing]' --extra-index-url https://download.pytorch.org/whl/cpu
- name: Run Pytest
run: ENABLE_METHOD_CACHE=1 LLVM=1 python -m pytest -s -v -n=auto test/
testclang:
strategy:
matrix:
os: [ubuntu-latest, windows-latest]
runs-on: ${{ matrix.os }}
name: CLANG Tests ${{ matrix.os }} (w method cache)
steps:
- name: Checkout Code
uses: actions/checkout@v3
- name: Set up Python 3.8
uses: actions/setup-python@v4
with:
python-version: 3.8
- name: Install Dependencies
run: pip install -e '.[testing]' --extra-index-url https://download.pytorch.org/whl/cpu
- name: Set env
run: printf "CI=1\nCLANG=1\nENABLE_METHOD_CACHE=1" >> $GITHUB_ENV
- name: Run Pytest
run: python -m pytest -s -v -n=auto test/
testtorch:
name: Torch Tests
runs-on: ubuntu-latest
@@ -154,79 +88,72 @@ jobs:
uses: actions/setup-python@v4
with:
python-version: 3.8
- name: Cache pip
uses: actions/cache@v3
with:
path: ~/.cache/pip
key: testing
- name: Install Dependencies
run: pip install -e '.[testing]' --extra-index-url https://download.pytorch.org/whl/cpu
- name: Run Pytest
run: TORCH=1 python -m pytest -s -v -n=auto test/
run: TORCH=1 python -m pytest -n=auto test/
- name: Run ONNX
run: TORCH=1 python -m pytest test/external/external_test_onnx_backend.py --tb=no --disable-warnings || true
testgpu:
name: GPU Tests
runs-on: ubuntu-20.04
timeout-minutes: 20
steps:
- name: Checkout Code
uses: actions/checkout@v3
- name: Update packages
run: |
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
sudo apt-get update
- name: Install OpenCL
#run: sudo apt-get install -y pocl-opencl-icd
run: sudo apt-get install -y intel-oneapi-runtime-compilers intel-oneapi-runtime-opencl
- name: Set up Python 3.8
uses: actions/setup-python@v4
with:
python-version: 3.8
- name: Install Dependencies
run: pip install -e '.[testing]' --extra-index-url https://download.pytorch.org/whl/cpu
- name: Run Optimizer Test (OPT 2 and 3)
run: |
PYTHONPATH="." OPT=2 GPU=1 python test/external/external_test_opt.py
PYTHONPATH="." OPT=3 GPU=1 python test/external/external_test_opt.py
- name: Run Pytest (default)
run: GPU=1 python -m pytest -s -v -n=auto test/
run: TORCH=1 python -m pytest -n=auto test/external/external_test_onnx_backend.py --tb=no --disable-warnings || true
testopencl:
name: openpilot (OpenCL) Test
strategy:
matrix:
task: [optimage, openpilot]
name: ${{ matrix.task=='optimage'&&'GPU OPT and IMAGE Tests'||'openpilot (OpenCL) Tests'}}
runs-on: ubuntu-20.04
timeout-minutes: 20
steps:
- name: Checkout Code
uses: actions/checkout@v3
- name: Update packages
run: |
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
sudo apt-get update
- name: Install OpenCL
#run: sudo apt-get install -y pocl-opencl-icd
run: sudo apt-get install -y intel-oneapi-runtime-compilers intel-oneapi-runtime-opencl
- name: Set up Python 3.8
uses: actions/setup-python@v4
with:
python-version: 3.8
- name: Install Dependencies
run: pip install -e '.[testing]' --extra-index-url https://download.pytorch.org/whl/cpu
- name: Test openpilot model compile and size
run: |
DEBUG=2 ALLOWED_KERNEL_COUNT=199 FLOAT16=1 DEBUGCL=1 GPU=1 IMAGE=2 python3 openpilot/compile.py
python3 -c 'import os; assert os.path.getsize("/tmp/output.thneed") < 100_000_000'
- name: Test GPU IMAGE ops
run: |
GPU=1 IMAGE=1 python3 test/test_ops.py
FORWARD_ONLY=1 GPU=1 IMAGE=2 python3 test/test_ops.py
- name: Test openpilot model correctness (float32)
run: DEBUGCL=1 GPU=1 IMAGE=2 python3 openpilot/compile.py
- name: Test tensor core ops
run: GPU=1 TC=2 python3 test/test_ops.py
- name: Checkout Code
uses: actions/checkout@v3
- name: Update packages
run: |
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
sudo apt update
- name: Install OpenCL
#run: sudo apt-get install -y pocl-opencl-icd
run: sudo apt install -y intel-oneapi-runtime-compilers intel-oneapi-runtime-opencl
- name: Set up Python 3.8
uses: actions/setup-python@v4
with:
python-version: 3.8
- name: Cache pip
uses: actions/cache@v3
with:
path: ~/.cache/pip
key: testing
- name: Install Dependencies
run: pip install -e '.[testing]' --extra-index-url https://download.pytorch.org/whl/cpu
- if: ${{ matrix.task == 'optimage' }}
name: Run Optimizer Test (OPT 2 and 3)
run: |
PYTHONPATH="." OPT=2 GPU=1 python -m pytest -n=auto test/external/external_test_opt.py
PYTHONPATH="." OPT=3 GPU=1 python -m pytest -n=auto test/external/external_test_opt.py
- if: ${{ matrix.task == 'optimage'}}
name: Test GPU IMAGE ops
run: |
GPU=1 IMAGE=1 python3 -m pytest -n=auto test/test_ops.py
FORWARD_ONLY=1 GPU=1 IMAGE=2 python3 -m pytest -n=auto test/test_ops.py
- if: ${{ matrix.task == 'openpilot' }}
name: Test openpilot model compile and size
run: |
DEBUG=2 ALLOWED_KERNEL_COUNT=199 FLOAT16=1 DEBUGCL=1 GPU=1 IMAGE=2 python3 openpilot/compile.py
python3 -c 'import os; assert os.path.getsize("/tmp/output.thneed") < 100_000_000'
- if: ${{ matrix.task == 'openpilot' }}
name: Test openpilot model correctness (float32)
run: DEBUGCL=1 GPU=1 IMAGE=2 python3 openpilot/compile.py
- if: ${{ matrix.task == 'openpilot' }}
name: Test tensor core ops
run: GPU=1 TC=2 python3 -m pytest -n=auto test/test_ops.py
testmetal:
name: Metal Tests
testmetalwebgpu:
name: Metal and WebGPU Tests
runs-on: macos-13
timeout-minutes: 20
@@ -237,19 +164,27 @@ jobs:
uses: actions/setup-python@v4
with:
python-version: 3.11
- name: Cache pip
uses: actions/cache@v3
with:
path: ~/Library/Caches/pip
key: metalwebgpu
- name: Install Dependencies
run: pip install -e '.[metal,testing]'
run: pip install -e '.[metal,webgpu,testing]' --extra-index-url https://download.pytorch.org/whl/cpu
- name: Test LLaMA compile speed
run: PYTHONPATH="." METAL=1 python3 test/external/external_test_speed_llama.py
#- name: Run dtype test
# run: DEBUG=4 METAL=1 python -m pytest test/test_dtype.py
# dtype test has issues on test_half_to_int8
- name: Run ops test
- name: Run metal ops test
run: DEBUG=2 METAL=1 python -m pytest test/test_ops.py
- name: Run JIT test
run: DEBUG=2 METAL=1 python -m pytest test/test_jit.py
# TODO: why not testing the whole test/?
- name: Run webgpu pytest
run: WEBGPU=1 WGPU_BACKEND_TYPE=Metal python -m pytest -n=auto -m 'webgpu'
- name: Build WEBGPU Efficientnet
run: WEBGPU=1 WGPU_BACKEND_TYPE=Metal python -m examples.webgpu.compile_webgpu
testdocker:
name: Docker Test
@@ -264,58 +199,73 @@ jobs:
- name: Test Docker
run: docker run --rm tinygrad /usr/bin/env python3 -c "from tinygrad.tensor import Tensor; print(Tensor.eye(3).numpy())"
tests:
strategy:
matrix:
backend: [llvm, clang, gpu, cuda]
testcuda:
name: (emulated) cuda test
runs-on: ubuntu-22.04
name: Tests on (${{ matrix.backend }})
runs-on: ${{ matrix.backend == 'gpu' && 'ubuntu-20.04' || matrix.backend=='clang'&&'windows-latest'|| 'ubuntu-latest' }}
timeout-minutes: 20
steps:
- name: Checkout Code
uses: actions/checkout@v3
- name: Update packages
run: |
export DEBIAN_FRONTEND=noninteractive
sudo apt-get update -y
- name: Install packages
run: sudo apt-get install -y --no-install-recommends git g++ cmake ninja-build llvm-15-dev libz-dev libglew-dev flex bison libfl-dev libboost-thread-dev libboost-filesystem-dev nvidia-cuda-toolkit-gcc
- name: Cache gpuocelot
id: cache-build
uses: actions/cache@v3
env:
cache-name: cache-gpuocelot-build
with:
path: ${{ github.workspace }}/gpuocelot/ocelot/
key: ubuntu22.04-gpuocelot-19626fc00b6ee321638c3111074269c69050e091
restore-keys: |
ubuntu22.04-gpuocelot-19626fc00b6ee321638c3111074269c69050e091
- if: ${{ steps.cache-build.outputs.cache-hit != 'true' }}
name: Clone gpuocelot
uses: actions/checkout@v3
with:
repository: gpuocelot/gpuocelot
ref: 19626fc00b6ee321638c3111074269c69050e091
path: ${{ github.workspace }}/gpuocelot
submodules: true
- if: ${{ steps.cache-build.outputs.cache-hit != 'true' }}
name: Compile gpuocelot
run: |
cd ${{ github.workspace }}/gpuocelot/ocelot
mkdir build
cd build
cmake .. -Wno-dev -G Ninja -DOCELOT_BUILD_TOOLS=OFF
ninja
- name: Install gpuocelot
run: |
cd ${{ github.workspace }}/gpuocelot/ocelot/build
sudo ninja install
- name: Set up Python 3.8
uses: actions/setup-python@v4
with:
python-version: 3.8
cache: 'pip'
cache-dependency-path: setup.py
- name: Install tinygrad dependencies
run: pip install -e '.[testing, cuda]' --extra-index-url https://download.pytorch.org/whl/cpu
- name: Run pytest
run: FORWARD_ONLY=1 JIT=1 OPT=2 CUDA=1 CUDACPU=1 python -m pytest -s -v -n=auto test --ignore=test/external --ignore=test/models --ignore=test/test_speed_v_torch.py --ignore=test/test_specific_conv.py --ignore=test/test_net_speed.py --ignore=test/test_nn.py -k "not half"
- name: Checkout Code
uses: actions/checkout@v3
- name: Set up Python 3.8
uses: actions/setup-python@v4
with:
python-version: 3.8
- name: Cache pip
uses: actions/cache@v3
with:
path: ${{ matrix.backend=='clang'&&'~\AppData\Local\pip\cache'||'~/.cache/pip' }}
key: ${{ matrix.backend }}
- name: Set env
run: printf "${{ matrix.backend == 'llvm' && 'ENABLE_METHOD_CACHE=1\nLLVM=1' || matrix.backend == 'clang' && 'CLANG=1\nENABLED_METHOD_CACHE=1' || matrix.backend == 'gpu' && 'GPU=1' || matrix.backend == 'cuda' && 'FORWARD_ONLY=1\nJIT=1\nOPT=2\nCUDA=1\nCUDACPU=1\n'}}" >> $GITHUB_ENV
- name: Install packages (gpu)
if: matrix.backend == 'gpu'
run: |
wget -O- https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB | gpg --dearmor | sudo tee /usr/share/keyrings/oneapi-archive-keyring.gpg > /dev/null
echo "deb [signed-by=/usr/share/keyrings/oneapi-archive-keyring.gpg] https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
sudo apt update && \
sudo apt install -y intel-oneapi-runtime-compilers intel-oneapi-runtime-opencl
- name: Install packages (cuda)
if: matrix.backend == 'cuda'
run: |
export DEBIAN_FRONTEND=noninteractive
sudo apt update -y && \
sudo apt install -y --no-install-recommends git g++ cmake ninja-build llvm-15-dev zlib1g-dev libglew-dev flex bison libfl-dev libboost-thread-dev libboost-filesystem-dev nvidia-cuda-toolkit-gcc
- name: Cache gpuocelot
if: matrix.backend == 'cuda'
id: cache-build
uses: actions/cache@v3
env:
cache-name: cache-gpuocelot-build
with:
path: ${{ github.workspace }}/gpuocelot/ocelot/
key: ubuntu22.04-gpuocelot-19626fc00b6ee321638c3111074269c69050e091
restore-keys: |
ubuntu22.04-gpuocelot-19626fc00b6ee321638c3111074269c69050e091
- name: Clone/compile gpuocelot
if: matrix.backend == 'cuda' && steps.cache-build.outputs.cache-hit != 'true'
run: |
git clone --recurse-submodules https://github.com/gpuocelot/gpuocelot.git ${{ github.workspace }}/gpuocelot
cd ${{ github.workspace }}/gpuocelot/ocelot
git checkout 19626fc00b6ee321638c3111074269c69050e091
mkdir build
cd build
cmake .. -Wno-dev -G Ninja -DOCELOT_BUILD_TOOLS=OFF
ninja
- name: Install gpuocelot
if: matrix.backend == 'cuda'
run: |
cd ${{ github.workspace }}/gpuocelot/ocelot/build
sudo ninja install
- name: Install dependencies
run: pip install -e '.[testing${{matrix.backend=='llvm'&&',llvm'||matrix.backend=='cuda'&&',cuda'||''}}]' --extra-index-url https://download.pytorch.org/whl/cpu
- name: Run pytest (not cuda)
if: matrix.backend!='cuda'
run: python -m pytest -n=auto test/ -k '${{matrix.backend=='llvm'&&'not (test_nn.py and test_conv_transpose2d)'||'test'}}' -m 'not exclude_${{matrix.backend}}'
- name: Run pytest (cuda)
if: matrix.backend=='cuda'
run: python -m pytest -n=auto test/ -k 'not (half or test_efficientnet_safetensors) and not (test_conv2d and test_tensor.py)' -m 'not exclude_cuda' --ignore=test/external --ignore=test/models