Refactor load/store before tensor cores (#1193)

* minor cleanups * render_const * now that's a nice refactor * clean up vload/vstore * clean up render_load * debugs there * dumb * err, this? * const float4 * what's failing * bugfix * statement includes semicolon * bugfix
2026-04-29 03:00:14 -04:00 · 2023-07-08 15:54:58 -07:00
parent ef1909500e
commit 7151382364
9 changed files with 85 additions and 90 deletions
--- a/.github/workflows/test.yml
+++ b/.github/workflows/test.yml
@@ -185,14 +185,14 @@ jobs:
        python-version: 3.8
    - name: Install Dependencies
      run: pip install -e '.[testing]' --extra-index-url https://download.pytorch.org/whl/cpu
+    - name: Test openpilot model compile and size
+      run: |
+        DEBUG=2 ALLOWED_KERNEL_COUNT=199 FLOAT16=1 DEBUGCL=1 GPU=1 IMAGE=2 python3 openpilot/compile.py
+        python3 -c 'import os; assert os.path.getsize("/tmp/output.thneed") < 100_000_000'
    - name: Test GPU IMAGE ops
      run: |
        GPU=1 IMAGE=1 python3 test/test_ops.py
        FORWARD_ONLY=1 GPU=1 IMAGE=2 python3 test/test_ops.py
-    - name: Test openpilot model compile and size
-      run: |
-        ALLOWED_KERNEL_COUNT=199 FLOAT16=1 DEBUGCL=1 GPU=1 IMAGE=2 python3 openpilot/compile.py
-        python3 -c 'import os; assert os.path.getsize("/tmp/output.thneed") < 100_000_000'
    - name: Test openpilot model correctness (float32)
      run: DEBUGCL=1 GPU=1 IMAGE=2 python3 openpilot/compile.py