global -> group (#1007)

* global -> group

* allow None for local_size in custom function

* lil local

* comment on shape

* fix cuda

* smart local cast

* better local heuristic

* fix ptx, and work_dim cleanup

* fix metal

* fix ops test

* fix openpilot jit

* no more optlocal

* might fix metal tests

* try metal now

* see generated metal code

* test free removal. REVERT THIS

* mergable
This commit is contained in:
George Hotz
2023-06-21 11:50:43 -07:00
committed by GitHub
parent aab9ee0fca
commit 18892242b0
17 changed files with 81 additions and 90 deletions

View File

@@ -95,7 +95,7 @@ jobs:
run: pip install -e '.[llvm,testing]' --extra-index-url https://download.pytorch.org/whl/cpu
- name: Run Pytest
run: ENABLE_METHOD_CACHE=1 LLVM=1 python -m pytest -s -v -n=auto test/
testclang:
strategy:
matrix:
@@ -207,11 +207,11 @@ jobs:
python-version: 3.11
- name: Install Dependencies
run: pip install -e '.[metal,testing]'
- name: Run ops test
run: METAL=1 python -m pytest test/test_ops.py
# dtype test has issues on test_half_to_int8
#- name: Run dtype test
# run: METAL=1 python -m pytest test/test_dtype.py
# run: DEBUG=4 METAL=1 python -m pytest test/test_dtype.py
- name: Run ops test
run: DEBUG=2 METAL=1 python -m pytest test/test_ops.py
# dtype test has issues on test_half_to_int8
# disabled, this test is flaky
testdocker: