conv2d is an hlop (#589)

* conv2d is an hlop

* shorter conv

* KOPT=-1

* alt imp

* MULACC

* smarter mulacc

* pop conv

* 7x7 -> 5x5

* didn't fix, that's not going to work

* this is faster and matches old behavior

* oh, non lazy just won't work with mulacc

* mulacc in torch

* bool types were creeping in

* optimizer is actually better with hlop conv

* fix pushing permutes issue

* refactor einsum_mulacc

* fix up readme

* update readme

* _image_conv2d

* fix bias addition location

* pushing permutes gets back to 200 kernels

* conv cleanup

* disable hlop conv

* don't hide that in helpers
This commit is contained in:
George Hotz
2023-02-23 17:52:31 -08:00
committed by GitHub
parent 8835df7a5c
commit 758515dcc0
13 changed files with 177 additions and 55 deletions

View File

@@ -56,9 +56,7 @@ jobs:
- name: Install Dependencies
run: pip install -e '.[testing]' --extra-index-url https://download.pytorch.org/whl/cpu
- name: Run Pytest
run: LAZY=0 python -m pytest -s -v -n=auto
- name: Run Pytest (lazy)
run: LAZY=1 python -m pytest -s -v -n=auto
run: python -m pytest -s -v -n=auto
testimagenet:
name: ImageNet to C Compile Test
@@ -110,9 +108,7 @@ jobs:
- name: Install Dependencies
run: pip install -e '.[testing]' --extra-index-url https://download.pytorch.org/whl/cpu
- name: Run Pytest
run: LAZY=0 TORCH=1 python -m pytest -s -v -n=auto
- name: Run Pytest (lazy)
run: LAZY=1 TORCH=1 python -m pytest -s -v -n=auto
run: TORCH=1 python -m pytest -s -v -n=auto
testgpu:
name: GPU Tests