Commit Graph

4 Commits

Author SHA1 Message Date
George Hotz
382f346523 clean up opt (#649)
* clean up opt

* don't let global kernels get too small

* 8192 -> 1024

* disable local shape for clang

* fix can_merge

* unroll the 5x5 depthwise convs in op

* load float4 check
2023-03-05 20:49:36 -08:00
George Hotz
7930c6ab5c CLImage backing bug + test_vec_mul 2023-03-05 16:32:05 -08:00
George Hotz
8de24e3b05 accumulator can be a float4 (#647)
* remove reduceopop

* not float4 yet

* float4 acc works

* group_float4 on store
2023-03-05 15:44:41 -08:00
George Hotz
b1ba78ac38 move applegpu disassembler 2023-03-05 11:21:12 -08:00