Tensor Cores 2: Local Buffers Edition (#1057)

* local buffers

* work

* works

* invert_strides

* work

* non tc

* fix shapetracker bug

* stride priority

* touchups

* gate tensor cores

* tensor core conv

* cleanups

* bug fixes

* fix metal_matmul

* fast tensor cores

* more speed

* buffer selection bug fix

* fix CI maybe

* ugh, CI is set to true, not 1

* tc allowed

* add_gl_dimension

* split out padding conv tests

* does padding add fail

* test_padded_conv2d_1x1

* skip metal ci stuff

* more strict on yellow

* float2

* strip parens

* fix float2

* touch up

* dtype

* strip parens

* no alias

* bugfix

* cast float2 and test tensor core ops

* oops, don't hardcode 4
This commit is contained in:
George Hotz
2023-07-09 09:06:00 -07:00
committed by GitHub
parent 67e34b356a
commit beb4d3ab01
3 changed files with 223 additions and 44 deletions

View File

@@ -195,6 +195,8 @@ jobs:
FORWARD_ONLY=1 GPU=1 IMAGE=2 python3 test/test_ops.py
- name: Test openpilot model correctness (float32)
run: DEBUGCL=1 GPU=1 IMAGE=2 python3 openpilot/compile.py
- name: Test tensor core ops
run: GPU=1 TC=2 python3 test/test_ops.py
testmetal:
name: Metal Tests