Simple version of the new GPU backend (#458)

* newgpu * more to delete * hmm, tests pass with constant folding * fix lint/type * fix constant folding * comment and rerun tests * lazy touchups * fix graph_batchnorm test * smaller transformer to fix OOM * Revert "smaller transformer to fix OOM" This reverts commit a44ef8edc2. * no func cache * introspect * touchups * CLASTKernel * ugh, it was lru_cache * codegen * spacing * old gpu still in opencl * typing fix
2026-04-07 03:00:26 -04:00 · 2023-01-10 19:16:02 -08:00
parent 66123c99b9
commit fff1f046b0
7 changed files with 220 additions and 105 deletions
--- a/test/test_train.py
+++ b/test/test_train.py
@@ -46,6 +46,10 @@ class TestTrain(unittest.TestCase):
    Y = np.zeros((BS,6), dtype=np.int32)
    train_one_step(model,X,Y)

+    if Device.DEFAULT == "GPU":
+      from extra.introspection import print_objects
+      assert print_objects() == 0
+
  def test_resnet(self):
    X = np.zeros((BS, 3, 224, 224), dtype=np.float32)
    Y = np.zeros((BS), dtype=np.int32)