Simple version of the new GPU backend (#458)

* newgpu * more to delete * hmm, tests pass with constant folding * fix lint/type * fix constant folding * comment and rerun tests * lazy touchups * fix graph_batchnorm test * smaller transformer to fix OOM * Revert "smaller transformer to fix OOM" This reverts commit a44ef8edc2. * no func cache * introspect * touchups * CLASTKernel * ugh, it was lru_cache * codegen * spacing * old gpu still in opencl * typing fix
2026-02-11 23:25:04 -05:00 · 2023-01-10 19:16:02 -08:00
parent 66123c99b9
commit fff1f046b0
7 changed files with 220 additions and 105 deletions
--- a/test/graph_batchnorm.py
+++ b/test/graph_batchnorm.py
@@ -6,8 +6,8 @@ import unittest
 def model_step(lm):
  Tensor.training = True
  x = Tensor.ones(8,12,128,256, requires_grad=False)
-  loss = lm.forward(x).sum()
  optimizer = optim.SGD(get_parameters(lm), lr=0.001)
+  loss = lm.forward(x).sum()
  optimizer.zero_grad()
  loss.backward()
  del x,loss