Commit Graph

791 Commits

Author SHA1 Message Date
George Hotz
a11deb5150 shapetracker check for noop 2022-06-16 16:29:18 -07:00
George Hotz
52505faaf4 minor 2022-06-16 15:53:45 -07:00
George Hotz
d5b3e18540 Accelerate with CL (#325)
* accelerated opencl

* it's running, it's just wrong

* bugfix

* model is correct in opencl

* lazy image convert

* add padding support to convolution

* that stuff was all upstreamed

* remove HEAD

* oops

* test_simple_conv2d_4 passes, add dilation support

* put logic in ops_opencl

* fix crash

* hmm, stride seems okay

* padding for batched inputs

* just an issue now with cout%4

* op model still passes

* fix startPackedInputChannel

* pre and post processing ops for graph

* don't break other llops

* shapetrackering

* reshapes are free

* lazy movement ops
2022-06-16 15:40:52 -07:00
George Hotz
bd7068f635 fix tests hopefully 2022-06-16 14:07:37 -07:00
George Hotz
ce15bf2bdb the big memory gradient didn't even need to be computed 2022-06-16 11:41:29 -07:00
George Hotz
2e58948f6a Revert "can put that test back"
This reverts commit 51b082b41a.
2022-06-16 11:25:49 -07:00
George Hotz
51b082b41a can put that test back 2022-06-16 11:18:14 -07:00
George Hotz
85fe25e27b add stride support to shapetracker 2022-06-15 17:48:41 -07:00
George Hotz
3d4657167b fix tests hopefully 2022-06-15 17:26:37 -07:00
George Hotz
2a14befb74 support padding 2022-06-15 14:46:44 -07:00
George Hotz
fef6c82491 wow dilation support was simple 2022-06-15 11:38:23 -07:00
George Hotz
0b182029dd support dilated convolution in torch 2022-06-14 18:03:35 -07:00
George Hotz
a690ba4588 add test for padding 2022-06-14 17:41:22 -07:00
George Hotz
e057ca23bb add flip 2022-06-14 17:28:43 -07:00
George Hotz
6261a0639b ShapeTracker (#328)
* start shapetracker

* that late reshape is crushing our hopes

* simple failure

* DumbShapeTracker passes tests

* improve st tests

* stacked view tracker works

* flip works

* tests pass

* shapetracker works

* use ShapeTracker in ops_gpu

* a couple lines

* fix 0 shape

* less lines

* use shapetracker for new_shape in ops.py

* simpler still

* padding with a ZeroView

* gamed it a little
2022-06-14 16:08:22 -07:00
George Hotz
dcbca4fdf1 Expand Operator (#327)
* replace broadcasting with expand

* Tensor, not self

* remove broadcasting from mlops

* delete useless A operator

* expand, not repeat

* remove A op

* expand on gpu

* binary_op doesn't broadcast anymore

* expand is still total junk, but the tests should pass
2022-06-12 12:31:48 -07:00
George Hotz
33f18c61a1 test_broadcasted_add 2022-06-12 10:19:58 -07:00
George Hotz
af300b121b refactor to pass conv args into llops 2022-06-11 23:08:46 -07:00
George Hotz
d747a4b9e2 add padding to conv2d function, other minor things 2022-06-11 22:29:42 -07:00
George Hotz
9a3c048724 skip broken tests, no float64 allowed 2022-06-11 17:12:04 -07:00
George Hotz
9ebd472375 move ops to ops.py 2022-06-11 15:58:56 -07:00
George Hotz
b5b68e75ff simpler onnx 2022-06-11 15:35:45 -07:00
George Hotz
2305a5347b test_onnx works with enet also 2022-06-11 14:30:26 -07:00
George Hotz
6fdb276886 flip batchnorm function order 2022-06-11 13:20:41 -07:00
George Hotz
85d17a2acd running resnet onnx 2022-06-11 13:17:15 -07:00
George Hotz
0225360191 fixed with one return x 2022-06-11 12:08:53 -07:00
George Hotz
db5a632e8c multicat + test onnx is generic onnx 2022-06-11 11:50:47 -07:00
George Hotz
a710b3a210 it's a real test now 2022-06-11 11:33:33 -07:00
George Hotz
8440dbfa5d support inputs 2022-06-11 11:21:45 -07:00
George Hotz
08de1aa636 add flatten to tinygrad 2022-06-11 11:15:16 -07:00
George Hotz
aee251cc41 op model test 2022-06-11 11:06:03 -07:00
George Hotz
d061ce8d5e add ELU support 2022-06-11 10:47:23 -07:00
George Hotz
8864b37333 fix torch convdw 2022-06-10 15:04:39 -07:00
George Hotz
aac1a9b419 this breaks tests 2022-06-10 12:20:42 -07:00
George Hotz
e01ed64d7c restore that naming 2022-06-09 08:38:34 -07:00
George Hotz
60a48455ad still over line count, maybe test pass 2022-06-08 09:51:28 -07:00
George Hotz
70561f3d90 way over the line limit 2022-06-08 09:36:31 -07:00
George Hotz
4f7ee235c5 not a real test now 2022-06-08 09:00:59 -07:00
George Hotz
ae33060dae early float4 stuff for binary 2022-06-08 08:59:54 -07:00
George Hotz
82f29b5dbf better GPU block 2022-06-08 08:01:04 -07:00
George Hotz
42ae78241e only run test on GPU 2022-06-08 07:54:40 -07:00
George Hotz
cdf4b5f142 opencl perf test 2022-06-08 07:49:08 -07:00
George Hotz
d8ee8a39ac sgd threestep graph is so pretty 2022-06-06 09:45:37 -07:00
George Hotz
c143c92828 adam threestep 2022-06-06 09:38:28 -07:00
George Hotz
d302049e53 don't use div 2022-06-06 09:25:31 -07:00
George Hotz
a1dff4061b minor cleanups 2022-06-06 08:14:52 -07:00
George Hotz
3dac8fa728 this fix the gc 2022-06-05 17:16:40 -07:00
George Hotz
0ee21ba115 add ViT test and car 2022-06-05 17:12:43 -07:00
George Hotz
1de75b67d5 fix bug in graph with use of id 2022-06-05 16:31:20 -07:00
George Hotz
f0fe37bd34 simpler graph demo 2022-06-05 12:40:12 -07:00