Commit Graph

2143 Commits

Author SHA1 Message Date
George Hotz
a7131b6a46 Non contig (#339)
* contiguous_view

* non contig reduce too

* conv fast

* maybe faster valid

* improve test_onnx

* improve params

* elementwise_op

* draw non contig

* improve contiguous
2022-06-19 22:40:48 -07:00
George Hotz
d05e7c291a contiguous_view (#336)
* contiguous_view

* non contig reduce too

* conv fast

* maybe faster valid

* improve test_onnx

* improve params

* elementwise_op

* draw non contig
2022-06-19 20:37:28 -07:00
George Hotz
fb72ea3fbd gpu uses shapetracker (fix tests) (#335)
* shapetracker

* movement_op

* hmm, that's why repr failed
2022-06-19 17:32:07 -07:00
George Hotz
ce2e20b768 fix test 2022-06-19 17:07:09 -07:00
George Hotz
f5f21ecb86 gpu buffer is shapetracker 2022-06-19 17:02:24 -07:00
George Hotz
6b652dafb2 touchups 2022-06-19 16:57:14 -07:00
George Hotz
e364849b3b stuff from lazy 2022-06-19 09:57:16 -07:00
Tim Lügger
2069fef292 unnecessary assign add in cpu processing_op (#334)
We can replace += with = since we only change tmp once.
Now np.empty() can replace np.zeros() which might be slightly faster.
This saves a few milliseconds, best case ~60ms.

(However, most of the time in ops_cpu.processing_op() seems to be spend on np.reshape())
2022-06-19 07:41:40 -07:00
George Hotz
8d08e41c21 print time in test 2022-06-19 00:59:09 -07:00
George Hotz
395eb60f46 less lines, and oddly faster 2022-06-18 21:48:42 -07:00
George Hotz
aa164d901e remove ctx from buffers (#333) 2022-06-18 17:27:10 -07:00
George Hotz
77f5cef8a6 First batch from lazy branch (#332)
* test and helpers from lazy

* lazy pt2
2022-06-18 17:26:59 -07:00
George Hotz
3faf8353ca remove out_shape from processing_op 2022-06-16 17:07:57 -07:00
George Hotz
a11deb5150 shapetracker check for noop 2022-06-16 16:29:18 -07:00
George Hotz
52505faaf4 minor 2022-06-16 15:53:45 -07:00
George Hotz
d5b3e18540 Accelerate with CL (#325)
* accelerated opencl

* it's running, it's just wrong

* bugfix

* model is correct in opencl

* lazy image convert

* add padding support to convolution

* that stuff was all upstreamed

* remove HEAD

* oops

* test_simple_conv2d_4 passes, add dilation support

* put logic in ops_opencl

* fix crash

* hmm, stride seems okay

* padding for batched inputs

* just an issue now with cout%4

* op model still passes

* fix startPackedInputChannel

* pre and post processing ops for graph

* don't break other llops

* shapetrackering

* reshapes are free

* lazy movement ops
2022-06-16 15:40:52 -07:00
George Hotz
bd7068f635 fix tests hopefully 2022-06-16 14:07:37 -07:00
George Hotz
9306759cbc put the allocations back in the ops 2022-06-16 12:12:55 -07:00
George Hotz
ce15bf2bdb the big memory gradient didn't even need to be computed 2022-06-16 11:41:29 -07:00
George Hotz
2e58948f6a Revert "can put that test back"
This reverts commit 51b082b41a.
2022-06-16 11:25:49 -07:00
George Hotz
51b082b41a can put that test back 2022-06-16 11:18:14 -07:00
George Hotz
73bc181fbe cleaner output shape 2022-06-16 10:24:03 -07:00
George Hotz
b5796ae4f9 remove useless reshape 2022-06-16 10:15:43 -07:00
George Hotz
89db797e57 get rid of reduce using channels 2022-06-16 10:01:54 -07:00
George Hotz
38d6cfec2a remove the expand 2022-06-16 09:54:56 -07:00
George Hotz
bcfbb4c81b minor cleanups 2022-06-15 22:27:46 -07:00
George Hotz
3667200df5 remove unused unstride 2022-06-15 20:03:43 -07:00
George Hotz
ff648e9510 remove convt and compute dx with conv 2022-06-15 19:54:15 -07:00
George Hotz
142c88f2e3 move to mlops 2022-06-15 18:06:07 -07:00
George Hotz
85fe25e27b add stride support to shapetracker 2022-06-15 17:48:41 -07:00
George Hotz
827e8f67eb comment 2022-06-15 17:31:27 -07:00
George Hotz
3d4657167b fix tests hopefully 2022-06-15 17:26:37 -07:00
George Hotz
e4ab57e39d oops, only stride 2022-06-15 15:25:58 -07:00
George Hotz
86f55b078d transpose dilation was simple 2022-06-15 15:20:51 -07:00
George Hotz
2a14befb74 support padding 2022-06-15 14:46:44 -07:00
George Hotz
6d98366214 move CONVDW out of llops 2022-06-15 12:05:11 -07:00
George Hotz
fef6c82491 wow dilation support was simple 2022-06-15 11:38:23 -07:00
George Hotz
0b182029dd support dilated convolution in torch 2022-06-14 18:03:35 -07:00
George Hotz
a690ba4588 add test for padding 2022-06-14 17:41:22 -07:00
George Hotz
e057ca23bb add flip 2022-06-14 17:28:43 -07:00
George Hotz
a8aeebfb0c use shapetracker to combine adj reduce axis 2022-06-14 17:08:12 -07:00
George Hotz
906cce9916 reduce with loops 2022-06-14 16:38:33 -07:00
George Hotz
6261a0639b ShapeTracker (#328)
* start shapetracker

* that late reshape is crushing our hopes

* simple failure

* DumbShapeTracker passes tests

* improve st tests

* stacked view tracker works

* flip works

* tests pass

* shapetracker works

* use ShapeTracker in ops_gpu

* a couple lines

* fix 0 shape

* less lines

* use shapetracker for new_shape in ops.py

* simpler still

* padding with a ZeroView

* gamed it a little
2022-06-14 16:08:22 -07:00
George Hotz
e58b5711ec simpler convdw 2022-06-13 17:56:54 -07:00
George Hotz
dcbca4fdf1 Expand Operator (#327)
* replace broadcasting with expand

* Tensor, not self

* remove broadcasting from mlops

* delete useless A operator

* expand, not repeat

* remove A op

* expand on gpu

* binary_op doesn't broadcast anymore

* expand is still total junk, but the tests should pass
2022-06-12 12:31:48 -07:00
George Hotz
5cf7649eda register the operators outside 2022-06-12 10:26:34 -07:00
George Hotz
33f18c61a1 test_broadcasted_add 2022-06-12 10:19:58 -07:00
George Hotz
d47a421970 add cout to conv_args, don't change the first 12 2022-06-12 00:10:15 -07:00
George Hotz
af300b121b refactor to pass conv args into llops 2022-06-11 23:08:46 -07:00
George Hotz
d747a4b9e2 add padding to conv2d function, other minor things 2022-06-11 22:29:42 -07:00