George Hotz
27f209b80b
add time sum
2022-08-06 14:35:21 +00:00
George Hotz
64c0d2dedb
add gflop estimate
2022-08-06 14:21:48 +00:00
George Hotz
94d526f8fc
fix op estimate
2022-08-06 14:15:50 +00:00
George Hotz
f2847cb710
remove useless init, add ops counter
2022-08-06 14:05:25 +00:00
George Hotz
18fde22dac
fix that soon
2022-07-20 09:07:09 -07:00
George Hotz
187d581dc6
fix options on old pyopencl
2022-07-20 08:24:05 -07:00
Comma Device
6da956b9fa
that should be right
2022-07-19 19:47:37 -07:00
Comma Device
f4ed837f2f
float16 fixups
2022-07-19 19:44:40 -07:00
Comma Device
aa00a3948e
needs_load in image correct
2022-07-19 19:25:47 -07:00
Comma Device
314d70ff17
zero out the buffer
2022-07-19 19:17:47 -07:00
Comma Device
b8a67905e5
save weights
2022-07-19 19:14:14 -07:00
George Hotz
46e7dfade1
REQUIRES_SIMPLE_REDUCE
2022-07-19 11:42:14 -07:00
George Hotz
acbeaf0ba9
adam in benchmark_train_efficientnet
2022-07-19 09:33:07 -07:00
George Hotz
ef1100fdff
touchups
2022-07-19 09:30:06 -07:00
George Hotz
5d45c6e516
Fold reduce ( #362 )
...
* folding reduce
* fold through movementops
* fixup shapes
* was too aggressive
* i knew we needed that
* don't recompute reduce
* working
* fix openpilot compile
* prunegraph openpilot
* types and reduce_shape
* refactor
* cleanups
* neater
* 1009
* 1004
* clean up reduce for 998
2022-07-19 09:24:02 -07:00
Comma Device
2d402d1135
buffer_id is 8 bytes
2022-07-18 20:27:45 -07:00
Comma Device
577c23731e
outputs with size
2022-07-18 20:21:33 -07:00
Comma Device
29581b5c85
inputs and outputs
2022-07-18 20:17:26 -07:00
Comma Device
ae30641b0d
fix row pitch
2022-07-18 19:48:19 -07:00
Comma Device
02f23e526c
output file to disk
2022-07-18 19:23:22 -07:00
George Hotz
ef4afdb5d2
tests maybe
2022-07-18 08:24:14 -07:00
George Hotz
a2c4bcf313
disable opencl tests
2022-07-18 08:17:21 -07:00
George Hotz
5093455166
don't shuffle if there's children involved
2022-07-17 21:15:40 -07:00
George Hotz
f76d41812b
prune graph
2022-07-17 15:38:43 -07:00
George Hotz
eda6f071b2
default opt level 2
2022-07-17 14:54:40 -07:00
George Hotz
73b0471b25
join expands
2022-07-17 13:42:05 -07:00
George Hotz
cfabbbd6bb
more crap to remove without convs
2022-07-17 13:02:27 -07:00
George Hotz
5e96ed523a
fix opencl bug, no training on opencl
2022-07-17 12:55:26 -07:00
George Hotz
f93e297804
fix bug caused by rounding
2022-07-17 12:49:58 -07:00
George Hotz
cff297ef9d
w/e, that's a later prob
2022-07-17 12:32:50 -07:00
George Hotz
4bc07326d4
we need that opt to make gpu decent speed
2022-07-17 12:26:18 -07:00
George Hotz
6375e7129a
opencl not imported
2022-07-17 12:14:39 -07:00
George Hotz
bf299802f8
fixup tests
2022-07-17 12:11:53 -07:00
George Hotz
762e859089
testopencl
2022-07-17 11:56:40 -07:00
George Hotz
608e2431f7
test opencl, commit to removing the crap conv code from GPU
2022-07-17 11:55:37 -07:00
George Hotz
3c4565fa21
SLICE -> PAD,SHRINK
2022-07-17 11:33:59 -07:00
George Hotz
9574dd8559
some permutes are reshapes
2022-07-17 10:34:24 -07:00
George Hotz
77806e0d64
fix permute stacking
2022-07-17 10:24:57 -07:00
George Hotz
c28a99087b
add PAD movementop
2022-07-17 10:22:03 -07:00
George Hotz
4527c0453d
get_movementroot
2022-07-17 09:37:27 -07:00
George Hotz
b00cc93102
bugfixes
2022-07-17 09:19:22 -07:00
George Hotz
d07f379038
don't merge movement ops
2022-07-17 09:09:50 -07:00
George Hotz
f6ea7c022a
Revert "EXPAND -> REPEAT"
...
This reverts commit 115d2eadf5 .
2022-07-17 08:42:10 -07:00
George Hotz
cca089b11d
Revert "more expand -> repeat"
...
This reverts commit 2e7b1630a8 .
2022-07-17 08:41:48 -07:00
George Hotz
2e7b1630a8
more expand -> repeat
2022-07-17 08:40:49 -07:00
George Hotz
115d2eadf5
EXPAND -> REPEAT
2022-07-17 08:38:54 -07:00
George Hotz
1eb247f823
contiguous and same length
2022-07-16 08:49:07 -07:00
George Hotz
d04b274cd2
noop removal can replace with reshape
2022-07-16 08:32:42 -07:00
George Hotz
d985217fa4
skip reduce noops
2022-07-16 07:47:43 -07:00
George Hotz
bcf422dfdd
Device2 ( #358 )
...
* option for matmul
* fixups
* fast like a nascar
* running
* thneed runner
* no buffer id makes no backing buffer
* move constant folding to the top
* runs on mac
* folded biases
* was v slow
* maybe just that
* elu touchup
* speed and float32
Co-authored-by: Comma Device <device@comma.ai >
2022-07-16 07:26:19 -07:00