George Hotz
392e57aea7
ugh, why did that fail
2022-10-01 13:38:43 -04:00
George Hotz
7a61dc7ee9
test_sd_big_conv
2022-10-01 13:26:05 -04:00
Ollin Boer Bohan
3b1767e013
Fix OpenCL Metal texture issues ( #378 )
...
* Fix OpenCL Metal texture issues
Tile CL images when needed, to fit into the 16384 max Metal image size;
gets me to ~4.8s/iteration for SD on M1 Pro with OPENCL=1 FLOAT16=1.
* Minor cleanup
* Fix mish in CI, or no-op?
* Is mish being framed?
* It would help if any of this reproduced locally
* ???
* OPT is reverted; use original mish
* Cleanup post-review
* Fix some shape usage
* Tiler tests, shouldn't oom or overflow either
* Can't CL if there's no CL?
* Run tiler tests even if GPU=1
* relu6 segfault binary chop; revert test
* relu6 segfault binary chop; revert accel
* relu6 segfault binary chop; revert . (???)
* end relu6 segfault binary chop; repo's haunted
2022-09-29 01:21:54 -04:00
George Hotz
e737513c52
external_test_opt
2022-09-28 23:29:41 -04:00
George Hotz
650c011646
notrain test
2022-09-28 23:27:20 -04:00
George Hotz
af87d692e4
should this be 10?
2022-09-28 23:25:52 -04:00
George Hotz
0fd459b24e
ugh, global state
2022-09-28 23:10:49 -04:00
George Hotz
fa4eff9cc1
Device.GPU isn't definied
2022-09-28 23:00:15 -04:00
George Hotz
0b6537a572
fix tests
2022-09-28 22:57:58 -04:00
George Hotz
726cca78cd
fix bn folding issue, add new test
2022-09-28 22:52:18 -04:00
George Hotz
60df954377
Fix weight init: this work? ( #391 )
...
* this work?
* glorot uniform
* requies_grad broke
* propagate the None correctly
* so this weight init works
* ahh, i think it's this
* can't beat this
* glorot is best for ae
* remove comments
2022-09-25 16:46:33 -04:00
George Hotz
271446e3eb
set requires_grad to None ( #387 )
...
* set requires_grad to None
* some things need gradients
* hmm, why was get_parameters filtering
2022-09-21 11:16:02 -04:00
George Hotz
29ae21bb0d
import tests from CL metal texture fix
2022-09-19 20:01:47 -04:00
George Hotz
57e804a9bf
add min support
2022-09-18 20:39:41 -04:00
YassineYousfi
2f0f91ba3d
support float16 onnx weights ( #384 )
2022-09-15 09:12:18 -04:00
George Hotz
3c3534736e
fix matmul kernel and tests
2022-09-13 08:31:04 -07:00
Comma Device
62e9419206
fix test failure on MATMUL=1 backward pass
2022-09-13 11:18:52 -04:00
Comma Device
3b82afc6a0
simple on device failing test
2022-09-13 10:59:15 -04:00
George Hotz
4efde1ba0a
test_matmul
2022-09-13 07:51:33 -07:00
George Hotz
0b8c2221b5
relax mnist test a tiny bit
2022-09-07 07:52:05 -07:00
George Hotz
ecc1a0470d
add Linear to tinygrad.nn
2022-09-07 07:40:48 -07:00
George Hotz
790af99a48
fix slice one multi, and linear can be simpler with new broadcasting
2022-09-06 19:51:33 -07:00
YassineYousfi
5aad460c7a
broadcast from right to left ( #375 )
...
* broadcast from right to left
* add another broadcasted add test
2022-09-06 16:36:13 -07:00
George Hotz
bcb867cdd6
better idea for numbers, do the division in python
2022-09-03 16:23:39 -07:00
George Hotz
033a3ecccf
found tinygrad bug
2022-09-03 12:32:43 -07:00
George Hotz
7f15779942
t.assign in optim
2022-08-20 14:04:33 -07:00
George Hotz
1eb12dafbc
reduce axis at the end
2022-08-20 07:40:56 -07:00
George Hotz
b132de677d
tinygrad.nn ( #367 )
...
* tinygrad.nn
* flake8
* working on pylint
* more pylint
* more pylint
* pylint passes
* networkx
* mypy can't infer that type
* junk
2022-08-18 07:41:00 -07:00
George Hotz
18fde22dac
fix that soon
2022-07-20 09:07:09 -07:00
George Hotz
5d45c6e516
Fold reduce ( #362 )
...
* folding reduce
* fold through movementops
* fixup shapes
* was too aggressive
* i knew we needed that
* don't recompute reduce
* working
* fix openpilot compile
* prunegraph openpilot
* types and reduce_shape
* refactor
* cleanups
* neater
* 1009
* 1004
* clean up reduce for 998
2022-07-19 09:24:02 -07:00
George Hotz
f76d41812b
prune graph
2022-07-17 15:38:43 -07:00
George Hotz
73b0471b25
join expands
2022-07-17 13:42:05 -07:00
George Hotz
cfabbbd6bb
more crap to remove without convs
2022-07-17 13:02:27 -07:00
George Hotz
5e96ed523a
fix opencl bug, no training on opencl
2022-07-17 12:55:26 -07:00
George Hotz
f93e297804
fix bug caused by rounding
2022-07-17 12:49:58 -07:00
George Hotz
cff297ef9d
w/e, that's a later prob
2022-07-17 12:32:50 -07:00
George Hotz
6375e7129a
opencl not imported
2022-07-17 12:14:39 -07:00
George Hotz
bf299802f8
fixup tests
2022-07-17 12:11:53 -07:00
George Hotz
3c4565fa21
SLICE -> PAD,SHRINK
2022-07-17 11:33:59 -07:00
George Hotz
cca089b11d
Revert "more expand -> repeat"
...
This reverts commit 2e7b1630a8 .
2022-07-17 08:41:48 -07:00
George Hotz
2e7b1630a8
more expand -> repeat
2022-07-17 08:40:49 -07:00
George Hotz
d04b274cd2
noop removal can replace with reshape
2022-07-16 08:32:42 -07:00
George Hotz
bcf422dfdd
Device2 ( #358 )
...
* option for matmul
* fixups
* fast like a nascar
* running
* thneed runner
* no buffer id makes no backing buffer
* move constant folding to the top
* runs on mac
* folded biases
* was v slow
* maybe just that
* elu touchup
* speed and float32
Co-authored-by: Comma Device <device@comma.ai >
2022-07-16 07:26:19 -07:00
George Hotz
5e46561f7e
no_grad = NOT backward
2022-07-10 20:54:57 -07:00
George Hotz
b34ae7876f
lol chr(10) not chr(13)
2022-07-10 20:03:11 -07:00
George Hotz
44848ee5dc
prints show we can precompute from the outside
2022-07-08 10:59:20 -07:00
George Hotz
04e7e4104c
track graph children and make lazycache use weak references
2022-07-07 11:01:18 -07:00
George Hotz
001cfe83a2
local
2022-07-07 10:05:26 -07:00
George Hotz
2720ef49ca
extra and test and tuple
2022-07-07 10:01:33 -07:00
George Hotz
81b73f97a3
Optiimzation ( #355 )
...
* constant folding into kernels
* that opt worth it?
* fix mypy
* ast one kernel
* save 2 lines in conv kernel
* debug print kernel count
* cl debugging
* early realize inputs
* refactor Device
2022-07-04 08:58:57 -07:00