George Hotz
178ba50c03
some args for stable diffusion
2022-09-29 01:52:04 -04:00
Ollin Boer Bohan
3b1767e013
Fix OpenCL Metal texture issues ( #378 )
...
* Fix OpenCL Metal texture issues
Tile CL images when needed, to fit into the 16384 max Metal image size;
gets me to ~4.8s/iteration for SD on M1 Pro with OPENCL=1 FLOAT16=1.
* Minor cleanup
* Fix mish in CI, or no-op?
* Is mish being framed?
* It would help if any of this reproduced locally
* ???
* OPT is reverted; use original mish
* Cleanup post-review
* Fix some shape usage
* Tiler tests, shouldn't oom or overflow either
* Can't CL if there's no CL?
* Run tiler tests even if GPU=1
* relu6 segfault binary chop; revert test
* relu6 segfault binary chop; revert accel
* relu6 segfault binary chop; revert . (???)
* end relu6 segfault binary chop; repo's haunted
2022-09-29 01:21:54 -04:00
George Hotz
e737513c52
external_test_opt
2022-09-28 23:29:41 -04:00
George Hotz
650c011646
notrain test
2022-09-28 23:27:20 -04:00
George Hotz
af87d692e4
should this be 10?
2022-09-28 23:25:52 -04:00
George Hotz
0fd459b24e
ugh, global state
2022-09-28 23:10:49 -04:00
George Hotz
fa4eff9cc1
Device.GPU isn't definied
2022-09-28 23:00:15 -04:00
George Hotz
0b6537a572
fix tests
2022-09-28 22:57:58 -04:00
George Hotz
726cca78cd
fix bn folding issue, add new test
2022-09-28 22:52:18 -04:00
George Hotz
a0d169eb59
fix efficientnet
2022-09-28 14:23:01 -07:00
George Hotz
dec5334da9
revert layernorm to have axis param
2022-09-26 10:11:38 -04:00
George Hotz
dc80bf6f85
layernorm is all axis but the first
2022-09-25 17:55:48 -04:00
George Hotz
60df954377
Fix weight init: this work? ( #391 )
...
* this work?
* glorot uniform
* requies_grad broke
* propagate the None correctly
* so this weight init works
* ahh, i think it's this
* can't beat this
* glorot is best for ae
* remove comments
2022-09-25 16:46:33 -04:00
George Hotz
ff11c4316b
move get_parameters to optim.py
2022-09-25 13:16:58 -04:00
George Hotz
a0c0239ff1
fix mnist load from other dirs
2022-09-25 12:50:28 -04:00
Jacky Lee
2c01a66265
Reshape dataset from fetch_mnist ( #390 )
2022-09-24 21:16:29 -04:00
George Hotz
acae9a20c1
clipnorm support
2022-09-24 13:26:38 -04:00
George Hotz
271446e3eb
set requires_grad to None ( #387 )
...
* set requires_grad to None
* some things need gradients
* hmm, why was get_parameters filtering
2022-09-21 11:16:02 -04:00
George Hotz
29ae21bb0d
import tests from CL metal texture fix
2022-09-19 20:01:47 -04:00
George Hotz
a8aa1f9589
that's simpler
2022-09-18 20:40:46 -04:00
George Hotz
57e804a9bf
add min support
2022-09-18 20:39:41 -04:00
YassineYousfi
2f0f91ba3d
support float16 onnx weights ( #384 )
2022-09-15 09:12:18 -04:00
Comma Device
75f937227a
add barrier
2022-09-13 11:39:48 -04:00
George Hotz
3c3534736e
fix matmul kernel and tests
2022-09-13 08:31:04 -07:00
Comma Device
62e9419206
fix test failure on MATMUL=1 backward pass
2022-09-13 11:18:52 -04:00
Comma Device
3b82afc6a0
simple on device failing test
2022-09-13 10:59:15 -04:00
George Hotz
4efde1ba0a
test_matmul
2022-09-13 07:51:33 -07:00
George Hotz
894a7cee79
forgot a few
2022-09-12 09:21:46 -07:00
George Hotz
801ecd4a07
cleanup clip tokenizer
2022-09-12 09:20:12 -07:00
Fernand Pajot
ff0da4c802
Added standalone CLIP tokenizer ( #382 )
...
* Added standalone CLIP tokenizer.
* Fixed empty phrase.
* Truncating long prompts.
* Keeping two slots for the start and end token.
* Fixed empty phrase.
* Using tokenizer for empty phrase.
* Typo.
2022-09-12 09:12:55 -07:00
David Redmon
a1810c8617
update serious_mnist.py ( #380 )
2022-09-11 13:37:40 -07:00
George Hotz
ce348f0c92
Revert "change default opt to 2"
...
This reverts commit 726f4e98e9 .
2022-09-11 13:35:42 -07:00
George Hotz
726f4e98e9
change default opt to 2
2022-09-09 07:50:25 -07:00
YassineYousfi
1a7bdc51f8
support more onnx ops ( #376 )
...
* broadcast from right to left
* add another broadcasted add test
* more onnx ops
* use float32 range in clip
2022-09-07 15:15:24 -07:00
George Hotz
0b8c2221b5
relax mnist test a tiny bit
2022-09-07 07:52:05 -07:00
George Hotz
ecc1a0470d
add Linear to tinygrad.nn
2022-09-07 07:40:48 -07:00
George Hotz
d26bd73c1e
have to ignore that type
2022-09-07 07:24:27 -07:00
George Hotz
b7783565af
cpu line savings and cleaner
2022-09-06 21:24:22 -07:00
George Hotz
1c92a6da22
make gpu code readable
2022-09-06 21:17:36 -07:00
George Hotz
790af99a48
fix slice one multi, and linear can be simpler with new broadcasting
2022-09-06 19:51:33 -07:00
George Hotz
4f4ecbec97
add div to operators
2022-09-06 17:39:26 -07:00
George Hotz
5a76e652b8
simpler movement op
2022-09-06 17:27:33 -07:00
George Hotz
896f9f74a9
hmm, need this with broadcast change
2022-09-06 16:54:01 -07:00
George Hotz
a18a6a0773
fix sd with TORCH=1
2022-09-06 16:51:16 -07:00
YassineYousfi
5aad460c7a
broadcast from right to left ( #375 )
...
* broadcast from right to left
* add another broadcasted add test
2022-09-06 16:36:13 -07:00
George Hotz
0516359af8
fix stupid OPENCL=1 OOM
2022-09-06 14:29:23 -07:00
George Hotz
f215534a64
1100 lines, but sane linter rules
2022-09-06 13:47:45 -07:00
George Hotz
682dc64430
works at work
2022-09-06 08:06:11 -07:00
George Hotz
f683b26eef
bring back native exp log
2022-09-06 07:59:04 -07:00
George Hotz
d6f499fd69
improve opencl, why is it OOMing
2022-09-05 20:14:31 -07:00