George Hotz
a5a55ac19e
GlobalCounters cache + assign in optim
2023-02-08 17:10:55 -06:00
George Hotz
3d63934995
refactor to keep cl in the runtime ( #545 )
...
* refactor to keep cl in the runtime
* fix thneed, rename cl to _cl
* bugfix + _cuda
* fix tests
* thneed more correct
2023-02-08 16:46:09 -06:00
George Hotz
2844482a60
Mypy fun ( #541 )
...
* mypy fun
* things are just faster
* running fast
* mypy is fast
* compile.sh
* no gpu hack
* refactor ops_cpu and ops_torch to not subclass
* make weak buffer work
* tensor works
* fix test failing
* cpu/torch cleanups
* no or operator on dict in python 3.8
* that was junk
* fix warnings
* comment and touchup
2023-02-08 09:56:51 -06:00
George Hotz
f7291f6ca3
fixes big KOPT, breaks opencl ( #505 )
...
* fixes big KOPT, breaks opencl
* fix optimizer
* KernelCache
* oops, broke batchnorm
* hack to fix it
* fix llvm, less hacky gpu
* disable the cache
* cache just breaks things
2023-02-05 10:46:17 -08:00
James Roberts
db0a9b0a2d
Refactor CL.time_sum into GlobalCounters ( #519 )
2023-02-01 20:13:56 -08:00
George Hotz
5e37f084db
stable diffusion: clean up constant folding
2023-02-01 12:53:16 -08:00
Jacky Lee
486f023e81
Rename Normalize and move to nn ( #513 )
...
* Rename Normalize and move to nn
* Match PyTorch for dim>1
2023-02-01 11:55:03 -08:00
Jacky Lee
799b3f185a
Refactor getenv into helpers ( #508 )
...
* Refactor getenv into helpers
* Remove unused os
* Fix default value
* Fix more defaults for CI
* Fix bracket
* Revert changes to openpilot/compile.py
* Use getenv from helpers when possible
2023-01-31 15:09:09 -08:00
George Hotz
21f2af08d5
getenv + graphing
2023-01-30 19:15:03 -08:00
George Hotz
60ccddb58b
reenable SWAP
2023-01-30 17:32:02 -08:00
George Hotz
aea55eb196
found failing upcast
2023-01-30 16:12:56 -08:00
George Hotz
7ee0d99c70
CLCACHE
2023-01-30 14:02:06 -08:00
George Hotz
cccfea4b25
factor out KOPT code
2023-01-30 13:13:55 -08:00
George Hotz
de2c419fd4
make_pair and first attempt at hlb_cifar10
2023-01-30 11:07:23 -08:00
AllentDan
7b6b1f32b1
[Fix] fix typo: test_mnist -> datasets ( #492 )
...
* test_mnist -> datasets
* fix mnist_gan
2023-01-29 21:30:47 -08:00
George Hotz
2db272c7f7
Kernel Optimizer ( #489 )
...
* kernel optimizer
* 10x faster, but wrong. not good deal
* move test -> extra
* print x speedup
* clcache
* fix clcache + DEBUG
* GFLOPS estimate
* i==3
2023-01-29 17:15:00 -08:00
George Hotz
66da3bc3c0
reset the benchmark timer
2023-01-25 09:20:34 -08:00
George Hotz
487685919b
Revert "Rename Normalize and move to nn ( #415 )" ( #474 )
...
This reverts commit d768acb6a9 .
2023-01-25 07:50:04 -08:00
Jacky Lee
d768acb6a9
Rename Normalize and move to nn ( #415 )
...
* Rename Normalize and move to nn
* Fix comparison to None error
* Add test for GroupNorm
* Rename test case
* Flip parameters to match PyTorch
* Increase error tolerance
* Fix elementwise_affine on channels
* Match arguments with PyTorch
* Initialize weight and bias only when affine is true
* Is this it?
* A bit cleaner
* Handle case where weight or bias is None
2023-01-25 07:47:59 -08:00
George Hotz
6d7658db12
delete opencl <celebration>
2023-01-24 14:18:35 -08:00
nogira
2e744ef2f2
confirmed ( #449 )
...
w/ a bunch of print statements in the official model here: ce05de2819/ldm/modules/diffusionmodules/openaimodel.py (L413)
2023-01-07 08:41:06 -08:00
Drew Hintz
165fb4d631
remove redundant list comprehension from inside all. ( #397 )
...
remove explicit inherit from object.
2022-10-13 09:58:35 -07:00
George Hotz
178ba50c03
some args for stable diffusion
2022-09-29 01:52:04 -04:00
George Hotz
a0d169eb59
fix efficientnet
2022-09-28 14:23:01 -07:00
George Hotz
60df954377
Fix weight init: this work? ( #391 )
...
* this work?
* glorot uniform
* requies_grad broke
* propagate the None correctly
* so this weight init works
* ahh, i think it's this
* can't beat this
* glorot is best for ae
* remove comments
2022-09-25 16:46:33 -04:00
Jacky Lee
2c01a66265
Reshape dataset from fetch_mnist ( #390 )
2022-09-24 21:16:29 -04:00
George Hotz
894a7cee79
forgot a few
2022-09-12 09:21:46 -07:00
George Hotz
801ecd4a07
cleanup clip tokenizer
2022-09-12 09:20:12 -07:00
Fernand Pajot
ff0da4c802
Added standalone CLIP tokenizer ( #382 )
...
* Added standalone CLIP tokenizer.
* Fixed empty phrase.
* Truncating long prompts.
* Keeping two slots for the start and end token.
* Fixed empty phrase.
* Using tokenizer for empty phrase.
* Typo.
2022-09-12 09:12:55 -07:00
David Redmon
a1810c8617
update serious_mnist.py ( #380 )
2022-09-11 13:37:40 -07:00
George Hotz
ecc1a0470d
add Linear to tinygrad.nn
2022-09-07 07:40:48 -07:00
George Hotz
896f9f74a9
hmm, need this with broadcast change
2022-09-06 16:54:01 -07:00
George Hotz
a18a6a0773
fix sd with TORCH=1
2022-09-06 16:51:16 -07:00
George Hotz
0516359af8
fix stupid OPENCL=1 OOM
2022-09-06 14:29:23 -07:00
George Hotz
f215534a64
1100 lines, but sane linter rules
2022-09-06 13:47:45 -07:00
George Hotz
682dc64430
works at work
2022-09-06 08:06:11 -07:00
George Hotz
d6f499fd69
improve opencl, why is it OOMing
2022-09-05 20:14:31 -07:00
George Hotz
0ba6179de7
stable diffusion in readme
2022-09-05 18:51:56 -07:00
George Hotz
c1d5af8b0c
stable diffusion cleanups
2022-09-05 18:34:13 -07:00
George Hotz
3728ef6d02
better alphas
2022-09-05 16:48:26 -07:00
George Hotz
0fda854b3e
other prompt example
2022-09-05 16:14:16 -07:00
George Hotz
16cb4290c4
cat horse winning ❗
2022-09-05 16:05:14 -07:00
George Hotz
1043fa067a
it renders something
2022-09-05 15:52:14 -07:00
George Hotz
5a685b93ac
brown img
2022-09-05 15:20:18 -07:00
George Hotz
98d6264987
all models match
2022-09-05 12:27:54 -07:00
George Hotz
b8bd34b5d2
fix last bug in unet probz
2022-09-05 11:32:44 -07:00
George Hotz
3df67aa0af
fix transformer bugs
2022-09-05 11:26:32 -07:00
George Hotz
2ed3bb6223
clip model is running
2022-09-05 11:26:32 -07:00
George Hotz
1a54ea2417
runs on torch cpu
2022-09-04 12:06:42 -07:00
George Hotz
9590d92750
stable diffusion compiles (add no_init)
2022-09-04 11:40:50 -07:00