Commit Graph

1206 Commits

Author SHA1 Message Date
George Hotz
a5a55ac19e GlobalCounters cache + assign in optim 2023-02-08 17:10:55 -06:00
George Hotz
3d63934995 refactor to keep cl in the runtime (#545)
* refactor to keep cl in the runtime

* fix thneed, rename cl to _cl

* bugfix + _cuda

* fix tests

* thneed more correct
2023-02-08 16:46:09 -06:00
George Hotz
2844482a60 Mypy fun (#541)
* mypy fun

* things are just faster

* running fast

* mypy is fast

* compile.sh

* no gpu hack

* refactor ops_cpu and ops_torch to not subclass

* make weak buffer work

* tensor works

* fix test failing

* cpu/torch cleanups

* no or operator on dict in python 3.8

* that was junk

* fix warnings

* comment and touchup
2023-02-08 09:56:51 -06:00
George Hotz
f7291f6ca3 fixes big KOPT, breaks opencl (#505)
* fixes big KOPT, breaks opencl

* fix optimizer

* KernelCache

* oops, broke batchnorm

* hack to fix it

* fix llvm, less hacky gpu

* disable the cache

* cache just breaks things
2023-02-05 10:46:17 -08:00
James Roberts
db0a9b0a2d Refactor CL.time_sum into GlobalCounters (#519) 2023-02-01 20:13:56 -08:00
George Hotz
5e37f084db stable diffusion: clean up constant folding 2023-02-01 12:53:16 -08:00
Jacky Lee
486f023e81 Rename Normalize and move to nn (#513)
* Rename Normalize and move to nn

* Match PyTorch for dim>1
2023-02-01 11:55:03 -08:00
Jacky Lee
799b3f185a Refactor getenv into helpers (#508)
* Refactor getenv into helpers

* Remove unused os

* Fix default value

* Fix more defaults for CI

* Fix bracket

* Revert changes to openpilot/compile.py

* Use getenv from helpers when possible
2023-01-31 15:09:09 -08:00
George Hotz
21f2af08d5 getenv + graphing 2023-01-30 19:15:03 -08:00
George Hotz
60ccddb58b reenable SWAP 2023-01-30 17:32:02 -08:00
George Hotz
aea55eb196 found failing upcast 2023-01-30 16:12:56 -08:00
George Hotz
7ee0d99c70 CLCACHE 2023-01-30 14:02:06 -08:00
George Hotz
cccfea4b25 factor out KOPT code 2023-01-30 13:13:55 -08:00
George Hotz
de2c419fd4 make_pair and first attempt at hlb_cifar10 2023-01-30 11:07:23 -08:00
AllentDan
7b6b1f32b1 [Fix] fix typo: test_mnist -> datasets (#492)
* test_mnist -> datasets

* fix mnist_gan
2023-01-29 21:30:47 -08:00
George Hotz
2db272c7f7 Kernel Optimizer (#489)
* kernel optimizer

* 10x faster, but wrong. not good deal

* move test -> extra

* print x speedup

* clcache

* fix clcache + DEBUG

* GFLOPS estimate

* i==3
2023-01-29 17:15:00 -08:00
George Hotz
66da3bc3c0 reset the benchmark timer 2023-01-25 09:20:34 -08:00
George Hotz
487685919b Revert "Rename Normalize and move to nn (#415)" (#474)
This reverts commit d768acb6a9.
2023-01-25 07:50:04 -08:00
Jacky Lee
d768acb6a9 Rename Normalize and move to nn (#415)
* Rename Normalize and move to nn

* Fix comparison to None error

* Add test for GroupNorm

* Rename test case

* Flip parameters to match PyTorch

* Increase error tolerance

* Fix elementwise_affine on channels

* Match arguments with PyTorch

* Initialize weight and bias only when affine is true

* Is this it?

* A bit cleaner

* Handle case where weight or bias is None
2023-01-25 07:47:59 -08:00
George Hotz
6d7658db12 delete opencl <celebration> 2023-01-24 14:18:35 -08:00
nogira
2e744ef2f2 confirmed (#449)
w/ a bunch of print statements in the official model here: ce05de2819/ldm/modules/diffusionmodules/openaimodel.py (L413)
2023-01-07 08:41:06 -08:00
Drew Hintz
165fb4d631 remove redundant list comprehension from inside all. (#397)
remove explicit inherit from object.
2022-10-13 09:58:35 -07:00
George Hotz
178ba50c03 some args for stable diffusion 2022-09-29 01:52:04 -04:00
George Hotz
a0d169eb59 fix efficientnet 2022-09-28 14:23:01 -07:00
George Hotz
60df954377 Fix weight init: this work? (#391)
* this work?

* glorot uniform

* requies_grad broke

* propagate the None correctly

* so this weight init works

* ahh, i think it's this

* can't beat this

* glorot is best for ae

* remove comments
2022-09-25 16:46:33 -04:00
Jacky Lee
2c01a66265 Reshape dataset from fetch_mnist (#390) 2022-09-24 21:16:29 -04:00
George Hotz
894a7cee79 forgot a few 2022-09-12 09:21:46 -07:00
George Hotz
801ecd4a07 cleanup clip tokenizer 2022-09-12 09:20:12 -07:00
Fernand Pajot
ff0da4c802 Added standalone CLIP tokenizer (#382)
* Added standalone CLIP tokenizer.

* Fixed empty phrase.

* Truncating long prompts.

* Keeping two slots for the start and end token.

* Fixed empty phrase.

* Using tokenizer for empty phrase.

* Typo.
2022-09-12 09:12:55 -07:00
David Redmon
a1810c8617 update serious_mnist.py (#380) 2022-09-11 13:37:40 -07:00
George Hotz
ecc1a0470d add Linear to tinygrad.nn 2022-09-07 07:40:48 -07:00
George Hotz
896f9f74a9 hmm, need this with broadcast change 2022-09-06 16:54:01 -07:00
George Hotz
a18a6a0773 fix sd with TORCH=1 2022-09-06 16:51:16 -07:00
George Hotz
0516359af8 fix stupid OPENCL=1 OOM 2022-09-06 14:29:23 -07:00
George Hotz
f215534a64 1100 lines, but sane linter rules 2022-09-06 13:47:45 -07:00
George Hotz
682dc64430 works at work 2022-09-06 08:06:11 -07:00
George Hotz
d6f499fd69 improve opencl, why is it OOMing 2022-09-05 20:14:31 -07:00
George Hotz
0ba6179de7 stable diffusion in readme 2022-09-05 18:51:56 -07:00
George Hotz
c1d5af8b0c stable diffusion cleanups 2022-09-05 18:34:13 -07:00
George Hotz
3728ef6d02 better alphas 2022-09-05 16:48:26 -07:00
George Hotz
0fda854b3e other prompt example 2022-09-05 16:14:16 -07:00
George Hotz
16cb4290c4 cat horse winning 2022-09-05 16:05:14 -07:00
George Hotz
1043fa067a it renders something 2022-09-05 15:52:14 -07:00
George Hotz
5a685b93ac brown img 2022-09-05 15:20:18 -07:00
George Hotz
98d6264987 all models match 2022-09-05 12:27:54 -07:00
George Hotz
b8bd34b5d2 fix last bug in unet probz 2022-09-05 11:32:44 -07:00
George Hotz
3df67aa0af fix transformer bugs 2022-09-05 11:26:32 -07:00
George Hotz
2ed3bb6223 clip model is running 2022-09-05 11:26:32 -07:00
George Hotz
1a54ea2417 runs on torch cpu 2022-09-04 12:06:42 -07:00
George Hotz
9590d92750 stable diffusion compiles (add no_init) 2022-09-04 11:40:50 -07:00