Commit Graph

1642 Commits

Author SHA1 Message Date
George Hotz
893f136fe0 lines from helpers 2023-03-03 23:07:46 -08:00
George Hotz
81cda2b672 zero out s == 1 strides 2023-03-03 22:57:02 -08:00
George Hotz
aef336c079 merge_views is very powerful 2023-03-03 22:53:59 -08:00
George Hotz
b5b4edf59b comments 2023-03-03 22:39:31 -08:00
George Hotz
cfb050e2d1 simple modrange, thanks Jacky 2023-03-03 22:37:04 -08:00
George Hotz
3dab721f9f lazy cleanup 2023-03-03 22:01:03 -08:00
George Hotz
c53efb3635 optimize for CL (#633)
* required opt

* simplify

* works

* shift_to_last

* required is fine

* print shape in colored

* better shape

* args was wrong

* debugs

* fix empty shape

* colored shape printer
2023-03-03 22:00:09 -08:00
Pankaj Doharey
9d97d97b26 Opens image in default viewer after saving. (#612) 2023-03-03 17:28:49 -08:00
George Hotz
1a84976d4d fix thneed gflops 2023-03-03 16:52:59 -08:00
George Hotz
7a1d96fd76 No negative (#632)
* behavior is correct without VALIDHACKS

* simple div and mod

* fix tests

* no negative variables

* alt form is correct

* still correct

* bug in mulnode

* at least validhacks works now

* cleanups

* test validhacks, and to_image_idx

* cache compare key

* tests and __neg__
2023-03-03 16:48:14 -08:00
George Hotz
8c475ea86a relax atol, merge_view 2023-03-03 07:48:44 -08:00
George Hotz
b9ce20c374 openpilot test wasn't running, factor out image idx 2023-03-03 07:41:53 -08:00
George Hotz
9bd2cdee08 skip broken bn training test for speed 2023-03-03 06:52:11 -08:00
George Hotz
999b44c274 fix external test + speed 2023-03-03 06:46:16 -08:00
George Hotz
8919ca8163 test cleanups 2023-03-03 06:36:06 -08:00
George Hotz
459488bba2 fix linter (#630)
* fix linter

* no imports okay

* explicit bases

* disable in pylintrc
2023-03-02 20:06:20 -08:00
George Hotz
3915c89fb6 symbolic improvements (#629)
* fixups

* shorter diff

* wow, okay removing that had side effects

* more numeric tests

* MIN MAX tests
2023-03-02 19:50:38 -08:00
George Hotz
b842cdf11f support wait in cuda 2023-03-02 10:39:26 -08:00
George Hotz
dc88ad3342 fix ops print bug 2023-03-02 10:33:03 -08:00
George Hotz
0335cb86b9 refactor comparison. there's a bug in the method cache 2023-03-02 10:10:16 -08:00
George Hotz
11f257ddf9 clean up cmp tests 2023-03-02 09:35:16 -08:00
George Hotz
8902764167 fit nits in compare 2023-03-02 08:15:26 -08:00
Diogo
52204a7b88 adding comparison operators (#616)
* Less, LessOrEqual, Greater, GreaterOrEqual, Equal

* lint fix

* using built in functions

* overriding __eq__ breaks things

* backwards pass for less - foward only tests

* one other spot

* removing backwards for comparison ops to match pytorch

* raise runtime error

* more tests for comparison ops

* fixed the lineup

* added number upcast tests
2023-03-02 08:10:44 -08:00
George Hotz
2e26286294 speed like you wouldn't believe (#626)
* speed like you wouldn't believe

* fix tests
2023-03-02 07:49:19 -08:00
Martin Loretz
51fb6aeb45 Fix cuda runtime (#625) 2023-03-02 06:52:34 -08:00
George Hotz
fca055bd66 NOOP means contiguous 2023-03-01 21:54:51 -08:00
George Hotz
d062cc82b8 put restrict back 2023-03-01 21:34:45 -08:00
George Hotz
201d9a2d58 remove extra copy on output 2023-03-01 21:19:44 -08:00
George Hotz
b442e75c7a test speed v torch 2023-03-01 19:50:12 -08:00
George Hotz
bfcec234a2 Refactor ASTs (#622)
* ugh worst branch name

* compiler refactor continues

* scc -> cloc

* buf -> _buf

* finish _buf, and program -> runtime

* gpu is still working, clang isn't

* clang in new style

* ops_metal

* something broke it

* improve metal

* clean up tons of cl crap

* hack fix sync

* cleaner gpu

* gpu metal clang

* cleanups

* minor refactor

* GPUCodegen

* fix up LLVM

* blind CUDA refactor

* codegen / runtime

* keep ops naming

* linter passes

* woah, llvm was allocing 4x what it needed to

* bugfixes

* fix openpilot compiler

* fix compile_efficientnet

* method cache should fix tests

* deal with duped functions
2023-03-01 18:57:29 -08:00
Jacky Lee
5e41d5857c Add tests for randomness (#621)
* Add tests for random creation functions

* It worked on my machine!

* Rename to helper_same_distribution

* Remove extra line

* Add tests for equal distribution

* Test without scipy

* Do a different test for randn
2023-03-01 15:39:20 -08:00
George Hotz
0055f0c2b3 touchups 2023-02-28 20:36:11 -08:00
George Hotz
f4aa3868e3 remove save_for_backward, there's still lines to save in the simplest places 2023-02-28 20:11:05 -08:00
George Hotz
7ff92550bb slice -> pad, shrink 2023-02-28 19:58:12 -08:00
George Hotz
ea3fa07c2a bump tinygrad to 0.5, move reshape logic from mlops 2023-02-28 18:07:03 -08:00
George Hotz
e9e71fbfc4 remove mlop (#619)
* remove mlop

* lil simpler
2023-02-28 17:58:24 -08:00
George Hotz
6b423b675d fix mulacc when both strides are 0 2023-02-28 17:27:52 -08:00
George Hotz
4c4d88aad4 fix the last bug, and make HLOP the default 2023-02-28 17:04:28 -08:00
George Hotz
fde6c2d62b fix image grouping 2023-02-28 16:50:46 -08:00
George Hotz
17c55f051d fix test symbolic 2023-02-28 16:37:08 -08:00
George Hotz
28f52f7c24 improve symbolic 2023-02-28 16:21:58 -08:00
George Hotz
1702a5779f remove hacks from can_merge 2023-02-28 15:30:20 -08:00
George Hotz
e21df1701b distribute + refactor merge_views 2023-02-28 14:57:56 -08:00
George Hotz
7e6edfbc64 unbreak onnx conv padding 2023-02-28 13:55:03 -08:00
George Hotz
7d556ca7e0 avg/max pool work in N-D 2023-02-28 13:38:27 -08:00
George Hotz
dcb50a3a9f better hlop image conv 2023-02-28 13:06:58 -08:00
George Hotz
9d539b8ebb more intuitive output shape from _pool 2023-02-28 11:41:48 -08:00
George Hotz
d722ffbd04 _pool2d -> _pool 2023-02-28 11:35:19 -08:00
George Hotz
3c8da6bd03 add typing 2023-02-28 10:54:46 -08:00
George Hotz
922f96e527 DeviceBuffer : shape can be correct type now 2023-02-28 10:08:55 -08:00