Commit Graph

500 Commits

Author SHA1 Message Date
George Hotz
1826ff6b89 dtypes nice and clean (#673)
* add dtype class

* dtypes

* buffers are lazy

* dtype is tracked by lazybuffer and GenericShape

* fix types in llvm

* llvm store

* dtype tests

* fix tests maybe

* fix flop counter

* fix CI

* CI fix and check format

* fix dtype and dtype check

* fix custom test

* fix test graph
2023-03-10 16:56:07 -08:00
George Hotz
036737a12a mem_estimate tracks bytes, not items 2023-03-10 09:44:12 -08:00
George Hotz
1a039306d2 good changes from llama branch (#671)
* good changes from llama

* transpose behavior changed
2023-03-09 20:51:22 -08:00
George Hotz
dbbaa0bdd7 int32, and refactor pad/shrink 2023-03-09 12:57:17 -08:00
George Hotz
fb5ee9260f add pad tests to shapetracker 2023-03-09 12:51:18 -08:00
George Hotz
022c5835fc fix GPU import error and old python Tuple 2023-03-08 12:22:11 -08:00
George Hotz
c22afc52db move the custom function example to a test 2023-03-08 10:05:04 -08:00
George Hotz
00641aa45d add challenge tests 2023-03-07 19:39:04 -08:00
George Hotz
e0244baf60 3 letters for graph op 2023-03-07 19:20:48 -08:00
George Hotz
4eb880550f enable contract test 2023-03-07 17:32:28 -08:00
Alex Wang
d885d2d0f5 Allow 1s for contraction detection (#663)
* Allow 1s for contraction check

* More test cases for 1s
2023-03-07 17:31:28 -08:00
George Hotz
b561256a0e allow all reduces (#661)
* allow all reduces

* push permute tests

* explict permute reshape push

* contractw1s
2023-03-07 15:36:01 -08:00
George Hotz
b14d31d6db ConvNeXt + extras (#657)
* simple convnext implementation

* shorter function names

* need to realize the random functions now

* creating an optimizer realizes all params

* assign contiguous

* fix lazy lazy

* why was i doing that...add convnext to tests

* LazyNumpyArray

* enable assert + comment

* no two tiny
2023-03-06 22:10:56 -08:00
George Hotz
8c5dea8d72 fix CUDA float4 issues 2023-03-06 07:16:38 -08:00
George Hotz
7dbcc26582 fix up external tests 2023-03-06 06:52:28 -08:00
George Hotz
50012f679b move get_contraction to shapetracker 2023-03-06 06:42:57 -08:00
Alex Wang
64ecbd91b5 Refactor contraction and add integration test cases for push permute (#650)
* Refactor contraction and add unit tests

* Fix typo; Fix TestConv.test_elu failure due to some ones in old_shape

* Add push permute test cases

* Fix mypy type annotation check error

* Add contraction unit test; Reshape to higher dimension is not contraction
2023-03-06 06:36:55 -08:00
George Hotz
382f346523 clean up opt (#649)
* clean up opt

* don't let global kernels get too small

* 8192 -> 1024

* disable local shape for clang

* fix can_merge

* unroll the 5x5 depthwise convs in op

* load float4 check
2023-03-05 20:49:36 -08:00
George Hotz
7930c6ab5c CLImage backing bug + test_vec_mul 2023-03-05 16:32:05 -08:00
George Hotz
8de24e3b05 accumulator can be a float4 (#647)
* remove reduceopop

* not float4 yet

* float4 acc works

* group_float4 on store
2023-03-05 15:44:41 -08:00
George Hotz
7940ad258e fix dropout test 2023-03-05 12:24:04 -08:00
George Hotz
b1ba78ac38 move applegpu disassembler 2023-03-05 11:21:12 -08:00
George Hotz
16b03f3c3b wow, can't believe that was broken (#642)
* wow, can't believe that was broken

* remove namedtuple comment
2023-03-04 22:28:28 -08:00
George Hotz
4a607f7d65 more ext gpu tests 2023-03-04 21:00:08 -08:00
George Hotz
69198a73d2 test_1x1_24_6 2023-03-04 20:37:46 -08:00
George Hotz
b02a392d69 Improve local (#635)
* local is improving

* local is finding bugs

* new local should work
2023-03-04 09:30:49 -08:00
George Hotz
528cb3b3b9 fix ast test 2023-03-04 07:49:25 -08:00
George Hotz
28a6ada4ce line reduction in metal 2023-03-03 23:14:40 -08:00
George Hotz
7a1d96fd76 No negative (#632)
* behavior is correct without VALIDHACKS

* simple div and mod

* fix tests

* no negative variables

* alt form is correct

* still correct

* bug in mulnode

* at least validhacks works now

* cleanups

* test validhacks, and to_image_idx

* cache compare key

* tests and __neg__
2023-03-03 16:48:14 -08:00
George Hotz
8c475ea86a relax atol, merge_view 2023-03-03 07:48:44 -08:00
George Hotz
b9ce20c374 openpilot test wasn't running, factor out image idx 2023-03-03 07:41:53 -08:00
George Hotz
9bd2cdee08 skip broken bn training test for speed 2023-03-03 06:52:11 -08:00
George Hotz
999b44c274 fix external test + speed 2023-03-03 06:46:16 -08:00
George Hotz
8919ca8163 test cleanups 2023-03-03 06:36:06 -08:00
George Hotz
3915c89fb6 symbolic improvements (#629)
* fixups

* shorter diff

* wow, okay removing that had side effects

* more numeric tests

* MIN MAX tests
2023-03-02 19:50:38 -08:00
George Hotz
0335cb86b9 refactor comparison. there's a bug in the method cache 2023-03-02 10:10:16 -08:00
George Hotz
11f257ddf9 clean up cmp tests 2023-03-02 09:35:16 -08:00
Diogo
52204a7b88 adding comparison operators (#616)
* Less, LessOrEqual, Greater, GreaterOrEqual, Equal

* lint fix

* using built in functions

* overriding __eq__ breaks things

* backwards pass for less - foward only tests

* one other spot

* removing backwards for comparison ops to match pytorch

* raise runtime error

* more tests for comparison ops

* fixed the lineup

* added number upcast tests
2023-03-02 08:10:44 -08:00
George Hotz
2e26286294 speed like you wouldn't believe (#626)
* speed like you wouldn't believe

* fix tests
2023-03-02 07:49:19 -08:00
George Hotz
fca055bd66 NOOP means contiguous 2023-03-01 21:54:51 -08:00
George Hotz
d062cc82b8 put restrict back 2023-03-01 21:34:45 -08:00
George Hotz
b442e75c7a test speed v torch 2023-03-01 19:50:12 -08:00
George Hotz
bfcec234a2 Refactor ASTs (#622)
* ugh worst branch name

* compiler refactor continues

* scc -> cloc

* buf -> _buf

* finish _buf, and program -> runtime

* gpu is still working, clang isn't

* clang in new style

* ops_metal

* something broke it

* improve metal

* clean up tons of cl crap

* hack fix sync

* cleaner gpu

* gpu metal clang

* cleanups

* minor refactor

* GPUCodegen

* fix up LLVM

* blind CUDA refactor

* codegen / runtime

* keep ops naming

* linter passes

* woah, llvm was allocing 4x what it needed to

* bugfixes

* fix openpilot compiler

* fix compile_efficientnet

* method cache should fix tests

* deal with duped functions
2023-03-01 18:57:29 -08:00
Jacky Lee
5e41d5857c Add tests for randomness (#621)
* Add tests for random creation functions

* It worked on my machine!

* Rename to helper_same_distribution

* Remove extra line

* Add tests for equal distribution

* Test without scipy

* Do a different test for randn
2023-03-01 15:39:20 -08:00
George Hotz
e9e71fbfc4 remove mlop (#619)
* remove mlop

* lil simpler
2023-02-28 17:58:24 -08:00
George Hotz
4c4d88aad4 fix the last bug, and make HLOP the default 2023-02-28 17:04:28 -08:00
George Hotz
17c55f051d fix test symbolic 2023-02-28 16:37:08 -08:00
George Hotz
28f52f7c24 improve symbolic 2023-02-28 16:21:58 -08:00
George Hotz
e21df1701b distribute + refactor merge_views 2023-02-28 14:57:56 -08:00
George Hotz
9d539b8ebb more intuitive output shape from _pool 2023-02-28 11:41:48 -08:00