Commit Graph

10417 Commits

Author SHA1 Message Date
George Hotz
d41ac5f5f1 touchups 2023-03-11 11:41:34 -08:00
Cyril Roumégous
3f08613a2a apply flake8 E203 rule (#684) 2023-03-11 11:35:16 -08:00
Diogo
784afc6c6f Eq magic function support (#683)
* add eq magic func

* changed from eq to __eq__

* ignore type for linter

* mypy doenst like descriptions :(
2023-03-11 10:31:46 -08:00
George Hotz
5ea44cefcc llama: add lexie personality 2023-03-11 10:23:33 -08:00
George Hotz
c908f911a7 llama defaults to metal on osx 2023-03-11 09:30:13 -08:00
George Hotz
fd65edf595 fix mem_estimate for dtype.itemsize 2023-03-11 09:20:05 -08:00
George Hotz
fe8c05b96f allow disabling method cache 2023-03-11 08:57:49 -08:00
George Hotz
5e1380df6a profiling llama + cache is_contiguous 2023-03-11 08:23:21 -08:00
George Hotz
01f39b19dc move to shapetracker.py 2023-03-11 07:50:07 -08:00
George Hotz
f3ac52aee8 Mypyc (#680)
* building shapetracker

* default ENABLE_METHOD_CACHE

* symbolic compiles

* improve types

* tensor compiles

* oops, that's a bug

* best of both worlds

* find legit typing bugs

* pad2d can take list or tuple

* sub 200ms when compiled
2023-03-11 07:33:30 -08:00
George Hotz
22905dd657 speedups from llama branch 2023-03-10 22:01:32 -08:00
George Hotz
0b03216cc3 losing lines (#678)
* losing lines

* FLIP -> STRIDE

* shapetracker refactor
2023-03-10 21:57:05 -08:00
George Hotz
d7cb8e3e56 multithreaded fake_torch_load_zipped 2023-03-10 19:16:27 -08:00
George Hotz
b1206bcb18 third try at torch loading (#677)
* third try at torch loading

* numpy fixed

* fix enet compile

* load_single_weight supports empty weights

* oops, CPU wasn't the default

* so many bugs
2023-03-10 19:11:29 -08:00
Connor Henderson
8b7a16cf85 Add conv binops_no_rerun test assertions (#665)
* Add conv binops_no_rerun assertions

* use assert_allclose

* widen tolerance for elu
2023-03-10 19:09:48 -08:00
George Hotz
8bf75a7fdd fix stable diffusion and CI 2023-03-10 17:48:12 -08:00
George Hotz
c7d17c25d9 ugh, that's getting ugle 2023-03-10 17:41:19 -08:00
George Hotz
4780f9a6df llama runs (slowly) in master 2023-03-10 17:36:51 -08:00
George Hotz
1826ff6b89 dtypes nice and clean (#673)
* add dtype class

* dtypes

* buffers are lazy

* dtype is tracked by lazybuffer and GenericShape

* fix types in llvm

* llvm store

* dtype tests

* fix tests maybe

* fix flop counter

* fix CI

* CI fix and check format

* fix dtype and dtype check

* fix custom test

* fix test graph
2023-03-10 16:56:07 -08:00
George Hotz
d26345595d more llama stuff 2023-03-10 10:48:10 -08:00
George Hotz
442e1bcd5a typo + EARLY_STOPPING 2023-03-10 10:43:07 -08:00
George Hotz
6142e63a3e touchups, print GB/s 2023-03-10 10:37:37 -08:00
George Hotz
036737a12a mem_estimate tracks bytes, not items 2023-03-10 09:44:12 -08:00
George Hotz
1a039306d2 good changes from llama branch (#671)
* good changes from llama

* transpose behavior changed
2023-03-09 20:51:22 -08:00
George Hotz
de1b6d3e08 check shrink is actually smaller 2023-03-09 12:59:45 -08:00
George Hotz
dbbaa0bdd7 int32, and refactor pad/shrink 2023-03-09 12:57:17 -08:00
George Hotz
fb5ee9260f add pad tests to shapetracker 2023-03-09 12:51:18 -08:00
jspieler
da7fb4b227 Fixed DDPG example (#667) 2023-03-09 11:49:52 -08:00
George Hotz
022c5835fc fix GPU import error and old python Tuple 2023-03-08 12:22:11 -08:00
George Hotz
c22afc52db move the custom function example to a test 2023-03-08 10:05:04 -08:00
George Hotz
7d3b9d0e95 oops, things relied on that API. the global cache needs access to the ASTRunner class 2023-03-08 08:39:31 -08:00
George Hotz
4f957423c3 jitting custom ops + OPTLOCAL assignment bugfix 2023-03-08 08:30:37 -08:00
George Hotz
7285de41a1 tinygrad supports CUSTOM functions 2023-03-08 07:50:33 -08:00
George Hotz
00641aa45d add challenge tests 2023-03-07 19:39:04 -08:00
George Hotz
e0244baf60 3 letters for graph op 2023-03-07 19:20:48 -08:00
George Hotz
46df02115d bring back SHUFFLE_PAD_OPS as OPT>=4 2023-03-07 17:42:34 -08:00
George Hotz
4eb880550f enable contract test 2023-03-07 17:32:28 -08:00
Alex Wang
d885d2d0f5 Allow 1s for contraction detection (#663)
* Allow 1s for contraction check

* More test cases for 1s
2023-03-07 17:31:28 -08:00
George Hotz
b561256a0e allow all reduces (#661)
* allow all reduces

* push permute tests

* explict permute reshape push

* contractw1s
2023-03-07 15:36:01 -08:00
George Hotz
b14d31d6db ConvNeXt + extras (#657)
* simple convnext implementation

* shorter function names

* need to realize the random functions now

* creating an optimizer realizes all params

* assign contiguous

* fix lazy lazy

* why was i doing that...add convnext to tests

* LazyNumpyArray

* enable assert + comment

* no two tiny
2023-03-06 22:10:56 -08:00
George Hotz
d8dda2af3a openpilot fixups v0.5.0 2023-03-06 14:14:44 -08:00
George Hotz
4b9bc1615b While fusion (#654)
* try this

* readme

* opt comments
2023-03-06 09:13:23 -08:00
George Hotz
066a65dad5 remove tflop number, i'll never update that when it's fast 2023-03-06 08:30:31 -08:00
George Hotz
6e763dc446 matmul example in readme 2023-03-06 08:25:13 -08:00
George Hotz
5dc227dba6 fix bug in ENABLE_METHOD_CACHE and enable for llvm 2023-03-06 07:43:40 -08:00
George Hotz
8c5dea8d72 fix CUDA float4 issues 2023-03-06 07:16:38 -08:00
George Hotz
7dbcc26582 fix up external tests 2023-03-06 06:52:28 -08:00
George Hotz
50012f679b move get_contraction to shapetracker 2023-03-06 06:42:57 -08:00
Alex Wang
64ecbd91b5 Refactor contraction and add integration test cases for push permute (#650)
* Refactor contraction and add unit tests

* Fix typo; Fix TestConv.test_elu failure due to some ones in old_shape

* Add push permute test cases

* Fix mypy type annotation check error

* Add contraction unit test; Reshape to higher dimension is not contraction
2023-03-06 06:36:55 -08:00
Peter McDevitt
cb5be9697c One less line in consume_flops (#651)
* less lines

* using walrus

* using original way
2023-03-05 23:34:45 -08:00