George Hotz
d41ac5f5f1
touchups
2023-03-11 11:41:34 -08:00
Cyril Roumégous
3f08613a2a
apply flake8 E203 rule ( #684 )
2023-03-11 11:35:16 -08:00
Diogo
784afc6c6f
Eq magic function support ( #683 )
...
* add eq magic func
* changed from eq to __eq__
* ignore type for linter
* mypy doenst like descriptions :(
2023-03-11 10:31:46 -08:00
George Hotz
5ea44cefcc
llama: add lexie personality
2023-03-11 10:23:33 -08:00
George Hotz
c908f911a7
llama defaults to metal on osx
2023-03-11 09:30:13 -08:00
George Hotz
fd65edf595
fix mem_estimate for dtype.itemsize
2023-03-11 09:20:05 -08:00
George Hotz
fe8c05b96f
allow disabling method cache
2023-03-11 08:57:49 -08:00
George Hotz
5e1380df6a
profiling llama + cache is_contiguous
2023-03-11 08:23:21 -08:00
George Hotz
01f39b19dc
move to shapetracker.py
2023-03-11 07:50:07 -08:00
George Hotz
f3ac52aee8
Mypyc ( #680 )
...
* building shapetracker
* default ENABLE_METHOD_CACHE
* symbolic compiles
* improve types
* tensor compiles
* oops, that's a bug
* best of both worlds
* find legit typing bugs
* pad2d can take list or tuple
* sub 200ms when compiled
2023-03-11 07:33:30 -08:00
George Hotz
22905dd657
speedups from llama branch
2023-03-10 22:01:32 -08:00
George Hotz
0b03216cc3
losing lines ( #678 )
...
* losing lines
* FLIP -> STRIDE
* shapetracker refactor
2023-03-10 21:57:05 -08:00
George Hotz
d7cb8e3e56
multithreaded fake_torch_load_zipped
2023-03-10 19:16:27 -08:00
George Hotz
b1206bcb18
third try at torch loading ( #677 )
...
* third try at torch loading
* numpy fixed
* fix enet compile
* load_single_weight supports empty weights
* oops, CPU wasn't the default
* so many bugs
2023-03-10 19:11:29 -08:00
Connor Henderson
8b7a16cf85
Add conv binops_no_rerun test assertions ( #665 )
...
* Add conv binops_no_rerun assertions
* use assert_allclose
* widen tolerance for elu
2023-03-10 19:09:48 -08:00
George Hotz
8bf75a7fdd
fix stable diffusion and CI
2023-03-10 17:48:12 -08:00
George Hotz
c7d17c25d9
ugh, that's getting ugle
2023-03-10 17:41:19 -08:00
George Hotz
4780f9a6df
llama runs (slowly) in master
2023-03-10 17:36:51 -08:00
George Hotz
1826ff6b89
dtypes nice and clean ( #673 )
...
* add dtype class
* dtypes
* buffers are lazy
* dtype is tracked by lazybuffer and GenericShape
* fix types in llvm
* llvm store
* dtype tests
* fix tests maybe
* fix flop counter
* fix CI
* CI fix and check format
* fix dtype and dtype check
* fix custom test
* fix test graph
2023-03-10 16:56:07 -08:00
George Hotz
d26345595d
more llama stuff
2023-03-10 10:48:10 -08:00
George Hotz
442e1bcd5a
typo + EARLY_STOPPING
2023-03-10 10:43:07 -08:00
George Hotz
6142e63a3e
touchups, print GB/s
2023-03-10 10:37:37 -08:00
George Hotz
036737a12a
mem_estimate tracks bytes, not items
2023-03-10 09:44:12 -08:00
George Hotz
1a039306d2
good changes from llama branch ( #671 )
...
* good changes from llama
* transpose behavior changed
2023-03-09 20:51:22 -08:00
George Hotz
de1b6d3e08
check shrink is actually smaller
2023-03-09 12:59:45 -08:00
George Hotz
dbbaa0bdd7
int32, and refactor pad/shrink
2023-03-09 12:57:17 -08:00
George Hotz
fb5ee9260f
add pad tests to shapetracker
2023-03-09 12:51:18 -08:00
jspieler
da7fb4b227
Fixed DDPG example ( #667 )
2023-03-09 11:49:52 -08:00
George Hotz
022c5835fc
fix GPU import error and old python Tuple
2023-03-08 12:22:11 -08:00
George Hotz
c22afc52db
move the custom function example to a test
2023-03-08 10:05:04 -08:00
George Hotz
7d3b9d0e95
oops, things relied on that API. the global cache needs access to the ASTRunner class
2023-03-08 08:39:31 -08:00
George Hotz
4f957423c3
jitting custom ops + OPTLOCAL assignment bugfix
2023-03-08 08:30:37 -08:00
George Hotz
7285de41a1
tinygrad supports CUSTOM functions
2023-03-08 07:50:33 -08:00
George Hotz
00641aa45d
add challenge tests
2023-03-07 19:39:04 -08:00
George Hotz
e0244baf60
3 letters for graph op
2023-03-07 19:20:48 -08:00
George Hotz
46df02115d
bring back SHUFFLE_PAD_OPS as OPT>=4
2023-03-07 17:42:34 -08:00
George Hotz
4eb880550f
enable contract test
2023-03-07 17:32:28 -08:00
Alex Wang
d885d2d0f5
Allow 1s for contraction detection ( #663 )
...
* Allow 1s for contraction check
* More test cases for 1s
2023-03-07 17:31:28 -08:00
George Hotz
b561256a0e
allow all reduces ( #661 )
...
* allow all reduces
* push permute tests
* explict permute reshape push
* contractw1s
2023-03-07 15:36:01 -08:00
George Hotz
b14d31d6db
ConvNeXt + extras ( #657 )
...
* simple convnext implementation
* shorter function names
* need to realize the random functions now
* creating an optimizer realizes all params
* assign contiguous
* fix lazy lazy
* why was i doing that...add convnext to tests
* LazyNumpyArray
* enable assert + comment
* no two tiny
2023-03-06 22:10:56 -08:00
George Hotz
d8dda2af3a
openpilot fixups
v0.5.0
2023-03-06 14:14:44 -08:00
George Hotz
4b9bc1615b
While fusion ( #654 )
...
* try this
* readme
* opt comments
2023-03-06 09:13:23 -08:00
George Hotz
066a65dad5
remove tflop number, i'll never update that when it's fast
2023-03-06 08:30:31 -08:00
George Hotz
6e763dc446
matmul example in readme
2023-03-06 08:25:13 -08:00
George Hotz
5dc227dba6
fix bug in ENABLE_METHOD_CACHE and enable for llvm
2023-03-06 07:43:40 -08:00
George Hotz
8c5dea8d72
fix CUDA float4 issues
2023-03-06 07:16:38 -08:00
George Hotz
7dbcc26582
fix up external tests
2023-03-06 06:52:28 -08:00
George Hotz
50012f679b
move get_contraction to shapetracker
2023-03-06 06:42:57 -08:00
Alex Wang
64ecbd91b5
Refactor contraction and add integration test cases for push permute ( #650 )
...
* Refactor contraction and add unit tests
* Fix typo; Fix TestConv.test_elu failure due to some ones in old_shape
* Add push permute test cases
* Fix mypy type annotation check error
* Add contraction unit test; Reshape to higher dimension is not contraction
2023-03-06 06:36:55 -08:00
Peter McDevitt
cb5be9697c
One less line in consume_flops ( #651 )
...
* less lines
* using walrus
* using original way
2023-03-05 23:34:45 -08:00