George Hotz
2ab282bfec
run on update_benchmark too ( #1460 )
...
* run on update_benchmark too
* amd inference test
* name it better
* add 10 CIFAR training steps
2023-08-06 08:58:37 -07:00
terafo
3d41674b42
Fixed regression ( #1447 )
...
Co-authored-by: terafo <terafo@protonmail.com >
2023-08-06 07:55:58 -07:00
George Hotz
d67e248d9b
simple bitcast 2 ( #1445 )
...
* simple bitcast 2
* bc 2
* empty
* Revert "empty"
This reverts commit d8ee083655 .
2023-08-06 00:30:50 -07:00
George Hotz
943b227cb1
only on push to master
2023-08-06 00:10:07 -07:00
George Hotz
2274e3e757
Fix benchmark ( #1454 )
...
* do benchmarking
* system
* artifact
* go
* name artifact
* only on push
2023-08-05 23:44:36 -07:00
George Hotz
bf21aec81f
do benchmarking ( #1451 )
...
* do benchmarking
* system
* artifact
* go
* name artifact
2023-08-05 23:35:01 -07:00
nimlgen
1ba8ae62a1
Match Torch speed for sum reduction ( #1387 )
...
Co-authored-by: Alexander Edwards <alex@alexedw.com >
2023-08-05 22:27:33 -07:00
chenyu
09ede08b23
simplify Node.sum aggregating ( #1449 )
2023-08-05 22:19:36 -07:00
George Hotz
7fa730b506
external model benchmark test
2023-08-05 22:10:48 -07:00
chenyu
cb5dcc7b57
remove view_from_shape ( #1448 )
2023-08-05 20:39:13 -07:00
Diogo
e2af95c2f8
moved global_max and local_max to LinearizerOptions also added assert for max bufs ( #1446 )
2023-08-05 18:23:18 -07:00
George Hotz
7b8d06c9f1
test uops ( #1444 )
...
* test uops
* tests should pass
* improve uops
* precision
2023-08-05 12:35:56 -07:00
George Hotz
84c430355e
fix backends for new style ( #1443 )
...
* fix backends for new style
* fix method cache
* fix fakeless
* llvm blacklist
* fix kernel optimizer
2023-08-05 11:07:04 -07:00
George Hotz
67781fcf5d
fix fail fast in CI
2023-08-05 10:24:24 -07:00
George Hotz
bd7f4b1249
move renamer to linearizer ( #1442 )
...
* move renamer to linearizer
* uops converter
* Delete test_uops.py
2023-08-05 08:53:25 -07:00
nimlgen
669b406ec6
correct children count with lazycache ( #1429 )
2023-08-05 00:30:16 -07:00
Felix
97a6029cf7
Corrected a few misspelled words ( #1435 )
2023-08-04 16:51:08 -07:00
Adrian Kretz
043d5f2cb5
Fix NOUNROLL ( #1439 )
2023-08-04 16:50:19 -07:00
Francesco Castelli
579f4615a0
Add assert for wrong matmul/dot shapes ( #1438 )
2023-08-04 18:16:56 -04:00
Umut Zengin
52db7d7435
inf, -inf support for pad ( #1436 )
2023-08-04 15:05:25 -04:00
Alex Telon
7325bc914f
fix: Context ( #1430 )
...
* Fixed issue in Context
* Cleaned up fix
Now that DEBUG.value = 3 always works we can do so in __new__ as well.
2023-08-04 10:53:48 -04:00
ian
c08ed1949f
Fix plt output comment ( #1428 )
2023-08-03 23:35:52 -07:00
wozeparrot
801bed4f66
Add ops_shm ( #1413 )
...
* feat: add ops_shm
* clean: extra newline
* feat: add test
* feat: ci doesn't like that
* feat: ci still doesn't like that
* feat: skip big test on ci
* feat: testing
* feat: big
* feat: testing again
* feat: reskip test
2023-08-03 17:40:52 -07:00
chenyu
34f348643b
Support constant expand to symbolic shape ( #1411 )
2023-08-02 21:21:22 -07:00
chenyu
6572ca6835
support symbolic expand ( #1407 )
2023-08-02 20:03:46 -04:00
wozeparrot
a367f71fea
fix: don't put kernels into cache when optimizing ( #1409 )
2023-08-02 18:17:16 -04:00
Paolo Gavazzi
9ffa1eb7e2
Removed dep of torch, torchaudio, kept librosa only ( #1264 )
2023-08-02 13:52:04 -04:00
George Hotz
fc2303e520
gitignore in weights
2023-08-02 16:26:41 +00:00
chenyu
18d0a93f09
LazyBuffer.get_variable_buffers() ( #1391 )
...
* LazyBudder.get_variable_buffers()
* remove left_only, add ProdNode
* no vars for OpNode.b
* do not change symbolic vars, remove ProdNode
2023-08-02 09:01:35 -07:00
Umut Zengin
8889821547
Const pad support to pad2d and slice ( #1392 )
...
* slice to pad2d migrate
* Gain line
* Mypy happy
* Mypy happy
* Revert
* whitespace
2023-08-02 08:58:52 -07:00
wozeparrot
ab9e4a2e93
Make cuda CI a bit more consistent ( #1403 )
...
* feat: use fast-apt-mirror
* feat: use in more places
2023-08-02 07:38:22 -07:00
wozeparrot
7aff8c4ded
cl fixes ( #1402 )
...
* feat: non-blocking
* feat: store event on buffer
2023-08-01 22:13:51 -07:00
Alex Telon
b66361843a
Timing and Context can now be used as decorators ( #1385 )
...
* Context and Timing can now be used as decorators
* Using Timing decorator in quickstart.md
The time formating is better and is a useful tool to learn.
Old: Time: 3.5260659999912605
New: Time: 3526.14 ms
* Updated env_vars documentation for Context
* Added test for Context decorator
* Put new import on same line as others
2023-08-01 17:16:10 -07:00
chenyu
d9d1372dd0
Update pytest.ini format ( #1398 )
2023-08-01 18:00:51 -04:00
George Hotz
f4218b709f
Revert "Improve Metal runtime command buffer handling ( #1335 )" ( #1397 )
...
This reverts commit bd54105b6b .
2023-08-01 12:10:20 -07:00
Diogo
4dc8595069
simple exporting models ( #1344 )
...
* unified exporting
* json exporting
* ignore more
* simplified buffer export
* added dtypes
* added assert
* swift example
* fix tests
* linter
* remove whitespace
* fixed tests
* remove swift example
* remove unintended changes
* allow callable models to be used
* whitespace
* more readable json export
* name change
* whitespace
* whitespace
2023-08-01 09:35:48 -07:00
wozeparrot
7c7cf16ef2
use host ptr for speed on copyouts ( #1393 )
...
* feat: use mapped buffer for speed
* fix: whoops don't need that
* feat: don't need explicit call to memoryview
2023-08-01 09:34:12 -07:00
Diogo
ba5e3818a0
Limit dims based on max size ( #1390 )
...
* working
* whitespace
* changed defaults to None
* linter
* last linter error
2023-07-31 19:18:19 -07:00
chenyu
b2fde9ec36
reshape to register variable value ( #1386 )
...
* reshape to register variable value
* better error message
2023-07-31 17:10:02 -07:00
Umut Zengin
0de5f20970
Re-open constant pad support to Tensor.pad ( #1388 )
...
* Added const padding support to .pad
* Linter
2023-07-31 17:08:57 -07:00
David Hou
3300d0aeaf
syncthreads before wmma ( #1389 )
...
(venv) chaos@tiny3:~/tinygrad$ KX=2 KY=2 N=2048 python extra/gemm/hip_matmul.py
4194304 289.60 us, would be 59322.55 GFLOPS matmul, 173.80 GB/s
2023-07-31 17:05:49 -07:00
Alex Telon
2d10e0340e
Refactored ContextVars ( #1331 )
2023-07-31 15:44:46 -04:00
George Hotz
f27df835a6
delete dead stuff ( #1382 )
...
* delete bpe from repo
* remove yolo examples
* Revert "remove yolo examples"
This reverts commit cd1f49d466 .
* no windows
2023-07-31 11:17:49 -07:00
Yixiang Gao
6e62dcfbf3
add check global dim limit in linearizer ( #1299 )
...
* need a better place for reshape and permute
* add permutation
* cuda fixed
* clean up
* enable nvidia GPU with global max
* fix order
* fix CI
* add check for global dim limit but need refactor
* refactor
* fix ignore
2023-07-31 11:14:54 -07:00
ronak69
ce0ab1c14e
convert $@ to "$@" in run_multibackend.sh ( #1379 )
2023-07-31 10:39:22 -07:00
chenyu
f5ef445cb6
trim space ( #1381 )
2023-07-31 10:37:57 -07:00
JaSpa99
5ab12059da
rng hlops: add normal and kaiming_normal ( #1378 )
...
* add normal and kaiming_normal
* make sure its float
* add tests
2023-07-31 10:37:02 -07:00
George Hotz
37fa7e96fb
Revert "update editorconfig, enforce via CI ( #1343 )" ( #1380 )
...
This reverts commit da2efecbe2 .
2023-07-31 10:35:50 -07:00
Pavol Rusnak
da2efecbe2
update editorconfig, enforce via CI ( #1343 )
...
* update editorconfig to set unix-style newlines and trim whitespace
* add editorconfig github action to the CI
* fix whitespace
2023-07-30 18:44:30 -07:00
S-Lykles
c2b82ea8ac
fix to_shape_strides ( #1374 )
...
* add tests for expr_node and expr_idxs
* simplify condition and add missing optimization
2023-07-30 18:42:46 -07:00