Commit Graph

10633 Commits

Author SHA1 Message Date
George Hotz
ec405b919f Revert "Revert "do not block gc in UOp.toposort (#9623)" (#9624)" (#9639)
This reverts commit 7ef02d0e1c.
2025-03-31 14:03:38 +08:00
George Hotz
49b1c46d16 good changes from the dsp branch (#9638) 2025-03-31 13:02:53 +08:00
qazal
9d67d3a2f3 simpler viz codeblocks (#9636)
* simpler viz codeblocks

* err
2025-03-31 11:48:35 +08:00
chenyu
60eb0c4ed7 exclude slow tests on PYTHON (#9634) 2025-03-30 22:55:05 -04:00
chenyu
5012ba3f04 cumalu touchup [pr] (#9632) 2025-03-30 22:43:11 -04:00
chenyu
d8d7ac1bb1 fix bert free_intermediates (#9633)
fix when only run eval `TRAIN=0 BERT_SIZE=tiny examples/mlperf/training_submission_v5.0/tinycorp/benchmarks/bert/implementations/tinybox_green/dev_beam.sh`
2025-03-30 22:42:52 -04:00
qazal
ff984c807d hotfix: less lines for viz helpers (#9631) 2025-03-31 10:10:34 +08:00
qazal
c206a7ae6d refactor viz state updates (#9630)
* refactor viz state updates

* onclick
2025-03-31 09:54:54 +08:00
Yvon Manzi
6652003839 Add cumprod to Tensor (#9629)
* probably how cumprod should look like

* update _cumalu to work with MUL

* shorter

* cumprod testing

* clean

* more cleanup

* add cumprod to torch backend.

* make it look like cumsum

* mypy fix

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-03-30 21:49:18 -04:00
geohotstan
d52e91db7b ONNX ops clean ups (#9622)
* combine work from remove numpy and onnx ops tests

* clippy

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-03-30 21:39:22 -04:00
uuuvn
962c0f65f8 Fix generate_am (#9626)
This should be a comment
2025-03-31 01:15:44 +08:00
uuuvn
2a4247b8c2 RDNA 3.5 support (#9627) 2025-03-31 01:15:20 +08:00
geohotstan
a08b07b4da Bump onnx==1.17.0 (#9618)
* bump

* remove resize tf_crop_and_resize

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-03-30 03:21:51 -04:00
qazal
7ef02d0e1c Revert "do not block gc in UOp.toposort (#9623)" (#9624)
This reverts commit f1a35bbb54.
2025-03-30 12:57:06 +08:00
qazal
f1a35bbb54 do not block gc in UOp.toposort (#9623) 2025-03-30 11:54:17 +08:00
nimlgen
54e1e59b44 am: rdna 4 support (#9621)
* hm

* fix

* return this

* fine

* g

* ruff

* fix
2025-03-29 23:16:27 +07:00
nimlgen
118bd1cbed hotfix: amd imports (#9620) 2025-03-29 20:19:53 +07:00
uuuvn
5908b89f71 MI300X support (WIP) (#9585) 2025-03-29 19:46:42 +08:00
George Hotz
77f0d09ecf hotfix: HIP supports parallel BEAM search 2025-03-29 11:49:53 +08:00
chenyu
162f286a0e add a few Tensor method to doc (#9614)
* add a few Tensor method to doc

* clone
2025-03-28 13:47:16 -04:00
uuuvn
dd9aae02c3 Refactor ops_amd.py (MI300X prereq) (#9428) 2025-03-29 00:17:20 +07:00
uuuvn
3e1168ff5e am: module import in common (#9615) 2025-03-28 21:29:34 +07:00
nimlgen
a8ff85369e cpugraph for dsp (#9601)
* cpugraph init

* fixes

* no cpu for now

* mypy

* fix
2025-03-28 19:06:31 +07:00
nimlgen
fa0ebbd237 jit: optimize before pickle (#9611)
* jit: optimize before pickle

* optimize weights

* fix

* mypy

* mypy2
2025-03-28 19:06:09 +07:00
George Hotz
392a311312 Revert "add copy button in VIZ code-block (#9605)" (#9610)
This reverts commit d1e8598c81.
2025-03-28 17:05:44 +08:00
Harsh Natuskar
d1e8598c81 add copy button in VIZ code-block (#9605)
* works

* only second block has copy

* better function

* better

* ...

* smol function

* update copy-btn css

* updates
2025-03-28 16:52:21 +08:00
qazal
b4ea45b4a6 fix viz recenter + worker cleanup (#9607) 2025-03-28 15:24:53 +08:00
Andrew Furey
50dee4a7b3 add test for checking const gradients (#9598) 2025-03-27 15:17:37 -04:00
chenyu
5358b0904b update uop_given_valid if a node becomes const (#9604)
* update uop_given_valid if a node becomes const

* cleanup
2025-03-27 14:57:46 -04:00
chenyu
a187dfd3df bert BEAM_UOPS_MAX 3000->4000 (#9603)
more stable for the final step time

green 410ms (master) -> 397ms (BEAM=4) -> 392ms (this)
red 561ms (master) -> 550ms (this)
2025-03-27 11:58:47 -04:00
qazal
088a677e25 rescale to fit viz graph [pr] (#9599)
* zoom to fit the graph in viz [pr]

* always on screen fit graph

* space key recenters
2025-03-27 23:33:51 +08:00
nimlgen
3737821b9e prepare for clang graph (#9600)
* prepare for clang graph

* emu

* ops

* ops2

* better type

* fix
2025-03-27 20:09:37 +07:00
qazal
bf94924d5a fix viz with nested graph_rewrite (#9595) 2025-03-27 13:14:28 +08:00
qazal
c011751b41 statically define viz arrow heads (#9594) 2025-03-27 12:22:04 +08:00
qazal
0877497bad hotfix: use captured uops in viz render [pr] (#9593)
* hotfix: use captured uops in viz render [pr]

* better error
2025-03-27 11:52:12 +08:00
qazal
e5ff7b23d7 refactor to @track_matches + add failing test_nested_rewrite (#9592)
* test_nested_rewrite

* refactor to track_matches

* positional arg
2025-03-27 11:11:56 +08:00
chenyu
62888614f6 lower bert eval bs to 24 (#9590)
oom during eval
2025-03-26 21:25:23 -04:00
nimlgen
dc9da1d917 memplan into one buffer (#9526)
* new memplanner

* new should works

* fix

* VALIDATE_MEMORY_PLANNER

* hm?

* ugh

* fix alignment

* fix2

* rm

* tiny fixes

* test

* comments and fixes

* fix2

* liiiinetr

* t

* fix
2025-03-27 01:46:50 +07:00
qazal
8b717c345c cache viz worker at launch (#9589) 2025-03-27 01:10:02 +08:00
George Hotz
d62ced8981 symbolic -> symbolic_flat (#9588) 2025-03-26 23:34:43 +08:00
George Hotz
8aaa5e1ec5 generate the individual indexes (#9587) 2025-03-26 22:32:06 +08:00
George Hotz
5c6cd884e3 multiple simplifies is faster [pr] (#9586)
* multiple simplifies is faster [pr]

* cleanup

* cleanup
2025-03-26 21:42:52 +08:00
George Hotz
1e6e75e39a little changes from dsp branch (#9582)
* little changes from dsp branch

* not that one

* need the where

* Revert "need the where"

This reverts commit 140f89c878.
2025-03-26 20:01:21 +08:00
nimlgen
e88a640ca5 fix _access_resources for offset buffers (#9580)
* fix _access_resources for offset buffers

* test
2025-03-26 18:42:43 +07:00
Andrey
7b865ed03d use tuple in isinstance for type checking (#9583) 2025-03-26 19:36:48 +08:00
George Hotz
9115ce8860 linearizer fixups from DSP branch (#9581) 2025-03-26 18:28:15 +08:00
qazal
e799df537e prep viz UI cleanup for grid scales (#9579)
* less ways to make a button

* move collapse out

* work

* do not create extra resizers

* better

* ul

* safari
2025-03-26 17:48:15 +08:00
nimlgen
ccbcdca473 add memplanner tests (#9577) 2025-03-26 10:59:39 +07:00
qazal
c03dadfcb9 add TORCHVIZ=1 to beautiful_mnist_torch (#9576) 2025-03-26 11:17:08 +08:00
qazal
93bcb974c5 select torch device in examples/beautiful_mnist_torch.py (#9575) 2025-03-26 11:01:25 +08:00