George Hotz
ec405b919f
Revert "Revert "do not block gc in UOp.toposort ( #9623 )" ( #9624 )" ( #9639 )
...
This reverts commit 7ef02d0e1c .
2025-03-31 14:03:38 +08:00
George Hotz
49b1c46d16
good changes from the dsp branch ( #9638 )
2025-03-31 13:02:53 +08:00
qazal
9d67d3a2f3
simpler viz codeblocks ( #9636 )
...
* simpler viz codeblocks
* err
2025-03-31 11:48:35 +08:00
chenyu
60eb0c4ed7
exclude slow tests on PYTHON ( #9634 )
2025-03-30 22:55:05 -04:00
chenyu
5012ba3f04
cumalu touchup [pr] ( #9632 )
2025-03-30 22:43:11 -04:00
chenyu
d8d7ac1bb1
fix bert free_intermediates ( #9633 )
...
fix when only run eval `TRAIN=0 BERT_SIZE=tiny examples/mlperf/training_submission_v5.0/tinycorp/benchmarks/bert/implementations/tinybox_green/dev_beam.sh`
2025-03-30 22:42:52 -04:00
qazal
ff984c807d
hotfix: less lines for viz helpers ( #9631 )
2025-03-31 10:10:34 +08:00
qazal
c206a7ae6d
refactor viz state updates ( #9630 )
...
* refactor viz state updates
* onclick
2025-03-31 09:54:54 +08:00
Yvon Manzi
6652003839
Add cumprod to Tensor ( #9629 )
...
* probably how cumprod should look like
* update _cumalu to work with MUL
* shorter
* cumprod testing
* clean
* more cleanup
* add cumprod to torch backend.
* make it look like cumsum
* mypy fix
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-03-30 21:49:18 -04:00
geohotstan
d52e91db7b
ONNX ops clean ups ( #9622 )
...
* combine work from remove numpy and onnx ops tests
* clippy
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-03-30 21:39:22 -04:00
uuuvn
962c0f65f8
Fix generate_am ( #9626 )
...
This should be a comment
2025-03-31 01:15:44 +08:00
uuuvn
2a4247b8c2
RDNA 3.5 support ( #9627 )
2025-03-31 01:15:20 +08:00
geohotstan
a08b07b4da
Bump onnx==1.17.0 ( #9618 )
...
* bump
* remove resize tf_crop_and_resize
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-03-30 03:21:51 -04:00
qazal
7ef02d0e1c
Revert "do not block gc in UOp.toposort ( #9623 )" ( #9624 )
...
This reverts commit f1a35bbb54 .
2025-03-30 12:57:06 +08:00
qazal
f1a35bbb54
do not block gc in UOp.toposort ( #9623 )
2025-03-30 11:54:17 +08:00
nimlgen
54e1e59b44
am: rdna 4 support ( #9621 )
...
* hm
* fix
* return this
* fine
* g
* ruff
* fix
2025-03-29 23:16:27 +07:00
nimlgen
118bd1cbed
hotfix: amd imports ( #9620 )
2025-03-29 20:19:53 +07:00
uuuvn
5908b89f71
MI300X support (WIP) ( #9585 )
2025-03-29 19:46:42 +08:00
George Hotz
77f0d09ecf
hotfix: HIP supports parallel BEAM search
2025-03-29 11:49:53 +08:00
chenyu
162f286a0e
add a few Tensor method to doc ( #9614 )
...
* add a few Tensor method to doc
* clone
2025-03-28 13:47:16 -04:00
uuuvn
dd9aae02c3
Refactor ops_amd.py (MI300X prereq) ( #9428 )
2025-03-29 00:17:20 +07:00
uuuvn
3e1168ff5e
am: module import in common ( #9615 )
2025-03-28 21:29:34 +07:00
nimlgen
a8ff85369e
cpugraph for dsp ( #9601 )
...
* cpugraph init
* fixes
* no cpu for now
* mypy
* fix
2025-03-28 19:06:31 +07:00
nimlgen
fa0ebbd237
jit: optimize before pickle ( #9611 )
...
* jit: optimize before pickle
* optimize weights
* fix
* mypy
* mypy2
2025-03-28 19:06:09 +07:00
George Hotz
392a311312
Revert "add copy button in VIZ code-block ( #9605 )" ( #9610 )
...
This reverts commit d1e8598c81 .
2025-03-28 17:05:44 +08:00
Harsh Natuskar
d1e8598c81
add copy button in VIZ code-block ( #9605 )
...
* works
* only second block has copy
* better function
* better
* ...
* smol function
* update copy-btn css
* updates
2025-03-28 16:52:21 +08:00
qazal
b4ea45b4a6
fix viz recenter + worker cleanup ( #9607 )
2025-03-28 15:24:53 +08:00
Andrew Furey
50dee4a7b3
add test for checking const gradients ( #9598 )
2025-03-27 15:17:37 -04:00
chenyu
5358b0904b
update uop_given_valid if a node becomes const ( #9604 )
...
* update uop_given_valid if a node becomes const
* cleanup
2025-03-27 14:57:46 -04:00
chenyu
a187dfd3df
bert BEAM_UOPS_MAX 3000->4000 ( #9603 )
...
more stable for the final step time
green 410ms (master) -> 397ms (BEAM=4) -> 392ms (this)
red 561ms (master) -> 550ms (this)
2025-03-27 11:58:47 -04:00
qazal
088a677e25
rescale to fit viz graph [pr] ( #9599 )
...
* zoom to fit the graph in viz [pr]
* always on screen fit graph
* space key recenters
2025-03-27 23:33:51 +08:00
nimlgen
3737821b9e
prepare for clang graph ( #9600 )
...
* prepare for clang graph
* emu
* ops
* ops2
* better type
* fix
2025-03-27 20:09:37 +07:00
qazal
bf94924d5a
fix viz with nested graph_rewrite ( #9595 )
2025-03-27 13:14:28 +08:00
qazal
c011751b41
statically define viz arrow heads ( #9594 )
2025-03-27 12:22:04 +08:00
qazal
0877497bad
hotfix: use captured uops in viz render [pr] ( #9593 )
...
* hotfix: use captured uops in viz render [pr]
* better error
2025-03-27 11:52:12 +08:00
qazal
e5ff7b23d7
refactor to @track_matches + add failing test_nested_rewrite ( #9592 )
...
* test_nested_rewrite
* refactor to track_matches
* positional arg
2025-03-27 11:11:56 +08:00
chenyu
62888614f6
lower bert eval bs to 24 ( #9590 )
...
oom during eval
2025-03-26 21:25:23 -04:00
nimlgen
dc9da1d917
memplan into one buffer ( #9526 )
...
* new memplanner
* new should works
* fix
* VALIDATE_MEMORY_PLANNER
* hm?
* ugh
* fix alignment
* fix2
* rm
* tiny fixes
* test
* comments and fixes
* fix2
* liiiinetr
* t
* fix
2025-03-27 01:46:50 +07:00
qazal
8b717c345c
cache viz worker at launch ( #9589 )
2025-03-27 01:10:02 +08:00
George Hotz
d62ced8981
symbolic -> symbolic_flat ( #9588 )
2025-03-26 23:34:43 +08:00
George Hotz
8aaa5e1ec5
generate the individual indexes ( #9587 )
2025-03-26 22:32:06 +08:00
George Hotz
5c6cd884e3
multiple simplifies is faster [pr] ( #9586 )
...
* multiple simplifies is faster [pr]
* cleanup
* cleanup
2025-03-26 21:42:52 +08:00
George Hotz
1e6e75e39a
little changes from dsp branch ( #9582 )
...
* little changes from dsp branch
* not that one
* need the where
* Revert "need the where"
This reverts commit 140f89c878 .
2025-03-26 20:01:21 +08:00
nimlgen
e88a640ca5
fix _access_resources for offset buffers ( #9580 )
...
* fix _access_resources for offset buffers
* test
2025-03-26 18:42:43 +07:00
Andrey
7b865ed03d
use tuple in isinstance for type checking ( #9583 )
2025-03-26 19:36:48 +08:00
George Hotz
9115ce8860
linearizer fixups from DSP branch ( #9581 )
2025-03-26 18:28:15 +08:00
qazal
e799df537e
prep viz UI cleanup for grid scales ( #9579 )
...
* less ways to make a button
* move collapse out
* work
* do not create extra resizers
* better
* ul
* safari
2025-03-26 17:48:15 +08:00
nimlgen
ccbcdca473
add memplanner tests ( #9577 )
2025-03-26 10:59:39 +07:00
qazal
c03dadfcb9
add TORCHVIZ=1 to beautiful_mnist_torch ( #9576 )
2025-03-26 11:17:08 +08:00
qazal
93bcb974c5
select torch device in examples/beautiful_mnist_torch.py ( #9575 )
2025-03-26 11:01:25 +08:00