nimlgen
bfe28ee2ad
rm run_schedule ( #15847 )
2026-04-21 18:14:30 +03:00
nimlgen
ae9b84d32f
rm beam uop ( #15844 )
2026-04-21 13:10:26 +03:00
qazal
f9655af2a3
viz/cli: move to tinygrad ( #15835 )
...
* move cli
* update imports
* cleanup the readme
* edit
* work
* details
* python -m tinygrad.viz.cli
* do not execv in non tty
* option
* lint
* simpler
* gemm pmc
2026-04-21 13:35:10 +09:00
qazal
601b9d3f59
viz/cli: dedup DEBUG=3 pyrender ( #15826 )
2026-04-20 19:29:09 +09:00
qazal
b05b1010bf
viz/cli: ux cleanups, show user python ( #15817 )
...
* small fixes
* print python trace
* jsonl
* cleanup fmt, fix tqdm
* print mode
* types
* less
* keep those
* fix
* everyone can print json
* pmc p2
2026-04-20 03:50:48 +03:00
qazal
c6d8753ee1
viz/cli: --json support, refine docs ( #15528 )
...
* refine
* remove
* refine
* keep
* need to say this
* back
* feedback
* feedback
* json
* dur_ms
* et_ms
* remove useless thing
* docs
* respect NO_COLOR
* DEBUG also produces valid json
2026-04-19 21:53:38 +03:00
wozeparrot
f28ea84de2
llama: fused silu fp8 amax ( #15798 )
...
* llama: combined w13
* llama: fused swiglu+fp8
* llama: fix amax interleaving
* llama: don't need seperate matmul
2026-04-19 12:03:55 +08:00
nimlgen
022d8c4a11
remove jit_cache usage in extra/examples ( #15808 )
...
* remove jit_cache usage in extra/examples
* cached
2026-04-18 23:00:18 +03:00
qazal
2581985532
viz/cli: multi device profiler output, print markers ( #15795 )
...
* yield
* all devices
* better
* add unittests
* markers like this
* profile_markers work
* less
* update README
* tiny and null
2026-04-17 23:40:10 +03:00
qazal
a227dbece1
viz/cli: reconstruct DEBUG output ( #15791 )
...
* work
* work
* ext
* padding
* at time
* work
* reorder
* less flags
* num_rows
* feedback
* pmc
2026-04-17 18:27:58 +03:00
qazal
afc3904e58
viz/cli: unit tests in CI ( #15788 )
...
* simple failing test
* test stdout
* cleanup sqttmap
2026-04-17 22:34:44 +09:00
qazal
7bdb3adbbf
viz/cli: simplification and reordering ( #15785 )
...
* remove
* work
* this is all one thing
* the reorder
2026-04-17 15:16:07 +03:00
wozeparrot
9e60e4a7e7
llama: native fp8 ( #15733 )
2026-04-16 22:16:05 -07:00
qazal
0e69388f6b
viz/cli: add DEBUG, optional number of rows ( #15777 )
...
* tabulate switch
* support DEBUG
* --top
* improve
* work
* feedback
* 0
* print_kernel both ways
* simplify
2026-04-17 04:36:47 +03:00
qazal
6d9320ffb3
add NO_COLOR ( #15765 )
...
* NO_COLOR in cli
* add in helpers
* rm flags
* docs
* fix that
* temp
* Revert "temp"
This reverts commit 7522e664f6 .
2026-04-16 22:44:55 +03:00
qazal
12c653a743
remove opts arg in get_program, everything uses opts_to_apply [pr] ( #15767 )
...
* check Ops.BEAM in process replay
* remove opts from the get_program api
* lint
* simplify
* cleanup
2026-04-16 22:42:43 +03:00
qazal
126cda45f8
viz/cli: cleanups, add memory printer ( #15762 )
...
* simple repro
* use context
* work
* memory printer
* rm
* memory printer
* pylint
2026-04-16 22:44:47 +09:00
George Hotz
d1cce7a476
put the ranges on store instead of after ( #15759 )
...
* put the ranges on store instead of after
* better assert
* fix stuff
* comment out slow rules i don't understand
* simpler rule
* closer
* return false for store
* fix loop
* only a few schedule failures remain
* remove stores to self
* all tests pass locally
* remove junk
* regression test and fix
* better test, bump broken torch count
* bugfix with regression test
* new fusion is better
2026-04-16 19:06:40 +08:00
qazal
1f26584b2e
viz/cli: cleanups from linter ( #15745 )
...
* run linter
* pmc
2026-04-16 03:36:24 +09:00
chenyu
3394d18066
size*itemsize -> nbytes ( #15729 )
...
and some UOp.size removal to prep for size to mixin change
2026-04-14 16:27:54 -04:00
qazal
905b8adc97
viz: cli and server cleanups ( #15713 )
...
* update get_profile arg[0]
* uop_to_json arg[0]
* data is standalone in cli
2026-04-14 06:42:29 +09:00
George Hotz
16f50a40a5
remove REMU from tree ( #15706 )
...
* no more compare emulators
* remove remu from tree
2026-04-13 20:43:08 +08:00
qazal
ac027055ef
viz: no global state ( #15705 )
...
* start viz data
* get_full_rewrites also moves
* update ref_map
* work
* update consumers
* cleaner cli
* linter
* cleanup tests
* back
* better
* sqtt tests
2026-04-13 21:35:20 +09:00
wozeparrot
457508d5a0
llama: save more 2 ( #15681 )
2026-04-11 01:03:36 -07:00
wozeparrot
55bcd7cc9e
llama amax outside ( #15670 )
2026-04-09 23:08:03 -07:00
nimlgen
057dc173ab
beam uop ( #15660 )
...
* beam as uop
* x
2026-04-09 19:13:03 +03:00
George Hotz
48a7627b04
add RDNA4 support to copy WMMA ( #15663 )
...
* add RDNA4 supportt to copy WMMA
* simpler
* simpler
* comment
* assert
2026-04-09 22:48:20 +08:00
qazal
742b3894d7
viz/cli: add pmc printer ( #15651 )
...
* viz/cli: add pmc printer
* cli work
* s
* linter
* pack workgroups
* add : to wgp
* counter name
2026-04-09 08:50:54 +09:00
nimlgen
28b14b0e38
mlx: remove to_be, use helpers ( #15655 )
2026-04-08 20:07:28 +03:00
qazal
71c83cc3f6
viz: put OTHER_ on the wave row ( #15650 )
...
* viz: put OTHER_ on the wave row
* update tests
* cleanup cli
2026-04-08 23:13:44 +09:00
George Hotz
1ebeb52e59
RDNA4 asm gemm ( #15427 )
...
* sqtt: rdna4 decoder work
* diff cleanup
* more diff
* test
* 125
* r4
---------
Co-authored-by: qazal <qazal.software@gmail.com >
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com >
2026-04-08 21:26:44 +08:00
qazal
3ac16b3bea
viz: add wmma row, update exec duration logic ( #15646 )
...
* viz: split wmma to its own row, fix duration logic
* regs
* decrease number of loops, add pickle
* assert overlaps
2026-04-08 20:24:23 +09:00
wozeparrot
70dbd35023
llama: move custom_kernel into flat_llama ( #15643 )
2026-04-08 00:19:14 -07:00
qazal
a508b8fd2a
viz: delete redundant things ( #15637 )
...
* delete that
* remove
* delete graph config
2026-04-08 07:18:04 +09:00
chenyu
1483f7e71c
support shift by Tensor ( #15623 )
...
* support shift by Tensor
* use mixin
2026-04-06 15:14:57 -04:00
chenyu
6e30a5f5ea
update shifts in torch backend ( #15622 )
2026-04-06 14:08:33 -04:00
nimlgen
e3986a6b74
mlx: init runtime ( #15612 )
...
* mlx: init
* x
* swap
2026-04-05 22:52:29 +03:00
wozeparrot
7e54992bf6
fp8 llama ( #15588 )
...
Co-authored-by: qazal <qazal.software@gmail.com >
2026-04-04 18:24:57 -07:00
qazal
f7aed180e4
viz/cli: add Other row in profiler ( #15600 )
2026-04-04 22:40:53 +09:00
Christopher Milan
645d45d968
DEV has arch ( #15577 )
...
Co-authored-by: Comma Device <device@comma.ai >
2026-04-03 19:17:19 -04:00
nimlgen
237084b276
remote: support several hosts ( #15585 )
...
* remote: support several hossts
* f
2026-04-03 11:22:15 +03:00
Christopher Milan
0ed8d9271d
Renderers accept Target or nothing ( #15590 )
2026-04-03 01:09:41 -04:00
nimlgen
046c3f1240
mlx: add loopback with send/recv ( #15583 )
2026-04-02 18:15:46 +03:00
qazal
fefb0ebc2a
gemm/asm: fp8 cleanups ( #15580 )
...
* normal gemm here
* s/dtypes.fp8e4m3/FP8_DTYPE
* gemm_bw
* device UOp stays NULL
2026-04-02 19:02:38 +09:00
chenyu
1aa04eab08
simple CreationMixin ( #15567 )
...
start with full_like, zeros_like, ones_like
2026-04-01 23:00:56 -04:00
nimlgen
da12c2ea16
better install msg ( #15570 )
2026-04-01 20:09:37 +03:00
qazal
9275f283e5
viz: update flag and display names ( #15566 )
...
* rename to occ, other_simd
* se pkts
* match viz cli tool in names
2026-04-01 21:48:37 +09:00
Christopher Milan
acf239e4d2
specify renderer in DEV, <dev>_<ren>=1 is deprecated ( #15551 )
2026-03-31 18:35:14 -04:00
nimlgen
477d194630
hipcomgr and tinygpu scripts ( #15549 )
2026-03-31 20:07:52 +03:00
qazal
a15345a53e
viz/cli: improve --help message ( #15546 )
...
* viz/cli: improve --help message
* not the default
* more work
* -s
* respect colored
2026-03-31 22:31:33 +09:00