wozeparrot
|
8713ae6de9
|
fix: dead sdv2 download link (#13521)
|
2025-12-01 22:50:53 -08:00 |
|
George Hotz
|
44104b0b7f
|
mnist with grad acc + Adam on CPU (#13520)
* mnist with grad acc + Adam on CPU
* still broken, but closer
* works w/o jit
* this works without the jit
|
2025-12-01 18:27:32 -08:00 |
|
George Hotz
|
7307120311
|
shard to one device is to (#13519)
* shard to one device is to
* fst
|
2025-12-01 16:29:53 -08:00 |
|
chenyu
|
0b92fd30f5
|
simpler simplify_valid [pr] (#13514)
dedup instead of getting a True clause which is removed later
|
2025-12-01 17:36:33 -05:00 |
|
qazal
|
a5ec3b24be
|
viz: start PMC in the counters view (#13510)
|
2025-12-02 00:01:57 +08:00 |
|
nimlgen
|
759b41ab91
|
amd: fix rsrc_word3 on gfx9 (#13509)
|
2025-12-01 12:47:54 +03:00 |
|
chenyu
|
ebbd114885
|
simpler invalid alu [pr] (#13508)
|
2025-11-30 22:18:42 -05:00 |
|
George Hotz
|
ada6b92b2d
|
add a gate to rewrite if there's no rules [pr] (#13506)
|
2025-11-30 17:40:52 -08:00 |
|
George Hotz
|
97b56e11e0
|
hotfix: 32 workgroups for radeon 8050s
|
2025-11-30 08:20:17 -08:00 |
|
George Hotz
|
bd4b9de7d2
|
use numpy in amd_uop_matmul for simpler tracing (#13503)
|
2025-11-30 08:04:38 -08:00 |
|
qazal
|
9023ca30ef
|
show number of waves in each SE/CU (#13491)
* show number of waves in each SE/CU
* update to test_ones
|
2025-11-30 22:29:16 +08:00 |
|
nimlgen
|
455dd88236
|
nv: minimal hevc (#13502)
* nv: minimal hevc
* validate
* not needed
* tralin
* var
* cpu
* fxi
* desc
* move
* cleanup
|
2025-11-30 16:46:55 +03:00 |
|
George Hotz
|
fd373fea7a
|
fix a few tests [pr] (#13498)
|
2025-11-29 13:43:45 -08:00 |
|
George Hotz
|
29b11c8992
|
bug in device enumerate where we didn't put default back (#13495)
|
2025-11-29 13:00:55 -08:00 |
|
George Hotz
|
6a140f74fe
|
split out unique_const and cache const [pr] (#13493)
* split out unique_const
* add cache to const
* call const in unique_const
|
2025-11-29 10:44:28 -08:00 |
|
George Hotz
|
c38b7684dc
|
improve microbenchmarks (#13492)
* improve microbenchmarks
* bugfix + ubench
* lil
* no src in const method
|
2025-11-29 10:15:22 -08:00 |
|
qazal
|
941597db71
|
viz UI cleanups (#13490)
|
2025-11-29 22:07:00 +08:00 |
|
qazal
|
d457ee0ba4
|
viz: correctly handle multiple sqtt traces of the same prg (#13460)
|
2025-11-29 20:52:41 +08:00 |
|
George Hotz
|
6f4d7c0c70
|
directly create tensor in _apply_uop (#13489)
|
2025-11-28 19:51:06 -08:00 |
|
kamilisjon
|
3d76ef9ba8
|
Update tests (#13479)
|
2025-11-28 18:35:28 -08:00 |
|
nimlgen
|
192bf4e00a
|
amd,nv: remove unused env vars (#13487)
|
2025-11-28 23:12:53 +03:00 |
|
qazal
|
ae9c56134e
|
skip test_tk failing locally on macbook (#13476)
|
2025-11-29 01:15:37 +08:00 |
|
qazal
|
f33ccd31fd
|
viz: instruction deduping for SQTT inst waves (#13482)
|
2025-11-28 23:17:07 +08:00 |
|
Roelof van Dijk
|
eb543a91e8
|
perf: remove graph-in-graph from expand_index (#13473)
* remove graph-in-graph from devectorizer
* vectorize, not sink
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
|
2025-11-27 11:32:16 -08:00 |
|
Roelof van Dijk
|
d3e125d05d
|
keyword changed (import reserved in python) (#13477)
|
2025-11-27 11:23:00 -08:00 |
|
qazal
|
72ef533d9c
|
tracing: use u32 for buffer args encoding (#13472)
|
2025-11-28 00:19:51 +08:00 |
|
George Hotz
|
18addc0a1d
|
process replay only get_program (#13475)
|
2025-11-27 08:18:18 -08:00 |
|
George Hotz
|
a8e005b095
|
enable process replay (non-checking) by default (#13474)
|
2025-11-27 07:28:44 -08:00 |
|
qazal
|
952a6a8b10
|
viz: add kernel buffers back to the sidebar (#13471)
|
2025-11-27 22:10:35 +08:00 |
|
Kirill R.
|
57869387f9
|
Update wording in mnist.md (#13469)
|
2025-11-27 05:59:49 -08:00 |
|
nimlgen
|
1d207eca3d
|
cuda: fix fmt in compiler (#13470)
|
2025-11-27 16:51:17 +03:00 |
|
qazal
|
2df8a3474e
|
viz: bring back flops and mem in sidebar (#13467)
|
2025-11-27 17:27:44 +08:00 |
|
George Hotz
|
05cd2279d0
|
add cache on reshape (#13466)
* remove cache on divmod, way less objects
* _apply_reshape
* reshape
* no gc on realize
* wow that cache is fast
|
2025-11-26 18:57:40 -08:00 |
|
George Hotz
|
f4123b66df
|
add DEBUG_GC (#13465)
* add DEBUG_GC
* fixup create_schedule_with_vars
* work
|
2025-11-26 17:44:44 -08:00 |
|
George Hotz
|
19228e8d37
|
test_graph is flaky
|
2025-11-26 16:37:42 -08:00 |
|
George Hotz
|
268b3eb392
|
factor scheduling into complete_create_schedule_with_vars (#13464)
|
2025-11-26 15:43:27 -08:00 |
|
George Hotz
|
e4cd649ff0
|
remove kernelize to prepare for refactors (#13463)
* remove kernelize to prepare for refactors
* less kernelize
* last test
|
2025-11-26 14:18:50 -08:00 |
|
qazal
|
b63e5a7568
|
viz: full range x axis scroll (#13459)
|
2025-11-26 21:28:07 +08:00 |
|
qazal
|
c12e218751
|
viz: double click on INST wave (#13458)
|
2025-11-26 21:12:40 +08:00 |
|
qazal
|
e9cb738c7a
|
viz: event sidebar cleanup (#13457)
|
2025-11-26 19:47:15 +08:00 |
|
qazal
|
2a3b665972
|
viz: initial zoom at first event (#13456)
* viz: initial zoom at first event
* sidebar work
|
2025-11-26 16:42:06 +08:00 |
|
Christopher Milan
|
b2af92c821
|
fix HCQGraph.__del__ bug when finalizing (#13298)
* fix _do_ioctl import
* fix circular import
* suppress_finalizing instead
|
2025-11-25 20:33:48 -08:00 |
|
qazal
|
8c1e2a42fd
|
viz: start work on profiler speed (#13455)
|
2025-11-26 07:54:04 +08:00 |
|
wozeparrot
|
ffc31a23f4
|
tk mi350 (#13288)
|
2025-11-25 15:49:44 -08:00 |
|
nimlgen
|
436ab6bfc7
|
nv: use opt mutliple vaspaces (#13453)
|
2025-11-25 23:10:21 +03:00 |
|
qazal
|
7238df7a94
|
viz: cleanup sort_fn (#13454)
|
2025-11-26 04:10:10 +08:00 |
|
qazal
|
5520f1fb0b
|
viz: per cu timeline (#13451)
* add cu_loc
* work
* WAVE -> W
|
2025-11-26 00:05:20 +08:00 |
|
qazal
|
4a9562e353
|
viz: draw markers on top (#13449)
* viz: draw markers on top
* create generic label drawer
* same text rendering infrastructure for markers
* minor details
* diff
|
2025-11-25 17:27:01 +08:00 |
|
George Hotz
|
5373fd2d66
|
add user device (#13447)
* add user device
* add device_sort_fn (#13448)
Co-authored-by: qazal <qazal.software@gmail.com>
* linter
* order by dname
---------
Co-authored-by: qazal <qazal.software@gmail.com>
|
2025-11-25 15:25:45 +08:00 |
|
George Hotz
|
241e533451
|
toposort recursive_property is faster (#13446)
|
2025-11-24 22:29:15 -08:00 |
|