Roelof van Dijk
d3e125d05d
keyword changed (import reserved in python) ( #13477 )
2025-11-27 11:23:00 -08:00
qazal
72ef533d9c
tracing: use u32 for buffer args encoding ( #13472 )
2025-11-28 00:19:51 +08:00
George Hotz
18addc0a1d
process replay only get_program ( #13475 )
2025-11-27 08:18:18 -08:00
George Hotz
a8e005b095
enable process replay (non-checking) by default ( #13474 )
2025-11-27 07:28:44 -08:00
qazal
952a6a8b10
viz: add kernel buffers back to the sidebar ( #13471 )
2025-11-27 22:10:35 +08:00
Kirill R.
57869387f9
Update wording in mnist.md ( #13469 )
2025-11-27 05:59:49 -08:00
nimlgen
1d207eca3d
cuda: fix fmt in compiler ( #13470 )
2025-11-27 16:51:17 +03:00
qazal
2df8a3474e
viz: bring back flops and mem in sidebar ( #13467 )
2025-11-27 17:27:44 +08:00
George Hotz
05cd2279d0
add cache on reshape ( #13466 )
...
* remove cache on divmod, way less objects
* _apply_reshape
* reshape
* no gc on realize
* wow that cache is fast
2025-11-26 18:57:40 -08:00
George Hotz
f4123b66df
add DEBUG_GC ( #13465 )
...
* add DEBUG_GC
* fixup create_schedule_with_vars
* work
2025-11-26 17:44:44 -08:00
George Hotz
19228e8d37
test_graph is flaky
2025-11-26 16:37:42 -08:00
George Hotz
268b3eb392
factor scheduling into complete_create_schedule_with_vars ( #13464 )
2025-11-26 15:43:27 -08:00
George Hotz
e4cd649ff0
remove kernelize to prepare for refactors ( #13463 )
...
* remove kernelize to prepare for refactors
* less kernelize
* last test
2025-11-26 14:18:50 -08:00
qazal
b63e5a7568
viz: full range x axis scroll ( #13459 )
2025-11-26 21:28:07 +08:00
qazal
c12e218751
viz: double click on INST wave ( #13458 )
2025-11-26 21:12:40 +08:00
qazal
e9cb738c7a
viz: event sidebar cleanup ( #13457 )
2025-11-26 19:47:15 +08:00
qazal
2a3b665972
viz: initial zoom at first event ( #13456 )
...
* viz: initial zoom at first event
* sidebar work
2025-11-26 16:42:06 +08:00
Christopher Milan
b2af92c821
fix HCQGraph.__del__ bug when finalizing ( #13298 )
...
* fix _do_ioctl import
* fix circular import
* suppress_finalizing instead
2025-11-25 20:33:48 -08:00
qazal
8c1e2a42fd
viz: start work on profiler speed ( #13455 )
2025-11-26 07:54:04 +08:00
wozeparrot
ffc31a23f4
tk mi350 ( #13288 )
2025-11-25 15:49:44 -08:00
nimlgen
436ab6bfc7
nv: use opt mutliple vaspaces ( #13453 )
2025-11-25 23:10:21 +03:00
qazal
7238df7a94
viz: cleanup sort_fn ( #13454 )
2025-11-26 04:10:10 +08:00
qazal
5520f1fb0b
viz: per cu timeline ( #13451 )
...
* add cu_loc
* work
* WAVE -> W
2025-11-26 00:05:20 +08:00
qazal
4a9562e353
viz: draw markers on top ( #13449 )
...
* viz: draw markers on top
* create generic label drawer
* same text rendering infrastructure for markers
* minor details
* diff
2025-11-25 17:27:01 +08:00
George Hotz
5373fd2d66
add user device ( #13447 )
...
* add user device
* add device_sort_fn (#13448 )
Co-authored-by: qazal <qazal.software@gmail.com >
* linter
* order by dname
---------
Co-authored-by: qazal <qazal.software@gmail.com >
2025-11-25 15:25:45 +08:00
George Hotz
241e533451
toposort recursive_property is faster ( #13446 )
2025-11-24 22:29:15 -08:00
George Hotz
8e8fec408e
fix n^2 _apply_map_to_tensors [pr] ( #13443 )
...
* clean up slow rules
* fix rule
* non n^2 toposort
* topovisit
* state dict profile_marker
2025-11-24 18:59:16 -08:00
wozeparrot
249553a119
tinyfs tweaks ( #13444 )
2025-11-24 18:07:32 -08:00
wozeparrot
f46bc31156
tk: start and step in range ( #13442 )
2025-11-24 15:43:24 -08:00
George Hotz
cc5e6323ac
stable diffusion profiling ( #13441 )
...
* stable diffusion profiling
Signed-off-by: George Hotz <geohot@gmail.com >
* profile_marker
* profile per step
* fix slow Context
* profile that
---------
Signed-off-by: George Hotz <geohot@gmail.com >
2025-11-24 15:25:45 -08:00
nimlgen
18cfb54736
amd: a bit better se limiting ( #13440 )
...
* amd: a bit better se limiting
* SQTT_LIMIT_SE=0
2025-11-24 21:51:47 +03:00
C T
2d53029be3
Whisper less flaky tests ( #13435 )
...
* use less flaky metric for whisper long transcription
* multiline long transcription 3 reference
* fix reference transcript
see https://homepage.ntu.edu.tw/~karchung/miniconversations/MC.htm
sanitized for whisper
* try lower wer threshold
* add test for wer metric
* extract TRANSCRIPTION_3_ALT
* rename test
* rename
* add tests for high WER difference
* move tests
* sync metric
2025-11-24 09:50:49 -08:00
qazal
2a9bd12700
sqtt: add occupancy events to the timeline ( #13430 )
2025-11-24 22:28:05 +08:00
Sieds Lykles
63a931ff76
Symbolic divisor fuzzer ( #13433 )
...
* render z3 range better
* working version
* rename
* add to workflow
* factor out variable_names
* smaller expressions
* smaller
* + back
2025-11-23 20:29:32 +01:00
nimlgen
677db34eba
nv: cleanup map flags ( #13434 )
2025-11-23 19:54:52 +03:00
qazal
712c7a6448
sqtt loader cleanups from the occupancy branch ( #13431 )
...
* cleanup err handling
* from disasms
* s/wave_execs/wave_insts
2025-11-23 21:50:34 +08:00
George Hotz
9d7a17ee39
beautiful SQTT_PARSE=1 with color ( #13428 )
...
* beautiful SQTT_PARSE=1 with color
* linter
* linter 2
* a few more labels
* filter and or
* wave alloc
* a few more
2025-11-23 01:05:14 -08:00
qazal
474a631877
viz: align left offset for nested items ( #13420 )
2025-11-23 14:22:51 +08:00
George Hotz
da0aa57a3b
add cu parsing to attempt_sqtt_parse
2025-11-22 22:09:05 -08:00
qazal
320ed78803
can view wave timeline with SQTT_ITRACE_SE_MASK=0 ( #13427 )
2025-11-23 13:55:47 +08:00
Pranil
c1838c71fc
display service name typo ( #13426 )
...
its tinybox-display.service
2025-11-22 20:49:56 -08:00
George Hotz
5110409339
continue work on parse sqtt, enable with SQTT_PARSE ( #13425 )
...
* continue work on parse sqtt, enable with SQTT_PARSE
* fix timing
* delta is pre instruction
* hi8 values
* a few more
* a bit more
* let it crash if you enabled it
* figure out simd
* hide 0x11
2025-11-22 19:03:17 -08:00
George Hotz
92170d0ff1
lil op cleanup ( #13424 )
...
* track flag count and op count
* text
* more
* file count
* lil op cleanup
* cleanups
* move
2025-11-22 15:21:15 -08:00
George Hotz
423b76a852
improve sqtt format parser (saturday coffee shop project) ( #13419 )
...
* improve sqtt format parser
* actually read the trash code ChatGPT wrote
* cleanups
* hand written parser
* quality
* more
* was missing first packet
* maybe
* filt
* fixups
* label the waves
* progress
2025-11-22 15:04:10 -08:00
George Hotz
9d6cf3472e
remove op/sentinel
2025-11-22 15:01:47 -08:00
Christopher Milan
310da2a201
remove hashFiles in setup-tinygrad ( #13423 )
...
* fix hashFiles in setup-tinygrad on macos
* remove hashFiles altogether
2025-11-22 17:47:10 -05:00
qazal
c14033e10f
viz: faster startup time with SQTT=1 ( #13337 )
...
* roc.py cleanups
* direct append
* viz index cleanup
* simd row details
* add kernel arg
* late instructions decode
* more instruction decode to sep server request
* 200ms startup, 6 second to waves timeline
* sort units
* creating new http paths is easy now
* instructions unpacker
* min diff, use hyphens
* summary table
2025-11-22 22:02:30 +08:00
qazal
1655fdb6de
viz: cleanup sqtt loader ( #13417 )
2025-11-22 20:10:23 +08:00
qazal
903eec3754
fix sz.py tinygrad import in ci ( #13418 )
2025-11-22 19:20:26 +08:00
nimlgen
3a42680e22
amd: pmc generic arch for gfx10+ ( #13407 )
2025-11-22 12:31:23 +03:00