Commit Graph

11171 Commits

Author SHA1 Message Date
Irwin1138
c18ad6f937 test github action install cuda-toolkit 2025-11-24 21:27:47 +02:00
Irwin1138
00aa943e91 fix linter errors 2025-11-23 22:23:26 +02:00
Irwin1138
7aea5256b1 fix cuda on windows 2025-11-23 22:05:01 +02:00
Sieds Lykles
63a931ff76 Symbolic divisor fuzzer (#13433)
* render z3 range better

* working version

* rename

* add to workflow

* factor out variable_names

* smaller expressions

* smaller

* + back
2025-11-23 20:29:32 +01:00
nimlgen
677db34eba nv: cleanup map flags (#13434) 2025-11-23 19:54:52 +03:00
qazal
712c7a6448 sqtt loader cleanups from the occupancy branch (#13431)
* cleanup err handling

* from disasms

* s/wave_execs/wave_insts
2025-11-23 21:50:34 +08:00
George Hotz
9d7a17ee39 beautiful SQTT_PARSE=1 with color (#13428)
* beautiful SQTT_PARSE=1 with color

* linter

* linter 2

* a few more labels

* filter and or

* wave alloc

* a few more
2025-11-23 01:05:14 -08:00
qazal
474a631877 viz: align left offset for nested items (#13420) 2025-11-23 14:22:51 +08:00
George Hotz
da0aa57a3b add cu parsing to attempt_sqtt_parse 2025-11-22 22:09:05 -08:00
qazal
320ed78803 can view wave timeline with SQTT_ITRACE_SE_MASK=0 (#13427) 2025-11-23 13:55:47 +08:00
Pranil
c1838c71fc display service name typo (#13426)
its tinybox-display.service
2025-11-22 20:49:56 -08:00
George Hotz
5110409339 continue work on parse sqtt, enable with SQTT_PARSE (#13425)
* continue work on parse sqtt, enable with SQTT_PARSE

* fix timing

* delta is pre instruction

* hi8 values

* a few more

* a bit more

* let it crash if you enabled it

* figure out simd

* hide 0x11
2025-11-22 19:03:17 -08:00
George Hotz
92170d0ff1 lil op cleanup (#13424)
* track flag count and op count

* text

* more

* file count

* lil op cleanup

* cleanups

* move
2025-11-22 15:21:15 -08:00
George Hotz
423b76a852 improve sqtt format parser (saturday coffee shop project) (#13419)
* improve sqtt format parser

* actually read the trash code ChatGPT wrote

* cleanups

* hand written parser

* quality

* more

* was missing first packet

* maybe

* filt

* fixups

* label the waves

* progress
2025-11-22 15:04:10 -08:00
George Hotz
9d6cf3472e remove op/sentinel 2025-11-22 15:01:47 -08:00
Christopher Milan
310da2a201 remove hashFiles in setup-tinygrad (#13423)
* fix hashFiles in setup-tinygrad on macos

* remove hashFiles altogether
2025-11-22 17:47:10 -05:00
qazal
c14033e10f viz: faster startup time with SQTT=1 (#13337)
* roc.py cleanups

* direct append

* viz index cleanup

* simd row details

* add kernel arg

* late instructions decode

* more instruction decode to sep server request

* 200ms startup, 6 second to waves timeline

* sort units

* creating new http paths is easy now

* instructions unpacker

* min diff, use hyphens

* summary table
2025-11-22 22:02:30 +08:00
qazal
1655fdb6de viz: cleanup sqtt loader (#13417) 2025-11-22 20:10:23 +08:00
qazal
903eec3754 fix sz.py tinygrad import in ci (#13418) 2025-11-22 19:20:26 +08:00
nimlgen
3a42680e22 amd: pmc generic arch for gfx10+ (#13407) 2025-11-22 12:31:23 +03:00
George Hotz
1f8b24a6b9 track flag count and op count (#13416)
* track flag count and op count

* text

* more

* file count
2025-11-21 22:46:33 -08:00
George Hotz
4c0f4226b9 delete the PRECAST op [p] (#13415)
* don't use PRECAST in cstyle renderer [p]

* fix in metal

* fix opencl

* __builtin_bit_cast

* precast is unused

* cuda is c99?

* lambda_union_bitcast

* helper function

* delete precast op
2025-11-21 21:47:14 -08:00
wozeparrot
1f648bb1ba feat: reenable mobilenetv2 dsp (#13320) 2025-11-21 15:21:49 -08:00
chenyu
054477a44f remove full_symbolic in simplify (#13413)
only flip one schedule in winograd backward, no functional difference
2025-11-21 15:04:00 -05:00
chenyu
cb29265f23 add test that shows the validhack regression with bad rewrite order (#13411) 2025-11-21 13:48:30 -05:00
qazal
fdfe83880b viz: unique sqtt wave names (#13410)
* viz: unique sqtt wave names

* better name for the shape

* it's a per program counter now

* table view, refactor to wave:insts dict
2025-11-22 02:43:31 +08:00
chenyu
a6c9b4ff6a fix symbolic comments [pr] (#13408) 2025-11-21 09:18:50 -05:00
Sieds Lykles
114bb94c55 Fix load collapse MAX to ADD (#13406)
* add Ops.ADD to pattern

* add test
2025-11-21 12:26:14 +01:00
qazal
87c248eafa small cleanups from viz memory usage fixes (#13405)
* shape link cleanups

* cleanup findRectAtPosition
2025-11-21 17:05:08 +08:00
qazal
0de1b24154 viz: SE : CU : SIMD : WAVE in sqtt timeline (#13404)
* wave id in device rows

* SE : CU : SIMD : WAVE

* automatic width

* better styling

* rm the blue

* sort
2025-11-21 15:42:29 +08:00
George Hotz
dabb02767f set AMD profile mode with sudo on SQTT or PMC (#13403)
* require profile mode

* add mode setter

* cleanup

* not needed

* SQTT_LIMIT_SE
2025-11-20 23:19:11 -08:00
George Hotz
e1051d00d7 multi like on full_like as well as rand_like (#13402)
* multi like on full_like as well as rand_like

* add test and fix bug

* mismatch, optim match

* one line
2025-11-20 20:46:48 -08:00
chenyu
fa3def2f12 call less simplify in simplify_valid_load [pr] (#13401) 2025-11-20 19:54:22 -05:00
qazal
895ec7417e viz: enable mapping function names to colors (#13400) 2025-11-21 06:43:02 +08:00
George Hotz
a74f6020d5 track apply map to tensors (#13399)
* track apply map to tensors

* sub
2025-11-20 14:24:55 -08:00
chenyu
647fde64e6 no sym in pm_reduce [pr] (#13398)
* no sym in pm_reduce [pr]

* fix that
2025-11-20 16:49:09 -05:00
qazal
1313250e0d viz: use system helper for llvm-mca (#13395) 2025-11-21 04:47:25 +08:00
Christopher Milan
de3593957f Revert "Revert "autogen: fix formatting on zero-argument function-like macros…" (#13388)
This reverts commit 0901a40685.
2025-11-20 15:36:13 -05:00
qazal
1220072328 viz: refactor to generic steps api (#13393) 2025-11-21 04:33:23 +08:00
George Hotz
26ccbf7040 debufferize with symbolic in one pm (#13392) 2025-11-20 11:47:03 -08:00
George Hotz
c46f608703 top down remove_bufferize (#13391)
* top down remove_bufferize

* removable if ALWAYS_CONTIGUOUS
2025-11-20 11:32:00 -08:00
Christopher Milan
4043489803 set curl -f in setup-tinygrad (#13389)
* set curl -f in setup-tinygrad

* test bad redirect

* Revert "test bad redirect"

This reverts commit ad945e7ffc.
2025-11-20 13:45:47 -05:00
chenyu
0251a8e628 parse_valid minor cleanup [pr] (#13385)
* stricter parse_valid [pr]

* not stricter

* no VCONST

* Revert "no VCONST"

This reverts commit 330dbdf4060562596febcbf970bda6051a35012f.
2025-11-20 13:15:06 -05:00
Christopher Milan
0901a40685 Revert "autogen: fix formatting on zero-argument function-like macros (#13386)" (#13387)
This reverts commit 58d85d4bab.
2025-11-20 12:45:35 -05:00
b1tg
91e289cb14 amd fp8 llvm (#13186)
* amd fp8 llvm support

* fix max

* clean

* add test_mi350.sh

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-11-20 12:35:57 -05:00
Roelof van Dijk
1058748440 torch backend: no aten.detach for torch 2.10 compat (#13381)
* this works, less cpp?

* simpler = better

* keep torch 2.9 working as well
2025-11-20 09:12:15 -08:00
Christopher Milan
58d85d4bab autogen: fix formatting on zero-argument function-like macros (#13386)
* fix formatting on zero-argument function-like macros

* autogen tests should run

* ugh
2025-11-20 12:11:04 -05:00
qazal
9dbc550692 roc: map disassembly to prog name (#13384) 2025-11-20 23:47:19 +08:00
qazal
ebcdf68bab viz: use content headers for profiler (#13383) 2025-11-20 23:33:16 +08:00
nimlgen
0b0ea4981c hcq: unwrap signals (#13382) 2025-11-20 18:12:41 +03:00