Commit Graph

10576 Commits

Author SHA1 Message Date
George Hotz
2c90f3ea76 split test_advancedindex 2025-10-14 19:02:21 +08:00
George Hotz
d99457657b svd nonfull in parallel 2025-10-14 18:50:11 +08:00
George Hotz
8a34a4e2c7 fix up some slow tests that launch python 2025-10-14 18:42:42 +08:00
qazal
d3bfcd3277 minor patches for SQTT over usb on gfx12 (#12627)
* disable cpu_access in the sqtt buffer allocation

not sure if this is required, it results in a very slow call to
pcie_mem_write over USB GPU, removing it worked fine.

* fix itrace_se_mask on gfx12

on gfx11 it gave 6 se, on gfx11 this value is 2 so no instructions were
traced.

* Revert "fix itrace_se_mask on gfx12"

This reverts commit 0644adbcd1.
2025-10-14 18:07:46 +08:00
Sieds Lykles
1e6e5a0efd parse_valid returns None instead of raising (#12663)
* parse_valid returns None

* change there too
2025-10-14 11:57:38 +02:00
qazal
471bd30d16 cleanup viz/serve.py (#12665)
* use load_pickle

* update comment
2025-10-14 17:50:39 +08:00
George Hotz
fb61f3519f remove assign contiguous hack (#12659)
* remove assign contiguous hack

* remove bad contiguous usage in torch backend

* assign
2025-10-14 16:42:14 +08:00
George Hotz
30ee7c4c26 cleanup Device usage in Tensor (#12662) 2025-10-14 16:22:22 +08:00
Sieds Lykles
e06cbfcb8a combine pm_drop_and_clauses (#12660)
* combine those

* wino kernels decreased
2025-10-14 10:09:41 +02:00
George Hotz
84d4589ed4 remove pylint from pre-commit and CI (#12658)
* remove pylint from pre-commit and CI

* multidevice test is fast

* faster pre-commit

* 8 is faster than 4

* better name

* how did that typecheck?
2025-10-14 15:39:59 +08:00
qazal
8ecaf839e2 cleanup UOp tracing [pr] (#12657) 2025-10-14 14:50:59 +08:00
George Hotz
b9eb5b5d49 clean up the LLM tokenizer (#12653)
* clean up the LLM tokenizer

* simple tokenizer is actually simple

* ugh write good code
2025-10-14 14:22:01 +08:00
qazal
a9ef93176f viz: add colored text helper (#12654) 2025-10-14 13:05:26 +08:00
George Hotz
ecdc7539a2 add typing to MathTraits (#12650)
* add typing to MathTraits

* fix assign
2025-10-14 12:35:20 +08:00
qazal
9bf032de69 viz: keep focused shape in view (#12648) 2025-10-14 10:49:08 +08:00
chenyu
77b5e6774e fix bert training config (#12647)
FREE_INTERMEDIATE=0 REWRITE_STACK_LIMIT=500000
2025-10-13 15:03:47 -04:00
nimlgen
f1041dc0ac pylint 4.0.0 (#12642)
* cpu: fix spacing

* fix pylint

* fix pylint

* pylint 4.0.0

* lambda

* keep eval for now

* im so sorry
2025-10-13 23:28:36 +08:00
wozeparrot
47e0c43976 feat: Tensor.{load, store} (#12629) 2025-10-13 08:04:41 -07:00
chenyu
0f776c6e46 examples/mlperf/training_submission_v6.0 (#12644)
copied from v5.1
2025-10-13 09:58:25 -04:00
Sieds Lykles
e0139fafc1 UOp symbolic tests use eval to check against string (#12643) 2025-10-13 14:19:42 +02:00
b1tg
218225e8d0 pylint error (#12630)
Co-authored-by: wozeparrot <wozeparrot@gmail.com>
2025-10-13 05:05:12 -07:00
nimlgen
9096d7cc2e amd: support for rx9060 (#12640) 2025-10-13 19:44:15 +08:00
qazal
066d25f5fb refactor to trace_num property in buffers (#12638) 2025-10-13 18:06:55 +08:00
qazal
cd6aeebfee sqtt: osx decoder installer (#12637) 2025-10-13 17:26:12 +08:00
Sieds Lykles
e537e895b1 drop unused invalid conditions (#12635)
* drop where conditions if the ranges are not used inside the index

* remove allow_any_len
2025-10-13 10:52:21 +02:00
wozeparrot
9ab06dffad hotfix: block from env (#12628) 2025-10-12 08:07:32 -07:00
wozeparrot
12435a2dab actual tinyfs device (#12620) 2025-10-12 07:51:17 -07:00
chenyu
8f5f57c7d9 smaller CNT fuzz shapetracker (#12626) 2025-10-12 08:52:30 -04:00
George Hotz
1ecf403294 cleanup long lines [pr] (#12623)
* cleanup long lines

* more

* a few more

* all noqa fixed

* fix amd + cuda

* clean that up
2025-10-12 20:18:05 +08:00
qazal
fd51ecf983 process_replay for get_rangeify_map (#12624) 2025-10-12 15:14:40 +03:00
qazal
b5afa3848e viz: fix memory graph total nbytes (#12622)
* viz: fix memory graph total nbytes

* post increment

* simple regression test

* loop with markers + slightly off text baseline

* cpu events clear
2025-10-12 14:32:46 +03:00
nimlgen
822eab057f cpu: respect taskset + allow all cores (#12619)
* cpu: account taskset + allow all cores

* spaces
2025-10-12 14:31:40 +08:00
chenyu
7ac74d1550 remove unused type ignore [pr] (#12618) 2025-10-11 21:24:04 -04:00
Sieds Lykles
772a8dfe31 reshape uses valid when simplifying (#12597)
* reshape uses valid when simplifying

* try with IGNORE_OOB=0

* is it this test?

* skipif gpuocelot
2025-10-11 17:02:54 +02:00
nimlgen
08e62454b6 amd: use cpu_view() in sqtt (#12610) 2025-10-11 18:11:25 +08:00
Sieds Lykles
a2ae56674a uop_given_valid try multiple clauses (#12615)
* uop_given_valid uses less simplify

* enable test

* try all expressions together

* enable test
2025-10-11 11:53:42 +02:00
Sieds Lykles
dccdd190aa uop_given_valid uses less simplify (#12612)
* uop_given_valid uses less simplify

* enable test
2025-10-11 10:57:39 +02:00
qazal
9205527db0 viz: draw highlights above shapes (#12613) 2025-10-11 11:39:13 +03:00
George Hotz
cab034b863 improve typing (#12611)
* improve typing and bump to 3.11

* no need for Self yet

* improve typing

* binop also
2025-10-11 16:20:23 +08:00
Sieds Lykles
4300ebc455 cache apply_movement_op (#12609)
* cache apply_movement_op

* pyling and clear cache

* fix types

* ignore

* cleanup
2025-10-11 08:53:10 +02:00
George Hotz
7596c1b8f5 TestOuterworldReduce works (#12608) 2025-10-10 20:06:41 +08:00
chenyu
001b3710d3 enable some test_ops tests (#12607) 2025-10-10 07:23:21 -04:00
qazal
a62dc9ceb5 viz: light up buffer path (#12603) 2025-10-10 14:07:30 +03:00
qazal
464c56862f viz: update ansi regex (#12605)
* viz: update ansi regex

* better

* add ansi_colors_light

* javascript
2025-10-10 13:58:58 +03:00
George Hotz
ac96d98745 GROUP_REDUCE is now bright RED instead of green (#12604) 2025-10-10 18:23:57 +08:00
nimlgen
89be3590aa amd: sqtt on gfx12 (#12564)
* amd: sqtt on gfx12

* cleaner

* thi

* and this

* ops

* ugh

* back

* rm this

* rm
2025-10-10 17:54:14 +08:00
chenyu
95ad047445 do not use sint_to_uop in renderer [pr] (#12601) 2025-10-10 05:29:10 -04:00
Sieds Lykles
e625c27598 update min step times openpilot (#12600) 2025-10-10 11:24:27 +02:00
nimlgen
6ec96f6088 amd: remove dup flags in sqtt (#12595) 2025-10-10 17:23:33 +08:00
wozeparrot
9471157346 feat: bump llvm version (#12598) 2025-10-10 02:20:22 -07:00