George Hotz
2c90f3ea76
split test_advancedindex
2025-10-14 19:02:21 +08:00
George Hotz
d99457657b
svd nonfull in parallel
2025-10-14 18:50:11 +08:00
George Hotz
8a34a4e2c7
fix up some slow tests that launch python
2025-10-14 18:42:42 +08:00
qazal
d3bfcd3277
minor patches for SQTT over usb on gfx12 ( #12627 )
...
* disable cpu_access in the sqtt buffer allocation
not sure if this is required, it results in a very slow call to
pcie_mem_write over USB GPU, removing it worked fine.
* fix itrace_se_mask on gfx12
on gfx11 it gave 6 se, on gfx11 this value is 2 so no instructions were
traced.
* Revert "fix itrace_se_mask on gfx12"
This reverts commit 0644adbcd1 .
2025-10-14 18:07:46 +08:00
Sieds Lykles
1e6e5a0efd
parse_valid returns None instead of raising (#12663 )
...
* parse_valid returns None
* change there too
2025-10-14 11:57:38 +02:00
qazal
471bd30d16
cleanup viz/serve.py ( #12665 )
...
* use load_pickle
* update comment
2025-10-14 17:50:39 +08:00
George Hotz
fb61f3519f
remove assign contiguous hack ( #12659 )
...
* remove assign contiguous hack
* remove bad contiguous usage in torch backend
* assign
2025-10-14 16:42:14 +08:00
George Hotz
30ee7c4c26
cleanup Device usage in Tensor ( #12662 )
2025-10-14 16:22:22 +08:00
Sieds Lykles
e06cbfcb8a
combine pm_drop_and_clauses ( #12660 )
...
* combine those
* wino kernels decreased
2025-10-14 10:09:41 +02:00
George Hotz
84d4589ed4
remove pylint from pre-commit and CI ( #12658 )
...
* remove pylint from pre-commit and CI
* multidevice test is fast
* faster pre-commit
* 8 is faster than 4
* better name
* how did that typecheck?
2025-10-14 15:39:59 +08:00
qazal
8ecaf839e2
cleanup UOp tracing [pr] ( #12657 )
2025-10-14 14:50:59 +08:00
George Hotz
b9eb5b5d49
clean up the LLM tokenizer ( #12653 )
...
* clean up the LLM tokenizer
* simple tokenizer is actually simple
* ugh write good code
2025-10-14 14:22:01 +08:00
qazal
a9ef93176f
viz: add colored text helper ( #12654 )
2025-10-14 13:05:26 +08:00
George Hotz
ecdc7539a2
add typing to MathTraits ( #12650 )
...
* add typing to MathTraits
* fix assign
2025-10-14 12:35:20 +08:00
qazal
9bf032de69
viz: keep focused shape in view ( #12648 )
2025-10-14 10:49:08 +08:00
chenyu
77b5e6774e
fix bert training config ( #12647 )
...
FREE_INTERMEDIATE=0 REWRITE_STACK_LIMIT=500000
2025-10-13 15:03:47 -04:00
nimlgen
f1041dc0ac
pylint 4.0.0 ( #12642 )
...
* cpu: fix spacing
* fix pylint
* fix pylint
* pylint 4.0.0
* lambda
* keep eval for now
* im so sorry
2025-10-13 23:28:36 +08:00
wozeparrot
47e0c43976
feat: Tensor.{load, store} ( #12629 )
2025-10-13 08:04:41 -07:00
chenyu
0f776c6e46
examples/mlperf/training_submission_v6.0 ( #12644 )
...
copied from v5.1
2025-10-13 09:58:25 -04:00
Sieds Lykles
e0139fafc1
UOp symbolic tests use eval to check against string ( #12643 )
2025-10-13 14:19:42 +02:00
b1tg
218225e8d0
pylint error ( #12630 )
...
Co-authored-by: wozeparrot <wozeparrot@gmail.com >
2025-10-13 05:05:12 -07:00
nimlgen
9096d7cc2e
amd: support for rx9060 ( #12640 )
2025-10-13 19:44:15 +08:00
qazal
066d25f5fb
refactor to trace_num property in buffers ( #12638 )
2025-10-13 18:06:55 +08:00
qazal
cd6aeebfee
sqtt: osx decoder installer ( #12637 )
2025-10-13 17:26:12 +08:00
Sieds Lykles
e537e895b1
drop unused invalid conditions ( #12635 )
...
* drop where conditions if the ranges are not used inside the index
* remove allow_any_len
2025-10-13 10:52:21 +02:00
wozeparrot
9ab06dffad
hotfix: block from env ( #12628 )
2025-10-12 08:07:32 -07:00
wozeparrot
12435a2dab
actual tinyfs device ( #12620 )
2025-10-12 07:51:17 -07:00
chenyu
8f5f57c7d9
smaller CNT fuzz shapetracker ( #12626 )
2025-10-12 08:52:30 -04:00
George Hotz
1ecf403294
cleanup long lines [pr] ( #12623 )
...
* cleanup long lines
* more
* a few more
* all noqa fixed
* fix amd + cuda
* clean that up
2025-10-12 20:18:05 +08:00
qazal
fd51ecf983
process_replay for get_rangeify_map ( #12624 )
2025-10-12 15:14:40 +03:00
qazal
b5afa3848e
viz: fix memory graph total nbytes ( #12622 )
...
* viz: fix memory graph total nbytes
* post increment
* simple regression test
* loop with markers + slightly off text baseline
* cpu events clear
2025-10-12 14:32:46 +03:00
nimlgen
822eab057f
cpu: respect taskset + allow all cores ( #12619 )
...
* cpu: account taskset + allow all cores
* spaces
2025-10-12 14:31:40 +08:00
chenyu
7ac74d1550
remove unused type ignore [pr] ( #12618 )
2025-10-11 21:24:04 -04:00
Sieds Lykles
772a8dfe31
reshape uses valid when simplifying ( #12597 )
...
* reshape uses valid when simplifying
* try with IGNORE_OOB=0
* is it this test?
* skipif gpuocelot
2025-10-11 17:02:54 +02:00
nimlgen
08e62454b6
amd: use cpu_view() in sqtt ( #12610 )
2025-10-11 18:11:25 +08:00
Sieds Lykles
a2ae56674a
uop_given_valid try multiple clauses (#12615 )
...
* uop_given_valid uses less simplify
* enable test
* try all expressions together
* enable test
2025-10-11 11:53:42 +02:00
Sieds Lykles
dccdd190aa
uop_given_valid uses less simplify ( #12612 )
...
* uop_given_valid uses less simplify
* enable test
2025-10-11 10:57:39 +02:00
qazal
9205527db0
viz: draw highlights above shapes ( #12613 )
2025-10-11 11:39:13 +03:00
George Hotz
cab034b863
improve typing ( #12611 )
...
* improve typing and bump to 3.11
* no need for Self yet
* improve typing
* binop also
2025-10-11 16:20:23 +08:00
Sieds Lykles
4300ebc455
cache apply_movement_op ( #12609 )
...
* cache apply_movement_op
* pyling and clear cache
* fix types
* ignore
* cleanup
2025-10-11 08:53:10 +02:00
George Hotz
7596c1b8f5
TestOuterworldReduce works ( #12608 )
2025-10-10 20:06:41 +08:00
chenyu
001b3710d3
enable some test_ops tests ( #12607 )
2025-10-10 07:23:21 -04:00
qazal
a62dc9ceb5
viz: light up buffer path ( #12603 )
2025-10-10 14:07:30 +03:00
qazal
464c56862f
viz: update ansi regex ( #12605 )
...
* viz: update ansi regex
* better
* add ansi_colors_light
* javascript
2025-10-10 13:58:58 +03:00
George Hotz
ac96d98745
GROUP_REDUCE is now bright RED instead of green ( #12604 )
2025-10-10 18:23:57 +08:00
nimlgen
89be3590aa
amd: sqtt on gfx12 ( #12564 )
...
* amd: sqtt on gfx12
* cleaner
* thi
* and this
* ops
* ugh
* back
* rm this
* rm
2025-10-10 17:54:14 +08:00
chenyu
95ad047445
do not use sint_to_uop in renderer [pr] ( #12601 )
2025-10-10 05:29:10 -04:00
Sieds Lykles
e625c27598
update min step times openpilot ( #12600 )
2025-10-10 11:24:27 +02:00
nimlgen
6ec96f6088
amd: remove dup flags in sqtt ( #12595 )
2025-10-10 17:23:33 +08:00
wozeparrot
9471157346
feat: bump llvm version ( #12598 )
2025-10-10 02:20:22 -07:00