George Hotz
a8dca47fbc
fix
2025-10-03 10:52:18 +08:00
George Hotz
769db23df6
that
2025-10-03 10:45:47 +08:00
George Hotz
1cd40941c8
delete junk
2025-10-03 10:35:12 +08:00
George Hotz
9a607e69e1
Merge branch 'master' into support_opts_in_contig
2025-10-03 10:32:08 +08:00
George Hotz
9cd365c12e
little changes from double gemm ( #12429 )
...
* little changes from double gemm
* split pm_group_for_reduce
* pm_add_buffers_local
* Revert "pm_add_buffers_local"
This reverts commit 4d30a91db2 .
2025-10-03 10:31:51 +08:00
Sieds Lykles
16a65b4fd0
fix test_symbolic_gcd_div hang ( #12427 )
2025-10-03 04:21:16 +02:00
George Hotz
5b05bf4ab4
Merge branch 'master' into support_opts_in_contig
2025-10-03 10:03:14 +08:00
chenyu
2d24af888b
REWRITE_STACK_LIMIT ( #12426 )
2025-10-02 21:51:04 -04:00
hooved
1b58ef0d60
Increase stack size limit in unified_rewrite ( #12424 )
...
* increase stack size limit
* rerun CI due to random tqdm test fail
2025-10-03 09:06:47 +08:00
qazal
17d36d0952
don't tag MSTACK/MSELECT on global buffers ( #12423 )
...
* don't tag MSTACK/MSELECT
* fix
2025-10-02 13:32:15 +03:00
George Hotz
e59e0aadc1
opts
2025-10-02 18:25:38 +08:00
George Hotz
3688afa513
fix swap
2025-10-02 18:17:03 +08:00
George Hotz
dae164ffb1
warp
2025-10-02 18:04:45 +08:00
George Hotz
3fb3dd4c06
flash attention with two gemms
2025-10-02 17:48:48 +08:00
George Hotz
e5028d58e9
flash attention sort of works
2025-10-02 17:29:23 +08:00
chenyu
7b3912d8e4
relax atol for some tests ( #12422 )
2025-10-02 05:04:44 -04:00
chenyu
98163832e4
update RANGEIFY test_cast_padded ( #12421 )
...
* update RANGEIFY test_cast_padded
* update test
2025-10-02 04:37:35 -04:00
George Hotz
5a602e6c36
double wmma works
2025-10-02 16:06:46 +08:00
chenyu
37beef6de3
add null bert training test in ci ( #12420 )
...
fails with RANGEIFY `RuntimeError: children not making progress`
2025-10-02 04:05:19 -04:00
George Hotz
6640514555
demote works on both matmuls
2025-10-02 15:42:22 +08:00
qazal
f21851b099
ops: n^2 .device property fix ( #12419 )
...
* test case for a long rand chain
currently failing with RANGEIFY because device propogates too deep
* skip
* ops: n^2 .device property fix
* unskip
---------
Co-authored-by: Chen-Yu Yang <chenyu@fastmail.com >
2025-10-02 03:28:12 -04:00
b1tg
ec177c80c2
rangeify: fix test_where_fold (llvm) ( #12416 )
...
* rangeify: fix test_where_fold (AMD_LLVM)
* rm comment
2025-10-02 02:57:49 -04:00
qazal
13a25b2e67
rangeify: don't shape INDEX on kernelize ( #12417 )
2025-10-02 09:45:37 +03:00
hooved
5d9035f5a6
Eval for Stable Diffusion mlperf ( #12316 )
...
* add diff
* rerun ci
* refactor beam workaround, add test
* fix conflict
* linting
2025-10-02 02:35:38 -04:00
hooved
0f804c9a83
Stable Diffusion model init for mlperf ( #12314 )
...
* include clip pr diff
* updated unet and sd init
* dehardcode default device
* revert beam hang workaround
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-10-02 02:28:41 -04:00
George Hotz
2fbd7d21f9
tc passes
2025-10-02 14:27:52 +08:00
George Hotz
0eee93f0c0
hotfix: disable split ranges for non rangeify
2025-10-02 13:15:24 +08:00
George Hotz
f32a497f08
bug
2025-10-02 13:10:07 +08:00
George Hotz
3fd25a425b
fix gfr
2025-10-02 13:06:31 +08:00
George Hotz
3da569c20b
Merge branch 'master' into support_opts_in_contig
2025-10-02 12:57:43 +08:00
George Hotz
583553f467
split ranges ( #12411 )
...
* split ranges
* simpler
* split ranges
* range str
* fix test
* oops
* faster
* no group 2
* tests
* dont_sub_ranges_for_image
* revert that
2025-10-02 12:57:22 +08:00
qazal
6fc6b51b59
fix limit_bufs with kernelize ( #12415 )
2025-10-02 07:49:11 +03:00
George Hotz
9d5d4b248c
Merge branch 'master' into support_opts_in_contig
2025-10-02 12:39:50 +08:00
qazal
d1c868f990
fix limit_bufs with multi ( #12414 )
2025-10-02 05:51:56 +03:00
qazal
2fcd55583f
allow less kernels in external_test_opt ( #12412 )
...
* allow less kernels in external_test_opt
* this was always 2
2025-10-02 05:05:42 +03:00
qazal
8b48e19ce2
skip more multi remote tests ( #12410 )
2025-10-02 04:50:46 +03:00
George Hotz
3770dd9d80
annotate bufferize in viz
2025-10-02 09:20:50 +08:00
qazal
5b649616ff
rangeify: detect and assert cycles ( #12405 )
...
* rangeify: assert cycles
* rng=2
* any
2025-10-02 03:39:43 +03:00
Sieds Lykles
9a64fc0d28
Load alt value with cast try 2 ( #12407 )
...
* add or_casted
* add tests and fix old tests
* cast load
* move that to pm_render
* add allow_any_len to gated load patterns in renderers
* slice [:2]
2025-10-02 00:55:29 +02:00
nimlgen
3e0e0290ce
increase timeout in test_module_runs ( #12408 )
2025-10-01 22:01:44 +03:00
Sieds Lykles
2f8ac77c25
add allow_any_len to gated load patterns in renderers ( #12406 )
2025-10-01 20:35:32 +02:00
George Hotz
89bed28716
split reduceop ( #12404 )
...
* some rangeify tests fixed
* bring split reduceop to rangeify
* fix tests
2025-10-01 18:45:16 +08:00
George Hotz
74ee305948
some rangeify tests fixed ( #12403 )
2025-10-01 18:23:37 +08:00
qazal
f198a9e1ba
skip test_multihost_aware_schedule, assign devices mismatch ( #12396 )
...
* minimal failing remote test
* this should've never worked?
* skip that test
2025-10-01 13:09:15 +03:00
b1tg
ac3d457d5e
rangeify: TestReduceOpsConstFolding ( #12397 )
...
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-10-01 17:58:19 +08:00
George Hotz
c449e8eb17
don't change that
2025-10-01 17:47:43 +08:00
George Hotz
3dc1b2e98e
broken
2025-10-01 17:27:17 +08:00
George Hotz
8e6126160f
Merge branch 'master' into support_opts_in_contig
2025-10-01 17:20:51 +08:00
George Hotz
60e52fbe36
support opts in contig, simpler ( #12400 )
2025-10-01 17:20:04 +08:00
chenyu
6c95b1f39d
explicitly set device for CI unit test ( #12399 )
2025-10-01 05:16:54 -04:00