Commit Graph

4667 Commits

Author SHA1 Message Date
Sieds Lykles
16a65b4fd0 fix test_symbolic_gcd_div hang (#12427) 2025-10-03 04:21:16 +02:00
chenyu
7b3912d8e4 relax atol for some tests (#12422) 2025-10-02 05:04:44 -04:00
chenyu
98163832e4 update RANGEIFY test_cast_padded (#12421)
* update RANGEIFY test_cast_padded

* update test
2025-10-02 04:37:35 -04:00
qazal
f21851b099 ops: n^2 .device property fix (#12419)
* test case for a long rand chain

currently failing with RANGEIFY because device propogates too deep

* skip

* ops: n^2 .device property fix

* unskip

---------

Co-authored-by: Chen-Yu Yang <chenyu@fastmail.com>
2025-10-02 03:28:12 -04:00
qazal
13a25b2e67 rangeify: don't shape INDEX on kernelize (#12417) 2025-10-02 09:45:37 +03:00
hooved
5d9035f5a6 Eval for Stable Diffusion mlperf (#12316)
* add diff

* rerun ci

* refactor beam workaround, add test

* fix conflict

* linting
2025-10-02 02:35:38 -04:00
hooved
0f804c9a83 Stable Diffusion model init for mlperf (#12314)
* include clip pr diff

* updated unet and sd init

* dehardcode default device

* revert beam hang workaround

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-10-02 02:28:41 -04:00
George Hotz
583553f467 split ranges (#12411)
* split ranges

* simpler

* split ranges

* range str

* fix test

* oops

* faster

* no group 2

* tests

* dont_sub_ranges_for_image

* revert that
2025-10-02 12:57:22 +08:00
qazal
6fc6b51b59 fix limit_bufs with kernelize (#12415) 2025-10-02 07:49:11 +03:00
qazal
2fcd55583f allow less kernels in external_test_opt (#12412)
* allow less kernels in external_test_opt

* this was always 2
2025-10-02 05:05:42 +03:00
qazal
8b48e19ce2 skip more multi remote tests (#12410) 2025-10-02 04:50:46 +03:00
Sieds Lykles
9a64fc0d28 Load alt value with cast try 2 (#12407)
* add or_casted

* add tests and fix old tests

* cast load

* move that to pm_render

* add allow_any_len to gated load patterns in renderers

* slice [:2]
2025-10-02 00:55:29 +02:00
nimlgen
3e0e0290ce increase timeout in test_module_runs (#12408) 2025-10-01 22:01:44 +03:00
George Hotz
89bed28716 split reduceop (#12404)
* some rangeify tests fixed

* bring split reduceop to rangeify

* fix tests
2025-10-01 18:45:16 +08:00
George Hotz
74ee305948 some rangeify tests fixed (#12403) 2025-10-01 18:23:37 +08:00
qazal
f198a9e1ba skip test_multihost_aware_schedule, assign devices mismatch (#12396)
* minimal failing remote test

* this should've never worked?

* skip that test
2025-10-01 13:09:15 +03:00
George Hotz
60e52fbe36 support opts in contig, simpler (#12400) 2025-10-01 17:20:04 +08:00
chenyu
6ba8bf282f skip test_masked_select for RANGEIFY PYTHON (#12395) 2025-10-01 04:13:31 -04:00
chenyu
adc8c3b28f Revert "load alt value with cast (#12384)" (#12392)
This reverts commit 05e91a248d.
2025-10-01 03:20:04 -04:00
qazal
90b1c0dd96 rangeify: test_where_fold kernel count (#12379)
* rangeify: test_where_fold kernel count

* get these from the index

* replace ranges

* fine

* movement ops

* diff

* better
2025-10-01 09:35:12 +03:00
b1tg
42748ccb92 rangeify: fix test_prequant_conv2d_1x1 (#12391) 2025-10-01 02:33:47 -04:00
Sieds Lykles
05e91a248d load alt value with cast (#12384)
* add or_casted

* add tests and fix old tests

* cast load

* move that to pm_render
2025-10-01 07:14:26 +02:00
b1tg
57ad46c6e4 rangeify: increase atol for test_two_binops_no_rerun passing on real windows machine (#12389)
CPU_LLVM=1
2025-10-01 00:56:45 -04:00
chenyu
0662946fac atol in test_two_binops_no_rerun (#12387)
for RANGEIFY LLVM
2025-10-01 00:05:47 -04:00
wozeparrot
4204edc60b feat: skip test_long (#12383) 2025-09-30 20:07:39 -07:00
George Hotz
4c9a930de2 rangeify attn tests (#12377) 2025-10-01 09:59:19 +08:00
hooved
969a1b35ca LR scheduler for Stable Diffusion mlperf training (#12201)
* add lr scheduler for stable diffusion training

* add lr scheduler test

* rerun ci

* rerun CI

* use np for testing

* move test to CI path

* remove unneeded copy
2025-09-30 21:21:08 -04:00
George Hotz
9ef319f349 bad conv in rangeify (#12373)
* bad conv with broken rangeify

* no maxpool needed

* add empty_like

* typo

* no self

* issue remains for test
2025-10-01 08:56:22 +08:00
George Hotz
44558a37f7 fix some rangeify tests (#12370)
* fix bad range merges

* fix rng

* fix uop gc

* fix some rangeify tests

* now that needs rangeify 2 also
2025-09-30 20:12:08 +08:00
nimlgen
2c397eb2a2 rangeify: buf limit (#12336)
* limit bufs

* g

* fix buffer limit

* um?

* fix

* only these?

* typo

* f

* cleaner
2025-09-30 14:59:47 +03:00
George Hotz
a83f219253 fix bad range merges (#12368)
* fix bad range merges

* fix rng

* fix uop gc
2025-09-30 19:30:21 +08:00
qazal
a95159d579 remove TestShapeSpec, it relies on ShapeTracker [pr] (#12369) 2025-09-30 14:20:35 +03:00
qazal
de1d562b69 rangeify: update test_pickle asserts (#12366)
* realized exists on the base

* use is_realized
2025-09-30 13:27:41 +03:00
qazal
e8c595c29e remu: add new instructions introduced in RANGEIFY (#12363)
* add v_mad_i64_i32 for test_output_padded_conv_transpose2d

* run amd test_ops

* skip test_masked_select
2025-09-30 12:36:29 +03:00
qazal
109c63b904 update Tensor unit tests for RANGEIFY (#12359)
* update test_kernelize for RANGEIFY

* also kernelizes user contiguous

* skip that test

* tensor uop repr

* 4 kernels, still realizes a float
2025-09-30 11:17:21 +03:00
George Hotz
7129419500 fix cifar training in RANGEIFY (#12355)
* fix cifar training in RANGEIFY

* even more wino fuse

* bugfix

* test to show issue
2025-09-30 15:59:19 +08:00
qazal
4ff7f20b9d rangeify: fix kernelize (#12357) 2025-09-30 10:10:08 +03:00
chenyu
86c5c969ea linalg cosmetic change (#12356) 2025-09-30 03:00:59 -04:00
qazal
6a56d3c859 rangeify: only test correctness in multi (#12339)
* work

* more work

* back here

* skip tests

* work
2025-09-30 09:55:59 +03:00
George Hotz
ab6b0d3a21 enable cleanup_dead_axes (#12351)
* enable cleanup_dead_axes

* don't mess with user contig

* correct tag behavior

* double reshape isn't correct

* block on assign too

* skip messing with symbolic

* Fix tests

* disable RANGEIFY=2

* test w rangeify
2025-09-30 14:09:39 +08:00
Sieds Lykles
73b25bf47d z3 fix loaded mask (#12353)
* z3 fix loaded mask

* indentation
2025-09-30 06:55:50 +02:00
wozeparrot
2a0caa09c2 push copy to disk (#12348) 2025-09-29 21:55:05 -07:00
hooved
39aae679e4 Support bfloat16 on NULL backend (#12340)
* add failing test

* move test

* only run test with NULL default

* add skip reason

* add fix
2025-09-30 00:02:30 -04:00
George Hotz
f522e83a02 fix rangeify elu fusion for openpilot (#12341)
* fix rangeify elu fusion for openpilot

* flip the metadata

* copy over permuted contiguous support

* this is correct

* update that
2025-09-30 11:41:52 +08:00
Sieds Lykles
d55d829635 Lower index dtype spec fix (#12337)
* new pm_lower_index_dtype

* load_store_indexing after index lowering

* shorten line

* seperate rule for long removal

* fix test

* fix index_to_concrete_int

* minor fixes

* add sink there

* update types in linearizer test
2025-09-30 04:26:50 +02:00
hooved
c2689c505e Clip model updates for Stable Diffusion mlperf training (#12313)
* stable diffusion mlperf clip changes

* add clip tests

* set gelu as attribute

* add more tests

* factor out GPUS

* rerun CI

* add imports to if blocks

* remove unneeded axis

* add clip tests to CI

* move clip tests

* add deps, disable max buf size
2025-09-29 21:50:14 -04:00
George Hotz
cdfa0f29fd add rendering to index (#12338) 2025-09-30 09:18:05 +08:00
qazal
9513f025c5 apply multi before rangeify (#12298)
* it doesn't realize it when i reshape

* cleaner graph

* map out

* REDUCE_AXIS also gives the wrong answer

* maybe

* work

* back here

* try

* more

* refactor tests

* check MultiBuffer

* or copy

* fine with this

* don't need graph_rewrite_map in rangeify
2025-09-29 14:16:31 +03:00
George Hotz
3291e00df7 fix efficientnet slowness on rangeify (#12332) 2025-09-29 18:01:01 +08:00
chenyu
9d2f2b8e34 skip test_mean_half_precision_overflow (#12331)
it only works with SPLIT_REDUCEOP=1
2025-09-29 05:15:04 -04:00