Commit Graph

10417 Commits

Author SHA1 Message Date
chenyu
18e9ec3ea1 add wino cifar to search benchmark (#10615)
* add wino cifar to search benchmark

* FUSE_OPTIM=1

* revert those
2025-06-03 20:38:43 -04:00
Bhavya Gada
bafd0c30d7 fix some minor typos and grammar (#10619) 2025-06-03 15:55:25 -07:00
nimlgen
4381b54543 am: disable page migration (#10608)
* am: disable page migration

* fixed

* enable

* fxi

* typ

* fix check
2025-06-03 18:51:28 +03:00
chenyu
1c1f578490 DISABLE_COMPILER_CACHE in sdxl search (#10614) 2025-06-03 09:22:25 -04:00
qazal
ce9f12dc13 reorder cast before masking constants (#10609)
* failing test from fuzzer

* .numpy() handles bfloat16 better

* const->view->cast becomes const->cast->view

* update TestMovedConstFolding.test_cast_padded
2025-06-03 15:44:03 +03:00
qazal
910cabb081 add kernel count to grouper process replay differ [pr] (#10611) 2025-06-03 15:21:27 +03:00
chenyu
26dee71bc1 hotfix don't overwrite acc dtype in scatter_reduce (#10606)
dtype is inferred by individul reduce
2025-06-02 21:17:01 -04:00
ihar
ba02a6331e removed unnecessary 'isinstance(data, UOp)' check (#10605) 2025-06-02 20:58:14 -04:00
nimlgen
07de095b27 am: more info on PFs (#10602)
* am: more info on PFs

* fix
2025-06-02 23:48:40 +03:00
qazal
b8fb2ba829 rename to finalize_gbarrier [pr] (#10596) 2025-06-02 12:55:31 +03:00
Ahmed Harmouche
650404a143 [webgpu] Proper shared mem size for packed types (#10585)
* Proper shared mem size in webgpu

* Add test

* Refactor test
2025-06-01 20:18:33 -04:00
qazal
00822603ec allow stacking of VIEW UOps [pr] (#10532)
* allow stacking of VIEW UOps [pr]

* merge_views is first

* simpler

* loc for pr, this needs a helper

* keep

* diff [pr]

* formatting
2025-06-01 23:27:04 +03:00
qazal
3cc73a0172 simpler process replay main loop [pr] (#10588)
* simpler process replay main loop [pr]

* use logging

* default to 1
2025-06-01 15:03:21 +03:00
qazal
dc882d3d7d merge process replay and viz captures [pr] (#10581)
* refactoring

* test script

* work

* more work

* diff

* repr splits lines correctly

* that

* add location

* add location

* also don't need name_override

* k.copy

* [pr]

* name_override 2

* err
2025-06-01 12:30:10 +03:00
qazal
1f8a8721e9 remove test_unaligns_idxs, UOps don't have order like this [pr] (#10587) 2025-06-01 12:16:14 +03:00
ihar
c45936c4fc replaced '.upper()' which is never needed with '.lower()' which were duplicated (#10586) 2025-05-31 20:58:42 -04:00
ihar
88f38d3fcc remove '_metaop' because it is an old wrapper around 'UOp.metaop' with no additional functionality anymore (#10583) 2025-05-31 14:06:39 -04:00
chenyu
77c7989fa0 remove a MUL rewrite rule for wgsl (#10582)
tests are fine without it
2025-05-31 14:05:49 -04:00
Ahmed Harmouche
35eb4d357a [webgpu] Fix atomic shared mem load inside loop (#10530)
* Disable shared mem atomics on webgpu

* allow_any_len in load pattern matcher to fix temp load inside loop
2025-05-31 09:29:02 -04:00
qazal
6af4b02374 use plain dict and list in grouper [pr] (#10580) 2025-05-31 13:09:59 +03:00
chenyu
4ab3391e6f set -o pipefail for mlperf run_and_time (#10577)
also run the 5.1 script in ci cron job
2025-05-30 16:36:44 -04:00
chenyu
baf482d314 copy mlperf stuff to 5.1 (#10576)
5.0 is finalized, new changes go to 5.1
2025-05-30 16:12:39 -04:00
nimlgen
883bb4541c am: reserve address space (#10564)
* am: reserve address space

* f

* cc

* errno

* fix

* always has cpu mapping
2025-05-30 19:31:03 +03:00
qazal
e0305e54fc remove custom merge_views rewrite rule for buffer ops [pr] (#10574) 2025-05-30 15:27:13 +03:00
qazal
de9597a8a9 cleanup kernel.py ShapeTracker replacement [pr] (#10573) 2025-05-30 15:06:01 +03:00
qazal
5b59728c75 refactor LOAD(DEFINE_GLOBAL, VIEW) in kernels to LOAD(VIEW(DEFINE_GLOBAL)) (#10541)
* changes to core tinygrad

* fixups pt1

TC=3
docs/abstractions2.py
IMAGE=2
test_quantize_dsp
test_schedule

* more tests

* green now

* images stay images
2025-05-30 14:27:58 +03:00
chenyu
116ffc4e92 cstyle strips paren for AND and OR (#10560) 2025-05-30 07:09:05 -04:00
qazal
bbf05110a2 use kernelize in TestLinearizer.test_indexing_multireduce [pr] (#10571) 2025-05-30 11:27:09 +03:00
qazal
7051bf3fd5 fixup hardcoded asts ptr dtype and constants [pr] (#10570)
* fixup hardcoded asts ptr dtype and constants [pr]

* use kernelize for test_kernel_count
2025-05-30 09:38:32 +03:00
qazal
066196415f UOp.valid and const_like work with just shapes [pr] (#10569)
* UOp.valid and const_like work with just shapes [pr]

* pm_quant left

* pm_quant
2025-05-30 08:55:06 +03:00
wozeparrot
5e3c4a8431 fix: comma testsig (#10568) 2025-05-29 19:00:07 -07:00
Eitan Turok
c07f13c438 Docs for masked_fill (#10558)
* add docs

* fix doc examples

* add to docs

* fix typo
2025-05-29 03:49:02 -07:00
George Hotz
b3b43a82c4 remove Tensor.no_grad, it's meaningless now [pr] (#10556) 2025-05-28 22:20:02 -07:00
George Hotz
e4e7b5d7e1 continue work on beautiful cifar (#10555) 2025-05-28 21:42:01 -07:00
George Hotz
e140f8f0d8 linearizer test_failure_61 (#10552)
* enumerate cases of Tensors in the JIT

* optional fused optimizers

* add fused optimizer test

* move that there

* ugh

* work on beautiful_cifar

* speed close to hlb_cifar

* test_failure_61

* just the failure
2025-05-28 21:30:50 -07:00
George Hotz
871df1436a more beautiful cifar (#10551)
* enumerate cases of Tensors in the JIT

* optional fused optimizers

* add fused optimizer test

* move that there

* ugh

* work on beautiful_cifar

* speed close to hlb_cifar

* schedule to corealize all

* one line sched step

* less lines
2025-05-28 20:48:20 -07:00
George Hotz
ee12e801a3 optional fused optimizers (#10549)
* enumerate cases of Tensors in the JIT

* optional fused optimizers

* add fused optimizer test

* move that there

* ugh
2025-05-28 13:50:30 -07:00
Sieds Lykles
ae02a1e232 [bounty] Z3 symbolic fuzzer [pr] (#10514)
* First version, caught a bug?

* Nicely print failure to reproduce

* Remove that

* Put the assert back

* Change fuzzing to use testing_unit so it has z3

* Test key to match

* Add rule

* Add test

* Add test for edge case 0

* Merge patterns

* update comment

* consistent whitespace

* whitespace

* add condition

* add test

* update comment

* use Variable

* fuzzer using z3_renderer

* Cleaned up printing and debugging

* working new fuzzer

* change some comments and printing

* more formatting

* fuzz failures in seperate file

* fix fstring

* more tests

* naming

* remove added line

* remove comment

* print number of skipped expressions

* use self.assertEqual

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2025-05-28 16:28:37 -04:00
chenyu
74cf5dbd9e mlperf system updates (#10550)
standardized processor and accelerator names
2025-05-28 16:15:46 -04:00
George Hotz
98f3d1c26d enumerate cases of Tensors in the JIT (#10548) 2025-05-28 11:51:27 -07:00
nimlgen
d1d9e729fd am_smi: mem usage (#10547) 2025-05-28 16:53:31 +03:00
chenyu
23e41f523a sdxl also run with cached search (#10546) 2025-05-28 06:51:56 -04:00
chenyu
fffdc4d31c workflow to run sdxl with search (#10543) 2025-05-27 17:25:41 -04:00
qazal
d1f0043331 use store_val helper in test_schedule asserts [pr] (#10540) 2025-05-27 21:48:06 +03:00
George Hotz
5b268121d4 remove becomes map (#10533)
* remove becomes map

* add comment and delete dead code

* multi is a view
2025-05-27 11:47:11 -07:00
qazal
271110bb5a s/src[0]/buf in lowerer.py [pr] (#10539) 2025-05-27 21:08:54 +03:00
qazal
f0042629d1 replace arg in merge_view [pr] (#10537) 2025-05-27 21:00:24 +03:00
qazal
617ecc1a7b fixup grouper process replay [pr] (#10538) 2025-05-27 20:46:55 +03:00
George Hotz
a07caaca0d handle stride 0 variable reshape (#10536) 2025-05-27 10:00:24 -07:00
George Hotz
0515622d95 the schedule graph is the tensor graph (#10534)
* the schedule graph is the tensor graph

* gate type_verify on debug

* relax that spec

* unmasked check is okay
2025-05-27 09:23:57 -07:00