chenyu
18e9ec3ea1
add wino cifar to search benchmark ( #10615 )
...
* add wino cifar to search benchmark
* FUSE_OPTIM=1
* revert those
2025-06-03 20:38:43 -04:00
Bhavya Gada
bafd0c30d7
fix some minor typos and grammar ( #10619 )
2025-06-03 15:55:25 -07:00
nimlgen
4381b54543
am: disable page migration ( #10608 )
...
* am: disable page migration
* fixed
* enable
* fxi
* typ
* fix check
2025-06-03 18:51:28 +03:00
chenyu
1c1f578490
DISABLE_COMPILER_CACHE in sdxl search ( #10614 )
2025-06-03 09:22:25 -04:00
qazal
ce9f12dc13
reorder cast before masking constants ( #10609 )
...
* failing test from fuzzer
* .numpy() handles bfloat16 better
* const->view->cast becomes const->cast->view
* update TestMovedConstFolding.test_cast_padded
2025-06-03 15:44:03 +03:00
qazal
910cabb081
add kernel count to grouper process replay differ [pr] ( #10611 )
2025-06-03 15:21:27 +03:00
chenyu
26dee71bc1
hotfix don't overwrite acc dtype in scatter_reduce ( #10606 )
...
dtype is inferred by individul reduce
2025-06-02 21:17:01 -04:00
ihar
ba02a6331e
removed unnecessary 'isinstance(data, UOp)' check ( #10605 )
2025-06-02 20:58:14 -04:00
nimlgen
07de095b27
am: more info on PFs ( #10602 )
...
* am: more info on PFs
* fix
2025-06-02 23:48:40 +03:00
qazal
b8fb2ba829
rename to finalize_gbarrier [pr] ( #10596 )
2025-06-02 12:55:31 +03:00
Ahmed Harmouche
650404a143
[webgpu] Proper shared mem size for packed types ( #10585 )
...
* Proper shared mem size in webgpu
* Add test
* Refactor test
2025-06-01 20:18:33 -04:00
qazal
00822603ec
allow stacking of VIEW UOps [pr] ( #10532 )
...
* allow stacking of VIEW UOps [pr]
* merge_views is first
* simpler
* loc for pr, this needs a helper
* keep
* diff [pr]
* formatting
2025-06-01 23:27:04 +03:00
qazal
3cc73a0172
simpler process replay main loop [pr] ( #10588 )
...
* simpler process replay main loop [pr]
* use logging
* default to 1
2025-06-01 15:03:21 +03:00
qazal
dc882d3d7d
merge process replay and viz captures [pr] ( #10581 )
...
* refactoring
* test script
* work
* more work
* diff
* repr splits lines correctly
* that
* add location
* add location
* also don't need name_override
* k.copy
* [pr]
* name_override 2
* err
2025-06-01 12:30:10 +03:00
qazal
1f8a8721e9
remove test_unaligns_idxs, UOps don't have order like this [pr] ( #10587 )
2025-06-01 12:16:14 +03:00
ihar
c45936c4fc
replaced '.upper()' which is never needed with '.lower()' which were duplicated ( #10586 )
2025-05-31 20:58:42 -04:00
ihar
88f38d3fcc
remove '_metaop' because it is an old wrapper around 'UOp.metaop' with no additional functionality anymore ( #10583 )
2025-05-31 14:06:39 -04:00
chenyu
77c7989fa0
remove a MUL rewrite rule for wgsl ( #10582 )
...
tests are fine without it
2025-05-31 14:05:49 -04:00
Ahmed Harmouche
35eb4d357a
[webgpu] Fix atomic shared mem load inside loop ( #10530 )
...
* Disable shared mem atomics on webgpu
* allow_any_len in load pattern matcher to fix temp load inside loop
2025-05-31 09:29:02 -04:00
qazal
6af4b02374
use plain dict and list in grouper [pr] ( #10580 )
2025-05-31 13:09:59 +03:00
chenyu
4ab3391e6f
set -o pipefail for mlperf run_and_time (#10577 )
...
also run the 5.1 script in ci cron job
2025-05-30 16:36:44 -04:00
chenyu
baf482d314
copy mlperf stuff to 5.1 ( #10576 )
...
5.0 is finalized, new changes go to 5.1
2025-05-30 16:12:39 -04:00
nimlgen
883bb4541c
am: reserve address space ( #10564 )
...
* am: reserve address space
* f
* cc
* errno
* fix
* always has cpu mapping
2025-05-30 19:31:03 +03:00
qazal
e0305e54fc
remove custom merge_views rewrite rule for buffer ops [pr] ( #10574 )
2025-05-30 15:27:13 +03:00
qazal
de9597a8a9
cleanup kernel.py ShapeTracker replacement [pr] ( #10573 )
2025-05-30 15:06:01 +03:00
qazal
5b59728c75
refactor LOAD(DEFINE_GLOBAL, VIEW) in kernels to LOAD(VIEW(DEFINE_GLOBAL)) ( #10541 )
...
* changes to core tinygrad
* fixups pt1
TC=3
docs/abstractions2.py
IMAGE=2
test_quantize_dsp
test_schedule
* more tests
* green now
* images stay images
2025-05-30 14:27:58 +03:00
chenyu
116ffc4e92
cstyle strips paren for AND and OR ( #10560 )
2025-05-30 07:09:05 -04:00
qazal
bbf05110a2
use kernelize in TestLinearizer.test_indexing_multireduce [pr] ( #10571 )
2025-05-30 11:27:09 +03:00
qazal
7051bf3fd5
fixup hardcoded asts ptr dtype and constants [pr] ( #10570 )
...
* fixup hardcoded asts ptr dtype and constants [pr]
* use kernelize for test_kernel_count
2025-05-30 09:38:32 +03:00
qazal
066196415f
UOp.valid and const_like work with just shapes [pr] ( #10569 )
...
* UOp.valid and const_like work with just shapes [pr]
* pm_quant left
* pm_quant
2025-05-30 08:55:06 +03:00
wozeparrot
5e3c4a8431
fix: comma testsig ( #10568 )
2025-05-29 19:00:07 -07:00
Eitan Turok
c07f13c438
Docs for masked_fill ( #10558 )
...
* add docs
* fix doc examples
* add to docs
* fix typo
2025-05-29 03:49:02 -07:00
George Hotz
b3b43a82c4
remove Tensor.no_grad, it's meaningless now [pr] ( #10556 )
2025-05-28 22:20:02 -07:00
George Hotz
e4e7b5d7e1
continue work on beautiful cifar ( #10555 )
2025-05-28 21:42:01 -07:00
George Hotz
e140f8f0d8
linearizer test_failure_61 ( #10552 )
...
* enumerate cases of Tensors in the JIT
* optional fused optimizers
* add fused optimizer test
* move that there
* ugh
* work on beautiful_cifar
* speed close to hlb_cifar
* test_failure_61
* just the failure
2025-05-28 21:30:50 -07:00
George Hotz
871df1436a
more beautiful cifar ( #10551 )
...
* enumerate cases of Tensors in the JIT
* optional fused optimizers
* add fused optimizer test
* move that there
* ugh
* work on beautiful_cifar
* speed close to hlb_cifar
* schedule to corealize all
* one line sched step
* less lines
2025-05-28 20:48:20 -07:00
George Hotz
ee12e801a3
optional fused optimizers ( #10549 )
...
* enumerate cases of Tensors in the JIT
* optional fused optimizers
* add fused optimizer test
* move that there
* ugh
2025-05-28 13:50:30 -07:00
Sieds Lykles
ae02a1e232
[bounty] Z3 symbolic fuzzer [pr] ( #10514 )
...
* First version, caught a bug?
* Nicely print failure to reproduce
* Remove that
* Put the assert back
* Change fuzzing to use testing_unit so it has z3
* Test key to match
* Add rule
* Add test
* Add test for edge case 0
* Merge patterns
* update comment
* consistent whitespace
* whitespace
* add condition
* add test
* update comment
* use Variable
* fuzzer using z3_renderer
* Cleaned up printing and debugging
* working new fuzzer
* change some comments and printing
* more formatting
* fuzz failures in seperate file
* fix fstring
* more tests
* naming
* remove added line
* remove comment
* print number of skipped expressions
* use self.assertEqual
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-05-28 16:28:37 -04:00
chenyu
74cf5dbd9e
mlperf system updates ( #10550 )
...
standardized processor and accelerator names
2025-05-28 16:15:46 -04:00
George Hotz
98f3d1c26d
enumerate cases of Tensors in the JIT ( #10548 )
2025-05-28 11:51:27 -07:00
nimlgen
d1d9e729fd
am_smi: mem usage ( #10547 )
2025-05-28 16:53:31 +03:00
chenyu
23e41f523a
sdxl also run with cached search ( #10546 )
2025-05-28 06:51:56 -04:00
chenyu
fffdc4d31c
workflow to run sdxl with search ( #10543 )
2025-05-27 17:25:41 -04:00
qazal
d1f0043331
use store_val helper in test_schedule asserts [pr] ( #10540 )
2025-05-27 21:48:06 +03:00
George Hotz
5b268121d4
remove becomes map ( #10533 )
...
* remove becomes map
* add comment and delete dead code
* multi is a view
2025-05-27 11:47:11 -07:00
qazal
271110bb5a
s/src[0]/buf in lowerer.py [pr] ( #10539 )
2025-05-27 21:08:54 +03:00
qazal
f0042629d1
replace arg in merge_view [pr] ( #10537 )
2025-05-27 21:00:24 +03:00
qazal
617ecc1a7b
fixup grouper process replay [pr] ( #10538 )
2025-05-27 20:46:55 +03:00
George Hotz
a07caaca0d
handle stride 0 variable reshape ( #10536 )
2025-05-27 10:00:24 -07:00
George Hotz
0515622d95
the schedule graph is the tensor graph ( #10534 )
...
* the schedule graph is the tensor graph
* gate type_verify on debug
* relax that spec
* unmasked check is okay
2025-05-27 09:23:57 -07:00