qazal
bbf05110a2
use kernelize in TestLinearizer.test_indexing_multireduce [pr] ( #10571 )
2025-05-30 11:27:09 +03:00
qazal
7051bf3fd5
fixup hardcoded asts ptr dtype and constants [pr] ( #10570 )
...
* fixup hardcoded asts ptr dtype and constants [pr]
* use kernelize for test_kernel_count
2025-05-30 09:38:32 +03:00
qazal
066196415f
UOp.valid and const_like work with just shapes [pr] ( #10569 )
...
* UOp.valid and const_like work with just shapes [pr]
* pm_quant left
* pm_quant
2025-05-30 08:55:06 +03:00
wozeparrot
5e3c4a8431
fix: comma testsig ( #10568 )
2025-05-29 19:00:07 -07:00
Eitan Turok
c07f13c438
Docs for masked_fill ( #10558 )
...
* add docs
* fix doc examples
* add to docs
* fix typo
2025-05-29 03:49:02 -07:00
George Hotz
b3b43a82c4
remove Tensor.no_grad, it's meaningless now [pr] ( #10556 )
2025-05-28 22:20:02 -07:00
George Hotz
e4e7b5d7e1
continue work on beautiful cifar ( #10555 )
2025-05-28 21:42:01 -07:00
George Hotz
e140f8f0d8
linearizer test_failure_61 ( #10552 )
...
* enumerate cases of Tensors in the JIT
* optional fused optimizers
* add fused optimizer test
* move that there
* ugh
* work on beautiful_cifar
* speed close to hlb_cifar
* test_failure_61
* just the failure
2025-05-28 21:30:50 -07:00
George Hotz
871df1436a
more beautiful cifar ( #10551 )
...
* enumerate cases of Tensors in the JIT
* optional fused optimizers
* add fused optimizer test
* move that there
* ugh
* work on beautiful_cifar
* speed close to hlb_cifar
* schedule to corealize all
* one line sched step
* less lines
2025-05-28 20:48:20 -07:00
George Hotz
ee12e801a3
optional fused optimizers ( #10549 )
...
* enumerate cases of Tensors in the JIT
* optional fused optimizers
* add fused optimizer test
* move that there
* ugh
2025-05-28 13:50:30 -07:00
Sieds Lykles
ae02a1e232
[bounty] Z3 symbolic fuzzer [pr] ( #10514 )
...
* First version, caught a bug?
* Nicely print failure to reproduce
* Remove that
* Put the assert back
* Change fuzzing to use testing_unit so it has z3
* Test key to match
* Add rule
* Add test
* Add test for edge case 0
* Merge patterns
* update comment
* consistent whitespace
* whitespace
* add condition
* add test
* update comment
* use Variable
* fuzzer using z3_renderer
* Cleaned up printing and debugging
* working new fuzzer
* change some comments and printing
* more formatting
* fuzz failures in seperate file
* fix fstring
* more tests
* naming
* remove added line
* remove comment
* print number of skipped expressions
* use self.assertEqual
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-05-28 16:28:37 -04:00
chenyu
74cf5dbd9e
mlperf system updates ( #10550 )
...
standardized processor and accelerator names
2025-05-28 16:15:46 -04:00
George Hotz
98f3d1c26d
enumerate cases of Tensors in the JIT ( #10548 )
2025-05-28 11:51:27 -07:00
nimlgen
d1d9e729fd
am_smi: mem usage ( #10547 )
2025-05-28 16:53:31 +03:00
chenyu
23e41f523a
sdxl also run with cached search ( #10546 )
2025-05-28 06:51:56 -04:00
chenyu
fffdc4d31c
workflow to run sdxl with search ( #10543 )
2025-05-27 17:25:41 -04:00
qazal
d1f0043331
use store_val helper in test_schedule asserts [pr] ( #10540 )
2025-05-27 21:48:06 +03:00
George Hotz
5b268121d4
remove becomes map ( #10533 )
...
* remove becomes map
* add comment and delete dead code
* multi is a view
2025-05-27 11:47:11 -07:00
qazal
271110bb5a
s/src[0]/buf in lowerer.py [pr] ( #10539 )
2025-05-27 21:08:54 +03:00
qazal
f0042629d1
replace arg in merge_view [pr] ( #10537 )
2025-05-27 21:00:24 +03:00
qazal
617ecc1a7b
fixup grouper process replay [pr] ( #10538 )
2025-05-27 20:46:55 +03:00
George Hotz
a07caaca0d
handle stride 0 variable reshape ( #10536 )
2025-05-27 10:00:24 -07:00
George Hotz
0515622d95
the schedule graph is the tensor graph ( #10534 )
...
* the schedule graph is the tensor graph
* gate type_verify on debug
* relax that spec
* unmasked check is okay
2025-05-27 09:23:57 -07:00
qazal
142f6ba873
move merge_views to grouper swizzler [pr] ( #10531 )
2025-05-27 16:33:26 +03:00
qazal
c03e9c8995
fixup typing for @diskcache [pr] ( #10529 )
2025-05-27 15:30:49 +03:00
George Hotz
41e3d07d7f
view gradient is tricky ( #10528 )
...
* view gradient is tricky
* explicit
2025-05-26 22:28:30 -07:00
George Hotz
ab4ca5da29
remove gradient nonsense [pr] ( #10527 )
...
* remove gradient nonsense [pr]
* grads to base
2025-05-26 19:09:59 -07:00
chenyu
76eb130d8c
hotfix: BenchEvent MLPERF_RUN is mlperf_run ( #10526 )
2025-05-26 20:19:37 -04:00
uuuvn
c29c46853f
Very basic mock sqtt ( #10512 )
...
This mockgpu sqtt emulation will just ignore basically everything and end
up with a 0x1000 size trace full of zeroes, but just testing for things
like register rename is better than nothing i guess
2025-05-26 14:38:28 -07:00
chenyu
51dc7eedb0
correct use AM for resnet run_and_time ( #10524 )
2025-05-26 15:33:11 -04:00
chenyu
c1919ad55f
use AM for resnet run_and_time ( #10523 )
2025-05-26 14:50:49 -04:00
George Hotz
e9bb2052cf
hotfix: update readme
2025-05-26 10:28:16 -07:00
qazal
6d07087fe1
remove contiguous from MSELECT 2 ( #10522 )
...
* remove contiguous from MSELECT
* test_shrink_on_shard_axis
---------
Co-authored-by: George Hotz <geohot@gmail.com >
2025-05-26 19:19:01 +03:00
geohotstan
602a145f8f
Add Tensor.unfold ( #10518 )
...
* yoinked 10272
* eitanturok's fixes
* hmmm should size be sint?
* add test
2025-05-26 11:15:44 -04:00
qazal
9169dcfb49
do not create kernels with more inputs than the backend allows ( #10510 )
...
* work
* no itertools + top down pass
* clean viz
* python can do that
* webgpu
* gbarrier of gbarrier is gbarrier
* device can be tuple
* bug in toposort
* failing test for gated toposort
* contiguous of gbarrier is gbarrier
* check for binops
* Revert "check for binops"
This reverts commit 53e3cdf720 .
* viz + match on gbarrier, self exists by default
* alt
* green now
* cleanup
2025-05-26 18:02:03 +03:00
nimlgen
deb369417c
am_smi: print device usage ( #10520 )
...
* am_smi: print device usage
* tiny comments
2025-05-26 17:17:56 +03:00
chenyu
2d50efb92b
set -e on mlperf run_and_time scripts (#10519 )
2025-05-26 09:22:30 -04:00
Sieds Lykles
478c76f4b7
More div conditions ( #10432 )
...
* add condition
* add test
* use Variable
2025-05-26 07:36:05 -04:00
Sieds Lykles
c6c7882bdf
bugfix: seperate rule for x//d<-c ( #10148 )
...
* Add rule
* Add test
* Add test for edge case 0
* Merge patterns
* update comment
* consistent whitespace
* whitespace
* update comment
2025-05-26 07:35:41 -04:00
chenyu
2eeea373af
add BENCHMARK_LOG for mlperf resnet cron ( #10516 )
2025-05-25 22:00:29 -04:00
b1tg
a1f64af92d
ci: setup llvm for amdremote ( #10507 )
...
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
2025-05-25 21:52:27 -04:00
geohotstan
fd9f236a82
move test over ( #10508 )
2025-05-25 21:51:51 -04:00
wozeparrot
7c81f9f95e
fix: gate mlperf workflow ( #10515 )
2025-05-25 17:06:21 -07:00
Panagiotis Kourouklidis
4941486cb0
Add method to return field masks for AMDReg ( #10511 )
2025-05-25 14:47:20 -07:00
nimlgen
88c5864bf3
nv: do not hardcode sass version ( #10513 )
2025-05-25 22:41:15 +03:00
George Hotz
941cbd3471
hotfix: amd works on arch linux w/o rocm
2025-05-24 16:47:13 -07:00
nimlgen
d90ddcc365
nv: blackwell support ( #10487 )
...
* nv: blackwell support
* fixes
* hm
* h
* fixes
* mypy
* xx
* yy
* arr
* revert
* oops
* unrelated
2025-05-24 18:23:53 +03:00
chenyu
dc6309242d
WallTimeEvent for mlperf ci ( #10506 )
2025-05-24 10:56:03 -04:00
qazal
dd5601af68
readable COPY(VIEW) reordering [pr] ( #10505 )
...
* readable COPY(VIEW) reordering [pr]
* assert that
* spec
* resolve
* Revert "resolve"
This reverts commit f5629fbef8 .
* arg
2025-05-24 17:08:58 +03:00
Ahmed Harmouche
bbb6deff53
Increase op limit in test_index_mnist to pass on webgpu ( #10504 )
...
* Increase op limit to enable mnist indexing on webgpu
* Only relax op_limit on WebGPU
2025-05-24 09:37:31 -04:00