qazal
b515d796fb
inline viz get_name [pr] ( #10682 )
...
* inline viz get_name [pr]
* changing name_fxn makes this simpler
* waitUntil dom
2025-06-07 11:16:16 +03:00
wozeparrot
e3805171e2
feat: variable bs bitcast ( #10674 )
2025-06-06 17:21:53 -07:00
George Hotz
54db1f8ee8
prevent huge waste of multi ram ( #10669 )
...
* prevent huge waste of multi ram
* fix ram usage
* only define var
* add resolve
* fix tests
* fix cifar training
* remove that logic
* fix test without long
2025-06-06 17:17:21 -07:00
George Hotz
b68b7dbc2a
test winograd is close to normal conv [pr] ( #10557 )
...
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-06-06 19:11:49 -04:00
leopf
eb7305e6a4
Tensor.keccak("sha3_256") ( #7186 )
...
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
Co-authored-by: George Hotz <geohot@gmail.com >
Co-authored-by: wozeparrot <wozeparrot@gmail.com >
2025-06-06 15:24:05 -07:00
chenyu
bdede4924e
fix odd number in get_test_global_size ( #10671 )
...
factor might not be a integer if input global_size has an odd number in it
2025-06-06 17:31:35 -04:00
George Hotz
7f0f97aa76
new test_multitensor tests ( #10667 )
...
* new test_multitensor tests
* cleanup scheduler
2025-06-06 10:26:28 -07:00
chenyu
4a6d84c4c3
hotfix llama start_pos vmax is max_context-1 ( #10659 )
...
* hotfix llama start_pos vmax is max_context-1
fixed `IGNORE_OOB=0 python3 examples/llama3.py --size 1B --benchmark --temperature 0`
* hotfix: multitensor transformer test tests kv cache
---------
Co-authored-by: George Hotz <geohot@gmail.com >
2025-06-06 00:41:25 -04:00
George Hotz
5eb6e1e65a
Revert "hotfix: multitensor transformer test tests kv cache"
...
This reverts commit ad9f88419a .
2025-06-05 21:15:34 -07:00
George Hotz
ad9f88419a
hotfix: multitensor transformer test tests kv cache
2025-06-05 21:08:57 -07:00
George Hotz
8325c4f192
tests for multi assign ( #10658 )
...
* tests for multi assign
* transformer tests
* add that assert
2025-06-05 20:56:40 -07:00
wozeparrot
0d86f8d375
fix failed threefry ( #10646 )
2025-06-05 17:17:42 -07:00
chenyu
ff1aad7b69
fix const float pow to int tensor ( #10655 )
...
was incorrectly casted into int
2025-06-05 19:15:12 -04:00
George Hotz
baba274a76
minimal mstack pr to fix allreduce ( #10649 )
...
* minimal mstack pr to fix allreduce
* fix webgpu
2025-06-05 15:14:53 -07:00
George Hotz
4c315f8e17
MSTACK little non-functional changes ( #10648 )
2025-06-05 13:20:22 -07:00
chenyu
46811d0d3c
minor external_model_benchmark cleanup ( #10644 )
2025-06-05 14:13:28 -04:00
qazal
26afbc954f
delete redundant tests from test_schedule [pr] ( #10643 )
2025-06-05 20:08:39 +03:00
chenyu
80ebce421d
remove metal buffer limit in external_model_benchmark [pr] ( #10642 )
...
not needed anymore
2025-06-05 13:00:51 -04:00
qazal
28c4997236
check for matching shape order in fused reduce ( #10641 )
...
* failing test
* shapes match with ones removed
2025-06-05 19:37:22 +03:00
qazal
1190062812
prevent grouper can_chase while fusing arange [pr] ( #10623 )
2025-06-05 18:50:21 +03:00
qazal
8c5ea00522
push permutes through fused reduces ( #10628 )
...
* fix pushing reshapes through reduceops
* reduceop_view_right should assert on ndims mismatch
* update that, view.reshape asserts it
2025-06-05 16:14:04 +03:00
chenyu
d0969f5a1f
cleanup multi tests ( #10635 )
2025-06-05 00:28:44 -04:00
qazal
571c0296a9
linearizer failure from FUSE_ARANGE default diff ( #10629 )
...
* start with test_arange_sum
* test_arange_avgpool2d
* device.renderer.supports_float4
2025-06-04 19:11:52 +03:00
qazal
5056d21b29
add failing TestSchedule.test_arange_sum [pr] ( #10627 )
2025-06-04 17:23:59 +03:00
qazal
7114b6ab31
viz browser tests ( #10626 )
...
* viz browser tests
* expect failure if js/ isn't included
* back green
2025-06-04 14:58:24 +03:00
wozeparrot
4d1686f767
clean: becnhmark -> benchmark ( #10620 )
2025-06-03 19:28:18 -07:00
qazal
ce9f12dc13
reorder cast before masking constants ( #10609 )
...
* failing test from fuzzer
* .numpy() handles bfloat16 better
* const->view->cast becomes const->cast->view
* update TestMovedConstFolding.test_cast_padded
2025-06-03 15:44:03 +03:00
qazal
910cabb081
add kernel count to grouper process replay differ [pr] ( #10611 )
2025-06-03 15:21:27 +03:00
Ahmed Harmouche
650404a143
[webgpu] Proper shared mem size for packed types ( #10585 )
...
* Proper shared mem size in webgpu
* Add test
* Refactor test
2025-06-01 20:18:33 -04:00
qazal
3cc73a0172
simpler process replay main loop [pr] ( #10588 )
...
* simpler process replay main loop [pr]
* use logging
* default to 1
2025-06-01 15:03:21 +03:00
qazal
dc882d3d7d
merge process replay and viz captures [pr] ( #10581 )
...
* refactoring
* test script
* work
* more work
* diff
* repr splits lines correctly
* that
* add location
* add location
* also don't need name_override
* k.copy
* [pr]
* name_override 2
* err
2025-06-01 12:30:10 +03:00
qazal
1f8a8721e9
remove test_unaligns_idxs, UOps don't have order like this [pr] ( #10587 )
2025-06-01 12:16:14 +03:00
Ahmed Harmouche
35eb4d357a
[webgpu] Fix atomic shared mem load inside loop ( #10530 )
...
* Disable shared mem atomics on webgpu
* allow_any_len in load pattern matcher to fix temp load inside loop
2025-05-31 09:29:02 -04:00
qazal
5b59728c75
refactor LOAD(DEFINE_GLOBAL, VIEW) in kernels to LOAD(VIEW(DEFINE_GLOBAL)) ( #10541 )
...
* changes to core tinygrad
* fixups pt1
TC=3
docs/abstractions2.py
IMAGE=2
test_quantize_dsp
test_schedule
* more tests
* green now
* images stay images
2025-05-30 14:27:58 +03:00
chenyu
116ffc4e92
cstyle strips paren for AND and OR ( #10560 )
2025-05-30 07:09:05 -04:00
qazal
bbf05110a2
use kernelize in TestLinearizer.test_indexing_multireduce [pr] ( #10571 )
2025-05-30 11:27:09 +03:00
qazal
7051bf3fd5
fixup hardcoded asts ptr dtype and constants [pr] ( #10570 )
...
* fixup hardcoded asts ptr dtype and constants [pr]
* use kernelize for test_kernel_count
2025-05-30 09:38:32 +03:00
qazal
066196415f
UOp.valid and const_like work with just shapes [pr] ( #10569 )
...
* UOp.valid and const_like work with just shapes [pr]
* pm_quant left
* pm_quant
2025-05-30 08:55:06 +03:00
George Hotz
b3b43a82c4
remove Tensor.no_grad, it's meaningless now [pr] ( #10556 )
2025-05-28 22:20:02 -07:00
George Hotz
e140f8f0d8
linearizer test_failure_61 ( #10552 )
...
* enumerate cases of Tensors in the JIT
* optional fused optimizers
* add fused optimizer test
* move that there
* ugh
* work on beautiful_cifar
* speed close to hlb_cifar
* test_failure_61
* just the failure
2025-05-28 21:30:50 -07:00
Sieds Lykles
ae02a1e232
[bounty] Z3 symbolic fuzzer [pr] ( #10514 )
...
* First version, caught a bug?
* Nicely print failure to reproduce
* Remove that
* Put the assert back
* Change fuzzing to use testing_unit so it has z3
* Test key to match
* Add rule
* Add test
* Add test for edge case 0
* Merge patterns
* update comment
* consistent whitespace
* whitespace
* add condition
* add test
* update comment
* use Variable
* fuzzer using z3_renderer
* Cleaned up printing and debugging
* working new fuzzer
* change some comments and printing
* more formatting
* fuzz failures in seperate file
* fix fstring
* more tests
* naming
* remove added line
* remove comment
* print number of skipped expressions
* use self.assertEqual
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-05-28 16:28:37 -04:00
George Hotz
98f3d1c26d
enumerate cases of Tensors in the JIT ( #10548 )
2025-05-28 11:51:27 -07:00
qazal
d1f0043331
use store_val helper in test_schedule asserts [pr] ( #10540 )
2025-05-27 21:48:06 +03:00
George Hotz
5b268121d4
remove becomes map ( #10533 )
...
* remove becomes map
* add comment and delete dead code
* multi is a view
2025-05-27 11:47:11 -07:00
George Hotz
a07caaca0d
handle stride 0 variable reshape ( #10536 )
2025-05-27 10:00:24 -07:00
George Hotz
41e3d07d7f
view gradient is tricky ( #10528 )
...
* view gradient is tricky
* explicit
2025-05-26 22:28:30 -07:00
uuuvn
c29c46853f
Very basic mock sqtt ( #10512 )
...
This mockgpu sqtt emulation will just ignore basically everything and end
up with a 0x1000 size trace full of zeroes, but just testing for things
like register rename is better than nothing i guess
2025-05-26 14:38:28 -07:00
qazal
6d07087fe1
remove contiguous from MSELECT 2 ( #10522 )
...
* remove contiguous from MSELECT
* test_shrink_on_shard_axis
---------
Co-authored-by: George Hotz <geohot@gmail.com >
2025-05-26 19:19:01 +03:00
geohotstan
602a145f8f
Add Tensor.unfold ( #10518 )
...
* yoinked 10272
* eitanturok's fixes
* hmmm should size be sint?
* add test
2025-05-26 11:15:44 -04:00
qazal
9169dcfb49
do not create kernels with more inputs than the backend allows ( #10510 )
...
* work
* no itertools + top down pass
* clean viz
* python can do that
* webgpu
* gbarrier of gbarrier is gbarrier
* device can be tuple
* bug in toposort
* failing test for gated toposort
* contiguous of gbarrier is gbarrier
* check for binops
* Revert "check for binops"
This reverts commit 53e3cdf720 .
* viz + match on gbarrier, self exists by default
* alt
* green now
* cleanup
2025-05-26 18:02:03 +03:00