George Hotz
996c907c0b
rewrite not ready + children machinery ( #11607 )
...
* rewrite not ready + children machinery
* it doesn't like track rewrites
2025-08-10 15:28:30 -07:00
geohotstan
b0dab6a4cd
onnx Resize OP clean up ( #11603 )
...
* start
* slight clean up
2025-08-10 14:10:39 -04:00
Sieds Lykles
10540414cd
Add Ops.CMPEQ ( #10431 )
...
* Add op
* add to Groupop.ALU
* fix spec
* fix ptx
* temporary pickle by name to see process replay
* add Ops.EQ to binary ops
* Actuall rename properly
* add test to assert CMPEQ is being used
* Ops.CMPEQ is automatic cast to bool
* add Ops.CMPEQ to llvm
* add Ops.CMPEQ to llvm
2025-08-10 13:13:16 +02:00
chenyu
dfb702ef33
fix sort for small dim ( #11601 )
...
* fix sort for small dim
* fixed test_sort_empty
2025-08-10 01:17:41 -04:00
Sieds Lykles
01c770c77b
Fix z3 float cast in indexing ( #11590 )
...
* adjust dtype of z3_renderer and add rule for cast
* dtypes.bool is also cast noop
* add regression test
* make embedding smaller
* even smaller test
2025-08-09 17:59:23 +02:00
Sieds Lykles
10d388499d
Refactor optional.py ( #11578 )
...
* move fast_idiv to transcendental
* move optional.py
* adjust comment
* change import
* mypy needs this?
2025-08-09 17:35:05 +02:00
qazal
16f0edbe90
pass opts arg in get_program process replay [pr] ( #11571 )
...
* fix ptx process replay
* keyword arg
* renderer is also optional [pr]
* test_linearizer fixup
* name function order is args,ret,kwargs
* can use opts_to_apply
* pass through p.applied_opts
* sink_arg
* now it opens devices too
2025-08-08 03:05:09 +03:00
qazal
960cc6533a
pass through name function args in track_rewrites ( #11572 )
2025-08-08 02:28:52 +03:00
George Hotz
82be8abfd2
move opt under codegen ( #11569 )
2025-08-07 14:19:17 -07:00
George Hotz
6ed2dfd187
delete the arange dim mismatch restriction ( #11568 )
...
* delete the arange dim mismatch restriction
* skip that test race
2025-08-07 13:46:17 -07:00
chenyu
aa1a6f2132
support threshold in Tensor.softplus ( #11564 )
...
fix gradient for large input
2025-08-07 13:43:18 -04:00
chenyu
7ee3770961
FUSE_ARANGE=1 ( #11427 )
...
* FUSE_ARANGE=1
* fix test
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-08-07 13:32:34 -04:00
George Hotz
9764c6cdee
fix mismatch reduce, try 2 ( #11560 )
...
* fix mismatch reduce, try 2
* fix heuristic
* delete that test
* don't start allowing ones
2025-08-07 07:57:58 -07:00
nimlgen
4f29a2c441
fix flaky test on macos ( #11557 )
2025-08-07 15:55:35 +03:00
George Hotz
a1aa5670aa
Revert "fix mismatch reduce ( #11547 )" ( #11549 )
...
This reverts commit 49d21a9055 .
2025-08-06 22:43:15 -07:00
George Hotz
49d21a9055
fix mismatch reduce ( #11547 )
...
* fix mismatch reduce
* cleanups
* fix shape
* fix mypy
* resolve
2025-08-06 21:12:51 -07:00
George Hotz
21570545d3
move view pushing to codegen, try 2 ( #11534 )
...
* move view pushing to codegen, try 2
* fix up some linearizer tests
* fix test search
* fix test schedule
* delete that test
* fix test arange
* fix a few tests
* update tests
* push views
* ebs cleanup
* fix local/reg
* test and lint
* fix more tests
* test cleanups
* skipped that one
2025-08-06 15:58:38 -07:00
George Hotz
80d9cced07
more test cleanups ( #11544 )
...
* more test cleanups
* revert that
2025-08-06 15:05:21 -07:00
George Hotz
6fd1332763
update some tests for less Kernel ( #11543 )
...
* update some tests for less Kernel
* get_program update
2025-08-06 14:19:59 -07:00
George Hotz
09dc7af8e9
move bind to big graph ( #11539 )
...
* move bind to big graph
* fix tests
* unbind inside kernel only
* merge views
* fix multitensor
* failure text change
2025-08-06 13:27:51 -07:00
George Hotz
7c5e115747
test_mismatch_reduce ( #11538 )
2025-08-06 10:02:14 -07:00
George Hotz
4fe11725c6
pass through sink arg, update linearizer test ( #11536 )
...
* pass through sink arg, update linearizer test
* get_program help
* bump line count
* use new api
2025-08-06 09:48:48 -07:00
chenyu
c9225d22ce
only disable flaky test_jit_multidev_xfer ( #11523 )
2025-08-05 22:17:25 -04:00
George Hotz
07b0df0d86
hotfix: test tensor dims start at 1
2025-08-05 15:40:24 -07:00
George Hotz
4dabdf7c6d
Revert "optimize in rewrite ( #11516 )" ( #11517 )
...
This reverts commit 3b777a9e05 .
2025-08-05 15:39:07 -07:00
George Hotz
3b777a9e05
optimize in rewrite ( #11516 )
...
* changes
* fix test uops
* dim shouldn't be 0
* huh, why did that one not save
2025-08-05 15:33:26 -07:00
nimlgen
fc4e713d1c
jit graph split tests ( #11507 )
...
* jit graph split tests
* fix
* one more test
* more tests
* fix
* xm
* rmeote
2025-08-05 21:32:37 +03:00
chenyu
ace8e9a706
fix test_conv2d_winograd ( #11511 )
2025-08-05 12:15:46 -04:00
chenyu
223aaa0492
clean up more conv tests ( #11510 )
2025-08-05 12:15:30 -04:00
Garret Castro
76e62a1c23
extract conv layer test logic ( #11488 )
...
* refactor: extract conv layer test logic
* tuple is unnecessary
* integrate _test_conv logic into all conv tests
* fix linter, forgot dilation
* undo winograd extraction
adds too many if statements for a single case
2025-08-05 11:15:54 -04:00
uuuvn
011ef8fa9d
Fix incorrect jit current batch devs reset ( #11505 )
...
`current_batch_devs = []` (in `flush_batch()`) happens between
`new_batched_devs = ...` and `current_batch_devs = new_batched_devs` =>
doesn't actually reset anything leading to things not jitting properly
which 2xs remote bert step time (should have similar effects on any
non-hcq backend)
2025-08-05 08:16:16 +03:00
chenyu
f02720ca2d
fix fuse gate_contiguous unique ( #11504 )
2025-08-04 23:43:31 -04:00
qazal
846a2826ab
viz: remove TracingKey.fmt ( #11482 )
...
* viz: remove TracingKey.fmt
* remove from test too
2025-08-05 00:00:03 +03:00
leopf
4f0ee4e982
BPE tokenizer ( #11415 )
...
* BPE works
* refactor tok
* oops
* basic tests
* fix eval
* smaller diff
* fix error
* proper vocab decoding
* use regex for splitting
* escape ucatrange
* full compat
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-08-04 09:52:38 -07:00
b1tg
06af9f9236
fix double exception + add name,loc in error msg ( #11487 )
...
Co-authored-by: b1tg <b1tg@users.noreply.github.com >
2025-08-04 13:41:23 +03:00
chenyu
e0106b6b25
1/(x*c) -> (1/c)*(1/x) ( #11491 )
...
example: 2*(2*a).reciprocal() -> a.reciprocal()
# TODO: bounds for reciprocal
# TODO: should z3 work?
2025-08-03 23:35:46 -04:00
chenyu
dbc7807c61
enable WEBGPU tests with buffer limit ( #11489 )
...
TestSample still fails?
2025-08-03 13:02:44 -07:00
chenyu
66be747908
few more dtype cast convinience methods ( #11480 )
2025-08-02 15:47:09 -04:00
chenyu
e22e5da9a5
move some test_dtype tests to unit ( #11479 )
2025-08-02 15:25:00 -04:00
nimlgen
da0b955be4
hcq: cpu can be graphed ( #11474 )
...
* hcq: cpu can be graphed
* ops
* new jit decisions
* fix test
* fix remote
* cleaner
* fix
2025-08-02 21:01:19 +03:00
kevvz
ef7e01cadf
Fix SVD shape bug + Fix batched SVD bug ( #11477 )
...
* failing test case
* fix
* better test
* space
2025-08-02 09:47:41 -07:00
qazal
fa66d9772d
viz: show const node when it's root ( #11456 )
2025-08-01 01:01:58 +03:00
Eitan Turok
cba3655de5
Add Test for Setitem ( #10559 )
...
* init
* update
* better
* failing test
* works
* Delete test file
* clean
* lint
* simplify variable name
* rm contigious, rm int dtype, and add assertEqual
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2025-07-30 22:03:41 -04:00
chenyu
4ca430e5bf
fix search dedup ( #11439 )
...
it should check against pre real_axis axis in actions, not real_axis.
2025-07-30 17:24:16 -04:00
chenyu
d5fc6af4a2
remove unused ShapeTracker.consecutive [pr] ( #11426 )
2025-07-29 18:36:19 -04:00
George Hotz
49a2583584
real new lowerer ( #11419 )
...
* real new lowerer
* fix group for reduce
* skip missing ranges
* fix wmma and unroll/contract
* real fix for wmma
* disable that test
* fix if gate
* simpler
* flash attention fusion works
* no end barriers
* still broken
* flash attention finally works
2025-07-29 15:35:51 -07:00
chenyu
0e5d8d5c3c
remove tests that used .to_uop() ( #11425 )
...
* remove tests that used .to_uop()
* import
2025-07-29 15:52:16 -04:00
nimlgen
d38d285489
ci: add h machines ( #11416 )
...
* ci: add h machines
* more
* fix names
* names not collide
* 20
* 10
2025-07-29 19:21:51 +03:00
George Hotz
8c10085459
assert shape on lowerer store [pr] ( #11395 )
...
* assert shape on lowerer store [pr]
* fix ptx
2025-07-27 10:41:57 -07:00
George Hotz
dfeee63d30
uop matmul work ( #11388 )
...
* uop matmul work
* works with locals
2025-07-26 21:23:55 -07:00