Sieds Lykles
d267a2d9eb
Div mod recombine test for issue ( #7957 )
...
* Add test for failing div_mod recombine
* Add test case when there is gcd in div/mod
2024-11-29 08:47:50 -05:00
Sieds Lykles
864758423e
Don't take const in gcd and change the "nothing_changed" condition ( #7926 )
...
* Don't take const in gcd and change the "nothing_changed" condition
Biggest difference is probably actually that I forgot to check if gcd
changed if nothing else changed
The TODO was fixed by not using the const in the gcd, and then taking it
out
* Fix more tests
2024-11-27 18:07:36 -05:00
chenyu
988d64900b
add TODO case to test_mod_congruence ( #7925 )
...
same alu count but better bounds
2024-11-27 15:23:21 -05:00
Sieds Lykles
d318867776
Factoring gcd out of mod ( #7916 )
...
* Factoring gcd out of mod
Curious if this will be faster/better
* Update bounds on test
2024-11-26 21:17:22 -05:00
chenyu
ff3f2a9c1a
Revert "move attention upcast ( #7830 )" ( #7903 )
...
This reverts commit c07daf40e7 .
2024-11-25 18:59:51 -05:00
chenyu
a49ca0c2ff
clean up fully_flatten [pr] ( #7885 )
...
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2024-11-25 06:53:18 -05:00
Sieds Lykles
a49a7c4784
Improved mod folding ( #7887 )
...
* Remove uneccessary if statement
In all paths where something_changed was set to True, remainder is
appended so the list can't be empty
* Working version of improved mod folding
* Fix offset calculation
Passing fuzz_symbolic.py to 130_000 so far
Added an extra test
* Cleaner offset calculation
2024-11-24 22:21:34 -05:00
George Hotz
8c3d3181dd
bottom up rewrite fixes substitute [pr] ( #7862 )
...
* single pass rewrite fixes substitute [pr]
* caching for single_pass_rewrite
* allow multiple rewrites
* a simple test
* bottom_up_rewrite is fully flexible
2024-11-23 20:53:37 +08:00
George Hotz
144e9f00df
viz is local, new test, and new quantize [pr] ( #7859 )
...
* viz is local, new test, and new quantize [pr]
* fix mime types
* remove font
* after index
2024-11-23 14:27:10 +08:00
chenyu
c07daf40e7
move attention upcast ( #7830 )
...
still upcast before softmax, but faster because intermediate buffer can be stored in half (as long as qk is within half range).
2024-11-22 17:10:51 -05:00
George Hotz
c5d458ce02
BufferSpec and ProgramSpec [pr] ( #7814 )
...
* BufferSpec and ProgramSpec [pr]
* delete preallocate, it's unused
* Revert "delete preallocate, it's unused"
This reverts commit dcfcfaccde .
2024-11-21 12:18:05 +08:00
Francis Lata
a1c1b9547f
Context manager support for tqdm ( #7770 )
...
* add context manager support
* add test case for context manager usage
2024-11-18 14:12:03 -05:00
chenyu
e3105675fb
cond.where(True, False) is cond ( #7733 )
2024-11-16 09:44:17 -05:00
ignaciosica
597a239e28
Remove UnaryOps, BinaryOps, TernaryOps, MetaOps [pr] ( #7725 )
...
* remove unaryops
* remove ternaryops
* remove metaops
* hotfix
* remove binaryops
* hotfix: test_pattern_matcher
---------
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com >
2024-11-16 20:56:56 +08:00
chenyu
aeb1301bab
enable a few tests that work now ( #7721 )
...
should mark the ones that are expected to work with expectedFailure, and delete and ones that are not expected to work
2024-11-15 14:30:52 -05:00
qazal
e84d089ef1
delete ReduceOps, only use REDUCE_AXIS ( #7667 )
2024-11-13 19:04:27 +08:00
qazal
9d6b03d691
early assert swizzle in kernel [pr] ( #7610 )
...
* early assert swizzle in kernel [pr]
* better
* note changes
* TestIndexing 2
2024-11-09 21:54:43 +08:00
chenyu
74b4d1c1e1
rewrite idx again in real_strides after uop_given_valid ( #7600 )
...
uop_given_valid does not guarantee output to be flat. fixed one last real_strides test.
2024-11-08 14:30:32 -05:00
chenyu
c6189e38c1
simplify_valid in real_strides ( #7599 )
...
improved one more real_strides. after finishing the last one will think about always applying these in to_indexed_uops
2024-11-08 10:45:22 -05:00
chenyu
a1dfd288bb
different valid order ( #7589 )
...
in simplify_valid, we start with valids that are in others' parent so the others is more likely to be simplified
2024-11-07 20:27:56 -05:00
chenyu
4378b100ad
make UOp.range arg a tuple [pr] ( #7583 )
...
* make UOp.range arg a tuple [pr]
so render works on output of ShapeTracker.to_indexed_uops
* fix
2024-11-07 11:58:09 -05:00
chenyu
bb7b5362be
uop_given_valid in real_strides ( #7231 )
...
simplified idx allows deriving more strides
2024-11-07 09:41:16 -05:00
George Hotz
205befa788
move is_dtype_supported to device [pr] ( #7575 )
2024-11-07 20:38:03 +08:00
Carl Basho
630a7f37cf
update tests ( #7554 )
...
Co-authored-by: John Doe <null@mail.com >
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-11-05 11:35:15 -05:00
George Hotz
99bd4372a5
Ops.ALU is no more, the arg is just an op ( #7525 )
...
* op arg alu [pr]
* more
* more passing
* fix more tests
* more tests passing
* fix single failing test
* so much cleaner
* noop to not have process replay trigger
* fix ptx
2024-11-05 00:22:22 +08:00
George Hotz
9c3ee64a3e
hotfix: QoL assert if op is a str
2024-11-04 17:11:38 +08:00
George Hotz
0c19b6298b
rename ops to have unique names ( #7522 )
2024-11-04 17:09:45 +08:00
George Hotz
bac251d2c1
idx_load_store in lowerer [pr] ( #7477 )
...
* idx_load_store in lowerer [pr]
* fix tests (#7513 )
Co-authored-by: John Doe <null@mail.com >
* work
---------
Co-authored-by: Carl Basho <76494676+oldpondplop@users.noreply.github.com >
Co-authored-by: John Doe <null@mail.com >
2024-11-04 10:18:40 +08:00
chenyu
7758f7211b
Revert "s/UPat/Pat ( #7506 )" [pr] ( #7517 )
...
* Revert "s/UPat/Pat (#7506 )"
This reverts commit 400011a8c1 .
* fix
2024-11-03 16:33:02 -05:00
chenyu
84592225d8
tweak tqdm ( #7510 )
...
reduce parentheses and fuzz more tests now there's no sleep
2024-11-03 12:07:11 -05:00
chenyu
c25a69b97e
fix tqdm tests ( #7509 )
...
time.sleep masked two issues:
(1) iters_per_sec might have unitscale in it, and calling `float` on it fails
(2) default rate is too low to ensure the output matches, it might skip updating
2024-11-03 10:53:22 -05:00
chenyu
400011a8c1
s/UPat/Pat ( #7506 )
2024-11-03 08:26:19 -05:00
George Hotz
c8bf09b7d4
s/UOps/Ops ( #7500 )
...
* s/UOps/Ops [pr]
* fix
2024-11-03 11:26:10 +08:00
George Hotz
a7ba3d2d91
move reduce to lowerer [pr] ( #7462 )
...
* move reduce to lowerer [pr]
* simpler
2024-11-01 16:39:20 +08:00
chenyu
a21434504b
update payne_hanek_reduction [pr] ( #7455 )
2024-10-31 18:41:22 -04:00
chenyu
4065c3dec8
remove special 0 case in frexp ( #7450 )
...
we can safely assume input is non-zero, also removed unneeded bitcast
2024-10-31 13:02:33 -04:00
chenyu
0739895b4d
tiny clena up pow2if and payne_hanek_reduction ( #7423 )
2024-10-30 22:22:48 -04:00
chenyu
118dd7721f
clean up transcendental.rintk [pr] ( #7422 )
...
added unit tests and updated the comment. it's rounding away from 0 for negatives
2024-10-30 20:37:28 -04:00
chenyu
16e60d25b9
move polyN to helper [pr] ( #7405 )
...
also move `eval_uop` to `test.helpers`
2024-10-30 10:09:57 -04:00
chenyu
6bf38c35e5
clean up transcendental frexp [pr] ( #7384 )
...
also added some unit tests for frexp
2024-10-29 18:51:37 -04:00
chenyu
d3c192b056
Device method cleanup [pr] ( #7375 )
2024-10-29 12:49:47 -04:00
George Hotz
2cfc7b6695
Index everywhere 2 ( #7363 )
...
* indexing everywhere [pr]
* fix tests
2024-10-29 19:29:40 +08:00
George Hotz
572499c71a
add indexing to ops_python ( #7358 )
...
* add indexing to ops_python
* fix image
2024-10-29 18:11:03 +08:00
George Hotz
b647fa7514
rename MathTraits to maximum [pr] ( #7356 )
2024-10-29 16:43:04 +08:00
George Hotz
3989bd2682
idiv + reciprocal [pr] ( #7354 )
...
* idiv + reciprocal
* remove upcast from div
* fix docs
2024-10-29 15:54:19 +08:00
chenyu
c398f2467c
test uop mul min/max do not have nan in 0*inf ( #7340 )
2024-10-28 17:52:01 -04:00
Sieds Lykles
75dcd98e79
Fix calculation of vmin and vmax in multiplication when one src is negative and the other src has negative min and positive max ( #7333 )
...
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-10-28 16:01:46 -04:00
chenyu
4c855ae692
unit test transcendental helpers ( #7325 )
...
added a test to run UOps with const inputs. seems to have issue with both payne_hanek_reduction and cody_waite_reduction
2024-10-27 19:55:00 -04:00
chenyu
d66fe7a66f
fix simplify_valid ( #7313 )
...
the simplex should compare with valid bound, not its vmin
2024-10-26 14:21:12 -04:00
chenyu
0a4d01f6d4
disable simplify_valid ( #7312 )
...
fixed test_failure_55. will reenable it later after fixing the bug
2024-10-26 12:42:48 -04:00