Commit Graph

456 Commits

Author SHA1 Message Date
Sieds Lykles
d267a2d9eb Div mod recombine test for issue (#7957)
* Add test for failing div_mod recombine

* Add test case when there is gcd in div/mod
2024-11-29 08:47:50 -05:00
Sieds Lykles
864758423e Don't take const in gcd and change the "nothing_changed" condition (#7926)
* Don't take const in gcd and change the "nothing_changed" condition

Biggest difference is probably actually that I forgot to check if gcd
changed if nothing else changed
The TODO was fixed by not using the const in the gcd, and then taking it
out

* Fix more tests
2024-11-27 18:07:36 -05:00
chenyu
988d64900b add TODO case to test_mod_congruence (#7925)
same alu count but better bounds
2024-11-27 15:23:21 -05:00
Sieds Lykles
d318867776 Factoring gcd out of mod (#7916)
* Factoring gcd out of mod

Curious if this will be faster/better

* Update bounds on test
2024-11-26 21:17:22 -05:00
chenyu
ff3f2a9c1a Revert "move attention upcast (#7830)" (#7903)
This reverts commit c07daf40e7.
2024-11-25 18:59:51 -05:00
chenyu
a49ca0c2ff clean up fully_flatten [pr] (#7885)
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-11-25 06:53:18 -05:00
Sieds Lykles
a49a7c4784 Improved mod folding (#7887)
* Remove uneccessary if statement

In all paths where something_changed was set to True, remainder is
appended so the list can't be empty

* Working version of improved mod folding

* Fix offset calculation

Passing fuzz_symbolic.py to 130_000 so far
Added an extra test

* Cleaner offset calculation
2024-11-24 22:21:34 -05:00
George Hotz
8c3d3181dd bottom up rewrite fixes substitute [pr] (#7862)
* single pass rewrite fixes substitute [pr]

* caching for single_pass_rewrite

* allow multiple rewrites

* a simple test

* bottom_up_rewrite is fully flexible
2024-11-23 20:53:37 +08:00
George Hotz
144e9f00df viz is local, new test, and new quantize [pr] (#7859)
* viz is local, new test, and new quantize [pr]

* fix mime types

* remove font

* after index
2024-11-23 14:27:10 +08:00
chenyu
c07daf40e7 move attention upcast (#7830)
still upcast before softmax, but faster because intermediate buffer can be stored in half (as long as qk is within half range).
2024-11-22 17:10:51 -05:00
George Hotz
c5d458ce02 BufferSpec and ProgramSpec [pr] (#7814)
* BufferSpec and ProgramSpec [pr]

* delete preallocate, it's unused

* Revert "delete preallocate, it's unused"

This reverts commit dcfcfaccde.
2024-11-21 12:18:05 +08:00
Francis Lata
a1c1b9547f Context manager support for tqdm (#7770)
* add context manager support

* add test case for context manager usage
2024-11-18 14:12:03 -05:00
chenyu
e3105675fb cond.where(True, False) is cond (#7733) 2024-11-16 09:44:17 -05:00
ignaciosica
597a239e28 Remove UnaryOps, BinaryOps, TernaryOps, MetaOps [pr] (#7725)
* remove unaryops

* remove ternaryops

* remove metaops

* hotfix

* remove binaryops

* hotfix: test_pattern_matcher

---------

Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>
2024-11-16 20:56:56 +08:00
chenyu
aeb1301bab enable a few tests that work now (#7721)
should mark the ones that are expected to work with expectedFailure, and delete and ones that are not expected to work
2024-11-15 14:30:52 -05:00
qazal
e84d089ef1 delete ReduceOps, only use REDUCE_AXIS (#7667) 2024-11-13 19:04:27 +08:00
qazal
9d6b03d691 early assert swizzle in kernel [pr] (#7610)
* early assert swizzle in kernel [pr]

* better

* note changes

* TestIndexing 2
2024-11-09 21:54:43 +08:00
chenyu
74b4d1c1e1 rewrite idx again in real_strides after uop_given_valid (#7600)
uop_given_valid does not guarantee output to be flat. fixed one last real_strides test.
2024-11-08 14:30:32 -05:00
chenyu
c6189e38c1 simplify_valid in real_strides (#7599)
improved one more real_strides. after finishing the last one will think about always applying these in to_indexed_uops
2024-11-08 10:45:22 -05:00
chenyu
a1dfd288bb different valid order (#7589)
in simplify_valid, we start with valids that are in others' parent so the others is more likely to be simplified
2024-11-07 20:27:56 -05:00
chenyu
4378b100ad make UOp.range arg a tuple [pr] (#7583)
* make UOp.range arg a tuple [pr]

so render works on output of ShapeTracker.to_indexed_uops

* fix
2024-11-07 11:58:09 -05:00
chenyu
bb7b5362be uop_given_valid in real_strides (#7231)
simplified idx allows deriving more strides
2024-11-07 09:41:16 -05:00
George Hotz
205befa788 move is_dtype_supported to device [pr] (#7575) 2024-11-07 20:38:03 +08:00
Carl Basho
630a7f37cf update tests (#7554)
Co-authored-by: John Doe <null@mail.com>
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-11-05 11:35:15 -05:00
George Hotz
99bd4372a5 Ops.ALU is no more, the arg is just an op (#7525)
* op arg alu [pr]

* more

* more passing

* fix more tests

* more tests passing

* fix single failing test

* so much cleaner

* noop to not have process replay trigger

* fix ptx
2024-11-05 00:22:22 +08:00
George Hotz
9c3ee64a3e hotfix: QoL assert if op is a str 2024-11-04 17:11:38 +08:00
George Hotz
0c19b6298b rename ops to have unique names (#7522) 2024-11-04 17:09:45 +08:00
George Hotz
bac251d2c1 idx_load_store in lowerer [pr] (#7477)
* idx_load_store in lowerer [pr]

* fix tests (#7513)

Co-authored-by: John Doe <null@mail.com>

* work

---------

Co-authored-by: Carl Basho <76494676+oldpondplop@users.noreply.github.com>
Co-authored-by: John Doe <null@mail.com>
2024-11-04 10:18:40 +08:00
chenyu
7758f7211b Revert "s/UPat/Pat (#7506)" [pr] (#7517)
* Revert "s/UPat/Pat (#7506)"

This reverts commit 400011a8c1.

* fix
2024-11-03 16:33:02 -05:00
chenyu
84592225d8 tweak tqdm (#7510)
reduce parentheses and fuzz more tests now there's no sleep
2024-11-03 12:07:11 -05:00
chenyu
c25a69b97e fix tqdm tests (#7509)
time.sleep masked two issues:
(1) iters_per_sec might have unitscale in it, and calling `float` on it fails
(2) default rate is too low to ensure the output matches, it might skip updating
2024-11-03 10:53:22 -05:00
chenyu
400011a8c1 s/UPat/Pat (#7506) 2024-11-03 08:26:19 -05:00
George Hotz
c8bf09b7d4 s/UOps/Ops (#7500)
* s/UOps/Ops [pr]

* fix
2024-11-03 11:26:10 +08:00
George Hotz
a7ba3d2d91 move reduce to lowerer [pr] (#7462)
* move reduce to lowerer [pr]

* simpler
2024-11-01 16:39:20 +08:00
chenyu
a21434504b update payne_hanek_reduction [pr] (#7455) 2024-10-31 18:41:22 -04:00
chenyu
4065c3dec8 remove special 0 case in frexp (#7450)
we can safely assume input is non-zero, also removed unneeded bitcast
2024-10-31 13:02:33 -04:00
chenyu
0739895b4d tiny clena up pow2if and payne_hanek_reduction (#7423) 2024-10-30 22:22:48 -04:00
chenyu
118dd7721f clean up transcendental.rintk [pr] (#7422)
added unit tests and updated the comment. it's rounding away from 0 for negatives
2024-10-30 20:37:28 -04:00
chenyu
16e60d25b9 move polyN to helper [pr] (#7405)
also move `eval_uop` to `test.helpers`
2024-10-30 10:09:57 -04:00
chenyu
6bf38c35e5 clean up transcendental frexp [pr] (#7384)
also added some unit tests for frexp
2024-10-29 18:51:37 -04:00
chenyu
d3c192b056 Device method cleanup [pr] (#7375) 2024-10-29 12:49:47 -04:00
George Hotz
2cfc7b6695 Index everywhere 2 (#7363)
* indexing everywhere [pr]

* fix tests
2024-10-29 19:29:40 +08:00
George Hotz
572499c71a add indexing to ops_python (#7358)
* add indexing to ops_python

* fix image
2024-10-29 18:11:03 +08:00
George Hotz
b647fa7514 rename MathTraits to maximum [pr] (#7356) 2024-10-29 16:43:04 +08:00
George Hotz
3989bd2682 idiv + reciprocal [pr] (#7354)
* idiv + reciprocal

* remove upcast from div

* fix docs
2024-10-29 15:54:19 +08:00
chenyu
c398f2467c test uop mul min/max do not have nan in 0*inf (#7340) 2024-10-28 17:52:01 -04:00
Sieds Lykles
75dcd98e79 Fix calculation of vmin and vmax in multiplication when one src is negative and the other src has negative min and positive max (#7333)
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-10-28 16:01:46 -04:00
chenyu
4c855ae692 unit test transcendental helpers (#7325)
added a test to run UOps with const inputs. seems to have issue with both payne_hanek_reduction and cody_waite_reduction
2024-10-27 19:55:00 -04:00
chenyu
d66fe7a66f fix simplify_valid (#7313)
the simplex should compare with valid bound, not its vmin
2024-10-26 14:21:12 -04:00
chenyu
0a4d01f6d4 disable simplify_valid (#7312)
fixed test_failure_55. will reenable it later after fixing the bug
2024-10-26 12:42:48 -04:00