George Hotz
144e9f00df
viz is local, new test, and new quantize [pr] ( #7859 )
...
* viz is local, new test, and new quantize [pr]
* fix mime types
* remove font
* after index
2024-11-23 14:27:10 +08:00
chenyu
c07daf40e7
move attention upcast ( #7830 )
...
still upcast before softmax, but faster because intermediate buffer can be stored in half (as long as qk is within half range).
2024-11-22 17:10:51 -05:00
George Hotz
c5d458ce02
BufferSpec and ProgramSpec [pr] ( #7814 )
...
* BufferSpec and ProgramSpec [pr]
* delete preallocate, it's unused
* Revert "delete preallocate, it's unused"
This reverts commit dcfcfaccde .
2024-11-21 12:18:05 +08:00
Francis Lata
a1c1b9547f
Context manager support for tqdm ( #7770 )
...
* add context manager support
* add test case for context manager usage
2024-11-18 14:12:03 -05:00
chenyu
e3105675fb
cond.where(True, False) is cond ( #7733 )
2024-11-16 09:44:17 -05:00
ignaciosica
597a239e28
Remove UnaryOps, BinaryOps, TernaryOps, MetaOps [pr] ( #7725 )
...
* remove unaryops
* remove ternaryops
* remove metaops
* hotfix
* remove binaryops
* hotfix: test_pattern_matcher
---------
Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com >
2024-11-16 20:56:56 +08:00
chenyu
aeb1301bab
enable a few tests that work now ( #7721 )
...
should mark the ones that are expected to work with expectedFailure, and delete and ones that are not expected to work
2024-11-15 14:30:52 -05:00
qazal
e84d089ef1
delete ReduceOps, only use REDUCE_AXIS ( #7667 )
2024-11-13 19:04:27 +08:00
qazal
9d6b03d691
early assert swizzle in kernel [pr] ( #7610 )
...
* early assert swizzle in kernel [pr]
* better
* note changes
* TestIndexing 2
2024-11-09 21:54:43 +08:00
chenyu
74b4d1c1e1
rewrite idx again in real_strides after uop_given_valid ( #7600 )
...
uop_given_valid does not guarantee output to be flat. fixed one last real_strides test.
2024-11-08 14:30:32 -05:00
chenyu
c6189e38c1
simplify_valid in real_strides ( #7599 )
...
improved one more real_strides. after finishing the last one will think about always applying these in to_indexed_uops
2024-11-08 10:45:22 -05:00
chenyu
a1dfd288bb
different valid order ( #7589 )
...
in simplify_valid, we start with valids that are in others' parent so the others is more likely to be simplified
2024-11-07 20:27:56 -05:00
chenyu
4378b100ad
make UOp.range arg a tuple [pr] ( #7583 )
...
* make UOp.range arg a tuple [pr]
so render works on output of ShapeTracker.to_indexed_uops
* fix
2024-11-07 11:58:09 -05:00
chenyu
bb7b5362be
uop_given_valid in real_strides ( #7231 )
...
simplified idx allows deriving more strides
2024-11-07 09:41:16 -05:00
George Hotz
205befa788
move is_dtype_supported to device [pr] ( #7575 )
2024-11-07 20:38:03 +08:00
Carl Basho
630a7f37cf
update tests ( #7554 )
...
Co-authored-by: John Doe <null@mail.com >
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-11-05 11:35:15 -05:00
George Hotz
99bd4372a5
Ops.ALU is no more, the arg is just an op ( #7525 )
...
* op arg alu [pr]
* more
* more passing
* fix more tests
* more tests passing
* fix single failing test
* so much cleaner
* noop to not have process replay trigger
* fix ptx
2024-11-05 00:22:22 +08:00
George Hotz
9c3ee64a3e
hotfix: QoL assert if op is a str
2024-11-04 17:11:38 +08:00
George Hotz
0c19b6298b
rename ops to have unique names ( #7522 )
2024-11-04 17:09:45 +08:00
George Hotz
bac251d2c1
idx_load_store in lowerer [pr] ( #7477 )
...
* idx_load_store in lowerer [pr]
* fix tests (#7513 )
Co-authored-by: John Doe <null@mail.com >
* work
---------
Co-authored-by: Carl Basho <76494676+oldpondplop@users.noreply.github.com >
Co-authored-by: John Doe <null@mail.com >
2024-11-04 10:18:40 +08:00
chenyu
7758f7211b
Revert "s/UPat/Pat ( #7506 )" [pr] ( #7517 )
...
* Revert "s/UPat/Pat (#7506 )"
This reverts commit 400011a8c1 .
* fix
2024-11-03 16:33:02 -05:00
chenyu
84592225d8
tweak tqdm ( #7510 )
...
reduce parentheses and fuzz more tests now there's no sleep
2024-11-03 12:07:11 -05:00
chenyu
c25a69b97e
fix tqdm tests ( #7509 )
...
time.sleep masked two issues:
(1) iters_per_sec might have unitscale in it, and calling `float` on it fails
(2) default rate is too low to ensure the output matches, it might skip updating
2024-11-03 10:53:22 -05:00
chenyu
400011a8c1
s/UPat/Pat ( #7506 )
2024-11-03 08:26:19 -05:00
George Hotz
c8bf09b7d4
s/UOps/Ops ( #7500 )
...
* s/UOps/Ops [pr]
* fix
2024-11-03 11:26:10 +08:00
George Hotz
a7ba3d2d91
move reduce to lowerer [pr] ( #7462 )
...
* move reduce to lowerer [pr]
* simpler
2024-11-01 16:39:20 +08:00
chenyu
a21434504b
update payne_hanek_reduction [pr] ( #7455 )
2024-10-31 18:41:22 -04:00
chenyu
4065c3dec8
remove special 0 case in frexp ( #7450 )
...
we can safely assume input is non-zero, also removed unneeded bitcast
2024-10-31 13:02:33 -04:00
chenyu
0739895b4d
tiny clena up pow2if and payne_hanek_reduction ( #7423 )
2024-10-30 22:22:48 -04:00
chenyu
118dd7721f
clean up transcendental.rintk [pr] ( #7422 )
...
added unit tests and updated the comment. it's rounding away from 0 for negatives
2024-10-30 20:37:28 -04:00
chenyu
16e60d25b9
move polyN to helper [pr] ( #7405 )
...
also move `eval_uop` to `test.helpers`
2024-10-30 10:09:57 -04:00
chenyu
6bf38c35e5
clean up transcendental frexp [pr] ( #7384 )
...
also added some unit tests for frexp
2024-10-29 18:51:37 -04:00
chenyu
d3c192b056
Device method cleanup [pr] ( #7375 )
2024-10-29 12:49:47 -04:00
George Hotz
2cfc7b6695
Index everywhere 2 ( #7363 )
...
* indexing everywhere [pr]
* fix tests
2024-10-29 19:29:40 +08:00
George Hotz
572499c71a
add indexing to ops_python ( #7358 )
...
* add indexing to ops_python
* fix image
2024-10-29 18:11:03 +08:00
George Hotz
b647fa7514
rename MathTraits to maximum [pr] ( #7356 )
2024-10-29 16:43:04 +08:00
George Hotz
3989bd2682
idiv + reciprocal [pr] ( #7354 )
...
* idiv + reciprocal
* remove upcast from div
* fix docs
2024-10-29 15:54:19 +08:00
chenyu
c398f2467c
test uop mul min/max do not have nan in 0*inf ( #7340 )
2024-10-28 17:52:01 -04:00
Sieds Lykles
75dcd98e79
Fix calculation of vmin and vmax in multiplication when one src is negative and the other src has negative min and positive max ( #7333 )
...
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-10-28 16:01:46 -04:00
chenyu
4c855ae692
unit test transcendental helpers ( #7325 )
...
added a test to run UOps with const inputs. seems to have issue with both payne_hanek_reduction and cody_waite_reduction
2024-10-27 19:55:00 -04:00
chenyu
d66fe7a66f
fix simplify_valid ( #7313 )
...
the simplex should compare with valid bound, not its vmin
2024-10-26 14:21:12 -04:00
chenyu
0a4d01f6d4
disable simplify_valid ( #7312 )
...
fixed test_failure_55. will reenable it later after fixing the bug
2024-10-26 12:42:48 -04:00
chenyu
e7cd21c5e3
remove custom render in test_simplify_valid_idx ( #7303 )
...
use UOp render to compare
2024-10-25 10:20:26 -04:00
George Hotz
199a991237
line reduction [pr] ( #7296 )
2024-10-25 17:05:09 +07:00
George Hotz
4812801aa6
try for canonical order ( #7286 )
...
* try for canonical order
* cmp better
* disable bad tests
* flip const order
* fix test
* fix tests
* different fix for NOOP
* metaclass here
* fix tests
* narrower scope
2024-10-25 16:04:54 +08:00
George Hotz
004af512e6
try all matches in the function ( #7288 )
2024-10-25 14:17:04 +08:00
chenyu
90f720d703
limit idiv by neg bound to only if s0 is non-negative [pr] ( #7277 )
...
also updated the tests when div by negative const
2024-10-24 15:46:50 -04:00
George Hotz
63048ad880
don't recreate COMMUTATIVE the other way ( #7255 )
...
* don't recreate COMMUTATIVE the other way
* add shl and add passing test
* fix tests and move assignment to __new__
* that can stay there
* happy mypy
2024-10-24 14:38:29 +08:00
chenyu
e90bbe6bbc
failed test cases for 3+ views shapetracker strides ( #7226 )
2024-10-22 18:49:13 -04:00
George Hotz
4013c9848c
don't use tons of memory for tests non CI [pr] ( #7209 )
...
* don't use tons of memory for tests
* fix import and clean up pre-commit
* use pathlib
* no shm on windows
* Revert "use pathlib"
This reverts commit 7c38489820 .
* run pre-commit hooks in test
* ugh, fix later
2024-10-22 15:04:51 +08:00