George Hotz
52243b258c
move flops_mem to renderer [pr] ( #8320 )
2024-12-18 12:13:17 -08:00
George Hotz
bd9c015b09
tests from grad uop path [pr] ( #8313 )
2024-12-18 09:25:05 -08:00
George Hotz
32df46cd73
test const pattern [pr] ( #8304 )
...
* test const pattern [pr]
* add model to test_tiny
2024-12-17 23:34:17 -08:00
Jyotirmaya Mahanta
45f2fb82d5
add failing tests for merge views ( #8306 )
...
* add failing tests for merge views
* assert is not none
* make linter happy
2024-12-18 01:27:38 -05:00
George Hotz
801e199196
change buffer to not be pointer [pr] ( #8302 )
2024-12-17 16:47:51 -08:00
George Hotz
0794af97db
consts do not realize
2024-12-17 08:53:53 -08:00
George Hotz
4764a4c172
Revert "TIP 3 - Tensor realization spec tests ( #8288 )" ( #8289 )
...
This reverts commit c0d4346b5a .
2024-12-17 08:36:11 -08:00
qazal
c0d4346b5a
TIP 3 - Tensor realization spec tests ( #8288 )
2024-12-18 00:04:50 +08:00
George Hotz
e3731766c9
add a test for UOp representation as Tensor [pr] ( #8278 )
2024-12-16 19:41:29 -08:00
chenyu
3195bd0d12
more test examples to merge views [pr] ( #8277 )
...
these have masks in self and masks in the merged views
2024-12-16 20:44:35 -05:00
chenyu
6e2e56c0ff
unit test for view add when self has a mask [pr] ( #8276 )
2024-12-16 20:07:35 -05:00
chenyu
2bb298f38d
add a missing unittest.main() [pr] ( #8274 )
2024-12-16 14:28:10 -05:00
George Hotz
bcd7ea60f0
hotfix: a few more grad tests
2024-12-13 21:03:02 -08:00
George Hotz
734f2c5344
compute gradient [pr] ( #8237 )
...
* compute gradient [pr]
* schedule_step_with_grads
* second deriv works
2024-12-13 20:46:01 -08:00
chenyu
0708a169dd
more comments and tests to reshape [pr] ( #8236 )
2024-12-13 23:21:51 -05:00
George Hotz
8396d90f91
non controversial changes from optim branch [pr] ( #8234 )
2024-12-13 19:24:16 -08:00
George Hotz
37fa38d272
Revert "switch beautiful_mnist to use new optimizer [pr] ( #8231 )" ( #8233 )
...
This reverts commit e9ee39df22 .
2024-12-13 19:07:09 -08:00
George Hotz
e9ee39df22
switch beautiful_mnist to use new optimizer [pr] ( #8231 )
...
* switch beautiful_mnist to use new optimizer [pr]
* fix abstractions3 + docs
* fix OptimizerGroup with schedule_step api
2024-12-13 18:27:16 -08:00
chenyu
e0956c518c
move some ifs from merge_dims to reshape [pr] ( #8229 )
...
the third return value is only used in reshape
2024-12-13 19:56:15 -05:00
George Hotz
e2f87ecf36
start work on new gradient ( #7838 )
...
* start work on new gradient
* more correct
* working tests
* more tests
* work
* add (faliing) gradient test
* add view and reduce gradient
* test_add works, many failing test_ops
* add max and reduce max
* add max and reduce max
* 129 failing
* 108 failed
* better view drawing
* 101 failed
* i got 99 failures
* 94 failures
* it's tons of terrible code, but only 50 tests fail
* only 19 failures
* same 19 but shorter
* minimal doesn't matter
* shorter
* lil simpler
* simpler
* simpler
* simpler
* 13 test failures
* nine tests fail
* all ops tests pass
* add contiguous gradient + fix sched tests
* faster by removing toposort calls
* missed one
* add jax to testing
2024-12-13 16:45:53 -08:00
chenyu
e371a23c45
more comments and tests to reshape [pr] ( #8228 )
2024-12-13 18:50:13 -05:00
chenyu
eb0e5a14fd
reorder and comments to reshape [pr] ( #8223 )
...
something feels wrong... contructing a counter example next
2024-12-13 17:02:27 -05:00
chenyu
ce41e6572d
unit test merge_dim [pr] ( #8195 )
...
looking for better ways to write this. first adding some tests
2024-12-12 17:55:52 -05:00
chenyu
d47530c0d4
fix device canonicalize for :0 in middle [pr] ( #8193 )
...
replace is wrong because it does not check if `:0` is at the end. use re.sub instead
2024-12-12 16:32:36 -05:00
chenyu
40a4c603b9
remove more test skip for webgpu [pr] ( #8192 )
2024-12-12 14:06:35 -05:00
chenyu
0e57152dbb
clean up test_uop_symbolic [pr] ( #8165 )
...
removed old `Node` references
2024-12-11 14:13:19 -05:00
ttomsa
e22d7b6fb0
fix var vmax inside special ( #8116 )
2024-12-09 01:16:08 -05:00
qazal
5dd61035f7
revert VALID early folding for now ( #8114 )
...
This reverts commit 4074f52317 .
2024-12-09 00:34:24 +08:00
qazal
4074f52317
VALID early folding ( #8100 )
...
* fold valid
* :)
* fix test_verify_ast
* keep symbolic working
2024-12-07 18:37:47 +08:00
qazal
a97b8fa3c5
maskless const can lower without valid, p1 [pr] ( #8094 )
2024-12-06 23:21:19 +02:00
qazal
df84dc6444
unrelated test fixups from delete_lazy [pr] ( #8088 )
...
* unrelated test fixups from delete_lazy [pr]
* fine if it's scheduled later
2024-12-06 17:31:02 +02:00
chenyu
e7d5fe4a32
improve idiv _min_max ( #8066 )
...
for the cases that the we don't know the exact bounds, we might still know the sign. with this, can remove some resolve for symbolic shapetracker
2024-12-05 23:02:16 -05:00
Sieds Lykles
49c6dab74b
Add pattern for div mod recombine with gcd ( #8061 )
...
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-12-05 13:16:58 -05:00
George Hotz
df18e7cc37
accept filename decorator [pr] ( #8049 )
...
* accept filename decorator [pr]
* add test for safe_load
* bring old tar tests back
2024-12-05 11:40:59 +08:00
chenyu
b3220ca7b1
test cases of always True/False lt ( #8048 )
...
* test cases of always True/False lt
* one more
2024-12-04 20:38:40 -05:00
Sieds Lykles
70db1bab5c
Fold nested div with const ( #8010 )
...
* Rebase nested div and with const
* Update the ordering
* return None on vectors
Fixes cpu test
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-12-04 14:59:09 -05:00
leopf
f0401e14e8
tar_extract with Tensors ( #7853 )
...
* initial
* USTAR, PAX and GNU support + testing
* from_bytes byteorder
* use TarInfo.frombuf
* tensor only usage
* remove contextlib.suppress
* shorter ow,pax
* more tests
* testing length + move tests
* cleanup
* new approach: RawTensorIO
* fix fetch
* enable read test
* cleanup and ignore fix
* fix for python < 3.12
* make it RawIO
* functions
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-12-04 17:03:19 +08:00
chenyu
0c060fa040
update uop and tests to not use lt/gt/le/ge [pr] ( #8023 )
...
just use dunder methods, eventually remove those from ops
2024-12-03 21:02:52 -05:00
chenyu
ef3752625b
add test case of realize_size with 0 in shape ( #8011 )
2024-12-03 09:19:50 -05:00
George Hotz
09eac42fd6
cache indexed uops in st [pr] ( #8008 )
...
* cache indexed uops in st [pr]
* remove arg from range
2024-12-03 21:27:07 +08:00
Sieds Lykles
e44183647f
Improved div folding ( #7996 )
...
* First version of div_mod folding together
* Working version with old div folding behaviour
* Test is fixed
* Fix linting
* Happy mypy
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-12-03 08:11:25 -05:00
chenyu
c7bc75e634
alu(c?t0:f0, c?t1:f1) -> c?alu(t0,t1):alu(f0,f1) ( #7900 )
...
* alu(c?t0:f0, c?t1:f1) -> c?alu(t0,t1):alu(f0,f1)
only do if at least one branch is const, so total alu won't increase
* tests and interesting TODO cases
2024-12-02 17:19:27 -05:00
Sieds Lykles
d267a2d9eb
Div mod recombine test for issue ( #7957 )
...
* Add test for failing div_mod recombine
* Add test case when there is gcd in div/mod
2024-11-29 08:47:50 -05:00
Sieds Lykles
864758423e
Don't take const in gcd and change the "nothing_changed" condition ( #7926 )
...
* Don't take const in gcd and change the "nothing_changed" condition
Biggest difference is probably actually that I forgot to check if gcd
changed if nothing else changed
The TODO was fixed by not using the const in the gcd, and then taking it
out
* Fix more tests
2024-11-27 18:07:36 -05:00
chenyu
988d64900b
add TODO case to test_mod_congruence ( #7925 )
...
same alu count but better bounds
2024-11-27 15:23:21 -05:00
Sieds Lykles
d318867776
Factoring gcd out of mod ( #7916 )
...
* Factoring gcd out of mod
Curious if this will be faster/better
* Update bounds on test
2024-11-26 21:17:22 -05:00
chenyu
ff3f2a9c1a
Revert "move attention upcast ( #7830 )" ( #7903 )
...
This reverts commit c07daf40e7 .
2024-11-25 18:59:51 -05:00
chenyu
a49ca0c2ff
clean up fully_flatten [pr] ( #7885 )
...
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2024-11-25 06:53:18 -05:00
Sieds Lykles
a49a7c4784
Improved mod folding ( #7887 )
...
* Remove uneccessary if statement
In all paths where something_changed was set to True, remainder is
appended so the list can't be empty
* Working version of improved mod folding
* Fix offset calculation
Passing fuzz_symbolic.py to 130_000 so far
Added an extra test
* Cleaner offset calculation
2024-11-24 22:21:34 -05:00
George Hotz
8c3d3181dd
bottom up rewrite fixes substitute [pr] ( #7862 )
...
* single pass rewrite fixes substitute [pr]
* caching for single_pass_rewrite
* allow multiple rewrites
* a simple test
* bottom_up_rewrite is fully flexible
2024-11-23 20:53:37 +08:00