Commit Graph

848 Commits

Author SHA1 Message Date
George Hotz
52243b258c move flops_mem to renderer [pr] (#8320) 2024-12-18 12:13:17 -08:00
George Hotz
bd9c015b09 tests from grad uop path [pr] (#8313) 2024-12-18 09:25:05 -08:00
George Hotz
32df46cd73 test const pattern [pr] (#8304)
* test const pattern [pr]

* add model to test_tiny
2024-12-17 23:34:17 -08:00
Jyotirmaya Mahanta
45f2fb82d5 add failing tests for merge views (#8306)
* add failing tests for merge views

* assert is not none

* make linter happy
2024-12-18 01:27:38 -05:00
George Hotz
801e199196 change buffer to not be pointer [pr] (#8302) 2024-12-17 16:47:51 -08:00
George Hotz
0794af97db consts do not realize 2024-12-17 08:53:53 -08:00
George Hotz
4764a4c172 Revert "TIP 3 - Tensor realization spec tests (#8288)" (#8289)
This reverts commit c0d4346b5a.
2024-12-17 08:36:11 -08:00
qazal
c0d4346b5a TIP 3 - Tensor realization spec tests (#8288) 2024-12-18 00:04:50 +08:00
George Hotz
e3731766c9 add a test for UOp representation as Tensor [pr] (#8278) 2024-12-16 19:41:29 -08:00
chenyu
3195bd0d12 more test examples to merge views [pr] (#8277)
these have masks in self and masks in the merged views
2024-12-16 20:44:35 -05:00
chenyu
6e2e56c0ff unit test for view add when self has a mask [pr] (#8276) 2024-12-16 20:07:35 -05:00
chenyu
2bb298f38d add a missing unittest.main() [pr] (#8274) 2024-12-16 14:28:10 -05:00
George Hotz
bcd7ea60f0 hotfix: a few more grad tests 2024-12-13 21:03:02 -08:00
George Hotz
734f2c5344 compute gradient [pr] (#8237)
* compute gradient [pr]

* schedule_step_with_grads

* second deriv works
2024-12-13 20:46:01 -08:00
chenyu
0708a169dd more comments and tests to reshape [pr] (#8236) 2024-12-13 23:21:51 -05:00
George Hotz
8396d90f91 non controversial changes from optim branch [pr] (#8234) 2024-12-13 19:24:16 -08:00
George Hotz
37fa38d272 Revert "switch beautiful_mnist to use new optimizer [pr] (#8231)" (#8233)
This reverts commit e9ee39df22.
2024-12-13 19:07:09 -08:00
George Hotz
e9ee39df22 switch beautiful_mnist to use new optimizer [pr] (#8231)
* switch beautiful_mnist to use new optimizer [pr]

* fix abstractions3 + docs

* fix OptimizerGroup with schedule_step api
2024-12-13 18:27:16 -08:00
chenyu
e0956c518c move some ifs from merge_dims to reshape [pr] (#8229)
the third return value is only used in reshape
2024-12-13 19:56:15 -05:00
George Hotz
e2f87ecf36 start work on new gradient (#7838)
* start work on new gradient

* more correct

* working tests

* more tests

* work

* add (faliing) gradient test

* add view and reduce gradient

* test_add works, many failing test_ops

* add max and reduce max

* add max and reduce max

* 129 failing

* 108 failed

* better view drawing

* 101 failed

* i got 99 failures

* 94 failures

* it's tons of terrible code, but only 50 tests fail

* only 19 failures

* same 19 but shorter

* minimal doesn't matter

* shorter

* lil simpler

* simpler

* simpler

* simpler

* 13 test failures

* nine tests fail

* all ops tests pass

* add contiguous gradient + fix sched tests

* faster by removing toposort calls

* missed one

* add jax to testing
2024-12-13 16:45:53 -08:00
chenyu
e371a23c45 more comments and tests to reshape [pr] (#8228) 2024-12-13 18:50:13 -05:00
chenyu
eb0e5a14fd reorder and comments to reshape [pr] (#8223)
something feels wrong... contructing a counter example next
2024-12-13 17:02:27 -05:00
chenyu
ce41e6572d unit test merge_dim [pr] (#8195)
looking for better ways to write this. first adding some tests
2024-12-12 17:55:52 -05:00
chenyu
d47530c0d4 fix device canonicalize for :0 in middle [pr] (#8193)
replace is wrong because it does not check if `:0` is at the end. use re.sub instead
2024-12-12 16:32:36 -05:00
chenyu
40a4c603b9 remove more test skip for webgpu [pr] (#8192) 2024-12-12 14:06:35 -05:00
chenyu
0e57152dbb clean up test_uop_symbolic [pr] (#8165)
removed old `Node` references
2024-12-11 14:13:19 -05:00
ttomsa
e22d7b6fb0 fix var vmax inside special (#8116) 2024-12-09 01:16:08 -05:00
qazal
5dd61035f7 revert VALID early folding for now (#8114)
This reverts commit 4074f52317.
2024-12-09 00:34:24 +08:00
qazal
4074f52317 VALID early folding (#8100)
* fold valid

* :)

* fix test_verify_ast

* keep symbolic working
2024-12-07 18:37:47 +08:00
qazal
a97b8fa3c5 maskless const can lower without valid, p1 [pr] (#8094) 2024-12-06 23:21:19 +02:00
qazal
df84dc6444 unrelated test fixups from delete_lazy [pr] (#8088)
* unrelated test fixups from delete_lazy [pr]

* fine if it's scheduled later
2024-12-06 17:31:02 +02:00
chenyu
e7d5fe4a32 improve idiv _min_max (#8066)
for the cases that the we don't know the exact bounds, we might still know the sign. with this, can remove some resolve for symbolic shapetracker
2024-12-05 23:02:16 -05:00
Sieds Lykles
49c6dab74b Add pattern for div mod recombine with gcd (#8061)
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-05 13:16:58 -05:00
George Hotz
df18e7cc37 accept filename decorator [pr] (#8049)
* accept filename decorator [pr]

* add test for safe_load

* bring old tar tests back
2024-12-05 11:40:59 +08:00
chenyu
b3220ca7b1 test cases of always True/False lt (#8048)
* test cases of always True/False lt

* one more
2024-12-04 20:38:40 -05:00
Sieds Lykles
70db1bab5c Fold nested div with const (#8010)
* Rebase nested div and with const

* Update the ordering

* return None on vectors

Fixes cpu test

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-04 14:59:09 -05:00
leopf
f0401e14e8 tar_extract with Tensors (#7853)
* initial

* USTAR, PAX and GNU support + testing

* from_bytes byteorder

* use TarInfo.frombuf

* tensor only usage

* remove contextlib.suppress

* shorter ow,pax

* more tests

* testing length + move tests

* cleanup

* new approach: RawTensorIO

* fix fetch

* enable read test

* cleanup and ignore fix

* fix for python < 3.12

* make it RawIO

* functions

---------

Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-04 17:03:19 +08:00
chenyu
0c060fa040 update uop and tests to not use lt/gt/le/ge [pr] (#8023)
just use dunder methods, eventually remove those from ops
2024-12-03 21:02:52 -05:00
chenyu
ef3752625b add test case of realize_size with 0 in shape (#8011) 2024-12-03 09:19:50 -05:00
George Hotz
09eac42fd6 cache indexed uops in st [pr] (#8008)
* cache indexed uops in st [pr]

* remove arg from range
2024-12-03 21:27:07 +08:00
Sieds Lykles
e44183647f Improved div folding (#7996)
* First version of div_mod folding together

* Working version with old div folding behaviour

* Test is fixed

* Fix linting

* Happy mypy

---------

Co-authored-by: chenyu <chenyu@fastmail.com>
2024-12-03 08:11:25 -05:00
chenyu
c7bc75e634 alu(c?t0:f0, c?t1:f1) -> c?alu(t0,t1):alu(f0,f1) (#7900)
* alu(c?t0:f0, c?t1:f1) -> c?alu(t0,t1):alu(f0,f1)

only do if at least one branch is const, so total alu won't increase

* tests and interesting TODO cases
2024-12-02 17:19:27 -05:00
Sieds Lykles
d267a2d9eb Div mod recombine test for issue (#7957)
* Add test for failing div_mod recombine

* Add test case when there is gcd in div/mod
2024-11-29 08:47:50 -05:00
Sieds Lykles
864758423e Don't take const in gcd and change the "nothing_changed" condition (#7926)
* Don't take const in gcd and change the "nothing_changed" condition

Biggest difference is probably actually that I forgot to check if gcd
changed if nothing else changed
The TODO was fixed by not using the const in the gcd, and then taking it
out

* Fix more tests
2024-11-27 18:07:36 -05:00
chenyu
988d64900b add TODO case to test_mod_congruence (#7925)
same alu count but better bounds
2024-11-27 15:23:21 -05:00
Sieds Lykles
d318867776 Factoring gcd out of mod (#7916)
* Factoring gcd out of mod

Curious if this will be faster/better

* Update bounds on test
2024-11-26 21:17:22 -05:00
chenyu
ff3f2a9c1a Revert "move attention upcast (#7830)" (#7903)
This reverts commit c07daf40e7.
2024-11-25 18:59:51 -05:00
chenyu
a49ca0c2ff clean up fully_flatten [pr] (#7885)
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-11-25 06:53:18 -05:00
Sieds Lykles
a49a7c4784 Improved mod folding (#7887)
* Remove uneccessary if statement

In all paths where something_changed was set to True, remainder is
appended so the list can't be empty

* Working version of improved mod folding

* Fix offset calculation

Passing fuzz_symbolic.py to 130_000 so far
Added an extra test

* Cleaner offset calculation
2024-11-24 22:21:34 -05:00
George Hotz
8c3d3181dd bottom up rewrite fixes substitute [pr] (#7862)
* single pass rewrite fixes substitute [pr]

* caching for single_pass_rewrite

* allow multiple rewrites

* a simple test

* bottom_up_rewrite is fully flexible
2024-11-23 20:53:37 +08:00