Commit Graph

7301 Commits

Author SHA1 Message Date
uuuvn
da2245a458 Fix double => half cast on clang (#8265) 2024-12-15 11:24:05 -08:00
qazal
1d21651823 free toposort cache after it goes out of scope [pr] (#8264) 2024-12-15 19:50:42 +02:00
George Hotz
53603c4ec1 simple ignore mop noops [pr] (#8263) 2024-12-15 09:32:42 -08:00
qazal
e1518f1e38 minimal failing test for UOp.toposort gc [pr] (#8261) 2024-12-15 19:30:56 +02:00
qazal
67e66ac1ab hotfix: schedule_uop in process replay (#8260)
* hotfix: schedule_uop in process replay

* notes
2024-12-15 21:24:54 +08:00
qazal
58b224a40f schedule_uop api refactor [pr] (#8259) 2024-12-15 19:50:00 +08:00
qazal
ef1346ab39 simplify to_uop more, toward deleting early bufferize (#8254) 2024-12-15 17:41:31 +08:00
qazal
d05e21cb69 replace lazy srcs with the new uop api [pr] (#8255)
* buf_uop_view function

* srcs shouldn't exist

* fix TestTensorMetadata

---------

Co-authored-by: George Hotz <geohot@gmail.com>
2024-12-15 17:09:54 +08:00
qazal
e0aeb2e9f4 unwrap existing buffer from assign src [pr] (#8252)
* unwrap existing buffer from assign src [pr]

* this is a upat
2024-12-15 03:29:51 +02:00
qazal
ace654a7e4 keep realized passthrough [pr] (#8248)
* keep realized passthrough [pr]

* more pruning
2024-12-15 02:23:15 +02:00
qazal
d78e75f710 hotfix: use ubuntu-22.04 ci from 8249 (#8251) 2024-12-15 02:23:00 +02:00
chenyu
16c1b2379d remove an if in reshape [pr] (#8246)
not needed, likely a hack to work around pre-resolve?
2024-12-14 16:05:50 -05:00
chenyu
4c1733440d failed test case for stable sigmoid (#8245)
it should also work if implemented differently
2024-12-14 15:19:41 -05:00
chenyu
3eb952f537 fix some sigmoid extreme (#8238)
* fix some sigmoid extreme

quite brittle... the problem is it has 3 terms and mul might have bad order

* test_tanh_extreme

* just sigmoid gradient
2024-12-14 14:37:06 -05:00
George Hotz
bcd7ea60f0 hotfix: a few more grad tests 2024-12-13 21:03:02 -08:00
George Hotz
734f2c5344 compute gradient [pr] (#8237)
* compute gradient [pr]

* schedule_step_with_grads

* second deriv works
2024-12-13 20:46:01 -08:00
chenyu
0708a169dd more comments and tests to reshape [pr] (#8236) 2024-12-13 23:21:51 -05:00
George Hotz
8396d90f91 non controversial changes from optim branch [pr] (#8234) 2024-12-13 19:24:16 -08:00
George Hotz
37fa38d272 Revert "switch beautiful_mnist to use new optimizer [pr] (#8231)" (#8233)
This reverts commit e9ee39df22.
2024-12-13 19:07:09 -08:00
George Hotz
e9ee39df22 switch beautiful_mnist to use new optimizer [pr] (#8231)
* switch beautiful_mnist to use new optimizer [pr]

* fix abstractions3 + docs

* fix OptimizerGroup with schedule_step api
2024-12-13 18:27:16 -08:00
chenyu
e0956c518c move some ifs from merge_dims to reshape [pr] (#8229)
the third return value is only used in reshape
2024-12-13 19:56:15 -05:00
George Hotz
e2f87ecf36 start work on new gradient (#7838)
* start work on new gradient

* more correct

* working tests

* more tests

* work

* add (faliing) gradient test

* add view and reduce gradient

* test_add works, many failing test_ops

* add max and reduce max

* add max and reduce max

* 129 failing

* 108 failed

* better view drawing

* 101 failed

* i got 99 failures

* 94 failures

* it's tons of terrible code, but only 50 tests fail

* only 19 failures

* same 19 but shorter

* minimal doesn't matter

* shorter

* lil simpler

* simpler

* simpler

* simpler

* 13 test failures

* nine tests fail

* all ops tests pass

* add contiguous gradient + fix sched tests

* faster by removing toposort calls

* missed one

* add jax to testing
2024-12-13 16:45:53 -08:00
chenyu
e371a23c45 more comments and tests to reshape [pr] (#8228) 2024-12-13 18:50:13 -05:00
George Hotz
6d83a96440 retry: use movement ops [pr] (#8225)
* Revert "Revert "use movement ops [pr] (#8222)" (#8224)"

This reverts commit da19c37f0a.

* fix cast before view
2024-12-13 15:14:26 -08:00
George Hotz
4679f9fb44 add detach to graph [pr] (#8221)
* add detach to graph [pr]

* accept failure
2024-12-13 14:21:32 -08:00
chenyu
62e19649c0 lower test_conv_3x3_256_32_32_256_256 (#8226)
tiny7 is slow
2024-12-13 17:15:53 -05:00
George Hotz
da19c37f0a Revert "use movement ops [pr] (#8222)" (#8224)
This reverts commit 0d26c970ba.
2024-12-13 14:10:47 -08:00
George Hotz
0d26c970ba use movement ops [pr] (#8222)
* use movement ops [pr]

* test indexing
2024-12-13 14:06:01 -08:00
chenyu
eb0e5a14fd reorder and comments to reshape [pr] (#8223)
something feels wrong... contructing a counter example next
2024-12-13 17:02:27 -05:00
pkotzbach
c1b79c118f add unit tests for to_dtype (#8217)
* add unit test for to_dtype

* add unit test for to_dtype

---------

Co-authored-by: pkotzbach <pawkotz@gmail.com>
2024-12-13 16:21:02 -05:00
George Hotz
8a50868264 touchup function.py [pr] (#8220)
* touchup function.py [pr]

* remove ALLOWED_READ_IMAGE

* eh, keep it, just change it
2024-12-13 13:07:00 -08:00
George Hotz
aff112f8ab add new uops to prep for gradient (#8219) 2024-12-13 11:54:26 -08:00
George Hotz
dbe549e462 rename expand to unroll [pr] (#8218) 2024-12-13 11:41:52 -08:00
ignaciosica
0a00187dce add real AMX tests to benchmark (#8216)
* add real amx to benchmark

* add debug=2 to check tc are triggered
2024-12-13 14:03:41 -05:00
Ahmed Harmouche
70f6183f34 Remove unnecessary wgsl string rewrite pattern (#8215) 2024-12-13 18:27:29 +01:00
geohotstan
eebb3a1bb9 unique names (#8213) 2024-12-13 12:14:47 -05:00
qazal
1824cbd72c s/lazybufs/tensor_uops [pr] (#8207) 2024-12-13 19:20:02 +08:00
qazal
6d6c34eb1e scheduler local graph_rewrite cleanups [pr] (#8206)
* scheduler local graph_rewrite cleanups [pr]

* extra merge
2024-12-13 19:07:09 +08:00
qazal
4a617c84e1 cleanup ctx usage in scheduler upats [pr] (#8205) 2024-12-13 18:01:13 +08:00
qazal
55b8c4e8bf apply_swizzle can apply to any views [pr] (#8204) 2024-12-13 17:58:35 +08:00
Ahmed Harmouche
651f72442c encapsulate the exported webgpu model (#8203) 2024-12-13 10:55:37 +01:00
qazal
5864627abe process replay filter warnings [pr] (#8199) 2024-12-13 17:43:43 +08:00
qazal
c5c0d0277d flatten buffer args, delete dtype [pr] (#8202) 2024-12-13 16:43:47 +08:00
Ahmed Harmouche
5198415bfb No unpack_map in wgsl (#8200) 2024-12-13 08:10:31 +01:00
leopf
fe68dbdb23 GroupOp.Idempotent (#8198) 2024-12-12 20:44:04 -05:00
chenyu
ce41e6572d unit test merge_dim [pr] (#8195)
looking for better ways to write this. first adding some tests
2024-12-12 17:55:52 -05:00
chenyu
d47530c0d4 fix device canonicalize for :0 in middle [pr] (#8193)
replace is wrong because it does not check if `:0` is at the end. use re.sub instead
2024-12-12 16:32:36 -05:00
chenyu
40a4c603b9 remove more test skip for webgpu [pr] (#8192) 2024-12-12 14:06:35 -05:00
chenyu
d586c7e108 remove had_counter from rand (#8191) 2024-12-12 13:35:39 -05:00
chenyu
2fe98e44cd unneeded isinstance(size, int) in alloc [pr] (#8189) 2024-12-12 13:05:02 -05:00