uuuvn
da2245a458
Fix double => half cast on clang ( #8265 )
2024-12-15 11:24:05 -08:00
qazal
1d21651823
free toposort cache after it goes out of scope [pr] ( #8264 )
2024-12-15 19:50:42 +02:00
George Hotz
53603c4ec1
simple ignore mop noops [pr] ( #8263 )
2024-12-15 09:32:42 -08:00
qazal
e1518f1e38
minimal failing test for UOp.toposort gc [pr] ( #8261 )
2024-12-15 19:30:56 +02:00
qazal
67e66ac1ab
hotfix: schedule_uop in process replay ( #8260 )
...
* hotfix: schedule_uop in process replay
* notes
2024-12-15 21:24:54 +08:00
qazal
58b224a40f
schedule_uop api refactor [pr] ( #8259 )
2024-12-15 19:50:00 +08:00
qazal
ef1346ab39
simplify to_uop more, toward deleting early bufferize ( #8254 )
2024-12-15 17:41:31 +08:00
qazal
d05e21cb69
replace lazy srcs with the new uop api [pr] ( #8255 )
...
* buf_uop_view function
* srcs shouldn't exist
* fix TestTensorMetadata
---------
Co-authored-by: George Hotz <geohot@gmail.com >
2024-12-15 17:09:54 +08:00
qazal
e0aeb2e9f4
unwrap existing buffer from assign src [pr] ( #8252 )
...
* unwrap existing buffer from assign src [pr]
* this is a upat
2024-12-15 03:29:51 +02:00
qazal
ace654a7e4
keep realized passthrough [pr] ( #8248 )
...
* keep realized passthrough [pr]
* more pruning
2024-12-15 02:23:15 +02:00
qazal
d78e75f710
hotfix: use ubuntu-22.04 ci from 8249 ( #8251 )
2024-12-15 02:23:00 +02:00
chenyu
16c1b2379d
remove an if in reshape [pr] ( #8246 )
...
not needed, likely a hack to work around pre-resolve?
2024-12-14 16:05:50 -05:00
chenyu
4c1733440d
failed test case for stable sigmoid ( #8245 )
...
it should also work if implemented differently
2024-12-14 15:19:41 -05:00
chenyu
3eb952f537
fix some sigmoid extreme ( #8238 )
...
* fix some sigmoid extreme
quite brittle... the problem is it has 3 terms and mul might have bad order
* test_tanh_extreme
* just sigmoid gradient
2024-12-14 14:37:06 -05:00
George Hotz
bcd7ea60f0
hotfix: a few more grad tests
2024-12-13 21:03:02 -08:00
George Hotz
734f2c5344
compute gradient [pr] ( #8237 )
...
* compute gradient [pr]
* schedule_step_with_grads
* second deriv works
2024-12-13 20:46:01 -08:00
chenyu
0708a169dd
more comments and tests to reshape [pr] ( #8236 )
2024-12-13 23:21:51 -05:00
George Hotz
8396d90f91
non controversial changes from optim branch [pr] ( #8234 )
2024-12-13 19:24:16 -08:00
George Hotz
37fa38d272
Revert "switch beautiful_mnist to use new optimizer [pr] ( #8231 )" ( #8233 )
...
This reverts commit e9ee39df22 .
2024-12-13 19:07:09 -08:00
George Hotz
e9ee39df22
switch beautiful_mnist to use new optimizer [pr] ( #8231 )
...
* switch beautiful_mnist to use new optimizer [pr]
* fix abstractions3 + docs
* fix OptimizerGroup with schedule_step api
2024-12-13 18:27:16 -08:00
chenyu
e0956c518c
move some ifs from merge_dims to reshape [pr] ( #8229 )
...
the third return value is only used in reshape
2024-12-13 19:56:15 -05:00
George Hotz
e2f87ecf36
start work on new gradient ( #7838 )
...
* start work on new gradient
* more correct
* working tests
* more tests
* work
* add (faliing) gradient test
* add view and reduce gradient
* test_add works, many failing test_ops
* add max and reduce max
* add max and reduce max
* 129 failing
* 108 failed
* better view drawing
* 101 failed
* i got 99 failures
* 94 failures
* it's tons of terrible code, but only 50 tests fail
* only 19 failures
* same 19 but shorter
* minimal doesn't matter
* shorter
* lil simpler
* simpler
* simpler
* simpler
* 13 test failures
* nine tests fail
* all ops tests pass
* add contiguous gradient + fix sched tests
* faster by removing toposort calls
* missed one
* add jax to testing
2024-12-13 16:45:53 -08:00
chenyu
e371a23c45
more comments and tests to reshape [pr] ( #8228 )
2024-12-13 18:50:13 -05:00
George Hotz
6d83a96440
retry: use movement ops [pr] ( #8225 )
...
* Revert "Revert "use movement ops [pr] (#8222 )" (#8224 )"
This reverts commit da19c37f0a .
* fix cast before view
2024-12-13 15:14:26 -08:00
George Hotz
4679f9fb44
add detach to graph [pr] ( #8221 )
...
* add detach to graph [pr]
* accept failure
2024-12-13 14:21:32 -08:00
chenyu
62e19649c0
lower test_conv_3x3_256_32_32_256_256 ( #8226 )
...
tiny7 is slow
2024-12-13 17:15:53 -05:00
George Hotz
da19c37f0a
Revert "use movement ops [pr] ( #8222 )" ( #8224 )
...
This reverts commit 0d26c970ba .
2024-12-13 14:10:47 -08:00
George Hotz
0d26c970ba
use movement ops [pr] ( #8222 )
...
* use movement ops [pr]
* test indexing
2024-12-13 14:06:01 -08:00
chenyu
eb0e5a14fd
reorder and comments to reshape [pr] ( #8223 )
...
something feels wrong... contructing a counter example next
2024-12-13 17:02:27 -05:00
pkotzbach
c1b79c118f
add unit tests for to_dtype ( #8217 )
...
* add unit test for to_dtype
* add unit test for to_dtype
---------
Co-authored-by: pkotzbach <pawkotz@gmail.com >
2024-12-13 16:21:02 -05:00
George Hotz
8a50868264
touchup function.py [pr] ( #8220 )
...
* touchup function.py [pr]
* remove ALLOWED_READ_IMAGE
* eh, keep it, just change it
2024-12-13 13:07:00 -08:00
George Hotz
aff112f8ab
add new uops to prep for gradient ( #8219 )
2024-12-13 11:54:26 -08:00
George Hotz
dbe549e462
rename expand to unroll [pr] ( #8218 )
2024-12-13 11:41:52 -08:00
ignaciosica
0a00187dce
add real AMX tests to benchmark ( #8216 )
...
* add real amx to benchmark
* add debug=2 to check tc are triggered
2024-12-13 14:03:41 -05:00
Ahmed Harmouche
70f6183f34
Remove unnecessary wgsl string rewrite pattern ( #8215 )
2024-12-13 18:27:29 +01:00
geohotstan
eebb3a1bb9
unique names ( #8213 )
2024-12-13 12:14:47 -05:00
qazal
1824cbd72c
s/lazybufs/tensor_uops [pr] ( #8207 )
2024-12-13 19:20:02 +08:00
qazal
6d6c34eb1e
scheduler local graph_rewrite cleanups [pr] ( #8206 )
...
* scheduler local graph_rewrite cleanups [pr]
* extra merge
2024-12-13 19:07:09 +08:00
qazal
4a617c84e1
cleanup ctx usage in scheduler upats [pr] ( #8205 )
2024-12-13 18:01:13 +08:00
qazal
55b8c4e8bf
apply_swizzle can apply to any views [pr] ( #8204 )
2024-12-13 17:58:35 +08:00
Ahmed Harmouche
651f72442c
encapsulate the exported webgpu model ( #8203 )
2024-12-13 10:55:37 +01:00
qazal
5864627abe
process replay filter warnings [pr] ( #8199 )
2024-12-13 17:43:43 +08:00
qazal
c5c0d0277d
flatten buffer args, delete dtype [pr] ( #8202 )
2024-12-13 16:43:47 +08:00
Ahmed Harmouche
5198415bfb
No unpack_map in wgsl ( #8200 )
2024-12-13 08:10:31 +01:00
leopf
fe68dbdb23
GroupOp.Idempotent ( #8198 )
2024-12-12 20:44:04 -05:00
chenyu
ce41e6572d
unit test merge_dim [pr] ( #8195 )
...
looking for better ways to write this. first adding some tests
2024-12-12 17:55:52 -05:00
chenyu
d47530c0d4
fix device canonicalize for :0 in middle [pr] ( #8193 )
...
replace is wrong because it does not check if `:0` is at the end. use re.sub instead
2024-12-12 16:32:36 -05:00
chenyu
40a4c603b9
remove more test skip for webgpu [pr] ( #8192 )
2024-12-12 14:06:35 -05:00
chenyu
d586c7e108
remove had_counter from rand ( #8191 )
2024-12-12 13:35:39 -05:00
chenyu
2fe98e44cd
unneeded isinstance(size, int) in alloc [pr] ( #8189 )
2024-12-12 13:05:02 -05:00