Commit Graph

4618 Commits

Author SHA1 Message Date
qazal
1d21651823 free toposort cache after it goes out of scope [pr] (#8264) 2024-12-15 19:50:42 +02:00
qazal
e1518f1e38 minimal failing test for UOp.toposort gc [pr] (#8261) 2024-12-15 19:30:56 +02:00
qazal
67e66ac1ab hotfix: schedule_uop in process replay (#8260)
* hotfix: schedule_uop in process replay

* notes
2024-12-15 21:24:54 +08:00
qazal
d05e21cb69 replace lazy srcs with the new uop api [pr] (#8255)
* buf_uop_view function

* srcs shouldn't exist

* fix TestTensorMetadata

---------

Co-authored-by: George Hotz <geohot@gmail.com>
2024-12-15 17:09:54 +08:00
chenyu
4c1733440d failed test case for stable sigmoid (#8245)
it should also work if implemented differently
2024-12-14 15:19:41 -05:00
chenyu
3eb952f537 fix some sigmoid extreme (#8238)
* fix some sigmoid extreme

quite brittle... the problem is it has 3 terms and mul might have bad order

* test_tanh_extreme

* just sigmoid gradient
2024-12-14 14:37:06 -05:00
George Hotz
bcd7ea60f0 hotfix: a few more grad tests 2024-12-13 21:03:02 -08:00
George Hotz
734f2c5344 compute gradient [pr] (#8237)
* compute gradient [pr]

* schedule_step_with_grads

* second deriv works
2024-12-13 20:46:01 -08:00
chenyu
0708a169dd more comments and tests to reshape [pr] (#8236) 2024-12-13 23:21:51 -05:00
George Hotz
8396d90f91 non controversial changes from optim branch [pr] (#8234) 2024-12-13 19:24:16 -08:00
George Hotz
37fa38d272 Revert "switch beautiful_mnist to use new optimizer [pr] (#8231)" (#8233)
This reverts commit e9ee39df22.
2024-12-13 19:07:09 -08:00
George Hotz
e9ee39df22 switch beautiful_mnist to use new optimizer [pr] (#8231)
* switch beautiful_mnist to use new optimizer [pr]

* fix abstractions3 + docs

* fix OptimizerGroup with schedule_step api
2024-12-13 18:27:16 -08:00
chenyu
e0956c518c move some ifs from merge_dims to reshape [pr] (#8229)
the third return value is only used in reshape
2024-12-13 19:56:15 -05:00
George Hotz
e2f87ecf36 start work on new gradient (#7838)
* start work on new gradient

* more correct

* working tests

* more tests

* work

* add (faliing) gradient test

* add view and reduce gradient

* test_add works, many failing test_ops

* add max and reduce max

* add max and reduce max

* 129 failing

* 108 failed

* better view drawing

* 101 failed

* i got 99 failures

* 94 failures

* it's tons of terrible code, but only 50 tests fail

* only 19 failures

* same 19 but shorter

* minimal doesn't matter

* shorter

* lil simpler

* simpler

* simpler

* simpler

* 13 test failures

* nine tests fail

* all ops tests pass

* add contiguous gradient + fix sched tests

* faster by removing toposort calls

* missed one

* add jax to testing
2024-12-13 16:45:53 -08:00
chenyu
e371a23c45 more comments and tests to reshape [pr] (#8228) 2024-12-13 18:50:13 -05:00
George Hotz
6d83a96440 retry: use movement ops [pr] (#8225)
* Revert "Revert "use movement ops [pr] (#8222)" (#8224)"

This reverts commit da19c37f0a.

* fix cast before view
2024-12-13 15:14:26 -08:00
George Hotz
4679f9fb44 add detach to graph [pr] (#8221)
* add detach to graph [pr]

* accept failure
2024-12-13 14:21:32 -08:00
chenyu
62e19649c0 lower test_conv_3x3_256_32_32_256_256 (#8226)
tiny7 is slow
2024-12-13 17:15:53 -05:00
George Hotz
da19c37f0a Revert "use movement ops [pr] (#8222)" (#8224)
This reverts commit 0d26c970ba.
2024-12-13 14:10:47 -08:00
George Hotz
0d26c970ba use movement ops [pr] (#8222)
* use movement ops [pr]

* test indexing
2024-12-13 14:06:01 -08:00
chenyu
eb0e5a14fd reorder and comments to reshape [pr] (#8223)
something feels wrong... contructing a counter example next
2024-12-13 17:02:27 -05:00
pkotzbach
c1b79c118f add unit tests for to_dtype (#8217)
* add unit test for to_dtype

* add unit test for to_dtype

---------

Co-authored-by: pkotzbach <pawkotz@gmail.com>
2024-12-13 16:21:02 -05:00
George Hotz
dbe549e462 rename expand to unroll [pr] (#8218) 2024-12-13 11:41:52 -08:00
qazal
5864627abe process replay filter warnings [pr] (#8199) 2024-12-13 17:43:43 +08:00
chenyu
ce41e6572d unit test merge_dim [pr] (#8195)
looking for better ways to write this. first adding some tests
2024-12-12 17:55:52 -05:00
chenyu
d47530c0d4 fix device canonicalize for :0 in middle [pr] (#8193)
replace is wrong because it does not check if `:0` is at the end. use re.sub instead
2024-12-12 16:32:36 -05:00
chenyu
40a4c603b9 remove more test skip for webgpu [pr] (#8192) 2024-12-12 14:06:35 -05:00
chenyu
72ff631f8d remove unreachable tensor dtype assert (#8190)
it would have failed in `to_dtype`. added some tests for it too
2024-12-12 13:04:49 -05:00
George Hotz
8a04a3a77a rename LazyBuffer -> UOp [pr] (#8169)
* rename LazyBuffer -> UOp [pr]

* fix docs
2024-12-11 16:15:52 -08:00
qazal
9044b0746a delete lazy [pr] (#7801)
* LazyBuffer = UOp

* try 4 at this diff

* skip optimization tests p1

* raise kernel count expectations

* BIND isn't the _only_ uop that can become a tensor

* fix test_ones_sum on symbolic

* bump openpilot, correctness first

* offset on assign is fine

* uop is immutable

* what if this was higher

* more optimization skips

* instant fold const copy

* test_multitensor shouldn't expect buffer for unrealized

* move copy folder to upats

* start BUFFER_VIEW

* kinda BUFFER_VIEW

* Revert "kinda BUFFER_VIEW"

This reverts commit 94b4fe3040.

* BUFFER_VIEW try 2

* linter and missed _device

* pylint

* keep Ops.CONTIGUOUS

* always BUFFER_VIEW disk

* test

* cpu isn't a real device

* buffer references afte del

* add that back

* start bringing some of these back

* more test updates

* simpler simplify copy

* subbufer everything

* this is fine with buffer view

* cleanup the diff in test/ 1

* copy is one thing

* diff pruning

* diff pruning 2

* oh bind unbinds way too early

* extra

* more diff pruning

* more const folding

* experiment with symbolic here

* Revert "experiment with symbolic here"

This reverts commit cb87d61f7a.

* Revert "more const folding"

This reverts commit 2a7d258a2b.

* Revert VALID early folding

This reverts commit 4074f52317.

* storing const is fine

* fix test_prefer_half_buffer

* iterate on test_real_world

* this fixes test_train_mnist memory, breaks everything else

* Revert "this fixes test_train_mnist memory, breaks everything else"

This reverts commit dccfcbe068.

* always expect buffer to exist here

* temp debug: something is mutating lazydata in compile3

* Revert "temp debug: something is mutating lazydata in compile3"

This reverts commit 71400f0d55.

* everything back to normal

* compile3

* compile3 test

* start captured jit work, that test passes

* finalized memory skip set

* linter err

* back to base here

* tiny metaop cleanup

* print tensor

* 4th type this unbind got me

* green pickle

* tensor_variable sanity

* cast sanity

* link from the reds

* COPY sanity + minor repr change

* you can exist

* enable test_winograd

* bye bye nbytes

* danger, uop is mutating

* real become

* delete those from uop init

* put it in buffer init

* buffer inits with so much stuff

* buffer pickle try 2

* toposort can't be a cached property

* fix test_schedule_gc_with_inputs

* remove all @unittest.skip(gc)

* Revert "remove all @unittest.skip(gc)"

This reverts commit 9d8d92dd85.

* reenable real world + test_schedule_gc

* test: RUN_PROCESS_REPLAY=0

* fix pickle jit

* test changes

* reenable test_lru_alloc and TestTrain

* fix imagedtype

* bring pr back

* reenable 3 gc tests

* test_schedule better diff

* disable SPLIT_REDUCEOP

* test_save_all_dtypes looks fixed

* fix metadata

* skip that one

* fix viz by not pickling buffers

* simple test for const folding

* bring split reduceop back

* add simplify_alu

* simplify_binop fixes a test

* fix cast folding

* disable that test

* that test looks fine

* changes from delete_lazy pruning p1

* cast folding and children base

* test: cast folding from pruning branch

* green test_sgd_4convs_fuse_conv_bw

* enable some indexing folding

* test_complex_backward is fixed

* prune more, 295 -> 233

* fix test_multi_const_folding_literal

* fix double copy

* early become test

* ooooops

* clean up ctx in all big_graph

* fix openpilot 208 kernels

* train_cifar is fine now

* fix CAST_BEFORE_VIEW

* ever faker const

* back to 13

* mark expectedFailure

* fine don't create them

* test_multi_const_folding_tensor

---------

Co-authored-by: George Hotz <geohot@gmail.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
2024-12-12 05:05:19 +08:00
chenyu
0e57152dbb clean up test_uop_symbolic [pr] (#8165)
removed old `Node` references
2024-12-11 14:13:19 -05:00
chenyu
5eadae204b test multi device rand with manual_seed (#8164) 2024-12-11 13:11:31 -05:00
qazal
047a6dabc3 prereq for scheduler contiguous_child [pr] (#8163)
* the whole context is fine here [pr]

* fix that
2024-12-12 02:02:22 +08:00
Ahmed Harmouche
a73e3677d0 Test linearizer on webgpu (#8159)
* Test linearizer on wgpu

* Skip tests due to exceeded dims
2024-12-11 17:03:26 +01:00
qazal
b894657aa7 assert the same things without mutating or accessing internal ops state [pr] (#8157)
* don't mutate internal state in test_lazybuffer

* fix test_schedule internals

* save time

* third si

* fine sometimes buffer_view isn't there
2024-12-11 22:01:27 +08:00
George Hotz
c8e7707a7e hotfix: disable flaky move tensor test 2024-12-10 17:11:21 -08:00
chenyu
155f7df599 lower test_gemm_4096 expectation on green (#8152)
getting 119 sometimes, so lowered to 115
2024-12-10 18:05:12 -05:00
chenyu
c4be1529cf update test for Tensor.softplus (#8150)
test beta and extreme inputs.
to pass big input, it needs to support `threshold`, which needs fix on backward that we punt until new gradient api
2024-12-10 17:48:02 -05:00
Ahmed Harmouche
a8cfdc70ed Run more webgpu tests (#8142) 2024-12-10 23:20:04 +01:00
George Hotz
aa3b094334 changes from delete lazy [pr] (#8146)
* changes from delete lazy [pr]

* test tweak
2024-12-10 11:06:17 -08:00
chenyu
286fec115e fix Tensor.minimum for int (#8145)
use invert instead of just neg. consolidate min, argmin, and minimum

also update maximum to not apply the mid point for int
2024-12-10 13:34:41 -05:00
qazal
3a2658efbd small changes to refine the delete_lazy diff (#8134)
* _view -> view

* const_arg things
2024-12-10 18:46:10 +08:00
qazal
6d33da09c9 split scalar getitem tests into correctness and optimization [pr] (#8133) 2024-12-10 18:18:46 +08:00
qazal
7436ebef2f spend lines on const_arg for tensor and scheduler [pr] (#8132)
* spend lines on const_arg for tensor and scheduler [pr]

* simple test_const_arg

* base on lazy
2024-12-10 18:07:35 +08:00
chenyu
917deb88a4 make //0 return 0 in python_alu (#8131)
on master it raises because it cannot truncate inf to int, which crashes valid expression like `(t > 0).where(1//t, t)`.
2024-12-09 19:32:06 -05:00
chenyu
358287959b fix pow of int to negative const int (#8129)
it should return in int
2024-12-09 17:20:18 -05:00
chenyu
12f7d284e0 failed test case for int pow (#8128)
also updated test_ops so that non-float compares with `assert_equal`. removed `test_multinomial` which is tested better in test_randomness
2024-12-09 16:15:09 -05:00
qazal
80de06c8b9 scheduler ops_folding from delete_lazy (#8124)
* scheduler diff from delete_lazy

* test_std_mean

* late fold copy of CONST

* clang const is fine
2024-12-10 00:36:01 +08:00
qazal
22d99f1421 test_pickle_realized_tensor actually tests pickle [pr] (#8119)
* test_pickle_realized_tensor actually tests pickle [pr]

* clang
2024-12-09 17:26:19 +08:00
chenyu
ccf54c2375 fix argmax/min on int32 min (#8118) 2024-12-09 02:29:23 -05:00