* split shared_codegen_spec and fix index
* add VCONST to program_spec and move index to shared_codegen_spec
* working ignore_oob=0
* cleanup
* fix spec
* undo that
* move barrier and special earlier
* fix more spec issues
* more updates
* remove special from program_spec
* cleanup and fixes
* move more to shared
* special is not in shared_spec
* some comments
* dont do bounds check there
* split pm_cleanups
* update test_schedule
* shrink when we remove bufferize
* dont do shrink if shape is empty
* update tests
* remove *1 from metadata
* deal with the noop bufferize
* only noop on cvar
* cleanup
* fix if
* rename
* remove restrictions on range ending in indexing
* early simplify
* Revert "early simplify"
This reverts commit 657d9972c2.
* disable const folding tests
* Prevent const folding in test_payne_hanek_reduction
* Do not use list as a default parameter
* Bitcast constant folding
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
* remove Tensor._to_const_val
added a TODO for advance indexing on const, which was the last place that checks const in Tensor
* that is not folding now
* one more
* LazyBuffer = UOp
* try 4 at this diff
* skip optimization tests p1
* raise kernel count expectations
* BIND isn't the _only_ uop that can become a tensor
* fix test_ones_sum on symbolic
* bump openpilot, correctness first
* offset on assign is fine
* uop is immutable
* what if this was higher
* more optimization skips
* instant fold const copy
* test_multitensor shouldn't expect buffer for unrealized
* move copy folder to upats
* start BUFFER_VIEW
* kinda BUFFER_VIEW
* Revert "kinda BUFFER_VIEW"
This reverts commit 94b4fe3040.
* BUFFER_VIEW try 2
* linter and missed _device
* pylint
* keep Ops.CONTIGUOUS
* always BUFFER_VIEW disk
* test
* cpu isn't a real device
* buffer references afte del
* add that back
* start bringing some of these back
* more test updates
* simpler simplify copy
* subbufer everything
* this is fine with buffer view
* cleanup the diff in test/ 1
* copy is one thing
* diff pruning
* diff pruning 2
* oh bind unbinds way too early
* extra
* more diff pruning
* more const folding
* experiment with symbolic here
* Revert "experiment with symbolic here"
This reverts commit cb87d61f7a.
* Revert "more const folding"
This reverts commit 2a7d258a2b.
* Revert VALID early folding
This reverts commit 4074f52317.
* storing const is fine
* fix test_prefer_half_buffer
* iterate on test_real_world
* this fixes test_train_mnist memory, breaks everything else
* Revert "this fixes test_train_mnist memory, breaks everything else"
This reverts commit dccfcbe068.
* always expect buffer to exist here
* temp debug: something is mutating lazydata in compile3
* Revert "temp debug: something is mutating lazydata in compile3"
This reverts commit 71400f0d55.
* everything back to normal
* compile3
* compile3 test
* start captured jit work, that test passes
* finalized memory skip set
* linter err
* back to base here
* tiny metaop cleanup
* print tensor
* 4th type this unbind got me
* green pickle
* tensor_variable sanity
* cast sanity
* link from the reds
* COPY sanity + minor repr change
* you can exist
* enable test_winograd
* bye bye nbytes
* danger, uop is mutating
* real become
* delete those from uop init
* put it in buffer init
* buffer inits with so much stuff
* buffer pickle try 2
* toposort can't be a cached property
* fix test_schedule_gc_with_inputs
* remove all @unittest.skip(gc)
* Revert "remove all @unittest.skip(gc)"
This reverts commit 9d8d92dd85.
* reenable real world + test_schedule_gc
* test: RUN_PROCESS_REPLAY=0
* fix pickle jit
* test changes
* reenable test_lru_alloc and TestTrain
* fix imagedtype
* bring pr back
* reenable 3 gc tests
* test_schedule better diff
* disable SPLIT_REDUCEOP
* test_save_all_dtypes looks fixed
* fix metadata
* skip that one
* fix viz by not pickling buffers
* simple test for const folding
* bring split reduceop back
* add simplify_alu
* simplify_binop fixes a test
* fix cast folding
* disable that test
* that test looks fine
* changes from delete_lazy pruning p1
* cast folding and children base
* test: cast folding from pruning branch
* green test_sgd_4convs_fuse_conv_bw
* enable some indexing folding
* test_complex_backward is fixed
* prune more, 295 -> 233
* fix test_multi_const_folding_literal
* fix double copy
* early become test
* ooooops
* clean up ctx in all big_graph
* fix openpilot 208 kernels
* train_cifar is fine now
* fix CAST_BEFORE_VIEW
* ever faker const
* back to 13
* mark expectedFailure
* fine don't create them
* test_multi_const_folding_tensor
---------
Co-authored-by: George Hotz <geohot@gmail.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>