* LazyBuffer = UOp
* try 4 at this diff
* skip optimization tests p1
* raise kernel count expectations
* BIND isn't the _only_ uop that can become a tensor
* fix test_ones_sum on symbolic
* bump openpilot, correctness first
* offset on assign is fine
* uop is immutable
* what if this was higher
* more optimization skips
* instant fold const copy
* test_multitensor shouldn't expect buffer for unrealized
* move copy folder to upats
* start BUFFER_VIEW
* kinda BUFFER_VIEW
* Revert "kinda BUFFER_VIEW"
This reverts commit 94b4fe3040.
* BUFFER_VIEW try 2
* linter and missed _device
* pylint
* keep Ops.CONTIGUOUS
* always BUFFER_VIEW disk
* test
* cpu isn't a real device
* buffer references afte del
* add that back
* start bringing some of these back
* more test updates
* simpler simplify copy
* subbufer everything
* this is fine with buffer view
* cleanup the diff in test/ 1
* copy is one thing
* diff pruning
* diff pruning 2
* oh bind unbinds way too early
* extra
* more diff pruning
* more const folding
* experiment with symbolic here
* Revert "experiment with symbolic here"
This reverts commit cb87d61f7a.
* Revert "more const folding"
This reverts commit 2a7d258a2b.
* Revert VALID early folding
This reverts commit 4074f52317.
* storing const is fine
* fix test_prefer_half_buffer
* iterate on test_real_world
* this fixes test_train_mnist memory, breaks everything else
* Revert "this fixes test_train_mnist memory, breaks everything else"
This reverts commit dccfcbe068.
* always expect buffer to exist here
* temp debug: something is mutating lazydata in compile3
* Revert "temp debug: something is mutating lazydata in compile3"
This reverts commit 71400f0d55.
* everything back to normal
* compile3
* compile3 test
* start captured jit work, that test passes
* finalized memory skip set
* linter err
* back to base here
* tiny metaop cleanup
* print tensor
* 4th type this unbind got me
* green pickle
* tensor_variable sanity
* cast sanity
* link from the reds
* COPY sanity + minor repr change
* you can exist
* enable test_winograd
* bye bye nbytes
* danger, uop is mutating
* real become
* delete those from uop init
* put it in buffer init
* buffer inits with so much stuff
* buffer pickle try 2
* toposort can't be a cached property
* fix test_schedule_gc_with_inputs
* remove all @unittest.skip(gc)
* Revert "remove all @unittest.skip(gc)"
This reverts commit 9d8d92dd85.
* reenable real world + test_schedule_gc
* test: RUN_PROCESS_REPLAY=0
* fix pickle jit
* test changes
* reenable test_lru_alloc and TestTrain
* fix imagedtype
* bring pr back
* reenable 3 gc tests
* test_schedule better diff
* disable SPLIT_REDUCEOP
* test_save_all_dtypes looks fixed
* fix metadata
* skip that one
* fix viz by not pickling buffers
* simple test for const folding
* bring split reduceop back
* add simplify_alu
* simplify_binop fixes a test
* fix cast folding
* disable that test
* that test looks fine
* changes from delete_lazy pruning p1
* cast folding and children base
* test: cast folding from pruning branch
* green test_sgd_4convs_fuse_conv_bw
* enable some indexing folding
* test_complex_backward is fixed
* prune more, 295 -> 233
* fix test_multi_const_folding_literal
* fix double copy
* early become test
* ooooops
* clean up ctx in all big_graph
* fix openpilot 208 kernels
* train_cifar is fine now
* fix CAST_BEFORE_VIEW
* ever faker const
* back to 13
* mark expectedFailure
* fine don't create them
* test_multi_const_folding_tensor
---------
Co-authored-by: George Hotz <geohot@gmail.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
* assign early folding [pr]
* move to to_si
* -
* fix generate_dataset
* diff too big
* no recreation, no diff
* gzip
* new sops from tiny10
* final try
* 1 is simpler than 2
* variable name
* change error wording
* shapes for sequence type must be homogeneous
* bug fix for model benchmark
* fix comments too
---------
Co-authored-by: chenyu <chenyu@fastmail.com>
* hacky fix for cast
* only float to uint8
* limit to float -> uint8
* touchup alu cast test
* improve tests and support more float to unsigned casts
* del one repeated test
* del 1 more repeated test
* try removing expected failure test
* hmmm try 1 more
* skip tests for flakiness
* uint64 super flaky
* clean up
* grammar
* just match numpy
* why is CI numpy different from local numpy
* increase verbosity
* try
* try2
* try3
* try4
* yeah idk
* new direction
* try again
* just don't support uint32 and uint64
* done?
* oops
* comment
* documentation
* it is what it is
---------
Co-authored-by: chenyu <chenyu@fastmail.com>
* wip pool
* check CI for remove alternative implementation
* Revert "check CI for remove alternative implementation"
This reverts commit 7b1bb900e5.
* fix test
* tests tests tests
* slap a resolve on it
* fix comment
* a little simpler pool
* check CI for removal again
* Revert "check CI for removal again"
This reverts commit be798b7857.
* small
* update
* some ez tests
* english
* clean up code
* fix ruff
* how did I +25 lines?
* small clean ups
* moar clean ups
* try test_avgpool2d_failure2 in CI
* final clean up
* exclude bug fix
* avg underscore pool
* no more edge case stuff
* add better comments for explanation
* add test cases for decreasing end padding
* address feedback
* improve test coverage
* tiny more polish as we wait for lines :D
* more readable code ordering
* add to documentation
* oops
* set to False instead
---------
Co-authored-by: chenyu <chenyu@fastmail.com>