* cdna emulator work
* accvgprs
* cdna passes most tests
* ruff
* add cdna4 to tests
* cdna emu
* crash
* pass?
* work
* gen
* clean up wave_size access
* asm_gemm passes
* remove acc from dsl.py, emulator can keep its different reg file
it's purely an encoding here, the ASM_GEMM already encodes acc srcs with v[], this can
be cleaned up later, but not functionally required for emulator.
* split asm_gemm tests to ones fast on the emulator
* don't do that
* 124 stays null on rdna
* the segfault was because of hw regs, not this
* Revert "clean up wave_size access", it's explicitly tested
This reverts commit 1202ff5787.
* nullcopyout
---------
Co-authored-by: George Hotz <geohot@gmail.com>
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
* no after removal
* we are using walk
* null schedule test
* pytest deps
* Revert "pytest deps"
This reverts commit 5e1c5304ec.
* Revert "null schedule test"
This reverts commit 02da66053e.
* clean null tests
* preallocate all realized buffers
* contiguous
* work
* comment that out
* move to schedule
* better
* correct fix
* just buffer
* disk bufs
* fixes disk tensor stuff
* fix symbolic stuff
* fix multi
* 162 failures
* bugfixes
* don't check that anymore
* fix schedule tests
* mnist should be contiguious
* type and buffer
* fix tests
* shrink axis correction
* mypy fixes
* tests skips
* same 37 failures
* dedup
* no shrink in the graph
* 29 failures
* skips
* fix custom kernel
* fix training
* those optimizations aren't supported currently
* simpler
* more correct
* tests
* 14 failures
* works
* fix that test
* broken
* 11 failures
* only kernel counts left
* fixes
* all tests pass
* remove tensor_map
* op test
* 200 -> 230
* test fixes
* fixes
* revert test_tiny thing
* guard
* revert that
* test tiny passes
* no contigs there
* base realize back
* Revert "no contigs there"
This reverts commit c45bb9fcfd.
* revert that
* chop many assigns
* 12 failures
* fix tests
* tests
* apply after
* pre-commit
* remove old code
* delete that
* fix types
* remove extra contig
* fix dataloader
* torch fix
* disk fix
* update kernel fusion numbres
* runs on amd
* restore kernel count
* add that rule back
* that
* disable that
* wrong
* add the correct rule for that folding
* more tests
* guard c1.arg
* no newlines
* realize those
* split into a different file
* remove detach/contig back
* skip 2
* update that
* Revert "hotfix: skip test/amd in macpytest"
This reverts commit b7dade2adf.
* no llvm subprocess
* simpler
* sys.exec
* cleanup
* process safe
* diag
* arm ftz support
* 5 sec
* this one