* cvar dtype:DType|tuple[DType, ...]|None=None
* fmt
* add a test
* list typeguard as a dep for CI
* extra step to install mypy
* fix venv
* ci fixes
* mv typeguard to testing install group
* simpler TYPED=1 test
* add typeguard to lint group
* ** rangeify, try 3
* bring that over
* bufferize, don't use contig tag
* work
* ish
* fix rangeify
* flash attention is back
* fix rangeify tests
* stuff passes
* fix test_log_softmax
* more stuff passes
* progress children
* new endrange solution
* progress
* progress counter
* basic assign
* contigs only
* symbolic in schedule
* unbind_kernel
* late children
* ops fixed
* beautiful mnist is close
* that seems to work
* mnist works
* improve names
* fix bmnist
* no pcontig
* testing backward
* work
* clone movement ops
* new_range helper
* MBLOCK/MERGE
* ops tests pass
* revert mblock stuff
* cleanups...but it breaks ops
* remove reindex
* hack for relu
* disable the hacks
* more hacks
* upd
* mostly works with cleanups disabled
* ndr
* ops tests pass
* terrible hacks for indexing to work
* context mismatch
* pcontig
* split pcontig v contig
* z3 trunc
* null
* no fuse in rangeify
* ops test passes
* lnorm
* fix assign
* nd rangeify
* both should work
* tests for rangeify
* cleanups
* stores pass the pointer through
* disable pcontig for now
* PARTIAL_CONTIG is a flag
* move device tests to test/device
* test speedups
* test device
* linalg to unit
* upd
* so pytest just works
* more divide and skip
* speed
* test devectorize
* add pillow
* BOOM
* cache extra/huggingface/models/
* why max buffer size is not 0
* override MAX_BUFFER_SIZE
* less models
* remove more models and change cache dir to already cached dir
* only metal
* less is more?
* remove check ops
* why is this not setting the ENVVAR
* ughhhhh just test in models
* only cpu and gpu
* only cpu actually
* just override it idk
* final
* move extra dependencies up top
* simplification
* fix print
* make README better
* revert ops_disk fix for now
* clean up test_onnx
* remove testing fashion clip model cuz sloooowwwwww
* actually let METAL run this
* fix comment mistake
* fix download path in run_models
* does this work?
* cleanup setup and teardown
* contextvar like this?
* prove model is cached
* do I need to increment DOWNLOAD_CACHE_VERSION?
* see if cached with incremented DOWNLOAD_CACHE_VERSION
* use warnings to see if the model exists
* revert DOWNLOAD_CACHE_VERSION stuff and clean up
* add retry to download
* nit
* move view pushing to codegen, try 2
* fix up some linearizer tests
* fix test search
* fix test schedule
* delete that test
* fix test arange
* fix a few tests
* update tests
* push views
* ebs cleanup
* fix local/reg
* test and lint
* fix more tests
* test cleanups
* skipped that one
* cleanup fix_kernel
* early load buffer
* early meta ops
* move those to fix_kernel_ops
* fix tests
* remote metal was flaky
* Revert "fix tests"
This reverts commit a27019383d.
* that hack broke things
* fine for ptx
* add caching for apt packages
* remove 'inputs' from apt cache key, use outputs instead of env
* remove unnecessary mkdir for partial
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
instead of running test with 3.10, add onnx to mypy which would have caught StrEnum regression. Several type annotation failed mypy now that does not affect running the code and were skipped for now
* start
* remove onnx.load from compile4 and move np to dropout
* clean up and enable test
* clean up
* move WebGPU ONNX test into MacOS (WebGPU)
* leave test in ONNX (CPU)
* fix raw_data init None, and simplify onnx_runner test a little?
* THESE TESTS ARE SO UGLY UGHH
* need to really think about how to structure the test
* wow LLMs are quite something
* not always on disk now
* also add external data loading test
* cleaner tests
* minimize diff and add const folding tests
* add external data loading too
* whoops add webgpu back.. but why was it not needed in the first place?
* better comment
* move webgpu test to macos(webgpu)?
* llm english so much better than me wow
* trigger CI to check flakiness
---------
Co-authored-by: chenyu <chenyu@fastmail.com>
* cast after sum
* comment out skipif
* minor fix
* only test IMAGE
* IMAGE is supported now
* simpler
* simplerr
* only cast if dtype is None
* dont need to change base_imaeg_type
* only cast when dtype is half
* add explicit test
* actually no, workflow seems better
* actually, keep both
* move test
* fix indent
---------
Co-authored-by: Utkarsh Gill <engelbart@Utkarshs-MacBook-Pro.local>