* hopeful impl for Tensor.einsum
* satisfy mypy by having less typing. :(
* a few simple tests
* even more tests
* permute tests
* xfails for improper usage
* fix LLVM test fail
* use argfix
* more helpful error message on shape mismatch
The current yolov3 example is broken with the current implementation of of fetch in the helpers. I was tempted to fix the helpers instead but that could have just as well broken other examples.
* add bf16 test support
this model takes me almost a minute to download though:
https://huggingface.co/TinyPixel/Llama-2-7B-bf16-sharded/resolve/main/pytorch_model-00001-of-00014.bin?download=true: 100%|█████████████████████████████| 981M/981M [00:40<00:00, 24.2MB/s]
* ensure we first load if it is bitcast to avoid taking the address of an rvalue
* tiny bf16 in the cloud
skip GPU
* should skip torch
lint
* Revert "ensure we first load if it is bitcast to avoid taking the address of an rvalue"
This reverts commit b86a28ab84.
* break the kernel
* skip LLVM and GPU in CI
* skip CUDA
* universal test cast
* disable div
* midcast fixup
* add 64-bit types
* hack maximum
* use Metal precise::sin instead of default
This is because the default sin function defaults to single-percision math: https://developer.apple.com/metal/Metal-Shading-Language-Specification.pdf#page=164
* LLVM code_for_op support for var_dtype
* comment out maximum for now with a TODO explaining it
* Revert "hack maximum"
This reverts commit d170048c5f.
* make the comment more specific
* slightly more forgiving
* ok does this fail in all backends?
* weird its only Metal CI
* add graph
* skip sin of nan for CUDACPU
This is only happening in the CUDACPU runtime and not CUDA itself. https://github.com/tinygrad/tinygrad/actions/runs/7128973726/job/19412000385#step:16:36
* METAL and CUDACPU behave differently in overflows with numpy running on CI
* that skip is wrong
* skip fp16 tests on LLVM similar to test_dtype
original commit that skipped LLVM in CI 1826ff6b89
* remove all of sin from CUDACPU
* limit range of values in CUDACPU and METAL CI
* Revert "use Metal precise::sin instead of default"
This reverts commit d960094d4a.
* change atol and rtol for Metal sin
* METAL CI is more imprecise
* cleanup
---------
Co-authored-by: George Hotz <geohot@gmail.com>