* add FUZZ_NTH to fuzz_linearizer
also update tests in test_linearizer_failures to not just run on METAL
* update failures for HIP/HSA
* test_failure_21 LLVM PADTO
* working PolynomialDecayWithWarmup + tests.......
add lars_util.py, oops
* keep lars_util.py as intact as possible, simplify our interface
* whitespace
* clean up
* clean up
* asserts
* test polylr for full resnet training run
* add comment
* rename
* fix do_optim
* don't cast lr
* info
* calculate from train_files
* skip it
included non-reduce kernel and kernel with variables. green msg when everything passed
it's possible that creating rawbufs failed due to memory error, included that in failure cases
* lars optimizer + tests
* fix skip list!
* use id to compare in skip list
* go back to using set
* Tensor(bool) * Tensor(bool) is and
* don't lint external/mlperf_resnet
* whitespace
* add external_test_optim to opencl tests
* give mlperf task a name
* mlperf under onnx
* remove track_gnorm
* contiguous instead of realize
* assert momentum and weight decay positive
---------
Co-authored-by: chenyu <chenyu@fastmail.com>
* this mem fault still happening
* smaller
* that print doesn't work
* overflows test
* hip doesn't uses_ptr_arithmetic
* only with locals
* test overflow new name
* it's not ptr arith
* simpler
* simple repro
* old compiler
* simpler
* put that back
* test/external/fuzz_linearizer: add a FUZZ_MAX_SIZE option
this allows us to limit the size of the kernel and reduce running
times by avoiding ones that take a long time
* fix spacing and re-order to put parameters together
* remove cpu and torch backends
* don't copy to cpu
* use clang instead of cpu
* multitensor gathers on the first device
* clang is cpu + use default
* fixup
* bugfix
* set metal fast math default to 0 (disabled)
It's a correctness fix because we use inf and nan. Let's see how slow it is
* skip failed onnx tests
* tmp DISABLE_COMPILER_CACHE=1 in metal benchmark
* Revert "tmp DISABLE_COMPILER_CACHE=1 in metal benchmark"
This reverts commit 22267df380.
* env var METAL_FAST_MATH to disable fastmath for metal
use this to test impact of fast math. might need to disable compiler cache with DISABLE_COMPILER_CACHE
* failed onnx test with fast math
METAL_FAST_MATH=0 DISABLE_COMPILER_CACHE=1 NOOPT=1 python -m pytest -n=auto test/external/external_test_onnx_backend.py -k test_MaxPool3d_stride_padding_cpu
* Reapply "take merge views from corsix branch" (#3278)
This reverts commit d298916232.
* reintroduce merge views
* update second any
* isinstance -> not
* 25% less same but unequal
* add onnx test_reduce_log_sum_exp
* more reuse
* more
* stuff
* good CenterCropPad
* imports
* good ArrayFeatureExtractor
* pretty good Pad
* stuff
* stuff
* onnx.py
* Atan
* pass int8 test
* dtype related
* fastmath stuff
* Resize linear
* fix CI
* move back
exceptions can be raised from either model conversion or individual backend failed. openpilot on torch mps works, but does not work with torch cpu.
seperate the expcetion block so that the benchmark can inlcude torch mps for openpilot.
* cached size
* simplify simplify
* 0 doesn't have base
* fix test
* cleaner cache
* hmm, metal is flaky on this...might be real(ish) but useless as test
* short circuit reshape/expand properly
* better reshape bypass
* updated most dtype hacks in onnx_ops
* temporarily revert dequantizelinear change
* I think this is right...
* MORE FIXES WOOOO NEW DTYPE IS AWESOME
* ok
* oops missed a print
* half -> float32 for CI
* is npdtype
* some more
* fix if ordering
* more clean ups
* final cleanups
* casting to half not allowed
* k nvm
* revert ArgMax change
* only GPU
* llvm begone
* teeny tiny change
* fix: attempt to add cast tests
* try this
* fix dequantizelinear
* revert some stuff
* tests pass pls
* less lines in onnx_tests
* oops missed string tensor tests
* clean up
* try: revert default behavior changes
* fix: disabled Cast and Castlike tests
* docs: small changes
* fix: fixed isNaN op and enabled associated tests
* fix: forgot about float16
* done
* update disabled test
* gah missed another float16
* disable rest of failing tests
* rm extra line
* try...
---------
Co-authored-by: chenyu <chenyu@fastmail.com>