* invert (broken)
* decent invert
* shapetracker invert works
* plus is meh, invert is good
* support invert mask
* a few more invert tests
* shapetracker math invert test
* remove match_type in ops_torch and ops_cpu
input dtypes are aligned and casted in mlops
* dict union only after python3.9
* fix that
* fix Sigmoid forward cast
* Uncripple dtype tests, TestBFloat16DType never actually runs.
* Fix conversion from/to bfloat16.
Call cast() recursively, so that it works for any type combo.
* Run this test on torch backend as well.
* Add torch.bfloat16.
* Add support for ushort and uint.
* Convert np.uint32 to np.int32 when loading.
* Fix warning.
* the universe is flat as a 2D tensor
* try this
* TESTS
* less lines in test
* don't change all_int since other places use it
* add tests and del noqa by making non-aesthetic spacing LOOOOOL
* some reordering
* fixed empty list and add tests
* more tests
* add list bool tensors
* clearer with least lines added
* added bool
* oops
* more tests
* improved tests
* oops
* assert UnaryOps has same dtype as input in uop
* fallback to float on images
* just unary ops for now
* pass amt
* noqa on the temp line
---------
Co-authored-by: chenyu <chenyu@fastmail.com>
* upcast the other way
* Revert "upcast the other way"
This reverts commit 355692ba79.
* remove uop cast, this should have never been there
* add regression test
* now fuzz it
correct test
* the accumulator is always the output type
lint
* fuzz all reduce ops
* MULACC upcast_dtype could be half too
opencl supports it https://man.opencl.org/mad.html
* cast to the same dtype is a noop
* internal casting support for MULACC
* fuzz test mulacc internal casting
* get_reduce_dtype
handle vectorized acc
update get_reduce_acc calls with the correct dtype
update tests
* pending _complete_ implementation of a function that gets the dtype based on self.reduceop
+more failing tests
* get_reduce_dtype try 2
add TODO
* get_lazyop_info already does it
* cleanup
* bring back internal casting support for mulacc
* use the scalar version of the acc dtype
* conceptual diff cleanup
* one extra line to a cleaner linearizer
* correct test assumptions - these should promote?
* rm mulacc cast, the cast of vins happens with the acc dtype promotion
linearizer hacks
* Revert "rm mulacc cast, the cast of vins happens with the acc dtype promotion"
This reverts commit afdd540733.
Revert "correct test assumptions - these should promote?"
This reverts commit 49ae2206ed.
* skip tests blocked by MULACC->lazyop cleanup
* final changes to add back internal casting for MULACC and update skip test logic, upcast works but downcast does not
* only test the linearizer abstraction layer
we wanna ensure that linearizer matches whatever lazy is returning
* remove unused hypothesis module
* remove mulacc related changes, those will move to the lazy pr
* remove midcast test
* move to helpers
* Revert "remove midcast test"
This reverts commit 86e74d7960.
add TODO with skip
---------
Co-authored-by: chenyu <chenyu@fastmail.com>
* least upper float
* dont cast to the same thing
* tests for least_upper_float
* add regression tests to test_dtype_alu
* the call is pretty cheap probably cache is too much overhead
* validate stable diffusion for seed 0
the closest false positive i can get is with the setup and one less step. dist = 0.0036
same setup with fp16 has dist=5e-6.
so setting validation threshold to 1e-4 should be good
* run with --seed 0