chenyu
f511ad9103
No pyint again ( #7156 )
...
* Revert "bring back pyint (#7150 )"
This reverts commit 37e83ca6fc .
* remove truncate in const folding
* truncate_output=False
2024-10-19 13:48:59 -04:00
chenyu
37e83ca6fc
bring back pyint ( #7150 )
...
fixed test_failure_52 and resnet. need to understand this better
2024-10-18 14:54:37 -04:00
Bhavya Gada
b7b2017cb9
only ignore warnings not errors ( #7146 )
2024-10-18 07:41:11 -04:00
Bhavya Gada
534597e753
fix all test warnings ( #7024 )
...
* fix pytorch warning in nn.conv2d for same padding
* fix future warning in torch load
* fix overflow warning in tensor list test: https://github.com/numpy/numpy/issues/23606#issuecomment-1512752172
* fix floating point warnings in dtype tests using docs https://numpy.org/doc/stable/reference/generated/numpy.errstate.html and a neat solution https://stackoverflow.com/questions/53634965/change-np-seterr-behavior-inside-a-function-only
* put err state in one place; comment taken care of by function hover
* enter np errstate context manager on test setup
* put decorator on class
2024-10-18 08:56:40 +08:00
George Hotz
ded1b38b84
minor dtype cleanup [pr] ( #7124 )
...
* minor dtype cleanup [pr]
* use ptr() function
2024-10-17 17:41:23 +08:00
George Hotz
f85c9ba00a
rewrite max to use cmplt + where ( #7037 )
2024-10-14 20:00:51 +08:00
George Hotz
85a45164fb
remove pyint [pr] ( #7016 )
...
* remove pyint
* bump time on tp [pr]
* dont truncate in const fold
* remove dead code
* Revert "dont truncate in const fold"
This reverts commit 29c81db0f7 .
* remove define_var
2024-10-12 22:36:24 +08:00
chenyu
75d9dcf000
support dtype in softmax and log_softmax ( #6914 )
...
matches torch. for mixed precision training, we would want to use float for softmax
2024-10-06 07:18:15 -04:00
wozeparrot
2b899164c6
no numpy ( #6751 )
2024-09-26 16:40:18 +08:00
George Hotz
cb22ef379a
truncate consts early ( #6741 )
...
* truncate consts early
* ptx still fails
* Update dtype.py
2024-09-25 16:49:51 +08:00
George Hotz
1b4d1823b7
add pyint to DTYPES_DICT [run_process_replay] ( #6477 )
...
* add pyint to DTYPES_DICT [run_process_replay]
* also fix uop alu bug
* exclude pyint there too
* ne ne
* force explicit dtype
2024-09-11 17:31:59 +08:00
chenyu
002303c145
fix output of truncate_fp16 ( #6381 )
...
make sure the non-inf path returns the truncated value
2024-09-05 22:55:43 -04:00
chenyu
590c0922b6
Tensor.prod ( #6250 )
...
* Tensor.prod
a new reduce op!
* onnx ReduceProd
2024-08-23 10:06:32 -04:00
wozeparrot
0c5189de25
threefry half ( #6154 )
2024-08-18 15:23:12 -07:00
samm393
2dc586ffe5
Shape change bitcast for more dtypes ( #6047 )
...
* bitcast & tests
* use to_dtype
* put disk tensor tests back
* tests
* bitmask
* no bitmask
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2024-08-14 10:03:34 -07:00
chenyu
4a65010de8
remove CUDACPU flag in tests [run_process_replay] ( #5902 )
...
no longer used
2024-08-04 16:06:38 -04:00
chenyu
c67e9887f7
support using str to specify dtype ( #5897 )
...
* support using str to specify dtype
in Tensor creation and args into `cast` and `bitcast`, and acc_dtype
* more tests
2024-08-04 12:56:28 -04:00
samm393
2c94316bd2
ull literal support and test ( #5789 )
...
* ull literal support and test
* missing .numpy()
2024-07-29 11:50:49 -04:00
chenyu
600a39771d
fix Tensor.arange if (stop-start) and step have different signs ( #5775 )
2024-07-28 14:34:10 -04:00
kormann
2c4add6844
pretty print lazy op per default ( #5505 )
...
* pretty lop
* min diff
* walrus
* fix
* min diff
* simplify
* pretty helper function
* ws
* pretty uop upat
* tests
* stricter tests
* test passes
* ws
* stronger upat test
* delete print_tree
* min diff
* stricter exp test
* fix merge
* stronger uops eval test
* +readable and deep upat test
* +readable and deep upat test
* sort inv fix
* fix
* revert allowed_len
2024-07-18 09:34:08 -07:00
chenyu
f8a47608cc
test dtype.min and dtype.max ( #5479 )
...
compared with np.iinfo for integer dtype
2024-07-14 15:31:37 -04:00
chenyu
ca021229e4
fix attention to always return in the same dtype as input ( #5100 )
...
mid cast to default_float does not work as intended when default is float32 and qkv is in half
2024-06-22 10:34:57 -04:00
chenyu
cc2be9064f
fix out of bound python list into numpy array ( #5043 )
...
numpy 2.0 does not allow oob python const and recommends writing as `np.array(value).astype(dtype)`
2024-06-18 18:05:21 -04:00
chenyu
acaf9a490d
RECIP(-0.0) should be -inf ( #5024 )
...
* RECIP(-0.0) should be -inf
added test_dtype_alu for PYTHON backend
* catcht that
* fix those two
2024-06-17 22:26:58 -04:00
chenyu
03b367c014
handle float16 overflow in PYTHON ( #5022 )
...
* handle float16 overflow in PYTHON
use `truncate` when constructing tensor from list to make sure all values are packable (might be slow, but should be correct). add truncate_fp16 to cast overflowed values to inf/-inf.
* all valid fmt supports truncate
2024-06-17 21:12:52 -04:00
chenyu
4296507021
Tensor.sum returns in acc_dtype if specified ( #5012 )
...
* Tensor.sum returns in acc_dtype if specified
* skip PYTHON for now
* revert that
* relax that
2024-06-17 16:35:52 -04:00
chenyu
2b07847f2b
matmul returns in acc_dtype if specified ( #4994 )
...
more flexible to not automatically downcast, can fix bert mixed precision training with this
2024-06-16 12:56:15 -04:00
chenyu
67e8df4969
remove numpy from dtype ( #4969 )
...
replaced all dtype.np with _to_np_dtype defined in tensor.py.
after this, the only numpy usages are (1) Tensor(np.ndarray), (2) construct .numpy() output, (3) numpy random buffer
2024-06-14 15:38:45 -04:00
chenyu
287d3c3b84
support list, tuple input in dtypes.from_py ( #4945 )
...
* support list, tuple input in dtypes.from_py
and used it to infer dtype from python list and tuple in Tensor constructor.
* fix tests
2024-06-13 13:38:06 -04:00
qazal
637f482588
configure derandomizing CI tests ( #4793 )
2024-05-31 17:06:58 +03:00
Szymon Ożóg
de5c69c4c9
Unify test_dtype naming conventions ( #4730 )
2024-05-25 10:12:40 -04:00
chenyu
47aba47f64
update Torch.gather api ( #4692 )
...
* update Torch.gather api
gather(self, dim, index) to match torch
* fix that
2024-05-22 21:54:06 -04:00
chenyu
286b4dbdf2
compile raise CompileError and skip only RuntimeError in multiprocess… ( #4646 )
...
* compile raise CompileError and skip only RuntimeError in multiprocess beam
renderer error with multiprocess should not be skipped by beam
* use `==` for dtype to dtype comparison
* that needs to be is
* typo
2024-05-19 00:25:25 -04:00
chenyu
04f2327ca3
fix abs of diff of uint ( #4411 )
2024-05-15 18:39:11 -04:00
nimlgen
eb9689336e
nv mockgpu ( #4600 )
...
* mockgpu nv
* works
* comment that out
* fix merge
* setup gpuocelot
* install packages
* not run all of them
* passes
* fix ci
* almost
* should pass
* linter
* linter 2
* try this?
* ugn, not supported
* ci
* remove ticket from description
* better descs
2024-05-15 23:46:08 +03:00
chenyu
3c11ca452e
skip CLANG test casts between double and half for now ( #4609 )
...
start breaking after github CI image update
2024-05-15 16:17:06 -04:00
chenyu
7eb035e7c5
stronger test case for half mean overflow ( #4470 )
2024-05-07 22:40:09 -04:00
chenyu
ca7300c783
fix half mean and its backward ( #4469 )
...
* fix half mean and its backward
cast to sum_acc_type, sum, div, then cast back
* mean dtype tests
2024-05-07 21:46:41 -04:00
qazal
35dfbc6354
rand_for_dtype helper ( #4459 )
2024-05-07 00:03:42 +03:00
chenyu
826cccd54d
fix mean underflow for half tensor ( #4377 )
...
* fix mean underflow for half tensor
divide only the reduce factor. added unit test and non-nan assertion in resnet training. also added a failed test cast for symbolic shape var
* skip for python backend
2024-05-01 13:38:57 -04:00
chenyu
077ea6926c
remove downcast_half in sum ( #4376 )
...
breaks boolean mean and other stuff
2024-05-01 11:46:44 -04:00
chenyu
93abcd3113
fix function.py sum backward without downcast_half ( #4353 )
...
without downcast_half, sum output dtype can be different from input dtype. cast back to input dtype in function.py
2024-04-29 17:53:02 -04:00
chenyu
c1d8d425eb
fix mean of half tensor if sum is greater than hlaf.max ( #4327 )
...
sum of half does acc in float32 already, add an arg to not downcast to half and use that in mean
2024-04-28 18:04:54 -04:00
qazal
23445db2b9
no skipped tests in RHIP ( #4337 )
...
* delete skip
* delete split skip
* remu dev
* compiler fails here
* Revert "remu dev"
This reverts commit 28b933d4eb .
2024-04-28 12:23:05 -04:00
chenyu
63eb0a68af
fix return dtype of gather ( #4159 )
2024-04-12 16:25:12 -04:00
chenyu
d9c5a2b1bb
fix return dtype of getitem Tensor indexing ( #4158 )
...
the use of sum can auto-upcast the result. fixed by using the data dtype as the acc_dtype
2024-04-12 15:55:02 -04:00
chenyu
380f27d629
move sum acc_dtype into lazy so it applies to backward ( #4149 )
...
* move sum acc_dtype into lazy so it applies to backward
* unit test
2024-04-11 14:43:56 -04:00
chenyu
7bc560ec49
remove outdated bf16 comments in test_dtype ( #3987 )
2024-03-29 00:56:18 -04:00
uuuvn
8a40d7d423
Shape changing bitcast and assert bitcast in disk ( #3973 )
...
* Shape changing bitcast
* only support it on disk
* basic test
* more tests
* RuntimeError instead of assert
* create unique temp files
* move tests that use disk to test_disk_tensor
* linter
* remove assert on error messages
* that's RuntimeError now
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2024-03-28 21:49:10 -07:00
chenyu
793ab0512e
use ctypes to truncate float64 and float32 in uops ( #3986 )
...
this fixed the softmax.argmax bug for ops_python as the float is truncated to float32
2024-03-28 23:56:50 -04:00