Max-We
ab2714423b
Add einsum tests ( #6286 )
...
Co-authored-by: Maximilian Weichart <maximilian.weichart@icloud.com >
2024-08-26 09:09:25 -07:00
chenyu
af7c04ff57
Tensor.__floordiv__ ( #6283 )
...
support Tensor.__floordiv__ and friends
2024-08-26 09:43:40 -04:00
chenyu
da5cf11859
fix acc init value for MUL ( #6263 )
2024-08-23 23:19:44 -04:00
chenyu
590c0922b6
Tensor.prod ( #6250 )
...
* Tensor.prod
a new reduce op!
* onnx ReduceProd
2024-08-23 10:06:32 -04:00
Gabe Caldwell
bdd6325f31
default num_classes value for one_hot ( #6182 )
...
* num_classes=-1
If num_classes set to -1, the number of classes will be inferred as one greater than the largest class value in the input tensor.
* num_classes desc
comment to explain num_classes default and what that means.
* replacing ' with `
2024-08-19 12:07:14 -07:00
Alessandro Benetti
9328248610
support for std_mean and cross_entropy ( #6181 )
...
* support for std_mean and cross_entropy (#3 )
* Cross entropy and std mean support
* remove extra examples
2024-08-19 12:06:44 -07:00
George Hotz
553ae9ebc0
bilinear interp uint8 fails ( #6103 )
...
* new test for e2e compile failures
* fix bug
* bilinear interp uint8 fails
* better tests
2024-08-15 19:34:39 -07:00
chenyu
4a65010de8
remove CUDACPU flag in tests [run_process_replay] ( #5902 )
...
no longer used
2024-08-04 16:06:38 -04:00
chenyu
b392b8edc3
increase atol and rtol test_gemm_fp16 ( #5866 )
...
* increase atol and rtol test_gemm_fp16
made it pass with NOOPT which has larger accumulated error
* revert that
2024-08-01 19:09:58 -04:00
chenyu
defd89e8e0
unify negative shape creation to raise ValueError ( #5817 )
...
[run_process_replay]
2024-07-30 13:42:59 -04:00
P4ssenger
6742a4789a
Add check for negative dimension in view ( #5790 )
...
* add check for negative dimension in view
* add negative dim tests
* move check to tensor level
* fix error message
* move check to view create
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-07-30 13:26:27 -04:00
samm393
573e0f9a48
remove float division from idiv in python_alu ( #5777 )
...
* removes float division from idiv in python_alu
* add test
* cleaner logic
* pass clang unsigned literals correctly
* suffix ULL instead of U
---------
Co-authored-by: chenyu <chenyu@fastmail.com >
2024-07-29 12:14:12 -04:00
George Hotz
053550c3f3
remove MERGE opt, cleanup wmma upcast ( #5669 )
...
* remove MERGE opt, cleanup wmma upcast
* upcast first
* fix broken vectorize folding rule
2024-07-23 20:43:42 -07:00
George Hotz
e3f00ac77d
Fix cuda tc emu test ( #5663 )
...
* fix acc folding for NV tensor cores
* fix correctness of reduce_before_expand
* fix test emulated CUDA tensor cores
* test_gemm_fp16 on some devices
2024-07-23 15:04:25 -07:00
George Hotz
386fb5e7f8
folding without UNMUL ( #5628 )
...
* folding without UNMUL
* fix failures, index_collapse
* import ReduceOps
* test_arange_4096 isn't folding
2024-07-21 20:14:44 -07:00
George Hotz
0ad87021e2
move acc to end ( #5568 )
...
* move acc to end
* confirmed pictures are the same
* relax that
* Update test_ops.py
2024-07-19 03:06:52 -07:00
chenyu
6e405b0a2b
add 0d tensor to trunc/floor/ceil/round tests ( #5512 )
...
existing trunc test passes backward but its backward is incorrect in general. added tests that would fail
2024-07-16 16:48:25 -04:00
Tobias Fischer
87a2ef2bc2
Add Interpolate Function ( #5482 )
...
* add interpolate function
* fixed linter issue
* reduced sizes in test
---------
Co-authored-by: wozeparrot <wozeparrot@gmail.com >
2024-07-16 09:44:01 -07:00
Tobias Fischer
e219103677
Add Pad to Pooling ( #5488 )
2024-07-14 21:50:20 -07:00
Tobias Fischer
5849130cbb
gather negative dim fix ( #5486 )
2024-07-14 20:20:53 -04:00
chenyu
00813a92a0
update Tensor.eye api to match torch ( #5433 )
...
* update Tensor.eye api to match torch
input is n for nrows and optional m for ncols
* space
* fix onnx
2024-07-12 20:25:12 -04:00
chenyu
64986f949c
more transcend math tests in ci ( #5368 )
...
* more transcend math tests in ci
test large input to trig functions that hit different reduction algo, and test TRANSCENDENTAL=2 for all backend
* no CUDACPU
* try that
2024-07-10 21:19:09 -04:00
chenyu
0f0940225a
fix Tensor.all and Tensor.any for PTX ( #5335 )
...
supported boolean acc and boolean phi. and rewrite boolean max to uint8 max
2024-07-08 18:15:04 -04:00
chenyu
6856f915d6
Tensor.any and Tensor.all ( #5320 )
...
does not work in ptx yet due to how boolean tensor is handled
2024-07-07 14:36:00 -04:00
chenyu
2029cb7047
support passing None to Tensor.clip ( #5319 )
...
passing None for no upper bound or no lower bound
2024-07-07 13:04:22 -04:00
chenyu
c1e330f302
Tensor.int and Tensor.bool ( #5317 )
2024-07-07 11:52:58 -04:00
George Hotz
e53b164e1a
small changes from lowerer ( #5266 )
2024-07-02 15:03:54 -07:00
George Hotz
3df47bc21e
OpenELM + repeat_interleave ( #5234 )
...
* start writing openelm
* progress...hit bug
* repeat_interleave support
* gqa
* add rotary embedding
* spp
* i think it runs correctly
* broken
* output is good now
* cleanups
* no io_uring on android
2024-06-30 15:18:39 -07:00
hikettei
ad1ca7da64
[Feature] Added BinaryOps.AND/BinaryOps.OR ( #5223 )
...
* [Feature] Added BinaryOps.AND/BinaryOps.OR
* Add: __rand__, __ror__
2024-06-29 17:20:25 -07:00
chenyu
ee0c6dfc15
build Tensor._tri with movements only ( #5110 )
...
* build Tensor._tri with movements only
doesn't need arange, saved a kernel in attention mask
* simpler, more tests
2024-06-23 00:07:36 -04:00
chenyu
20fabd8a5b
update Tensor.triu and Tensor.tril ( #5109 )
...
renamed arg to `diagonal` that matches torch api, and added document and examples
2024-06-22 21:59:50 -04:00
George Hotz
9f875123b6
small changes from lowerer. [run_process_replay] [no_assert] ( #5102 )
2024-06-22 11:09:35 -07:00
chenyu
166a2b19b5
fix reduce axis of 0d tensors ( #5089 )
...
`x.sum(())` is fine, and `x.sum((1,))` should throw IndexError
2024-06-21 13:51:40 -04:00
chenyu
36b4a492a1
explicitly check getitem indices can have at most one ellipsis ( #5087 )
...
* explicitly check getitem indices can have at most one ellipsis
previous error with multiple `...`:
```
if index_type not in [None, int, slice, Tensor]: raise IndexError(f"{index_type=} not supported")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
IndexError: index_type=<class 'ellipsis'> not supported
```
this pr:
```
if len(ellipsis_idx) > 1: raise IndexError("an index can only have a single ellipsis ('...')")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
IndexError: an index can only have a single ellipsis ('...')
```
* oh we have that already
* test that
* test these
2024-06-21 12:33:18 -04:00
chenyu
f6d6760f71
don't cast tuple to list before creating Tensor ( #5071 )
...
Tensor constructor supports creating from tuple now
2024-06-20 13:32:56 -04:00
chenyu
50700171ef
minor cleanup to reshape arg handling ( #5070 )
...
moved None handle to be with argfix, and only resolve -1 if there's a -1
2024-06-20 10:27:27 -04:00
chenyu
f4355d0f1b
check Tensor.permute input arg is a valid permutation ( #5069 )
...
also added support of negative axes
2024-06-20 10:01:28 -04:00
chenyu
e8f39fcaaa
check arg to Tensor.flip can appear only once ( #5068 )
...
* check arg to Tensor.flip can appear only once
raise RuntimeError if there are multiple
* fix test
2024-06-20 09:33:42 -04:00
chenyu
620fa6e5a2
check Tensor.reshape can have at most one -1 ( #5026 )
...
raise RuntimeError to match torch. on master it throws weird errors from shapetracker
2024-06-18 08:17:12 -04:00
chenyu
c0139b05d8
python_alu sin(inf) is nan ( #5020 )
...
* python_alu sin(inf) is nan
without special handling, it throws ValueError: math domain error
* skip CUDACPU
2024-06-17 19:47:30 -04:00
Ray
1ad3b25461
fix einsum output str ( #4998 )
...
* fix einsum output str
* new line to satisfy linter
* removed redundant cast (satisfy linter)
2024-06-17 12:18:14 -04:00
chenyu
67e8df4969
remove numpy from dtype ( #4969 )
...
replaced all dtype.np with _to_np_dtype defined in tensor.py.
after this, the only numpy usages are (1) Tensor(np.ndarray), (2) construct .numpy() output, (3) numpy random buffer
2024-06-14 15:38:45 -04:00
geohotstan
90332eb529
Getitem pin None dimension ( #4960 )
...
* fix
* remove torch out of bounds test
* 1 more test case
2024-06-14 10:48:59 -04:00
chenyu
74586bc339
fix getitem with leading None ( #4943 )
...
i think all None handling can be unified and remove the calc_dim in advanced indexing
2024-06-13 11:23:40 -04:00
chenyu
fae08c4d48
fix Tensor.triu / Tensor.triu with boolean input ( #4941 )
...
`where(self, 0)` incorrectly upcasted the output. `where(self, False)` is correct but looks unnatural, so added a cast at the end. Pattern matcher can fold the cast into where branches
2024-06-12 20:16:13 -04:00
chenyu
eb0f5b5660
failed test case for getitem with leading Nones ( #4936 )
...
* failed test case for getitem with leading Nones
torch matched numpy so tinygrad is incorrect.
another repro
```
t = np.arange(12).reshape((3, 4))
print(t[None, None, np.array([1, 2])])
t = torch.arange(12).reshape((3, 4))
print(t[None, None, torch.tensor([1, 2])].numpy())
t = Tensor.arange(12).reshape(3, 4)
print(t[None, None, Tensor([1, 2])].numpy())
```
* # noqa
2024-06-12 16:19:42 -04:00
chenyu
1326f29e24
fix Tensor.gather shape checking criteria ( #4932 )
...
it's fine if `self.shape[d] >= index.shape[d]` for all `d != dim`, not for all `d`
2024-06-12 13:10:14 -04:00
chenyu
798ea61377
widen test_ops [low, high] and more strict atol ( #4906 )
...
default [low, high] changed from [-1.5, 1.5] to [-2, 2] (except tan).
dropped several explicit atol if it's unnecessarily larger than default 1e-6.
tested on mac, tinybox red / green
2024-06-10 20:47:09 -04:00
chenyu
c8cd637236
test case for Tensor.var reducing over size = 1 axis ( #4902 )
...
backward failed when correction >= reducing n
2024-06-10 12:11:39 -04:00
chenyu
a70e8a80d7
test_ops test cmp with special floats ( #4826 )
...
prepare to fix nan, it did not work with ge and le before either
2024-06-04 12:10:21 -04:00