chenyu
12a910f1d2
update torch 2.8 ( #12172 )
...
support _reshape_alias. something is wrong with one case of unfold
2025-09-14 15:19:03 -04:00
George Hotz
bcafa72b7f
use tags instead of graph_rewrite_map in rangeify ( #12110 )
...
* use tags instead of graph_rewrite_map in rangeify
* new style, add realize
* metadata works
* simple failure
* fix
* loops
* stuff becomes a NOOP when you remove it
* stuff becomes a NOOP when you remove it
* tags on bufferize
* bmnist works
* locals don't work
* shippable
* fix some tests
* simpler map_realize
* remove const hack
* debuggable test
* broke
* assign test
* straight up bug
* wooo it passes
* sink shouldn't be there
* fix ops
* bmnist
* kv cache ish
* Set RANGEIFY context variable to 0
* should work normal
* better
* types
* hacks to fix test_symbolic
* pm_add_buffers
* tests should pass
2025-09-14 11:39:01 +08:00
chenyu
aac3dceaf6
merge two PYTHON backend ci job ( #12143 )
...
* merge two PYTHON backend ci job
and mark anything that takes > 10 in test_ops slow
* two more
2025-09-12 17:36:46 -04:00
chenyu
544eb2c402
clean up test_scatter_reduce ( #12125 )
2025-09-11 16:36:58 -04:00
chenyu
0e266f376c
ops_gpu -> ops_cl ( #12103 )
2025-09-10 15:15:48 -04:00
nimlgen
1c6c42715f
unify cpu and llvm ( #11982 )
...
* try unify cpu and llvm
* fixes
* fix
* ops
* no llvm
* fix
* rm
* lvmm is ot
* oops
* override
* no llvm
* ignore
* skip llvm
* ooops
2025-09-09 13:54:44 +03:00
chenyu
ce7163e9b4
clean up skip slow tests in PYTHON ( #12028 )
...
skip with SKIP_SLOW_TEST and decorators
2025-09-05 11:35:26 -04:00
chenyu
52166fd7eb
smaller test_ops inputs ( #12007 )
2025-09-04 16:22:33 -04:00
chenyu
d0e739453e
update many einsum tests ( #11981 )
...
correct the exception testing, and raise ValueError instead of assert when checking args
2025-09-03 15:40:20 -04:00
chenyu
69dd1817d0
raise RuntimeError in merge_dicts instead of assert [pr] ( #11965 )
2025-09-02 17:18:44 -04:00
chenyu
7123df3928
Use Tensor.logaddexp to implement Tensor.softplus ( #11796 )
...
instead of piecewise linear, numerical is handled by logaddexp. jax does this and i think it's more elegant than torch's approach
2025-08-23 11:52:29 -04:00
chenyu
fb8ee02424
Tensor.logaddexp ( #11793 )
2025-08-23 09:15:00 -04:00
geohotstan
1e679bd789
fix max_unpool2d inf ( #11784 )
...
* start
* add regression test for maxunpool2d
2025-08-22 08:31:24 -04:00
chenyu
91a4de4ca7
fix getitem with inf in tensor ( #11781 )
2025-08-21 21:55:32 -04:00
chenyu
5276fbc9c5
fix gather with inf values ( #11760 )
...
(mask * x) is wrong because 0*inf is nan. i feel we have a lot of those still...
2025-08-20 20:35:40 -04:00
chenyu
4fe19eec72
Ops.TRUNC ( #11659 )
2025-08-13 18:40:48 -04:00
chenyu
0c97d6de1b
don't round pow output for int pow int ( #11625 )
...
also added atol=0 and big pows for the tests
2025-08-11 20:57:47 -04:00
chenyu
d623f6d850
support int Tensor pow to const non-negative int ( #11624 )
...
matches torch
2025-08-11 19:50:19 -04:00
chenyu
a67e0917c3
list indexing can normalize in python ( #11609 )
...
* list indexing can normalize in python
list index does not need to be normalized in tensor
* update those
2025-08-10 20:02:38 -04:00
chenyu
1181ec0cd2
few more tensor indexing test cases ( #11608 )
2025-08-10 18:56:42 -04:00
chenyu
dfb702ef33
fix sort for small dim ( #11601 )
...
* fix sort for small dim
* fixed test_sort_empty
2025-08-10 01:17:41 -04:00
chenyu
aa1a6f2132
support threshold in Tensor.softplus ( #11564 )
...
fix gradient for large input
2025-08-07 13:43:18 -04:00
chenyu
dbc7807c61
enable WEBGPU tests with buffer limit ( #11489 )
...
TestSample still fails?
2025-08-03 13:02:44 -07:00
chenyu
2d7c28de6a
clean up dup lambdas in helper_test_exception ( #11325 )
2025-07-22 12:21:57 -04:00
chenyu
fb42c84365
merge TestRollEdgeCases into test_ops ( #11321 )
2025-07-22 10:55:57 -04:00
chenyu
1d8b3e9d1c
movementop only Tensor.roll ( #11317 )
...
* movementop only Tensor.roll
* fixed
2025-07-22 10:34:15 -04:00
chenyu
6e9506e6fd
Tensor.roll supports dims=None ( #11313 )
2025-07-21 17:29:23 -04:00
chenyu
d3a93185a6
clean up test_roll ( #11312 )
2025-07-21 16:00:50 -04:00
chenyu
341a686799
Tensor.diagonal ( #11122 )
...
only implemented main diagonal for 2-D tensors. with diagonal and qr, we can get determinant
2025-07-07 16:21:26 -04:00
Nino Risteski
a1a146a499
adding enable_gqa in SDPA ( #11097 )
...
Co-authored-by: wozeparrot <wozeparrot@gmail.com >
2025-07-06 23:25:33 -07:00
chenyu
7468959f4b
Tensor.argsort ( #11112 )
2025-07-06 13:56:35 -04:00
kevvz
b7af9cf849
clean svd tests, set full_matrices false in torch backend ( #11113 )
...
* clean tests, set full_matrices false
* add more shape asserts
2025-07-06 13:55:49 -04:00
chenyu
ba88ec3ad0
pipe linalg svd to torch ( #11109 )
...
and found a bug in svd
2025-07-06 08:37:25 -04:00
chenyu
845a4d32bc
Tensor.diag ( #11108 )
...
also updated Tensor.eye to use it
2025-07-05 23:03:02 -04:00
ttomsa
4905af4ae0
remove invalid int div test ( #11106 )
...
* rm test
* also rm this
2025-07-05 18:57:55 -04:00
chenyu
a2f5a54458
move sparse_categorical_crossentropy to test_ops ( #11083 )
...
also flattened the tests
2025-07-03 21:40:54 -04:00
chenyu
678cabc6f2
use argfix in Tensor.stack ( #11077 )
...
works for multiple Tensor args or single tuple/list of Tensors, but not the mixed
2025-07-03 12:15:11 -04:00
Ahmed Harmouche
e992ed10dc
WebGPU on Windows ( #10890 )
...
* WebGPU on Windows
* Fix dawn-python install
* New test
* pydeps
* Minor fix
* Only install dawn-python on windows webgpu
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com >
2025-07-02 08:38:45 -07:00
chenyu
126fcf4129
clean up AMD_LLVM in tests ( #11021 )
2025-06-28 22:45:47 -04:00
chenyu
49bba2f0a0
improve test_nll_loss ( #10986 )
...
build target and weight tensors outside so it tests backward too.
2025-06-26 02:46:55 -04:00
chenyu
0612acfc70
improve Tensor.cross_entropy ( #10985 )
...
separate when Y is prob vs indices and check shapes for indices. also fix higher dim cases
2025-06-26 01:39:48 -04:00
chenyu
18e264a449
Tensor.logsigmoid ( #10955 )
2025-06-24 11:16:14 -04:00
chenyu
35504c938e
torch.clip(x,y) -> x.clip(y) in test_ops ( #10954 )
...
* torch.clip(x,y) -> x.clip(y) in test_ops
* test_binary_crossentropy_logits_pos_weights
2025-06-24 10:22:19 -04:00
Fang-Pen Lin
86d458533f
Add pos_weight for binary_crossentropy_logits ( #10855 )
...
* Add pos_weight for binary_crossentropy_logits
* Remove debug code
* Code style
* Code style
* Rename
2025-06-24 09:42:37 -04:00
chenyu
2d9c61e39e
test more dims in test_logsumexp and test_logcumsumexp ( #10907 )
...
refactoring squeeze and unsqueeze is easy to get wrong
2025-06-20 21:42:18 -04:00
Nino Risteski
3771cc0f77
fix test logcumsumexp broken devectorize=0 ( #10880 )
...
* fix test logcumsumexp numerical
* lint
* Use dtypes.min instead of -1e4
2025-06-20 20:54:50 -04:00
George Hotz
a493eb396c
fix view add 0 ( #10840 )
2025-06-16 16:46:12 -07:00
chenyu
e5d5ae55f9
smaller inputs for test_sort and test_topk ( #10829 )
2025-06-16 00:21:15 -04:00
chenyu
7a6df0a161
remove .relu() call in several conv tests in test_ops ( #10807 )
...
* remove .relu() call in several conv tests in test_ops
testing negative parts double the effectiveness. keep the relu between two convs and the tests that explicitly test relu
* relax tol
2025-06-13 17:10:16 -04:00
George Hotz
81b9c04574
move high level stuff to unit tests [pr] ( #10708 )
...
* move high level stuff to unit tests [pr]
* process replay on unit tests
* fix pr, less compute
* set omp num threads
* set 200MB buffer size limit
* delete junk
* fix tests
* faster
* move test_indexing to unit
* faster
2025-06-08 14:05:56 -07:00