tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-20 20:38:03 -05:00

Author	SHA1	Message	Date
chenyu	4fe19eec72	Ops.TRUNC (#11659 )	2025-08-13 18:40:48 -04:00
chenyu	0c97d6de1b	don't round pow output for int pow int (#11625 ) also added atol=0 and big pows for the tests	2025-08-11 20:57:47 -04:00
chenyu	d623f6d850	support int Tensor pow to const non-negative int (#11624 ) matches torch	2025-08-11 19:50:19 -04:00
chenyu	a67e0917c3	list indexing can normalize in python (#11609 ) * list indexing can normalize in python list index does not need to be normalized in tensor * update those	2025-08-10 20:02:38 -04:00
chenyu	1181ec0cd2	few more tensor indexing test cases (#11608 )	2025-08-10 18:56:42 -04:00
chenyu	dfb702ef33	fix sort for small dim (#11601 ) * fix sort for small dim * fixed test_sort_empty	2025-08-10 01:17:41 -04:00
chenyu	aa1a6f2132	support threshold in Tensor.softplus (#11564 ) fix gradient for large input	2025-08-07 13:43:18 -04:00
chenyu	dbc7807c61	enable WEBGPU tests with buffer limit (#11489 ) TestSample still fails?	2025-08-03 13:02:44 -07:00
chenyu	2d7c28de6a	clean up dup lambdas in helper_test_exception (#11325 )	2025-07-22 12:21:57 -04:00
chenyu	fb42c84365	merge TestRollEdgeCases into test_ops (#11321 )	2025-07-22 10:55:57 -04:00
chenyu	1d8b3e9d1c	movementop only Tensor.roll (#11317 ) * movementop only Tensor.roll * fixed	2025-07-22 10:34:15 -04:00
chenyu	6e9506e6fd	Tensor.roll supports dims=None (#11313 )	2025-07-21 17:29:23 -04:00
chenyu	d3a93185a6	clean up test_roll (#11312 )	2025-07-21 16:00:50 -04:00
chenyu	341a686799	Tensor.diagonal (#11122 ) only implemented main diagonal for 2-D tensors. with diagonal and qr, we can get determinant	2025-07-07 16:21:26 -04:00
Nino Risteski	a1a146a499	adding enable_gqa in SDPA (#11097 ) Co-authored-by: wozeparrot <wozeparrot@gmail.com>	2025-07-06 23:25:33 -07:00
chenyu	7468959f4b	Tensor.argsort (#11112 )	2025-07-06 13:56:35 -04:00
kevvz	b7af9cf849	clean svd tests, set full_matrices false in torch backend (#11113 ) * clean tests, set full_matrices false * add more shape asserts	2025-07-06 13:55:49 -04:00
chenyu	ba88ec3ad0	pipe linalg svd to torch (#11109 ) and found a bug in svd	2025-07-06 08:37:25 -04:00
chenyu	845a4d32bc	Tensor.diag (#11108 ) also updated Tensor.eye to use it	2025-07-05 23:03:02 -04:00
ttomsa	4905af4ae0	remove invalid int div test (#11106 ) * rm test * also rm this	2025-07-05 18:57:55 -04:00
chenyu	a2f5a54458	move sparse_categorical_crossentropy to test_ops (#11083 ) also flattened the tests	2025-07-03 21:40:54 -04:00
chenyu	678cabc6f2	use argfix in Tensor.stack (#11077 ) works for multiple Tensor args or single tuple/list of Tensors, but not the mixed	2025-07-03 12:15:11 -04:00
Ahmed Harmouche	e992ed10dc	WebGPU on Windows (#10890 ) * WebGPU on Windows * Fix dawn-python install * New test * pydeps * Minor fix * Only install dawn-python on windows webgpu --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-07-02 08:38:45 -07:00
chenyu	126fcf4129	clean up AMD_LLVM in tests (#11021 )	2025-06-28 22:45:47 -04:00
chenyu	49bba2f0a0	improve test_nll_loss (#10986 ) build target and weight tensors outside so it tests backward too.	2025-06-26 02:46:55 -04:00
chenyu	0612acfc70	improve Tensor.cross_entropy (#10985 ) separate when Y is prob vs indices and check shapes for indices. also fix higher dim cases	2025-06-26 01:39:48 -04:00
chenyu	18e264a449	Tensor.logsigmoid (#10955 )	2025-06-24 11:16:14 -04:00
chenyu	35504c938e	torch.clip(x,y) -> x.clip(y) in test_ops (#10954 ) * torch.clip(x,y) -> x.clip(y) in test_ops * test_binary_crossentropy_logits_pos_weights	2025-06-24 10:22:19 -04:00
Fang-Pen Lin	86d458533f	Add pos_weight for binary_crossentropy_logits (#10855 ) * Add pos_weight for binary_crossentropy_logits * Remove debug code * Code style * Code style * Rename	2025-06-24 09:42:37 -04:00
chenyu	2d9c61e39e	test more dims in test_logsumexp and test_logcumsumexp (#10907 ) refactoring squeeze and unsqueeze is easy to get wrong	2025-06-20 21:42:18 -04:00
Nino Risteski	3771cc0f77	fix test logcumsumexp broken devectorize=0 (#10880 ) * fix test logcumsumexp numerical * lint * Use dtypes.min instead of -1e4	2025-06-20 20:54:50 -04:00
George Hotz	a493eb396c	fix view add 0 (#10840 )	2025-06-16 16:46:12 -07:00
chenyu	e5d5ae55f9	smaller inputs for test_sort and test_topk (#10829 )	2025-06-16 00:21:15 -04:00
chenyu	7a6df0a161	remove .relu() call in several conv tests in test_ops (#10807 ) * remove .relu() call in several conv tests in test_ops testing negative parts double the effectiveness. keep the relu between two convs and the tests that explicitly test relu * relax tol	2025-06-13 17:10:16 -04:00
George Hotz	81b9c04574	move high level stuff to unit tests [pr] (#10708 ) * move high level stuff to unit tests [pr] * process replay on unit tests * fix pr, less compute * set omp num threads * set 200MB buffer size limit * delete junk * fix tests * faster * move test_indexing to unit * faster	2025-06-08 14:05:56 -07:00
George Hotz	8c76250d31	speed up a few tests (#10692 )	2025-06-07 20:39:25 -07:00
ihar	74b849b5e1	remove unnecessary 'argfix' because 'view' is an alias to 'reshape'. all functionality must be inside 'reshape' (#10677 ) * remove unnecessary 'argfix' because 'view' is an alias to 'reshape'. all functionality must be inside 'reshape' * added the same set of unit tests for 'view' as for 'reshape' since 'view' is just an alias for 'reshape' * improved tests for 'view' op	2025-06-07 22:15:31 -04:00
chenyu	ff1aad7b69	fix const float pow to int tensor (#10655 ) was incorrectly casted into int	2025-06-05 19:15:12 -04:00
geohotstan	602a145f8f	Add Tensor.unfold (#10518 ) * yoinked 10272 * eitanturok's fixes * hmmm should size be sint? * add test	2025-05-26 11:15:44 -04:00
chenyu	7bfb20757c	fix tensor int floor div (#10327 ) * fix tensor int floor div * test_float_floordiv_scalar	2025-05-21 06:46:54 -04:00
chenyu	145e51247a	split CAST and BITCAST in PYTHON [pr] (#10123 ) CAST only needs truncate and does not require dtype fmt. added bfloat16 tests can run locally	2025-04-30 23:27:35 -04:00
George Hotz	11113c9d07	reduce_unparented (#10056 )	2025-04-26 09:48:16 -04:00
George Hotz	78caf55154	Revert "FP8 support on NVIDIA (#8631 )" This reverts commit `2c8e4ea865`.	2025-04-09 12:27:41 +08:00
George Hotz	d1505137ad	Revert "move TestOpsFp8s skipTest (#9797 )" This reverts commit `a3aaf92b21`.	2025-04-09 12:27:40 +08:00
chenyu	a3aaf92b21	move TestOpsFp8s skipTest (#9797 ) so get_available_devices is not called when running other tests	2025-04-08 22:44:07 -04:00
pkotzbach	2c8e4ea865	FP8 support on NVIDIA (#8631 ) * squashed fp8 commits * tensorcore start * minor changes * pre-commit * pylint * Delete fp8mul.cu * clean * small bugfix * fix test_dtype * fix test_dtype_alu * add EMULATE_CUDA_SM89 * fix ci * fix test_linearizer * fix test_linearizer * fix swizzle * add debug to simple_matmul * fixed swizzle * python emulator * refactor python emulator * setup fix * numpy setup * ml_dtypes only in emulate_cuda_sm89 * fix pylint * fix tests * fix mypy * fix mypy * fix ruff * done python emulator * add acc type * tests * mypy * clean code * add cuda tensor core tests to CI * minor fix * clean test_dtype.py * clean cstyle.py * clean test_ops.py * fix test * fix test * whitespaces * pylint * pylint * amd? * amd? * amd * reduce lines * mockgpu remove * fix * ruff * ruff * fix mypy * ruff * test only for cuda * fixed formatting * small fixes * small fix * least_upper_dtype if fp8s not supported * log and reciprocal are supported for fp8s * ops python fixes * dtypes.fp8s use * e4m3 + e5m2 result dtype test * truncate linter fix --------- Co-authored-by: pkotzbach <pawkotz@gmail.com> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com> Co-authored-by: chenyu <chenyu@fastmail.com>	2025-04-08 21:54:04 -04:00
chenyu	3b8d923692	remove skip LLVM in test_div_int (#9686 )	2025-04-02 04:15:00 -04:00
chenyu	0e34f9082e	helper functions for cstyle div mod [pr] (#9673 )	2025-04-01 08:06:56 -04:00
Yvon Manzi	6652003839	Add cumprod to Tensor (#9629 ) * probably how cumprod should look like * update _cumalu to work with MUL * shorter * cumprod testing * clean * more cleanup * add cumprod to torch backend. * make it look like cumsum * mypy fix --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-03-30 21:49:18 -04:00
b1tg	f90001e1a6	amd llvm render (no_comgr prereq) (#9543 ) * amd llvm render * skip test_div_rounding_mode --------- Co-authored-by: b1tg <b1tg@users.noreply.github.com>	2025-03-24 22:50:51 +08:00

1 2 3 4 5 ...

600 Commits