tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-13 17:08:11 -05:00

Author	SHA1	Message	Date
Shun Usami	34a05b31fe	Fix advanced tensor indexing setitem (#12128 ) * Add failure test case for advanced tensor indexing setitem * Fix advanced tensor indexing setitem when permuted * Reduce line count * Revert unnecessary change * Combine two lines into one	2025-09-14 15:22:40 -04:00
Sieds Lykles	2fc0bd150b	Arange overflow raises error and one_hot upcast (#11975 ) * add error * to_dtype * shorten line * add test * upcast one hot dim im overflows	2025-09-13 00:18:25 +02:00
b1tg	14faf7a5c0	AutoCastType tests for fp8s/bf16 (#12084 )	2025-09-09 11:33:01 -04:00
nimlgen	9182948951	remove llvm_bf16_cast (#12075 )	2025-09-08 20:51:15 +03:00
Sieds Lykles	f326df8ae8	add type: ignore (#12059 )	2025-09-06 21:17:35 +02:00
Sieds Lykles	581b2388c2	add dtypes.index (#12015 ) * add dtypes.index * cast shape, stride and mask to dtypes.index in view.create * move pm_lower_index_dtype to ops * DEFINE_VAR is dtype.index by default * merge var_val_using_str * remove int from commutative * fix test_rewrite_map * change that to dtypes.index * change some int to index * shorten those * remove old cast in renderer * cleanup * change that back * add comment * delete comment * just delete those * view doesnt have to cast anymore * adjust comment	2025-09-06 06:03:44 +02:00
Sieds Lykles	c6c16b2946	`var_vals` uses str for var (#12011 ) * var_vals is str,int * remove imports * remove print * fix test * change var_vals in hcq * update test_hcq * fix multitensor _device_num var * fix syminfer test * shorten line * p.vars stays list[Variable] * shorten line * vars is back to tuple[Variable, ...] * change var_vals in extra * change var_vals from shapetracker * var_vals is str:int * fix signature	2025-09-06 04:16:12 +02:00
George Hotz	870f63d9cc	add WARP axistype, fix postopt bugs (#12033 ) * postopt is 83% match * warp is bright CYAN * beautiful mnist beam works * fix shutdown bug	2025-09-05 10:36:55 -07:00
chenyu	d0e739453e	update many einsum tests (#11981 ) correct the exception testing, and raise ValueError instead of assert when checking args	2025-09-03 15:40:20 -04:00
chenyu	561318fea7	Tensor.cos in test_stype_alu (#11916 ) * Tensor.cos in test_stype_alu * need this fix anyway	2025-08-29 20:26:36 -04:00
Ben Waldron	ea1be2e4cd	[bounty] Remove using reshape to register symbolic shape (#11771 ) * Modify tests and start work towards removing symbolic reshape * Refactor symbolic reshape * fix small error * much cleaner + fix more tests * Can remove this now * Update test_symbolic_ops and test_tiny * Couple more tests * Unused import * More tests and add EXPAND to Tensor.empty * Fix test beam search * all int * Fix rangeify by adding shrink * Remove OOB check and so fix test_symbolic_jit * test_symbolic_jit doesn't need OOB Context anymore either * Should remove that test now * Cleanups part 1 * fix linters * Final cleanups * Don't reassign inside for loop --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-08-28 12:30:49 -04:00
chenyu	beb5982165	FUSE_ATTENTION (#11884 )	2025-08-27 19:59:17 -04:00
chenyu	337e979a59	call dtypes.as_const in Tensor(list) (#11840 )	2025-08-25 22:08:26 -04:00
chenyu	7123df3928	Use Tensor.logaddexp to implement Tensor.softplus (#11796 ) instead of piecewise linear, numerical is handled by logaddexp. jax does this and i think it's more elegant than torch's approach	2025-08-23 11:52:29 -04:00
chenyu	fb8ee02424	Tensor.logaddexp (#11793 )	2025-08-23 09:15:00 -04:00
chenyu	e39b25cd36	upcast float exp to at least float32 (#11758 ) * upcast float exp to at least float32 * unlucky seed	2025-08-22 20:16:34 -04:00
geohotstan	1e679bd789	fix max_unpool2d inf (#11784 ) * start * add regression test for maxunpool2d	2025-08-22 08:31:24 -04:00
George Hotz	bb8de51e5f	remove unused early cleanups + contig w range [pr] (#11780 ) * remove unused early cleanups [pr] * contiguous with range * woah, this works	2025-08-21 20:04:45 -07:00
chenyu	91a4de4ca7	fix getitem with inf in tensor (#11781 )	2025-08-21 21:55:32 -04:00
chenyu	5276fbc9c5	fix gather with inf values (#11760 ) (mask * x) is wrong because 0*inf is nan. i feel we have a lot of those still...	2025-08-20 20:35:40 -04:00
George Hotz	9635592141	** rangeify, try 3 (#11683 ) * ** rangeify, try 3 * bring that over * bufferize, don't use contig tag * work * ish * fix rangeify * flash attention is back * fix rangeify tests * stuff passes * fix test_log_softmax * more stuff passes * progress children * new endrange solution * progress * progress counter * basic assign * contigs only * symbolic in schedule * unbind_kernel * late children * ops fixed * beautiful mnist is close * that seems to work * mnist works * improve names * fix bmnist * no pcontig * testing backward * work * clone movement ops * new_range helper * MBLOCK/MERGE * ops tests pass * revert mblock stuff * cleanups...but it breaks ops * remove reindex * hack for relu * disable the hacks * more hacks * upd * mostly works with cleanups disabled * ndr * ops tests pass * terrible hacks for indexing to work * context mismatch * pcontig * split pcontig v contig * z3 trunc * null * no fuse in rangeify * ops test passes * lnorm * fix assign * nd rangeify * both should work * tests for rangeify * cleanups * stores pass the pointer through * disable pcontig for now * PARTIAL_CONTIG is a flag	2025-08-20 14:22:44 -07:00
chenyu	5f08a3e928	hotfix: cast half to float in Tensor.tolist (#11755 ) workaround for python < 3.12	2025-08-20 12:18:35 -04:00
chenyu	02353588cb	small getitem cleanup (#11730 )	2025-08-19 12:25:58 -04:00
chenyu	712a5c651a	minor Tensor.triu cleanup (#11728 ) less confusing dtype	2025-08-19 08:07:38 -04:00
George Hotz	4b3fcb4064	Revert "REDUCE_AXIS keepdim=False (#11311 )" (#11718 ) This reverts commit `b518a7378a`.	2025-08-18 13:28:53 -07:00
b1tg	b518a7378a	REDUCE_AXIS keepdim=False (#11311 ) * progress * fix tests * fix tests * remove hack for test_symfold * fix test_conv.py on llvm * hack test_cache_speed * lint * remove hack for helper_linearizer_opt * tests * fix DSP * clean up * remove hack for kernelize.py * hack for test/test_multitensor.py TestMultiTensor.test_matmul_shard_none * clean * uop.r need reshape? * lower_store cause fail * fix lower? * avoid contiguous hack * 2134 * conv2d count * remove unused * hack lower * reduced and clean up * fix TestMultiTensor.test_matmul_shard_none * src sync + fix TestMultiTensor.test_matmul_shard_none * remove excluded in mop --------- Co-authored-by: b1tg <b1tg@users.noreply.github.com> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com> Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com>	2025-08-18 10:09:17 -07:00
chenyu	c30a113b2a	support bf16 and fp8 in Tensor.tolist (#11704 ) memoryview does not support it, but casting works fine so cast is fine	2025-08-17 15:11:13 -04:00
George Hotz	9366a23eb0	test backward in test_tiny (#11697 ) * test backward in test_tiny * empty	2025-08-16 20:29:39 -07:00
chenyu	4fe19eec72	Ops.TRUNC (#11659 )	2025-08-13 18:40:48 -04:00
George Hotz	22bdf48cdd	render ranges in viz, name gbufs with sizes. changes from rangeify (#11656 ) * render ranges in viz, name gbufs with sizes. changes from rangeify * fix unit test dtypes	2025-08-13 12:46:16 -07:00
kevvz	e2873a3a41	[bounty] Muon optim (#11414 ) * newton schulz * add muon + move newton schulz to tensor * compact newton schulz * better tests * cleanup * add comments for muon * cleanup * add export with tests * match muon optim with test optim * cleanup * unsed import * correct comment * whitespace * move export * muon test fix * match reference impl + tests * remove export by moving muon device * add credit * cleanup * remove print * spacing * spacing * comma * cleanup * removal * fix tests + optim momentum * consistent is not/ not * more consistency * fix test * cleanup * fix the nones * remove comment * cast * comment * comment * muon teeny test * muon flag beautiful mnist * set steps * steps as hyperparam * match default test steps * name * large cleanup * dont care about steps * nesterov false default * match each other impl * steps * switch nest * swap defaults * update docstring * add no nesterov test * ban fuse_optim * prints * classical momentum * alternative condition * recon * pre + post wd * false default * detach * signature changes * context * swap order * big cleanup * 0 step instead * parity * remove fuse * remove fused * better paper * assert message * correct shape check + eps * multidim * add eps * cleanup * correct assert message * lint * better tests * naming * ns_steps,ns_params * update docstring * docstring * match sgd and muon together * sandwich * add back fused * parity --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-08-13 14:27:55 -04:00
chenyu	94e6d84e32	rewrite Tensor.round to not use cast int (#11654 )	2025-08-13 13:51:08 -04:00
chenyu	0d7075f2de	assign should broadcast input tensor (#11629 ) fixed test_assign_broadcast	2025-08-11 23:36:35 -04:00
chenyu	0c97d6de1b	don't round pow output for int pow int (#11625 ) also added atol=0 and big pows for the tests	2025-08-11 20:57:47 -04:00
chenyu	d623f6d850	support int Tensor pow to const non-negative int (#11624 ) matches torch	2025-08-11 19:50:19 -04:00
chenyu	0806677b51	rewrite sort idx (#11613 )	2025-08-11 16:20:56 -04:00
George Hotz	700c11597b	switch contextvars.ContextVar to _ContextVar (#11621 )	2025-08-11 12:20:09 -07:00
chenyu	a67e0917c3	list indexing can normalize in python (#11609 ) * list indexing can normalize in python list index does not need to be normalized in tensor * update those	2025-08-10 20:02:38 -04:00
chenyu	f7aa1b85fe	minor sort cleanups (#11602 )	2025-08-10 01:51:23 -04:00
chenyu	dfb702ef33	fix sort for small dim (#11601 ) * fix sort for small dim * fixed test_sort_empty	2025-08-10 01:17:41 -04:00
chenyu	aa1a6f2132	support threshold in Tensor.softplus (#11564 ) fix gradient for large input	2025-08-07 13:43:18 -04:00
b1tg	8b8bd6c534	make einsum generate same kernels (#11508 ) Co-authored-by: b1tg <b1tg@users.noreply.github.com>	2025-08-05 11:12:52 -04:00
chenyu	8a11af01ed	remove broken paperswithcode links in doc (#11497 )	2025-08-04 13:12:33 -04:00
chenyu	823f1a01db	move cast around expand backward to tensor.py (#11483 )	2025-08-02 23:03:54 -04:00
chenyu	66be747908	few more dtype cast convinience methods (#11480 )	2025-08-02 15:47:09 -04:00
kevvz	ef7e01cadf	Fix SVD shape bug + Fix batched SVD bug (#11477 ) * failing test case * fix * better test * space	2025-08-02 09:47:41 -07:00
wozeparrot	24dd0d52ed	feat: test remove to cpu (#11444 )	2025-07-30 20:18:56 -07:00
chenyu	88c338bfcc	add kernelize to keccak for each data block (#11370 ) * add kernelize to keccak for each data block test_long works now. this prevents internal uops from growing propotional to data length and eventually too deep * this? * hash stuff * gate test * mv	2025-07-25 16:07:20 -04:00
chenyu	cc795c6656	simplify keccak pad mask code (#11362 )	2025-07-24 19:24:10 -04:00
chenyu	c0c4bc9d7c	use int32 for keccak reorder_indexes (#11360 ) it's used for tensor indexing, so int32 instead of uint64 is slightly faster	2025-07-24 15:54:50 -04:00

1 2 3 4 5 ...

1265 Commits