tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-04-29 03:00:14 -04:00

Author	SHA1	Message	Date
nimlgen	0d6fc0f571	jit: graphing in uops (#15489 ) * jit: graphing as rewrite rule * f * +metal,cuda * x * cl * x * x * simpler * f * m * x * revert? * revert2 * back * back * t * x * m * x * c * x * l * x * comment * smaller * rv * x * x	2026-03-27 19:09:02 +03:00
chenyu	30ebbe7f17	few more fold valid tests (#15509 ) from remove CORRECT_DIVMOD_FOLDING attempt	2026-03-27 10:38:42 -04:00
Christopher Milan	9e0cc5c6ae	create image buffers in late codegen (#15493 )	2026-03-27 04:50:53 -04:00
chenyu	1198d6e908	move pow to mixin (#15507 )	2026-03-27 03:16:40 -04:00
chenyu	323fcefd7d	Revert "DEV is a ContextVar (#15505 )" (#15506 ) This reverts commit `fdb30cba96`.	2026-03-27 02:22:40 -04:00
Christopher Milan	fdb30cba96	DEV is a ContextVar (#15505 )	2026-03-27 00:57:09 -04:00
wozeparrot	a65e958be9	llama: new apply_grad (#15503 )	2026-03-26 19:39:25 -07:00
Christopher Milan	67a50fb738	move where on load with casts (#15492 )	2026-03-26 22:11:27 -04:00
qazal	586c49642f	viz/cli: test in CI (#15501 ) * viz cli work * baseline test * make cli test work without subprocess * more checks * check itrace * s/return/return None * change * minimal * colored	2026-03-27 06:47:15 +09:00
qazal	3f9f0fa846	viz: yield sqtt alt events (#15500 ) * yield other * less * work * less	2026-03-27 04:43:41 +09:00
qazal	237c25031f	sqtt: construct OTHER_SIMD op types with for loop (#15495 ) * other-lds from amd_copy_matmul * more other * other simd work	2026-03-26 23:07:18 +09:00
nimlgen	7193f90746	test view input in jit (#15497 ) * will anything fail? * add test	2026-03-26 16:59:47 +03:00
nimlgen	de24b3fe37	jit: pass init params straight to base (#15496 ) * jit: pass init params straight to base * linter	2026-03-26 16:59:10 +03:00
qazal	ec5b7a249e	viz: refactor sqtt timeline builder (#15494 ) * viz: refactor sqtt timeline builder * barrier maps to waves * clean up cli	2026-03-26 21:16:15 +09:00
Christopher Milan	313937ad6d	fix IMAGE TestEnd2End.test_linear_mnist (#15488 )	2026-03-26 04:12:47 -04:00
Christopher Milan	bc180a963c	deprecate <dev>=1 in favor of DEV=<dev> (#15467 ) * start work on target * add test * update actions to use DEV * update docs * update readmes * tests need that too * update example * update tests (comments) * fix that test * ruff * mypy * oops * remove getenvs * don't add Target yet * and the test * lint * and docs * more stuff * assert * few more fixes * test assert	2026-03-26 03:48:03 -04:00
chenyu	8426f820a1	Tensor.sub to mixin (#15486 ) also _broadcasted skipped broadcasting shape if it does not have shape	2026-03-25 23:20:56 -04:00
wozeparrot	1ca178f379	llama: stochastic rounding (#15456 )	2026-03-25 18:16:31 -07:00
chenyu	7c8f992894	move EXPAND dtype cast back to gradient.py (#15481 ) only a concern for gradient, not mixin	2026-03-25 19:25:26 -04:00
nimlgen	9d2d0774b4	remote: disk copies (#15482 ) * remote: disk copies * lineter * r * nv * x	2026-03-25 22:14:25 +03:00
qazal	7c2c8d3905	viz: small ux improvements (#15483 ) * test * better * work	2026-03-26 03:18:25 +09:00
qazal	737d5f67f9	viz: compute canvas dims for auto zoom (#15474 )	2026-03-26 00:05:23 +09:00
qazal	60bd546593	sqtt: add cycle count to rdna3 enums (#15473 ) * update rdna3 sqtt enums to include cycle_count * dispatch_to_exec	2026-03-25 23:19:54 +09:00
chenyu	142bf11926	logical_not to mixin [pr] (#15472 ) also UPat.cast skips same dtype	2026-03-25 09:16:45 -04:00
George Hotz	25ff7146f2	add a status line to REMOTE with DEBUG=1 (#15471 ) * python speedups of hot paths * add a status line to REMOTE with DEBUG=1 * pc * t	2026-03-25 20:54:56 +08:00
qazal	c973b508b8	viz/cli: pass ctrlc (#15470 )	2026-03-25 21:13:28 +09:00
George Hotz	c1a7d90ccc	python speedups of hot paths (#15469 )	2026-03-25 20:02:42 +08:00
George Hotz	ae7090b13b	print function timing with DEBUG=2 (#15468 ) * add DEBUG=2 function timing * remove those functions, they aren't useful * fix spec	2026-03-25 19:07:32 +08:00
Christopher Milan	e7f389efda	fix height=1 images on macos (#15460 )	2026-03-25 05:59:56 -04:00
George Hotz	789628df2e	hotfix: add USE_BOT flag to ASM24 USB	2026-03-25 15:00:08 +08:00
George Hotz	cd1a276f47	llm: support gguf path or url (#15464 ) * llm: support gguf path or url * one line	2026-03-25 14:43:19 +08:00
chenyu	713b322e70	add weakint to promo_lattice (#15463 ) sits between bool and smallest int	2026-03-25 00:27:34 -04:00
chenyu	02878c5a2f	move _broadcasted to OpMixin (#15461 ) it needs both ElementwiseMixin and MovementMixin	2026-03-24 23:56:01 -04:00
chenyu	519ba22470	more Tensor._broadcasted cleanup (#15459 ) prep moving to mixin	2026-03-24 22:55:45 -04:00
George Hotz	fe2690399b	llm: support assistant prefill + refactor to TransformerConfig (#15457 ) * llm: support assistant prefill * refactor to ModelConfig * TransformerConfig * more	2026-03-25 10:50:48 +08:00
Christopher Milan	fd92aec094	cleanup unused image pitch code (#15458 )	2026-03-24 22:47:16 -04:00
chenyu	f6ed4da268	Tensor.ufix (#15452 ) * Tensor.ufix prep moving _broadcasted to mixin * remove backward_cast	2026-03-24 22:34:43 -04:00
qazal	1b3d00d6ac	viz/cli: remove --offset and --limit flags (#15439 ) * work * also no more no-color * reorder * update llama * sqtt readme * itertools * rm that * signals back	2026-03-25 09:52:27 +09:00
wozeparrot	da2031266a	llama: correct 8b init (#15397 )	2026-03-24 13:41:41 -07:00
qazal	652bab8aad	viz: support nested track_rewrites (#15454 ) * simple test * stack active groups	2026-03-25 05:01:30 +09:00
qazal	41eb2cc41b	viz: preserve zoom between re renders (#15451 )	2026-03-25 03:11:10 +09:00
Salman Chishti	84049fdc07	Upgrade GitHub Actions to latest versions (#15446 ) Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com> Co-authored-by: chenyu <chenyu@fastmail.com>	2026-03-24 10:28:49 -04:00
Salman Chishti	9567075e20	Upgrade GitHub Actions for Node 24 compatibility (#15445 ) Signed-off-by: Salman Muin Kayser Chishti <13schishti@gmail.com> Co-authored-by: chenyu <chenyu@fastmail.com>	2026-03-24 10:28:19 -04:00
chenyu	b7960841af	support shape broadcast in UOp.alu (#15442 ) i think it can integrate tighter, but now Tensor also does ufix from UOp and implicit dtype upcast	2026-03-24 10:14:57 -04:00
George Hotz	a33ac869aa	llm server: temperature + test client (#15444 ) * improvements to the llm server * eval script * eval llm * better eval gets 58.71 * cleanups * add temperature, but multinomial is absurdly slow * claude is so smart * lint * remove slop * no more stop	2026-03-24 21:07:15 +08:00
nimlgen	9db5d677c7	jit in viz (#15447 )	2026-03-24 18:23:53 +08:00
Christopher Milan	2e4fbbcc9c	ir3: fix texture mapping and benchmark (#15443 )	2026-03-24 04:52:54 -04:00
Christopher Milan	d5320a9ddf	QCOM cleanups (#15435 )	2026-03-23 22:18:38 -04:00
George Hotz	85dee83f5d	amd flash attention cleanups + emulator fixes (#15431 ) * amd flash attention cleanups * simpler * params * fix emulator bugs * fix idiv bug * remove that test * more emu fixes	2026-03-24 10:10:46 +08:00
chenyu	018a9e2d3c	remove match_dtype arg in Tensor._broadcasted (#15440 ) reworked Tensor.where to not need it, also updated dtypes.from_py to use isinstance because ConstFloat issues	2026-03-23 22:10:39 -04:00

1 2 3 4 5 ...

12781 Commits