tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-04-29 03:00:14 -04:00

Author	SHA1	Message	Date
George Hotz	ec00cefa5b	llm is the only app (#15779 ) * tinygrad/llm is the only app * upd pyproject * claude refs * scoping * min diff	2026-04-17 10:44:48 +08:00
Christopher Milan	9f4b7bed25	add pickled jit regression test (#15774 )	2026-04-16 16:59:09 -04:00
qazal	12c653a743	remove opts arg in get_program, everything uses opts_to_apply [pr] (#15767 ) * check Ops.BEAM in process replay * remove opts from the get_program api * lint * simplify * cleanup	2026-04-16 22:42:43 +03:00
chenyu	f0c12a2004	another form of assign to itself (#15770 )	2026-04-16 15:17:19 -04:00
b1tg	4e88d875ba	llm: glm 4.7 flash (#15738 ) * glm 4.7 * test * temperature, server enable_thinking * --no-think * remove think stuff	2026-04-16 22:42:04 +08:00
chenyu	d147e2a549	update test_nested_after_contiguous_store (#15763 ) add kernel counts and some TODOs	2026-04-16 09:59:26 -04:00
qazal	126cda45f8	viz/cli: cleanups, add memory printer (#15762 ) * simple repro * use context * work * memory printer * rm * memory printer * pylint	2026-04-16 22:44:47 +09:00
George Hotz	f57380cbc2	simplify GatedDeltaNetBlock using two state tensors (#15704 ) * test double after * simpler ssm * no double test	2026-04-16 21:14:00 +08:00
George Hotz	d1cce7a476	put the ranges on store instead of after (#15759 ) * put the ranges on store instead of after * better assert * fix stuff * comment out slow rules i don't understand * simpler rule * closer * return false for store * fix loop * only a few schedule failures remain * remove stores to self * all tests pass locally * remove junk * regression test and fix * better test, bump broken torch count * bugfix with regression test * new fusion is better	2026-04-16 19:06:40 +08:00
George Hotz	d24466c844	CALL with return value is FUNCTION (#15758 ) * CALL with return value is FUNCTION (GPT try) * cleanups	2026-04-16 13:25:07 +08:00
chenyu	218d6b8988	delete old UOp.size [pr] (#15756 )	2026-04-15 23:21:00 -04:00
Muzammil	983a7bb576	exclude __del__ from TRACEMETA wrapping (#15747 ) Session-Id: 019d9234-2531-75a0-a252-f0302cd9931f	2026-04-16 10:49:55 +08:00
chenyu	8bd4fead26	UOp.size -> prod(max_shape) (#15755 ) and more test updates	2026-04-15 22:41:30 -04:00
chenyu	10c262ced8	update tests that use UOp.size (#15753 )	2026-04-15 21:58:27 -04:00
qazal	96092d110c	fix process_replay Ops.BEAM [pr] (#15752 )	2026-04-16 07:35:28 +09:00
Christopher Milan	be8005c5dc	DEV: secondary targets (#15748 )	2026-04-15 17:26:20 -04:00
chenyu	507c02cecb	fix symbolic contiguous_view_offset (#15749 ) * fix symbolic contiguous_view_offset * flatten	2026-04-15 16:54:38 -04:00
nimlgen	164495678c	test_graph to use uops (#15746 ) * test_graph to use uops * x * n	2026-04-15 21:59:41 +03:00
Christopher Milan	1c36878008	DEV: suggest alternatives (#15732 )	2026-04-14 23:42:32 -04:00
George Hotz	1ae6528bb6	move schedule into schedule (#15736 ) * move schedule into schedule * callify to root * sched docs	2026-04-15 11:03:25 +08:00
chenyu	3394d18066	size*itemsize -> nbytes (#15729 ) and some UOp.size removal to prep for size to mixin change	2026-04-14 16:27:54 -04:00
George Hotz	2450c8cba8	rename to callify + fix mypy (#15727 ) * rename to callify + fix mypy * update test	2026-04-14 23:43:19 +08:00
George Hotz	359b1582d6	amd: EMU DPP support (#15719 ) * EMU DPP support from GPT 5.4 * cleanups * simple * nope * fix	2026-04-14 14:58:41 +08:00
wozeparrot	2b8d303f75	allreduce in precast dtype (#15689 )	2026-04-13 20:24:12 -07:00
George Hotz	5683126844	llm: support for tekken tokenizer (#15720 )	2026-04-14 10:52:07 +08:00
chenyu	70883a6950	cat the stack to mixin (#15715 )	2026-04-13 18:44:39 -04:00
qazal	905b8adc97	viz: cli and server cleanups (#15713 ) * update get_profile arg[0] * uop_to_json arg[0] * data is standalone in cli	2026-04-14 06:42:29 +09:00
Christopher Milan	d83707ec29	autogen: explicit types (#15679 )	2026-04-13 16:54:39 -04:00
chenyu	ac41f15fc1	cumsum to mixin (#15712 ) built on top of getitem	2026-04-13 15:06:08 -04:00
chenyu	931d6cc62a	basic getitem to mixin (#15697 ) * basic getitem to mixin * cleanup * fix * cleanup	2026-04-13 13:04:36 -04:00
George Hotz	7610bdc59e	block multistore, it's not supported (#15708 )	2026-04-13 20:57:59 +08:00
George Hotz	16f50a40a5	remove REMU from tree (#15706 ) * no more compare emulators * remove remu from tree	2026-04-13 20:43:08 +08:00
qazal	ac027055ef	viz: no global state (#15705 ) * start viz data * get_full_rewrites also moves * update ref_map * work * update consumers * cleaner cli * linter * cleanup tests * back * better * sqtt tests	2026-04-13 21:35:20 +09:00
George Hotz	4c1fb18a09	Revert "Revert "Tests for GatedDeltaNetBlock + fix multi after assign issue (…" (#15703 ) This reverts commit `0cec42db71`.	2026-04-13 19:09:38 +08:00
George Hotz	0cec42db71	Revert "Tests for GatedDeltaNetBlock + fix multi after assign issue (#15700 )" (#15702 ) This reverts commit `6f5d756282`.	2026-04-13 19:06:44 +08:00
George Hotz	6f5d756282	Tests for GatedDeltaNetBlock + fix multi after assign issue (#15700 ) * broken after/assign test * test for GatedDeltaNet * better comments * fix issue 1 with multi kernel * fix 2 * fix * linter * public api + cleanup	2026-04-13 18:43:23 +08:00
chenyu	f7ff480fa6	start mixin getitem tests (#15695 ) goal is to make Tensor[idx].uop equal to Tensor.uop[idx]	2026-04-12 18:54:33 -04:00
chenyu	e706f408cb	suppress test warnings from numpy (#15688 )	2026-04-11 22:33:20 -04:00
Graham Robbins	4ca844e96b	add Q1_0 gguf type (#15683 ) * add Q1_0 * better description * fix trailing whitespace --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2026-04-11 18:17:24 +08:00
wozeparrot	457508d5a0	llama: save more 2 (#15681 )	2026-04-11 01:03:36 -07:00
George Hotz	b5a9465b13	llm: add support for moonlight (deepseek MLA) (#15466 ) * add gguf Q5_0 * it works * rebase * simpler test * class * less diff * dicts * normal names * simplify * this * simpler * work * work	2026-04-11 10:32:48 +08:00
chenyu	8e7fcc8ca3	remove _include_initial in _cumalu (#15674 ) handle negative pad in caller	2026-04-10 08:33:30 -04:00
George Hotz	9092f2a8c0	llm: add shared_expert and rope_dim support from qwen35 (#15673 ) * llm: add shared_expert and rope_dim support from qwen35 * refactor into FFNBlock and TransformerBlock * norms where they belong	2026-04-10 19:18:27 +08:00
b1tg	9ab1415937	llm: fix streaming UTF-8 decode (#15653 )	2026-04-10 17:01:02 +08:00
Christopher Milan	dbc23e8a1b	move HCQ_VISIBLE_DEVICES into DEV (#15668 )	2026-04-09 22:01:35 -04:00
Christopher Milan	d08c76d9cb	c.Struct cleanup (#15640 )	2026-04-08 20:07:16 -04:00
chenyu	4cf2759fc8	fix merge_reduce_ends (#15659 ) * fix merge_reduce_ends same range with different nesting should not merge, like cumsum twice should not merge * skip that	2026-04-08 17:20:01 -04:00
qazal	71c83cc3f6	viz: put OTHER_ on the wave row (#15650 ) * viz: put OTHER_ on the wave row * update tests * cleanup cli	2026-04-08 23:13:44 +09:00
qazal	3ac16b3bea	viz: add wmma row, update exec duration logic (#15646 ) * viz: split wmma to its own row, fix duration logic * regs * decrease number of loops, add pickle * assert overlaps	2026-04-08 20:24:23 +09:00
George Hotz	35e3983840	Add Q5_0, Q5_1, and bfloat16 GGUF types (#15644 )	2026-04-08 17:16:19 +08:00

1 2 3 4 5 ...

5447 Commits