tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-04-29 03:00:14 -04:00

Author	SHA1	Message	Date
chenyu	7cbfa1896a	comment out unused arm, triton in toml (#15741 ) fixed `PYTHONPATH=. uv run tinygrad/apps/llm.py`	2026-04-15 10:05:19 -04:00
Christopher Milan	1c36878008	DEV: suggest alternatives (#15732 )	2026-04-14 23:42:32 -04:00
George Hotz	1ae6528bb6	move schedule into schedule (#15736 ) * move schedule into schedule * callify to root * sched docs	2026-04-15 11:03:25 +08:00
wozeparrot	3721c60bef	llama: bs 16 (#15737 )	2026-04-14 19:52:03 -07:00
wozeparrot	480ad264a4	llama: per device amax (#15735 )	2026-04-14 19:01:17 -07:00
Christopher Milan	adc96cd724	qcom: synchronize for copyin (#15731 ) fixes: #15698	2026-04-14 18:31:15 -04:00
chenyu	3394d18066	size*itemsize -> nbytes (#15729 ) and some UOp.size removal to prep for size to mixin change	2026-04-14 16:27:54 -04:00
nimlgen	e9ecc990ea	amd: add r9700 devid (#15721 )	2026-04-14 20:15:00 +03:00
George Hotz	2450c8cba8	rename to callify + fix mypy (#15727 ) * rename to callify + fix mypy * update test	2026-04-14 23:43:19 +08:00
chenyu	528faa18ec	update env_vars.md (#15722 ) remove HCQ_VISIBLE_DEVICES, IMAGE=2 and old DEBUG=3 stuff	2026-04-14 09:13:35 -04:00
George Hotz	359b1582d6	amd: EMU DPP support (#15719 ) * EMU DPP support from GPT 5.4 * cleanups * simple * nope * fix	2026-04-14 14:58:41 +08:00
wozeparrot	2b8d303f75	allreduce in precast dtype (#15689 )	2026-04-13 20:24:12 -07:00
George Hotz	5683126844	llm: support for tekken tokenizer (#15720 )	2026-04-14 10:52:07 +08:00
chenyu	70883a6950	cat the stack to mixin (#15715 )	2026-04-13 18:44:39 -04:00
qazal	355e2729d3	viz: keep program UOp in data (#15714 ) * refactor program uop access * c.name	2026-04-14 07:04:16 +09:00
qazal	905b8adc97	viz: cli and server cleanups (#15713 ) * update get_profile arg[0] * uop_to_json arg[0] * data is standalone in cli	2026-04-14 06:42:29 +09:00
Christopher Milan	d83707ec29	autogen: explicit types (#15679 )	2026-04-13 16:54:39 -04:00
chenyu	ac41f15fc1	cumsum to mixin (#15712 ) built on top of getitem	2026-04-13 15:06:08 -04:00
nimlgen	eac481b67f	mlx: fix ctypes (#15711 ) * mlx: fix ctypes * x	2026-04-13 20:43:56 +03:00
nimlgen	b370f5c5ac	hcq: call free for unmap (#15710 )	2026-04-13 20:30:21 +03:00
chenyu	931d6cc62a	basic getitem to mixin (#15697 ) * basic getitem to mixin * cleanup * fix * cleanup	2026-04-13 13:04:36 -04:00
George Hotz	7610bdc59e	block multistore, it's not supported (#15708 )	2026-04-13 20:57:59 +08:00
George Hotz	84d64b5835	hotfix: abstractions4 works in mock except asm	2026-04-13 20:57:00 +08:00
George Hotz	16f50a40a5	remove REMU from tree (#15706 ) * no more compare emulators * remove remu from tree	2026-04-13 20:43:08 +08:00
qazal	ac027055ef	viz: no global state (#15705 ) * start viz data * get_full_rewrites also moves * update ref_map * work * update consumers * cleaner cli * linter * cleanup tests * back * better * sqtt tests	2026-04-13 21:35:20 +09:00
George Hotz	4c1fb18a09	Revert "Revert "Tests for GatedDeltaNetBlock + fix multi after assign issue (…" (#15703 ) This reverts commit `0cec42db71`.	2026-04-13 19:09:38 +08:00
George Hotz	0cec42db71	Revert "Tests for GatedDeltaNetBlock + fix multi after assign issue (#15700 )" (#15702 ) This reverts commit `6f5d756282`.	2026-04-13 19:06:44 +08:00
George Hotz	6f5d756282	Tests for GatedDeltaNetBlock + fix multi after assign issue (#15700 ) * broken after/assign test * test for GatedDeltaNet * better comments * fix issue 1 with multi kernel * fix 2 * fix * linter * public api + cleanup	2026-04-13 18:43:23 +08:00
b1tg	2b5ba0095d	qwen3.5 (#15210 ) * qwen3.5 * faster * or * rm zero hack * less float * T=1 * clean * clean * 4b * rope_dim * Revert "jit: captures linears, not execitems (#15399)" This reverts commit `9656d97d97`. * DeltaNetBlock * pairwise_topk * clean * Reapply "jit: captures linears, not execitems (#15399)" This reverts commit `cf3deff53d`. * clean topk, _swiglu * common * FFNBlock * clean * half * no mix * qwen3.5 test * fix ssm cache invalidation * TransformerConfig * SSMConfig * clean * reset_state * llm: reuse server conversation tokens to avoid BPE roundtrip cache miss * import error * prefill * none check * put it back * clean pairwise_topk * symbolic: fold BIND(CONST, CONST) to CONST * clean * simpler pm * _cached_msg_count * stream decoder; ssm checkpoints * rm checkpoint * attn_output_gate * conflict, attn_output_gate * clean, less has_ssm, assert * chunked prefill * _reset_cache * _reusable_prefix_len * revert loop --------- Co-authored-by: b1tg <b1tg@users.noreply.github.com> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2026-04-13 15:35:24 +08:00
qazal	2ada38f777	viz: execv after all producers complete (#15696 )	2026-04-13 08:15:47 +09:00
chenyu	f7ff480fa6	start mixin getitem tests (#15695 ) goal is to make Tensor[idx].uop equal to Tensor.uop[idx]	2026-04-12 18:54:33 -04:00
chenyu	77385ccb37	more trivial stuff to mixin (#15693 )	2026-04-12 15:17:16 -04:00
chenyu	ff1de5ae13	normalize logsumexp contiguous_backward to mixin (#15692 ) * normalize logsumexp contiguous_backward to mixin * more	2026-04-12 13:13:00 -04:00
chenyu	0254cfe642	move usum and uprod to mixin (#15690 ) and used it to clean up ops and tensor	2026-04-12 11:42:24 -04:00
nimlgen	e9b2e156b4	add jitbeam to tinygpu docs (#15691 )	2026-04-12 18:20:26 +03:00
chenyu	e706f408cb	suppress test warnings from numpy (#15688 )	2026-04-11 22:33:20 -04:00
nimlgen	938cba4fdf	amd: a bit faster usb, skip interrupts on sync (#15686 )	2026-04-11 17:26:36 +03:00
qazal	054d78e6ff	fix llama profile.sh NULL source (#15685 )	2026-04-11 22:56:05 +09:00
Graham Robbins	4ca844e96b	add Q1_0 gguf type (#15683 ) * add Q1_0 * better description * fix trailing whitespace --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2026-04-11 18:17:24 +08:00
George Hotz	5156a04cf5	add support for AM_POWER_LIMIT (#15684 ) * add support for AM_POWER_LIMIT * level None	2026-04-11 17:14:54 +08:00
wozeparrot	457508d5a0	llama: save more 2 (#15681 )	2026-04-11 01:03:36 -07:00
George Hotz	29238b772f	AMD USB: support for 0xF3 power toggle	2026-04-11 13:04:38 +08:00
George Hotz	b5a9465b13	llm: add support for moonlight (deepseek MLA) (#15466 ) * add gguf Q5_0 * it works * rebase * simpler test * class * less diff * dicts * normal names * simplify * this * simpler * work * work	2026-04-11 10:32:48 +08:00
wozeparrot	590464c8d8	llama: only support wqkv path + cleanups (#15680 ) * llama: only support wqkv path + cleanups * llama: missing transpose	2026-04-11 07:39:27 +08:00
nimlgen	aa012d6f08	usb: faster custom (#15678 ) * usb: _f0_out_buf for e4 cmd as well * custom speed * fast	2026-04-10 23:00:31 +03:00
nimlgen	58646f9569	usb fast copyout (#15677 ) * usb * fix usb	2026-04-10 21:04:49 +03:00
qazal	0d5cdc9600	viz: split draw loop (#15676 ) * split draw loop * one draw * no functions * inline all highlights * cleanup	2026-04-10 23:25:50 +09:00
chenyu	e1334d3852	move canonicalize_device to device.py (#15675 )	2026-04-10 09:43:56 -04:00
chenyu	8e7fcc8ca3	remove _include_initial in _cumalu (#15674 ) handle negative pad in caller	2026-04-10 08:33:30 -04:00
George Hotz	9092f2a8c0	llm: add shared_expert and rope_dim support from qwen35 (#15673 ) * llm: add shared_expert and rope_dim support from qwen35 * refactor into FFNBlock and TransformerBlock * norms where they belong	2026-04-10 19:18:27 +08:00

1 2 3 4 5 ...

12959 Commits