tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-04-29 03:00:14 -04:00

Author	SHA1	Message	Date
nimlgen	86eec01f97	limit gl*lc (#15359 )	2026-03-19 12:38:55 +08:00
chenyu	b39816e998	failed test case for Tensor(np, "bf16") (#15358 )	2026-03-18 23:40:14 -04:00
wozeparrot	c45a606750	feat: no if in rand (#15333 )	2026-03-18 15:09:51 -07:00
qazal	709fc52d7b	viz: fix auto zoom range in sqtt, include endpgm packet (#15349 ) * viz: fix automatic zoom range in sqtt packets * it's x+width * include s_endpgm * endpgm also doesn't have exec	2026-03-18 22:52:32 +09:00
nimlgen	d4836ddbb0	canonicalize device from tuple (#15348 ) * will it ifx ci? * test * um	2026-03-18 20:35:52 +08:00
George Hotz	5524916e39	llama compute gradients explicitly + 243 GB of RAM on MP=8 (#15343 ) * llama compute gradients explicitly * apply grads * fix multi issue * multi BUFFER_VIEW support * simpler * skip the flaky test	2026-03-18 19:54:40 +08:00
nimlgen	f853371c83	fix compilers autoselect (#15346 )	2026-03-18 18:19:53 +08:00
chenyu	761ce8c0d3	fix Invalid combine rules (#15345 ) * fix Invalid combine rules wrong conditions broke setiem into invalids * fix	2026-03-18 04:58:02 -04:00
chenyu	fceb21c315	Tensor(uop) uses device from uop (#15340 )	2026-03-18 02:56:06 -04:00
George Hotz	6109117af1	anonymous buffers are Invalid (#15336 ) * anonymous buffers are Invalid * unique_const * work * remove invalid writes * test_anonymous_buffers_in_function	2026-03-18 14:52:56 +08:00
nimlgen	d720d50e12	memory: traverse all valid ranges only (#15338 ) * memory: traverse all valid ranges only * x	2026-03-18 14:03:39 +08:00
chenyu	ac7a348d06	dtypes.as_const -> DType.const (#15337 ) does not need to be a staticmethod	2026-03-18 00:48:41 -04:00
Christopher Milan	864d3917d5	add openpilot onnx parser test (#15334 )	2026-03-18 00:12:02 -04:00
chenyu	94926d00d8	fix rand > uint32.max (#15330 ) need to keep low and high as 1D tensor. `PYTHONPATH=. LLAMA3_SIZE=405B python3 examples/mlperf/models/flat_llama.py` works now	2026-03-17 22:00:01 -04:00
wozeparrot	b45edeb965	fix: rand supports large tensors (#15329 )	2026-03-17 15:45:41 -07:00
qazal	00817cf65e	viz: all tests can run on the NULL device (#15328 ) * remove that * move to test_viz * get_cfg * do not use os.environ * hm * it's always on NULL * import renderer * no import *	2026-03-18 04:14:20 +09:00
chenyu	14eb8170e4	skip TestRunAsModule if libclang is loaded (#15323 ) reverse rule of TestAutogen skip, otherwise `NULL=1 python -m pytest test/null/test_autogen.py test/null/test_device.py` crashes for me	2026-03-17 06:02:53 -04:00
George Hotz	9d95321be3	set allow_implicit=False by default (#15319 ) * set allow_implicit=False by default * modernize beautiful mnist	2026-03-17 17:14:38 +08:00
George Hotz	584ec75aa2	precompile backward (#15311 ) * add precompile backward support * cleanups * fix * compact grad * split v not split * simpler * no NOOPT	2026-03-17 15:28:40 +08:00
b1tg	856a839efc	llm: fix qwen3 moe topk renormalization (#15201 )	2026-03-17 12:57:33 +08:00
George Hotz	3ff03be413	call always has tuple (#15297 ) * call always has tuple * fix pre-commit and simplify * update * fix * move that assert * tuple * fix multi * cleanups * fix merge	2026-03-17 10:58:46 +08:00
wozeparrot	674c760974	embedded bwd vocab shard (#15001 ) * fix: remove more multi from call * feat: embedding bwd vocab sharding * clean: unused import * clean: don't actually need this pattern	2026-03-16 19:37:16 -07:00
chenyu	02afb45f29	remove UOp.assign [pr] (#15300 ) * remove UOp.assign [pr] it's all store and after, UOp is immutable * fix test	2026-03-16 21:45:41 -04:00
qazal	33bd33e783	sqtt: add CDNA ops enum, show in viz (#15140 )	2026-03-17 09:38:42 +09:00
chenyu	3e2b7803e6	view assign replaces at buffer identity (#15298 ) matches what functions capture	2026-03-16 19:58:38 -04:00
George Hotz	476276f4b4	support grads on tuples (#15287 ) * support grads on tuples * simpler * grad_fxn works * cleanups * unused	2026-03-16 17:39:34 +08:00
George Hotz	08662bc4ab	add TUPLE/GETTUPLE, simple tests pass (#15286 ) * simple tuple stuff passes * resolved	2026-03-16 15:06:02 +08:00
qazal	4445f50356	viz: variable duration rdna barriers (#15277 ) * viz: variable length rdna barriers * work * tiny changes * simple wave simd test * small wave sync test * good multi barrier bug find * simple fix * wave_sync asserts * rdna4 work * more rdna4 * find more bugs in my model * it's so much simpler * wave_sync tests duration * r4 * should just call this rdna4	2026-03-16 06:06:19 +09:00
qazal	5cd1daa3bc	cdna asm_gemm in one file, remove old rdna3 asm (#15281 )	2026-03-16 04:32:30 +09:00
chenyu	cd14e8e64b	allocations contiguous is store+after (#15280 )	2026-03-15 11:58:40 -04:00
qazal	7b6211fdd7	sqtt: remove discover_ops script (#15279 )	2026-03-15 22:17:06 +09:00
qazal	3858bfc83d	sqtt: CDNA inst decodes (#15274 ) * sqtt: CDNA inst decodes * JUMP packets other way * cdna insts * r3 * r4 * lds from simd1 and simd2	2026-03-14 21:03:46 +09:00
Christopher Milan	9047249a7c	m.where(x.pad_to(m.shape), Invalid) ranges shrink (#15275 )	2026-03-14 07:26:36 -04:00
Christopher Milan	dabdc986df	shrink guarded ranges, try 2 (#15272 )	2026-03-14 04:24:05 -04:00
Christopher Milan	7cf4b16c91	Revert "shrink guarded ranges" (#15271 )	2026-03-14 03:44:38 -04:00
Christopher Milan	d9951e2f8e	shrink guarded ranges (#15263 )	2026-03-14 03:38:48 -04:00
qazal	4d60312f7f	viz: asm python dsl syntax highlighting (#15259 )	2026-03-14 06:37:43 +09:00
qazal	6209ddfc90	viz: improve disasm of s_code_end (#15258 ) * viz: improve amd disasm of s_code_end * better tests * order was good	2026-03-14 03:31:14 +09:00
Sieds Lykles	4b59083d7c	assign into empty works (#15256 )	2026-03-13 10:24:29 -04:00
qazal	60b1b908c6	sqtt: CDNA layout header packet is the same size (#15255 )	2026-03-13 22:28:24 +09:00
chenyu	018c01508d	test case for call precompile multi (#15254 )	2026-03-13 06:28:43 -04:00
qazal	d893b14193	sqtt: update cdna packet names (#15243 ) * sqtt: update cdna packet names * change * order	2026-03-13 08:49:09 +09:00
chenyu	90b7f4341d	failed two level divmod recombine case (#15233 )	2026-03-12 04:04:36 -04:00
chenyu	842c978df3	remove staticmethod dtypes.max/min (#15227 ) always use x.dtype.max/min	2026-03-11 23:11:24 -04:00
b1tg	18dc77ccab	add fp8 fnuz dtypes with PYTHON backend support (#14945 ) * add fp8 fnuz dtypes with PYTHON backend support * rm emu related change * clarify fp8 fnuz zero handling * Revert "rm emu related change" This reverts commit `efa4763c22`. --------- Co-authored-by: b1tg <b1tg@users.noreply.github.com> Co-authored-by: chenyu <chenyu@fastmail.com>	2026-03-11 22:30:18 -04:00
George Hotz	4f3f55328b	do not patch on invalid tensor tests (#15226 ) * do not patch on invalid tensor tests * cleanup	2026-03-12 09:35:20 +08:00
qazal	d3eef70162	viz: render shader clock frequency graph (#15197 )	2026-03-12 01:32:49 +09:00
Christopher Milan	2fb8a7f60f	fix test_invalid_tensor when before values are nan (#15215 )	2026-03-10 23:51:19 -04:00
chenyu	fce87f19a8	better fold_add_divmod_recombine (#15214 )	2026-03-10 23:24:22 -04:00
chenyu	df8deec949	test for nest_by_factor selection (#15213 )	2026-03-10 22:41:31 -04:00

1 2 3 4 5 ...

5313 Commits