tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-04-29 03:00:14 -04:00

Author	SHA1	Message	Date
nimlgen	9656d97d97	jit: captures linears, not execitems (#15399 ) * jit: captures linears, not execitems * x * um * etsts * mockcuda	2026-03-21 16:32:12 +08:00
Christopher Milan	a12d3951de	fix test_export_model imports (#15389 )	2026-03-20 07:27:01 -04:00
Christopher Milan	1560b534a5	remove IMAGE=2 (#15312 )	2026-03-20 06:26:52 -04:00
chenyu	c491345766	pass device into Tensor._frompy (#15385 ) * pass device into Tensor._frompy with this, canonicalize_device is the only usage of Device in tensor.py * export_model.py	2026-03-20 05:09:01 -04:00
George Hotz	3b75d8a7a2	fix double after bug in rangeify (#15381 )	2026-03-20 14:53:46 +08:00
Christopher Milan	0c89340a1e	automatically emulate unsupported (tiny) floats [skip_process_replay] (#15366 )	2026-03-20 02:31:44 -04:00
qazal	cf6a429aaa	mypy emulator pre-commit passing (#15379 ) * fix dict stuff * add type: ignores * fix pcode to put uops not ints	2026-03-20 14:44:09 +09:00
chenyu	da1700e16b	dtypes.index -> dtypes.weakint (#15377 )	2026-03-20 01:08:46 -04:00
chenyu	bf33c5f796	remove gradient materialize_grads (#15367 ) effectively default to True and removed *0 hack in Tensor.copysign. now dy/dx=0 if y does not depend on x remove	2026-03-19 23:36:03 -04:00
qazal	176ad47d7d	cdna4 emulator testing ASM_GEMM in CI (#15373 ) * cdna emulator work * accvgprs * cdna passes most tests * ruff * add cdna4 to tests * cdna emu * crash * pass? * work * gen * clean up wave_size access * asm_gemm passes * remove acc from dsl.py, emulator can keep its different reg file it's purely an encoding here, the ASM_GEMM already encodes acc srcs with v[], this can be cleaned up later, but not functionally required for emulator. * split asm_gemm tests to ones fast on the emulator * don't do that * 124 stays null on rdna * the segfault was because of hw regs, not this * Revert "clean up wave_size access", it's explicitly tested This reverts commit `1202ff5787`. * nullcopyout --------- Co-authored-by: George Hotz <geohot@gmail.com> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2026-03-20 05:51:30 +09:00
Christopher Milan	68d7a6b7be	PYTHONREMU: fix vop3p literals (#15372 )	2026-03-19 07:05:01 -04:00
nimlgen	86eec01f97	limit gl*lc (#15359 )	2026-03-19 12:38:55 +08:00
chenyu	b39816e998	failed test case for Tensor(np, "bf16") (#15358 )	2026-03-18 23:40:14 -04:00
wozeparrot	c45a606750	feat: no if in rand (#15333 )	2026-03-18 15:09:51 -07:00
qazal	709fc52d7b	viz: fix auto zoom range in sqtt, include endpgm packet (#15349 ) * viz: fix automatic zoom range in sqtt packets * it's x+width * include s_endpgm * endpgm also doesn't have exec	2026-03-18 22:52:32 +09:00
nimlgen	d4836ddbb0	canonicalize device from tuple (#15348 ) * will it ifx ci? * test * um	2026-03-18 20:35:52 +08:00
George Hotz	5524916e39	llama compute gradients explicitly + 243 GB of RAM on MP=8 (#15343 ) * llama compute gradients explicitly * apply grads * fix multi issue * multi BUFFER_VIEW support * simpler * skip the flaky test	2026-03-18 19:54:40 +08:00
nimlgen	f853371c83	fix compilers autoselect (#15346 )	2026-03-18 18:19:53 +08:00
chenyu	761ce8c0d3	fix Invalid combine rules (#15345 ) * fix Invalid combine rules wrong conditions broke setiem into invalids * fix	2026-03-18 04:58:02 -04:00
chenyu	fceb21c315	Tensor(uop) uses device from uop (#15340 )	2026-03-18 02:56:06 -04:00
George Hotz	6109117af1	anonymous buffers are Invalid (#15336 ) * anonymous buffers are Invalid * unique_const * work * remove invalid writes * test_anonymous_buffers_in_function	2026-03-18 14:52:56 +08:00
nimlgen	d720d50e12	memory: traverse all valid ranges only (#15338 ) * memory: traverse all valid ranges only * x	2026-03-18 14:03:39 +08:00
chenyu	ac7a348d06	dtypes.as_const -> DType.const (#15337 ) does not need to be a staticmethod	2026-03-18 00:48:41 -04:00
Christopher Milan	864d3917d5	add openpilot onnx parser test (#15334 )	2026-03-18 00:12:02 -04:00
chenyu	94926d00d8	fix rand > uint32.max (#15330 ) need to keep low and high as 1D tensor. `PYTHONPATH=. LLAMA3_SIZE=405B python3 examples/mlperf/models/flat_llama.py` works now	2026-03-17 22:00:01 -04:00
wozeparrot	b45edeb965	fix: rand supports large tensors (#15329 )	2026-03-17 15:45:41 -07:00
qazal	00817cf65e	viz: all tests can run on the NULL device (#15328 ) * remove that * move to test_viz * get_cfg * do not use os.environ * hm * it's always on NULL * import renderer * no import *	2026-03-18 04:14:20 +09:00
chenyu	14eb8170e4	skip TestRunAsModule if libclang is loaded (#15323 ) reverse rule of TestAutogen skip, otherwise `NULL=1 python -m pytest test/null/test_autogen.py test/null/test_device.py` crashes for me	2026-03-17 06:02:53 -04:00
George Hotz	9d95321be3	set allow_implicit=False by default (#15319 ) * set allow_implicit=False by default * modernize beautiful mnist	2026-03-17 17:14:38 +08:00
George Hotz	584ec75aa2	precompile backward (#15311 ) * add precompile backward support * cleanups * fix * compact grad * split v not split * simpler * no NOOPT	2026-03-17 15:28:40 +08:00
b1tg	856a839efc	llm: fix qwen3 moe topk renormalization (#15201 )	2026-03-17 12:57:33 +08:00
George Hotz	3ff03be413	call always has tuple (#15297 ) * call always has tuple * fix pre-commit and simplify * update * fix * move that assert * tuple * fix multi * cleanups * fix merge	2026-03-17 10:58:46 +08:00
wozeparrot	674c760974	embedded bwd vocab shard (#15001 ) * fix: remove more multi from call * feat: embedding bwd vocab sharding * clean: unused import * clean: don't actually need this pattern	2026-03-16 19:37:16 -07:00
chenyu	02afb45f29	remove UOp.assign [pr] (#15300 ) * remove UOp.assign [pr] it's all store and after, UOp is immutable * fix test	2026-03-16 21:45:41 -04:00
qazal	33bd33e783	sqtt: add CDNA ops enum, show in viz (#15140 )	2026-03-17 09:38:42 +09:00
chenyu	3e2b7803e6	view assign replaces at buffer identity (#15298 ) matches what functions capture	2026-03-16 19:58:38 -04:00
George Hotz	476276f4b4	support grads on tuples (#15287 ) * support grads on tuples * simpler * grad_fxn works * cleanups * unused	2026-03-16 17:39:34 +08:00
George Hotz	08662bc4ab	add TUPLE/GETTUPLE, simple tests pass (#15286 ) * simple tuple stuff passes * resolved	2026-03-16 15:06:02 +08:00
qazal	4445f50356	viz: variable duration rdna barriers (#15277 ) * viz: variable length rdna barriers * work * tiny changes * simple wave simd test * small wave sync test * good multi barrier bug find * simple fix * wave_sync asserts * rdna4 work * more rdna4 * find more bugs in my model * it's so much simpler * wave_sync tests duration * r4 * should just call this rdna4	2026-03-16 06:06:19 +09:00
qazal	5cd1daa3bc	cdna asm_gemm in one file, remove old rdna3 asm (#15281 )	2026-03-16 04:32:30 +09:00
chenyu	cd14e8e64b	allocations contiguous is store+after (#15280 )	2026-03-15 11:58:40 -04:00
qazal	7b6211fdd7	sqtt: remove discover_ops script (#15279 )	2026-03-15 22:17:06 +09:00
qazal	3858bfc83d	sqtt: CDNA inst decodes (#15274 ) * sqtt: CDNA inst decodes * JUMP packets other way * cdna insts * r3 * r4 * lds from simd1 and simd2	2026-03-14 21:03:46 +09:00
Christopher Milan	9047249a7c	m.where(x.pad_to(m.shape), Invalid) ranges shrink (#15275 )	2026-03-14 07:26:36 -04:00
Christopher Milan	dabdc986df	shrink guarded ranges, try 2 (#15272 )	2026-03-14 04:24:05 -04:00
Christopher Milan	7cf4b16c91	Revert "shrink guarded ranges" (#15271 )	2026-03-14 03:44:38 -04:00
Christopher Milan	d9951e2f8e	shrink guarded ranges (#15263 )	2026-03-14 03:38:48 -04:00
qazal	4d60312f7f	viz: asm python dsl syntax highlighting (#15259 )	2026-03-14 06:37:43 +09:00
qazal	6209ddfc90	viz: improve disasm of s_code_end (#15258 ) * viz: improve amd disasm of s_code_end * better tests * order was good	2026-03-14 03:31:14 +09:00
Sieds Lykles	4b59083d7c	assign into empty works (#15256 )	2026-03-13 10:24:29 -04:00

1 2 3 4 5 ...

5324 Commits