tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-04-29 03:00:14 -04:00

Author	SHA1	Message	Date
nimlgen	aec1ae0de1	llama: set manual_seed (#14409 )	2026-01-28 14:40:00 -08:00
chenyu	0870ed28b1	add Self type to MathMixin (#14411 ) these don't cause error	2026-01-28 16:59:38 -05:00
chenyu	079f33c208	fix type in Tensor.mean and Tensor.var (#14410 ) use Tensor.from_uop to wrap UOp from symbolic shape, kernels are the same	2026-01-28 15:24:02 -05:00
chenyu	2b5e99ccc1	minor type cleanups [pr] (#14408 ) mypy --warn-redundant-casts has false negative	2026-01-28 14:11:50 -05:00
chenyu	726415dbc8	import sint directly in movement.py TYPE_CHECKING (#14406 ) avoid creating string TypeAlias, fixed warning in `TYPED=1 python test/test_tiny.py`	2026-01-28 12:47:26 -05:00
nimlgen	acb2fc36ba	nv_pma: add decoder (#14404 ) * nv_pma: add decoder * cl	2026-01-28 20:44:02 +03:00
chenyu	7b9bc1d8cf	_MockMemoryviewMeta for mockgpu (#14405 ) fixed `PYTHONPATH=. TYPED=1 DEV=AMD MOCKGPU=1 python test/test_tiny.py`. basically make `isinstance(TrackedMemoryView_instance, memoryview)` true	2026-01-28 11:59:00 -05:00
chenyu	93793a645b	use cl.cl_mem instead of internal ctypes._CData (#14403 ) fixed `CHECK_OOB=0 DEV=CL TYPED=1 python test/test_tiny.py`	2026-01-28 10:56:41 -05:00
chenyu	a9b44070a8	fix webgpu runtime types (#14402 ) `CHECK_OOB=0 DEV=WEBGPU TYPED=1 python test/test_tiny.py` passed, also skip tests that failed locally	2026-01-28 10:37:25 -05:00
George Hotz	0c6b3f50aa	add marker to llama training (#14401 )	2026-01-28 22:44:28 +08:00
Jakob Sachs	2b7c00d3d2	fix sd-example dtype for CLIP embeddings (#14397 )	2026-01-28 09:07:19 -05:00
qazal	a5a9ce3fdf	viz: disasm cleanups from null emulate (#14399 ) * it's AMDHIPRenderer * don't need that indent * less assignment stuff * that arg order did not make sense * pmc	2026-01-28 22:03:30 +09:00
nimlgen	544928766d	hcq_smi: kill mac pids (#14398 )	2026-01-28 15:00:28 +03:00
George Hotz	202b74b369	assembly/amd: continue refactors (#14386 ) * simpler * merge * flat * no ctx * use the correct apis * dup code * write clean code * remove bad helpers * bits junk remove * junk remove * smem test * fix tests * correct fix + tests * Fmt matters it seems * wmma refactor * a lil more * kimi cleanups * line	2026-01-28 17:33:03 +08:00
qazal	5bffa17f82	llama train: better NULL=1 EMULATE=AMD_CDNA4 dev experience (#14395 ) * beam opens devices * switch to hip renderer * amd: true? * llvm true is for test_autogen	2026-01-28 17:31:22 +09:00
qazal	0294014108	fix bufferize cost function for multi, improve VIZ=-1 cli (#14394 ) * improve cli * remove_bufferize change	2026-01-28 15:53:18 +09:00
qazal	c158acea29	failing multi ram usage test from llama gemm (#14392 )	2026-01-28 14:32:32 +09:00
Christopher Milan	067e27857e	nested composite actions don't work (#14393 )	2026-01-28 00:13:30 -05:00
Christopher Milan	9dddf3d478	don't save caches for PRs, try 2 (#14391 )	2026-01-27 23:30:17 -05:00
Christopher Milan	68fe5d8b36	Revert "don't save caches for PRs (#14389 )" (#14390 )	2026-01-27 23:22:26 -05:00
Christopher Milan	4ab228b498	don't save caches for PRs (#14389 )	2026-01-27 23:21:31 -05:00
Christopher Milan	5e36482314	decompose long to ints where unsupported, try 2 (#14383 )	2026-01-27 23:20:43 -05:00
wozeparrot	e496547720	llama3 gradacc (#14291 )	2026-01-27 19:48:10 -08:00
George Hotz	88bc5ee212	assembly/amd: rename to better names (#14384 ) * assembly/amd: rename to better names * might help fuzzing segfault * emu2 -> emu	2026-01-28 10:00:54 +08:00
George Hotz	065b95cfb0	Revert "add retry to fetch (#14370 )" (#14385 ) This reverts commit `dc4d7f2d55`.	2026-01-28 09:35:37 +08:00
Eitan Turok	dc4d7f2d55	add retry to fetch (#14370 )	2026-01-27 14:04:25 -08:00
chenyu	8d1f3c8885	fix copysign for inf input (#14381 ) * fix copysign for inf input * llvm olt	2026-01-27 16:45:48 -05:00
Christopher Milan	289a3e415e	also skip test_nonoverlapping_shrink_assignment (#14382 )	2026-01-27 16:26:26 -05:00
Christopher Milan	f34efc1ad1	DISABLE_FAST_IDIV actually works as a ContextVar (#14378 )	2026-01-27 16:12:42 -05:00
chenyu	8c899e4aaf	fix copysign for -0 (#14380 ) test both x and 1/x < 0 work too. and found another big with the * 0 hack	2026-01-27 15:44:58 -05:00
chenyu	62884585a7	failed test case for copysign -0.0 (#14379 ) * failed test case for copysign -0.0 * skip those	2026-01-27 14:37:17 -05:00
nimlgen	ec1b28bc2c	am: exit early in case of failures (#14376 ) * am: exit early in case of failures * sorry, pre-linter * reset when error state	2026-01-27 22:10:02 +03:00
chenyu	cd22ee9ed0	add InvalidType to ConstType [pr] (#14373 ) * add InvalidType to ConstType [pr] TYPED=1 python test/test_tiny.py passes. added PyConst = float\|int\|bool for some Tensor level input types * hcq	2026-01-27 14:09:34 -05:00
Christopher Milan	5b42a1357b	SCACHE=0 works with DEBUG (#14377 )	2026-01-27 13:12:43 -05:00
chenyu	db010a31be	IGNORE_OOB -> CHECK_OOB [pr] (#14374 ) flip the meaning	2026-01-27 12:20:59 -05:00
chenyu	c22667b0c4	also skip test_overlapping_shrink_assignment_reverse (#14375 ) crashing	2026-01-27 12:20:39 -05:00
nimlgen	e52d58b041	autogen: update amd (#14372 )	2026-01-27 19:53:14 +03:00
nimlgen	cbf94a0a95	nv: exit early in case of failures (#14363 ) * nv: exit early in case of failures * f * cleaner	2026-01-27 19:16:22 +03:00
nimlgen	ec691cb299	am: print sq intrs (#14366 ) * am: print sq intrs * cleaner	2026-01-27 18:28:13 +03:00
qazal	a5f3d46423	hcq: do not assume kernel names are unique (#14371 ) * hcq: do not assume kernel names are unique * colored kernel name	2026-01-27 23:03:15 +09:00
George Hotz	e5df7e640b	fix branches in amd_asm_matmul (#14369 )	2026-01-27 20:48:42 +08:00
George Hotz	0ced258726	HOTFIX: skip crashing assign test	2026-01-27 20:35:17 +08:00
George Hotz	131ae604de	force_transcendental on sqrt (#14368 )	2026-01-27 20:24:41 +08:00
imaolo	14574c68fa	Add ContextVar to disable the scheduler cache (#14257 ) * add scheduler cache ContextVar * test scheduler cache context var --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2026-01-27 19:55:29 +08:00
George Hotz	bfc88bcfb8	assembly/amd: emu refactors + enable PYTHON_REMU by default (#14361 ) * assembly/amd: start refactors * cleanups * those are global * methods on ctx * const cleanup * range helper * types and imports * cleanups * cleanups * remove stale name * fix emu2 types * more typing * more mypy * cleanups * fxns * scc cleanup * cleanups * cleanups * simpler parse_pcode * laneid * no defaults for pcode * pcode is not optional * cleanups * functions cleanup * splat * expr_parser functions * single tok * invert global loops * try_eat * minor * run parser on all * no silent 0 * tests	2026-01-27 17:42:24 +08:00
Christopher Milan	2e72625652	Revert "decompose dtypes.long to ints where unsupported (#14261 )" (#14362 )	2026-01-27 02:04:59 -05:00
qazal	f866b2a513	mfma loop in asm dsl (#14349 ) * mfma loop in asm dsl * work	2026-01-27 11:11:37 +09:00
Christopher Milan	0793319929	decompose dtypes.long to ints where unsupported (#14261 ) * add works * use carry not overflow * bitwise ops * use tag instead of vec * cleaner * mul somewhat works * mul actually works * SUB and NEG work * SHL/SHR * ulong support * this should work? * oops * fix indexing * all ALU mostly works * refactor * test_dtype passing * signed division works * format * clean * some tests * ruff	2026-01-26 18:34:13 -05:00
wozeparrot	a987a4abc3	feat: llama8b dev_beam.sh (#14358 )	2026-01-26 14:51:23 -08:00
Christopher Milan	c9c533fc78	libclang path is homebrew on macos (#14357 ) * libclang path is homebrew macos * typo * ugh * typo * regen * no LIBCLANG_PATH	2026-01-26 17:32:09 -05:00

1 2 3 4 5 ...

11930 Commits