tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-22 21:38:10 -05:00

Author	SHA1	Message	Date
b1tg	61884f2057	add cstyle renderer to the NULL device (#11709 ) Co-authored-by: b1tg <b1tg@users.noreply.github.com>	2025-08-18 09:52:22 -07:00
uuuvn	18db8fa311	Allow choosing leaders in multinode reduce (#11506 ) Co-authored-by: wozeparrot <wozeparrot@gmail.com>	2025-08-18 12:43:20 -04:00
b1tg	799a637b03	fix the misused cast in amd llvm tc (#11711 ) Co-authored-by: b1tg <b1tg@users.noreply.github.com>	2025-08-18 09:15:34 -07:00
qazal	fef97547f9	viz: preset the final timestamp (#11712 )	2025-08-18 17:51:21 +03:00
chenyu	c30a113b2a	support bf16 and fp8 in Tensor.tolist (#11704 ) memoryview does not support it, but casting works fine so cast is fine	2025-08-17 15:11:13 -04:00
nimlgen	1c62a3833b	am: add versioned_header to load_fw (#11702 ) * am: add versioned_header to load_fw * fix mypy	2025-08-17 20:11:57 +03:00
qazal	eb3c918c5b	viz: s/area/height (#11703 )	2025-08-17 19:20:01 +03:00
qazal	d762edd694	viz: define tracks in python (#11701 ) * viz: defines tracks in python * update unittests * figuring it out * works * diff cleanup * math * y axis is back	2025-08-17 18:19:13 +03:00
qazal	eeeea29171	viz: device list refactor (#11700 ) * viz: device list refactor * paddingTop/padding-top	2025-08-17 15:08:54 +03:00
George Hotz	9366a23eb0	test backward in test_tiny (#11697 ) * test backward in test_tiny * empty	2025-08-16 20:29:39 -07:00
chenyu	4666df71c1	fix test_fuse_and_tc_opt (#11699 )	2025-08-16 21:10:53 -04:00
geohotstan	3d7c35d615	add fuse and tc opt bug repro (#11695 ) * FINALLY HAVE A SMALL REPRO OH BOY * show failure in CI * cleaner? * 1 possible fix * Revert "1 possible fix" This reverts commit `9e0fd215dd`.	2025-08-16 18:24:49 -04:00
nimlgen	d1224a7c4a	am: check both signatures (#11694 ) * am: check both signatures * fix	2025-08-16 20:01:07 +03:00
qazal	58c8991fa4	add Ops.REWRITE_ERROR (#11689 )	2025-08-16 00:56:53 +03:00
qazal	ec4fccb1da	viz: pass through RewriteNotReady (#11690 )	2025-08-16 00:33:59 +03:00
qazal	e954decb44	viz: pass UOp.st errors (#11688 )	2025-08-16 00:07:56 +03:00
nimlgen	bf0c45fd16	system: resource_resize might be unavail (#11680 )	2025-08-15 22:03:23 +03:00
George Hotz	4ab9fb2edd	explicit fixed point rewrite (#11685 ) * explicit fixed point rewrite * local cache * fix that	2025-08-15 11:08:41 -07:00
chenyu	5d6963c968	RuntimeError for unsupported dtype in PYTHON (#11686 )	2025-08-15 13:59:27 -04:00
nimlgen	b970cd6895	am: fix psp ring completion (#11679 ) * am: psp ring timeout + fix 0 fence_value * no sleep	2025-08-15 20:15:49 +03:00
qazal	c8ba48b223	show rewrite errors in viz (#11684 )	2025-08-15 19:09:47 +03:00
George Hotz	560984fd8d	small changes from rangeify (#11682 ) * small changes from rangeify * const like thing * ksym	2025-08-15 08:45:52 -07:00
chenyu	d0d39885c3	onnx in tinygrad (#11675 )	2025-08-14 19:57:21 -04:00
wozeparrot	71260a5ea4	feat: only bench openpilot 0.9.9 models (#11664 )	2025-08-14 19:27:18 -04:00
chenyu	4ddefbccb4	update setup packages (#11674 ) sorted, and added missing 'tinygrad.frontend' and 'tinygrad.runtime.autogen.nv'	2025-08-14 19:24:57 -04:00
chenyu	48c4033ae1	fix pylint for onnx (#11673 ) * fix pylint for onnx * too long	2025-08-14 18:48:02 -04:00
chenyu	e9d0027591	llama MP realize weight after shard (#11672 ) * llama MP realize weight after shard prevents memory spike on device 0 * empty weight for FAKEDATA	2025-08-14 16:17:46 -04:00
nimlgen	4176b24264	amd: support xcc in regs (#11670 ) * amd: support xcc in regs * mockamd * typong	2025-08-14 21:20:11 +03:00
Sieds Lykles	f399d0d75d	Render mod in terms of idiv (#11668 ) * Render mod in terms of idiv * cvar -> var	2025-08-14 19:59:39 +02:00
nimlgen	d747eeed32	amd logs parser based on device (#11669 )	2025-08-14 19:49:33 +03:00
geohotstan	1e904155e3	Add Onnx Huggingface to test/models/test_onnx.py (#11468 ) * BOOM * cache extra/huggingface/models/ * why max buffer size is not 0 * override MAX_BUFFER_SIZE * less models * remove more models and change cache dir to already cached dir * only metal * less is more? * remove check ops * why is this not setting the ENVVAR * ughhhhh just test in models * only cpu and gpu * only cpu actually * just override it idk * final * move extra dependencies up top * simplification * fix print * make README better * revert ops_disk fix for now * clean up test_onnx * remove testing fashion clip model cuz sloooowwwwww * actually let METAL run this * fix comment mistake * fix download path in run_models * does this work? * cleanup setup and teardown * contextvar like this? * prove model is cached * do I need to increment DOWNLOAD_CACHE_VERSION? * see if cached with incremented DOWNLOAD_CACHE_VERSION * use warnings to see if the model exists * revert DOWNLOAD_CACHE_VERSION stuff and clean up * add retry to download * nit	2025-08-14 11:16:41 -04:00
Sieds Lykles	06beeb6e13	Nest div even if factor is negative (#11666 )	2025-08-14 13:58:59 +02:00
Sieds Lykles	661e9a2d5d	div_and_mod_folding refactor (#11585 ) * divmod const folding is its own function * split nested mod optimization out of div and mod folding * make `fold_binary_numerator` its own function * factor out `fold_divmod_congruence` * check sign of numerator * add tests * assert int on vmin and vmax * add type: ignore * factor out more rules * remove div_and_mod_folding * cached_property to property * remove import * add returns * restore old order * check sign of x.vmin and newx.vmin * check more signs * add some test that would have caught bugs * better test if the div simplified * shorten line * replace terms_factors_const with pop_const * move that back * minor cleanup * remove comments * some cleanup	2025-08-14 11:52:42 +02:00
chenyu	0fc43c2e54	fix test_const_tensor_index index (#11660 ) index should be ints	2025-08-13 19:50:16 -04:00
chenyu	4fe19eec72	Ops.TRUNC (#11659 )	2025-08-13 18:40:48 -04:00
qazal	eb10a9c76a	viz: always left align timeline values (#11658 )	2025-08-13 23:55:28 +03:00
George Hotz	22bdf48cdd	render ranges in viz, name gbufs with sizes. changes from rangeify (#11656 ) * render ranges in viz, name gbufs with sizes. changes from rangeify * fix unit test dtypes	2025-08-13 12:46:16 -07:00
George Hotz	9b4da590bb	remove need for cast_vec (#11653 ) * remove need for cast_vec * fix amdllvm	2025-08-13 12:09:47 -07:00
kevvz	e2873a3a41	[bounty] Muon optim (#11414 ) * newton schulz * add muon + move newton schulz to tensor * compact newton schulz * better tests * cleanup * add comments for muon * cleanup * add export with tests * match muon optim with test optim * cleanup * unsed import * correct comment * whitespace * move export * muon test fix * match reference impl + tests * remove export by moving muon device * add credit * cleanup * remove print * spacing * spacing * comma * cleanup * removal * fix tests + optim momentum * consistent is not/ not * more consistency * fix test * cleanup * fix the nones * remove comment * cast * comment * comment * muon teeny test * muon flag beautiful mnist * set steps * steps as hyperparam * match default test steps * name * large cleanup * dont care about steps * nesterov false default * match each other impl * steps * switch nest * swap defaults * update docstring * add no nesterov test * ban fuse_optim * prints * classical momentum * alternative condition * recon * pre + post wd * false default * detach * signature changes * context * swap order * big cleanup * 0 step instead * parity * remove fuse * remove fused * better paper * assert message * correct shape check + eps * multidim * add eps * cleanup * correct assert message * lint * better tests * naming * ns_steps,ns_params * update docstring * docstring * match sgd and muon together * sandwich * add back fused * parity --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-08-13 14:27:55 -04:00
chenyu	94e6d84e32	rewrite Tensor.round to not use cast int (#11654 )	2025-08-13 13:51:08 -04:00
George Hotz	d2521d828a	transcendental+idiv+threefry are uop decompositions (#11636 ) * transcendental+idiv+threefry are uop decompositions [pr] * threefry decomp * fix randomness tests * fix webgpu * unneeded now * fix * move prematcher * all cast should probably be cast_vec	2025-08-13 09:37:12 -07:00
geohotstan	cf7224ce3e	fully lint onnx.py (#11647 ) * mypy * ruff ruff ruff	2025-08-13 08:22:06 -07:00
geohotstan	925555b62a	Fix onnx Domain bug (#11650 )	2025-08-13 08:20:50 -07:00
Sieds Lykles	67df617fe1	add launch bounds to ptx (#11646 )	2025-08-13 13:05:39 +02:00
qazal	88f95e9f59	viz: minor fixups for firefox (#11645 ) * fix circle attr * set fill color	2025-08-13 12:59:28 +03:00
qazal	6f88eac0fc	viz: refactor node and edge tagging (#11644 )	2025-08-13 12:41:01 +03:00
qazal	8140bf9778	viz: create layout once (#11643 ) * start * work * works * diff cleanup	2025-08-13 09:24:58 +03:00
chenyu	3fb79bb43a	minor onnx cleanups (#11642 )	2025-08-13 01:05:19 -04:00
chenyu	e9e5a08a04	simplify onnx cubic (#11641 ) we can drop the double where and abs since we know which ranges the inputs map into	2025-08-12 19:57:31 -04:00
George Hotz	18cdbec447	split decompositions pass (#11638 ) * split decompositions pass * fix ptx * pack load store early * restore that	2025-08-12 12:56:05 -07:00

... 15 16 17 18 19 ...

10633 Commits