tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-21 04:47:56 -05:00

Author	SHA1	Message	Date
George Hotz	15309ea0d8	cleanup tests, bump caches	2025-08-19 21:08:57 -07:00
George Hotz	00391db628	no ast for mem estimate (#11744 ) * no ast for mem estimate * skip for webgpu	2025-08-19 20:18:45 -07:00
ttomsa	70c3f1fb29	x.where(False, True) -> !x (#11738 ) * add pat * add test	2025-08-19 19:08:16 -04:00
George Hotz	1d307f568c	move device tests to test/device + test cleanups (#11735 ) * move device tests to test/device * test speedups * test device * linalg to unit * upd * so pytest just works * more divide and skip * speed * test devectorize * add pillow	2025-08-19 16:02:20 -07:00
nimlgen	9c9e337c78	amd: parse soc enums (#11727 ) * amd: parse soc enums * remove from mock * fix * minimal amd_gpu	2025-08-19 15:06:09 +03:00
George Hotz	4b3fcb4064	Revert "REDUCE_AXIS keepdim=False (#11311 )" (#11718 ) This reverts commit `b518a7378a`.	2025-08-18 13:28:53 -07:00
b1tg	b518a7378a	REDUCE_AXIS keepdim=False (#11311 ) * progress * fix tests * fix tests * remove hack for test_symfold * fix test_conv.py on llvm * hack test_cache_speed * lint * remove hack for helper_linearizer_opt * tests * fix DSP * clean up * remove hack for kernelize.py * hack for test/test_multitensor.py TestMultiTensor.test_matmul_shard_none * clean * uop.r need reshape? * lower_store cause fail * fix lower? * avoid contiguous hack * 2134 * conv2d count * remove unused * hack lower * reduced and clean up * fix TestMultiTensor.test_matmul_shard_none * src sync + fix TestMultiTensor.test_matmul_shard_none * remove excluded in mop --------- Co-authored-by: b1tg <b1tg@users.noreply.github.com> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com> Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com>	2025-08-18 10:09:17 -07:00
chenyu	c30a113b2a	support bf16 and fp8 in Tensor.tolist (#11704 ) memoryview does not support it, but casting works fine so cast is fine	2025-08-17 15:11:13 -04:00
qazal	d762edd694	viz: define tracks in python (#11701 ) * viz: defines tracks in python * update unittests * figuring it out * works * diff cleanup * math * y axis is back	2025-08-17 18:19:13 +03:00
George Hotz	9366a23eb0	test backward in test_tiny (#11697 ) * test backward in test_tiny * empty	2025-08-16 20:29:39 -07:00
chenyu	4666df71c1	fix test_fuse_and_tc_opt (#11699 )	2025-08-16 21:10:53 -04:00
geohotstan	3d7c35d615	add fuse and tc opt bug repro (#11695 ) * FINALLY HAVE A SMALL REPRO OH BOY * show failure in CI * cleaner? * 1 possible fix * Revert "1 possible fix" This reverts commit `9e0fd215dd`.	2025-08-16 18:24:49 -04:00
qazal	c8ba48b223	show rewrite errors in viz (#11684 )	2025-08-15 19:09:47 +03:00
George Hotz	560984fd8d	small changes from rangeify (#11682 ) * small changes from rangeify * const like thing * ksym	2025-08-15 08:45:52 -07:00
chenyu	d0d39885c3	onnx in tinygrad (#11675 )	2025-08-14 19:57:21 -04:00
nimlgen	4176b24264	amd: support xcc in regs (#11670 ) * amd: support xcc in regs * mockamd * typong	2025-08-14 21:20:11 +03:00
geohotstan	1e904155e3	Add Onnx Huggingface to test/models/test_onnx.py (#11468 ) * BOOM * cache extra/huggingface/models/ * why max buffer size is not 0 * override MAX_BUFFER_SIZE * less models * remove more models and change cache dir to already cached dir * only metal * less is more? * remove check ops * why is this not setting the ENVVAR * ughhhhh just test in models * only cpu and gpu * only cpu actually * just override it idk * final * move extra dependencies up top * simplification * fix print * make README better * revert ops_disk fix for now * clean up test_onnx * remove testing fashion clip model cuz sloooowwwwww * actually let METAL run this * fix comment mistake * fix download path in run_models * does this work? * cleanup setup and teardown * contextvar like this? * prove model is cached * do I need to increment DOWNLOAD_CACHE_VERSION? * see if cached with incremented DOWNLOAD_CACHE_VERSION * use warnings to see if the model exists * revert DOWNLOAD_CACHE_VERSION stuff and clean up * add retry to download * nit	2025-08-14 11:16:41 -04:00
Sieds Lykles	06beeb6e13	Nest div even if factor is negative (#11666 )	2025-08-14 13:58:59 +02:00
Sieds Lykles	661e9a2d5d	div_and_mod_folding refactor (#11585 ) * divmod const folding is its own function * split nested mod optimization out of div and mod folding * make `fold_binary_numerator` its own function * factor out `fold_divmod_congruence` * check sign of numerator * add tests * assert int on vmin and vmax * add type: ignore * factor out more rules * remove div_and_mod_folding * cached_property to property * remove import * add returns * restore old order * check sign of x.vmin and newx.vmin * check more signs * add some test that would have caught bugs * better test if the div simplified * shorten line * replace terms_factors_const with pop_const * move that back * minor cleanup * remove comments * some cleanup	2025-08-14 11:52:42 +02:00
chenyu	0fc43c2e54	fix test_const_tensor_index index (#11660 ) index should be ints	2025-08-13 19:50:16 -04:00
chenyu	4fe19eec72	Ops.TRUNC (#11659 )	2025-08-13 18:40:48 -04:00
George Hotz	22bdf48cdd	render ranges in viz, name gbufs with sizes. changes from rangeify (#11656 ) * render ranges in viz, name gbufs with sizes. changes from rangeify * fix unit test dtypes	2025-08-13 12:46:16 -07:00
kevvz	e2873a3a41	[bounty] Muon optim (#11414 ) * newton schulz * add muon + move newton schulz to tensor * compact newton schulz * better tests * cleanup * add comments for muon * cleanup * add export with tests * match muon optim with test optim * cleanup * unsed import * correct comment * whitespace * move export * muon test fix * match reference impl + tests * remove export by moving muon device * add credit * cleanup * remove print * spacing * spacing * comma * cleanup * removal * fix tests + optim momentum * consistent is not/ not * more consistency * fix test * cleanup * fix the nones * remove comment * cast * comment * comment * muon teeny test * muon flag beautiful mnist * set steps * steps as hyperparam * match default test steps * name * large cleanup * dont care about steps * nesterov false default * match each other impl * steps * switch nest * swap defaults * update docstring * add no nesterov test * ban fuse_optim * prints * classical momentum * alternative condition * recon * pre + post wd * false default * detach * signature changes * context * swap order * big cleanup * 0 step instead * parity * remove fuse * remove fused * better paper * assert message * correct shape check + eps * multidim * add eps * cleanup * correct assert message * lint * better tests * naming * ns_steps,ns_params * update docstring * docstring * match sgd and muon together * sandwich * add back fused * parity --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-08-13 14:27:55 -04:00
George Hotz	d2521d828a	transcendental+idiv+threefry are uop decompositions (#11636 ) * transcendental+idiv+threefry are uop decompositions [pr] * threefry decomp * fix randomness tests * fix webgpu * unneeded now * fix * move prematcher * all cast should probably be cast_vec	2025-08-13 09:37:12 -07:00
geohotstan	925555b62a	Fix onnx Domain bug (#11650 )	2025-08-13 08:20:50 -07:00
chenyu	0d8a0d7a96	update test_multi_const_folding_tensor to include pow (#11635 ) pow folds now	2025-08-12 13:35:37 -04:00
Sieds Lykles	4d6e407eb0	Extend fast_idiv to negative ints (#11632 ) * fast idiv for signed ints * Add rule and test * fix tests * redo fuzz_fast_idiv to do negative ints as well * adjust comments * remove unused imports	2025-08-12 19:34:49 +02:00
geohotstan	ad9dec25b3	combine onnx parser and onnx (#11485 ) * start * more * fix onnx_runner test * pass * patch for disk and add domains from huggingface * simpler docs * revert domain changes * rerun ci * revert onnx ops test change * add fix from strenum stuff * correct way * revert correct way to leave the fix for another PR * test segfault * Revert "test segfault" This reverts commit `4e1aaf41e7`. * remove some unnecessary documentation * test segfault again * Revert "test segfault again" This reverts commit `56fc5f03e7`. * try gemini suggested patch for sys._getframe * keep trying with gemini * revert not working gemini suggestions and try faulthandler * remove pythonfaulthandler * trigger CI a few times * minimize diff --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-08-12 12:56:39 -04:00
Sieds Lykles	4c3982c44e	Take sign out of mod (#11631 ) * Add rule and test * fix tests	2025-08-12 18:44:36 +02:00
chenyu	0d7075f2de	assign should broadcast input tensor (#11629 ) fixed test_assign_broadcast	2025-08-11 23:36:35 -04:00
George Hotz	ca41b5e38b	skip_0 in graph rewrite [pr] (#11627 ) * skip_0 in graph rewrite [pr] * no track_rewrites on test * use dict instead of set	2025-08-11 18:29:04 -07:00
chenyu	0c97d6de1b	don't round pow output for int pow int (#11625 ) also added atol=0 and big pows for the tests	2025-08-11 20:57:47 -04:00
chenyu	d623f6d850	support int Tensor pow to const non-negative int (#11624 ) matches torch	2025-08-11 19:50:19 -04:00
chenyu	857a830dcc	fix test_arange_float_step (#11623 )	2025-08-11 16:58:42 -04:00
ttomsa	ae0c3cfff6	change clang -march flag to -mcpu on arm (#10970 ) Co-authored-by: wozeparrot <wozeparrot@gmail.com>	2025-08-11 13:38:48 -04:00
geohotstan	27bcb9fd1c	Support cubic mode for ONNX Resize OP (#11612 ) * start * add reference * this is so much slower * this makes sense but differs from official impl, but results are still correct..? * add a comment * Just keep it simple for now since I don't fully get it yet * address comments * correct * teeny clean up * another small comment improvement lol	2025-08-11 11:49:30 -04:00
nimlgen	d2bb1bcb97	cloud: a bit better err handling (#11616 ) * cloud: err propagation to client * fix * print exc * linter * excs * fix * hm * flaky	2025-08-11 15:51:22 +03:00
chenyu	a67e0917c3	list indexing can normalize in python (#11609 ) * list indexing can normalize in python list index does not need to be normalized in tensor * update those	2025-08-10 20:02:38 -04:00
chenyu	1181ec0cd2	few more tensor indexing test cases (#11608 )	2025-08-10 18:56:42 -04:00
George Hotz	996c907c0b	rewrite not ready + children machinery (#11607 ) * rewrite not ready + children machinery * it doesn't like track rewrites	2025-08-10 15:28:30 -07:00
geohotstan	b0dab6a4cd	onnx Resize OP clean up (#11603 ) * start * slight clean up	2025-08-10 14:10:39 -04:00
Sieds Lykles	10540414cd	Add Ops.CMPEQ (#10431 ) * Add op * add to Groupop.ALU * fix spec * fix ptx * temporary pickle by name to see process replay * add Ops.EQ to binary ops * Actuall rename properly * add test to assert CMPEQ is being used * Ops.CMPEQ is automatic cast to bool * add Ops.CMPEQ to llvm * add Ops.CMPEQ to llvm	2025-08-10 13:13:16 +02:00
chenyu	dfb702ef33	fix sort for small dim (#11601 ) * fix sort for small dim * fixed test_sort_empty	2025-08-10 01:17:41 -04:00
Sieds Lykles	01c770c77b	Fix z3 float cast in indexing (#11590 ) * adjust dtype of z3_renderer and add rule for cast * dtypes.bool is also cast noop * add regression test * make embedding smaller * even smaller test	2025-08-09 17:59:23 +02:00
Sieds Lykles	10d388499d	Refactor optional.py (#11578 ) * move fast_idiv to transcendental * move optional.py * adjust comment * change import * mypy needs this?	2025-08-09 17:35:05 +02:00
qazal	16f0edbe90	pass opts arg in get_program process replay [pr] (#11571 ) * fix ptx process replay * keyword arg * renderer is also optional [pr] * test_linearizer fixup * name function order is args,ret,kwargs * can use opts_to_apply * pass through p.applied_opts * sink_arg * now it opens devices too	2025-08-08 03:05:09 +03:00
qazal	960cc6533a	pass through name function args in track_rewrites (#11572 )	2025-08-08 02:28:52 +03:00
George Hotz	82be8abfd2	move opt under codegen (#11569 )	2025-08-07 14:19:17 -07:00
George Hotz	6ed2dfd187	delete the arange dim mismatch restriction (#11568 ) * delete the arange dim mismatch restriction * skip that test race	2025-08-07 13:46:17 -07:00
chenyu	aa1a6f2132	support threshold in Tensor.softplus (#11564 ) fix gradient for large input	2025-08-07 13:43:18 -04:00

1 2 3 4 5 ...

4144 Commits