tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-04-29 03:00:14 -04:00

Author	SHA1	Message	Date
chenyu	35c9701df0	update outdated tests and comments (#14090 )	2026-01-10 01:00:48 -05:00
chenyu	2833c5a54b	few more jit tests with multi tensor inputs (#14047 )	2026-01-06 22:05:22 -05:00
nimlgen	25440f0f72	all2all (#13902 ) * all2all * um * fix * x * um * simler * mypy * fix * t * cmnts	2025-12-31 16:38:32 +03:00
chenyu	784b919f7f	Revert "optim empty shard #13513 (#13598 )" (#13855 ) * Revert "optim empty shard #13513 (#13598)" This reverts commit `76d465dbc3`. * test_arange_shrink * update test	2025-12-27 21:10:23 -05:00
George Hotz	744af193f0	remove ScheduleItem and merge it with ExecItem (#13759 ) * remove ExecItem and merge it with ScheduleItem * less diff * fix issues * min diff * don't change bufs in _lower * min diff * update * revert * fixes * diff	2025-12-19 17:04:24 -04:00
George Hotz	3dbde178c1	mark slow tests as slow instead of as CI (#13736 ) * mark slow tests as slow instead of as CI * CI shouldn't have different behavior * more skips / CI * slow	2025-12-17 10:29:57 -04:00
chenyu	fda73c8180	support LAMB param offload (#13730 ) also added Tensor.shard_like	2025-12-16 19:56:30 -05:00
Nino Risteski	76d465dbc3	optim empty shard #13513 (#13598 ) * optim empty shard * remove tuple * simplify * lint * lint2 * test * remove original buffer unique id * new rule * reset shard * update * reset shard	2025-12-09 12:28:36 -05:00
George Hotz	6bd355fa26	add needs_second_gpu decorator (#13543 ) * add needs_second_gpu decorator * more skips * two more fixes	2025-12-02 19:08:23 -08:00
George Hotz	e1051d00d7	multi like on full_like as well as rand_like (#13402 ) * multi like on full_like as well as rand_like * add test and fix bug * mismatch, optim match * one line	2025-11-20 20:46:48 -08:00
George Hotz	9b2b535fa4	fix issue with multi flip (#13115 )	2025-11-05 15:28:50 -08:00
George Hotz	a59439d013	use UOp.shape property instead of UOp.st (#12664 ) * work on shape property * reshape causing issues * more mops * all mops * need to cache it * _shape is like _device * mostly works * shape is good * const uses _shape * fix tests * size doesn't use st * close * test is broken * one less st * hack for 3 op assign * oops, i didn't mean to change that * support emulate in the NullDevice * reproed failure in emulation * fix wmma	2025-10-15 10:01:34 +08:00
Sieds Lykles	dccdd190aa	uop_given_valid uses less simplify (#12612 ) * uop_given_valid uses less simplify * enable test	2025-10-11 10:57:39 +02:00
chenyu	f2c3a72b0c	remove RANGEIFY flag [pr] (#12577 )	2025-10-09 21:52:54 -04:00
George Hotz	945cc46475	delete children tracking from uop (#12491 ) * delete children tracking from uop * uop children no longer exists * no tracked children * that test is flaky too	2025-10-08 09:04:14 +08:00
qazal	76e8a3250c	rangeify: late zero folding (#12464 ) * rangeify: late zero folding * early * not kernels * none * multi * linter * mstack is sink comment * more comment	2025-10-06 12:52:33 +03:00
chenyu	c1e85f699c	multi test case for sharded ring allreduce (#12462 ) * multi test case for sharded ring allreduce triggers `children not making progress` with RANGEIFY * expect_rangeify_fails	2025-10-05 23:18:24 -04:00
qazal	6a56d3c859	rangeify: only test correctness in multi (#12339 ) * work * more work * back here * skip tests * work	2025-09-30 09:55:59 +03:00
qazal	9513f025c5	apply multi before rangeify (#12298 ) * it doesn't realize it when i reshape * cleaner graph * map out * REDUCE_AXIS also gives the wrong answer * maybe * work * back here * try * more * refactor tests * check MultiBuffer * or copy * fine with this * don't need graph_rewrite_map in rangeify	2025-09-29 14:16:31 +03:00
qazal	57c7e0a8f8	RANGEIFY=1 test_jit (#12254 ) * RANGEIFY=1 test_jit * don't do any of that * disk * simple disk tensor * more work * run more tests * it also doesn't copy everytime * skip tests that hang everything	2025-09-20 17:34:32 +03:00
chenyu	9ad6a56d17	smaller test_simple_reduce (#12124 )	2025-09-11 15:45:38 -04:00
nimlgen	1c6c42715f	unify cpu and llvm (#11982 ) * try unify cpu and llvm * fixes * fix * ops * no llvm * fix * rm * lvmm is ot * oops * override * no llvm * ignore * skip llvm * ooops	2025-09-09 13:54:44 +03:00
George Hotz	a75da49951	use AxisType for UPCAST/UNROLL (#11800 ) * use AxisType for UPCAST/UNROLL * fixes * fix the bug * fix hack * bad test * flaky test	2025-08-23 14:44:48 -07:00
George Hotz	1d307f568c	move device tests to test/device + test cleanups (#11735 ) * move device tests to test/device * test speedups * test device * linalg to unit * upd * so pytest just works * more divide and skip * speed * test devectorize * add pillow	2025-08-19 16:02:20 -07:00
chenyu	dbc7807c61	enable WEBGPU tests with buffer limit (#11489 ) TestSample still fails?	2025-08-03 13:02:44 -07:00
George Hotz	3923e78061	no_vectorized_acc keeps single DEFINE_REG (#11387 ) * no_vectorized_acc keeps single DEFINE_REG * fix ptx, skip flaky test	2025-07-26 11:44:09 -07:00
qazal	ac39f27ae6	viz: non blocking UOp tracing (#10913 ) * viz: non blocking UOp tracing * u.arg * no if Ops.KENREL * drop replace * switch to weakref.WeakKeyDictionary * back * remove ram usage skips, viz works here * cache on reconstruct	2025-06-23 19:59:28 +03:00
George Hotz	531d143780	bring back old sharded rand behavior (#10842 )	2025-06-16 17:23:47 -07:00
George Hotz	81b9c04574	move high level stuff to unit tests [pr] (#10708 ) * move high level stuff to unit tests [pr] * process replay on unit tests * fix pr, less compute * set omp num threads * set 200MB buffer size limit * delete junk * fix tests * faster * move test_indexing to unit * faster	2025-06-08 14:05:56 -07:00
George Hotz	32e9949052	rename lazydata to uop (#10698 )	2025-06-08 08:42:22 -07:00
uuuvn	8e3f337075	Skip flaky test in ci (#10696 ) `test_data_parallel_resnet_train_step` is already skipped on LLVM/CPU: ```python @unittest.skipIf(CI and REAL_DEV in ("CUDA", "NV", "LLVM", "CPU"), "slow, and flaky on LLVM/CPU") @unittest.skipIf(REAL_DEV == "WEBGPU" and not OSX, "WEBGPU Vulkan can only run kernels with up to 10 buffers") def test_data_parallel_resnet_train_step(self): ``` It looks like `test_data_parallel_resnet` (no `_train_step`) is flaky in a similar way: https://github.com/tinygrad/tinygrad/actions/runs/15472667248/job/43560773882?pr=10642#step:9:64	2025-06-08 08:24:09 -07:00
George Hotz	54db1f8ee8	prevent huge waste of multi ram (#10669 ) * prevent huge waste of multi ram * fix ram usage * only define var * add resolve * fix tests * fix cifar training * remove that logic * fix test without long	2025-06-06 17:17:21 -07:00
George Hotz	7f0f97aa76	new test_multitensor tests (#10667 ) * new test_multitensor tests * cleanup scheduler	2025-06-06 10:26:28 -07:00
chenyu	4a6d84c4c3	hotfix llama start_pos vmax is max_context-1 (#10659 ) * hotfix llama start_pos vmax is max_context-1 fixed `IGNORE_OOB=0 python3 examples/llama3.py --size 1B --benchmark --temperature 0` * hotfix: multitensor transformer test tests kv cache --------- Co-authored-by: George Hotz <geohot@gmail.com>	2025-06-06 00:41:25 -04:00
George Hotz	5eb6e1e65a	Revert "hotfix: multitensor transformer test tests kv cache" This reverts commit `ad9f88419a`.	2025-06-05 21:15:34 -07:00
George Hotz	ad9f88419a	hotfix: multitensor transformer test tests kv cache	2025-06-05 21:08:57 -07:00
George Hotz	8325c4f192	tests for multi assign (#10658 ) * tests for multi assign * transformer tests * add that assert	2025-06-05 20:56:40 -07:00
George Hotz	4c315f8e17	MSTACK little non-functional changes (#10648 )	2025-06-05 13:20:22 -07:00
chenyu	d0969f5a1f	cleanup multi tests (#10635 )	2025-06-05 00:28:44 -04:00
qazal	6d07087fe1	remove contiguous from MSELECT 2 (#10522 ) * remove contiguous from MSELECT * test_shrink_on_shard_axis --------- Co-authored-by: George Hotz <geohot@gmail.com>	2025-05-26 19:19:01 +03:00
uuuvn	ec9955c956	Use REAL_DEV for test skips (#10420 ) This should fix remote cpu tests flakiness (segfaults were in `test_data_parallel_resnet_train_step` which is skipped on cpu but wasn't skipped on remote cpu)	2025-05-19 17:32:14 -07:00
qazal	cc8dda1d75	move multi_map to grouper rewrite pass (#10409 ) * move multi_map to grouper rewrite pass * delete that	2025-05-19 10:44:06 +03:00
George Hotz	411392dfb7	move files into uop dir (#10399 ) * move files into uop dir [pr] * tinygrad.uop is a thing * fix uop docs, no pr * fix viz	2025-05-18 11:38:28 -07:00
George Hotz	6ec88d94df	add tests for multi ram usage [pr] (#10376 )	2025-05-17 15:33:40 -07:00
George Hotz	e13f2a3092	multi is O(1) (#10183 ) * multi is O(1) * allreduce * no new uops needed * junk * something * simple * that's really what i want * closer * inject _device_num * pretty print * cleanups * this * early dnum * ops allreduce is good * ish * device is the tuple and this is fine * simpler * progress * copy_multi * work * more tests * more tests pass * work * no None axis * tests * no none multi * type fixes * pre commit passes * lil * remove this * mlperf dataloader on mac * that test was wrong * unbind * support DEBUG=2 * realize * only unbind bound vars * don't include fixedvars * graph test * one test * fixedvars in hcq * new ring reduce * ring reduce * simpler ring * mselect * mselect doesn't work * Revert "mselect doesn't work" This reverts commit `c78b77bd7d`. * Revert "mselect" This reverts commit `bb2e430ac3`. * simpler * fixups * no optional * fix jit * move things around * cleanup multi * simpler multi * simpler reshape	2025-05-16 23:14:23 -07:00
George Hotz	e1a40e8040	add hcq fixedvars support [pr] (#10356 ) * add hcq fixedvars support [pr] * different test * fixedvars are only for comp_queues * fix hcq varvals	2025-05-16 22:05:53 -07:00
George Hotz	a4a25720b2	add test_multitensor_jit_input [pr] (#10347 )	2025-05-15 20:47:57 -07:00
George Hotz	568d6d96e7	small changes from new multi [pr] (#10318 )	2025-05-14 20:50:59 -07:00
George Hotz	42e70193c9	multi: instead of real, just copy (#10289 ) * multi: instead of real, just copy * fix test * remove real	2025-05-14 10:36:55 -07:00
George Hotz	5f64bbc63d	improve multi tests + add support for fixedvars [pr] (#10281 ) * improve multi tests + add support for fixedvars [pr] * add support for fixedvars	2025-05-13 09:27:00 -07:00

1 2 3 4

194 Commits