tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-04-29 03:00:14 -04:00

Author	SHA1	Message	Date
chenyu	14d1c5fdfd	assign fusion tests on detach and contiguous_backward (#15092 )	2026-03-02 15:21:51 -05:00
qazal	f7aeff6061	viz: cli.py cleanups, do not require PYTHONPATH (#15085 ) * cleanup the print * sys.exit * equal check * cleanup unpacker * cli doesn't need PYTHONPATH * no semicolons * %s/PYTHONPATH=. //g	2026-03-02 19:24:38 +09:00
chenyu	fe0fa8333b	Revert "improve Tensor.sort indices (#15070 )" (#15072 ) This reverts commit `e3003631f2`.	2026-02-28 14:40:30 -05:00
chenyu	e3003631f2	improve Tensor.sort indices (#15070 ) * improve Tensor.sort indices instead of N^2 match at the end, have an arange to start and go through the same N(logN)^2 path * contiguous	2026-02-28 14:16:16 -05:00
chenyu	d345f7f5dc	remove _pending_assigns (#15040 )	2026-02-26 22:38:10 -05:00
George Hotz	e3fa9896b7	start function and add walk rewrite (#14992 ) * start function and add walk rewrite * work * add function on feed_forward * llm progress * stuff * none of that	2026-02-25 13:56:27 +08:00
George Hotz	b643fca51e	clean up complete_create_schedule_with_vars (#14980 ) * clean up complete_create_schedule_with_vars * transform_to_call * update viz tests	2026-02-24 16:12:36 +08:00
ttomsa	0366474089	Bool cast to cmpne (#14544 ) * test * rm in llvmir * rm in ptx and nir * hmmmm * rm in decompositions * skip tests * add test * just this * rm comment --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2026-02-23 10:31:36 -05:00
George Hotz	b824490e3f	allocate generates a call (#14958 ) * allocate generates a call * symbolic works too * DEFINE_VAR is param * replace param later * apply buffers * name * upd * this was a bug...	2026-02-23 15:59:20 +08:00
chenyu	4424757b9a	update test_sharded_memory (#14956 ) cleaned up and moved to test/null	2026-02-22 16:56:08 -05:00
qazal	c5029fa460	jit case with Tensor.empty input, realized means allocated (#14930 ) * simple failing jit test case with Tensor.empty * this used to exist in ops.py... * Revert "removed if self.buffer.is_allocated() in realized (#14836)" This reverts commit `72cf603805`.	2026-02-21 16:33:55 +09:00
George Hotz	df7774661a	remove late numbering of UOps (#14923 ) * remove late numbering of UOps * stupid fix * dead code	2026-02-21 09:18:48 +08:00
chenyu	24286c5593	fix clone for multi (#14919 ) also update empty_like to make sure it's backed by buffers	2026-02-20 17:21:09 -05:00
chenyu	a4634b253a	fix empty_like for sharded tensor (#14915 )	2026-02-20 16:30:04 -05:00
George Hotz	2611907afb	start ripping out old scheduler -- no maps (#14909 ) * start ripping out old scheduler -- no maps * no more metadata	2026-02-20 21:05:04 +08:00
George Hotz	55d3a5def9	preallocate all realized buffers (#14823 ) * preallocate all realized buffers * contiguous * work * comment that out * move to schedule * better * correct fix * just buffer * disk bufs * fixes disk tensor stuff * fix symbolic stuff * fix multi * 162 failures * bugfixes * don't check that anymore * fix schedule tests * mnist should be contiguious * type and buffer * fix tests * shrink axis correction * mypy fixes * tests skips * same 37 failures * dedup * no shrink in the graph * 29 failures * skips * fix custom kernel * fix training * those optimizations aren't supported currently * simpler * more correct * tests * 14 failures * works * fix that test * broken * 11 failures * only kernel counts left * fixes * all tests pass * remove tensor_map * op test * 200 -> 230 * test fixes * fixes * revert test_tiny thing * guard * revert that * test tiny passes * no contigs there * base realize back * Revert "no contigs there" This reverts commit `c45bb9fcfd`. * revert that * chop many assigns * 12 failures * fix tests * tests * apply after * pre-commit * remove old code * delete that * fix types * remove extra contig * fix dataloader * torch fix * disk fix * update kernel fusion numbres * runs on amd * restore kernel count * add that rule back * that * disable that * wrong * add the correct rule for that folding * more tests * guard c1.arg * no newlines * realize those * split into a different file * remove detach/contig back * skip 2 * update that	2026-02-20 20:05:54 +08:00
George Hotz	6610255654	add the correct rule for gcd div/mod folding (#14905 ) * add the correct rule for that folding * more tests * guard c1.arg	2026-02-20 18:11:54 +08:00
George Hotz	a28fc2fba7	hotfix: remove wrong symbolic rule	2026-02-20 17:09:18 +08:00
qazal	e9ae3da711	viz: click on CALL node goes to codegen (#14609 ) * viz: click on CALL node goes to codegen * colored name	2026-02-20 11:13:11 +09:00
George Hotz	fc5677c28b	resnet dataloader + more test cleanups (#14899 ) * resnet dataloader * tests	2026-02-20 10:05:47 +08:00
chenyu	b9744ab62b	one more test_gpudims test (#14898 ) failure from the bad simplification attempt	2026-02-19 18:18:44 -05:00
chenyu	9d6cf00be2	fix gpudim bug and test_split_2d_to_3d (#14896 )	2026-02-19 16:46:24 -05:00
chenyu	2b31823ef9	update test_gpudims to prove bijectivity (#14895 ) * update test_gpudims to prove bijectivity * one more	2026-02-19 16:18:59 -05:00
chenyu	19ce7a3f7f	use z3 to verify gpudims output index (#14894 ) found a bug with z3	2026-02-19 15:24:38 -05:00
chenyu	52f727738b	move test_grouped_dims to test/null (#14893 ) it's a pure helper	2026-02-19 14:50:53 -05:00
chenyu	7400362a86	remove UOp.vars [pr] (#14891 )	2026-02-19 12:09:39 -05:00
George Hotz	f6c1cf343c	new symbolic rule from prealloc_bufs (#14883 ) * new symbolic rule from prealloc_bufs * optim	2026-02-19 20:57:30 +08:00
George Hotz	2f0f8b5776	more test relaxations from prealloc_bufs (#14880 )	2026-02-19 14:23:28 +08:00
George Hotz	ab61c16730	fixes and test relaxations from prealloc_bufs (#14875 ) * fixes and test relaxations from prealloc_bufs * fix error type and guard _mop * revert that * contiguous makes extra/torch_backend/test_kernel_fusion.py fail	2026-02-19 11:37:25 +08:00
chenyu	f771de6738	gc.collect() to get the correct GlobalCounters.mem_used in tests (#14868 ) test can be flaky if gc happens in between	2026-02-18 15:01:23 -05:00
chenyu	5746a605ce	UOp.axis raises for invalid reshape (#14863 ) reshape is lazy now, so better to raise from the .axis call and not have caller to handle invalid case	2026-02-18 11:28:56 -05:00
George Hotz	ab55e8c6b9	assign should be used as output buffer (#14845 ) * assign should be used as buffer * late removed * the fix * better fix * backward slice	2026-02-18 09:37:46 +08:00
chenyu	72cf603805	removed if self.buffer.is_allocated() in realized (#14836 ) automatically fixes is_realized issue for empty	2026-02-17 15:35:56 -05:00
chenyu	f147791105	update test to reset and test kernel_count directly (#14832 )	2026-02-17 11:48:46 -05:00
George Hotz	bc3487d607	VIZ display cleanups (#14811 ) * exclude reshape/expand broadcasts from viz * limit src lines	2026-02-17 10:03:08 +08:00
nimlgen	9f8afb518c	viz: sdma gb/s in graph (#14798 ) * viz: sdma gb/s in graph * f	2026-02-16 16:45:06 +03:00
qazal	db3db476ff	viz: add GB/s to SDMA (#14795 ) * work * better * fix that * no decimal	2026-02-16 20:09:20 +09:00
qazal	c2be31e75b	move Estimates to rewrite rules [pr] (#14782 ) * move Estimates to rewrite rules [pr] * don't need this cached_property * tuple * return	2026-02-16 12:59:42 +09:00
George Hotz	0abcb9aac2	move more to mixins (#14780 ) * move more to mixins * revert * move some * do not change * more * fix tests * Revert "more" This reverts commit `d942d59fa4`. * go * work * more * work * guard * base	2026-02-16 11:35:00 +08:00
George Hotz	9759fd6193	dtype mixin (#14763 ) * dtype mixin * dtype mixin methods	2026-02-15 16:03:48 +08:00
George Hotz	32980c74d1	hotfix: skip flaky tests, looped many times on tinymac3	2026-02-15 07:46:29 +08:00
chenyu	043f5dbfa0	fix write-after-read tracking (#14754 ) AFTER-AFTER was silently dropped, which breaks write-after-read	2026-02-14 17:23:05 -05:00
chenyu	0ce4a55dad	clean up test_setitem_slice (#14750 ) moved to test_setitem_schedule, and use contiguous zeros as scheduler handles empty differently now	2026-02-14 14:29:16 -05:00
nimlgen	e1a18dadae	fix devices for copies (#14747 ) * fix devices for copies * add test	2026-02-14 17:39:41 +03:00
George Hotz	c0fe78f73b	BUG: metadata is lost with partial assign (#14732 )	2026-02-13 21:35:21 +08:00
chenyu	50cb40be88	clean up test/null/test_indexing.py (#14720 )	2026-02-12 22:36:53 -05:00
qazal	5b624b5e93	viz: better error message for out of range timestamps (#14722 ) * test_timestamp_out_of_range * rel_ts helper * linter	2026-02-13 12:13:40 +09:00
chenyu	86352988d8	update test_uops_stats for setitem (#14710 ) realize both full tensor and the slice should not add to global_mem	2026-02-12 12:26:13 -05:00
chenyu	56caf6a3a2	fix Estimate.from_uops for sliced access (#14695 ) "assume all DEFINE_GLOBAL memory is accessed" is wrong for partial load. get accessed accumulated from INDEX, then cap at full size. now mem_est never exceeds lds_est	2026-02-12 11:18:07 -05:00
chenyu	8551fa50d3	support bitcast in sym_infer (#14708 ) fixed `DEBUG=2 DEV=WEBGPU python -m pytest test/backend/test_tensor_variable.py::TestTensorVariable::test_symbolic_pad`	2026-02-12 10:21:05 -05:00

1 2

65 Commits