tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-04-29 03:00:14 -04:00

Author	SHA1	Message	Date
nimlgen	abd830b260	am: setup_rinf returns only doorbell (#15112 )	2026-03-03 19:27:41 +03:00
nimlgen	4b42bb54aa	am: reset sdma to start from 0 (#15109 )	2026-03-03 18:14:46 +03:00
George Hotz	01ddb4c267	add precompile to call (#15099 ) * add precompile to call * put get back * something * after structure * alt * keep it call * resolve call * resolve linear call * precompile works with llm * revert rangeify * color for debugging * getenv PRECOMPILE * clean up deco pattern * fully recursive sink scheduling * revert llama * fix SPEC=2	2026-03-03 22:32:42 +08:00
qazal	c7f908b788	sqtt: fix rdna4 structs (#15111 ) * work * DEBUG=2	2026-03-03 23:32:14 +09:00
qazal	8dd691761d	sqtt: remove old files (#15108 )	2026-03-03 22:43:24 +09:00
Christopher Milan	de043226ba	benchmark comma usbgpu driving_vision step and load time (#15103 ) Co-authored-by: Comma Device <device@comma.ai>	2026-03-03 06:08:03 -05:00
Christopher Milan	5f6b610da1	FLOAT16 logic for IMAGE==1 goes back to image_conv2d (#15105 )	2026-03-03 05:37:57 -05:00
wozeparrot	529318259c	fix: fix null tests to actually use null device (#15104 )	2026-03-03 02:05:47 -08:00
George Hotz	7d025089e3	no after removal (#15102 ) * no after removal * we are using walk * null schedule test * pytest deps * Revert "pytest deps" This reverts commit `5e1c5304ec`. * Revert "null schedule test" This reverts commit `02da66053e`. * clean null tests	2026-03-03 17:50:31 +08:00
wozeparrot	92c16810ac	feat: per device mem_used (#15100 )	2026-03-03 01:31:28 -08:00
qazal	e3a0598d0b	viz: the whole pc should be in view (#15101 )	2026-03-03 17:17:53 +09:00
b1tg	a9ea36de79	assembly/amd: v_cmp_lg_f32 is ordered not-equal (#14982 )	2026-03-03 15:37:48 +08:00
wozeparrot	c35de9bd68	asm_gemm: support more sharding (#15002 )	2026-03-02 23:16:37 -08:00
wozeparrot	824ba4386a	llama3 dp fix (#15098 )	2026-03-02 22:43:07 -08:00
chenyu	5dcf29b1a0	use clone in test_swap_slices (#15096 )	2026-03-02 22:05:12 -05:00
Christopher Milan	c70e8af068	move IMAGE FLOAT16 logic to allocations (#15095 ) * FLOAT16 logic in allocations * cleanup * separate that * only apply when IMAGE == 1 * test passing now * create image buffers earlier	2026-03-02 22:00:05 -05:00
George Hotz	d483e4153a	buffer view is like buffer (#15082 ) * buffer view is like buffer * fix * swap_reshape_shrink * contiguous on gguf, fix overlap * revert that * _device_supports_view * this * fix that test * 0 buffers * that test was wrong * this * check correct size * contig BUFFER_VIEW * this * fix tests * buffer view tests * om * fix torch * no MOCKGPU * skip	2026-03-03 09:52:33 +08:00
qazal	62ee976c1b	gemm/asm: cleanup repeated patterns to helper functions (#15094 )	2026-03-03 08:14:47 +09:00
qazal	848f5cea96	viz: sqtt instruction packet trace (#15065 )	2026-03-03 07:55:04 +09:00
chenyu	14d1c5fdfd	assign fusion tests on detach and contiguous_backward (#15092 )	2026-03-02 15:21:51 -05:00
nimlgen	dfa180413d	tbgpu: sign nv (#15087 )	2026-03-02 22:58:30 +03:00
chenyu	71f228f80f	test exact kernel count in torch_backend/test_kernel_fusion (#15091 )	2026-03-02 14:26:32 -05:00
chenyu	f80b1033c5	simpler Tensor.all (#15089 ) same generated kernel	2026-03-02 11:08:55 -05:00
chenyu	4008f7d4e8	move Tensor.one_hot +1 to python (#15088 )	2026-03-02 10:56:41 -05:00
nimlgen	dafbe9733a	am: cleanup (#15086 )	2026-03-02 17:06:21 +03:00
qazal	f7aeff6061	viz: cli.py cleanups, do not require PYTHONPATH (#15085 ) * cleanup the print * sys.exit * equal check * cleanup unpacker * cli doesn't need PYTHONPATH * no semicolons * %s/PYTHONPATH=. //g	2026-03-02 19:24:38 +09:00
George Hotz	5ff278446c	add contiguous_view_offset (#15084 ) * add contiguous_view_offset * no int	2026-03-02 18:05:04 +08:00
Christopher Milan	977c270774	IMAGE=1 kernel count failing tests (#15083 )	2026-03-02 04:35:26 -05:00
George Hotz	3539693555	Support triu variable on diagonal + SDPA symbolic (#15081 ) * triu variable * fails * dumbbb * no commutative in reshape * real fix * revert that * sdpa symbolic tests	2026-03-02 12:19:48 +08:00
wozeparrot	a4f6365929	llama3: fstep takes grads (#15069 )	2026-03-01 20:05:07 -08:00
Nick	8e8e9f6ff6	assert removal for _tri() + tests (#15073 ) * assert removal for _tri() and tests * removed import * tests triu/tril like in prefill --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2026-03-02 10:34:28 +08:00
nimlgen	ccbbca05ef	beam: add dev_timeout for am (#15063 ) * beam: add dev_timeout for am * all covered * fk * x * fuzz * reset * f	2026-03-01 16:57:29 +03:00
chenyu	8cb4368967	delete unused END NOOP rule [pr] (#15077 )	2026-03-01 00:09:05 -05:00
chenyu	efce99adc9	skip isComposing key press in llm.py (#15076 ) for the CJK input user	2026-02-28 20:31:53 -05:00
chenyu	103ea16ec0	add contiguous back to svd (#15074 ) can cause infinite loop	2026-02-28 16:49:26 -05:00
chenyu	fe0fa8333b	Revert "improve Tensor.sort indices (#15070 )" (#15072 ) This reverts commit `e3003631f2`.	2026-02-28 14:40:30 -05:00
chenyu	e3003631f2	improve Tensor.sort indices (#15070 ) * improve Tensor.sort indices instead of N^2 match at the end, have an arange to start and go through the same N(logN)^2 path * contiguous	2026-02-28 14:16:16 -05:00
wozeparrot	cfc5cf65ad	llama3: vocab padding fix + jit copies on fakedata (#15067 )	2026-02-28 08:44:55 -08:00
chenyu	76170d035a	relax atol for test_xlm_roberta_large (#15066 )	2026-02-28 11:22:35 -05:00
qazal	cfb8e6922d	viz: arrow keys move through time (#15064 ) * work * automatic zoom, keeping scale * the whole shape should be out of view	2026-02-28 23:52:36 +09:00
nimlgen	9b3450c9da	test gpu crash on cdna (#15062 )	2026-02-28 13:17:59 +03:00
nimlgen	6bbf813dd3	ci: switch to tinygrad/amdcomgr_dylib (#15061 )	2026-02-28 13:09:39 +03:00
nimlgen	77846300b2	am: reset vm fault (#15060 )	2026-02-28 12:58:56 +03:00
George Hotz	dc54441e1f	add better printing to tinygrad.apps.llm (#15059 ) * add better printing to tinygrad.apps.llm * add gc.collect * comment	2026-02-28 16:38:50 +08:00
George Hotz	bb84e389cf	functions for llama trainer (#15045 ) * functions for llama trainer * function there * axis match * fix multi * lil cleaner * there's a bug with HK_FLASH_ATTENTION * training functions * for commit	2026-02-28 12:15:18 +08:00
chenyu	9b4ba3f838	remove ReduceContext.range_to_ends [pr] (#15055 ) * remove ReduceContext.range_to_ends [pr] make merge_reduce_ends pure. this state is causing issue when introducing more reduce merging rewrites * tag	2026-02-27 22:15:44 -05:00
chenyu	151608aa90	update test_multiple_to_single_device (#15056 ) follow up to #14482, add SCACHE=0 to the test	2026-02-27 21:44:33 -05:00
chenyu	5fd06f4f02	differentiable setitem (#15054 ) * differentiable setitem go through the where path for bw * no return	2026-02-27 17:25:15 -05:00
chenyu	db6b3e1edc	fix mixed setitem with both basic and tensor indexing (#15050 )	2026-02-27 15:35:48 -05:00
chenyu	c9f6d8751b	don't remove_bufferize for Invalid (#15053 ) * don't remove_bufferize for Invalid * replaced	2026-02-27 15:16:09 -05:00

1 2 3 4 5 ...

12481 Commits