tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-04-29 03:00:14 -04:00

Author	SHA1	Message	Date
George Hotz	41d00a046d	add device to local, fix PCONTIG=2 (#14266 ) * add device to local, fix PCONTIG=2 * regression test * remove the device when we render * viz slowness * no long	2026-01-21 22:12:18 +09:00
nimlgen	22af7132cd	fix test_dev_jitter_matrix (#14255 )	2026-01-20 20:07:51 +03:00
C T	26f8b12e01	Whisper audio helpers (mel filters in tinygrad) (#13478 ) * add whisper audio helpers for stft/mel/resample * cleanup * add whisper stft test * make only stft test explicitly depend on librosa * extract sinc_window_kernel * dehardcode device * use same device argument * simplify * type annotate * ruff format audio_helpers.py * ruff format test_whisper.py * add WHISPER_NEW_STFT * rename * undo ruff format changes * use new stft and mel for whisper * remove stft test that depends on librosa * remove whitespace * add Tensor.log10 with test\test_ops.py::TestOps::test_log10 * use Tensor.log10 * fix lint * future: remove unused STFT class * future: remove resample code since it isn't used (yet) * match openai with pad_mode="reflect" * pad_to * future: cut resample leftovers * cleanup * add mel tests * future: cut stft * future: cut non-mel prep_audio changes * reduce diff * move audio_helpers.py to examples * reduce whitespace * fix imports * reduce whitespace --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2026-01-20 10:50:02 -05:00
George Hotz	5e24643889	minor import speedups (#14244 ) * minor import speedups * server stuff in server places * pre-commit * fix	2026-01-20 15:05:36 +09:00
qazal	b1c5a242b7	Revert "move is_dtype_supported logic to renderer (#14188 )" (#14237 ) This reverts commit `161fee9a48`.	2026-01-20 12:19:14 +09:00
chenyu	9ea63d7d52	failed test case for onnx IF with jit (#14235 ) silently fails now since onnx treats IF cond as a const	2026-01-19 18:10:05 -05:00
George Hotz	31bcbed6bb	AMD_DISABLE_SDMA for testing with -n12 (#14216 )	2026-01-19 16:10:30 +09:00
Christopher Milan	161fee9a48	move is_dtype_supported logic to renderer (#14188 ) * move is_dtype_supported logic to renderer * fix CPU_COUNT * mypy happy * early import libclang too with llvm * run with debug * skip autogen tests if MTLCompiler or llvm is loaded * run autogen tests separately in CI * lint	2026-01-18 22:37:04 -05:00
chenyu	67d9712ef6	jit copy aliased output if it's read later (#14210 )	2026-01-18 18:48:59 -05:00
chenyu	97333b1954	jit footguns test case on assign with same buffer outputs (#14209 ) related https://github.com/tinygrad/tinygrad/issues/13364	2026-01-18 16:01:09 -05:00
chenyu	e7c2df9113	improve consecutive Tensor indexing (#14208 ) * improve consecutive Tensor indexing instead of O(idx_countssrc_dims), it can just be O(idx_counts) test correctness	2026-01-18 15:14:33 -05:00
chenyu	c7b8f6496f	remove dtypes.index_like and dtypes.fields [pr] (#14207 ) barely used, so just use inline and DTYPES_DICT	2026-01-18 11:49:01 -05:00
chenyu	5e6a72c33f	new Onnx Gather (#14187 ) instead of assuming const indices, check if it showed as a const	2026-01-16 22:24:07 -05:00
chenyu	ab244c7f81	onnx Gather should not assume indices to be const (#14185 ) * onnx Gather should not assume indices to be const added a failed test case * just list	2026-01-16 20:55:00 -05:00
wozeparrot	a879b54234	tk: fa jit fix (#14170 )	2026-01-16 16:38:45 -08:00
Christopher Milan	a021b84604	autogen: fix enum (#14171 )	2026-01-16 01:30:11 -05:00
chenyu	14e9a71a41	move test_assign to unit (#14165 ) scheduling these should not depend on device	2026-01-15 17:10:13 -05:00
Christopher Milan	0cb024a5bb	remove ctypes.Structure (#13651 )	2026-01-15 05:06:22 -05:00
qazal	164bc678a6	scheduler: sched_cache bugfix for different Tensor.custom_kernel schedules (#14161 ) * simplest failing test * min fix * same function reuses the cache * SPEC=2 never worked for custom_kernel	2026-01-15 14:59:14 +09:00
qazal	b46da603fe	codegen/custom_kernel: do not attach KernelInfo to user program (#14160 )	2026-01-15 14:01:48 +09:00
chenyu	add7da268f	multiple slice assign test (#14157 ) GANing test cases	2026-01-14 21:08:03 -05:00
chenyu	1381daac06	many more failed assign tests (#14153 ) assign is quite broken	2026-01-14 16:20:28 -05:00
chenyu	899a56446e	failed assign test cases with write before read (#14148 ) slice assign write before read fails now. this is why kv cache needs a realize	2026-01-14 10:30:50 -05:00
chenyu	2a2c1eacf6	disable fast_idiv on metal (#14137 ) there's a metal compiler bug which was the root cause that keccak needs a contigous hack	2026-01-13 21:40:40 -05:00
wozeparrot	a92778aa0c	tk: fa multi fix (#14134 )	2026-01-13 17:22:15 -08:00
chenyu	fe00682502	clean up svd tests (#14133 ) removed from test_ops and added to TestTorchBackend	2026-01-13 16:32:21 -05:00
chenyu	e610821c52	Tensor.cummin and Tensor.nonzero (#14131 )	2026-01-13 15:09:56 -05:00
chenyu	176a934ddd	Tensor.diagonal support offset and dims (#14130 )	2026-01-13 14:49:06 -05:00
qazal	79d00521f8	viz: fix cfg err when endpgm is in the middle of stream (#14128 ) * kernel from beautiful_mnist * minimal test * correct way to do this * rm that	2026-01-14 02:00:34 +09:00
qazal	fd10fd245a	viz: cfg tokenizer fix and unit tests (#14121 ) * output Ops.BINARY * failing test for the cfg * dsl renamed to offset and sz * add better asserts * move the note	2026-01-13 15:08:55 +09:00
chenyu	05fcb57696	also return index in Tensor.cummax (#14117 ) * also return index in Tensor.cummax * fix	2026-01-12 22:42:10 -05:00
wozeparrot	7c967399a4	tk: add failing test for fa multidevice (#14116 )	2026-01-12 19:11:09 -08:00
George Hotz	330a0b686e	assembly/amd: clean up dsl and make type verification strict (#14102 ) * assembly/amd: start newdsl * work * newdsl upd * Reg is p nice * cleaner * work * getting clean * all fields * more BitFields * redo the pdfs with dsl2 syntax * no lit * cleanups * more defaults * fix get and remove crap * aliases * ugly but kind of works * NULL, not rawimm * clean up defaults * only dsl * asm fixes * lit fixup * more lit * cleanups * olddsl * single pcode dict * emu sort of works * trash test * global is global * types property * reg mods * fix a few tests * remove monkey patch * fixes * less hacks in tests * less hacks in tests * 4 test failures * hw tests all pass * fix compare emulator * fix some tests * 3 more * fix and shorten sqtt * handwritten * fix validation * test corrections * all types validate * fix dsl2 tests * fix bugs in disasm * skips on cdna * work * repr with reg[] * fix bitfield tests * merge pcodes in dsl * remove override * disasm uses inst.types * simpler	2026-01-13 08:52:16 +09:00
C T	a8c821f45e	add Tensor.log10 with test\test_ops.py::TestOps::test_log10 (#14113 )	2026-01-12 13:45:47 -05:00
chenyu	6b0a9f5ee6	don't strip sink in to_uops_list [pr] (#14111 )	2026-01-12 11:19:03 -05:00
chenyu	cad7feec02	more onnx ops (#14104 ) HannWindow, HammingWindow, BlackmanWindow, Hardmax, LpNormalization	2026-01-12 09:11:13 -05:00
chenyu	9973a81356	add channels_last to QLinearGlobalAveragePool (#14094 ) and other minor cleanups	2026-01-10 18:38:19 -05:00
chenyu	35c9701df0	update outdated tests and comments (#14090 )	2026-01-10 01:00:48 -05:00
chenyu	92246ea731	update tests, `WEBGPU=1 pytest .` passes (#14089 ) * update tests, `WEBGPU=1 pytest .` passes * minor update	2026-01-10 00:03:02 -05:00
chenyu	c34c6d9468	fix wgsl packed_store can drop valid (#14088 ) * fix wgsl packed_store can drop valid * fix	2026-01-09 15:22:06 -05:00
chenyu	eacccc5ace	more disk assign tests (#14087 ) covers more edge cases	2026-01-09 14:14:52 -05:00
chenyu	ed295e74dc	don't skip gguf test if ggml is not installed (#14086 ) * don't skip gguf test if ggml is not installed should just let it fail * fix	2026-01-09 12:05:58 -05:00
chenyu	cff33c8d78	add some disk assign tests (#14085 )	2026-01-09 11:50:59 -05:00
chenyu	74fa3c7d09	decomp pow for LVP (#14084 ) test failed due to undefined behavior, so use decomp instead	2026-01-09 10:50:28 -05:00
b1tg	0fbc551622	train bert with fp8 (#13874 ) * fp8 train * clean * lint * test fix from #13439 * skip first/last layer * rm __init__, restore unroll <=32 check * tests * clean test, remove unused * multi-gpu test, clean quantize_to_fp8 * remove bert contiguous * run script * test: better check * run script search * add seed in bert data shuffle * move script to mi350x folder --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2026-01-09 09:21:59 -05:00
chenyu	efcb32f6a9	unique const when requires_grad is set to True (#14075 ) * unique const when requires_grad is set to True * fix pyrender	2026-01-08 16:30:45 -05:00
chenyu	b34c637767	support bfloat16 for CL (#14073 )	2026-01-08 14:14:29 -05:00
Garret Castro	16b652302e	skip bf16 test if not supported by device (#14070 )	2026-01-08 13:37:24 -05:00
wozeparrot	027b935269	tk: fix grouped load store (#14035 )	2026-01-07 22:38:02 -08:00
chenyu	3caa1e2c98	fix cast HALF with PYTHON backend (#14058 )	2026-01-07 16:52:05 -05:00

1 2 3 4 5 ...

4887 Commits