tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-09 15:08:02 -05:00

Author	SHA1	Message	Date
chenyu	0a98fd38b3	fix tests that failed locally on mac (#13872 ) keccak output was silently broken without contiguous	2025-12-29 11:23:38 -05:00
Clément Verrier	0e409ff5ce	fix indentation in UOp pretty_print for repeated references (#13857 ) * fix correct indentation in UOp pretty_print for repeated references When a UOp was referenced multiple times, the walrus operator notation (e.g., x0:=) was correctly used for the first occurrence, but subsequent references had misaligned indentation due to an extra space character. Fix indentation misalignment in pretty_print() when UOps are referenced multiple times. * add simple unit tests for UOp repr --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-12-29 10:46:16 -05:00
anu	9b4de8abc7	fix beam in python 3.14+ (#13836 ) * fix beam search on python 3.14 * add PickleableCount class to helpers * change name, add test, add step * tidy count init	2025-12-27 16:24:22 -05:00
chenyu	54af29dbdb	trange can just be a function (#13827 )	2025-12-24 23:57:10 -05:00
George Hotz	43c6e973d8	add optional compiler in Renderer (#13817 ) * add optional compiler in Renderer [pr] * fix * late init * remove precompiled * cleanup	2025-12-23 17:58:46 -05:00
George Hotz	6439a515be	test fixups / speedups / var_vals refactor (#13812 ) * no PYTHONPATH + llm server port 0 * llm tok speedup * refactor var_vals	2025-12-23 12:05:59 -05:00
George Hotz	8dcba2e2cc	no full_rewrite [pr] (#13809 ) * no full_rewrite [pr] * fix * fix docs	2025-12-22 23:20:01 -05:00
George Hotz	df0f9d6860	add olmoe support to llm (#13792 ) * add olmoe support to llm * cleanups * simpler * clean * fix mypy * lil * remove dumb assert	2025-12-22 10:41:35 -04:00
chenyu	5cb827f7bf	clean up can_lossless_cast and add missing pairs [p] (#13793 )	2025-12-21 12:18:33 -05:00
George Hotz	75a6a03664	add qwen3 moe support to tinygrad.apps.llm (#13775 ) * qwen moe works * simple moe * one test * integration	2025-12-21 12:36:02 -04:00
chenyu	733ef0452c	update test_uop_resolve (#13777 ) plain @unittest.expectedFailure is too broad	2025-12-20 12:40:59 -05:00
chenyu	185a000882	gradient of COPY (#13760 )	2025-12-19 13:33:59 -05:00
George Hotz	aeb7516c8a	tests passing on tinybox h3 (#13742 )	2025-12-17 19:04:34 -04:00
George Hotz	b013244c38	fix local tests for AMD_LLVM (#13738 ) * fix local tests for AMD_LLVM * fix linters * skip that for now * fix segfault	2025-12-17 12:23:46 -04:00
George Hotz	3dbde178c1	mark slow tests as slow instead of as CI (#13736 ) * mark slow tests as slow instead of as CI * CI shouldn't have different behavior * more skips / CI * slow	2025-12-17 10:29:57 -04:00
George Hotz	9015a22523	make tests faster (#13734 )	2025-12-17 09:39:44 -04:00
George Hotz	cf0c28d5ae	all tests pass on strix halo (#13728 )	2025-12-16 19:35:50 -04:00
George Hotz	321ab943b2	qwen model is working (#13690 ) * qwen model is mostly working * add Q4_K quantization support to GGUF parser, add qwen3:1.7b model - Add Q4_K (type 12) dequantization in nn/state.py - Add qwen3:1.7b model using Q4_K_M quantization (smaller than Q8_0) - Make bos_token_id optional for models like Qwen3 that don't have it - Fix line length issues and add preset parameter to SimpleTokenizer 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * smaller diff * test dequant * half split * better * simple tok * mock token * polish * better * fix * replace --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-15 18:00:34 -04:00
George Hotz	a657a4e0f4	add Q4_K GGUF quantization support (#13700 ) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-15 10:17:56 -05:00
George Hotz	572ca80046	fast tinygrad.apps.llm (#13685 ) * llm: add --benchmark support * fix speed * debug logging * fix test attention	2025-12-14 21:05:21 -05:00
chenyu	ed962786d6	use assign in Tensor.backward (#13674 ) preserve the grad object so that jit works	2025-12-13 22:43:06 -05:00
George Hotz	55845f7de7	schedule: cache unbinds for consistent cache keys (#13664 ) * schedule: cache unbinds for consistent cache keys strip BIND values before computing cache key so different bound values (e.g. KV cache positions) hit the same schedule cache entry. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * spec: allow single-src BIND for schedule cache key normalization 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: add lessons learned to CLAUDE.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * more claude.md --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-12 17:27:42 -05:00
George Hotz	8c87a0bf8d	Revert "schedule: cache unbinds for consistent cache keys (#13662 )" This reverts commit `af86cae10c`.	2025-12-12 16:49:50 -05:00
George Hotz	af86cae10c	schedule: cache unbinds for consistent cache keys (#13662 ) * schedule: cache unbinds for consistent cache keys different bound variable values (e.g. kv cache positions) now produce the same schedule cache key by unbinding BIND(DEFINE_VAR, CONST) before computing the cache key and rebinding after lookup. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * schedule: cache unbinds for consistent cache keys When scheduling, BIND(DEFINE_VAR, CONST) nodes are now unbound to tagged DEFINE_VARs before computing the cache key. This ensures that the same computation with different bound values (e.g., different KV cache positions in LLM) gets the same cache key and reuses the cached schedule. The fix: - pm_pre_sched_cache: replaces BIND with tagged DEFINE_VAR - pm_post_sched_cache: restores tagged DEFINE_VAR back to original BIND - pm_remove_rangeify_tags: excludes DEFINE_VAR to preserve tags through rangeify - var_vals extracted from BINDs before cache key computation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * schedule: fix BIND handling and add CLAUDE.md - Handle BIND to RANGE in create_schedule (not matched by CONST pattern) - Assert all BINDs on same variable have same value - Add CLAUDE.md codebase guide 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-12 16:40:10 -05:00
George Hotz	316da9f7ff	llm: add created/model fields, non-streaming support, and tests (#13660 ) * llm: add created/model fields, non-streaming support, and tests - Add `created` timestamp and `model` fields to response (required by OpenAI spec) - Add non-streaming mode support for /v1/chat/completions - Add `send_data` helper to HTTPRequestHandler for responses with Content-Length - Refactor viz/serve.py to use send_data - Add integration tests using real OpenAI client 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * add openai to testing * toml * Remove 'openai' from dependencies Removed 'openai' from the dependencies list. * bump cache --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-12 14:50:36 -05:00
Christopher Milan	94d7646bdc	fix anonymous struct fields (#13610 )	2025-12-07 12:56:38 -05:00
nimlgen	ac5f1e115d	autogen: repro for the bug (#13607 ) * autogen: repro for the test * mute	2025-12-07 15:51:03 +03:00
George Hotz	c5bd28e21d	start work on schedule cache (#13529 ) * start work on schedule cache * local unique * schedule cache works * schedule cache cleanup * fix tests * preserve metadata * oops, fix cache * put that there * fix spec * always miss * why is that broken? * src[0].op * fix process replay * delete abstractions2 * reenable the actual schedule cache * metadata is best effort * fix JIT in examples/gradaccum_mnist.py * full jit * fixed and test is real	2025-12-04 17:24:49 -08:00
ayanhan	edf929ec9d	fix: add __delitem__ to Tensor with proper TypeError (#13561 )	2025-12-04 00:53:08 -08:00
Christopher Milan	0a54434b15	mitigate ctypes c_bool bitfield bug (#13558 ) * mitigate ctypes c_bool bitfield bug * don't delete old test	2025-12-03 20:46:04 -05:00
chenyu	22777a89ea	minor test_uop_symbolic updates (#13551 )	2025-12-03 13:17:44 -05:00
chenyu	a205f98ef4	tighter bound for MOD (#13550 )	2025-12-03 11:24:29 -05:00
nimlgen	549f3287a8	fix caching for fetch (#13544 )	2025-12-03 14:34:14 +03:00
George Hotz	6bd355fa26	add needs_second_gpu decorator (#13543 ) * add needs_second_gpu decorator * more skips * two more fixes	2025-12-02 19:08:23 -08:00
Roelof van Dijk	c158e3c988	add cifar gated uop_given_valid regression test (#13536 )	2025-12-02 16:02:47 -05:00
nimlgen	77a76d1b13	device: respect compiler ContextVars (#13523 ) * device: envvars for cc * fix * fix * x * um * fix * remote * em * cleanup * typing * fix * debug * lvp? * ugh * singl * rm * lol * fix * ? * this? * why? * rev * mod test * l	2025-12-02 14:42:04 +03:00
George Hotz	c38b7684dc	improve microbenchmarks (#13492 ) * improve microbenchmarks * bugfix + ubench * lil * no src in const method	2025-11-29 10:15:22 -08:00
qazal	72ef533d9c	tracing: use u32 for buffer args encoding (#13472 )	2025-11-28 00:19:51 +08:00
George Hotz	e4cd649ff0	remove kernelize to prepare for refactors (#13463 ) * remove kernelize to prepare for refactors * less kernelize * last test	2025-11-26 14:18:50 -08:00
qazal	7238df7a94	viz: cleanup sort_fn (#13454 )	2025-11-26 04:10:10 +08:00
wozeparrot	249553a119	tinyfs tweaks (#13444 )	2025-11-24 18:07:32 -08:00
chenyu	cb29265f23	add test that shows the validhack regression with bad rewrite order (#13411 )	2025-11-21 13:48:30 -05:00
chenyu	0251a8e628	parse_valid minor cleanup [pr] (#13385 ) * stricter parse_valid [pr] * not stricter * no VCONST * Revert "no VCONST" This reverts commit 330dbdf4060562596febcbf970bda6051a35012f.	2025-11-20 13:15:06 -05:00
George Hotz	986d113024	symbolic fuzz failure (#13367 ) * symbolic fuzz failure * skip flaky test	2025-11-19 14:21:08 -08:00
George Hotz	05ccc69248	Revert "merge to fold_divmod_general [p] (#13359 )" This reverts commit `7711bbac7f`.	2025-11-19 14:18:09 -08:00
George Hotz	7711bbac7f	merge to fold_divmod_general [p] (#13359 ) * merge to fold_divmod_general [p] * merge more * merge more * merge more	2025-11-19 11:37:45 -08:00
George Hotz	957cf717e7	Python speed (#13355 ) * skip process replay by default * work on python speed * fix names of rewrite rules * fix that test	2025-11-19 09:03:00 -08:00
Christopher Milan	a438c277de	autogen tests for 3.14 (#13343 )	2025-11-18 22:16:59 -05:00
George Hotz	cabd4add48	more work parsing SQTT, separate VIZ/PROFILE (#13308 ) * more work parsing SQTT * more minimal runner * sep VIZ/PROFILE * parse print new * improve parser * more filter * that * split them * lil cleanup * skip flaky test * AQL in mmapeak	2025-11-16 10:40:39 -08:00
nimlgen	c80d459d99	autogen: fix packed args structs (#13274 ) * autogen: fix packed args structs * and test this	2025-11-14 20:24:06 +08:00

1 2 3 4 5 ...

961 Commits