tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-10 07:28:15 -05:00

Author	SHA1	Message	Date
George Hotz	aeb7516c8a	tests passing on tinybox h3 (#13742 )	2025-12-17 19:04:34 -04:00
George Hotz	b013244c38	fix local tests for AMD_LLVM (#13738 ) * fix local tests for AMD_LLVM * fix linters * skip that for now * fix segfault	2025-12-17 12:23:46 -04:00
George Hotz	3dbde178c1	mark slow tests as slow instead of as CI (#13736 ) * mark slow tests as slow instead of as CI * CI shouldn't have different behavior * more skips / CI * slow	2025-12-17 10:29:57 -04:00
George Hotz	9015a22523	make tests faster (#13734 )	2025-12-17 09:39:44 -04:00
George Hotz	cf0c28d5ae	all tests pass on strix halo (#13728 )	2025-12-16 19:35:50 -04:00
George Hotz	321ab943b2	qwen model is working (#13690 ) * qwen model is mostly working * add Q4_K quantization support to GGUF parser, add qwen3:1.7b model - Add Q4_K (type 12) dequantization in nn/state.py - Add qwen3:1.7b model using Q4_K_M quantization (smaller than Q8_0) - Make bos_token_id optional for models like Qwen3 that don't have it - Fix line length issues and add preset parameter to SimpleTokenizer 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * smaller diff * test dequant * half split * better * simple tok * mock token * polish * better * fix * replace --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-15 18:00:34 -04:00
George Hotz	a657a4e0f4	add Q4_K GGUF quantization support (#13700 ) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-15 10:17:56 -05:00
George Hotz	572ca80046	fast tinygrad.apps.llm (#13685 ) * llm: add --benchmark support * fix speed * debug logging * fix test attention	2025-12-14 21:05:21 -05:00
chenyu	ed962786d6	use assign in Tensor.backward (#13674 ) preserve the grad object so that jit works	2025-12-13 22:43:06 -05:00
George Hotz	55845f7de7	schedule: cache unbinds for consistent cache keys (#13664 ) * schedule: cache unbinds for consistent cache keys strip BIND values before computing cache key so different bound values (e.g. KV cache positions) hit the same schedule cache entry. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * spec: allow single-src BIND for schedule cache key normalization 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * docs: add lessons learned to CLAUDE.md 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * more claude.md --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-12 17:27:42 -05:00
George Hotz	8c87a0bf8d	Revert "schedule: cache unbinds for consistent cache keys (#13662 )" This reverts commit `af86cae10c`.	2025-12-12 16:49:50 -05:00
George Hotz	af86cae10c	schedule: cache unbinds for consistent cache keys (#13662 ) * schedule: cache unbinds for consistent cache keys different bound variable values (e.g. kv cache positions) now produce the same schedule cache key by unbinding BIND(DEFINE_VAR, CONST) before computing the cache key and rebinding after lookup. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * schedule: cache unbinds for consistent cache keys When scheduling, BIND(DEFINE_VAR, CONST) nodes are now unbound to tagged DEFINE_VARs before computing the cache key. This ensures that the same computation with different bound values (e.g., different KV cache positions in LLM) gets the same cache key and reuses the cached schedule. The fix: - pm_pre_sched_cache: replaces BIND with tagged DEFINE_VAR - pm_post_sched_cache: restores tagged DEFINE_VAR back to original BIND - pm_remove_rangeify_tags: excludes DEFINE_VAR to preserve tags through rangeify - var_vals extracted from BINDs before cache key computation 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * schedule: fix BIND handling and add CLAUDE.md - Handle BIND to RANGE in create_schedule (not matched by CONST pattern) - Assert all BINDs on same variable have same value - Add CLAUDE.md codebase guide 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-12 16:40:10 -05:00
George Hotz	316da9f7ff	llm: add created/model fields, non-streaming support, and tests (#13660 ) * llm: add created/model fields, non-streaming support, and tests - Add `created` timestamp and `model` fields to response (required by OpenAI spec) - Add non-streaming mode support for /v1/chat/completions - Add `send_data` helper to HTTPRequestHandler for responses with Content-Length - Refactor viz/serve.py to use send_data - Add integration tests using real OpenAI client 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * add openai to testing * toml * Remove 'openai' from dependencies Removed 'openai' from the dependencies list. * bump cache --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-12 14:50:36 -05:00
Christopher Milan	94d7646bdc	fix anonymous struct fields (#13610 )	2025-12-07 12:56:38 -05:00
nimlgen	ac5f1e115d	autogen: repro for the bug (#13607 ) * autogen: repro for the test * mute	2025-12-07 15:51:03 +03:00
George Hotz	c5bd28e21d	start work on schedule cache (#13529 ) * start work on schedule cache * local unique * schedule cache works * schedule cache cleanup * fix tests * preserve metadata * oops, fix cache * put that there * fix spec * always miss * why is that broken? * src[0].op * fix process replay * delete abstractions2 * reenable the actual schedule cache * metadata is best effort * fix JIT in examples/gradaccum_mnist.py * full jit * fixed and test is real	2025-12-04 17:24:49 -08:00
ayanhan	edf929ec9d	fix: add __delitem__ to Tensor with proper TypeError (#13561 )	2025-12-04 00:53:08 -08:00
Christopher Milan	0a54434b15	mitigate ctypes c_bool bitfield bug (#13558 ) * mitigate ctypes c_bool bitfield bug * don't delete old test	2025-12-03 20:46:04 -05:00
chenyu	22777a89ea	minor test_uop_symbolic updates (#13551 )	2025-12-03 13:17:44 -05:00
chenyu	a205f98ef4	tighter bound for MOD (#13550 )	2025-12-03 11:24:29 -05:00
nimlgen	549f3287a8	fix caching for fetch (#13544 )	2025-12-03 14:34:14 +03:00
George Hotz	6bd355fa26	add needs_second_gpu decorator (#13543 ) * add needs_second_gpu decorator * more skips * two more fixes	2025-12-02 19:08:23 -08:00
Roelof van Dijk	c158e3c988	add cifar gated uop_given_valid regression test (#13536 )	2025-12-02 16:02:47 -05:00
nimlgen	77a76d1b13	device: respect compiler ContextVars (#13523 ) * device: envvars for cc * fix * fix * x * um * fix * remote * em * cleanup * typing * fix * debug * lvp? * ugh * singl * rm * lol * fix * ? * this? * why? * rev * mod test * l	2025-12-02 14:42:04 +03:00
George Hotz	c38b7684dc	improve microbenchmarks (#13492 ) * improve microbenchmarks * bugfix + ubench * lil * no src in const method	2025-11-29 10:15:22 -08:00
qazal	72ef533d9c	tracing: use u32 for buffer args encoding (#13472 )	2025-11-28 00:19:51 +08:00
George Hotz	e4cd649ff0	remove kernelize to prepare for refactors (#13463 ) * remove kernelize to prepare for refactors * less kernelize * last test	2025-11-26 14:18:50 -08:00
qazal	7238df7a94	viz: cleanup sort_fn (#13454 )	2025-11-26 04:10:10 +08:00
wozeparrot	249553a119	tinyfs tweaks (#13444 )	2025-11-24 18:07:32 -08:00
chenyu	cb29265f23	add test that shows the validhack regression with bad rewrite order (#13411 )	2025-11-21 13:48:30 -05:00
chenyu	0251a8e628	parse_valid minor cleanup [pr] (#13385 ) * stricter parse_valid [pr] * not stricter * no VCONST * Revert "no VCONST" This reverts commit 330dbdf4060562596febcbf970bda6051a35012f.	2025-11-20 13:15:06 -05:00
George Hotz	986d113024	symbolic fuzz failure (#13367 ) * symbolic fuzz failure * skip flaky test	2025-11-19 14:21:08 -08:00
George Hotz	05ccc69248	Revert "merge to fold_divmod_general [p] (#13359 )" This reverts commit `7711bbac7f`.	2025-11-19 14:18:09 -08:00
George Hotz	7711bbac7f	merge to fold_divmod_general [p] (#13359 ) * merge to fold_divmod_general [p] * merge more * merge more * merge more	2025-11-19 11:37:45 -08:00
George Hotz	957cf717e7	Python speed (#13355 ) * skip process replay by default * work on python speed * fix names of rewrite rules * fix that test	2025-11-19 09:03:00 -08:00
Christopher Milan	a438c277de	autogen tests for 3.14 (#13343 )	2025-11-18 22:16:59 -05:00
George Hotz	cabd4add48	more work parsing SQTT, separate VIZ/PROFILE (#13308 ) * more work parsing SQTT * more minimal runner * sep VIZ/PROFILE * parse print new * improve parser * more filter * that * split them * lil cleanup * skip flaky test * AQL in mmapeak	2025-11-16 10:40:39 -08:00
nimlgen	c80d459d99	autogen: fix packed args structs (#13274 ) * autogen: fix packed args structs * and test this	2025-11-14 20:24:06 +08:00
George Hotz	bcdfc109b5	hotfix: disable flaky test	2025-11-13 06:19:28 -08:00
George Hotz	ab9fa964d8	DISABLE_COMPILER_CACHE -> CCACHE (#13234 ) * DISABLE_COMPILER_CACHE -> CCACHE * Fix cachekey assignment in Compiler constructor	2025-11-12 15:07:09 -08:00
qazal	7a6853fa40	viz: show python callstack in the first graph (#13218 )	2025-11-12 20:52:28 +08:00
Christopher Milan	41a098a82d	In-tree autogen: libc.py (#13217 ) * checkout changes from autogen branch * parents * pylint happy * move sys to system in helpers.py * typo * typo	2025-11-11 19:13:48 -08:00
qazal	bc55bc4849	cleanup test_viz profiler tests (#13221 )	2025-11-12 03:46:48 +08:00
nimlgen	b8e48effcb	device: no compilers message with reasons (#13146 ) * device: no compilers message with reasons * typings * mypy	2025-11-07 23:01:45 +08:00
chenyu	bb8cf948f2	variation of (x%c)+(x//c)*c = x (#13135 ) when x is in the form of y//b, the idiv term might have combined	2025-11-06 18:53:28 -05:00
George Hotz	bcfe42937f	move permute/flip/shrink to mixins (#13113 ) * move permute to mixins * move more stuff * two more * fix local mypy * fix tests * fix shrink	2025-11-05 14:14:15 -08:00
Sieds Lykles	3dc593c536	add strip_params to pyrender (#13021 ) * add strip_params to pyrender * update that one too * strip_parens fix * cleaner * add test * add some more tests * cleaner strip_parens	2025-10-31 14:15:56 +01:00
Sieds Lykles	4c8362128b	New symbolic renderer + strip parens (#13017 ) * new uop renderer * better tester * strip parens * update tests * split method check_uop_against_string * use ctx.update instead of add_rendered method * strip parens based on precedence * update test * new symbolic renderer * add comment	2025-10-30 16:41:32 +01:00
George Hotz	2da02f1ae1	add loads at the end (#12988 ) * add loads at the end * simpler * late load * tests passing * fix matvec * spec test passes * fix where on load * fix abs2 * fix more tests	2025-10-30 10:42:19 +08:00
Sieds Lykles	70bce62c67	dont collapse possibly empty symbolic range (#12994 ) * dont collapse a symbolic range based on min/max * refactor z3 renderer * include sink explicitely instead of dtypes.void * use dtype.scalar()	2025-10-29 12:17:09 +01:00

1 2 3 4 5 ...

949 Commits