tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-09 15:08:02 -05:00

Author	SHA1	Message	Date
George Hotz	f1471a3b99	speed up rdna3 unit tests + add to CI (#13871 ) * speed up rdna3 unit tests * add test to CI * faster and simpler * speedups * bugfixes * use helper * fix CI maybe * test fixes * llvm-21 on 24.04 * upd * llvm-21 * fix test * bring that back * merge gen into lib * test generators	2025-12-29 10:26:48 -05:00
chenyu	f5090192c8	reorder AMD tensor core benchmark test (#13860 ) * reorder AMD tensor core benchmark test * disable that	2025-12-28 12:29:51 -05:00
chenyu	cba05acadf	re-enable TYPED=1 import test (#13858 )	2025-12-28 11:49:06 -05:00
qazal	a1c1684b91	set .amdhsa_kernarg_size in asm test (#13826 )	2025-12-25 13:08:14 +09:00
George Hotz	4702da41d5	hotfix: mkdir for extra/disassemblers	2025-12-19 17:18:37 -04:00
chenyu	80b84f5267	ruff lint tinykitten (#13762 ) deleted used import and double spaces. a few ignore to not change the real code	2025-12-19 14:31:00 -05:00
Christopher Milan	97103831c5	Revert "remove image from BufferSpec (#13636 )" (#13761 ) This reverts commit `2571a1eb47`.	2025-12-19 13:54:36 -05:00
Christopher Milan	2571a1eb47	remove image from BufferSpec (#13636 ) * remove image from BufferSpec * cl tiny_gemm (64) works * mypy * padding * openpilot CL * reshape properly * remove extra qcom checks * pad output * mypy * update compile test * move undo * TestImageCopy valid images * TestImageRealization valid images * TestImageDType valid images * cleanups * test_renderer_failures * ruff * mypy * simplify ops_qcom * bump step time	2025-12-19 13:41:20 -05:00
George Hotz	4b741e893f	remove REMOTE=1 (#13722 ) * remove REMOTE=1 * leave ibverbs	2025-12-16 15:58:10 -04:00
George Hotz	e5a66ace80	multi custom kernel support (#13716 ) * multi custom kernel support * custom kernel xfrom * works * no SPEC=2 on ck * panic * touchups	2025-12-16 11:36:30 -04:00
George Hotz	7589c897b2	split usbgpu tests into their own benchmark [pr] (#13711 )	2025-12-15 21:42:40 -04:00
qazal	6bafd90248	remove unused process replay input [pr] (#13712 )	2025-12-16 09:29:35 +08:00
George Hotz	fd49bb512d	download cache by job (#13703 )	2025-12-15 10:47:17 -05:00
George Hotz	316da9f7ff	llm: add created/model fields, non-streaming support, and tests (#13660 ) * llm: add created/model fields, non-streaming support, and tests - Add `created` timestamp and `model` fields to response (required by OpenAI spec) - Add non-streaming mode support for /v1/chat/completions - Add `send_data` helper to HTTPRequestHandler for responses with Content-Length - Refactor viz/serve.py to use send_data - Add integration tests using real OpenAI client 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> * add openai to testing * toml * Remove 'openai' from dependencies Removed 'openai' from the dependencies list. * bump cache --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2025-12-12 14:50:36 -05:00
George Hotz	f0fa9bcd98	openai api for llm (#13648 ) * openai api for llm * responds to simple request * schedule cache needs to unbind * stream works * share stream code * 20k * one print * cid	2025-12-12 08:25:33 -05:00
nimlgen	cbae33003d	ci: add usb4 (#13643 ) * ci: add usb4 * debug=3 * undef * revert	2025-12-11 19:41:41 +03:00
chenyu	2471b49e45	minor bert / llama change from grad acc branch (#13622 ) * minor bert / llama change from grad acc branch * revert those	2025-12-08 16:04:14 -05:00
Christopher Milan	cb3d756547	NAK compile-only test (#13621 )	2025-12-08 15:53:46 -05:00
Christopher Milan	a4c3d48aa9	compile-only test for IR3 actually works (#13619 )	2025-12-08 15:07:49 -05:00
Christopher Milan	1c16b6e082	Mesa: freedreno (#12746 ) * ir3 init * got a program * 1 + 1 works * use isa_disasm instead of shader_disasm * wip * matmul works * works on py3.14 * fix const loading * skip QCOM failing tests * cleanup * args actually work * add compile-only tests * fix typo and install tinymesa * IR3 NULL backend * (float32) images work * autogen fix * fix compile only test * typo * mypy happy * compile-only uses py3.14 * bump mesa * unify qcom disassembler * float16 works * disasm shows in viz * save a line * add real del * variable workgroup sizes * simplify diff * bump line count * properly set wgsz * regen mesa * no preamble * bump lines	2025-12-08 14:02:08 -05:00
chenyu	b981b6f89e	remove old llama grad_acc (#13611 ) * remove old llama grad_acc * GRADIENT_ACC_STEPS=1	2025-12-07 13:03:47 -05:00
Christopher Milan	4eae4b0ce6	unify adreno autogen with mesa (#13604 ) * unify adreno autogen with mesa * gen pm4 * TestTiny::test_plus works * add a6xx enums * IMAGE=2 TestTiny::test_gemm works * remove adreno from CI * cleanup	2025-12-06 15:17:36 -05:00
Christopher Milan	dec2f50aee	reenable process replay for lvp (#13592 )	2025-12-05 12:36:35 -05:00
chenyu	ac1227575f	IMAGE=1 driving_vision in benchmark (#13587 )	2025-12-05 10:20:54 -05:00
qazal	6d92e9ffbf	hotfix: skip process replay on lvp (#13585 )	2025-12-05 19:25:23 +08:00
George Hotz	24ca8eeaa7	small fixups from schedule_cache (#13557 )	2025-12-03 15:41:16 -08:00
Douglas Nyberg	f5abd38132	remove tfa dependency: use keras.optimizers.Lamb and tf.raw_ops for LARS (#13555 )	2025-12-03 17:48:27 -05:00
chenyu	8902781dc1	enable more benchmarks (#13540 ) * enable more benchmarks * disable some * adjust ASSERT_MIN_STEP_TIME * mac NOCLANG=1	2025-12-02 20:31:14 -05:00
George Hotz	21184ae6b1	bump cache to 14 (#13530 )	2025-12-02 08:02:19 -08:00
nimlgen	77a76d1b13	device: respect compiler ContextVars (#13523 ) * device: envvars for cc * fix * fix * x * um * fix * remote * em * cleanup * typing * fix * debug * lvp? * ugh * singl * rm * lol * fix * ? * this? * why? * rev * mod test * l	2025-12-02 14:42:04 +03:00
nimlgen	455dd88236	nv: minimal hevc (#13502 ) * nv: minimal hevc * validate * not needed * tralin * var * cpu * fxi * desc * move * cleanup	2025-11-30 16:46:55 +03:00
Sieds Lykles	63a931ff76	Symbolic divisor fuzzer (#13433 ) * render z3 range better * working version * rename * add to workflow * factor out variable_names * smaller expressions * smaller * + back	2025-11-23 20:29:32 +01:00
Christopher Milan	310da2a201	remove hashFiles in setup-tinygrad (#13423 ) * fix hashFiles in setup-tinygrad on macos * remove hashFiles altogether	2025-11-22 17:47:10 -05:00
qazal	903eec3754	fix sz.py tinygrad import in ci (#13418 )	2025-11-22 19:20:26 +08:00
wozeparrot	1f648bb1ba	feat: reenable mobilenetv2 dsp (#13320 )	2025-11-21 15:21:49 -08:00
Christopher Milan	de3593957f	Revert "Revert "autogen: fix formatting on zero-argument function-like macros…" (#13388 ) This reverts commit `0901a40685`.	2025-11-20 15:36:13 -05:00
Christopher Milan	4043489803	set curl -f in setup-tinygrad (#13389 ) * set curl -f in setup-tinygrad * test bad redirect * Revert "test bad redirect" This reverts commit `ad945e7ffc`.	2025-11-20 13:45:47 -05:00
Christopher Milan	0901a40685	Revert "autogen: fix formatting on zero-argument function-like macros (#13386 )" (#13387 ) This reverts commit `58d85d4bab`.	2025-11-20 12:45:35 -05:00
Christopher Milan	58d85d4bab	autogen: fix formatting on zero-argument function-like macros (#13386 ) * fix formatting on zero-argument function-like macros * autogen tests should run * ugh	2025-11-20 12:11:04 -05:00
Roelof van Dijk	0dc2ff431d	fix: revive torch backend (#13280 ) * fix: revive torch backend * as_strided view vs copy * Revert "as_strided view vs copy" This reverts commit `82a61223f2`. * add extra tests (move inplace, add fusion tests) * better fusion with inplace_op * no optimizer hooks (break mnist training fusion) * split off fusion tests in separate file, assert on resnet fusion fix: remove comments * cleanup, reduce diff * reduce diff * better fusion and identity checks --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-11-19 15:26:50 -08:00
George Hotz	1a332afa76	spec test on 3.14 (#12957 )	2025-11-19 00:43:04 -08:00
chenyu	6372c95094	disable benchmark MobileNetV2 on DSP (#13305 ) failed on tinyc2	2025-11-16 09:42:52 -05:00
Christopher Milan	5b823af696	Remove (pypi) clang dep for autogen (#13284 ) * no more clang * regen comgr_3 * ci doesn't need pypi clang * fix objc * REGEN for libclang --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-11-15 09:05:11 -08:00
George Hotz	df53c62a9f	bump line count	2025-11-15 08:16:20 -08:00
Christopher Milan	d1bb08c5a1	In-tree autogen: objective c (#13223 ) * checkout changes from autogen branch * move assert --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-11-14 14:08:42 -08:00
nimlgen	14eb48b13a	autogen: rename nv_gpu to nv_570 (#13273 ) * autogen: rename nv_gpu to nv_570 * rename	2025-11-14 20:07:19 +08:00
George Hotz	44d84228ff	move comgr_3 logic back to the old place (#13266 ) * move comgr_3 logic back to the old place * explicit	2025-11-13 20:05:54 -08:00
Christopher Milan	09f3aae169	In-tree autogen: all C libraries (#13220 ) * checkout files from autogen branch * ioctl with payload * fix am generations * properly fix generations This reverts commit `b2a54f4f41`. * revert discovery.h * support pragma pack(1) * typo * better getter * typo * NVCEC0_QMDV05_00_RELEASE[01]_ENABLE * align support * anon handling fix --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-11-13 18:57:44 -08:00
Harald Schäfer	3af231904e	openpilot compile tests: assert pre-rangify speeds (#12775 ) * assert pre-rangify speeds * typo --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-11-13 09:39:06 -08:00
George Hotz	263b724143	one cache and bump it (#13258 )	2025-11-13 07:33:31 -08:00

1 2 3 4 5 ...

1147 Commits