tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-09 15:08:02 -05:00

Author	SHA1	Message	Date
Christopher Milan	950d8de00e	automatically inline anonymous (#13652 )	2025-12-12 00:02:44 -05:00
chenyu	01e9ad0d52	clean up bert next_data (#13650 ) train iter was designed to never stop for both real and fake data	2025-12-11 22:56:28 -05:00
Jakob Sachs	ab2220b834	Handle missing bfloat16 natives on CPU architectures (#13553 ) * CPU: fix compiler-rt libcall by adding intermediate casts for bfloat16 * fix lint * remove old manual bypass of bf16 for CPU tests, and add diversion converstion from bf16 to/from fp16 --------- Co-authored-by: Jakob Sachs <jakobs99@purelymail.com>	2025-12-11 15:38:43 -05:00
nimlgen	cbae33003d	ci: add usb4 (#13643 ) * ci: add usb4 * debug=3 * undef * revert	2025-12-11 19:41:41 +03:00
chenyu	03600aef1e	failed test case when init jit with empty inputs (#13641 ) not related to bert grad acc, but still seems to be a bug	2025-12-10 22:03:06 -05:00
nimlgen	51f3c9f615	am: use va_base as base (#13640 )	2025-12-10 21:09:35 +03:00
chenyu	5034c6fb37	reenable FREE_INTERMEDIATE for bert (#13639 ) * reenable FREE_INTERMEDIATE for bert * comment	2025-12-10 12:08:09 -05:00
qazal	be6d538351	viz: add kernel walltime to pmc scoreboard (#13638 ) * viz: add kernel walltime to pmc scoreboard * fix typing * tiny TracingKey refactor * key on kernel name	2025-12-10 20:16:42 +08:00
qazal	1666c4aaab	viz: fix counter names ordering (#13637 )	2025-12-10 17:05:27 +08:00
qazal	c801bb7054	viz: show all kernel pmcs (#13635 )	2025-12-10 07:16:02 +08:00
wozeparrot	4854a0c02c	fix: getattr returns AttributeError not ImportError when missing (#13633 )	2025-12-09 14:26:54 -08:00
chenyu	016a59cafa	remove contiguous and use where in EmbeddingBert (#13632 )	2025-12-09 15:49:21 -05:00
nimlgen	ddecba300f	amd: use getattr for autogen (#13630 ) * amd: use getattr for autogen * fi	2025-12-09 20:36:26 +03:00
Nino Risteski	76d465dbc3	optim empty shard #13513 (#13598 ) * optim empty shard * remove tuple * simplify * lint * lint2 * test * remove original buffer unique id * new rule * reset shard * update * reset shard	2025-12-09 12:28:36 -05:00
ayanhan	47a170be2e	test: enable cummax scalar IndexError test (#13625 )	2025-12-09 12:25:56 -05:00
Christopher Milan	9eae9dc3be	regen smu_v13 with stdint (#13631 ) Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com>	2025-12-09 12:20:01 -05:00
nimlgen	7cd8852f60	autogen: do no return tuples (#13629 )	2025-12-09 20:08:13 +03:00
nimlgen	9e484b5b1c	hcq: check size is None, do not read the whole size for 0s (#13628 )	2025-12-09 19:37:44 +03:00
nimlgen	1329033b8c	am: fix hot-queue restarts, only dequeue (#13627 )	2025-12-09 19:37:21 +03:00
nimlgen	b07839493d	proclogs with xccs (#13626 )	2025-12-09 16:46:08 +03:00
qazal	2c333818f4	simplify UOp stringifier [pr] (#13618 ) * simplify UOp stringifier [pr] * fix tuple	2025-12-09 05:06:16 +08:00
chenyu	2471b49e45	minor bert / llama change from grad acc branch (#13622 ) * minor bert / llama change from grad acc branch * revert those	2025-12-08 16:04:14 -05:00
Christopher Milan	cb3d756547	NAK compile-only test (#13621 )	2025-12-08 15:53:46 -05:00
Christopher Milan	a4c3d48aa9	compile-only test for IR3 actually works (#13619 )	2025-12-08 15:07:49 -05:00
Christopher Milan	a17077d1d9	skip test_double_assign in CI LVP (#13620 )	2025-12-08 14:54:02 -05:00
Christopher Milan	1c16b6e082	Mesa: freedreno (#12746 ) * ir3 init * got a program * 1 + 1 works * use isa_disasm instead of shader_disasm * wip * matmul works * works on py3.14 * fix const loading * skip QCOM failing tests * cleanup * args actually work * add compile-only tests * fix typo and install tinymesa * IR3 NULL backend * (float32) images work * autogen fix * fix compile only test * typo * mypy happy * compile-only uses py3.14 * bump mesa * unify qcom disassembler * float16 works * disasm shows in viz * save a line * add real del * variable workgroup sizes * simplify diff * bump line count * properly set wgsz * regen mesa * no preamble * bump lines	2025-12-08 14:02:08 -05:00
Douglas Nyberg	947c6eefc3	add Swish op (#13541 ) * add Swish ONNX operator * add Swish regression test * remove trailing whitespace * upgrade ONNX to 1.20, add excludes for unimplemented ops * upgrade ONNX to 1.19, add Swish op * upgrade ONNX to 1.19, TensorFlow to 2.18, add Swish op * exclude attention_3d and attention_4d_gqa tests * exclude attention fp16 tests * exclude all attention tests * retrigger CI * retrigger CI - worker crash	2025-12-08 12:41:18 -05:00
nimlgen	dd8a1a10d4	amd: tiny cleanups (#13616 )	2025-12-08 13:15:56 +03:00
qazal	2b07336c82	viz server cleanups (#13615 ) * depths start at 0 * rename the api path	2025-12-08 17:44:43 +08:00
wozeparrot	89c4206e22	fix: typing (#13614 )	2025-12-07 20:10:30 -08:00
qazal	572dfd5506	add static amd program info to viz (#13594 ) * llvm-readelf * amd_readelf + soft_err * cleanup * multiple metadata * max wgp size, may be less	2025-12-08 04:08:14 +08:00
qazal	73093314bd	viz: support list of sidebar info (#13612 )	2025-12-08 03:09:43 +08:00
chenyu	b981b6f89e	remove old llama grad_acc (#13611 ) * remove old llama grad_acc * GRADIENT_ACC_STEPS=1	2025-12-07 13:03:47 -05:00
Christopher Milan	94d7646bdc	fix anonymous struct fields (#13610 )	2025-12-07 12:56:38 -05:00
nimlgen	dcd50baca4	amd/nv: cleanup (#13608 )	2025-12-07 17:05:26 +03:00
nimlgen	ac5f1e115d	autogen: repro for the bug (#13607 ) * autogen: repro for the test * mute	2025-12-07 15:51:03 +03:00
Christopher Milan	4eae4b0ce6	unify adreno autogen with mesa (#13604 ) * unify adreno autogen with mesa * gen pm4 * TestTiny::test_plus works * add a6xx enums * IMAGE=2 TestTiny::test_gemm works * remove adreno from CI * cleanup	2025-12-06 15:17:36 -05:00
kamilisjon	e20bc0b9b5	remove unused function parameter in beam search (#13602 )	2025-12-06 11:40:47 -05:00
nimlgen	abafb96441	hcq: check all subbufs are free (#13599 ) * hcq: check all subbufs are free * fix * Update ops_amd.py	2025-12-06 17:43:18 +03:00
nimlgen	f2b549d921	amd: refactor scratch calc (#13595 ) * amd: refactor scratch calc * fix	2025-12-06 16:41:35 +03:00
chenyu	4562f217e1	more bert updates (#13597 ) prep split jit also lower BS to 72	2025-12-06 08:32:43 -05:00
wozeparrot	93f1baca77	feat: tk fa in tensor (#13580 )	2025-12-05 14:36:29 -08:00
chenyu	cb4c6324ef	revert bert grad accumulation (#13596 ) prep for the new split jit style	2025-12-05 17:30:08 -05:00
qazal	f20212e1ec	refactor viz error handler (#13593 )	2025-12-06 02:37:39 +08:00
Christopher Milan	dec2f50aee	reenable process replay for lvp (#13592 )	2025-12-05 12:36:35 -05:00
chenyu	0977206b1c	Revert am (#13591 ) * Revert "hotfix: amd: tmpring (#13589)" This reverts commit `4d8b283b36`. * Revert "amd: use correct structs (#13583)" This reverts commit `d8b09eda57`.	2025-12-05 11:03:12 -05:00
chenyu	ac1227575f	IMAGE=1 driving_vision in benchmark (#13587 )	2025-12-05 10:20:54 -05:00
nimlgen	4d8b283b36	hotfix: amd: tmpring (#13589 ) * hotfix: amd: tmpring * more	2025-12-05 18:19:05 +03:00
qazal	8c332219f9	viz: remove x86asm highlighter (#13586 ) * viz: remove x86asm highlighter * formatting	2025-12-05 21:05:50 +08:00
qazal	5d8726d8d2	viz: refactor to generic sidebar (#13584 )	2025-12-05 20:09:41 +08:00

1 2 3 4 5 ...

11329 Commits