tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-09 15:08:02 -05:00

Author	SHA1	Message	Date
qazal	572dfd5506	add static amd program info to viz (#13594 ) * llvm-readelf * amd_readelf + soft_err * cleanup * multiple metadata * max wgp size, may be less	2025-12-08 04:08:14 +08:00
qazal	73093314bd	viz: support list of sidebar info (#13612 )	2025-12-08 03:09:43 +08:00
chenyu	b981b6f89e	remove old llama grad_acc (#13611 ) * remove old llama grad_acc * GRADIENT_ACC_STEPS=1	2025-12-07 13:03:47 -05:00
Christopher Milan	94d7646bdc	fix anonymous struct fields (#13610 )	2025-12-07 12:56:38 -05:00
nimlgen	dcd50baca4	amd/nv: cleanup (#13608 )	2025-12-07 17:05:26 +03:00
nimlgen	ac5f1e115d	autogen: repro for the bug (#13607 ) * autogen: repro for the test * mute	2025-12-07 15:51:03 +03:00
Christopher Milan	4eae4b0ce6	unify adreno autogen with mesa (#13604 ) * unify adreno autogen with mesa * gen pm4 * TestTiny::test_plus works * add a6xx enums * IMAGE=2 TestTiny::test_gemm works * remove adreno from CI * cleanup	2025-12-06 15:17:36 -05:00
kamilisjon	e20bc0b9b5	remove unused function parameter in beam search (#13602 )	2025-12-06 11:40:47 -05:00
nimlgen	abafb96441	hcq: check all subbufs are free (#13599 ) * hcq: check all subbufs are free * fix * Update ops_amd.py	2025-12-06 17:43:18 +03:00
nimlgen	f2b549d921	amd: refactor scratch calc (#13595 ) * amd: refactor scratch calc * fix	2025-12-06 16:41:35 +03:00
chenyu	4562f217e1	more bert updates (#13597 ) prep split jit also lower BS to 72	2025-12-06 08:32:43 -05:00
wozeparrot	93f1baca77	feat: tk fa in tensor (#13580 )	2025-12-05 14:36:29 -08:00
chenyu	cb4c6324ef	revert bert grad accumulation (#13596 ) prep for the new split jit style	2025-12-05 17:30:08 -05:00
qazal	f20212e1ec	refactor viz error handler (#13593 )	2025-12-06 02:37:39 +08:00
Christopher Milan	dec2f50aee	reenable process replay for lvp (#13592 )	2025-12-05 12:36:35 -05:00
chenyu	0977206b1c	Revert am (#13591 ) * Revert "hotfix: amd: tmpring (#13589)" This reverts commit `4d8b283b36`. * Revert "amd: use correct structs (#13583)" This reverts commit `d8b09eda57`.	2025-12-05 11:03:12 -05:00
chenyu	ac1227575f	IMAGE=1 driving_vision in benchmark (#13587 )	2025-12-05 10:20:54 -05:00
nimlgen	4d8b283b36	hotfix: amd: tmpring (#13589 ) * hotfix: amd: tmpring * more	2025-12-05 18:19:05 +03:00
qazal	8c332219f9	viz: remove x86asm highlighter (#13586 ) * viz: remove x86asm highlighter * formatting	2025-12-05 21:05:50 +08:00
qazal	5d8726d8d2	viz: refactor to generic sidebar (#13584 )	2025-12-05 20:09:41 +08:00
nimlgen	d8b09eda57	amd: use correct structs (#13583 )	2025-12-05 14:46:38 +03:00
qazal	6d92e9ffbf	hotfix: skip process replay on lvp (#13585 )	2025-12-05 19:25:23 +08:00
Christopher Milan	8011b953c9	mesa: remove glsl type hack (#13578 ) * mesa: remove glsl type hack * lazy type access * save a line * fix windows? * mypy happy	2025-12-04 21:18:56 -05:00
George Hotz	c5bd28e21d	start work on schedule cache (#13529 ) * start work on schedule cache * local unique * schedule cache works * schedule cache cleanup * fix tests * preserve metadata * oops, fix cache * put that there * fix spec * always miss * why is that broken? * src[0].op * fix process replay * delete abstractions2 * reenable the actual schedule cache * metadata is best effort * fix JIT in examples/gradaccum_mnist.py * full jit * fixed and test is real	2025-12-04 17:24:49 -08:00
wozeparrot	62e2fc5108	tk: global load/store rv (#13577 )	2025-12-04 17:23:48 -08:00
Christopher Milan	5cfe1698e8	autogen: strip function parameter qualifiers (#13576 ) * autogen: strip function parameter qualifiers * regen hip * re-regen hip	2025-12-04 19:54:34 -05:00
qazal	f21c9dbf4b	enable PMC with VIZ=2 (#13575 )	2025-12-05 03:09:53 +08:00
qazal	d7caae5f61	viz: tabulate pmc (#13574 ) * viz: tabulate pmc * linter * enable nesting * pmc comes before waves	2025-12-05 03:08:39 +08:00
chenyu	42f6cf3a90	tighter test_real_world mem and kernel count bounds (#13573 ) also check if actual usage is within 20% of set limit, the old limits are too big to be useful	2025-12-04 13:35:39 -05:00
chenyu	89f9e1dcd5	add SGD to beautiful_mnist (#13571 )	2025-12-04 12:17:29 -05:00
qazal	512a8f3dd4	viz: start global memory PMC tests (#13569 )	2025-12-05 00:40:27 +08:00
chenyu	7df56d3b99	Optimizer.device is a property (#13568 )	2025-12-04 09:25:15 -05:00
nimlgen	db99a61fad	qcom: support cpu mappings (#13565 ) * test * qcom: support cpu mappings * clean * msg	2025-12-04 14:50:46 +03:00
George Hotz	bd6a068ef7	move track_rewrites to outer schedule cache (#13556 ) Co-authored-by: qazal <qazal.software@gmail.com>	2025-12-04 19:13:45 +08:00
qazal	3eae146139	faster process replay [pr] (#13564 )	2025-12-04 18:52:07 +08:00
Rory Clear	6eab756578	fix and test loading num_batches_tracked (#13538 ) * fix and test loading num_batches_tracked * add failing reverse case * try reshape state dict if mismatch * reshape for () and (1,) --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-12-04 01:22:49 -08:00
nimlgen	877a7fdd61	jit: support encdec (#13563 ) * jit: support encdec * fix	2025-12-04 11:58:34 +03:00
Douglas Nyberg	a8a62bc08e	add max/min reduction support to ScatterND (#13562 )	2025-12-04 00:53:47 -08:00
ayanhan	edf929ec9d	fix: add __delitem__ to Tensor with proper TypeError (#13561 )	2025-12-04 00:53:08 -08:00
Douglas Nyberg	9411ecedc4	fix CUDA half-precision trunc() type mismatch (#13559 )	2025-12-03 21:53:16 -05:00
ayanhan	92b40290c7	fix: add test_sum_int and remove outdated TODO in test_custom_kernel (#13560 )	2025-12-03 21:51:58 -05:00
Christopher Milan	0a54434b15	mitigate ctypes c_bool bitfield bug (#13558 ) * mitigate ctypes c_bool bitfield bug * don't delete old test	2025-12-03 20:46:04 -05:00
George Hotz	96d16675fe	update examples/gradaccum_mnist.py to use the JIT	2025-12-03 16:11:42 -08:00
George Hotz	24ca8eeaa7	small fixups from schedule_cache (#13557 )	2025-12-03 15:41:16 -08:00
Douglas Nyberg	f5abd38132	remove tfa dependency: use keras.optimizers.Lamb and tf.raw_ops for LARS (#13555 )	2025-12-03 17:48:27 -05:00
George Hotz	a4c4e48385	add LUNIQUE op (#13554 )	2025-12-03 14:34:34 -08:00
George Hotz	a909cd4581	faster HEVC decode (#13552 ) * faster HEVC decode * bind to variables * cleanups * more cleanups	2025-12-03 11:33:05 -08:00
chenyu	22777a89ea	minor test_uop_symbolic updates (#13551 )	2025-12-03 13:17:44 -05:00
chenyu	a205f98ef4	tighter bound for MOD (#13550 )	2025-12-03 11:24:29 -05:00
nimlgen	fcdb01abe7	hip: fix ioctl (#13548 )	2025-12-03 16:40:43 +03:00

1 2 3 4 5 ...

11299 Commits