tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-10 07:28:15 -05:00

Author	SHA1	Message	Date
Christopher Milan	4043489803	set curl -f in setup-tinygrad (#13389 ) * set curl -f in setup-tinygrad * test bad redirect * Revert "test bad redirect" This reverts commit `ad945e7ffc`.	2025-11-20 13:45:47 -05:00
Christopher Milan	0901a40685	Revert "autogen: fix formatting on zero-argument function-like macros (#13386 )" (#13387 ) This reverts commit `58d85d4bab`.	2025-11-20 12:45:35 -05:00
Christopher Milan	58d85d4bab	autogen: fix formatting on zero-argument function-like macros (#13386 ) * fix formatting on zero-argument function-like macros * autogen tests should run * ugh	2025-11-20 12:11:04 -05:00
Roelof van Dijk	0dc2ff431d	fix: revive torch backend (#13280 ) * fix: revive torch backend * as_strided view vs copy * Revert "as_strided view vs copy" This reverts commit `82a61223f2`. * add extra tests (move inplace, add fusion tests) * better fusion with inplace_op * no optimizer hooks (break mnist training fusion) * split off fusion tests in separate file, assert on resnet fusion fix: remove comments * cleanup, reduce diff * reduce diff * better fusion and identity checks --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-11-19 15:26:50 -08:00
George Hotz	1a332afa76	spec test on 3.14 (#12957 )	2025-11-19 00:43:04 -08:00
chenyu	6372c95094	disable benchmark MobileNetV2 on DSP (#13305 ) failed on tinyc2	2025-11-16 09:42:52 -05:00
Christopher Milan	5b823af696	Remove (pypi) clang dep for autogen (#13284 ) * no more clang * regen comgr_3 * ci doesn't need pypi clang * fix objc * REGEN for libclang --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-11-15 09:05:11 -08:00
George Hotz	df53c62a9f	bump line count	2025-11-15 08:16:20 -08:00
Christopher Milan	d1bb08c5a1	In-tree autogen: objective c (#13223 ) * checkout changes from autogen branch * move assert --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-11-14 14:08:42 -08:00
nimlgen	14eb48b13a	autogen: rename nv_gpu to nv_570 (#13273 ) * autogen: rename nv_gpu to nv_570 * rename	2025-11-14 20:07:19 +08:00
George Hotz	44d84228ff	move comgr_3 logic back to the old place (#13266 ) * move comgr_3 logic back to the old place * explicit	2025-11-13 20:05:54 -08:00
Christopher Milan	09f3aae169	In-tree autogen: all C libraries (#13220 ) * checkout files from autogen branch * ioctl with payload * fix am generations * properly fix generations This reverts commit `b2a54f4f41`. * revert discovery.h * support pragma pack(1) * typo * better getter * typo * NVCEC0_QMDV05_00_RELEASE[01]_ENABLE * align support * anon handling fix --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-11-13 18:57:44 -08:00
Harald Schäfer	3af231904e	openpilot compile tests: assert pre-rangify speeds (#12775 ) * assert pre-rangify speeds * typo --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-11-13 09:39:06 -08:00
George Hotz	263b724143	one cache and bump it (#13258 )	2025-11-13 07:33:31 -08:00
chenyu	3f939f3d3c	update pm_simplify_valid (#13241 ) * update pm_simplify_valid fixed openpilot conv regression * IMAGE training is broken	2025-11-12 19:40:02 -05:00
George Hotz	ab9fa964d8	DISABLE_COMPILER_CACHE -> CCACHE (#13234 ) * DISABLE_COMPILER_CACHE -> CCACHE * Fix cachekey assignment in Compiler constructor	2025-11-12 15:07:09 -08:00
Christopher Milan	41a098a82d	In-tree autogen: libc.py (#13217 ) * checkout changes from autogen branch * parents * pylint happy * move sys to system in helpers.py * typo * typo	2025-11-11 19:13:48 -08:00
chenyu	23b90945c3	add a benchmark for openpilot vision with DEBUG=2 (#13219 ) see per kernel speed, also disable the jobs for 0.9.9	2025-11-11 14:41:52 -05:00
Gaétan Lepage	6fd7ce3832	migrate to pyproject.toml (#13189 ) * migrate to pyproject.toml * move mypy config to pyproject.toml	2025-11-11 09:09:27 -08:00
chenyu	60e55d9a2d	line count 18500 (#13191 )	2025-11-10 13:52:13 -05:00
chenyu	6c48c87e51	improved ASSERT_MIN_STEP_TIME (#13182 ) * improved ASSERT_MIN_STEP_TIME getting close, current time +1ms then round up * relax	2025-11-09 16:41:12 -05:00
chenyu	e1d46de8f8	update GROUPTOP heuristic more (#13178 ) reverts #13176	2025-11-09 02:31:12 -05:00
chenyu	8e868dced8	only GROUPTOP one reduce kernel (#13176 ) * only GROUPTOP one reduce kernel * ALLOWED_GATED_READ_IMAGE=148	2025-11-08 22:38:44 -05:00
George Hotz	42b34cf83d	bottom up linearizer (#13133 ) * bottom up linearizer * late stores * more complete * remove broken heuristic * upcast size * opt * more conservative * it needs that * disable opencl half on QCOM * fix * make that a real test * cpu test okay * ptx skip * end is after the range	2025-11-06 15:30:32 -08:00
chenyu	54141e9cb9	DISABLE_COMPILER_CACHE=1 in speed_v_theoretical (#13096 )	2025-11-04 11:28:18 -05:00
chenyu	ddf01fdb15	revert mlperf.yml setting (#13080 )	2025-11-03 15:24:13 -05:00
chenyu	a317d6e625	extra/amdpci/setup_python_cap.sh (#13070 )	2025-11-02 19:19:36 -05:00
chenyu	ad501ce50a	mlperf cron install tqdm (#13069 ) one more...	2025-11-02 18:09:27 -05:00
chenyu	2c8d619147	mlperf cron install influxdb3-python (#13068 )	2025-11-02 17:55:40 -05:00
chenyu	4c22f089fc	mlperf cron install tensorflow try 2 (#13067 )	2025-11-02 17:11:01 -05:00
chenyu	c58cf91850	mlperf cron install tensorflow (#13066 )	2025-11-02 16:48:05 -05:00
chenyu	74db65cf72	update mlperf bert LOGMLPERF (#13065 )	2025-11-02 15:26:37 -05:00
chenyu	b18293de96	train bert in mlperf cron (#13064 ) more relevant now	2025-11-02 15:04:02 -05:00
George Hotz	036ee9f84c	Self type + mixins (#13056 ) * use Self type * mixin * fix later	2025-11-02 13:30:01 +08:00
George Hotz	65a0a31475	AMD mi350x matmul from stream (#13040 ) * works * working mfma * 120 TFLOPS * regs * 192 TFLOPS * try pipelining * something * notes * contract * linter to 3.11 * that was a bug	2025-11-01 17:55:19 +08:00
nimlgen	f6786c1bfd	autogen: py314 (#13038 ) * autogen: py314 * bump py?	2025-11-01 04:02:19 +08:00
George Hotz	5eb87ab131	hotfix: bump cifar time to 350	2025-10-30 17:29:20 +08:00
nimlgen	4b001ec723	amd: pmc in mockgpu (#13000 ) * amd: pmc in mockgpu * fix * do not open in ci	2025-10-30 01:52:02 +08:00
b1tg	bb307b9e81	fix fp8 vectorization (#12977 ) * fix fp8 vectorization * add fp8 tc to benchmark	2025-10-28 13:55:30 -04:00
George Hotz	5e01cc299b	zero len ranges fail (#12974 ) * zero len ranges fail * fix Python backend * fix llvm * fix ptx * yolo fix nir * this works... * always store... * always store... * Revert "always store..." This reverts commit `0816cf344d`.	2025-10-28 22:49:55 +08:00
George Hotz	e936aa7974	cleanups from if range branch (#12973 )	2025-10-28 20:58:47 +08:00
George Hotz	2832954bcb	test with IGNORE_OOB=0 (#12960 )	2025-10-28 10:32:19 +08:00
George Hotz	7784cec48e	pytest-split on spec (#12959 )	2025-10-28 10:09:01 +08:00
b1tg	45e2f916a3	add quantize fp8 in llama3 (#12893 ) * add quantize fp8 in llama3 * don't truncate fp8 alu result * cast to float32 before matmul * --model weights/LLaMA-3/8B-SF-DPO/ --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-10-27 10:22:57 -04:00
George Hotz	25c2da1579	check SPEC=2 in CI (#12945 ) * check SPEC=2 in CI * split SPEC=2 * fast enough	2025-10-27 21:53:57 +08:00
George Hotz	8a941d95a4	SPEC=2 is full spec, SPEC=1 is default (#12910 ) * SPEC=1 passes all tests * just use SPEC, not __debug__	2025-10-25 11:10:43 +08:00
chenyu	4b7329001d	clean up test_avg_pool3d (#12905 )	2025-10-24 14:31:36 -04:00
chenyu	154b4f9f40	test FUSE_OPTIM=1 test/test_optim.py (#12895 )	2025-10-23 15:54:27 -04:00
wozeparrot	6e00dec95d	feat: pin openpilot 0.10.1 models (#12878 )	2025-10-22 14:57:54 -07:00
chenyu	f0831c8c30	add 0.10.0 to comma benchmark (#12875 ) * add 0.10.0 to comma benchmark disabled the 0.10.1 ones which are pinned to master. it does not work because benchmark uses the cached old version * that's pinned	2025-10-22 15:18:21 -04:00

1 2 3 4 5 ...

1161 Commits