tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-09 15:08:02 -05:00

Author	SHA1	Message	Date
chenyu	41e45c20ff	minor stuff reading the printed code [pr] (#13177 )	2025-11-09 00:58:51 -05:00
chenyu	834067d91f	move onnx import in compile3 (#13172 ) only used in test_vs_onnx	2025-11-08 09:44:34 -08:00
Harald Schäfer	587ccc0e5c	compile3: make selftests opt-in (#12851 )	2025-10-21 11:32:27 -07:00
wozeparrot	990e8b97ee	feat: log openpilot 0.10.1 times (#12816 )	2025-10-20 18:30:34 -07:00
Harald Schäfer	addc54b96c	Simplify openpilot compile3.py (#12748 ) * Simpler compile3 * tests * remove default args * onnx file is still fp16 * self-test FP16 too * allow test disable * absurd tolerance * Just do latest * Try simplest * use later models * kernel count not relevant if speed is good * dead improts * Revert "dead improts" This reverts commit `f68c2cd15d`. * Revert "kernel count not relevant if speed is good" This reverts commit `0955ca4ee0`. * add back kernal count check on latest model	2025-10-18 10:12:22 -04:00
George Hotz	612e3d6143	replace mop arg with vectorized index (#12695 ) * replace mop arg with vectorized index * tests passing * better viz * no compile4	2025-10-15 20:50:06 +08:00
nimlgen	658c566e22	vars in gated_read_image_count (#12486 ) * vars in gated_read_image_count * nc	2025-10-09 14:54:15 +08:00
chenyu	be05028419	move ASSERT_MIN_STEP_TIME to compile3 (#12535 ) threshold is current time +20%	2025-10-08 22:16:59 -04:00
qazal	7e0b14243e	delete grouper and kernelize (#12517 ) * delete grouper and kernelize * +sys.setrecursionlimit	2025-10-08 12:27:26 +03:00
George Hotz	0f25b4b289	move frontend dir to nn [pr] (#12470 )	2025-10-07 10:42:22 +08:00
qazal	1af05dae77	fix rangeify in compile4.py (#12467 ) * fix rangeify in compile4.py * fix type_verify	2025-10-06 13:37:46 +03:00
chenyu	0e266f376c	ops_gpu -> ops_cl (#12103 )	2025-09-10 15:15:48 -04:00
George Hotz	842184a1ab	rename kernelize to schedule, try 2 (#11305 )	2025-07-21 11:18:36 -07:00
chenyu	85ddd72038	simpler grouptop in hcopt (#11219 ) * simpler grouptop in hcopt keep the only perf relevant conditions and the rest is handled by try except * update openpilot read image count	2025-07-13 16:06:09 -04:00
geohotstan	5ce278b245	OnnxRunner file as input (#10789 ) * file path as input and have parse be in OnnxRunner.__init__ * modelproto_to_onnxrunner -> modelproto_to_runner * whoops, fix import * oh flakiness again, is it because it's getting gc-ed? * small changes * CI flaky so just move compile4 fix in * copy typing of onnx_load * actually can just import onnx_load instead of onnx.load * fix external_benchmark_openpilot * fix onnx_runner test to use onnx_helper * rerun CI * try run_modelproto * spam CI a few times * revert run_modelproto since that's flaky also * no external onnx_load usage except onnx.py * cursor tab complete is evil. Snuck a darn sorted in. But does order change result? Why? * model_benchmark 193s -> 80s, add OnnxRunner.to()... * minimize diff and clean up * device can be None, weird but eh --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-07-12 14:27:46 -04:00
geohotstan	50936b4a18	ONNX real float16 (#10694 ) * squash commits * temp fix for const tensor * actually realizing float16 can only happen in raw_data * .float -> cast(float) to rerun CI --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-06-26 14:05:12 -04:00
George Hotz	b41e0563a3	move stuff to kernelize folder (#10902 ) * move stuff to kernelize folder * oops, forgot that	2025-06-20 16:10:20 -07:00
George Hotz	cba6e15937	split grouper and kernelize [pr] (#10854 )	2025-06-17 17:54:20 -07:00
chenyu	7d5c769c6b	fix compile4 (#10797 )	2025-06-12 22:28:56 -04:00
b1tg	24d328e313	onnx parser (#10435 ) * onnx parser * fix compile, lint * onnx.load -> onnx_load * compatible with ModelProto * fix test external_test_onnx_ops.py * fix tests * fix signed int * reduce to 261 lines * fix TypeProto.Optional * debug for _parse_message, add TypeProto.Sequence, cleanup * onnx_load from Tensor * remove BufferedReader * 174 lines and reduce tensor copy * cleanup * use onnx_load in external_model_benchmark.py * fix qcom test * [onnx] parser support external data --------- Co-authored-by: b1tg <b1tg@users.noreply.github.com> Co-authored-by: chenyu <chenyu@fastmail.com>	2025-06-09 12:44:28 -04:00
George Hotz	32e9949052	rename lazydata to uop (#10698 )	2025-06-08 08:42:22 -07:00
George Hotz	0d39bb5de1	rename to get_kernelize_map (#10465 )	2025-05-22 11:44:44 -07:00
George Hotz	577a0b4cfa	openpilot compile4 (wip) (#10407 ) * openpilot compile4 * add copies * remove junk	2025-05-22 10:47:34 -07:00
George Hotz	74d98eafb8	add onnx frontend stub [pr] (#9558 )	2025-03-24 12:24:34 +08:00
ZwX1616	c977781b3c	no numpy change if no NPY (#9281 ) * skip np change check if no NPY * use any	2025-02-28 09:32:35 +08:00
George Hotz	8b16c65bca	add compile3 benchmark [pr] (#8929 )	2025-02-06 22:49:31 +08:00
geohotstan	dd82b4c913	make onnx runner a class (#8647 ) * this * clean up * more clean ups and improve debug msg * more correct training toggler * remove manual training toggling * change some variable names * actually just add the training toggle for LIMIT envvar too * more refinement * __call__ and OnnxRunner * fix half pylint, other half is importing from onnx while this file is onnx.py, figure out later * ahhhh found another mistake * remove limit from __call__ --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-01-20 10:11:05 -08:00
Harald Schäfer	7059459648	Openpilot compile: fix for openpilot use (#8338 ) * compile3 changes * merge conflict * merge conflict * give dm npy for now * Revert "give dm npy for now" This reverts commit bfd980da7d2c2bab5b073127442c361922032ba1. * updates * Always float32 floats * Update compile3.py * Update compile3.py --------- Co-authored-by: ZwX1616 <zwx1616@gmail.com>	2024-12-19 19:43:15 -05:00
chenyu	26e049ab40	add ALLOWED_READ_IMAGE=2131 to openpilot (#8166 ) added as exact number check now as it's not clear if more/less than allowed is any better	2024-12-11 12:14:17 -08:00
George Hotz	f83d715f41	move checks into compile3, delete compile2 [pr] (#8127 ) * move checks into compile3 [pr] * test_vs_onnx * test v torch works * float16 won't compile on compile3 * actually delete compile2	2024-12-09 14:21:42 -08:00
George Hotz	00ac0db9d4	np tensors have the memory from numpy in compile3 [pr] (#8098 )	2024-12-07 14:01:51 +08:00
George Hotz	22feb3a2f1	move copy into the JIT for openpilot compile3 (#7937 ) * move copy into the JIT, test fails * ahh, prune was the issue	2024-12-07 13:26:26 +08:00
George Hotz	fbb4099b3c	add test for compile3 [pr] (#7783 ) Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>	2024-11-19 19:26:51 +08:00
Harald Schäfer	e7cbc29f48	openpilot benchmark: add cast from numpy to benchmark (#7593 ) * openpilot benchmark: add cast from numpy to benchmark * whitespace * comment	2024-11-08 19:31:00 +08:00
George Hotz	c8bf09b7d4	s/UOps/Ops (#7500 ) * s/UOps/Ops [pr] * fix	2024-11-03 11:26:10 +08:00
George Hotz	72a9ac27e9	support image dtype in cloud [pr] (#7482 ) * support image dtype in cloud [pr] * remove outdated osx hack * unused imports	2024-11-02 23:54:27 +08:00
George Hotz	26df50cf43	move memory_planner to memory.py [pr] (#7079 )	2024-10-16 10:04:35 +08:00
George Hotz	5c9f76e274	hotfix: openpilot compile3 compare to i==1	2024-10-12 09:44:24 +08:00
George Hotz	f45d178a55	hotfix: support JIT_BATCH_SIZE=0, make that the default	2024-09-25 10:36:04 +08:00
George Hotz	b9e6d42a1f	Revert "gated native math in OpenCL (#6683 )" (#6691 ) This reverts commit `2fe3eeed17`.	2024-09-24 08:48:10 +08:00
George Hotz	2fe3eeed17	gated native math in OpenCL (#6683 ) * gated native math * Update cstyle.py	2024-09-23 19:22:13 +08:00
chenyu	b14c1bc417	UOps.RANGE is_increasing (#6615 ) * UOps.RANGE is_increasing 283 -> 47 valids * test	2024-09-20 03:14:52 -04:00
George Hotz	d02bb270b7	add copyin copyout for image on GPU [run_process_replay] (#6580 ) * add copyin copyout for image on GPU [run_process_replay] * add timing * enqueue vs total run * it's failing but that's fine	2024-09-18 16:06:20 +08:00
George Hotz	d4b662c318	new openpilot compile (#6573 ) * new openpilot compile * note, copyout doesn't work for images	2024-09-18 14:22:50 +08:00
chenyu	798be6bb74	add gated read_image count in openpilot compile2 (#6546 ) 530 to go	2024-09-16 21:17:00 -04:00
qazal	28c75bf2a6	merge uops with ops (#6111 ) Co-authored-by: chenyu <chenyu@fastmail.com>	2024-08-16 18:17:57 -04:00
qazal	c23d44c779	AST is UOp (#6030 ) * most of the work from the uops2 branch * schedule * realize * kernel * lowerer * search * green * merge uops with ops * Revert "merge uops with ops" This reverts commit `1408a59f12`. * fix benchmark * remove extra dedup	2024-08-16 22:09:00 +03:00
George Hotz	e077bc7baf	move memory planner to realize (#5937 )	2024-08-06 10:41:29 -07:00
George Hotz	fa7e734b49	MetaOps.KERNEL (#5543 )	2024-07-17 19:41:23 -07:00
chenyu	4df63da190	clean up rest of the loadop [run_process_replay] (#5440 ) to metaop and filter_sink	2024-07-12 23:38:51 -04:00

1 2

57 Commits