tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-25 23:08:06 -05:00

Author	SHA1	Message	Date
chenyu	b5d700adae	update openpilot supercombo.onnx to 0.9.4 (#1681 ) * update openpilot supercombo.onnx to 0.9.4 * update tests for the new model * comment out comma models from external_model_benchmark	2023-08-26 19:16:08 -04:00
chenyu	ae39cf84ab	Symbolic Shape JIT main PR (#1353 ) * Symbolic Shape JIT update tests 2 variables symbolic ops, adding more tests test passing cleanup * more test cases * single flag * review update * jit attention one piece * realize * symbolic_jit test for cuda * old artifact * works with cuda gpu but failed ci * CUDACPU	2023-08-18 14:39:55 -07:00
George Hotz	18892242b0	global -> group (#1007 ) * global -> group * allow None for local_size in custom function * lil local * comment on shape * fix cuda * smart local cast * better local heuristic * fix ptx, and work_dim cleanup * fix metal * fix ops test * fix openpilot jit * no more optlocal * might fix metal tests * try metal now * see generated metal code * test free removal. REVERT THIS * mergable	2023-06-21 11:50:43 -07:00
George Hotz	7fbf96b992	jit: TODO, use abstractions	2023-05-05 22:51:30 -07:00
George Hotz	7ecf4dff68	multi cl_queue (#762 ) * multi cl_queue * only platforms 1 * gpus first, then cpus * put device on underlying buffer * cl_queue array	2023-05-03 12:15:28 -07:00
George Hotz	b12b60af20	fix binop, other tests failure (#723 ) * fix binop, other tests failure * that was a bad idea * better layernorm * inference kernel count tests * new style reshape pushing * fixup replacement * 199 kernels is okay. fix flops * push reshape through unaryops only * GRAPH=2 draws the phantom ops * found resnet issue * non working test * mul is cheaper than div * OPT inflation * SHUFFLE_PAD_OPS in OPT=2	2023-03-22 18:15:07 -07:00
George Hotz	f5467cfedc	Devicebufferless (#708 ) * runs one metal kernel * conv2d works * ops tests are passing * const folding * all ops work * pre commit always passes * torch works * working still * fix graph test * tests passing * image almost works * image conv works * most images * fix custom * fix assignment * fix compile enet * clean up comments * fix realize return value * include shapetracker in LB repr * copy should make a copy * reenable method cache * fix lna * dtypes in graph * forward only for IMAGE=2 * simple realize * getting close * fixup new api, it's good except the kernel count * back to 197 kernels * tests should pass * go to a real float * no type_on_cpu * fix the docs * put shapetracker back in it's proper place	2023-03-18 14:40:23 -07:00
George Hotz	d8dda2af3a	openpilot fixups	2023-03-06 14:14:44 -08:00
George Hotz	382f346523	clean up opt (#649 ) * clean up opt * don't let global kernels get too small * 8192 -> 1024 * disable local shape for clang * fix can_merge * unroll the 5x5 depthwise convs in op * load float4 check	2023-03-05 20:49:36 -08:00
George Hotz	c53efb3635	optimize for CL (#633 ) * required opt * simplify * works * shift_to_last * required is fine * print shape in colored * better shape * args was wrong * debugs * fix empty shape * colored shape printer	2023-03-03 22:00:09 -08:00
George Hotz	1a84976d4d	fix thneed gflops	2023-03-03 16:52:59 -08:00
George Hotz	b9ce20c374	openpilot test wasn't running, factor out image idx	2023-03-03 07:41:53 -08:00
George Hotz	2e26286294	speed like you wouldn't believe (#626 ) * speed like you wouldn't believe * fix tests	2023-03-02 07:49:19 -08:00
George Hotz	bfcec234a2	Refactor ASTs (#622 ) * ugh worst branch name * compiler refactor continues * scc -> cloc * buf -> _buf * finish _buf, and program -> runtime * gpu is still working, clang isn't * clang in new style * ops_metal * something broke it * improve metal * clean up tons of cl crap * hack fix sync * cleaner gpu * gpu metal clang * cleanups * minor refactor * GPUCodegen * fix up LLVM * blind CUDA refactor * codegen / runtime * keep ops naming * linter passes * woah, llvm was allocing 4x what it needed to * bugfixes * fix openpilot compiler * fix compile_efficientnet * method cache should fix tests * deal with duped functions	2023-03-01 18:57:29 -08:00
voidz	94bec40110	moved extras/jit.py -> tinygrad/jit.py (#599 ) * moved extras/jit.py to tinygrad/jit.py * fixed indent * removed tinygrad.helpers.DEBUG from jit.py	2023-02-25 08:32:33 -08:00
George Hotz	d3029c91c5	no rng for op test	2023-02-24 00:23:20 -08:00
George Hotz	661812ffef	don't ignore type	2023-02-23 19:38:52 -08:00
George Hotz	8b0082540b	openpilot compile cleanups	2023-02-20 09:16:03 -08:00
George Hotz	de71c13934	test speed v torch uses jit	2023-02-12 07:43:17 -08:00
George Hotz	031edd01e6	switch openpilot compile to TinyJit	2023-02-11 09:51:44 -08:00
George Hotz	3d63934995	refactor to keep cl in the runtime (#545 ) * refactor to keep cl in the runtime * fix thneed, rename cl to _cl * bugfix + _cuda * fix tests * thneed more correct	2023-02-08 16:46:09 -06:00
Jacky Lee	799b3f185a	Refactor getenv into helpers (#508 ) * Refactor getenv into helpers * Remove unused os * Fix default value * Fix more defaults for CI * Fix bracket * Revert changes to openpilot/compile.py * Use getenv from helpers when possible	2023-01-31 15:09:09 -08:00
George Hotz	92001a06e1	openpilot/go.sh	2023-01-28 13:57:43 -08:00
George Hotz	6d7658db12	delete opencl <celebration>	2023-01-24 14:18:35 -08:00
George Hotz	e313c8af20	update openpilot tests from OPENCL to GPU	2023-01-24 14:05:20 -08:00
George Hotz	281b0db773	three from image	2023-01-12 12:26:58 -08:00
George Hotz	4885fce56e	shapetracker from newgpu (#456 ) * shapetracker from newgpu * touchup ops * test * testst * thneed deletes unused inputs * test * bugfix	2023-01-09 12:40:01 -08:00
George Hotz	e6b65f8e01	fix graph in openpilot/compile.py	2022-10-28 08:55:34 -07:00
George Hotz	ef62db3186	cleanups, remove E701	2022-10-28 08:28:56 -07:00
George Hotz	b65b70812a	Exec AST (#404 ) * working exec ast * exec_ast is staticmethod * GenericExecAST * fold that sometimes * ExplicitExecAST * exec_ast for GPU * gpu working * get_lazyop_shape * now gpubuffer is ExplicitExecAST * dedup * add a type * RESHAPE in opencl code * fix linter * that too for linter * cleanups * remove dead code * GenericShape is less lines * add ALLOWED_KERNEL_COUNT to tests * fix mypy * that's gotta be recursive * fix opencl shape processing * remove unneeded lambda	2022-10-28 08:27:03 -07:00
George Hotz	6a8fb53304	move ops.py into lazy.py (#402 ) * move ops.py into lazy.py * fix graph and linter * ugh, didn't add	2022-10-25 13:58:03 -07:00
George Hotz	3b9b7eda48	remove run_thneed dead code	2022-10-20 17:24:18 -07:00
George Hotz	1bec4651b3	fix nonstatic weights	2022-10-20 17:04:14 -07:00
George Hotz	50c95c7d9a	add assert to catch issue in attention	2022-10-20 15:13:00 -07:00
George Hotz	26c78ccf7d	remove useless buffer	2022-10-20 14:07:28 -07:00
George Hotz	a18c1f3178	zero out the inputs	2022-10-20 13:46:52 -07:00
George Hotz	61ee428e4c	rerun	2022-10-20 13:29:14 -07:00
George Hotz	5dae64b7b0	read input shapes and break down the layers	2022-10-20 13:11:24 -07:00
George Hotz	e00601faea	fix thneed self test	2022-10-20 12:55:02 -07:00
George Hotz	ace8db29f8	ReduceSum	2022-10-20 12:48:14 -07:00
George Hotz	c400ee0beb	refactoring thneed (#400 ) * refactoring thneed * continue * minor update * looks like it's working * big refactor * confirm thneed got the right output * code is there but it's broken * works now * always OPTWG, input -> dat * fix type issue	2022-10-20 12:35:59 -07:00
YassineYousfi	ae0f9b17df	openpilot: new models and onnx ops (#401 ) * ngrl stuff * fngrl * fix typo in compile script * workflow dispatch * new models in tests * dont need to up this threshold Co-authored-by: HaraldSchafer <harald.the.engineer@gmail.com>	2022-10-20 11:49:19 -07:00
George Hotz	d6f499fd69	improve opencl, why is it OOMing	2022-09-05 20:14:31 -07:00
George Hotz	2e9b7637b3	don't save input buffers	2022-08-31 15:37:38 -07:00
George Hotz	a3fc64a585	fix batchnorm folding in openpilot compile	2022-08-31 13:04:49 -07:00
Comma Device	a734df98fa	TEST_ENET for openpilot compiler	2022-08-31 13:23:36 -04:00
George Hotz	d919ac32af	fix wrong size input	2022-08-31 09:07:34 -07:00
George Hotz	040640a580	fix cl import error	2022-08-31 08:43:44 -07:00
George Hotz	33ac355bcd	still broken	2022-08-29 19:08:07 -07:00
George Hotz	5efab7cf1d	add reciprocal	2022-08-29 18:00:24 -07:00

1 2

77 Commits