tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-25 23:08:06 -05:00

Author	SHA1	Message	Date
Francis Lam	dece9958f8	wmma: clean up to make WMMA arg order consistent (#2014 ) also add cache defeat to extra/gemm/simple_matmul.py	2023-10-07 17:45:40 -07:00
George Hotz	cea4cbfc7a	move image+kopt to features (#2015 ) * move image+kopt to features * fix tests * debug prints (unrelated)	2023-10-07 15:41:08 -07:00
George Hotz	44ed94ef5c	use the device abstraction in handcode_resnet50_opt	2023-10-07 13:22:20 -07:00
George Hotz	6ee9cae44f	don't extract CIFAR every time / use the cache	2023-10-07 12:33:50 -07:00
nimlgen	d07ac379f9	add var_vals to kopt with symbolic (#2008 ) * add var_vals to kopt with symbolic again * no copies	2023-10-07 09:34:21 -07:00
George Hotz	121f7aa8c5	Schedule item (#2012 ) * ScheduleItem * put var_vals in the schedule * fix tests, wow that proliferated quickly * not ready to be in the schedule	2023-10-07 08:59:25 -07:00
George Hotz	f1f64bc88d	remove val_vars from the linearizer (#2009 ) * remove val_vars from the linearizer * no need to store var vals	2023-10-07 07:47:28 -07:00
George Hotz	dea8bb0938	triton isn't tested, and allows this refactor (#2007 ) * triton isn't tested * cuda buffer	2023-10-07 07:29:59 -07:00
George Hotz	23de1db727	strip whitespace	2023-10-07 06:06:27 -07:00
Roelof van Dijk	26fcc8dff6	fix: remove runtime imports (#1982 ) fix: import what is used probably monkeypatched fix: import revert selective import	2023-10-07 05:23:08 -07:00
George Hotz	f54959e5cd	move print tree into graph (#2003 ) * move print tree into graph * add winograd profiling test * change pre-commit to run ruff first	2023-10-07 04:39:21 -07:00
Ahmed Harmouche	2114dc13d1	Allow multi-input model export (#1995 ) * Allow multi-input model export * Add model export unit test * Fix efficientnet compilation * Only run model export test on JIT supported devices * Skip export model test if not EXPORT_SUPPORTED_DEVICE	2023-10-07 04:13:34 -07:00
George Hotz	ffa33d743a	good changes from openpilot_compile2 (#2000 ) * good changed from openpilot_compile2 * float32 image type was wrong * cleaner way to write that + a test	2023-10-06 13:33:24 -07:00
chenyu	05be57f57f	Fix llama with empty prompt (#1997 ) * fix llama with one token prompt * llama is all_jitted	2023-10-06 06:48:07 -07:00
George Hotz	7a68060422	Revert "allow local + grouped reduce in hand_coded (#1996 )" (#1998 ) This reverts commit `219a1f7063`.	2023-10-06 06:43:28 -07:00
nimlgen	219a1f7063	allow local + grouped reduce in hand_coded (#1996 ) * allow local + grouped reduce in hand_coded * allowed loop size based on global_dims * fix const * fix const one more time * better divisor * a bit fix * can take 2, why not * fix linter * better comments * start with 2 * not always pick group reduce * fix images * better images * better	2023-10-06 06:11:28 -07:00
George Hotz	fa9945dac0	remove stale tests	2023-10-06 02:14:56 -07:00
Vidhan Bhatt	94b21c41a7	ci: use `mypy.ini` (#1993 )	2023-10-06 01:45:28 -07:00
George Hotz	e43d8977f8	Revert "chore: add `py.typed` marker. (#1991 )" (#1994 ) This reverts commit `6d581e8911`.	2023-10-06 01:44:34 -07:00
Vidhan Bhatt	6d581e8911	chore: add `py.typed` marker. (#1991 ) * chore: add `py.typed` marker. * fix: add comma	2023-10-05 16:27:33 -07:00
chenyu	da2b3e55f4	simpler llama - don't shrink twice (#1981 )	2023-10-05 14:31:46 -07:00
Roelof van Dijk	972d9ea215	fix: PRUNEGRAPH is unused (#1985 )	2023-10-05 14:28:43 -07:00
George Hotz	21a2c5df73	fix up contiguous (#1978 )	2023-10-05 07:22:05 -07:00
chenyu	c99fa58dd2	simplify gpt2 example (#1973 ) * simplify gpt2 example * kernel_jitted_count and jit tests * Revert "kernel_jitted_count and jit tests" This reverts commit `31a3c26dd0`. * all_jitted test in test_real_world	2023-10-05 07:09:29 -07:00
George Hotz	2d0c1037b1	Fix up latest openpilot model (#1976 ) * fix gemv triggering for gemm * fixup_openpilot * external test issues	2023-10-05 05:24:28 -07:00
George Hotz	1862e14a4f	fix gemv triggering for gemm (#1975 )	2023-10-05 05:23:00 -07:00
Francis Lam	0ba75c4370	optimizer: add matvec optimizations (#1972 ) * optimizer: add matvec optimizations * renderer: fix alignment of shared memory in opencl	2023-10-04 14:16:27 -07:00
George Hotz	3d5127038c	don't create linearizer if we are in the method cache (#1969 ) * don't create linearizer if we are in the method cache * remove unchecked properties * that key isn't used * fix default type is sticky	2023-10-04 12:42:58 -07:00
George Hotz	de5d603ec1	corealize + remove realize from lazybuffer (#1968 ) * corealize + remove realize from lazybuffer * fix multigpu * fix graph	2023-10-04 10:59:31 -07:00
George Hotz	88b6ed6945	disable broken optim_conv2d	2023-10-04 07:33:50 -07:00
George Hotz	d449b3bef1	think about removing realize from lazybuffer (#1965 ) * remove realize from lazybuffer * okay fine, back that off * fix tests maybe * fix test	2023-10-04 07:18:58 -07:00
nimlgen	2ea1dd3e87	no process() in Linearizer (#1966 ) * no process() in Linearizer * more process() clean up	2023-10-04 07:18:42 -07:00
George Hotz	0945848b5f	schedule the loadops like everything else (#1964 ) * schedule the loadops like everything else * unify loadops with other things we schedule * delete all the ops * fix symbolic jit	2023-10-04 02:36:04 -07:00
Ahmed Harmouche	fb4d830a2a	Fix cast error in render_load in wgsl (#1956 ) * Fix cast error in wgsl * User render_cast intead of introducing new method * Make it shorter * Add back webgpu tests: efficientnet and dtypes	2023-10-04 02:29:14 -07:00
George Hotz	6a79d4044a	unrealized consts everywhere (#1963 ) * unrealized consts everywhere * don't import device from lazy * Device isn't in Lazy * same issue * disable jit random	2023-10-04 01:48:10 -07:00
nimlgen	f04c1a63ae	Rand works in jit (#1960 ) * rand works in jit * better jitted rand creation * Update realize.py --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-10-03 12:55:25 -07:00
George Hotz	f64d5b3ba8	move to realize.py (#1961 ) * move to realize.py * run_schedule moved	2023-10-03 07:25:40 -07:00
George Hotz	717451a244	Revert "optimizer: add matvec optimizations (#1753 )" (#1959 ) This reverts commit `f520323054`.	2023-10-03 00:28:42 -07:00
Francis Lam	f520323054	optimizer: add matvec optimizations (#1753 ) * optimizer: add matvec optimizations * Update optimizer.py --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-10-03 00:01:59 -07:00
nimlgen	e1f2c2cc19	fix jitted dist (#1955 )	2023-10-02 11:45:13 -04:00
Roelof van Dijk	35ac60775b	simplify line (#1950 ) * no need to index here, zip automatically truncates * enumerate is faster --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-10-02 03:19:15 -07:00
nimlgen	08e884217c	metal batch executor (#1920 ) * metal batch executor * no sym_infer in backends * calc_stat in BasicBatchExecutor` * run in batches of size 8 --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-10-02 03:18:31 -07:00
George Hotz	d48a90859c	use the opts from the default device (#1954 )	2023-10-02 03:13:46 -07:00
nimlgen	c27971d51f	fix llvm nan/inf const (#1951 ) * allow llvm * llvm works with inf/nan * enable some fast math back --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-10-02 03:08:57 -07:00
George Hotz	6a4ec4776e	fix CI (#1953 ) * this work * unauth * update in all places	2023-10-02 02:58:58 -07:00
Daniel Riege	579cabf668	Fix examples/train_efficientnet (#1947 ) * added missing colon * bug fixes for cifar10 dataset loading needed a reshape to work with conv layers and resolve fetched tensor to numpy since further code expects numpy array	2023-10-02 02:23:38 -07:00
David Hou	d4671cd8e3	use schedule in more places in linearizer tests (#1946 ) * pass current linearizer opts to Linearizer in TestFloat4 * use schedule instead of exec_ast hook	2023-10-02 02:22:56 -07:00
Roelof van Dijk	e7a49e84c8	perf: assert behind if is not optimized (#1847 ) * perf: assert behind if is not optimized * Update helpers.py --------- Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-09-29 11:07:24 -07:00
David Hou	8e9db88474	expand after expr_idxs in Linearizer.global_load (#1818 ) * small changes * expand in terms of substitute, directly expand g_idxs g_valid * delete expand_ops * don't compare using hash * any instead of in thanks gijskoning Co-authored-by: Gijs Koning <gijs-koning@live.nl> * support tc * testing code * no more create_rednode * maxsize none in view/node * oops * undo * typing * oops * oops * lmao * lmao * add expand multi test * Node.iter_idxs * type * type * delete checks! * clean up a little? * expand_idx in symbolic * un-golf * play around with types >.> * test_substitute and also remove an incorrect test? * get rid of range * Update symbolic.py * split out view cache change * split out flat components change * reduce diff * reduce diff * add some float4 tests * fix --------- Co-authored-by: Gijs Koning <gijs-koning@live.nl>	2023-09-29 10:33:34 -07:00
nimlgen	692bec7b6f	simplify CacheCollector (#1944 ) * rewrite cc * fix * fix tests * fix all tests * is it better * better with shape * cleaner * linter fix * no ; * better comment * better comments * no thneed changes	2023-09-29 10:13:04 -07:00

1 2 3 4 5 ...

2585 Commits