tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-15 01:48:23 -05:00

Author	SHA1	Message	Date
imaolo	6ee0435263	added from unaligned np test (#2134 )	2023-10-23 11:38:57 -04:00
George Hotz	6dc8eb5bfd	universal disk cache (#2130 ) * caching infra for tinygrad * nons tr key * fix linter * no shelve in beam search * beam search caching * check tensor cores with beam too * pretty print * LATEBEAM in stable diffusion	2023-10-22 10:56:57 -07:00
Francis Lam	ace6b2a151	optimizer: add test for correctness of opts (#2124 ) * optimizer: add test for correctness of opts Also added OptOps.UPCASTMID to constrain valid axes for opts with group_for_reduce. * llvm: fix LinearizerOptions to correctly not has_shared * optimizer: remove premature test scaffold for TC opts * search: fix the action space	2023-10-22 08:02:22 -07:00
George Hotz	cb508e6923	uops graphing + phi (#2120 ) * uops graphing * add_phi_node * less phi nodes * where graph uops should live * naming * move it to external * fix triton yolo * fix clang and preserve behavior	2023-10-19 22:26:28 -07:00
20kdc	bedd028061	waifu2x vgg7: testcase, auto-RGBA->RGB, function to grab pretrained models, training "fix" (#2117 )	2023-10-19 22:07:15 -07:00
qazal	36d4001b4f	add test coverage for search (#2104 ) * add test coverage for search * only in compiled backends * dont use device.default in decorator * time_til is the other way around xd	2023-10-19 17:06:47 -07:00
David Hou	95e17ff0d4	fix wino mask upcast calculation (#2057 ) * fix wino mask upcast calculation * add tests for wino upcast hcopt * add info to note * real world wino hcopt test * wino backward test * whitespace	2023-10-18 16:54:48 -07:00
George Hotz	87b714b8cb	split test_conv2d	2023-10-18 14:00:50 -07:00
George Hotz	15da96f393	print test durations and add speed (#2107 ) * print test durations * decrease sizes to increase speed * faster * GPU/CLANG onnx in seperate runner * test split, move ONNX CPU CI * simpler tests * simpler uops test * faster * less cuda apt * running ninja install * apt install * split fancy indexing	2023-10-18 13:46:42 -07:00
George Hotz	881fd7c141	add mops to graph, refactor IMAGE (#2100 ) * add mops to graph, refactor IMAGE * no reshape pushing * add todo * fix openpilot model alt * push reshapes reduces kernels in new op * IMAGE=2 is a first class citizen now	2023-10-17 21:27:51 -07:00
Umut Zengin	01b98b7f42	MulNode.__lt__ rule (#2086 ) * Added the rule * Added tests * flake8 * self.b == -1 shortcut	2023-10-17 13:18:35 -07:00
Szymon Ożóg	4bef1591f0	Disable ocelot cache + fix matvec in triton (#2010 ) * Revert "disable flaky triton test" This reverts commit `1e15fdaee7`. * Update test.yml * check if has shared for matvec * disable ocelot cache for triton * disable ocelot cache * disable ocelot cache * pass shared to triton uops tests * temporary debugs for CI crash * Revert "temporary debugs for CI crash" This reverts commit `fee3ea96c8`. * Revert "triton isn't tested, and allows this refactor (#2007)" This reverts commit `dea8bb0938`. * add runtime_args to every renderer, move triton local size override to runtime args * Add binary to args, correct type returned * update to new loops * Update test.yml	2023-10-17 10:33:32 -07:00
geohotstan	5ed630204b	Add ONNX to CI for other backends (#2069 ) * some cleanup * move continue back * more more more * added to CI * try * try intentionally break some tests * wtf * del True for test * yay tests broke, now pls no break * try AGAIN * gahy * lol * try * move over constant * moved over MORE * move shrink over * trailing lines * try CUDA CI * try again * boom * oops * improved comments * try: disable some flags and disable CUDA * try breaking tests * traceback has too much info so add --tb=no * revert forced CI failure * add comments and del unused imports * oooooooo using regular debug try enable tb * intentionally break tests * added tb back. Maybe not too verbose * strip whitespcae * missed something * Shape op int32 -> int64 * oops missed something * add some types * get rid of crazy 1 liners in pad op * actually test Split this time LOL * strip that whitespace	2023-10-17 09:33:54 -07:00
George Hotz	5a4a62ecae	Disable logging in early compile2 and lower kernel counts (#2090 ) * Revert "Revert "openpilot kernel fix from 209 to 207 (#2006)" (#2065)" This reverts commit `924ecc4d6a`. * gate behind OPT >= 4 * disable_logging in schedule * simple * from master * more images * revert that * 206 kernels	2023-10-16 20:15:24 -07:00
George Hotz	1bf4aef0f5	fix image dtype cmp (#2089 ) * fix image dtype cmp * print that with debug 3	2023-10-16 17:52:38 -07:00
George Hotz	e4846771b2	Revert "limit metal buffers and revert the 207 fix (try 2) (#2088 )" This reverts commit `5e24dc5a95`.	2023-10-16 17:50:11 -07:00
George Hotz	d0aaf7d83b	Revert "Revert "Revert "openpilot kernel fix from 209 to 207 (#2006 )" (#2065 )"" This reverts commit `f22a7cf656`.	2023-10-16 17:47:00 -07:00
George Hotz	5e24dc5a95	limit metal buffers and revert the 207 fix (try 2) (#2088 ) * limit metal buffers * look at the base, not the srcs * Revert "Revert "openpilot kernel fix from 209 to 207 (#2006)" (#2065)" This reverts commit `924ecc4d6a`. * add a test for that	2023-10-16 14:52:16 -07:00
George Hotz	e8fcd2f3db	Revert "limit metal buffers and revert the 207 fix (#2087 )" This reverts commit `2fb10f6a19`.	2023-10-16 14:32:22 -07:00
George Hotz	2fb10f6a19	limit metal buffers and revert the 207 fix (#2087 ) * limit metal buffers * Revert "Revert "openpilot kernel fix from 209 to 207 (#2006)" (#2065)" This reverts commit `924ecc4d6a`.	2023-10-16 14:26:32 -07:00
George Hotz	c36d306606	KOPT is over, BEAM is upstream (#2071 ) * create cache for q learning * make linter happy * global beam * where it belongs * bugfix * ditch the kopt, use the beam * faster lin and DEBUG=2 okay * remove kopt, move search to features	2023-10-16 09:46:03 -07:00
Umut Zengin	776605f2fc	O(1) VALIDHACKS (#2072 ) * first refactoring * O(1) validhacks * O(1) validhacks * Some cleaning * mypy * flake8 * Trim trim * flake8 * clean * less chaotic * less chaotic * flake8 * Symbolic, SumNode include mulnode for gcd * fix tests * smal optim * revert * clean * clean * flake8 * small fix * Add symbolic test	2023-10-15 11:26:41 -07:00
mmmkkaaayy	91168a28c4	whisper: make file transcription work, add basic CI test (#2042 )	2023-10-13 17:13:35 -07:00
George Hotz	924ecc4d6a	Revert "openpilot kernel fix from 209 to 207 (#2006 )" (#2065 ) This reverts commit `63869c62fc`.	2023-10-13 12:01:55 -07:00
Amrit Sahu	63869c62fc	openpilot kernel fix from 209 to 207 (#2006 ) * Fix openpilot kernel from 209 to 206 1. Use push_movement_ops conditions in _movement_op. Don't push PAD or check if the ops are safe to be pushed with PAD 2. Don't push if all the op.buffers are realized * change ALLOWED_KERNEL_COUNT to 206 for openpilot * don't push through sourceless buffers * change the tests to adjust kernel counts for new behaviour * restore pushing of movement ops through childless buffer * don't push EXPAND, causes OOM * allow push of intermediate movement ops * adding new test behaviour * modifying external_test_opt for new behaviour * restore old tests * Reenable push of EXPAND and introduce new tests I was wrong intially thinking EXPAND can cause OOM and hence I had disabled it. Since it is 0 stride and doesn't allocate memory its cool * Don't push EXPAND above LoadOps LB. This is causing OOM * Push should be decided on movement root of bufs To check if ast.op.buffers is sourceless/ realized go the the movement root and then decide if pushing should be done or not * refactor for readability * use .base instead * don't push expand, bad memory/compute consumption * restrict push of reshape, seeing improvement * push reshape if unary without further check * disable PAD solves convnext kernel count increase * reenable test_cache_binaryop_transpose * small nit	2023-10-13 11:59:15 -07:00
George Hotz	90c777d815	remove apply_auto_opt (#2063 )	2023-10-13 07:44:14 -07:00
nimlgen	bd42fa0b73	kernel cache (#2035 ) * init compiled cache * clang not compile to stdout * use kwrags in compile * remove some useless lines * slimmer * fix * tabs * retry * remove decorator * no race in hip * smaller hip * unused import * unused pathlib * path to str * add test * fix linter * less lines? * decorator is back * update tests * no hip version * better comments * a bit better test * linter * work wo decorator * linter happy * simpler return type * more tests * better comment * readable * readable * readable * compile returns bytes * no ununsed imports * readable	2023-10-13 06:32:01 -07:00
Umut Zengin	6b7ac5c431	ModNode __mod__ rule (#2039 ) * Implement mod rule * mypy * feat: New test added	2023-10-12 11:30:10 -07:00
George Hotz	c5edb3c374	train value net, improve API, add BCE (#2047 ) * api cleanups, BCE losses * valuenet * fixup examples * learning okay * add valuenet runner * net improvements * net improvements * 40% win rate	2023-10-12 07:56:38 -07:00
geohotstan	8d6cecb25c	Torch eq fix (#1562 ) * init * Revert "init" This reverts commit `682bf2073a`. * kids dont do drugs * one way to fix * resolve merge conflict * no more or * clean up	2023-10-11 12:57:11 -07:00
George Hotz	41bfeb2c1e	start work on auto opt (#2034 ) * start work on auto opt * lin failure * not beating hcopt * greedy * timing is fast * codegen.search * greedy search in handcode_opt * track running gflops * clean up those files * no failure	2023-10-11 12:54:53 -07:00
Francis Lam	81c7d750db	test: fix test_linearizer.test_tensor_core test (#2036 ) must use apply_tensor_core instead of hand_coded_optimizations	2023-10-10 14:48:28 -07:00
chenyu	e2b83f1b42	Variable.bind newer (#2017 ) * Variable.bind attempt 2 * ShapeTracker.unbind * fix llama * fix types * test case * View.vars cleanup * include mask in symbolic source * mask can be sint * st.unbind in bufferops * assert ast contain free Variable only * cleanup * conservative unbinding reduce op arg * move reduceop unbind * fix llama JIT arg behavior	2023-10-10 10:03:01 -07:00
qazal	e40f141203	Refactor and add more unit tests for disktensors (#2022 ) * testing with the test_ops pattern * add assign test * flake8 complaining about single line fn * slice 2d and minor cleanup * make assign_slice a one-liner * we dont need to repeat the same lambda twice, default tinygrad_fxn to be np_fxn * back assign fn for np array * implement __setitem__ in tensor.py * dont re-slice the ret tesnsor * one liner assign * drop the permute test	2023-10-09 18:46:29 -07:00
Luca Sciarpa	e93e240a6c	adapting test/external/external_osx_profiling.py to the new code base (#2002 ) * adapting external osx profiling * fixing dtype * fixing buffer size	2023-10-08 05:55:00 -07:00
George Hotz	cea4cbfc7a	move image+kopt to features (#2015 ) * move image+kopt to features * fix tests * debug prints (unrelated)	2023-10-07 15:41:08 -07:00
nimlgen	d07ac379f9	add var_vals to kopt with symbolic (#2008 ) * add var_vals to kopt with symbolic again * no copies	2023-10-07 09:34:21 -07:00
George Hotz	121f7aa8c5	Schedule item (#2012 ) * ScheduleItem * put var_vals in the schedule * fix tests, wow that proliferated quickly * not ready to be in the schedule	2023-10-07 08:59:25 -07:00
George Hotz	f54959e5cd	move print tree into graph (#2003 ) * move print tree into graph * add winograd profiling test * change pre-commit to run ruff first	2023-10-07 04:39:21 -07:00
Ahmed Harmouche	2114dc13d1	Allow multi-input model export (#1995 ) * Allow multi-input model export * Add model export unit test * Fix efficientnet compilation * Only run model export test on JIT supported devices * Skip export model test if not EXPORT_SUPPORTED_DEVICE	2023-10-07 04:13:34 -07:00
George Hotz	ffa33d743a	good changes from openpilot_compile2 (#2000 ) * good changed from openpilot_compile2 * float32 image type was wrong * cleaner way to write that + a test	2023-10-06 13:33:24 -07:00
chenyu	05be57f57f	Fix llama with empty prompt (#1997 ) * fix llama with one token prompt * llama is all_jitted	2023-10-06 06:48:07 -07:00
George Hotz	fa9945dac0	remove stale tests	2023-10-06 02:14:56 -07:00
George Hotz	21a2c5df73	fix up contiguous (#1978 )	2023-10-05 07:22:05 -07:00
chenyu	c99fa58dd2	simplify gpt2 example (#1973 ) * simplify gpt2 example * kernel_jitted_count and jit tests * Revert "kernel_jitted_count and jit tests" This reverts commit `31a3c26dd0`. * all_jitted test in test_real_world	2023-10-05 07:09:29 -07:00
George Hotz	2d0c1037b1	Fix up latest openpilot model (#1976 ) * fix gemv triggering for gemm * fixup_openpilot * external test issues	2023-10-05 05:24:28 -07:00
George Hotz	3d5127038c	don't create linearizer if we are in the method cache (#1969 ) * don't create linearizer if we are in the method cache * remove unchecked properties * that key isn't used * fix default type is sticky	2023-10-04 12:42:58 -07:00
George Hotz	de5d603ec1	corealize + remove realize from lazybuffer (#1968 ) * corealize + remove realize from lazybuffer * fix multigpu * fix graph	2023-10-04 10:59:31 -07:00
George Hotz	d449b3bef1	think about removing realize from lazybuffer (#1965 ) * remove realize from lazybuffer * okay fine, back that off * fix tests maybe * fix test	2023-10-04 07:18:58 -07:00
nimlgen	2ea1dd3e87	no process() in Linearizer (#1966 ) * no process() in Linearizer * more process() clean up	2023-10-04 07:18:42 -07:00

... 74 75 76 77 78 ...

4667 Commits