tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-10 07:28:15 -05:00

Author	SHA1	Message	Date
Ahmed Harmouche	4b01839774	support vals on WebGPU, run more tests (#2668 ) * Vals on webgpu, run more tests * Skip slow tests, run symbolic ops tests * Balance out tests	2023-12-07 16:45:21 -08:00
George Hotz	00d9eda961	FROM -> COPY, move vars_from_ast (#2675 )	2023-12-07 16:32:30 -08:00
Ahmed Harmouche	50dcd532d5	Get all WEBGPU test_ops passing (#2646 ) * Get all WEBGPU tests passing * Custom render store is not needed in wgsl	2023-12-06 07:40:37 -08:00
George Hotz	35b5e95097	parallel beam search (#2610 ) * better print * fix beam search with vars * cleanups * parallel is not default * restore that * bugfix * cleanups * bugfix	2023-12-05 10:09:45 -08:00
chenyu	1ac958a058	update pytest marks and CI test filters (#2587 ) * remove pytest marks * test more stuff * fine revert some * add that mark back * skip that * hmm LLVM does not work on ubuntu * too slow on CUDA CI * dup test	2023-12-03 15:20:44 -05:00
George Hotz	5068e99d18	refactor to remove extra kernel params (#2563 ) * refactor to have compiled kernel * bugfixes * docs/beautiful.py * revert that * fix tests	2023-12-02 00:32:25 -08:00
George Hotz	27481b9206	Switch ops_gpu -> gpuctypes (#2532 ) * ops_gpu is go * fix size 0 * fix image, and add more tests * nerf openpilot test, doesn't test thneed * run the schedule * better * oops, new inputs * delete pyopencl * Update ops_gpu.py	2023-12-01 22:30:21 -08:00
George Hotz	4c984bba7e	bump version to 0.8.0, clean CI, remove requests (#2545 ) * bump version to 0.8.0, clean CI, remove requests * why was that even there	2023-12-01 10:42:50 -08:00
George Hotz	8fd8399437	remove flake8 (#2544 )	2023-12-01 09:48:41 -08:00
George Hotz	d8175a4380	simple fix (#2543 )	2023-12-01 09:42:15 -08:00
George Hotz	2c363b5f0b	new style device (#2530 ) * cpu tests pass * torch works * works * metal works * fix ops_disk * metal jit works * fix openpilot * llvm and clang work * fix webgpu * docs are rly broken * LRU works on metal * delete comment * revert name to ._buf. LRU only on Compiled * changes * allocator * allocator, getting closer * lru alloc * LRUAllocator * all pass * metal * cuda * test examples * linearizer * test fixes * fix custom + clean realize * fix hip * skip tests * fix tests * fix size=0 * fix MOCKHIP * fix thneed * copy better * simple * old style metal copy * fix thneed * np reshape * give cuda a device	2023-11-30 17:07:16 -08:00
chenyu	7d26452305	call ruff with --preview (#2522 ) some checks are ignored without --preview	2023-11-30 13:59:00 -05:00
George Hotz	3dedeaae74	rebalance tests (#2504 ) * rebalance * balance * parallel apt-get for all * .local/lib/python3.11/site-packages * what is user doing * is that path right * Update test.yml * okay where are you * site-packages	2023-11-29 11:18:22 -08:00
George Hotz	065aff747e	make webgpu test reliable (#2502 ) * remove retry that doesn't work * fix cleanup * process exit in cleanup * add space	2023-11-29 10:02:24 -08:00
George Hotz	947711a532	split metal and webgpu tests (#2501 )	2023-11-29 09:32:09 -08:00
chenyu	3eb3c74675	metal ci tests everything (#2499 ) * metal ci tests everything * pretty good * METAL	2023-11-29 12:04:37 -05:00
George Hotz	889acefe85	Support weird loads in Image (#2498 ) * image support weird loads * umm, that was always wrong * openpilot compile fails with a weird error * image test passes * we have valids now * clean that up * no more required opts * add fastvits test, fix bug * minor cleanups	2023-11-29 08:30:46 -08:00
Liam	cf0c9096a9	Removing METAL Skips as CI works (#2488 ) * Test metal CI * remove metal and CI restrictions * enable dtype tests for metal ci	2023-11-28 19:46:59 -08:00
George Hotz	d87a246439	move to new cached fetch (#2493 ) * move to new cached fetch * extra.utils is over * loads * bump download cache * bump timeout	2023-11-28 17:36:55 -08:00
chenyu	28a67106ca	enable symbolic ops tests for hip (#2485 )	2023-11-27 22:33:41 -08:00
Davi Silva	136dbd8b36	HIP CI that compiles (to RDNA3) but doesn't have to run (#2482 ) * hip amd compilation * gate the test properly * cleanup unused import * remove superfluous numpy conversion * add SpeedyNet tests (f32 [passes] & f16 [fails]) * make CI verbose (error log from hip compiler) * test the real ops_hip * Merge branch 'tinygrad:master' into ci/hip-compilation * fix CI * cleanup * really fix CI * Fix CI Three: the refixening --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-11-27 21:17:06 -08:00
George Hotz	acbe6d1b53	Revert "HIP compilation on CI targeting RDNA3 (#2459 )" (#2481 ) This reverts commit `d275ff930a`.	2023-11-27 20:41:21 -08:00
Davi Silva	d275ff930a	HIP compilation on CI targeting RDNA3 (#2459 ) * hip amd compilation * gate the test properly * cleanup unused import * remove superfluous numpy conversion * add SpeedyNet tests (f32 [passes] & f16 [fails]) * make CI verbose (error log from hip compiler) * test the real ops_hip * Merge branch 'tinygrad:master' into ci/hip-compilation * fix CI * cleanup * really fix CI	2023-11-27 20:33:11 -08:00
George Hotz	9e07824542	move device to device.py (#2466 ) * move device to device.py * pylint test --disable R,C,W,E --enable E0611 * fix tests	2023-11-27 11:34:37 -08:00
andresgit	259a869fc1	Fix UnicodeDecodeError when debugging on Intel APU (#2421 ) * test DEBUG=5 * print prg if NVIDIA, fixes error on Intel APU	2023-11-25 12:30:50 -08:00
George Hotz	095e2ced61	add name support to fetch (#2407 ) * add name support * use fetch in gpt2 * remove requests from main lib, networkx also optional * umm, keep that assert * updates to fetch * i love the walrus so much * stop bundling mnist with tinygrad * err, https * download cache names * add DOWNLOAD_CACHE_VERSION * need env. * ugh, wrong path * replace get_child	2023-11-23 14:16:17 -08:00
Francis Lata	6d672785db	Update Whisper to use fetch helper (#2401 ) * update whisper to use new fetch helper * simplify file opening * update name * update key name to "downloads-cache"	2023-11-23 12:59:59 -08:00
George Hotz	66c75f30c6	remove triton (#2396 )	2023-11-23 07:40:59 -08:00
George Hotz	8656eebb42	jit doesn't use named tensors (#2393 ) * jit doesn't use named tensors * move to compile2 * remove broken single root junk * explicit float32 * skip slow test	2023-11-23 00:13:18 -08:00
mmmkkaaayy	08d09eb666	Enable whisper test in CI for more backends (#2355 )	2023-11-18 17:52:50 -05:00
chenyu	8e22c0d95c	everything can jit now (#2338 )	2023-11-16 23:54:57 -05:00
George Hotz	1d5501594e	force rebuild of ocelot (#2334 ) * force rebuild of ocelot * SzymonOzog gpuocelot * delete that * downgrade that * non parallel * force rebuild * use llvm * nauto * less mem maybe * print test * helper_test_exception skip CUDACPU * helper_test_exception * shippable	2023-11-16 20:44:14 -08:00
chenyu	163b2bc26a	wgpu.utils._device -> wgpu.utils.device (#2330 ) * wgpu.utils._device -> wgpu.utils.device * can i do this? * no need to specify metal	2023-11-16 12:52:13 -05:00
forcefieldsovereign	b64738e1d6	Remove AS_STRIDED from shapetracker (#2216 ) * very close * remove comment * negative strides working * almost everything passes * calculate offset with list comprehension * some cleanup * got disk load working * review suggestions * fix after merge * overlap working * did it * clean * fixed disk load * lint * mypy * removed as_strided * trying without simplify * added back simplify * make sure expanding to smaller shape * cleanup * removed comment * removed env file * trying whisper test again * onnx test sqlite issue * working on test * finished test * eliminate unnecessary shrink-then-pad * don't shrink buffer * added strides check * added to ci under linters * switch issue * allow symbolic stride * removed .env * isinstance * adjust strides for double expand * cleanup * needed to add type hint for mypy * set pythonpath	2023-11-15 15:50:17 -05:00
mmmkkaaayy	91546225f4	Add cache step for model weights in CI, re-enable whisper test (#2307 )	2023-11-14 21:16:04 -08:00
George Hotz	01f8781c26	fix CI (#2300 ) * might work * might work 2 * might work 3 * sneak that in to llama too * pin them all	2023-11-14 11:02:59 -08:00
George Hotz	38b7f5a7fd	less phi, proper phi (#2241 ) * less phi, proper phi * disable flaky whisper test	2023-11-08 16:13:43 -08:00
George Hotz	8ba7ced7f9	extract const if it's const (#2193 ) * extract const if it's const * fix if statement * fast math issue * fix graphing and casting * disable flaky copyout test	2023-10-31 18:52:35 -07:00
George Hotz	a27c9f9de5	openpilot compile2 (#2189 ) * try compile2 * pass to thneed * fix tanh onnx	2023-10-31 11:08:58 -07:00
Akshay Kashyap	018bd29e37	Enable Multi-Output Export (#2179 ) * Enable Multi-Output Export * Add test * Update examples and lint * fix padding * test ops * dummy commit to rerun test * revert cuda lint * Enforce tuple/list of tensors * subscripted generics * put back webgpu test * Re-enable WebGPU Efficientnet test	2023-10-30 18:42:26 -07:00
chenyu	6c58bf3e9c	in time_linearizer, allocate a scratch buffer if output buffer is also input (#2152 ) * in time_linearizer, allocate a scratch buffer if output buffer is also input * move scratch buffer creation outside search	2023-10-28 07:17:41 -10:00
chenyu	0ca0e9ee5e	exclude ast with variables from beam search (#2140 ) * exclude ast with variables from beam search * test that * add to CI	2023-10-25 16:35:29 -04:00
Szymon Ożóg	a52b420fb3	switch ocelot back to main repo (#2147 ) * return to ocelot main branch * cd before checkout	2023-10-25 15:14:26 -04:00
Francis Lam	bf3490cdf9	wmma: refactor tensor cores using existing local dims (#2097 ) * wmma: refactor tensor cores using existing local dims * optimizer: fix bad rebase and break after one late local --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-10-25 13:10:46 -04:00
George Hotz	abeba8f1fc	optimization: get actions in CI (#2125 ) * get actions in CI * actually run the test * pythonpath	2023-10-20 12:22:01 -07:00
George Hotz	4526891db7	parallel apt (#2111 )	2023-10-18 14:49:00 -07:00
George Hotz	15da96f393	print test durations and add speed (#2107 ) * print test durations * decrease sizes to increase speed * faster * GPU/CLANG onnx in seperate runner * test split, move ONNX CPU CI * simpler tests * simpler uops test * faster * less cuda apt * running ninja install * apt install * split fancy indexing	2023-10-18 13:46:42 -07:00
George Hotz	e2a1c2aaa6	force ruff reinstall	2023-10-18 11:40:46 -07:00
George Hotz	0d2b3a9d33	full path for ruff	2023-10-18 11:27:49 -07:00
George Hotz	8940c89d13	tests: remove 2 runners, make cache reliable (#2106 ) * remove 2 runners * device.DEFAULT printing * explain rebuild * disable ocelot rebuild * try again to fix workflow * this? fix cache hash * force no rebuild * fix pylint	2023-10-18 11:10:41 -07:00

1 2 3 4 5

227 Commits