tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-14 01:18:26 -05:00

Author	SHA1	Message	Date
George Hotz	8ff2e13550	From teeny (#2426 ) * changes from teenygrad work * support not supporting ImageDType/PtrDType * fixups from teeny	2023-11-24 12:50:56 -08:00
nimlgen	e68aebfff9	bring hip graph back (#2385 ) * bring hip graph back * share with metal * fix linter * remove hasattrs * Update ops_hip.py * hip wrapper does not use _buf --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-11-24 07:53:44 -08:00
George Hotz	12023b6824	onnx ops cleanup (#2413 ) * onnx ops cleanup * revert those	2023-11-23 18:39:49 -08:00
George Hotz	095e2ced61	add name support to fetch (#2407 ) * add name support * use fetch in gpt2 * remove requests from main lib, networkx also optional * umm, keep that assert * updates to fetch * i love the walrus so much * stop bundling mnist with tinygrad * err, https * download cache names * add DOWNLOAD_CACHE_VERSION * need env. * ugh, wrong path * replace get_child	2023-11-23 14:16:17 -08:00
George Hotz	0505c5ea50	remove force_wait, refactor to graph (#2405 ) * remove force_wait * refactor * get rid of stupid ASTRunner * fix del in diskbuffer * BufferOps.FROM_UNDERLYING * put offset in the rawbuffer * fix bugs * use exec	2023-11-23 12:46:07 -08:00
George Hotz	4f8f0ac139	minor cleanups, remove dead files (#2398 ) * minor cleanups, remove dead files * s.name * use disk * pytest passes on mac	2023-11-23 09:01:50 -08:00
George Hotz	66c75f30c6	remove triton (#2396 )	2023-11-23 07:40:59 -08:00
chenyu	8798d120bb	autopad shapetracker for BEAM (#2375 ) * autopad shapetracker for BEAM * OptOps.PADTO * skip that test for now * correct padding reduce axis * just 32 * avoid more than double the FLOPs * cleanups * test case * no support for triton and llvm yet * typos * symbolic shape would not work * cannot PADTO with MAX kernel * advance db version * no breaking change - don't advance db version * is triton just python? * Revert "is triton just python?" This reverts commit 17e776c25587615e33a3634c2fb0bb8591ce65d4. * Revert "Revert "is triton just python?"" This reverts commit 6c434c01e1c4b0ea0431ec18632cd859fb3cf260. * support llvm * is it really passing in CI only? * update tests * oh triton test passed * simpler * revert that, with a test * check if st are the same * Revert "check if st are the same" This reverts commit d2a5eac110a5da1af82a2728c883779ef69c3cad. * update the db version * rebase artifact	2023-11-22 21:05:25 -05:00
qazal	0eda545946	dtypes.float.vec(sz) (#2386 ) * replace all _dtypen with dtype.vec(n) fix: print works * conceptul refactor of cstyle render_load logic * linearizer GEP is explicit that its dtype is the scalar version of localtype * vectorized global_store and load don't need a conditional	2023-11-22 17:43:14 -08:00
George Hotz	cbb8486779	ResNet training changes (update benchmark) (#2390 ) * default arg for chunk * bring back to_ * good changes * new set * unused hash * fix optim * new torch loader * fix test lr scheduler	2023-11-22 17:41:12 -08:00
wozeparrot	abbcc7aefa	missed cleanup from cache_id removal (#2376 )	2023-11-21 01:03:43 -05:00
George Hotz	a0890f4e6c	move fetch to helpers (#2363 ) * switch datasets to new fetch * add test_helpers * fix convnext and delete old torch load	2023-11-19 12:29:51 -08:00
chenyu	d7d078c7f9	Node.vars() returns a set and properly dedup (#2356 ) * dedup RedNode.vars() * vars returns a set * fix more vars * unused import * update to_movement_ops * comment	2023-11-18 17:44:52 -05:00
George Hotz	40246d35bc	ops_shm removed (#2351 ) * ops_shm removed * buf.cast * err, forgot those	2023-11-18 11:41:58 -08:00
George Hotz	c7b38b324b	A beautiful MNIST training example (#2272 ) * beautiful mnist * beautiful mnist example * from tinygrad import Tensor * more beautiful * the jit is super core tinygrad * globalcounters reset on jit run * symlinks and exclude * beautiful_cartpole * evaluate is it's own function * no symlinks * more beautiful * jit reset for double speed * type hinting for JIT * beautiful_mnist gets 98% * beautiful_mnist < 4s with BEAM=2 * better cartpole * use actor critic * zero_grad got lost * delete double relu * stable cartpole with PPO * beautiful_cartpole is more beautiful * REPLAY_BUFFER * beautiful stuff typechecks * None support in shape * hp tuning	2023-11-17 19:42:43 -08:00
chenyu	d2c0035c73	add back as_strided, move rebuilt mops to extra (#2344 ) * add back as_strided, move rebuilt mops to extra * negative stride for ops_cpu * Revert "negative stride for ops_cpu" This reverts commit `a13b6815ac`. * skip that * style	2023-11-17 14:34:30 -05:00
George Hotz	652d2de256	wow how did i think that was okay (#2339 )	2023-11-16 21:21:11 -08:00
chenyu	822d6e6f18	Simpler mops verify (#2325 ) * rewrite the to_movement_ops check using symbolic * tweak	2023-11-15 21:47:18 -05:00
forcefieldsovereign	b64738e1d6	Remove AS_STRIDED from shapetracker (#2216 ) * very close * remove comment * negative strides working * almost everything passes * calculate offset with list comprehension * some cleanup * got disk load working * review suggestions * fix after merge * overlap working * did it * clean * fixed disk load * lint * mypy * removed as_strided * trying without simplify * added back simplify * make sure expanding to smaller shape * cleanup * removed comment * removed env file * trying whisper test again * onnx test sqlite issue * working on test * finished test * eliminate unnecessary shrink-then-pad * don't shrink buffer * added strides check * added to ci under linters * switch issue * allow symbolic stride * removed .env * isinstance * adjust strides for double expand * cleanup * needed to add type hint for mypy * set pythonpath	2023-11-15 15:50:17 -05:00
geohotstan	3c5a51fb3a	aaaaaaa finally (#2310 )	2023-11-15 07:12:38 -08:00
George Hotz	4f7b1ac0d2	cleanups before interpreted jit (#2306 ) * jit mnist * InterpretedFlopCounter doesn't rely on Interpreted * allocator for cpu and torch * types for exec_ast * fix type issues * fix onnx, remove print * always self.from_underlying	2023-11-14 21:44:25 -08:00
nimlgen	4e0d47533e	beam works with var vals (#2296 ) * beam works with var vals * test passes now * better comment * linter happy	2023-11-14 13:03:19 -05:00
George Hotz	0cbf6c1811	move things, clean up extra (#2292 ) * move things * idk why pylint needs that now * delete unused	2023-11-13 20:18:40 -08:00
George Hotz	b1f7f29525	metal indirect command buffers (#2285 ) * metal indirect command buffers * sub 1ms gpt * metal batch exec is good * remove whitespace * input_replace * fix ci * useResources * very simple cacheallocator * update_stats * fix CI * minor * remove that from jit	2023-11-13 17:58:26 -08:00
rodfer	53c5baa8b6	add dilation to avg_pool2d (#2270 ) * add dilation to avg_pool2d * avg_pool_fix * avg_pool_fix * woo * oops * force it correct --------- Co-authored-by: rodfer0x80 <rodfer0x80@proton.me> Co-authored-by: zibokapi <zibokapi@gmail.com> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-11-13 08:47:56 -08:00
valar	123ea051e6	refactor/ci: delete many `# type: ignore` (#2281 ) * refactor/ci: delete many `# type: ignore` * replace `axis.__class__ is int` with `isinstance(axis, int)` to make mypy happy * add `--warn-unused-ignores` to mypy flag refs #2240 * ci: move `--warn-unused-ignores` flag to mypy config refs #2240	2023-11-12 11:04:20 -08:00
geohotstan	b853e9bb8c	Onnx 1.15.0 gogogo (#2217 ) * lol * lol * add GELULULULUL * onnx 1.50 * fuk torch bool neg * exclude regex tests * exclude dequantizelinear for now * is sunny in philly * damn it affinegrid * fixed auto_pad VALID * skip 0 shape tests * add temporary cast in Reduces * tests should pass now * added comments and cleanup * try moving dequantizelinear to onnx.py * fixed dequantizedlinear? * cleanup * try? * float16 segfaults LLVM CI..??? * cleanup comments * pin to 1.50.0 * remove use of -np.inf cuz numpy is kill * 1.50? lol I'm actually retarded * thx for review, muhbad * moved Gelu higher up	2023-11-10 15:36:48 -08:00
chenyu	a753c8e071	examples of new GPT2 and JIT change (#2261 ) * var_vals are global * working with global ish * better * fix export model * fix tests * better kv cache * does it run? * use where for kvmask * fix excessive var_vals * fix import * how does multigpu use this? * llama kinda work * faster and simpler * cleanup * fix conversation mode * test cleanups * fix one more test * test cleanup --------- Co-authored-by: George Hotz <geohot@gmail.com>	2023-11-10 15:07:02 -05:00
George Hotz	80bf0b8586	proper wmma (#2245 ) * proper wmma * hip cast * bugfixes * bugfix * that bug is fixed --------- Co-authored-by: George Hotz <george@tinygrad.org>	2023-11-09 15:15:18 -08:00
wozeparrot	4c44d1344b	feat: remove cache_id (#2236 )	2023-11-08 08:09:21 -08:00
Rory Clear	553688f12a	update metal matmul and matvec for compile api (#2238 )	2023-11-08 08:08:35 -08:00
George Hotz	2f7aab3d13	move optimize_local_size (#2221 ) * move optimize_local_size * interpret_ast	2023-11-05 21:00:52 -08:00
chenyu	f582ec56d5	Replace (getenv("CI", "") != "") with helpers.CI (#2213 )	2023-11-03 15:20:44 -07:00
George Hotz	f17bc16f46	simple runtime args (#2211 ) * simple runtime args * fix some tests * fix abstractions and triton * fix search	2023-11-03 12:31:29 -07:00
George Hotz	ddbc6eecaf	some refactors in the realization (#2206 ) * some refactors * delete old kernel search	2023-11-02 19:51:28 -07:00
George Hotz	03cf0afa4f	move all to compile api (#2203 ) * move metal+clang to compile api * all to the new style * remove binary arg * fix triton * fixup tests * fix clang * diskcache is generic * __wrapped__ * compile_gpu * fix thneed * keep the src in the ASTRunner * lib * move compile_gpu * compile_gpu in device * put compiler in astrunner * test reverts * triton compiler * ugh, that too	2023-11-01 23:01:32 -07:00
George Hotz	8932816816	remove arm64, caching for cuda (#2201 ) * remove arm64, caching for cuda * caching in llvm * switch cache_compiled to new cache * fix clang * caching for metal * fix pylint * cleanups * perf_counter and binary	2023-11-01 18:44:00 -07:00
George Hotz	7103b716c4	merge kernel and optimizer (#2200 ) * merge kernel and optimizer * linearize is reentrant * move global/local size * clean up linearizer copy * remove unneeded lin copies * stop linearizing twice * oops, that should be None	2023-11-01 15:20:01 -07:00
George Hotz	33bb650e94	use mad in opencl (#2198 ) Co-authored-by: Comma Device <device@comma.ai>	2023-11-01 10:40:08 -07:00
Comma Device	2e9982fe2d	fastvits example that's 10% faster	2023-10-31 21:48:23 -07:00
George Hotz	8ba7ced7f9	extract const if it's const (#2193 ) * extract const if it's const * fix if statement * fast math issue * fix graphing and casting * disable flaky copyout test	2023-10-31 18:52:35 -07:00
George Hotz	5aaa8a0cc1	fix shape	2023-10-31 11:36:19 -07:00
George Hotz	a27c9f9de5	openpilot compile2 (#2189 ) * try compile2 * pass to thneed * fix tanh onnx	2023-10-31 11:08:58 -07:00
forcefieldsovereign	f294bdd681	fixed imports (#2185 )	2023-10-30 22:07:17 -07:00
Akshay Kashyap	018bd29e37	Enable Multi-Output Export (#2179 ) * Enable Multi-Output Export * Add test * Update examples and lint * fix padding * test ops * dummy commit to rerun test * revert cuda lint * Enforce tuple/list of tensors * subscripted generics * put back webgpu test * Re-enable WebGPU Efficientnet test	2023-10-30 18:42:26 -07:00
chenyu	6c58bf3e9c	in time_linearizer, allocate a scratch buffer if output buffer is also input (#2152 ) * in time_linearizer, allocate a scratch buffer if output buffer is also input * move scratch buffer creation outside search	2023-10-28 07:17:41 -10:00
George Hotz	e0201922e3	Q network for pruning BEAM / uops deduping / BEAM_ESTIMATE (#2142 ) * stable diffusion < 324ms * revert swap action * fix tests due to more sum splitting * REDUCEOP_SPLIT_THRESHOLD env var * added from unaligned np test (#2134) * align cpu buffer before copy into cl buffer (#2135) * remove shelve from handcode_resnet50_opt.py (#2139) * Add dictionary keys to reduce db size (#2131) * work * ignore beam cache * dictionary keys are generic * minor db cleanups * fix baseline and extract dataset * fix training * log likelihood * more lin to feats * sts * training policynet * net sort of works * dedup * refactor, stupid new actions * fix uops deduping * BEAM_ESTIMATE --------- Co-authored-by: chenyu <chenyu@fastmail.com> Co-authored-by: imaolo <56898718+imaolo@users.noreply.github.com>	2023-10-27 10:53:06 -10:00
chenyu	0ca0e9ee5e	exclude ast with variables from beam search (#2140 ) * exclude ast with variables from beam search * test that * add to CI	2023-10-25 16:35:29 -04:00
wozeparrot	c29653605e	hip multigpu training (#1878 ) * feat: move to hip * feat: special path for RawBufferTransfer * feat: initial rawbuffertransfer * feat: hip ipc * feat: working hip ipc * feat: need to base device without args * feat: close mem handle * feat: modified test * feat: more multihip stuff * clean: cleanup * feat: cleaner * feat: don't crash * feat: test more * clean: way cleaner hip wrapper * feat: barrier * feat: barrier * feat: this breaks stuff * feat: we can use empty here * feat: maybe fix tests * feat: maybe fix tests again? * fix: probably fix tests * feat: no waiting here * feat: wait here * feat: much larger test * feat: need to sync here * feat: make this async * feat: no waiting! * feat: cut here * feat: sync copy * feat: random imports * feat: much cleaner world * feat: restore this * feat: restore this * clean: cleanup * feat: set this	2023-10-24 17:35:53 -04:00
nimlgen	2e89fd264f	Refactor hipgraph (#2141 ) * refactor hip graph * linter happy * happy liner	2023-10-24 15:45:56 -04:00

1 2 3 4 5 ...

458 Commits