tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-26 07:18:40 -05:00

Author	SHA1	Message	Date
George Hotz	c36d306606	KOPT is over, BEAM is upstream (#2071 ) * create cache for q learning * make linter happy * global beam * where it belongs * bugfix * ditch the kopt, use the beam * faster lin and DEBUG=2 okay * remove kopt, move search to features	2023-10-16 09:46:03 -07:00
George Hotz	121f7aa8c5	Schedule item (#2012 ) * ScheduleItem * put var_vals in the schedule * fix tests, wow that proliferated quickly * not ready to be in the schedule	2023-10-07 08:59:25 -07:00
Roelof van Dijk	972d9ea215	fix: PRUNEGRAPH is unused (#1985 )	2023-10-05 14:28:43 -07:00
George Hotz	de5d603ec1	corealize + remove realize from lazybuffer (#1968 ) * corealize + remove realize from lazybuffer * fix multigpu * fix graph	2023-10-04 10:59:31 -07:00
nimlgen	2ea1dd3e87	no process() in Linearizer (#1966 ) * no process() in Linearizer * more process() clean up	2023-10-04 07:18:42 -07:00
George Hotz	0945848b5f	schedule the loadops like everything else (#1964 ) * schedule the loadops like everything else * unify loadops with other things we schedule * delete all the ops * fix symbolic jit	2023-10-04 02:36:04 -07:00
Yixiang Gao	094d3d71be	with Tensor.train() (#1935 ) * add with.train * remove the rest TODOs * fix pyflake * fix pyflake error * fix mypy	2023-09-28 18:02:31 -07:00
George Hotz	adab724caa	schedule2, keep the tests working with small changes (#1932 ) * lazy cleanups * ast functions take in LazyOps * op instead of self.op * _base for mops * fix contiguous * start schedule * test_schedule * fix openpilot * more tests * bugfix and test skip * work * make sure things get freed * fix zerosized tensors * fix failing test * fix ceil and friends * fix openpilot * disable training * disable test collectives	2023-09-28 09:14:43 -07:00
George Hotz	c907efbf4a	reorder a few things (#1915 ) * reorder a few things * huh, that has to be there * move apply shapetracker * BufferOps * only for type checking	2023-09-25 10:17:21 +08:00
George Hotz	20059dc55b	Make ShapeTracker Immutable (#1909 ) * ugh * ops test pass * fix shapetracker tests * sym shapetracker * shapetracker is a tuple of views now * from_shape * fix has variable shape * key isn't needed * post init assert	2023-09-24 21:09:03 +08:00
George Hotz	7ff7aacdb4	LazyOp out of Linearizer (#1908 ) * loadop buffer on cpu * works for GPU * sort of working * has bugs * gpu tests pass * fix some tests * fix tensor cores * fix test linearizer * fix symbolic * fix has_variable_shape * non symbolic size * disable weird test * simple cache fix * fix custom function * fix kopt * cleanups * a bit broken on the assign * contig check * only buffer * need that order * idx * dedup buffers * hmm, bugfix * fix tensor cores * opts device	2023-09-24 14:30:53 +08:00
George Hotz	97dc813329	Revert "All LazyOps in the Linearizer (#1905 )" (#1907 ) This reverts commit `a5820390db`.	2023-09-24 11:51:22 +08:00
George Hotz	a5820390db	All LazyOps in the Linearizer (#1905 ) * loadop buffer on cpu * works for GPU * sort of working * has bugs * gpu tests pass * fix some tests * fix tensor cores * fix test linearizer * fix symbolic * fix has_variable_shape * non symbolic size * disable weird test * simple cache fix * fix custom function * fix kopt * cleanups * a bit broken on the assign * contig check * only buffer * need that order * idx	2023-09-24 11:50:00 +08:00
George Hotz	9cf13bd055	rename reduce_op (#1900 ) * rename reduce_op * more design v2	2023-09-23 11:27:36 +08:00
chenyu	b8fde6bb0f	Test KOPT in CI (#1744 ) * test kopt in ci * getenv takes dtype from default	2023-09-03 14:37:20 -07:00
crankygrumpster	c8025c319c	Remove Token from abstractions.py (#1741 ) * Remove Token from abstractions.py, update output string * add dtype	2023-09-02 21:56:11 -07:00
George Hotz	453e437598	move stuff in the linearizer (#1726 ) * move stuff in linearizer * move stuff in linearizer * minor * fix opts import	2023-08-31 14:42:09 -07:00
nimlgen	1c0449e190	add cache collector (#1595 ) * init cache collector * add test_cache_collector.py * switch GlobalCounters.cache to CacheCollector * init jit models test * jitted SD * add debug msg to print loaded bufs count * moved cache collctor to jit * clearer SD * no double device import	2023-08-28 19:59:55 -07:00
wozeparrot	f61d0657d1	document new envvars (#1676 ) * feat: document some new envvars * feat: actually put values * feat: no more cifar torch * feat: no fakedata	2023-08-26 20:17:02 -04:00
DavidFarago	1ba8f0dca3	Quickstart: Upgrade section "Training" to new code (#1663 ) Co-authored-by: Dave Farago <dfarago@innoopract.com>	2023-08-24 17:12:16 -04:00
DavidFarago	29adae84eb	Quickstart: Use tensors to compute train accuracy (#1662 ) Co-authored-by: Dave Farago <dfarago@innoopract.com>	2023-08-24 17:09:12 -04:00
George Hotz	a6d842af7a	move device to ops (#1646 ) * move device to ops * mlops types * 2 lines	2023-08-23 08:30:17 -07:00
Niklas D	a7752ad65d	Fix link to state.py in quickstart (#1632 )	2023-08-22 17:39:30 -04:00
George Hotz	718ced296c	move state to nn/state (#1619 )	2023-08-22 07:36:24 -07:00
Umut Zengin	f720682beb	np.argmax to Tensor.argmax (#1608 ) * to tensor argmax * removed keepdim * training update	2023-08-21 15:22:29 -07:00
Yixiang Gao	4d54afb6df	sparse cat cross entropy (#1597 ) * add sparse cat cross entropy * minor fix * add log_softmax into loss function * add test * update docs * fix training loss * add device	2023-08-21 14:14:54 -07:00
George Hotz	2e60920317	Revert "sparse cat cross entropy (#1591 )" (#1596 ) This reverts commit `f0ee850e98`.	2023-08-21 10:04:26 -07:00
Yixiang Gao	f0ee850e98	sparse cat cross entropy (#1591 ) * add sparse cat cross entropy * minor fix * add log_softmax into loss function * add test * update docs	2023-08-21 09:56:41 -07:00
chenyu	ae39cf84ab	Symbolic Shape JIT main PR (#1353 ) * Symbolic Shape JIT update tests 2 variables symbolic ops, adding more tests test passing cleanup * more test cases * single flag * review update * jit attention one piece * realize * symbolic_jit test for cuda * old artifact * works with cuda gpu but failed ci * CUDACPU	2023-08-18 14:39:55 -07:00
George Hotz	d24f936501	just cmplt (#1493 ) * just cmplt * fix maximum * don't save, there's no backward * ugh, no slot either * eq is a scam	2023-08-08 13:58:10 -07:00
George Hotz	7b8d06c9f1	test uops (#1444 ) * test uops * tests should pass * improve uops * precision	2023-08-05 12:35:56 -07:00
George Hotz	84c430355e	fix backends for new style (#1443 ) * fix backends for new style * fix method cache * fix fakeless * llvm blacklist * fix kernel optimizer	2023-08-05 11:07:04 -07:00
Alex Telon	b66361843a	Timing and Context can now be used as decorators (#1385 ) * Context and Timing can now be used as decorators * Using Timing decorator in quickstart.md The time formating is better and is a useful tool to learn. Old: Time: 3.5260659999912605 New: Time: 3526.14 ms * Updated env_vars documentation for Context * Added test for Context decorator * Put new import on same line as others	2023-08-01 17:16:10 -07:00
chenyu	940b6fd21a	Revert "Fix constant folding for Tensor([3]) (#1227 )" (#1274 ) This reverts commit `ab645317c9`.	2023-07-19 10:51:06 -07:00
David Hou	56ee97b37f	dedup kernel args v2 (#1272 ) * new version * fix abstractions * try remove test * Revert "try remove test" This reverts commit `2fc18a9f8e`. * assert_allclose * minimize the test * minimize the test * minimize the test * minimize the test * Revert "minimize the test" This reverts commit `e0c0929596`. * Revert "minimize the test" This reverts commit `88240551b1`. * Revert "minimize the test" This reverts commit `78328a7ce2`. * Revert "minimize the test" This reverts commit `989523fded`. * skip test inside body * oops * oops	2023-07-18 20:03:42 -07:00
Adrian Kretz	5a8ad57163	Add WHERE ternary (or trinary?) op (#1196 ) * Rename FusedOps to TernaryOps * Support ternary broadcast * Add where llop and mlop * Make where op work in cstyle codegen * Don't skip test_inf_where * Add backward path to where op * Use bool in cstyle codegen * Add LLVM where op * Add numpy where op * Add torch where op * Simplify where mlop * Update documentation * Forgot a rename * Merged relevant changes from PR #1195 onto PR #1196 * Add test to cover changes to linearizer.ast_parse for WHERE op Without this METAL will try to use ternary op on float4 and fail * Make where op work in wgsl backend * Allow ternary ops to be merged * Make mypy happy --------- Co-authored-by: Francis Lam <flam@alum.mit.edu>	2023-07-16 00:31:55 -07:00
chenyu	ab645317c9	Fix constant folding for Tensor([3]) (#1227 ) * Fix constant folding for Tensor([3]) * Remove duplicated prod import * load in the same device * better numpy * add constant fold shape test cases * improve tests	2023-07-11 14:01:32 -07:00
George Hotz	2952b8e7a8	Fix up abstractions.py to include the Linearizer (#1177 ) * fix up docs * remove pow, add sqrt	2023-07-07 18:33:51 -07:00
terafo	aa60feda48	Fix naming conflict with huggingface datasets (#1161 ) * Rename in files * Move files * Moved to extra/datasets as suggested * Changes to files * Fixed stupid mistake --------- Co-authored-by: terafo <terafo@protonmail.com>	2023-07-07 10:43:44 -07:00
Eli Frigo	801564f31b	Remove POW llop and add SQRT llop (#1104 ) * fixed division by zero for fast operations * made et closer to 0 * replace POW llop with SQRT * updated mlops to swap SQRT and POW llops * updated hlops to swap POW and SQRT * added sqrt llop to cpu runtime * added sqrt llop to cstyle codegen * added POW llop to llvm ir codegen * added SQRT llop to torch runtime * moved pow from mlops to hlops * found a better way to do reverse pow * fixed indentation * added SQRT llop to triton * update docs to match new llops * removed POW operator from assembly codegen * added sqrt and rsqrt to pow hlop * rewrote pow function in tensor.py * Adjust tolerance * Adjust for adamw * Reduce for Adam too * removed accidental leftover code * removed all of accidental code * added rsqrt test * removed pow from mlops again it was added back when resolving merge conflicts --------- Co-authored-by: Jacky Lee <jla524@sfu.ca>	2023-07-05 18:07:58 -07:00
tricky-labyrinth	fd98f6cffa	Small fix to abstractions.py so it runs on Windows without throwing an AttributeError (#1109 ) Co-authored-by: Tricky Labyrinth <trickylabyrinth@gmail.com>	2023-07-03 13:44:49 -07:00
Reza Rezvan	8ae9a054ae	Refactor nn.optim (#1091 ) * Refactor: nn.optim.py * Refactor: nn.optim.py; Fix all tests * Refactor: Replace all optim.get_parameters() * Refactor: Revert list comp. * Refactor: Replace optim.get_state_dict * Refactor: Change quickstart.md	2023-07-02 15:07:30 -07:00
foreign-sub	574cbda979	Quickstart (#1015 ) * fix quickstart md * add quickstart to ci	2023-06-29 13:26:58 -07:00
ernie	4d703be6d7	fix typo (#1065 )	2023-06-27 10:56:54 -07:00
George Hotz	18892242b0	global -> group (#1007 ) * global -> group * allow None for local_size in custom function * lil local * comment on shape * fix cuda * smart local cast * better local heuristic * fix ptx, and work_dim cleanup * fix metal * fix ops test * fix openpilot jit * no more optlocal * might fix metal tests * try metal now * see generated metal code * test free removal. REVERT THIS * mergable	2023-06-21 11:50:43 -07:00
Pasan Perera	b6102ba4ac	added CUDA and PTX to env_vars.md (#1009 )	2023-06-19 08:47:44 -07:00
Casey Primozic	651d6ea457	Minor improvements + cleanup to `ops_gpu.py` (#1006 ) * Minor improvements + cleanup to `ops_gpu.py` * Add some previously undocumented environment variables from `ops_gpu.py` to `env_vars.md` * Update debug print for OpenCL to print the devices that will be used post-filtering with `CL_EXCLUDE` * Remove a couple unused or superfluous variables and assignments * Use `fromimport` shorthand to shave off a couple precious LOC * Couple small whitespace changes to clean things up * Revert change to ordering of OpenCL devices * Small refactor for OpenCL context creation	2023-06-18 21:26:40 -07:00
sehaj	775287ed91	Add yolov8 implementation (#806 ) * added SPPF module from yolov8 * added conv_block, bottleneck modules * cleaned modules * c2f example * spf changes * C2f * fixed and tested bottleneck * improved detect class * tested spf and conv * checked c2f * DFL structure * fixed dfl * added dist2bbox function * added dist2bbox function * added and tested make_anchors function for the head * keeping functions above * creating the detection head * fixing head * untested blocks a. scale_boxes b. clip_boxes c. xywh2xyxy d. box_iou * head works * structure fixx * added darknet (backbone) * yolov8 neck, and intialize bias function while detection * fixed spacing * yolov8 class, init bias, and fixed c2f * forward pass almost working * fixed net structure * init bias not needed, forward pass working * load weights boilerplate * load weights done? * all variants loading! * post process: clip_boxes, scale_boxes, xywh2xyxy, and box_iou(untested) * fix scale_boxes * box_iou fixed and tested * created the pre nms function * fix nms * fixed load weights, apparently the latest commit broke something, excluding num_batches_tracked * added letterbox and pre_tranform for pre_process function * fixed letterbox, pre_transform and added preprocess function * custom NMS done, integrated prepare_boxes and nms, improved box_iou * added postprocess function till parsing * added draw_bounding_boxes_and_save function * testing full flow * using fetch for class names * fixed make_anchors + all tinygrad now * added command line arguments, weight downloading * single image for now only * made draw boxes more efficient * made NMS functions efficient * made compute_transform better * v8 working now, inference is done * prints objects detected in console now * fixed image loading (pre processing) * batch post processing * created initial tests * fixes bounding box thickness AND added get_detected_classes_with_frequency function * cleaning for testing * two tests * added url option for image, removed need for specifiying arguments * tests complete, but lots on things are printed on screen by ultralytics * remove parse arguments * fixed weight location * fixed colours of classes, and black font when high brightness * minor changes * TODOs for later * removed use of torch, using .npz weights * fixed tests * one path for fetch * preprocess now in tinygrad, plus test fix for that * updated tests * fix tests * no class labels needed * Add files via upload * Update showcase.md * Update showcase.md * added safe tensors as weights, and tests fix for that * safe tensors test * using safe_load * using tinygrad functions now to load weights * update tests --------- Co-authored-by: r3sist-uniq <amanmatreja@gmail.com> Co-authored-by: r3sist <72573738+r3sist-uniq@users.noreply.github.com>	2023-06-16 18:55:19 -07:00
John Moore	45bc040a63	Fix typo (#978 )	2023-06-13 15:15:45 -07:00
Nicklas Boman	5c7248c72d	imagenet download and prepare (#928 ) Changing if not exist to the exist_ok=True parameter and adding a variable check if you want to download training data also adding variable to env_vars.md	2023-06-08 12:55:33 -07:00

1 2

77 Commits