tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-10 07:28:15 -05:00

Author	SHA1	Message	Date
DavidFarago	29adae84eb	Quickstart: Use tensors to compute train accuracy (#1662 ) Co-authored-by: Dave Farago <dfarago@innoopract.com>	2023-08-24 17:09:12 -04:00
George Hotz	a6d842af7a	move device to ops (#1646 ) * move device to ops * mlops types * 2 lines	2023-08-23 08:30:17 -07:00
Niklas D	a7752ad65d	Fix link to state.py in quickstart (#1632 )	2023-08-22 17:39:30 -04:00
George Hotz	718ced296c	move state to nn/state (#1619 )	2023-08-22 07:36:24 -07:00
Umut Zengin	f720682beb	np.argmax to Tensor.argmax (#1608 ) * to tensor argmax * removed keepdim * training update	2023-08-21 15:22:29 -07:00
Yixiang Gao	4d54afb6df	sparse cat cross entropy (#1597 ) * add sparse cat cross entropy * minor fix * add log_softmax into loss function * add test * update docs * fix training loss * add device	2023-08-21 14:14:54 -07:00
George Hotz	2e60920317	Revert "sparse cat cross entropy (#1591 )" (#1596 ) This reverts commit `f0ee850e98`.	2023-08-21 10:04:26 -07:00
Yixiang Gao	f0ee850e98	sparse cat cross entropy (#1591 ) * add sparse cat cross entropy * minor fix * add log_softmax into loss function * add test * update docs	2023-08-21 09:56:41 -07:00
chenyu	ae39cf84ab	Symbolic Shape JIT main PR (#1353 ) * Symbolic Shape JIT update tests 2 variables symbolic ops, adding more tests test passing cleanup * more test cases * single flag * review update * jit attention one piece * realize * symbolic_jit test for cuda * old artifact * works with cuda gpu but failed ci * CUDACPU	2023-08-18 14:39:55 -07:00
George Hotz	d24f936501	just cmplt (#1493 ) * just cmplt * fix maximum * don't save, there's no backward * ugh, no slot either * eq is a scam	2023-08-08 13:58:10 -07:00
George Hotz	7b8d06c9f1	test uops (#1444 ) * test uops * tests should pass * improve uops * precision	2023-08-05 12:35:56 -07:00
George Hotz	84c430355e	fix backends for new style (#1443 ) * fix backends for new style * fix method cache * fix fakeless * llvm blacklist * fix kernel optimizer	2023-08-05 11:07:04 -07:00
Alex Telon	b66361843a	Timing and Context can now be used as decorators (#1385 ) * Context and Timing can now be used as decorators * Using Timing decorator in quickstart.md The time formating is better and is a useful tool to learn. Old: Time: 3.5260659999912605 New: Time: 3526.14 ms * Updated env_vars documentation for Context * Added test for Context decorator * Put new import on same line as others	2023-08-01 17:16:10 -07:00
chenyu	940b6fd21a	Revert "Fix constant folding for Tensor([3]) (#1227 )" (#1274 ) This reverts commit `ab645317c9`.	2023-07-19 10:51:06 -07:00
David Hou	56ee97b37f	dedup kernel args v2 (#1272 ) * new version * fix abstractions * try remove test * Revert "try remove test" This reverts commit `2fc18a9f8e`. * assert_allclose * minimize the test * minimize the test * minimize the test * minimize the test * Revert "minimize the test" This reverts commit `e0c0929596`. * Revert "minimize the test" This reverts commit `88240551b1`. * Revert "minimize the test" This reverts commit `78328a7ce2`. * Revert "minimize the test" This reverts commit `989523fded`. * skip test inside body * oops * oops	2023-07-18 20:03:42 -07:00
Adrian Kretz	5a8ad57163	Add WHERE ternary (or trinary?) op (#1196 ) * Rename FusedOps to TernaryOps * Support ternary broadcast * Add where llop and mlop * Make where op work in cstyle codegen * Don't skip test_inf_where * Add backward path to where op * Use bool in cstyle codegen * Add LLVM where op * Add numpy where op * Add torch where op * Simplify where mlop * Update documentation * Forgot a rename * Merged relevant changes from PR #1195 onto PR #1196 * Add test to cover changes to linearizer.ast_parse for WHERE op Without this METAL will try to use ternary op on float4 and fail * Make where op work in wgsl backend * Allow ternary ops to be merged * Make mypy happy --------- Co-authored-by: Francis Lam <flam@alum.mit.edu>	2023-07-16 00:31:55 -07:00
chenyu	ab645317c9	Fix constant folding for Tensor([3]) (#1227 ) * Fix constant folding for Tensor([3]) * Remove duplicated prod import * load in the same device * better numpy * add constant fold shape test cases * improve tests	2023-07-11 14:01:32 -07:00
George Hotz	2952b8e7a8	Fix up abstractions.py to include the Linearizer (#1177 ) * fix up docs * remove pow, add sqrt	2023-07-07 18:33:51 -07:00
terafo	aa60feda48	Fix naming conflict with huggingface datasets (#1161 ) * Rename in files * Move files * Moved to extra/datasets as suggested * Changes to files * Fixed stupid mistake --------- Co-authored-by: terafo <terafo@protonmail.com>	2023-07-07 10:43:44 -07:00
Eli Frigo	801564f31b	Remove POW llop and add SQRT llop (#1104 ) * fixed division by zero for fast operations * made et closer to 0 * replace POW llop with SQRT * updated mlops to swap SQRT and POW llops * updated hlops to swap POW and SQRT * added sqrt llop to cpu runtime * added sqrt llop to cstyle codegen * added POW llop to llvm ir codegen * added SQRT llop to torch runtime * moved pow from mlops to hlops * found a better way to do reverse pow * fixed indentation * added SQRT llop to triton * update docs to match new llops * removed POW operator from assembly codegen * added sqrt and rsqrt to pow hlop * rewrote pow function in tensor.py * Adjust tolerance * Adjust for adamw * Reduce for Adam too * removed accidental leftover code * removed all of accidental code * added rsqrt test * removed pow from mlops again it was added back when resolving merge conflicts --------- Co-authored-by: Jacky Lee <jla524@sfu.ca>	2023-07-05 18:07:58 -07:00
tricky-labyrinth	fd98f6cffa	Small fix to abstractions.py so it runs on Windows without throwing an AttributeError (#1109 ) Co-authored-by: Tricky Labyrinth <trickylabyrinth@gmail.com>	2023-07-03 13:44:49 -07:00
Reza Rezvan	8ae9a054ae	Refactor nn.optim (#1091 ) * Refactor: nn.optim.py * Refactor: nn.optim.py; Fix all tests * Refactor: Replace all optim.get_parameters() * Refactor: Revert list comp. * Refactor: Replace optim.get_state_dict * Refactor: Change quickstart.md	2023-07-02 15:07:30 -07:00
foreign-sub	574cbda979	Quickstart (#1015 ) * fix quickstart md * add quickstart to ci	2023-06-29 13:26:58 -07:00
ernie	4d703be6d7	fix typo (#1065 )	2023-06-27 10:56:54 -07:00
George Hotz	18892242b0	global -> group (#1007 ) * global -> group * allow None for local_size in custom function * lil local * comment on shape * fix cuda * smart local cast * better local heuristic * fix ptx, and work_dim cleanup * fix metal * fix ops test * fix openpilot jit * no more optlocal * might fix metal tests * try metal now * see generated metal code * test free removal. REVERT THIS * mergable	2023-06-21 11:50:43 -07:00
Pasan Perera	b6102ba4ac	added CUDA and PTX to env_vars.md (#1009 )	2023-06-19 08:47:44 -07:00
Casey Primozic	651d6ea457	Minor improvements + cleanup to `ops_gpu.py` (#1006 ) * Minor improvements + cleanup to `ops_gpu.py` * Add some previously undocumented environment variables from `ops_gpu.py` to `env_vars.md` * Update debug print for OpenCL to print the devices that will be used post-filtering with `CL_EXCLUDE` * Remove a couple unused or superfluous variables and assignments * Use `fromimport` shorthand to shave off a couple precious LOC * Couple small whitespace changes to clean things up * Revert change to ordering of OpenCL devices * Small refactor for OpenCL context creation	2023-06-18 21:26:40 -07:00
sehaj	775287ed91	Add yolov8 implementation (#806 ) * added SPPF module from yolov8 * added conv_block, bottleneck modules * cleaned modules * c2f example * spf changes * C2f * fixed and tested bottleneck * improved detect class * tested spf and conv * checked c2f * DFL structure * fixed dfl * added dist2bbox function * added dist2bbox function * added and tested make_anchors function for the head * keeping functions above * creating the detection head * fixing head * untested blocks a. scale_boxes b. clip_boxes c. xywh2xyxy d. box_iou * head works * structure fixx * added darknet (backbone) * yolov8 neck, and intialize bias function while detection * fixed spacing * yolov8 class, init bias, and fixed c2f * forward pass almost working * fixed net structure * init bias not needed, forward pass working * load weights boilerplate * load weights done? * all variants loading! * post process: clip_boxes, scale_boxes, xywh2xyxy, and box_iou(untested) * fix scale_boxes * box_iou fixed and tested * created the pre nms function * fix nms * fixed load weights, apparently the latest commit broke something, excluding num_batches_tracked * added letterbox and pre_tranform for pre_process function * fixed letterbox, pre_transform and added preprocess function * custom NMS done, integrated prepare_boxes and nms, improved box_iou * added postprocess function till parsing * added draw_bounding_boxes_and_save function * testing full flow * using fetch for class names * fixed make_anchors + all tinygrad now * added command line arguments, weight downloading * single image for now only * made draw boxes more efficient * made NMS functions efficient * made compute_transform better * v8 working now, inference is done * prints objects detected in console now * fixed image loading (pre processing) * batch post processing * created initial tests * fixes bounding box thickness AND added get_detected_classes_with_frequency function * cleaning for testing * two tests * added url option for image, removed need for specifiying arguments * tests complete, but lots on things are printed on screen by ultralytics * remove parse arguments * fixed weight location * fixed colours of classes, and black font when high brightness * minor changes * TODOs for later * removed use of torch, using .npz weights * fixed tests * one path for fetch * preprocess now in tinygrad, plus test fix for that * updated tests * fix tests * no class labels needed * Add files via upload * Update showcase.md * Update showcase.md * added safe tensors as weights, and tests fix for that * safe tensors test * using safe_load * using tinygrad functions now to load weights * update tests --------- Co-authored-by: r3sist-uniq <amanmatreja@gmail.com> Co-authored-by: r3sist <72573738+r3sist-uniq@users.noreply.github.com>	2023-06-16 18:55:19 -07:00
John Moore	45bc040a63	Fix typo (#978 )	2023-06-13 15:15:45 -07:00
Nicklas Boman	5c7248c72d	imagenet download and prepare (#928 ) Changing if not exist to the exist_ok=True parameter and adding a variable check if you want to download training data also adding variable to env_vars.md	2023-06-08 12:55:33 -07:00
George Hotz	df40a9c238	EXP+LOG -> EXP2+LOG2 (#954 ) * EXP+LOG -> EXP2+LOG2 * update docs	2023-06-08 10:57:31 -07:00
Timothy Lindblom	a149f12a5b	Replaced broken link to /tests with /test (#939 )	2023-06-06 10:29:09 -07:00
kposborne2	00360da05b	Update broken `docs/abstractions.py` for changed ops, and add to CI (#930 ) * fix and add to ci * still have those * ocd * update other doc	2023-06-04 19:21:20 -07:00
wozeparrot	091bd65a68	feat: quick doc fixups (#923 )	2023-06-04 11:03:57 -07:00
wozeparrot	e9c1ae3825	Add a quick start guide (#900 ) * feat: initial quick start guide * fix: fix link * feat: add note about jit * feat: add note about load/store ops * feat: add link to discord * feat: add note about saving and loading models * fix: correct code for saving and loading * feat: overhaul docs * fix: fix link * feat: wording * feat: add link to discord * feat: contributing guidelines * feat: make contributing section more doc focused * feat: add link to env_vars from readme * fix: wording * feat: move community to bottom * feat: showcase * feat: linebreak * feat: redesigned header * feat: tweaks * feat: tweaks * feat: badge for lines of code * feat: move installation instructions to repo readme * feat: readme overhaul number 2 * feat: move visualization to quick start guide * feat: readme 2 electric boogaloo * fix: grammar * fix: formatting * feat: no ugly line * feat: add line back * feat: new load method * feat: split adding accelerator docs out * feat: showcase whisper * feat: smaller tweaks * feat: bring back oneliner	2023-06-04 08:51:20 -07:00
George Hotz	791530045d	Refactor LoadOps (#910 ) * test * work * upd test * loadops * cleanups * real ones * remove LazyNumpyArray * fix assign test * remove range * np.require * llama uses arange kernels * no caching consts * fix enet * torch load support * tests cleanup * fix shufflenet * fix image * fix torch_load test	2023-06-03 09:40:43 -07:00
Nicklas Boman	0e9e0fd718	document environment variables (#887 )	2023-06-01 13:11:17 -07:00
George Hotz	1e56aced05	add changeable DEBUG (#816 )	2023-05-27 13:28:25 -07:00
George Hotz	f5467cfedc	Devicebufferless (#708 ) * runs one metal kernel * conv2d works * ops tests are passing * const folding * all ops work * pre commit always passes * torch works * working still * fix graph test * tests passing * image almost works * image conv works * most images * fix custom * fix assignment * fix compile enet * clean up comments * fix realize return value * include shapetracker in LB repr * copy should make a copy * reenable method cache * fix lna * dtypes in graph * forward only for IMAGE=2 * simple realize * getting close * fixup new api, it's good except the kernel count * back to 197 kernels * tests should pass * go to a real float * no type_on_cpu * fix the docs * put shapetracker back in it's proper place	2023-03-18 14:40:23 -07:00
George Hotz	3a8af99adb	i understand ClassVar now	2023-03-15 09:00:25 -07:00
Pasan Perera	df48753692	fixed the import error for latest changes in master (#705 )	2023-03-15 08:59:42 -07:00
George Hotz	54f499b623	Move rawbuffer (#697 ) * move GlobalCounters to helpers * that's not part of the public api * move InterpretedBuffer * remove fromCPU from devicebuffer	2023-03-13 22:30:36 -07:00
George Hotz	cbc5a7222a	symbolic is now a 6/10 due to the infinite loop. do better.	2023-03-13 00:07:59 -07:00
George Hotz	c594a0a835	fix flip bug, add new unit tests	2023-03-12 23:55:31 -07:00
George Hotz	ce1564b05e	fix shapetracker test	2023-03-12 22:33:25 -07:00
George Hotz	153cce0f7e	tutorial	2023-03-12 22:31:46 -07:00
George Hotz	8d16ebaea7	we have docs:	2023-03-12 19:05:44 -07:00
George Hotz	0ba6179de7	stable diffusion in readme	2022-09-05 18:51:56 -07:00
George Hotz	81c9438ea1	keepdim avoids reshapes	2022-06-05 15:56:42 -07:00
George Hotz	7a3fe34db1	GPU llops	2022-06-05 13:49:39 -07:00

... 4 5 6 7 8

357 Commits