tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-15 01:48:23 -05:00

Author	SHA1	Message	Date
George Hotz	ab5d14d4ba	MEM -> LOAD (#2492 ) * MEM -> LOAD * keep legacy working	2023-11-28 16:46:37 -08:00
chenyu	847f0a02b1	non-simplifiable mod should result in ModNode (#2490 ) * non-simplifiable mod should result in ModNode * space	2023-11-28 16:52:19 -05:00
Christopher Mauri Milan	7f01dd04f0	Apply ruff linting rules to tests (#2473 ) * everything except F821 * enable F821 with noqa * dumb fix * fix remaining imports and (former) lambdas * replace _ with noqa to avoid gc	2023-11-27 21:24:06 -08:00
Paul Gustafson	98cd9e8926	Add assertion to prevent nonsense mod values (#2474 )	2023-11-27 18:37:44 -08:00
chenyu	61a80a0675	asserts LtNodes of SumNode with MulNode of Nodes (#2465 )	2023-11-27 12:56:59 -05:00
Paul Gustafson	1d89c018fa	Add isinstance check before gcd call in SumNode.__lt__ (#2450 ) * Add isinstance check before gcd call * Delete blank lines * Fix unit test typo * Delete blank lines again --------- Co-authored-by: Paul Gustafson <paul.gustafson@theambrusgroup.com>	2023-11-26 13:05:04 -08:00
George Hotz	8e9cdef61f	clean up the buffers (#2447 ) * clean up the buffers * remove allocate_output * functools.lru_cache is methodcache * add TestShapeTrackerSize * cache_clear * no 0 sz buffer, add _ on functions that shouldn't be imported * fix size * if -> while	2023-11-26 11:02:29 -08:00
George Hotz	095e2ced61	add name support to fetch (#2407 ) * add name support * use fetch in gpt2 * remove requests from main lib, networkx also optional * umm, keep that assert * updates to fetch * i love the walrus so much * stop bundling mnist with tinygrad * err, https * download cache names * add DOWNLOAD_CACHE_VERSION * need env. * ugh, wrong path * replace get_child	2023-11-23 14:16:17 -08:00
George Hotz	a0890f4e6c	move fetch to helpers (#2363 ) * switch datasets to new fetch * add test_helpers * fix convnext and delete old torch load	2023-11-19 12:29:51 -08:00
chenyu	d7d078c7f9	Node.vars() returns a set and properly dedup (#2356 ) * dedup RedNode.vars() * vars returns a set * fix more vars * unused import * update to_movement_ops * comment	2023-11-18 17:44:52 -05:00
chenyu	f02e17a967	Variable.num -> NumNode (#2354 )	2023-11-18 15:45:52 -05:00
George Hotz	40246d35bc	ops_shm removed (#2351 ) * ops_shm removed * buf.cast * err, forgot those	2023-11-18 11:41:58 -08:00
George Hotz	3baaf298d6	two stage cumsum in tensor.py (#2331 ) * two stage cumsum in tensor.py * 2 more kernels for llama cumsum * gpt-2 and llama use fast multinomial	2023-11-16 12:09:53 -08:00
George Hotz	0cbf6c1811	move things, clean up extra (#2292 ) * move things * idk why pylint needs that now * delete unused	2023-11-13 20:18:40 -08:00
qazal	e2428b63a6	external (#2191 )	2023-10-31 13:57:24 -07:00
chenyu	3c88af5071	use unique table name for each disk_cache test (#2184 )	2023-10-30 13:49:49 -07:00
George Hotz	cea2bc7964	Add dictionary keys to reduce db size (#2131 ) * work * ignore beam cache * dictionary keys are generic * minor db cleanups * fix baseline and extract dataset * fix training * log likelihood	2023-10-24 10:49:22 -04:00
George Hotz	6dc8eb5bfd	universal disk cache (#2130 ) * caching infra for tinygrad * nons tr key * fix linter * no shelve in beam search * beam search caching * check tensor cores with beam too * pretty print * LATEBEAM in stable diffusion	2023-10-22 10:56:57 -07:00
Umut Zengin	01b98b7f42	MulNode.__lt__ rule (#2086 ) * Added the rule * Added tests * flake8 * self.b == -1 shortcut	2023-10-17 13:18:35 -07:00
Umut Zengin	776605f2fc	O(1) VALIDHACKS (#2072 ) * first refactoring * O(1) validhacks * O(1) validhacks * Some cleaning * mypy * flake8 * Trim trim * flake8 * clean * less chaotic * less chaotic * flake8 * Symbolic, SumNode include mulnode for gcd * fix tests * smal optim * revert * clean * clean * flake8 * small fix * Add symbolic test	2023-10-15 11:26:41 -07:00
Umut Zengin	6b7ac5c431	ModNode __mod__ rule (#2039 ) * Implement mod rule * mypy * feat: New test added	2023-10-12 11:30:10 -07:00
qazal	e40f141203	Refactor and add more unit tests for disktensors (#2022 ) * testing with the test_ops pattern * add assign test * flake8 complaining about single line fn * slice 2d and minor cleanup * make assign_slice a one-liner * we dont need to repeat the same lambda twice, default tinygrad_fxn to be np_fxn * back assign fn for np array * implement __setitem__ in tensor.py * dont re-slice the ret tesnsor * one liner assign * drop the permute test	2023-10-09 18:46:29 -07:00
George Hotz	ffa33d743a	good changes from openpilot_compile2 (#2000 ) * good changed from openpilot_compile2 * float32 image type was wrong * cleaner way to write that + a test	2023-10-06 13:33:24 -07:00
George Hotz	22b8576887	more lazy cleanup (#1938 ) * small lazy cleanups * a few more * cleanups * no more realizing in the scheduler test * a few more minor things * that was just wrong * fix graph. the graph test was completely useless * make graph usable * fix op graph	2023-09-29 00:53:29 -07:00
George Hotz	c907efbf4a	reorder a few things (#1915 ) * reorder a few things * huh, that has to be there * move apply shapetracker * BufferOps * only for type checking	2023-09-25 10:17:21 +08:00
George Hotz	20059dc55b	Make ShapeTracker Immutable (#1909 ) * ugh * ops test pass * fix shapetracker tests * sym shapetracker * shapetracker is a tuple of views now * from_shape * fix has variable shape * key isn't needed * post init assert	2023-09-24 21:09:03 +08:00
George Hotz	7ff7aacdb4	LazyOp out of Linearizer (#1908 ) * loadop buffer on cpu * works for GPU * sort of working * has bugs * gpu tests pass * fix some tests * fix tensor cores * fix test linearizer * fix symbolic * fix has_variable_shape * non symbolic size * disable weird test * simple cache fix * fix custom function * fix kopt * cleanups * a bit broken on the assign * contig check * only buffer * need that order * idx * dedup buffers * hmm, bugfix * fix tensor cores * opts device	2023-09-24 14:30:53 +08:00
George Hotz	97dc813329	Revert "All LazyOps in the Linearizer (#1905 )" (#1907 ) This reverts commit `a5820390db`.	2023-09-24 11:51:22 +08:00
George Hotz	a5820390db	All LazyOps in the Linearizer (#1905 ) * loadop buffer on cpu * works for GPU * sort of working * has bugs * gpu tests pass * fix some tests * fix tensor cores * fix test linearizer * fix symbolic * fix has_variable_shape * non symbolic size * disable weird test * simple cache fix * fix custom function * fix kopt * cleanups * a bit broken on the assign * contig check * only buffer * need that order * idx	2023-09-24 11:50:00 +08:00
Umut Zengin	3987280daf	Fix VALIDHACKS for Images and make it default (#1832 ) * valid hacks * valid hacks * valid hacks * new method * new method * handtune * is gate load breaking? * lint ruff less junk new approach? maybe this? * Make it more clear * Make it more clear * Will deal with the linter later * hack for linter * subs the idx but dont touch the valid * Updated the mod rules * lint hack * I believe bug fix lets see * Mod Node left * revert * Maybe this wont break? * revert * implemented "handtuned garbage" * revert and use VALIDHACKS * Lets see the CI * still broken? * currently its jungle * maybe this jungle ? * This works for everything somehow * Added test for symbolic * lint * final touch * This still works * lint * midway clean * less garbage * lint * final form * Slow but working way * lint and other stuff * lint * mypy * Make sure CI test Openpilot valid checks * test if CI break * Convert back * refactor * refactor * Managed to reduce openpilot time from 30 secs to 5 secs * Refactor * Substitute a node with variable * flake8 * Comment and refactor * More comprehensive mod * refactor * bug fix * More shave off * remove not sure part	2023-09-23 07:34:43 +08:00
George Hotz	78576915de	Add needed contiguous to DiskBuffer. SHM support on OSX (#1891 ) * add some contiguous * remove second contig * Revert "remove second contig" This reverts commit fc164f7dca1ad75b1e466e4e45a05eca58b7e0e0. * shm on osx * can repro bug * don't contig zeros and ones	2023-09-22 09:16:42 +08:00
chenyu	a5090f0ee9	remove NumNode.int() (#1876 )	2023-09-21 10:29:16 +08:00
chenyu	1b46de1a3e	fix type of helpers.prod, add test cases (#1859 )	2023-09-14 05:16:55 +08:00
chenyu	3ec301c2d7	apply view.py patch (#1844 )	2023-09-10 17:32:15 -07:00
George Hotz	47e602f717	view: do not trade complexity for speed (#1839 ) * view: do not trade complexity for speed * staticmethods * view create	2023-09-10 11:29:53 -07:00
David Hou	e74a6ca7e4	expand in terms of substitute (#1827 )	2023-09-09 14:43:00 -07:00
George Hotz	63c46e0287	Parens and gls (#1768 ) * more paren stripping * remove global and local size from renderers * complex strip parens * extra helpers + minor webgpu fix * fix test uops * one more parens test	2023-09-04 16:09:01 -07:00
chenyu	f964b9e5ee	visitor pattern for sym_infer and unit tests (#1733 ) * visitor pattern for sym_infer and unit tests * comments	2023-09-01 09:47:45 -07:00
George Hotz	5c403d43b9	New >3 indexing (#1729 ) * move reindexing into linearizer * get_grouped_dims * don't limit for clang	2023-08-31 21:24:15 -07:00
Karan Handa	a8aa13dc91	[ready] Replacing os with pathlib (#1708 ) * replace os.path with pathlib * safe convert dirnames to pathlib * replace all os.path.join * fix cuda error * change main chunk * Reviewer fixes * fix vgg * Fixed everything * Final fixes * ensure consistency * Change all parent.parent... to parents	2023-08-30 10:41:08 -07:00
Max Hahn	f9cb31fdc2	added visitor pattern (#1669 ) * added visitor pattern * pylint bug workaround * added tests, made abstract OpNode inherit from ABC * fixed assert * fix check of abstract classes in negative test * remove assert False	2023-08-30 09:03:44 -07:00
George Hotz	a6d842af7a	move device to ops (#1646 ) * move device to ops * mlops types * 2 lines	2023-08-23 08:30:17 -07:00
George Hotz	718ced296c	move state to nn/state (#1619 )	2023-08-22 07:36:24 -07:00
George Hotz	86a32ffb1a	lt sum (#1617 )	2023-08-21 21:19:16 -07:00
George Hotz	012ee7d162	not worth the speed (#1584 ) * not worth the speed * no slots * uops comments * bump to python 3.11 for speed * add critical slots back	2023-08-20 10:24:58 -07:00
chenyu	be50b2fe8f	more symbolic symbolic ops (#1564 ) * more symbolic symbolic ops * handle NumNode in __mul__	2023-08-18 09:21:41 -07:00
chenyu	11dd9b1741	symbolic codegen and exec (#1552 ) * symbolic codegen and exec * fix and add test * no sketchy * merge_dicts type * dtypes._arg_int32	2023-08-16 14:43:41 -07:00
chenyu	a89142e46f	ShapeTracker.var_vals (#1540 )	2023-08-14 18:53:37 -07:00
wozeparrot	9cb2bda34f	Revert "Better reshape (#1423 )" (#1538 )	2023-08-14 13:04:54 -04:00
Sieds Lykles	cf2bf1518d	Better reshape (#1423 ) * do reshaping without merge_views and reshape masks * added tests * properly do reshaping of zero or negative masks * replace while loop with single expression * remove old condition * add more tests and comments * remove empty file	2023-08-14 09:09:04 -07:00

... 16 17 18 19 20

952 Commits