tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-25 23:08:06 -05:00

Author	SHA1	Message	Date
kormann	3a04e518ec	print_tree UPat +fix (#5132 ) * fix and extend print_tree * typing * typing * fix upat * fix none * ws * rm prefix * mv luop dag * typo * test print_tree	2024-06-26 15:02:19 -07:00
nimlgen	16405b973a	fix hcq sync (#5062 ) * fix hcq sync * rewrite * linter + comment * fix profiler * no default dict * correct sync of unjitted transfer * fix test	2024-06-26 17:50:37 +03:00
nimlgen	fd27f19e92	graph tests (#5153 ) * graph tests * add test * cleanup	2024-06-26 16:31:20 +03:00
qazal	6ca7b13ed1	limit pickled objects [run_process_replay] (#5154 ) * limit pickled objects * delete uop from the list * debug metal * need self.opts for TC * dont need device * [run_process_replay] * minor	2024-06-26 13:51:32 +03:00
David Hou	666a9c1448	don't view origin buffer when sharding (#5122 ) * make buffer view optional with a flag * do not view when sharding to save memory	2024-06-25 20:19:09 -07:00
George Hotz	c98ca23cb9	test pickle variable (#5150 ) * test pickle variable * fix process replay	2024-06-25 19:49:21 -07:00
George Hotz	63ba2d05d1	uops dfs cleanup (#5147 ) * uops dfs cleanup * Update uops.py	2024-06-25 18:51:42 -07:00
Jhenner Tigreros	fa78755f19	Add new patterns to unfold division (#5139 ) * Add new patterns to unfold division * Create regression test and fix pattern	2024-06-25 18:07:47 -07:00
qazal	c4fdb9c725	second iteration on verify_lazyop (#5140 )	2024-06-25 09:44:32 +03:00
qazal	981afb114f	safely fold NEG in lazy.py (#5135 ) * safe * add test	2024-06-24 19:40:37 -04:00
chenyu	7948b05738	fix uneven shard with shrink and pad args on sharded axis (#5131 ) it's incorrect to assume all first (len(device)-1) shards would have the same size. e.g. size 2 shard 4 -> (1, 1, 0, 0)	2024-06-24 16:55:50 -04:00
qazal	18e70deec3	verify_lazyop (#5124 ) * start verify_lazyop * bfs order * assert * assert shapetrackers 2 * refactor * more iteration * skips * that ast was wrong too	2024-06-24 13:45:35 -07:00
chenyu	4a7d403777	cleanup test_multitensor (#5118 ) renamed d_zero, d0, d1, d2, ... to d0, d1, d2, d3 and reused some multi device tuples	2024-06-23 20:54:22 -04:00
chenyu	c0ba5e0dfb	multi copy_to_device return the copy on same device if possible (#5117 ) previously it always returns from the first device	2024-06-23 20:25:56 -04:00
Francis Lam	b563cd52ed	linearizer: change globals to merge into left axis/gridDims.x first (#5033 ) * linearizer: change order of collapse to be left-most also fixes Variable max size to be correct and add docs for the off parameter * fix multiple global dim oversizes * add passing variable test and reorganize tests * use assert RuntimeError for failing test	2024-06-23 18:53:15 -04:00
qazal	28bf8d86d8	test_linearizer with multi output ASTs (#5115 ) * ast is tuple * run test_phi_simplification * update reason * more tc * beam * a few more * use test_opt directly	2024-06-23 15:41:24 +03:00
chenyu	ee0c6dfc15	build Tensor._tri with movements only (#5110 ) * build Tensor._tri with movements only doesn't need arange, saved a kernel in attention mask * simpler, more tests	2024-06-23 00:07:36 -04:00
chenyu	20fabd8a5b	update Tensor.triu and Tensor.tril (#5109 ) renamed arg to `diagonal` that matches torch api, and added document and examples	2024-06-22 21:59:50 -04:00
chenyu	33211f356b	fix desc in tqdm (#5107 ) per doc `https://tqdm.github.io/docs/tqdm/`, user does not need to put `: ` in desc, and `: ` is automatically removed after desc if the latter is empty. updated test cases and added a test for set_description	2024-06-22 19:00:38 -04:00
chenyu	e356807696	tinytqdm.set_description and tinytrange (#5101 )	2024-06-22 14:45:06 -04:00
chenyu	8080298739	s/tinytqdm/tqdm (#5103 ) except in unit test where tqdm is imported	2024-06-22 14:18:26 -04:00
George Hotz	9f875123b6	small changes from lowerer. [run_process_replay] [no_assert] (#5102 )	2024-06-22 11:09:35 -07:00
chenyu	ca021229e4	fix attention to always return in the same dtype as input (#5100 ) mid cast to default_float does not work as intended when default is float32 and qkv is in half	2024-06-22 10:34:57 -04:00
chenyu	166a2b19b5	fix reduce axis of 0d tensors (#5089 ) `x.sum(())` is fine, and `x.sum((1,))` should throw IndexError	2024-06-21 13:51:40 -04:00
chenyu	36b4a492a1	explicitly check getitem indices can have at most one ellipsis (#5087 ) * explicitly check getitem indices can have at most one ellipsis previous error with multiple `...`: ``` if index_type not in [None, int, slice, Tensor]: raise IndexError(f"{index_type=} not supported") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ IndexError: index_type=<class 'ellipsis'> not supported ``` this pr: ``` if len(ellipsis_idx) > 1: raise IndexError("an index can only have a single ellipsis ('...')") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ IndexError: an index can only have a single ellipsis ('...') ``` * oh we have that already * test that * test these	2024-06-21 12:33:18 -04:00
nimlgen	f1e758bacb	graph fuzzer (#5082 ) * graph fuzzer * more options * mypy * no underscores for funcs	2024-06-21 18:47:23 +03:00
qazal	5717a54b28	don't use Tensor.empty in kernel opts tests (#5086 )	2024-06-21 18:41:03 +03:00
qazal	8aa786232d	docs for running process replay locally (#5083 )	2024-06-21 09:55:08 -04:00
nimlgen	fb1bf48cfe	io_uring for copies from disk (#5035 ) * exp uring * fixes and old version * nv * cleaner * cmp vs aio * fix * no lib * fix nv * linter * disk_speed_test now runs default * fixes * uring -> io_uring * linter happy * get_temp_buf comment added * tiny nits * put wait back * test runs everywhere * remove consts * remove mmap consts * do not require iouring to run test, they are generic	2024-06-21 11:36:51 +03:00
chenyu	f6d6760f71	don't cast tuple to list before creating Tensor (#5071 ) Tensor constructor supports creating from tuple now	2024-06-20 13:32:56 -04:00
George Hotz	6f6b3b10c9	import from uops, not linearizer (#5064 )	2024-06-20 08:08:44 -07:00
chenyu	50700171ef	minor cleanup to reshape arg handling (#5070 ) moved None handle to be with argfix, and only resolve -1 if there's a -1	2024-06-20 10:27:27 -04:00
chenyu	f4355d0f1b	check Tensor.permute input arg is a valid permutation (#5069 ) also added support of negative axes	2024-06-20 10:01:28 -04:00
qazal	24c89a2a33	move assert_equiv_uops to helpers + use == for dtypes (#5067 ) * dtypes should use == * use TestUOps * should use assertIs	2024-06-20 16:39:34 +03:00
chenyu	e8f39fcaaa	check arg to Tensor.flip can appear only once (#5068 ) * check arg to Tensor.flip can appear only once raise RuntimeError if there are multiple * fix test	2024-06-20 09:33:42 -04:00
qazal	55e02cdd84	generic gate folding (#5061 ) * add assert * fold truthy gates [run_process_replay] * fold falsy gates [run_process_replay] [no_assert] * redo asserts * check both barriers * spec start * spec end * assert srcs * make test_fold_gated_load_local better * [run_process_replay] [no_assert]	2024-06-20 16:10:08 +03:00
qazal	ee01e464e3	use process replay as a diff creator (#4903 ) * add no_assert option [run_process_replay] [no_assert] * test [run_process_replay] [no_assert] * [run_process_replay] * back to normal [run_process_replay] * remove the log	2024-06-19 18:17:31 +03:00
chenyu	cc2be9064f	fix out of bound python list into numpy array (#5043 ) numpy 2.0 does not allow oob python const and recommends writing as `np.array(value).astype(dtype)`	2024-06-18 18:05:21 -04:00
chenyu	4e5add4d01	move test_tqdm to test/unit/ (#5042 )	2024-06-18 17:41:39 -04:00
chenyu	2b2488f2e2	revert creating Tensor from a list without numpy (#5041 ) the change was incomplete and broke creating Tensor from a list of np array	2024-06-18 17:31:22 -04:00
chenyu	e2c5054bdd	update resnet.load_from_pretrained (#5040 )	2024-06-18 16:29:22 -04:00
chenyu	a3ed4176c8	use tinytqdm in active tests and examples (#5038 ) * use tinytqdm in active tests and examples stress test this before 0.9.1 * no set_description	2024-06-18 16:01:19 -04:00
kormann	fe332464d2	src->vin [run_process_replay] (#5036 )	2024-06-18 22:23:49 +03:00
reddyn12	f171006ded	Should this symbolic test fail? (#4501 ) * add test * skip test * use expected failure decorator --------- Co-authored-by: schlimeszn <schlimeszn@gmail.com> Co-authored-by: reddyn <nikidsniper@gmail.com>	2024-06-18 15:21:26 -04:00
kormann	7c3b877216	rename uop [run_process_replay] (#5031 ) * rename * fix unittests * rename vin * fix test * fix type [run_process_replay] * rm pre commit hook change	2024-06-18 21:34:05 +03:00
chenyu	dc942bf1f6	jit sampling functionn in test_randomness.test_multinomial (#5034 ) * jit sampling functionn in test_randomness.test_multinomial `THREEFRY=1 python3 -m pytest test/test_randomness.py::TestRandomness::test_multinomial --durations 1` 7 sec -> 1.2 sec * skip that	2024-06-18 14:21:05 -04:00
Francis Lam	8d33998e0d	[run_process_replay] linearizer: fix get_grouping_dims to respect global/local max (#4855 ) * linearizer: fix get_grouping_dims to respect global/local max * fix lidx variable index offset and unrestrict clang/llvm global len * test reverse variable indexing when reverse_dims is true * change the collapse axis to be the right most if reversed	2024-06-18 16:51:27 +03:00
Junjun Dong	c8cd6e725c	Remove BinaryOps.SUB. Replace SUB by ADD and NEG in all tests. Regenerate dataset (#4977 ) * feat: remove BinaryOps.SUB * remove SUB in test_early_end_local * regenerate dataset. remove SUB in test_linearizer_* * reenable overflow tests * simplify tensor.sub function by returning a+(-b) * remove whitespaces --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-06-18 09:06:13 -04:00
chenyu	620fa6e5a2	check Tensor.reshape can have at most one -1 (#5026 ) raise RuntimeError to match torch. on master it throws weird errors from shapetracker	2024-06-18 08:17:12 -04:00
chenyu	acaf9a490d	RECIP(-0.0) should be -inf (#5024 ) * RECIP(-0.0) should be -inf added test_dtype_alu for PYTHON backend * catcht that * fix those two	2024-06-17 22:26:58 -04:00

... 52 53 54 55 56 ...

4618 Commits