tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-02-19 02:44:40 -05:00

Author	SHA1	Message	Date
chenyu	3fc8203475	remove NEG from handwritten ast in tests (#6234 ) * remove NEG from handwritten ast in tests * test_linearizer_failures	2024-08-22 09:06:59 -04:00
chenyu	1c5ef5b793	format test_linearizer_failure (#6231 ) made it easier to remove NEG	2024-08-21 21:10:56 -04:00
George Hotz	5cdec79469	simpler expand without dont_expand_args [run_process_replay] (#6230 ) * simpler expand without dont_expand_args [run_process_replay] * Revert "simpler expand without dont_expand_args [run_process_replay]" This reverts commit 81693024c097c31e601f1a199a631e9eda0d9638. * exclude_args * why does that fix it * correct fix * _swizzle_args should be fast * add comment * zip is tuples	2024-08-21 17:48:45 -07:00
nimlgen	78c94abe9c	raise time limit for ci in test_profile_multidev_transfer (#6227 )	2024-08-21 22:42:03 +03:00
gswangg	c74b318458	migrate test_linearizer.py to UOp AST, pt. 2 (#6228 )	2024-08-21 22:16:11 +03:00
George Hotz	c3168952f0	wip: tracking pattern matcher [run_process_replay] (#6225 ) * wip: tracking pattern matcher * better * proper dedup * timing * early reject * mergable match stats * TrackedPattenMatcher * fix TrackedPattenMatcher * cleanups * clean that too * remove early_reject * Revert "remove early_reject" This reverts commit dc2aef14b8f5da58f5ec9566daf252513cac394c. * total * sort by time * match_stats cleanup	2024-08-21 11:57:26 -07:00
chenyu	a666450e4d	UOp pattern x + x -> x * 2 (#6224 ) * UOp pattern x + x -> x * 2 now there's no NEG, with this it covers all kinds of ax+bx * can remove x-x	2024-08-21 12:06:19 -04:00
chenyu	c9a9631818	no UnaryOps.NEG in generated UOp patterns (#6209 ) * no UnaryOps.NEG in generated UOp patterns removed pattern `x * (-1) -> -x` and `x != True` * those are fine because NEG became CMPNE and True * fix sd validation L2 norm	2024-08-21 11:08:22 -04:00
qazal	3b8cc5a3e0	more multireduce tests prep for neg removal [run_process_replay] (#6220 )	2024-08-21 12:45:24 +03:00
qazal	86c036f0d3	reorder uops.py [run_process_replay] (#6219 ) * reorder uops.py [run_process_replay] * nop spacing	2024-08-21 11:39:55 +03:00
qazal	f03e5a4b3b	test_multireduce const has a shape (#6218 )	2024-08-21 11:02:45 +03:00
George Hotz	911bf7216c	remove unused match rules [run_process_replay] (#6217 )	2024-08-21 00:16:04 -07:00
George Hotz	2c42e9c2c6	faster rewrite, no folder in expand/reduce [run_process_replay] (#6216 ) * faster rewrite, no folder in expand/reduce [run_process_replay] * is removing the expander there okay * parens * don't reconstruct exact match uop * fast do_reduce * expand pyint * most of the parents gains with less lines	2024-08-20 23:36:58 -07:00
George Hotz	16f420f7a7	split full_graph_rewrite and linearize_uop [run_process_replay] (#6215 ) * split full_graph_rewrite and linearize_uop * fix tests * graph rewrite in test uops * add types	2024-08-20 20:12:33 -07:00
George Hotz	9faf205601	CIFAR trainer + various bugfixes / improvements (#6146 ) * move cifar into datasets * support for pathlib Tensors, tar_extract, and fetch gunzip * too early for Device.DEFAULT * simpler hlb_cifar + .to(None) is default * new compiler failure, start beautiful_cifar * beautiful cifar runs but is broken * jit train step * cleaner * std_mean, not mean_std * more correct * fast indexing * don't print that * torch load broken * add eval * nicer bar * decoraters are the way to do this * bounds check the load * a few ops * batchnorm bugfix, if track_running_stats is False, use online estimate * full timing * fix fusion * unneeded realize * master tensor	2024-08-20 16:58:46 -07:00
George Hotz	296368f0dd	Revert "delete arg from cast [run_process_replay] (#6202 )" (#6214 ) This reverts commit `ec52a09393`.	2024-08-20 16:45:30 -07:00
nimlgen	89c4cffd86	nv fix size in SET_SEMAPHORE_A (#6213 )	2024-08-21 01:47:10 +03:00
qazal	ec52a09393	delete arg from cast [run_process_replay] (#6202 )	2024-08-20 14:06:16 -07:00
Francis Lam	7376b67e36	extra/gemm/triton_nv_matmul: fix Program arguments (#6212 ) remove op_estimate	2024-08-20 14:05:38 -07:00
madt2709	4bb98d8882	Fix track_running_stats in batchnorm (#6200 ) * Fix track_running_stats in batchnorm * Fix linter * Update test_fold_conv_batchnorm_notrain to keep allowed at 1 * Add test_fold_conv_batchnorm_notrain_no_running_stats * Save 1 line	2024-08-20 14:01:22 -07:00
George Hotz	d9c62a33c3	add cifar to datasets.py (#6210 )	2024-08-20 11:42:49 -07:00
George Hotz	a5d79688db	fix indexing out of bounds (#6208 ) * fix indeing out of bounds * 5 ops per access is fine	2024-08-20 11:34:56 -07:00
chenyu	4451bcaf95	update test_arange test_llama_embedding_opt (#6207 ) non CI uses larger embedding, still same orders of magnitude	2024-08-20 13:58:43 -04:00
ignaciosica	e4bb63c1be	Refactor amd kernel prefix (#6205 ) * refactor amd kernel_prefix * restore removed comment * nit	2024-08-20 10:37:36 -07:00
qazal	074cf780dd	add option to only benchmark schedule [run_process_replay] (#6204 )	2024-08-20 16:51:27 +03:00
Francis Lata	8fd8b970b0	update URL to eval cases from recent MLPerf file movements (#6201 )	2024-08-20 08:43:13 -04:00
gswangg	0e6f057eae	migrate test_linearizer.py to UOP AST (pt. 1) (#6150 ) * migrate test_multioutput to UOP AST * inline buf declarations * migrate test_multireduce to UOp AST * update test_mid_dim_multireduce to UOp AST * update test_triple_multireduce with UOp AST * make global definitions more concise * update test_double_reduce_multireduce with UOp AST * update test_multireduce_with_parallel with UOp AST * update test_multiout_multireduce to UOp AST * make gidx style consistent across updated tests --------- Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>	2024-08-20 10:02:20 +03:00
chenyu	10330a41c7	add CMPNE tests in test_uops (#6196 ) fixed the output_dtype for CMPNE and match the tests for CMPLT	2024-08-19 19:41:21 -04:00
chenyu	21d6739237	remove UnaryOps.NEG from lazy.py (#6193 ) * remove UnaryOps.NEG from lazy.py * neg is no longer unary	2024-08-19 18:41:28 -04:00
chenyu	4d1b5781b5	remove UnaryOps.NEG from function.py (#6187 ) * remove function.Neg prep to remove UnaryOps.NEG * replace all NEG in function.py	2024-08-19 17:39:15 -04:00
nimlgen	bc44e6501b	_gpu_alloc -> allocator.alloc (#6189 ) * _gpu_alloc -> allocator.alloc * not needed this import * pylint	2024-08-19 23:34:22 +03:00
chenyu	96d502d8b7	update function.Max backward (#6190 ) instead of `(1-(x!=max))`, use `(x!=max)!=True`. prep to remove Unary.NEG, also this can be instruction fused later more easily	2024-08-19 16:06:14 -04:00
Gabe Caldwell	bdd6325f31	default num_classes value for one_hot (#6182 ) * num_classes=-1 If num_classes set to -1, the number of classes will be inferred as one greater than the largest class value in the input tensor. * num_classes desc comment to explain num_classes default and what that means. * replacing ' with `	2024-08-19 12:07:14 -07:00
Alessandro Benetti	9328248610	support for std_mean and cross_entropy (#6181 ) * support for std_mean and cross_entropy (#3) * Cross entropy and std mean support * remove extra examples	2024-08-19 12:06:44 -07:00
Max-We	53b20afa3f	Write tar_extract (#6180 ) * Add tar_extract * Add tar_extract tests * Fix dtype for initialization from path * Tests for path initialization * rm print --------- Co-authored-by: Maximilian Weichart <maximilian.weichart@icloud.com>	2024-08-19 12:06:17 -07:00
Eitan Turok	8556d0c642	Support `gunzip` in `fetch` (#6176 ) * init * update * clean * add type * clean * fix import order * shorten variable names	2024-08-19 12:04:40 -07:00
chenyu	705b8066ab	function.Div -> function.IDiv [run_process_replay] (#6188 ) float div is equivalent to mul a reciprocal	2024-08-19 14:59:41 -04:00
qazal	ee5fe12630	disallow some uops at different levels [run_process_replay] (#6186 ) * assert intermediate ones * assert low-level uops	2024-08-19 21:23:44 +03:00
samm393	5d742f7fe3	Missing features from rearrange (#6184 ) * fixes and tests * typo in test	2024-08-19 11:19:07 -07:00
qazal	2242ff84be	type verify intermediate UOps [run_process_replay] (#6140 ) * type verify intermediate UOps [run_process_replay] * merge asserts * variable const	2024-08-19 20:59:01 +03:00
qazal	478145cb8e	lowering error in diff_schedule is fine [run_process_replay] (#6185 )	2024-08-19 20:51:12 +03:00
chenyu	00578a021b	re:6125 switch real_size to use uops [run_process_replay] (#6138 ) * switch real_size to use uops [run_process_replay] * enough to pass --------- Co-authored-by: George Hotz <geohot@gmail.com>	2024-08-19 13:20:24 -04:00
qazal	e28d29641f	more scheduler process replay tooling [run_process_replay] (#6178 )	2024-08-19 15:35:51 +03:00
chenyu	b36a7273c6	RUF018 assignment-in-assert [run_process_replay] (#6172 ) assertion should not have side effect or `-O` breaks. initially just wanted to fix the one in rearrange, but it also made some long lines less long	2024-08-19 00:34:52 -04:00
chenyu	9c60a27ece	lower float64 sin fuzzer threshold (#6173 ) 139216373.71875 failed https://github.com/tinygrad/tinygrad/actions/runs/10446960642/job/28925156240	2024-08-19 00:25:42 -04:00
samm393	fd7c84c1c8	Rearrange (#6106 ) * rearrange and tests * tidy * whitespace * remove line * -5 lines * test fix * static -> instance * fix () & add more tests * remove flags * -1 line * match einops * whitespace * repeated names	2024-08-18 20:22:28 -07:00
chenyu	2de174677a	threefry touchup [run_process_replay] (#6169 ) also why is test_gc testing _rng_counter is allocated??	2024-08-18 23:01:24 -04:00
David González Martínez	724e408736	add support for retain_graph in backward (#6145 ) * add support for retain_graph in backward * fix: dont accumulate grad on non-leaf tensors * fix order * fix: do not delete grad on leafs * fix linter * fix: can't exactly match torch behaviour internally * allow numerical room for test * refactor	2024-08-18 16:08:31 -07:00
wozeparrot	0c5189de25	threefry half (#6154 )	2024-08-18 15:23:12 -07:00
qazal	fad1818530	move graph rewrite to ops [run_proess_replay] (#6159 ) * move graph rewrite to ops [run_proess_replay] * better place	2024-08-18 20:02:28 +03:00

1 2 3 4 5 ...

5704 Commits