tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-02-03 03:05:03 -05:00

Author	SHA1	Message	Date
chenyu	36a1f38049	lazy folding: mul -1 is neg, and neg neg is noop (#4472 )	2024-05-08 01:52:22 -04:00
chenyu	c508eb7425	revert the removal of CAST_BEFORE_VIEW (#4471 ) this brings most of the memory gain for resnet back.	2024-05-08 00:14:29 -04:00
chenyu	f363f39e83	fix dtype of const folded sum (#4349 ) const folding sum should return in the same dtype the same as regular sum, which can be different from input dtype	2024-04-29 11:40:45 -04:00
George Hotz	ba7314c26b	cleanup lbs (#4163 )	2024-04-12 22:32:16 -07:00
chenyu	a7c6864260	remove CAST_BEFORE_VIEW (#4152 ) * remove CAST_BEFORE_VIEW testing perf, also this might have issue with assign? * remove all	2024-04-13 01:05:08 -04:00
geohotstan	1a1dd1c1a7	add and enable tests for indexing const folding (#4068 ) * enable test in test_indexing * added tests * rename stuff * del a test case cuz it's loadops.copy	2024-04-04 10:46:28 -04:00
chenyu	406cb5fd90	const fold ReduceOps (#4059 )	2024-04-03 14:39:28 -04:00
chenyu	fe03725b21	const fold cast unrealized_unpadded_const (#4047 ) * const fold unrealized_unpadded_const changed the underlying arg directly * CAST_BEFORE_VIEW folds some * fix const index in getitem	2024-04-03 12:31:24 -04:00
chenyu	f61ed869f5	Use exec_alu for lazy const folding (#4039 )	2024-04-02 20:52:05 -04:00
chenyu	85edc493b0	uops const fold rules to prevent tautological compare warnings (#4041 ) * uops const fold rules to prevent tautological compare warnings `bool < false` is false, `true < bool` is false, `a == a` is true, `a != a` is false * not true for nan * and nan does not work with llvm * full truth table test * revert a==a * comments and indents	2024-04-02 16:45:58 -04:00
chenyu	82440d3416	don't call contiguous for unpadded const into multi tensor (#4032 ) * don't call contiguous for unpadded const into multi tensor fixed multi const folding for sharded const. still wip, need to be careful that this does not break multi device cache somewhere * ehh need a memory test for that * simple sharded memory test	2024-04-01 19:22:14 -04:00
chenyu	77a68fc52f	test examples for multi tensor const folding (#4031 ) works with literal const operand now because it's copied to each shard and handled by lazy. does not work for sharded const	2024-04-01 16:53:43 -04:00
chenyu	379d52548d	const fold left const operand for ADD and MUL (#4029 ) * const fold left const operand for ADD and MUL * neg have dtype issue	2024-04-01 15:09:04 -04:00
chenyu	0e02d074bd	fix Tensor.pow folding for exponent 0 and 1 (#4025 )	2024-03-31 19:57:23 -04:00
chenyu	d3f27761b0	move const folding of ADD/SUB/MUL from tensor to lazy (#4020 ) * move const folding of ADD/SUB/MUL from tensor to lazy will do div and pow separately. * fix onnx adding with None	2024-03-31 16:35:36 -04:00
chenyu	7f859593b8	fix _to_const_val and const folding around it (#4017 ) * fix _to_const_val and const folding around it is_unrealized_contiguous_const is too strict and almost never hit if const is expanded. suffice to check if there's no pad * that test is folded * test_const_folding	2024-03-31 13:09:23 -04:00

16 Commits