tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-30 17:28:24 -05:00

Author	SHA1	Message	Date
qazal	78148e16d8	init changes from the dtypes_void branch [run_process_replay] (#6475 )	2024-09-11 16:34:50 +08:00
qazal	d6d9234985	cleanup some scheduler rewrites [run_process_replay] (#6474 )	2024-09-11 16:10:59 +08:00
George Hotz	1cadddee26	Revert "fold lt (#6472 )" (#6473 ) This reverts commit `81bda4d304`.	2024-09-11 15:59:25 +08:00
George Hotz	81bda4d304	fold lt (#6472 )	2024-09-11 15:56:57 +08:00
qazal	e645a0e766	allow selecting UPat files in TRACK_MATCH_STATS [run_process_replay] (#6470 )	2024-09-11 14:32:46 +08:00
qazal	3cde1503ce	enable graph rewrite in the scheduler (#6249 ) * test: enable * skip those * skip pads tests	2024-09-11 14:30:04 +08:00
chenyu	d9d1ae7248	more lt folding using gcd (#6469 )	2024-09-11 02:09:35 -04:00
madt2709	dfe1db1cff	Fix typo in docs (#6468 ) Co-authored-by: theordias <theo.dias@cresta.ai>	2024-09-11 01:47:26 -04:00
qazal	262569a3eb	green conv bw AST_REWRITE=1 (#6466 ) * green conv bw AST_REWRITE=1 * new strides and dtype fix	2024-09-11 10:51:24 +08:00
chenyu	15c4d4f406	fold unrolled arange div pattern (#6465 )	2024-09-10 22:35:52 -04:00
qazal	4259311006	merge views in conv swizzle (#6464 )	2024-09-11 10:11:01 +08:00
George Hotz	6d195fb653	small changes from new style expand [run_process_replay] (#6462 )	2024-09-11 09:10:56 +08:00
qazal	803b8b9313	conv bw schedule and correctness tests to iterate on (#6461 ) first to fix AST_REWRITE=1, then to implement the same fusion for dtypes.half.	2024-09-11 08:47:07 +08:00
chenyu	b574caadc9	fix UOp const_factor for ADD [run_process_replay] (#6459 ) currently not used, fixed for completeness	2024-09-10 20:04:26 -04:00
chenyu	2105832b87	_min_max of MUL of 2 non-positive inputs (#6454 )	2024-09-10 07:13:01 -04:00
Francis Lata	b7ce9a1530	UNet3D MLPerf (#3470 ) * add training set transforms * add DICE cross entropy loss * convert pred and label to Tensor when calculating DICE score * cleanups and allow train dataset batching * fix DICE CE loss calculation * jitted training step * clean up DICE CE loss calculation * initial support for sharding * Revert "initial support for sharding" This reverts commit `e3670813b8`. * minor updates * cleanup imports * add support for sharding * apply temp patch to try to avoid OOM * revert cstyle changes * add gradient acc * hotfix * add FP16 support * add ability to train on smaller image sizes * add support for saving and loading checkpoints + cleanup some various modes * fix issue with using smaller patch size + update W&B logging * disable LR_WARMUP_EPOCHS * updates * minor cleanups * cleanup * update order of transformations * more cleanups * realize loss * cleanup * more cleanup * some cleanups * add RAM usage * minor cleanups * add support for gradient accumulation * cleanup imports * minor updates to not use GA_STEPS * remove FP16 option since it's available now globally * update multi-GPU setup * add timing logs for training loop * go back to using existing dataloader and add ability to preprocess data to save time * clean up optimization and re-enable JIT and multi-GPU support for training and evaluation * free train and eval steps memory * cleanups and scale batch size based on the number of GPUs * fix GlobalCounters import * fix seed * fix W&B setup * update batch size default size * add back metric divergence check * put back JIT on UNet3d eval * move dataset preprocessing inside training code * add test for dice_loss * add config logging support to W&B and other cleanups * change how default float is getting retrieved * remove TinyJit import duplicate * update config logging to W&B and remove JIT on eval_step * no need for caching preprocessed data anymore * fix how evaluation is ran and how often * add support for LR scaling * fix issue with gaussian being moved to scipy.signal.windows * remove DICE loss unit test * fix issue where loss isn't compatible with multiGPU * add individual BEAM control for train and eval steps * fix ndimage scipy import * add BENCHMARK * cleanups on BENCHMARK + fix on rand_flip augmentation during training * cleanup train and eval BEAM envs * add checkpointing support after every eval * cleanup model_eval * disable grad during eval * use new preprocessing dataset mechanism * remove unused import * use training and inference_mode contexts * start eval after benchmarking * add data fetching time * cleanup decorators * more cleanups on training script * add message during benchmarking mode * realize when reassigning LR on scheduler and update default number of epochs * add JIT on eval step * remove JIT on eval_step * add train dataloader for unet3d * move checkpointing to be done after every epoch * revert removal of JIT on unet3d inference * save checkpoint if metric is not successful * Revert "add train dataloader for unet3d" This reverts commit `c166d129df`. * Revert "Revert "add train dataloader for unet3d"" This reverts commit `36366c65d2`. * hotfix: seed was defaulting to a value of 0 * fix SEED value * remove the usage of context managers for setting BEAM and going from training to inference * support new stack API for calculating eval loss and metric * Revert "remove the usage of context managers for setting BEAM and going from training to inference" This reverts commit `2c0ba8d322`. * check training and test preprocessed folders separately * clean up imports and log FUSE_CONV_BW * use train and val preprocessing constants * add kits19 dataset setup script * update to use the new test decorator for disabling grad * update kits19 dataset setup script * add docs on how to train the model * set default value for BASEDIR * add detailed instruction about BASEDIR usage --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-09-10 04:37:28 -04:00
qazal	f4f705a07c	can push SWIZZLE through reduce both ways (#6453 )	2024-09-10 16:00:50 +08:00
qazal	1347e49e82	second iteration on UOps.SWIZZLE (#6451 ) * new swizzle * fix the failing tests * test a double swizzle * ci	2024-09-10 14:43:21 +08:00
chenyu	e0d35e3657	update test_padto_sum_not_ok (#6450 ) updated the setup as `exp() < -1` could be folded to False	2024-09-09 22:46:42 -04:00
qazal	95c9fe841e	UOp.st infra for the new SWIZZLE (#6449 )	2024-09-10 09:39:45 +08:00
qazal	abfbd9fd2f	fix Variable init from the DEFINE_VAR refactor (#6448 ) prereq for UOps.VALID.	2024-09-10 09:14:29 +08:00
chenyu	fcc69adfc5	simplify c0x<c1 for negative int c0,c1 (#6431 ) simplify c0x<c1 for negative int c0,c1 fine if rhs is zero	2024-09-09 21:05:53 -04:00
kormann	f6f4f3222f	whisper long batch (#6335 ) * reset * test * only part refactor	2024-09-09 21:03:59 -04:00
qazal	29e63097a0	st is a cached_property on UOp [run_process_replay] (#6433 )	2024-09-10 08:30:35 +08:00
qazal	cf64f8bb40	start with the UOps.VALID spec [run_process_replay] (#6435 ) * document UOps.VALID [run_process_replay] * now the assert	2024-09-10 08:00:19 +08:00
Tim Becker	58a1b4f427	Faster UOp hashing (#6447 ) * Faster hashing of Enums and UOp * NOp should not define __eq__ --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-09-10 07:16:04 +08:00
George Hotz	92e4126793	Revert "Revert "RewriteContext [run_process_replay] (#6428 )" (#6438 )" (#6443 ) This reverts commit `e7dd08448f`.	2024-09-10 07:00:26 +08:00
George Hotz	904f6a63fa	Revert "Revert "cleanup process_replay/* namings [run_process_replay] (#6429 )…" (#6442 ) This reverts commit `eda177da84`.	2024-09-10 07:00:16 +08:00
nimlgen	8d3450ceab	qcom remove unused commands (#6445 ) * qcom remove unused commands * linetr	2024-09-09 20:26:07 +03:00
nimlgen	f63a9fd649	hcq _cur_cmd_idx for readability (#6444 ) * hcq _cur_cmd_idx for readability * linter	2024-09-09 20:04:45 +03:00
George Hotz	dbd4536167	Revert "add UOps.VALID (#6387 )" (#6441 ) This reverts commit `8186e4e7d6`.	2024-09-09 21:33:00 +08:00
George Hotz	e7dd08448f	Revert "RewriteContext [run_process_replay] (#6428 )" (#6438 ) This reverts commit `e1d61b048b`.	2024-09-09 18:53:18 +08:00
George Hotz	eda177da84	Revert "cleanup process_replay/* namings [run_process_replay] (#6429 )" (#6437 ) This reverts commit `f4e83b30b4`.	2024-09-09 18:52:36 +08:00
George Hotz	d5bd38c278	add min max rule for expand [run_process_replay] (#6434 )	2024-09-09 18:30:20 +08:00
George Hotz	42e5c8335e	remove args from min/max [run_process_replay] (#6430 ) * remove args from min/max [run_process_replay] * it's a ConstType * sconst_like unused * any const is fine	2024-09-09 18:18:20 +08:00
qazal	f4e83b30b4	cleanup process_replay/* namings [run_process_replay] (#6429 )	2024-09-09 16:59:04 +08:00
George Hotz	8186e4e7d6	add UOps.VALID (#6387 ) * uops valid * broke full_shape * fixup that st (hardcoded asts still red) * fixup DEFINE_VAR debug more debug * start moving stuff to ast_const * move test_linearizer * move test_linearizer_failures to ast_const * fixup test_schedule * small diff change * regenerate dataset * fixup test_multitensor * regen dataset try 2 --------- Co-authored-by: qazal <qazal.software@gmail.com>	2024-09-09 16:58:43 +08:00
George Hotz	e1d61b048b	RewriteContext [run_process_replay] (#6428 )	2024-09-09 16:49:02 +08:00
qazal	935b6b658f	delete seen from the scheduler api [run_process_replay] (#6427 ) docs	2024-09-09 16:26:34 +08:00
George Hotz	6c7abd18df	non-optional bounds (faster) [run_process_replay] (#6425 ) * non-optional bounds (faster) [run_process_replay] * pre-fetch min/max * Revert "pre-fetch min/max" This reverts commit `cdd71840c5`.	2024-09-09 16:00:16 +08:00
qazal	c5bae55ec8	new generate_dataset.sh (#6423 ) * new generate_dataset.sh * keep those there * test: rm expected failures * rename to extract	2024-09-09 15:13:07 +08:00
chenyu	1941e66cc9	real strides with uops (#6365 ) * real strides with uops [run_process_replay] * compare with old * Revert "compare with old" This reverts commit `f53a8d4276`. * make those @unittest.expectedFailure	2024-09-09 03:06:27 -04:00
chenyu	ac98f5056e	move lt-folding to a function [run_process_replay] (#6422 ) and added more tests (some failed to match symbolic)	2024-09-09 02:04:52 -04:00
qazal	ff8a9ac3c1	test new style gated store rendering (#6413 ) * test new style gated store rendering * switch to lidx * make lidx optional * fixup [run_process_replay]	2024-09-09 13:59:22 +08:00
George Hotz	90fb17304f	put rewrite back in ops [run_process_replay] (#6421 )	2024-09-09 13:53:51 +08:00
chenyu	047ab7d256	minor Program post_init size cleanup [run_process_replay] (#6415 )	2024-09-08 23:41:27 -04:00
qazal	442150a8df	more ast_const for hardcoding consts [run_process_replay] (#6418 )	2024-09-09 11:35:08 +08:00
chenyu	25af78c593	failed uop_symbolic divmod test by variable (#6414 )	2024-09-08 23:08:58 -04:00
qazal	88941bcf16	fold bitwise noops (#6412 ) from `8269a721cd6f5c6030ce120e1139095d7ba117eb` Co-authored-by: timmy <timmy0x@proton.me>	2024-09-09 10:18:38 +08:00
chenyu	ad05302232	tests of real_stride of symbolic shape (#6409 ) these would have failed in #6365	2024-09-08 21:37:19 -04:00

... 94 95 96 97 98 ...

10633 Commits