tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-02-02 02:35:22 -05:00

Author	SHA1	Message	Date
qazal	a17ea53340	delete USE_COPY_KERNEL from the scheduler [run_process_replay] (#6482 )	2024-09-12 07:45:31 +08:00
nimlgen	eac046ea55	hcq check queue size before submit (#6481 )	2024-09-11 23:13:13 +03:00
qazal	dda5c63f4a	things we can delete after dtypes.void [run_process_replay] (#6480 )	2024-09-11 19:21:41 +08:00
qazal	bce73c9a54	more scheduler graph_rewrite cleanups [run_process_replay] (#6479 )	2024-09-11 18:26:35 +08:00
George Hotz	bdd0c06f29	add void type to uop (#6471 ) * unwrap_dtype maybe * uopgraph stuff that hardcoded None * test_ops passes * dtypes.py fixups * update test_linearizer and friends * more ast updates * test_beam and test_schedule too * add void type to uop [run_process_replay] * remove dumb casts * start making it green * more cast cleanups * more cls methods to fix * regenerate dataset * split UOp and NOp const * maybe that too * fix docs * update test_uop_symbolic * test_verify_ast * new sops with no diff * meh, type_ignore is alright * remove that assert --------- Co-authored-by: qazal <qazal.software@gmail.com>	2024-09-11 18:16:28 +08:00
George Hotz	1b4d1823b7	add pyint to DTYPES_DICT [run_process_replay] (#6477 ) * add pyint to DTYPES_DICT [run_process_replay] * also fix uop alu bug * exclude pyint there too * ne ne * force explicit dtype	2024-09-11 17:31:59 +08:00
qazal	5cc142c8b8	add uop.swizzle(st) (#6476 )	2024-09-11 16:52:42 +08:00
qazal	78148e16d8	init changes from the dtypes_void branch [run_process_replay] (#6475 )	2024-09-11 16:34:50 +08:00
qazal	d6d9234985	cleanup some scheduler rewrites [run_process_replay] (#6474 )	2024-09-11 16:10:59 +08:00
George Hotz	1cadddee26	Revert "fold lt (#6472 )" (#6473 ) This reverts commit `81bda4d304`.	2024-09-11 15:59:25 +08:00
George Hotz	81bda4d304	fold lt (#6472 )	2024-09-11 15:56:57 +08:00
qazal	e645a0e766	allow selecting UPat files in TRACK_MATCH_STATS [run_process_replay] (#6470 )	2024-09-11 14:32:46 +08:00
qazal	3cde1503ce	enable graph rewrite in the scheduler (#6249 ) * test: enable * skip those * skip pads tests	2024-09-11 14:30:04 +08:00
chenyu	d9d1ae7248	more lt folding using gcd (#6469 )	2024-09-11 02:09:35 -04:00
madt2709	dfe1db1cff	Fix typo in docs (#6468 ) Co-authored-by: theordias <theo.dias@cresta.ai>	2024-09-11 01:47:26 -04:00
qazal	262569a3eb	green conv bw AST_REWRITE=1 (#6466 ) * green conv bw AST_REWRITE=1 * new strides and dtype fix	2024-09-11 10:51:24 +08:00
chenyu	15c4d4f406	fold unrolled arange div pattern (#6465 )	2024-09-10 22:35:52 -04:00
qazal	4259311006	merge views in conv swizzle (#6464 )	2024-09-11 10:11:01 +08:00
George Hotz	6d195fb653	small changes from new style expand [run_process_replay] (#6462 )	2024-09-11 09:10:56 +08:00
qazal	803b8b9313	conv bw schedule and correctness tests to iterate on (#6461 ) first to fix AST_REWRITE=1, then to implement the same fusion for dtypes.half.	2024-09-11 08:47:07 +08:00
chenyu	b574caadc9	fix UOp const_factor for ADD [run_process_replay] (#6459 ) currently not used, fixed for completeness	2024-09-10 20:04:26 -04:00
chenyu	2105832b87	_min_max of MUL of 2 non-positive inputs (#6454 )	2024-09-10 07:13:01 -04:00
Francis Lata	b7ce9a1530	UNet3D MLPerf (#3470 ) * add training set transforms * add DICE cross entropy loss * convert pred and label to Tensor when calculating DICE score * cleanups and allow train dataset batching * fix DICE CE loss calculation * jitted training step * clean up DICE CE loss calculation * initial support for sharding * Revert "initial support for sharding" This reverts commit `e3670813b8`. * minor updates * cleanup imports * add support for sharding * apply temp patch to try to avoid OOM * revert cstyle changes * add gradient acc * hotfix * add FP16 support * add ability to train on smaller image sizes * add support for saving and loading checkpoints + cleanup some various modes * fix issue with using smaller patch size + update W&B logging * disable LR_WARMUP_EPOCHS * updates * minor cleanups * cleanup * update order of transformations * more cleanups * realize loss * cleanup * more cleanup * some cleanups * add RAM usage * minor cleanups * add support for gradient accumulation * cleanup imports * minor updates to not use GA_STEPS * remove FP16 option since it's available now globally * update multi-GPU setup * add timing logs for training loop * go back to using existing dataloader and add ability to preprocess data to save time * clean up optimization and re-enable JIT and multi-GPU support for training and evaluation * free train and eval steps memory * cleanups and scale batch size based on the number of GPUs * fix GlobalCounters import * fix seed * fix W&B setup * update batch size default size * add back metric divergence check * put back JIT on UNet3d eval * move dataset preprocessing inside training code * add test for dice_loss * add config logging support to W&B and other cleanups * change how default float is getting retrieved * remove TinyJit import duplicate * update config logging to W&B and remove JIT on eval_step * no need for caching preprocessed data anymore * fix how evaluation is ran and how often * add support for LR scaling * fix issue with gaussian being moved to scipy.signal.windows * remove DICE loss unit test * fix issue where loss isn't compatible with multiGPU * add individual BEAM control for train and eval steps * fix ndimage scipy import * add BENCHMARK * cleanups on BENCHMARK + fix on rand_flip augmentation during training * cleanup train and eval BEAM envs * add checkpointing support after every eval * cleanup model_eval * disable grad during eval * use new preprocessing dataset mechanism * remove unused import * use training and inference_mode contexts * start eval after benchmarking * add data fetching time * cleanup decorators * more cleanups on training script * add message during benchmarking mode * realize when reassigning LR on scheduler and update default number of epochs * add JIT on eval step * remove JIT on eval_step * add train dataloader for unet3d * move checkpointing to be done after every epoch * revert removal of JIT on unet3d inference * save checkpoint if metric is not successful * Revert "add train dataloader for unet3d" This reverts commit `c166d129df`. * Revert "Revert "add train dataloader for unet3d"" This reverts commit `36366c65d2`. * hotfix: seed was defaulting to a value of 0 * fix SEED value * remove the usage of context managers for setting BEAM and going from training to inference * support new stack API for calculating eval loss and metric * Revert "remove the usage of context managers for setting BEAM and going from training to inference" This reverts commit `2c0ba8d322`. * check training and test preprocessed folders separately * clean up imports and log FUSE_CONV_BW * use train and val preprocessing constants * add kits19 dataset setup script * update to use the new test decorator for disabling grad * update kits19 dataset setup script * add docs on how to train the model * set default value for BASEDIR * add detailed instruction about BASEDIR usage --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-09-10 04:37:28 -04:00
qazal	f4f705a07c	can push SWIZZLE through reduce both ways (#6453 )	2024-09-10 16:00:50 +08:00
qazal	1347e49e82	second iteration on UOps.SWIZZLE (#6451 ) * new swizzle * fix the failing tests * test a double swizzle * ci	2024-09-10 14:43:21 +08:00
chenyu	e0d35e3657	update test_padto_sum_not_ok (#6450 ) updated the setup as `exp() < -1` could be folded to False	2024-09-09 22:46:42 -04:00
qazal	95c9fe841e	UOp.st infra for the new SWIZZLE (#6449 )	2024-09-10 09:39:45 +08:00
qazal	abfbd9fd2f	fix Variable init from the DEFINE_VAR refactor (#6448 ) prereq for UOps.VALID.	2024-09-10 09:14:29 +08:00
chenyu	fcc69adfc5	simplify c0x<c1 for negative int c0,c1 (#6431 ) simplify c0x<c1 for negative int c0,c1 fine if rhs is zero	2024-09-09 21:05:53 -04:00
kormann	f6f4f3222f	whisper long batch (#6335 ) * reset * test * only part refactor	2024-09-09 21:03:59 -04:00
qazal	29e63097a0	st is a cached_property on UOp [run_process_replay] (#6433 )	2024-09-10 08:30:35 +08:00
qazal	cf64f8bb40	start with the UOps.VALID spec [run_process_replay] (#6435 ) * document UOps.VALID [run_process_replay] * now the assert	2024-09-10 08:00:19 +08:00
Tim Becker	58a1b4f427	Faster UOp hashing (#6447 ) * Faster hashing of Enums and UOp * NOp should not define __eq__ --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-09-10 07:16:04 +08:00
George Hotz	92e4126793	Revert "Revert "RewriteContext [run_process_replay] (#6428 )" (#6438 )" (#6443 ) This reverts commit `e7dd08448f`.	2024-09-10 07:00:26 +08:00
George Hotz	904f6a63fa	Revert "Revert "cleanup process_replay/* namings [run_process_replay] (#6429 )…" (#6442 ) This reverts commit `eda177da84`.	2024-09-10 07:00:16 +08:00
nimlgen	8d3450ceab	qcom remove unused commands (#6445 ) * qcom remove unused commands * linetr	2024-09-09 20:26:07 +03:00
nimlgen	f63a9fd649	hcq _cur_cmd_idx for readability (#6444 ) * hcq _cur_cmd_idx for readability * linter	2024-09-09 20:04:45 +03:00
George Hotz	dbd4536167	Revert "add UOps.VALID (#6387 )" (#6441 ) This reverts commit `8186e4e7d6`.	2024-09-09 21:33:00 +08:00
George Hotz	e7dd08448f	Revert "RewriteContext [run_process_replay] (#6428 )" (#6438 ) This reverts commit `e1d61b048b`.	2024-09-09 18:53:18 +08:00
George Hotz	eda177da84	Revert "cleanup process_replay/* namings [run_process_replay] (#6429 )" (#6437 ) This reverts commit `f4e83b30b4`.	2024-09-09 18:52:36 +08:00
George Hotz	d5bd38c278	add min max rule for expand [run_process_replay] (#6434 )	2024-09-09 18:30:20 +08:00
George Hotz	42e5c8335e	remove args from min/max [run_process_replay] (#6430 ) * remove args from min/max [run_process_replay] * it's a ConstType * sconst_like unused * any const is fine	2024-09-09 18:18:20 +08:00
qazal	f4e83b30b4	cleanup process_replay/* namings [run_process_replay] (#6429 )	2024-09-09 16:59:04 +08:00
George Hotz	8186e4e7d6	add UOps.VALID (#6387 ) * uops valid * broke full_shape * fixup that st (hardcoded asts still red) * fixup DEFINE_VAR debug more debug * start moving stuff to ast_const * move test_linearizer * move test_linearizer_failures to ast_const * fixup test_schedule * small diff change * regenerate dataset * fixup test_multitensor * regen dataset try 2 --------- Co-authored-by: qazal <qazal.software@gmail.com>	2024-09-09 16:58:43 +08:00
George Hotz	e1d61b048b	RewriteContext [run_process_replay] (#6428 )	2024-09-09 16:49:02 +08:00
qazal	935b6b658f	delete seen from the scheduler api [run_process_replay] (#6427 ) docs	2024-09-09 16:26:34 +08:00
George Hotz	6c7abd18df	non-optional bounds (faster) [run_process_replay] (#6425 ) * non-optional bounds (faster) [run_process_replay] * pre-fetch min/max * Revert "pre-fetch min/max" This reverts commit `cdd71840c5`.	2024-09-09 16:00:16 +08:00
qazal	c5bae55ec8	new generate_dataset.sh (#6423 ) * new generate_dataset.sh * keep those there * test: rm expected failures * rename to extract	2024-09-09 15:13:07 +08:00
chenyu	1941e66cc9	real strides with uops (#6365 ) * real strides with uops [run_process_replay] * compare with old * Revert "compare with old" This reverts commit `f53a8d4276`. * make those @unittest.expectedFailure	2024-09-09 03:06:27 -04:00
chenyu	ac98f5056e	move lt-folding to a function [run_process_replay] (#6422 ) and added more tests (some failed to match symbolic)	2024-09-09 02:04:52 -04:00

... 91 92 93 94 95 ...

10490 Commits