tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-13 17:08:11 -05:00

Author	SHA1	Message	Date
George Hotz	14b613e281	add STEPS to beautiful_mnist	2024-08-10 15:23:44 -07:00
wozeparrot	d269bc95fa	faster tinychat (#5993 )	2024-08-08 19:16:26 -07:00
Elias Wahl	c9b4602854	no load in INITMLPERF (#5957 )	2024-08-08 11:28:24 -04:00
Elias Wahl	c9862e17d4	MLPERF BERT submission scripts (#5931 ) * green * red * fix benchmark * log * count train samples * oops. 4.0 -> 4.1 * note to todo * no pillow	2024-08-06 18:09:18 -04:00
chenyu	1dab75ae37	clean up mlperf dataloader import (#5940 ) use tinygrad tqdm for dataset, and PIL Image is only needed for resnet	2024-08-06 17:10:08 -04:00
George Hotz	e077bc7baf	move memory planner to realize (#5937 )	2024-08-06 10:41:29 -07:00
Elias Wahl	937bf5fe12	better hparam (#5891 )	2024-08-03 12:38:53 -04:00
Elias Wahl	4a114756f6	New BERT dataloader (#5881 ) * One file == One topic * update test * new dataloader * update train script * get index is faster	2024-08-02 15:12:23 -04:00
David Hou	9a485f36e4	shard kvcache (#5830 )	2024-07-30 20:29:54 -07:00
George Hotz	21c5e8e1b7	extreme llama speed, 57.34 tok/s (#5827 ) * extreme llama speed * mergable	2024-07-30 18:32:09 -07:00
Francis Lata	a0baff7a3d	update dataloader script example (#5818 )	2024-07-30 15:18:29 -04:00
wozeparrot	eebb1b9922	feat: temperature 0 llama3 benchmark (#5806 )	2024-07-30 12:05:36 -07:00
wozeparrot	639af3f823	llama3 temperature flag (#5803 )	2024-07-29 16:33:51 -07:00
George Hotz	8b34ee2f52	remove global_size and local_size from Kernel class [run_process_replay] (#5720 ) * remove global_size and local_size from Kernel class [run_process_replay] * sizes from the prg	2024-07-25 13:55:08 -07:00
George Hotz	7f5282b2f5	tests if the linearizer is generating dumb code (#5611 ) * tests if the linearizer is generating dumb code * push consts to the end * sort adds * sorted add and mul * this better * simple expand/contract * no math contract/expand	2024-07-20 20:36:32 -07:00
George Hotz	b399ccd6ef	BEAM bugfix, kernels dedup now (#5617 ) * BEAM bugfix, kernels dedup now * getenv is default	2024-07-20 19:43:50 -07:00
chenyu	d71308ed68	copy mlperf 4.0 to mlperf 4.1 (#5614 )	2024-07-20 16:12:00 -04:00
George Hotz	1113e47f96	print best in MCTS + light up the winner in hcopt	2024-07-20 09:39:36 -07:00
George Hotz	06e336bccb	mcts search (#5598 ) * mcts search * mcts cleanups * mcts cleanup * random shuffle children order * mcts in handcode_opt * src and remove_node * debug 3 to print ast * print the type * mcts in extra	2024-07-19 21:38:39 -07:00
George Hotz	0ad87021e2	move acc to end (#5568 ) * move acc to end * confirmed pictures are the same * relax that * Update test_ops.py	2024-07-19 03:06:52 -07:00
George Hotz	2de82b8a5d	remove get_lazyop_info (#5570 ) * don't use get_lazyop_info more * keep that min * no ptx for that test	2024-07-19 03:05:33 -07:00
kormann	2c4add6844	pretty print lazy op per default (#5505 ) * pretty lop * min diff * walrus * fix * min diff * simplify * pretty helper function * ws * pretty uop upat * tests * stricter tests * test passes * ws * stronger upat test * delete print_tree * min diff * stricter exp test * fix merge * stronger uops eval test * +readable and deep upat test * +readable and deep upat test * sort inv fix * fix * revert allowed_len	2024-07-18 09:34:08 -07:00
George Hotz	fa7e734b49	MetaOps.KERNEL (#5543 )	2024-07-17 19:41:23 -07:00
chenyu	4193095f67	fix handcode_opt.py with DEBUG=2 (#5530 ) only one ast per kernel now	2024-07-17 14:50:47 -04:00
George Hotz	a9f5a764dc	make BatchNorm work for 2D and 3D (#5477 ) * make BatchNorm work for 2D and 3D * beautiful mnist shouldn't use BatchNorm2d	2024-07-14 11:39:58 -07:00
George Hotz	aade18d20c	beautiful_mnist in torch	2024-07-14 11:09:58 -07:00
George Hotz	cdf63e41bf	mnist mlx example uses compile to be fair to tinyjit	2024-07-13 18:14:45 -07:00
George Hotz	8940530290	add mlx beautiful_mnist example	2024-07-13 17:55:47 -07:00
chenyu	28972418c4	s/get_linearizer/get_kernel [run_process_replay] (#5467 )	2024-07-13 20:32:22 -04:00
Francis Lata	0345577032	UNet3D dataloader shared memory fix (#5465 ) * create separate SharedMemory between inputs and labels * update path check for shared mem * clean up unit test for dataset	2024-07-13 20:26:00 -04:00
chenyu	4df63da190	clean up rest of the loadop [run_process_replay] (#5440 ) to metaop and filter_sink	2024-07-12 23:38:51 -04:00
George Hotz	03c2dc8bd7	lowerer is kernel [run_process_replay] (#5437 )	2024-07-12 18:50:55 -07:00
chenyu	9a187e6102	fix handcode_opt script (#5435 ) * fix handcode_opt script * run in ci * real run in ci * HALF=0	2024-07-12 20:52:28 -04:00
George Hotz	870dc8c350	s/Linearizer/Lowerer [run_process_replay] (#5428 )	2024-07-12 15:54:07 -07:00
George Hotz	6707c778d0	scheduleitem is not Tuple [run_process_replay] (#5425 ) * scheduleitem is not Tuple [run_process_replay] * fix tests * fix op + fuzzers * fix mop test	2024-07-12 15:13:19 -07:00
George Hotz	f6ef283e6a	s/loadops/metaops [run_process_replay] (#5421 )	2024-07-12 13:26:50 -07:00
wozeparrot	d1cbd6bb95	unity handcode_resnet_opt and handcode_bert_opt (#5418 )	2024-07-12 12:05:01 -07:00
wozeparrot	b7cc75a9df	usage summary in handcode opt (#5414 )	2024-07-12 11:21:18 -07:00
George Hotz	8390feb7b9	optim.OptimizerGroup in hlb_cifar (#5401 )	2024-07-11 20:14:36 -07:00
wozeparrot	c24d495ef9	metadata in handcode_opt (#5400 )	2024-07-11 17:45:34 -07:00
George Hotz	5232e405ce	hotfix: add BS to beautiful_mnist	2024-07-11 10:55:05 -07:00
wozeparrot	c9b3ae6bbf	fix llama.py chat mode assert (#5366 )	2024-07-10 18:06:14 -07:00
wozeparrot	fa873df9c1	bring tinychat more inline with tinyos' version (#5358 )	2024-07-10 13:13:52 -07:00
chenyu	322c37e621	use helpers.JIT in llama and gpt2 examples (#5350 ) * use helpers.JIT in llama and gpt2 examples replaced getenv("JIT"), effectively made gpt2 default jit * fix test_gpt2	2024-07-09 15:04:43 -04:00
Elias Wahl	73bddc44f6	Fix fake dataloader (#5326 )	2024-07-08 09:07:44 -04:00
chenyu	43c3f73fbc	handcode_bert_opt.py (#5295 ) similar to handcode_resnet50_opt.py, one file to check bert kernels without dataset.	2024-07-05 11:01:20 -04:00
Tobias Fischer	0c3a35e5c2	Stable Diffusion v2 Inference (#5283 ) * model implementation * clip fix, more qol options	2024-07-03 22:47:10 -04:00
reddyn12	d3e244d8b7	prev speed improvements (#5252 ) Co-authored-by: reddyn <nikidsniper@gmail.com>	2024-07-03 09:06:01 -07:00
chenyu	191463a919	add timing to SDXL (#5273 )	2024-07-02 23:29:54 -04:00
chenyu	b2c3a28a5e	nn.RMSNorm (#5272 ) the norm itself has no significant value to add to Tensor method, but we would want Tensor.normalize	2024-07-02 21:39:01 -04:00

... 7 8 9 10 11 ...

1207 Commits