tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-04-29 03:00:14 -04:00

Author	SHA1	Message	Date
Francis Lata	27ec792c19	check for CKPT when target metric is reached before saving	2025-03-02 00:41:08 -08:00
Francis Lata	3ac4ae5870	hotfix: log metric and move target metric check outside of CKPT	2025-03-01 04:31:00 -08:00
Francis Lata	974309862d	update dataloader seed	2025-02-28 21:41:30 +00:00
Francis Lata	6a62ece474	minor cleanups	2025-02-28 15:43:11 +00:00
Francis Lata	074e9f742b	more typing fixes	2025-02-28 15:42:11 +00:00
Francis Lata	e9d1af26b2	undo more changes	2025-02-28 15:11:17 +00:00
Francis Lata	47edcdb834	undo changse	2025-02-28 15:08:55 +00:00
Francis Lata	bdf442717c	update seeding on dataloader and the start of training script	2025-02-28 14:58:28 +00:00
Francis Lata	87bfa77f4a	some typing cleanups	2025-02-28 14:47:29 +00:00
Francis Lata	dc394e8214	Merge branch 'master' into retinanet_mlperf	2025-02-27 15:33:20 -05:00
chenyu	8ee2b460ee	Tensor.var_mean (#9287 )	2025-02-27 15:15:31 -05:00
qazal	cdf66cc67f	test: recompute expanded CAST (#9286 ) * those views should merge * diff cleanup * gpu * put it behind CAST_AFTER_EXPAND	2025-02-27 19:22:17 +01:00
nimlgen	43e60914f3	init torch hooking (#9284 ) * smth * mv * prof wk * revert and move * fix * nvprof * fix and no print much	2025-02-27 19:36:55 +03:00
George Hotz	387ea41e99	increase speed of torch mnist: use gradient api (#9282 )	2025-02-27 11:57:41 +08:00
Priyank Patel	a0764f0dc0	(bounty) Make mnist training run with torch backend (#9233 ) * yml changes * torch backend remove meta decomps and add test * torch backend bump timeout for tests --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-02-27 11:32:25 +08:00
George Hotz	67ba073c55	hotfix: test accuracy in beautiful_mnist_torch	2025-02-27 11:18:59 +08:00
George Hotz	9088125a6a	a lil more torch (#9280 )	2025-02-27 11:12:20 +08:00
George Hotz	b6a14911c8	start torch.compile support (#9279 )	2025-02-27 10:29:51 +08:00
chenyu	4342300eff	lower test_gemm_8192 amd to 70 (#9277 ) flaky	2025-02-26 16:32:08 -05:00
nimlgen	c4c29c8acc	nv: parse elf attrs (#9275 ) * better * hm * hm * fixed	2025-02-26 23:21:57 +03:00
chenyu	6350725e2d	simpler leaky_relu (#9271 ) rendered as `(data0+alu0) = ((val0<0.0f)?(0.01fval0):val0);` instead of two wheres. possible to update rewrite rules too	2025-02-26 13:43:48 -05:00
Francis Lata	4fa62ba304	Merge branch 'master' into retinanet_mlperf	2025-02-26 13:27:35 -05:00
Francis Lata	86b737a120	leakyrelu to leaky_relu (#9270 )	2025-02-26 13:22:08 -05:00
chenyu	cd822bbe11	hotfix torch_grad.detach().cpu().numpy() in test_ops (#9268 )	2025-02-26 12:27:35 -05:00
chenyu	49ca90df75	update test_ops backward tests (#9267 ) instead of `(out+1).square().mean().backward()`, use forward.sum().gradient to get closer to the gradients	2025-02-26 12:09:24 -05:00
Francis Lata	7cb226d757	Revert "Revert "add nan check during training"" This reverts commit `b7b2943197`.	2025-02-26 15:43:20 +00:00
Francis Lata	e0e50fc482	Merge branch 'master' into retinanet_mlperf	2025-02-26 15:43:05 +00:00
chenyu	aaf0a8069f	xor -> bitwise_xor (#9264 )	2025-02-26 10:21:14 -05:00
George Hotz	2158dc4849	full fix for as_strided in torch backend (#9257 ) * fixes from chargpt for torch backend * shrink support * add stride support * comment cleanup * a few more * work * import the stream hack * llvm multi auto	2025-02-26 22:34:05 +08:00
qazal	f60f997bf7	viz ui fixes [pr] (#9261 )	2025-02-26 14:52:18 +01:00
qazal	bfd1e55bda	show zoom to fit button in VIZ if graph isn't in view [pr] (#9258 ) * show zoom to fit button in VIZ if graph isn't in view [pr] * select #render	2025-02-26 14:20:39 +01:00
qazal	f70bad42ce	minor becomes_map cleanup + comments [pr] (#9256 ) * substitute assign source for KERNEL + comments [pr] * minor becomes_map cleanup + comments [pr]	2025-02-26 12:36:27 +01:00
George Hotz	7780393460	rig up torch's testing framework [pr] (#9254 ) * rig up torch's testing framework [pr] * support more movement ops * dec on expand * fix tests * work * fix tests * a few more * decomps + opt hook * installed pytest	2025-02-26 18:46:22 +08:00
qazal	b3755370ae	substitute assign source for KERNEL + comments [pr] (#9255 )	2025-02-26 11:44:29 +01:00
qazal	941559098b	do not lockup VIZ when rendering big graphs [pr] (#8795 ) * new viz renderer * aesthetics * progress message * pruning + timeout at 2s	2025-02-26 09:15:26 +01:00
qazal	e162aa862d	is_realized only if buffer is allocated (#9253 ) * is_realized only if the buffer is allocated * fix the image check too * assert test_lil_model after ExecItems run	2025-02-26 08:58:08 +01:00
George Hotz	b603af373e	run some tests from torch [pr] (#9252 ) * run some tests from torch [pr] * yml * wrap_out * clean up for the new people * a lil more	2025-02-26 15:42:22 +08:00
Francis Lata	e006ae24ea	Merge branch 'master' into retinanet_mlperf	2025-02-26 07:31:32 +00:00
George Hotz	3f4eb9006a	test for device mismatch [pr] (#9250 ) * test for device mismatch [pr] * fix bert	2025-02-26 13:06:33 +08:00
Sieds Lykles	9c4d9d9f10	Acc first (#9232 ) * put acc in front of the add chain * handle the other case * Make loop collapse more generic * Remove mulacc_unrolled * Actually remove it --------- Co-authored-by: George Hotz <geohot@gmail.com> Co-authored-by: chenyu <chenyu@fastmail.com>	2025-02-25 22:10:15 -05:00
chenyu	979e84f30e	RESET_STEP in bert setup and beam (#9248 ) running dev_beam migh OOM without it but runs fine in real run.	2025-02-25 19:15:10 -05:00
Francis Lata	b7b2943197	Revert "add nan check during training" This reverts commit `ddf1f0d5dd`.	2025-02-25 21:43:28 +00:00
nimlgen	2676c9d46e	dsp: raise exec errors as RuntimeError for beam (#9246 )	2025-02-25 19:22:35 +03:00
nimlgen	70db8c3003	hcq: dyn alloc signals (#9238 ) * hcq: dyn alloc signals * types and uniqueue devs * typing * mypy * mypy one more time * test * make fds to not intersect in mockgpu between drivers	2025-02-25 17:22:24 +03:00
chenyu	6610ad58ab	hotfix bert no shard with only one device (#9243 ) `LLVM=1 BERT_SIZE="tiny" DEFAULT_FLOAT=HALF BENCHMARK=5 MODEL="bert" python3 examples/mlperf/model_train.py` runs for me with this. it should not failed with single device shard though	2025-02-25 09:05:11 -05:00
qazal	bba9c22f53	implement the new subbuffer spec for DISK [pr] (#9241 )	2025-02-25 13:36:23 +01:00
qazal	48dfed064a	remove const/var from the kernel graph [pr] (#9240 )	2025-02-25 12:21:55 +01:00
Francis Lata	ddf1f0d5dd	add nan check during training	2025-02-25 10:53:31 +00:00
Francis Lata	8737020d75	add JIT reset support	2025-02-25 10:52:26 +00:00
Francis Lata	30d5daa121	Merge branch 'master' into retinanet_mlperf	2025-02-25 10:32:34 +00:00

1 2 3 4 5 ...

8210 Commits