tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-25 23:08:06 -05:00

Author	SHA1	Message	Date
Yixiang Gao	7c2ea85bb0	Raise memory limit for CIFAR test (#1499 )	2023-08-08 19:40:56 -04:00
Thiago Franco de Moraes	293a10204b	Add tinygrad.renderer to packages in setup.py (#1497 )	2023-08-08 15:51:49 -07:00
chenyu	0415a48cfc	patch JIT llama chat mode (#1496 )	2023-08-08 15:15:56 -07:00
Yixiang Gao	6480a1a180	CIFAR 94.03% (#1340 ) * add disk_tensor * fix jit * new baseline before whitening * whitening through torch * whiting done currently at 91.65% * 91.99% * clean up mixup and 92.3% * clean up 92.30% * 92.49% before searching for new hyper-parameters * fix CI * fix white space * add whitening init in test * refactor, update hyperpara, 92.72% * converting whiting to tinygrad operation * update CI kernels count for CIFAR * add pad reflect * add random crop 92.53% * update hyperpara 93% * 93.15% on docker container, need to refactor the assignment for hyper param * print out weights and bias to be separated * bias/non-bias params separated * fix whitespace * clean up * refactor hyper-param with dict * refactor lr schedular params * fix whitespace * fix cross entropy loss * fix whitespace * move opt hyp to hyp dict * minor fixup * adjust model, loss scaling * 92.74% while using half of compute as before * update hyp for cutmix * random shuffle during batches * clean up * updating the model * update ConvGroup * disable gradients for batchnorm layer weights * whitespace * 93.92% * clean up * finally 94%git add .! * rewrite whitening to remove dependency on torch * whitespace * remove dependency on torch, 93.91% * back to 94.03% * clean up * update test_real_world	2023-08-08 15:13:24 -07:00
Roelof van Dijk	aa83a9e910	ci: fix gpuocelot build cache (#1474 ) Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-08-08 14:00:04 -07:00
George Hotz	d24f936501	just cmplt (#1493 ) * just cmplt * fix maximum * don't save, there's no backward * ugh, no slot either * eq is a scam	2023-08-08 13:58:10 -07:00
Roelof van Dijk	e2cf0f322e	[READY] ci: missing n=auto (#1486 ) * ci: missing n=auto * fix: add to commented test --------- Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-08-08 07:37:24 -07:00
Roelof van Dijk	0ce7511110	fix: is not use with a literal (#1487 ) Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-08-08 07:35:30 -07:00
nimlgen	932dad1a2b	fix cast bool->float in llvmir (#1480 ) Closes #1479	2023-08-07 21:30:51 -07:00
nimlgen	046fd7437a	use fake buffer for external_test_speed_llama.py (#1478 )	2023-08-07 22:05:44 -04:00
George Hotz	5fdd248617	don't download cifar (#1472 )	2023-08-06 21:38:59 -07:00
George Hotz	d78fb8f4ed	add stable diffusion and llama (#1471 ) * add stable diffusion and llama * pretty in CI * was CI not true * that * CI=true, wtf * pythonpath * debug=1 * oops, wrong place * uops test broken for wgpu * wgpu tests flaky	2023-08-06 21:31:51 -07:00
terafo	24933ab551	Actually flip local_max in CUDA (#1462 ) * Actually do the flip * Fixed typo --------- Co-authored-by: terafo <terafo@protonmail.com>	2023-08-06 10:35:25 -07:00
Diogo	d7d1011f1e	Add WEBGPU tests to CI (#1463 ) * webgpu tests * assert device is webgpu * missed env set * exclude failing ci tests * ignore test file * changed acc for adam test	2023-08-06 10:32:01 -07:00
George Hotz	486a9dbfd9	speed v torch (#1464 ) * speed v torch * always print * change print * torch speed tee * all exposed	2023-08-06 09:32:33 -07:00
George Hotz	2ab282bfec	run on update_benchmark too (#1460 ) * run on update_benchmark too * amd inference test * name it better * add 10 CIFAR training steps	2023-08-06 08:58:37 -07:00
terafo	3d41674b42	Fixed regression (#1447 ) Co-authored-by: terafo <terafo@protonmail.com>	2023-08-06 07:55:58 -07:00
George Hotz	d67e248d9b	simple bitcast 2 (#1445 ) * simple bitcast 2 * bc 2 * empty * Revert "empty" This reverts commit `d8ee083655`.	2023-08-06 00:30:50 -07:00
George Hotz	943b227cb1	only on push to master	2023-08-06 00:10:07 -07:00
George Hotz	2274e3e757	Fix benchmark (#1454 ) * do benchmarking * system * artifact * go * name artifact * only on push	2023-08-05 23:44:36 -07:00
George Hotz	bf21aec81f	do benchmarking (#1451 ) * do benchmarking * system * artifact * go * name artifact	2023-08-05 23:35:01 -07:00
nimlgen	1ba8ae62a1	Match Torch speed for sum reduction (#1387 ) Co-authored-by: Alexander Edwards <alex@alexedw.com>	2023-08-05 22:27:33 -07:00
chenyu	09ede08b23	simplify Node.sum aggregating (#1449 )	2023-08-05 22:19:36 -07:00
George Hotz	7fa730b506	external model benchmark test	2023-08-05 22:10:48 -07:00
chenyu	cb5dcc7b57	remove view_from_shape (#1448 )	2023-08-05 20:39:13 -07:00
Diogo	e2af95c2f8	moved global_max and local_max to LinearizerOptions also added assert for max bufs (#1446 )	2023-08-05 18:23:18 -07:00
George Hotz	7b8d06c9f1	test uops (#1444 ) * test uops * tests should pass * improve uops * precision	2023-08-05 12:35:56 -07:00
George Hotz	84c430355e	fix backends for new style (#1443 ) * fix backends for new style * fix method cache * fix fakeless * llvm blacklist * fix kernel optimizer	2023-08-05 11:07:04 -07:00
George Hotz	67781fcf5d	fix fail fast in CI	2023-08-05 10:24:24 -07:00
George Hotz	bd7f4b1249	move renamer to linearizer (#1442 ) * move renamer to linearizer * uops converter * Delete test_uops.py	2023-08-05 08:53:25 -07:00
nimlgen	669b406ec6	correct children count with lazycache (#1429 )	2023-08-05 00:30:16 -07:00
Felix	97a6029cf7	Corrected a few misspelled words (#1435 )	2023-08-04 16:51:08 -07:00
Adrian Kretz	043d5f2cb5	Fix NOUNROLL (#1439 )	2023-08-04 16:50:19 -07:00
Francesco Castelli	579f4615a0	Add assert for wrong matmul/dot shapes (#1438 )	2023-08-04 18:16:56 -04:00
Umut Zengin	52db7d7435	inf, -inf support for pad (#1436 )	2023-08-04 15:05:25 -04:00
Alex Telon	7325bc914f	fix: Context (#1430 ) * Fixed issue in Context * Cleaned up fix Now that DEBUG.value = 3 always works we can do so in __new__ as well.	2023-08-04 10:53:48 -04:00
ian	c08ed1949f	Fix plt output comment (#1428 )	2023-08-03 23:35:52 -07:00
wozeparrot	801bed4f66	Add ops_shm (#1413 ) * feat: add ops_shm * clean: extra newline * feat: add test * feat: ci doesn't like that * feat: ci still doesn't like that * feat: skip big test on ci * feat: testing * feat: big * feat: testing again * feat: reskip test	2023-08-03 17:40:52 -07:00
chenyu	34f348643b	Support constant expand to symbolic shape (#1411 )	2023-08-02 21:21:22 -07:00
chenyu	6572ca6835	support symbolic expand (#1407 )	2023-08-02 20:03:46 -04:00
wozeparrot	a367f71fea	fix: don't put kernels into cache when optimizing (#1409 )	2023-08-02 18:17:16 -04:00
Paolo Gavazzi	9ffa1eb7e2	Removed dep of torch, torchaudio, kept librosa only (#1264 )	2023-08-02 13:52:04 -04:00
George Hotz	fc2303e520	gitignore in weights	2023-08-02 16:26:41 +00:00
chenyu	18d0a93f09	LazyBuffer.get_variable_buffers() (#1391 ) * LazyBudder.get_variable_buffers() * remove left_only, add ProdNode * no vars for OpNode.b * do not change symbolic vars, remove ProdNode	2023-08-02 09:01:35 -07:00
Umut Zengin	8889821547	Const pad support to pad2d and slice (#1392 ) * slice to pad2d migrate * Gain line * Mypy happy * Mypy happy * Revert * whitespace	2023-08-02 08:58:52 -07:00
wozeparrot	ab9e4a2e93	Make cuda CI a bit more consistent (#1403 ) * feat: use fast-apt-mirror * feat: use in more places	2023-08-02 07:38:22 -07:00
wozeparrot	7aff8c4ded	cl fixes (#1402 ) * feat: non-blocking * feat: store event on buffer	2023-08-01 22:13:51 -07:00
Alex Telon	b66361843a	Timing and Context can now be used as decorators (#1385 ) * Context and Timing can now be used as decorators * Using Timing decorator in quickstart.md The time formating is better and is a useful tool to learn. Old: Time: 3.5260659999912605 New: Time: 3526.14 ms * Updated env_vars documentation for Context * Added test for Context decorator * Put new import on same line as others	2023-08-01 17:16:10 -07:00
chenyu	d9d1372dd0	Update pytest.ini format (#1398 )	2023-08-01 18:00:51 -04:00
George Hotz	f4218b709f	Revert "Improve Metal runtime command buffer handling (#1335 )" (#1397 ) This reverts commit `bd54105b6b`.	2023-08-01 12:10:20 -07:00

1 2 3 4 5 ...

2244 Commits