tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-20 20:38:03 -05:00

Author	SHA1	Message	Date
chenyu	b087663c35	RANGEIFY test_bert uses more ran somehow (#12443 )	2025-10-03 04:38:53 -04:00
chenyu	f203d8b221	update RANGEIFY kernel count and test_masked_select (#12435 )	2025-10-03 00:41:34 -04:00
George Hotz	7129419500	fix cifar training in RANGEIFY (#12355 ) * fix cifar training in RANGEIFY * even more wino fuse * bugfix * test to show issue	2025-09-30 15:59:19 +08:00
chenyu	647965fb09	test_train cleanup (#12140 ) * test_train cleanup remove skipIf due to buffer sizes, runs locally * those are slow	2025-09-12 13:21:30 -04:00
chenyu	20cd7177de	delete test_bert_fuse_arange (#12121 ) * delete test_bert_fuse_arange it's the default now and we are not interested in FUSE_ARANGE=0 version * remove -v	2025-09-11 12:35:51 -04:00
chenyu	0e266f376c	ops_gpu -> ops_cl (#12103 )	2025-09-10 15:15:48 -04:00
nimlgen	1c6c42715f	unify cpu and llvm (#11982 ) * try unify cpu and llvm * fixes * fix * ops * no llvm * fix * rm * lvmm is ot * oops * override * no llvm * ignore * skip llvm * ooops	2025-09-09 13:54:44 +03:00
chenyu	bfa87f3490	clean up binary_crossentropy_logits (#10958 )	2025-06-24 12:23:40 -04:00
George Hotz	81b9c04574	move high level stuff to unit tests [pr] (#10708 ) * move high level stuff to unit tests [pr] * process replay on unit tests * fix pr, less compute * set omp num threads * set 200MB buffer size limit * delete junk * fix tests * faster * move test_indexing to unit * faster	2025-06-08 14:05:56 -07:00
qazal	95c6a736a9	fix FUSE_ARANGE=1 for bert (#10255 )	2025-05-12 14:44:05 +03:00
chenyu	70c797b107	train bert tests (#10248 ) added a working bert tiny test, and a failed bert FUSE_ARANGE test	2025-05-11 08:42:08 -04:00
George Hotz	b6d2effaf5	assign is contiguous (#10066 ) * assign is contiguous * disable process replay for SDXL	2025-04-27 08:40:33 -04:00
qazal	14aa2395d0	allow VIEW(BUFFER) in Tensor UOps [pr] (#9210 ) * allow VIEW(BUFFER) in Tensor UOps [pr] * still reshapes * update becomes_map tests * bring copy folder to the scheduler * lint * only sgd left * optimizer assign * 13 kernels * rename to test_reorder_expand + assert VIEW	2025-02-24 13:06:15 +01:00
chenyu	2e7c2780a9	CLANG -> CPU (#9189 )	2025-02-20 18:03:09 -05:00
qazal	1fce864a6d	delete multi output support (#8822 ) * delete multioutput for now * test_schedule * test_assign too * linter * 515 for sd * update tests and ctx * update that assign check	2025-01-30 22:45:50 -05:00
George Hotz	b4bf6a7dea	switch backward to use gradient [pr] (#8235 ) * switch backward to use gradient [pr] * set device correctly, dedup * why does that fail? * add noop cast * simple backward * fix beautiful_mnist * touchups * set in compute_gradient * uop_count * uop_count was wrong * collections * no note * skip that test * update sched kernel counts * train mnist is 65 * fix metadata and gc * fixes * materialize_grads * no pathlib stuff * add contiguous_backward, fix bugs * add some realize * fix multi	2025-01-26 09:12:16 +09:00
George Hotz	f29d6f54b8	support multilb gradient [pr] (#8624 )	2025-01-14 18:33:33 -08:00
George Hotz	aa3b094334	changes from delete lazy [pr] (#8146 ) * changes from delete lazy [pr] * test tweak	2024-12-10 11:06:17 -08:00
chenyu	aa51f3c14e	update kernel counts in test_real_world (#7960 ) the test was useless because it was looking at the jit graph counts. wrap with JIT=2 for now. if it's stable we could consider making kernel count strict, which helps change like #7940	2024-11-29 11:14:54 -05:00
George Hotz	205befa788	move is_dtype_supported to device [pr] (#7575 )	2024-11-07 20:38:03 +08:00
George Hotz	4013c9848c	don't use tons of memory for tests non CI [pr] (#7209 ) * don't use tons of memory for tests * fix import and clean up pre-commit * use pathlib * no shm on windows * Revert "use pathlib" This reverts commit `7c38489820`. * run pre-commit hooks in test * ugh, fix later	2024-10-22 15:04:51 +08:00
George Hotz	5ae2de9845	UOp.variable (#7010 ) * UOp.variable [pr] * fix tests * clean * improve name rendering * last bug	2024-10-12 18:20:44 +08:00
chenyu	a0dbe20dbd	skip some redundant and slow tests in ci (#5416 )	2024-07-12 14:43:13 -04:00
chenyu	322c37e621	use helpers.JIT in llama and gpt2 examples (#5350 ) * use helpers.JIT in llama and gpt2 examples replaced getenv("JIT"), effectively made gpt2 default jit * fix test_gpt2	2024-07-09 15:04:43 -04:00
chenyu	9a2a82a77f	test stable diffusion unet in ci (#5268 ) unet is parameterized now so can test a smaller one is ci	2024-07-02 21:37:52 -04:00
Tobias Fischer	9a25ee0b9a	pixed unet call params (#5262 )	2024-07-02 12:40:27 -04:00
Tobias Fischer	8c9c1cf62f	Pulled CLIP and UNet into Seperate Files (#5253 ) * pulled clip and unet into seperate files * reference cleanup, lru cache fix * better pool indexing	2024-07-01 22:33:01 -04:00
chenyu	6bbbeb93ac	skip a few clang test that took > 30 seconds in CI (#4126 ) * skip slow CLANG test test_train_cifar * skip those too * and that * only CI * one more	2024-04-10 02:00:34 -04:00
George Hotz	150ea2eb76	create engine folder and move code (#3948 ) * retry * older tf * that	2024-03-26 20:38:03 -07:00
chenyu	a2d3cf64a5	move is_dtype_supported to test.helpers (#3762 ) * move is_dtype_supported to test.helpers updated all places that check if float16 is supports * fix tests	2024-03-15 14:33:26 -04:00
chenyu	922f8319cb	Run test_real_world in METAL test (#3760 ) * clean up test_real_world * skip that * JIT=2 for metal * all device	2024-03-15 13:56:52 -04:00
George Hotz	41f0a25b53	lazy.py: cache consts (#3577 ) * lazy.py: cache consts * add regression test * always always cache const * bump by 1	2024-03-02 03:50:05 -08:00
xarkes	28a8b72024	Remove Interpreted device & remaining CPU/TORCH ref (#3423 ) * Remove Interpreted device & remaining CPU/TORCH ref * Oops * supports_device was useful * Fix doc wording --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-02-16 00:30:21 -05:00
George Hotz	41efaa848c	move graph.py and jit.py into features (#3376 ) * move graph.py into features * move jit into features * fix quickstart	2024-02-12 17:34:34 +01:00
George Hotz	9e17378b60	Fix metal tests (#3266 ) * small fixes for tests on mac * remove device from TensorCore	2024-01-27 18:09:42 -08:00
chenyu	f88506e630	move gpt2/llama sampling inside the model call (#3013 ) * move gpt2/llama sampling inside the model call * argmax uses one more kernel	2024-01-04 17:01:50 -05:00
Yixiang Gao	8a63f26a0f	make LR scheduler work with multigpu (#3011 ) * add a failing test for LR scheduler when using multigpu * fix calculation order and unnecessary tensor created for float * min_lr is no longer tensor	2024-01-04 12:10:56 -08:00
Yixiang Gao	84eb6dd32a	skip GPU cause opencl on intel can't compile half	2024-01-03 07:07:21 -08:00
Yixiang Gao	73879b50ad	only need to check the min_lr for the nan bug	2024-01-03 07:00:50 -08:00
Yixiang Gao	99f8740c60	running half in CI CPU is slow	2024-01-02 18:44:35 -08:00
Yixiang Gao	781690fd99	how long it takes on CI CPU without the lr scheduler	2024-01-02 18:33:48 -08:00
Yixiang Gao	dd00bcb9c0	fix whitespace	2024-01-02 18:16:33 -08:00
Yixiang Gao	841487cad9	add half test with using hyp from benchmarks	2024-01-02 18:14:30 -08:00
George Hotz	a280cfe169	move dtypes to dtype.py (#2964 ) * move dtypes to dtype.py * fix urllib	2024-01-01 14:58:48 -08:00
chenyu	1fb815e77e	hotfix fix coder. RMSNorm cannot have float16 input (#2932 ) * hotfix fix coder. RMSNorm cannot have float16 input * update real world test due to new kernels * more type casts	2023-12-25 02:28:11 -05:00
chenyu	0723f26c80	dtypes.default_float and dtypes.default_int (#2824 )	2023-12-18 12:21:44 -05:00
George Hotz	877c78b4ce	lazy tests (#2796 ) * tests * mini sd is very mini	2023-12-16 08:24:21 -08:00
George Hotz	c6eb618013	tests from new lazy branch (#2774 ) * tests from new lazy branch * fix lin 11 * that was needed * doesn't fail * mark * meant that * llvm passes	2023-12-14 23:06:39 -08:00
George Hotz	6d6eb9302d	ruff checks the max line length is 150 (#2734 ) * ruff checks the max line length is 150 * fix tensor.py * a lot more * done	2023-12-12 17:34:47 -08:00
George Hotz	4164d0ebbd	multitensor start (#2676 ) * multitensor work * early gen fixes the tests * atol for flaky test	2023-12-07 17:07:05 -08:00

1 2

93 Commits