tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-02-15 09:05:40 -05:00

Author	SHA1	Message	Date
George Hotz	8ff2e13550	From teeny (#2426 ) * changes from teenygrad work * support not supporting ImageDType/PtrDType * fixups from teeny	2023-11-24 12:50:56 -08:00
chenyu	3971259832	fix test_real_world llama (#2335 )	2023-11-16 19:50:08 -05:00
George Hotz	70a65c201e	JIT support in Interpreted (#2314 ) * factor that out * jit is supported everywhere * fix some tests * there's no jit supported device, the jit is everywhere * fix test uops	2023-11-15 11:13:38 -08:00
chenyu	a753c8e071	examples of new GPT2 and JIT change (#2261 ) * var_vals are global * working with global ish * better * fix export model * fix tests * better kv cache * does it run? * use where for kvmask * fix excessive var_vals * fix import * how does multigpu use this? * llama kinda work * faster and simpler * cleanup * fix conversation mode * test cleanups * fix one more test * test cleanup --------- Co-authored-by: George Hotz <geohot@gmail.com>	2023-11-10 15:07:02 -05:00
Roelof van Dijk	36ab04ae35	perf: lazyop as dataclass (#1603 ) * perf: lazyop as dataclass fix: linter fix: restore eq * use builtin methods, buffers to property to allow freezing * fix: reduce diff * fix: can't freeze due to KOPT tests, mypy * fix: explicit hash * can freeze if tests are fixed * fix: typo --------- Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-10-25 17:54:30 -04:00
George Hotz	15da96f393	print test durations and add speed (#2107 ) * print test durations * decrease sizes to increase speed * faster * GPU/CLANG onnx in seperate runner * test split, move ONNX CPU CI * simpler tests * simpler uops test * faster * less cuda apt * running ninja install * apt install * split fancy indexing	2023-10-18 13:46:42 -07:00
George Hotz	c36d306606	KOPT is over, BEAM is upstream (#2071 ) * create cache for q learning * make linter happy * global beam * where it belongs * bugfix * ditch the kopt, use the beam * faster lin and DEBUG=2 okay * remove kopt, move search to features	2023-10-16 09:46:03 -07:00
George Hotz	90c777d815	remove apply_auto_opt (#2063 )	2023-10-13 07:44:14 -07:00
George Hotz	cea4cbfc7a	move image+kopt to features (#2015 ) * move image+kopt to features * fix tests * debug prints (unrelated)	2023-10-07 15:41:08 -07:00
nimlgen	d07ac379f9	add var_vals to kopt with symbolic (#2008 ) * add var_vals to kopt with symbolic again * no copies	2023-10-07 09:34:21 -07:00
chenyu	05be57f57f	Fix llama with empty prompt (#1997 ) * fix llama with one token prompt * llama is all_jitted	2023-10-06 06:48:07 -07:00
chenyu	c99fa58dd2	simplify gpt2 example (#1973 ) * simplify gpt2 example * kernel_jitted_count and jit tests * Revert "kernel_jitted_count and jit tests" This reverts commit `31a3c26dd0`. * all_jitted test in test_real_world	2023-10-05 07:09:29 -07:00
George Hotz	2d0c1037b1	Fix up latest openpilot model (#1976 ) * fix gemv triggering for gemm * fixup_openpilot * external test issues	2023-10-05 05:24:28 -07:00
George Hotz	3d5127038c	don't create linearizer if we are in the method cache (#1969 ) * don't create linearizer if we are in the method cache * remove unchecked properties * that key isn't used * fix default type is sticky	2023-10-04 12:42:58 -07:00
nimlgen	2ea1dd3e87	no process() in Linearizer (#1966 ) * no process() in Linearizer * more process() clean up	2023-10-04 07:18:42 -07:00
Yixiang Gao	094d3d71be	with Tensor.train() (#1935 ) * add with.train * remove the rest TODOs * fix pyflake * fix pyflake error * fix mypy	2023-09-28 18:02:31 -07:00
George Hotz	adab724caa	schedule2, keep the tests working with small changes (#1932 ) * lazy cleanups * ast functions take in LazyOps * op instead of self.op * _base for mops * fix contiguous * start schedule * test_schedule * fix openpilot * more tests * bugfix and test skip * work * make sure things get freed * fix zerosized tensors * fix failing test * fix ceil and friends * fix openpilot * disable training * disable test collectives	2023-09-28 09:14:43 -07:00
George Hotz	7ff7aacdb4	LazyOp out of Linearizer (#1908 ) * loadop buffer on cpu * works for GPU * sort of working * has bugs * gpu tests pass * fix some tests * fix tensor cores * fix test linearizer * fix symbolic * fix has_variable_shape * non symbolic size * disable weird test * simple cache fix * fix custom function * fix kopt * cleanups * a bit broken on the assign * contig check * only buffer * need that order * idx * dedup buffers * hmm, bugfix * fix tensor cores * opts device	2023-09-24 14:30:53 +08:00
George Hotz	97dc813329	Revert "All LazyOps in the Linearizer (#1905 )" (#1907 ) This reverts commit `a5820390db`.	2023-09-24 11:51:22 +08:00
George Hotz	a5820390db	All LazyOps in the Linearizer (#1905 ) * loadop buffer on cpu * works for GPU * sort of working * has bugs * gpu tests pass * fix some tests * fix tensor cores * fix test linearizer * fix symbolic * fix has_variable_shape * non symbolic size * disable weird test * simple cache fix * fix custom function * fix kopt * cleanups * a bit broken on the assign * contig check * only buffer * need that order * idx	2023-09-24 11:50:00 +08:00
nimlgen	31fca43706	kopt works with local+grouped reduce and tests (#1824 )	2023-09-09 13:22:09 -07:00
tomtom-95	7344f7c2d1	KeyError fixed. (#1763 )	2023-09-04 15:36:16 -04:00
nimlgen	f863c12610	test kopt correctness (#1756 ) * test kopt correctness * bump BUDGET to 20 * kopt hooks as setUp/tearDown	2023-09-04 10:55:00 -07:00
chenyu	b8fde6bb0f	Test KOPT in CI (#1744 ) * test kopt in ci * getenv takes dtype from default	2023-09-03 14:37:20 -07:00
chenyu	a2745819f6	faster gpt2 jit path and gpt2 in test_real_world (#1738 )	2023-09-02 08:39:12 -07:00
nimlgen	8844a0a822	llvm jitted (#1652 )	2023-08-28 20:22:44 -07:00
George Hotz	a6d842af7a	move device to ops (#1646 ) * move device to ops * mlops types * 2 lines	2023-08-23 08:30:17 -07:00
George Hotz	718ced296c	move state to nn/state (#1619 )	2023-08-22 07:36:24 -07:00
George Hotz	739f327d2d	Shorter (#1582 ) * deleting lines * remove insert dims * if statement is never hit * bug fixes	2023-08-20 08:12:16 -07:00
chenyu	ae39cf84ab	Symbolic Shape JIT main PR (#1353 ) * Symbolic Shape JIT update tests 2 variables symbolic ops, adding more tests test passing cleanup * more test cases * single flag * review update * jit attention one piece * realize * symbolic_jit test for cuda * old artifact * works with cuda gpu but failed ci * CUDACPU	2023-08-18 14:39:55 -07:00
Yixiang Gao	7c2ea85bb0	Raise memory limit for CIFAR test (#1499 )	2023-08-08 19:40:56 -04:00
Yixiang Gao	6480a1a180	CIFAR 94.03% (#1340 ) * add disk_tensor * fix jit * new baseline before whitening * whitening through torch * whiting done currently at 91.65% * 91.99% * clean up mixup and 92.3% * clean up 92.30% * 92.49% before searching for new hyper-parameters * fix CI * fix white space * add whitening init in test * refactor, update hyperpara, 92.72% * converting whiting to tinygrad operation * update CI kernels count for CIFAR * add pad reflect * add random crop 92.53% * update hyperpara 93% * 93.15% on docker container, need to refactor the assignment for hyper param * print out weights and bias to be separated * bias/non-bias params separated * fix whitespace * clean up * refactor hyper-param with dict * refactor lr schedular params * fix whitespace * fix cross entropy loss * fix whitespace * move opt hyp to hyp dict * minor fixup * adjust model, loss scaling * 92.74% while using half of compute as before * update hyp for cutmix * random shuffle during batches * clean up * updating the model * update ConvGroup * disable gradients for batchnorm layer weights * whitespace * 93.92% * clean up * finally 94%git add .! * rewrite whitening to remove dependency on torch * whitespace * remove dependency on torch, 93.91% * back to 94.03% * clean up * update test_real_world	2023-08-08 15:13:24 -07:00
nimlgen	1ba8ae62a1	Match Torch speed for sum reduction (#1387 ) Co-authored-by: Alexander Edwards <alex@alexedw.com>	2023-08-05 22:27:33 -07:00
Pavol Rusnak	cd60b8561c	Add LLaMA-2 support (#1284 ) Co-authored-by: wozeparrot <wozeparrot@gmail.com>	2023-07-24 17:12:02 -04:00
George Hotz	f45013f0a3	stable diffusion: remove realizes we don't need	2023-07-20 19:53:07 -07:00
George Hotz	50a399ffa3	real world test: relax memory	2023-07-20 14:06:22 -07:00
George Hotz	17830e25da	real world tests (#1297 ) * real world test * touchup * sync device	2023-07-20 10:50:22 -07:00

37 Commits