tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-24 06:18:01 -05:00

Author	SHA1	Message	Date
qazal	ba17786068	do not construct unmasked VALID (#8759 ) * new lines that exist in codegen/ops * update tests * update sops.gz (13071 -> 13070 asts) * fix viz too * remove that TODO * diff pruning * mask assert + device * work * diff pruning * re: fix viz too --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-01-28 20:51:21 +02:00
qazal	3417bc1814	fix ShapeTracker spec for const [pr] (#8791 )	2025-01-28 19:53:36 +02:00
qazal	e8be8a5835	support lowering CONST(VIEW) in lowerer (#8785 )	2025-01-28 12:04:41 +02:00
George Hotz	80089536e5	Revert "move llvm_bf16_cast to renderer for CLANG and LLVM [pr] (#8720 )" (#8786 ) This reverts commit `af0452f116`.	2025-01-28 18:59:02 +09:00
mesozoic-egg	af0452f116	move llvm_bf16_cast to renderer for CLANG and LLVM [pr] (#8720 ) * handle bf16 via bitcasting for CLANG and LLVM * On LLVM, skip float16 cast * float32 on llvm lite, float32 elsewhere * code format * trigger pr * move to rewriter --------- Co-authored-by: Mesozoic Egg <mesozoic.egg@proton.mail> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-01-28 18:16:43 +09:00
qazal	aefbc2637f	test fixups from unmasked valid deletion [pr] (#8776 )	2025-01-28 09:23:30 +02:00
qazal	ed672881b0	remove additions/deletion in pr + check uops are equal [pr] (#8779 ) * use warnings there [pr] * remove those + move assert_diff [pr] * warn after log * remove * back	2025-01-28 08:57:34 +02:00
George Hotz	62655e4999	move multi into engine [pr] (#8778 ) * move multi into engine [pr] * all runtime is one sz	2025-01-28 09:15:29 +09:00
Ignacio Sica	b240f12593	[TIP-9] rename Opt's amt to arg 2 (#8770 ) * rename Opt amt to arg * ignore_beam_cache for test_tiny * move ignore_beam_cache to test_tiny * move to separate pr * revert space change --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-01-27 14:19:04 -05:00
Ignacio Sica	ed1b573868	ignore beam cache in test_tiny for stateless beam (#8771 )	2025-01-27 12:56:30 -05:00
George Hotz	3ed146a5ff	Revert "rename Opt amt to arg (#8767 )" (#8769 ) This reverts commit `bf041659a5`.	2025-01-27 23:46:37 +09:00
Ignacio Sica	bf041659a5	rename Opt amt to arg (#8767 )	2025-01-27 23:36:47 +09:00
George Hotz	96bff0b4f7	contiguous is no longer needed in SGD [pr] (#8760 ) * contiguous is no longer needed in SGD [pr] * add allow condition	2025-01-27 15:19:11 +09:00
George Hotz	a9d9f98d05	hotfix: those tests fail locally on mac due to buffer count	2025-01-27 07:53:48 +09:00
qazal	ac70f63d4b	tensor_map cleanups [pr] (#8754 ) * tensor_map cleanups [pr] * update test_schedule too	2025-01-26 11:41:54 +02:00
George Hotz	b53fe7c2fc	remove unused ctx [pr] (#8751 ) * remove unused ctx [pr] * fix test	2025-01-26 17:59:15 +09:00
George Hotz	b4bf6a7dea	switch backward to use gradient [pr] (#8235 ) * switch backward to use gradient [pr] * set device correctly, dedup * why does that fail? * add noop cast * simple backward * fix beautiful_mnist * touchups * set in compute_gradient * uop_count * uop_count was wrong * collections * no note * skip that test * update sched kernel counts * train mnist is 65 * fix metadata and gc * fixes * materialize_grads * no pathlib stuff * add contiguous_backward, fix bugs * add some realize * fix multi	2025-01-26 09:12:16 +09:00
George Hotz	0ffd572e1e	fix multi with no real srcs (#8749 )	2025-01-26 08:41:00 +09:00
qazal	0e42befc6e	viz cleanups 2 [pr] (#8748 ) * viz cleanups 2 [pr] * test_viz updates	2025-01-25 19:41:57 +02:00
qazal	a037201168	test_viz cleanups + move to /unit directory (#8746 ) * test_viz cleanups + move to /unit directory * lint	2025-01-25 14:33:31 +02:00
chenyu	e2b380b743	make UOp.multi real a tuple instead of list [pr] (#8744 ) tuple is immutable. also updated test_rand_like_from_alu test	2025-01-24 20:47:27 -05:00
chenyu	e0e176efbc	failed test case for multi rand_like [pr] (#8740 ) new multi broke multi device dropout	2025-01-24 13:56:51 -05:00
nimlgen	dc10187fc0	am: add am_smi (#8739 ) * am: start monitor * cleanups * fixes * hmm * progress * cleanup	2025-01-24 20:16:19 +03:00
George Hotz	e82ba1454b	MultiLazyBuffer is UOp [pr] (#8662 ) * MultiLazyBuffer is UOp [pr] * this is new mlb * this is the idea * progress * multitensor works * more movement ops * this * MultiLazyBuffer is UOp * cleanups * multi axis * fix more tests * work * not that * add multi grad and move shard to ops * mops not views * no double contig * sweet, all mt tests passing * port old logic * remove lbs * fix realized * whitespace * assign tweak * test_assign_kv_cache_multi passes * fix is_realized * fix JIT for multi * just a few more lines i'll pay them back soon i swear please bro just a few more * no split reduceop for multi	2025-01-24 13:28:55 +09:00
qazal	8e5bd0cd7a	fix buffer init and skip test_swizzle_failure_permute [pr] (#8732 ) * fix buffer init and skip test_swizzle_failure_permute [pr] * replace preload with just load * add	2025-01-23 17:21:38 +02:00
nimlgen	e4512baea4	am: cleanup mm (#8730 ) * am: cleanup mm * cle * ops * entries	2025-01-23 15:49:37 +03:00
qazal	07ec99001a	keep VIEW in big_sink + copy of buffer view spec [pr] (#8727 ) * keep views in sink [pr] * tests * things from the gpt2 bug	2025-01-23 11:29:30 +02:00
qazal	6cb74bb630	fix using clone with shrink [pr] (#8724 ) * fix using clone with shrink [pr] * remove extra arg, add test_clone_with_shrink_realized	2025-01-23 08:28:07 +02:00
qazal	907dfa0e82	image buffer realization spec [pr] (#8420 ) * image buffer realization spec [pr] * redo the spec * work	2025-01-22 20:25:22 +02:00
nimlgen	93fb50ce77	allreduce: add flags (#8713 )	2025-01-22 17:44:31 +03:00
qazal	2dae467b75	scheduler + process_replay import cleanup (#8711 )	2025-01-22 12:44:07 +02:00
qazal	e3d1464ba4	move assign preload out of schedule item [pr] (#8710 ) * move assign preload out of schedule item [pr] * fix that	2025-01-22 12:43:57 +02:00
nimlgen	c5e46c5eee	am: recover from any boot interrupt (#8703 ) * am: recover from any load interrupt * add fuzzer * nu	2025-01-21 22:22:23 +03:00
George Hotz	018edd934b	don't use view in copy [pr] (#8704 ) * don't use view in copy [pr] * oh, remove double contig * fix reps	2025-01-21 09:57:47 -08:00
qazal	d6bf1feaab	remove the "no copy" line from copy_to_device (#8702 ) * delete the no copy one * add tests	2025-01-21 17:09:33 +02:00
nimlgen	3628f89929	fix deallocate for subbuffers (#8701 ) * fix deallocate for subbuffers * forgot this * rm name * hmm	2025-01-21 16:34:19 +03:00
qazal	f0d424ecdf	Tensor UOps can become a buffer or const after scheduling (#8698 ) * spec * work * update test_viewed_consts_do_not_realize * remove	2025-01-21 12:33:19 +02:00
qazal	e2008c98c3	allow symbolic shape in tensor const parents [pr] (#8699 )	2025-01-21 12:01:25 +02:00
qazal	66ac0087e8	more high level contiguous tests + scheduler deletions [pr] (#8695 ) * delete those * move the upat too * rename ops_folding to just sym * keep that	2025-01-21 01:52:58 +02:00
qazal	08eb1f1f56	simplify tensors before scheduling [pr] (#8580 ) * delete forced_realize * put that back * work * remove forced_realize * expectedFailures * contiguous(buffer) * multi * expectedFailures * cleaner create_subbuffer * more comments * remove that * note * realizes * work * one upat and image is back * remove * cleaner * fix test_complex_backward for now --------- Co-authored-by: George Hotz <geohot@gmail.com>	2025-01-20 23:42:42 +02:00
qazal	02ad450e22	add failing assert for gradient realization [pr] (#8692 )	2025-01-20 22:50:09 +02:00
Sieds Lykles	1a15c0e89d	Move define_acc down an unrolled add chain (#8404 ) * Move define_acc down an unrolled add chain * Prevent possible infinite recursion * Add test * Fix typo in test * Move mulacc_unrolled to devoctorize + load_store_indexing pass * Add test for mulacc_unrolled by itself * undo formatter * import from ops, not rewriter * Add a const version --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-01-20 14:56:27 -05:00
geohotstan	dd82b4c913	make onnx runner a class (#8647 ) * this * clean up * more clean ups and improve debug msg * more correct training toggler * remove manual training toggling * change some variable names * actually just add the training toggle for LIMIT envvar too * more refinement * __call__ and OnnxRunner * fix half pylint, other half is importing from onnx while this file is onnx.py, figure out later * ahhhh found another mistake * remove limit from __call__ --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-01-20 10:11:05 -08:00
George Hotz	46a8c5e1e5	delete forced_realize (#8615 ) * delete forced_realize * put that back * expectedFailures * cleaner create_subbuffer * more comments --------- Co-authored-by: qazal <qazal.software@gmail.com> Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>	2025-01-20 09:40:36 -08:00
chenyu	679b1ad058	move softmax upcast to after subtracting max (#8684 ) * move softmax upcast to after subtracting max max can always be done in the same dtype without any numerical loss, so this is better when explicitly upcasting in softmax * skipUnless half	2025-01-20 12:16:32 -05:00
nimlgen	08ca871d77	am: remove pm block (#8688 ) * am: remove pm block * hm * oops	2025-01-20 18:05:22 +03:00
nimlgen	9d3c40601f	am: fast memory manager (#8654 ) * start * progress * fixes * smth * mini fixes * fix2 * ugh, need this for now * faster * cleanups * tiny linters * make mypy happier * test & free pts * ops * linter * cleanup vm * fix * remove map_from * tiny fixes * add test to ci	2025-01-20 16:58:22 +03:00
qazal	9e55495b4d	fold double contiguous [pr] (#8687 )	2025-01-20 14:38:33 +02:00
qazal	ed63ff2372	Remove contiguous on buffer (#8676 ) * remove contiguous on buffer * spec * make things that can't be images not images	2025-01-20 13:48:33 +02:00
qazal	3499a2c72d	start moving image things to rewrite rules (#8678 ) * start moving image things to rewrite rules [pr] * that too * as expected * fix * Revert "fix" This reverts commit `fd03c9464b`.	2025-01-20 13:34:29 +02:00

1 2 3 4 5 ...

3283 Commits