tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-15 01:48:23 -05:00

Author	SHA1	Message	Date
George Hotz	f0c178b7e9	move get_contraction to helpers (#3162 ) * move get_contraction to helpers * move simplify * lines * to_movement_ops is not generic	2024-01-17 19:13:11 -08:00
George Hotz	a464909d79	fast resnet eval (#3135 ) * fast resnet eval * fix HIP multidevice graph * neater expression for devices * lines * add decorator test	2024-01-15 14:15:18 -08:00
Paul Gustafson	6bb65cd02e	fix off-by-one error in st_equal (#3131 ) * fix off by one error * whitespace	2024-01-15 11:32:13 -08:00
chenyu	c658aa4fbf	minor cleanup of test_disk_tensor (#3112 )	2024-01-13 20:54:58 -05:00
chenyu	a300fea2a4	failed test case due to cast resets shapetracker (#3109 ) cast implicitly resets shapetracker and makes it contiguous (for disk tensor), which fails for Interpreted backend if inputs contain non-contiguous st.	2024-01-13 12:46:51 -05:00
chenyu	f018a55ea1	update NumNode.__hash__ to be hash(self.b) (#3105 ) with this, `a:=NumNode(x) == b` implies `hash(a) == hash(b)`	2024-01-12 19:46:21 -05:00
chenyu	dab8214103	unit tests for Device.canonicalize (#3055 )	2024-01-09 12:47:20 -05:00
George Hotz	655c6f61d3	St real size (#3046 ) * track the size in the lazybuffer * shapetracker real size * lint	2024-01-08 14:44:53 -08:00
George Hotz	c003be7309	Revert "track size in shapetracker" (#3043 ) * Revert "track size in shapetracker (#3026)" This reverts commit `a8ba1ac08f`. * st.size	2024-01-08 13:13:39 -08:00
George Hotz	a8ba1ac08f	track size in shapetracker (#3026 ) * track size in shapetracker * shapetracker adapter * size is an int * create Buffer with st.size * only compare the views for the jit * fix webgpu	2024-01-05 20:15:53 -08:00
George Hotz	60abc62a3f	fast hip read (#3014 ) * fast hip read * hip read faster * fix tests * to_mv * simplify * bump to 6k lines	2024-01-05 10:33:13 -08:00
George Hotz	9699c8c90b	don't alloc for InterpretedASTRunner (#2999 )	2024-01-03 17:05:53 -08:00
chenyu	74cc6fd3c2	remove AndNode.__floordiv__ special case (#2996 ) * remove AndNode.__floordiv__ AndNode produces a Node that min/max is bounded by [0, 1] so `//` on top of that is almost always 0. we don't really use that either * keep the test	2024-01-03 17:44:55 -05:00
chenyu	ff5399f053	move one last dtype test from test_helpers to test_dtype (#2975 )	2024-01-02 12:37:56 -05:00
George Hotz	a280cfe169	move dtypes to dtype.py (#2964 ) * move dtypes to dtype.py * fix urllib	2024-01-01 14:58:48 -08:00
George Hotz	c81ce9643d	move globalcounters to ops (#2960 ) * move globalcounters to ops * missed a few * sick of that failing	2024-01-01 14:21:02 -08:00
chenyu	8291986959	Variable.sum -> Node.sum, Variable.ands -> Node.ands (#2961 )	2024-01-01 16:21:28 -05:00
chenyu	3d720b5761	move expand_idx, iter_idxs and expand_node from symbolic to linearizer (#2959 )	2024-01-01 14:41:21 -05:00
George Hotz	5cac6338a4	apply the multitensor optimizations in lazy.py (#2901 ) * apply the multitensor optimizations in lazy.py * less lines * hack for webgpu * save a line	2023-12-21 13:55:49 -08:00
George Hotz	1765849937	new lazy, benchmark (#2878 ) * lazy rewrite, try 2 * min fix tests * pass contig test * put broken pads back * move that to realize * no contig child fixes array packing * so wrong * now that's correct * base children * fix bind issues * disable to_image_idx * fix tests * that failure shouldn't break other tests * more fixes * fix torch * skip failing tests in CI * 1e-7 * half is broken * 1e-6 margin of error	2023-12-20 14:33:21 -08:00
Peter Cawley	dae8976889	Fix reshape merging with masks (#2877 )	2023-12-20 14:00:58 -08:00
George Hotz	ca59054463	fix shapetracker math (#2861 ) * proper test * all st math good now * fix real_strides bug	2023-12-19 22:17:34 -08:00
chenyu	5a739e8c20	update one skipped pad_reshape test that was fine (#2860 ) * update one skipped pad_reshape test that was fine had a typo * this one passed	2023-12-19 23:25:52 -05:00
chenyu	ad233d557f	disable reshape merging with masks (#2858 ) fuzzer found a bug, and it's not complete	2023-12-19 19:06:16 -05:00
Oleg Rybalko	42a038c83f	More readable torch_load ext check (#2853 ) * more readable extension check * enable tarfile test * detach tensor if requires grad in torch	2023-12-19 14:53:15 -05:00
George Hotz	b2192b5400	minor improvements (#2845 )	2023-12-18 22:09:08 -08:00
George Hotz	d086325b1b	hotfix: failing tests	2023-12-18 21:12:42 -08:00
George Hotz	b6d71b131e	hotfix: push broken tests	2023-12-18 21:08:42 -08:00
George Hotz	80f53245e8	shapetracker add and invert (#2828 ) * invert (broken) * decent invert * shapetracker invert works * plus is meh, invert is good * support invert mask * a few more invert tests * shapetracker math invert test	2023-12-18 16:03:27 -08:00
chenyu	b4fa189c8c	Revert "Revert "Make Tensor creation allow multi-dim list of int and bool (#2793 )" (#2810 )" (#2813 ) This reverts commit `71a60762ed`.	2023-12-17 11:48:27 -05:00
chenyu	71a60762ed	Revert "Make Tensor creation allow multi-dim list of int and bool (#2793 )" (#2810 ) This reverts commit `798bf813b1`.	2023-12-17 02:03:52 -05:00
geohotstan	798bf813b1	Make Tensor creation allow multi-dim list of int and bool (#2793 ) * the universe is flat as a 2D tensor * try this * TESTS * less lines in test * don't change all_int since other places use it * add tests and del noqa by making non-aesthetic spacing LOOOOOL * some reordering * fixed empty list and add tests * more tests * add list bool tensors * clearer with least lines added * added bool * oops * more tests * improved tests * oops	2023-12-17 01:58:10 -05:00
George Hotz	877c78b4ce	lazy tests (#2796 ) * tests * mini sd is very mini	2023-12-16 08:24:21 -08:00
chenyu	5235cdee3d	remove _arg_int32 internal type (#2767 ) in DEFINE_GLOBAL, PtrDtype(int32) is buffer and int32 is int	2023-12-14 14:17:14 -05:00
George Hotz	7e5b3e53fe	changes to prep for new lazy (#2748 ) * changes to prep for new lazy * put those back	2023-12-13 10:28:22 -08:00
Umut Zengin	8ad7cfeeb1	More simplification in to_image_idx and symbolic (#2679 ) * less valid * add test --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2023-12-13 12:30:44 -05:00
George Hotz	6d6eb9302d	ruff checks the max line length is 150 (#2734 ) * ruff checks the max line length is 150 * fix tensor.py * a lot more * done	2023-12-12 17:34:47 -08:00
Guy Leroy	ee9e1d3662	Extend available types for `safe_save` (#2720 ) * Extend available types to save with * Linter fix	2023-12-11 14:50:35 -08:00
George Hotz	0fd44259cd	bf16 fix + cleanups from mixtral (#2698 ) * bf16 fix + cleanups from mixtral * generic bf16 cast	2023-12-10 16:31:52 -08:00
qazal	73b067f5ce	Bitcast p2 bfloat16 tests + clang fix (#2635 ) * add bf16 test support this model takes me almost a minute to download though: https://huggingface.co/TinyPixel/Llama-2-7B-bf16-sharded/resolve/main/pytorch_model-00001-of-00014.bin?download=true: 100%\|█████████████████████████████\| 981M/981M [00:40<00:00, 24.2MB/s] * ensure we first load if it is bitcast to avoid taking the address of an rvalue * tiny bf16 in the cloud skip GPU * should skip torch lint * Revert "ensure we first load if it is bitcast to avoid taking the address of an rvalue" This reverts commit `b86a28ab84`. * break the kernel * skip LLVM and GPU in CI * skip CUDA	2023-12-08 10:30:10 -08:00
chenyu	b931a20882	minor shapetracker cleanup (#2652 )	2023-12-06 11:43:52 -05:00
Amrit Sahu	71d989b476	adding test to cover #2644 failure (#2645 )	2023-12-06 11:00:30 -05:00
George Hotz	232ed2af3f	more test cleanups (#2631 ) * more test cleanups * move test example back	2023-12-05 16:17:57 -08:00
George Hotz	35b5e95097	parallel beam search (#2610 ) * better print * fix beam search with vars * cleanups * parallel is not default * restore that * bugfix * cleanups * bugfix	2023-12-05 10:09:45 -08:00
chenyu	dd8b4632a4	regression test for reshape fix #2616 (#2620 )	2023-12-05 11:46:33 -05:00
chenyu	c257a0dd99	minor reshape cleanups (#2619 ) * minor reshape cleanups * mea culpa	2023-12-05 11:23:17 -05:00
Amrit Sahu	e8d6a6ef2e	view.reshape without symbolic (#2218 ) * handle reshape of contiguous subparts with explicit mask * remove the add/remove ones logic in reshape * accomodate ones in accumulate logic * make multiply commutative * fix linting * make mypy happy * add test for commutative mul * merge dimensions in shape_strides for 1 range masks * add offsets for merging * fix linting * add back explicit 1 reshapes * fix mypy errors * fix accumulate by includng state * include non-zero stride dimension in acc * small cleanup * more compact to_shape_strides * more logical cleanup * compress more * compress reshape mask * adding some comments * small bug fix * improve test coverage * remove explicit add remove ones * small bug in test * enable test_reshape_splitting_combining * small fix * 10 lines less to_shape_strides * shorten reshape mask * some more cleanup * more cleanup * introduce some symbols for compactness * more symbols * more cleaner * lessen symbols, it became less readable * remove merge_views from view.reshape * change to_shape_strides to _merge_dims * improve readability * fix corner case * cleanup * better handling of 1 <= Variable('i',1,10) & new_dim = Variable('i',1,10) * rewrite _reshape_mask for readability * fix white space * add comment * nice shorthands for readability * add proof in docs * small nit --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2023-12-04 12:46:53 -05:00
chenyu	e9426f4fe4	simpler get_contraction (#2552 ) * simpler get_contraction * and test	2023-12-01 18:02:52 -05:00
George Hotz	2c363b5f0b	new style device (#2530 ) * cpu tests pass * torch works * works * metal works * fix ops_disk * metal jit works * fix openpilot * llvm and clang work * fix webgpu * docs are rly broken * LRU works on metal * delete comment * revert name to ._buf. LRU only on Compiled * changes * allocator * allocator, getting closer * lru alloc * LRUAllocator * all pass * metal * cuda * test examples * linearizer * test fixes * fix custom + clean realize * fix hip * skip tests * fix tests * fix size=0 * fix MOCKHIP * fix thneed * copy better * simple * old style metal copy * fix thneed * np reshape * give cuda a device	2023-11-30 17:07:16 -08:00
George Hotz	d87a246439	move to new cached fetch (#2493 ) * move to new cached fetch * extra.utils is over * loads * bump download cache * bump timeout	2023-11-28 17:36:55 -08:00

... 15 16 17 18 19 ...

952 Commits