tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-25 14:58:46 -05:00

Author	SHA1	Message	Date
qazal	aec4c4f01b	linearizer ast as a tuple of lazyops (#3689 ) * multi store op linearizer * currently we do only one output per kernel * named opts	2024-03-11 15:39:04 -07:00
Jungwan Woo	e5ee6bb2bd	fix outdated url in showcase doc (#3624 )	2024-03-05 14:44:40 -08:00
geohotstan	9268a8b154	remove MULACC (#3459 ) * init * removed mulacc * is uoptimize the problem? * lol hax make work temporarily fix l8er * revert extra/ changes * clean up * flaky metal tests? * add back mulacc for metal * revert last commit * try skipping linearizer_failure tests * skip flammit tests... cuz tests all work locally * try narrow down exact linearizer failure test * try 2 * try 4 * generated code is the exact same wtf why CI fails * code for 15 and 17 are exact same with or without mulacc, this should pass * try only 1 failure * try garbage collecting lol... * try del variables lol * try gcing after del lol... * is diskcache the problem??? * try disabling opts cache idk * try remove hack * try disable github metal cache... * try CACHELEVEL=0 :D idk anymore * try increase newCommandQueueWithMaxCommandBufferCount_, im almost out of ideas... * revert * actually not a HACK * oops	2024-02-29 07:40:40 -05:00
Caleb Bunch	b41761488d	change specific string 'CLANG' to DEVICE variable in abstractions2.py (#3488 )	2024-02-24 07:51:39 -05:00
qazal	7864fb69d1	delete MovementOps (#3434 ) * delete MovementOps * keep extra/to_movement_ops.py	2024-02-19 23:21:44 +01:00
Daniel Yeh	0a4029c519	fix path to models folder (#3442 ) Co-authored-by: Chen-Chen Yeh <ge96noj@mytum.de>	2024-02-19 13:35:57 +01:00
xarkes	28a8b72024	Remove Interpreted device & remaining CPU/TORCH ref (#3423 ) * Remove Interpreted device & remaining CPU/TORCH ref * Oops * supports_device was useful * Fix doc wording --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-02-16 00:30:21 -05:00
George Hotz	b1c0d8c99d	remove cpu and torch backends (#3399 ) * remove cpu and torch backends * don't copy to cpu * use clang instead of cpu * multitensor gathers on the first device * clang is cpu + use default * fixup * bugfix	2024-02-15 16:55:39 +01:00
George Hotz	a40df14fef	ops_ext to replace cpu import (#3409 ) * ops_ext to replace cpu import * don't allow zero copy with as buffer * memoryview(bytearray * reenable test * fix jit issue	2024-02-15 13:03:42 +01:00
George Hotz	6356474d6d	Revert "ops_ext to replace cpu import (#3406 )" (#3408 ) This reverts commit `91eb93f85a`.	2024-02-15 12:16:10 +01:00
George Hotz	91eb93f85a	ops_ext to replace cpu import (#3406 ) * ops_ext to replace cpu import * don't allow zero copy with as buffer * memoryview(bytearray * reenable test	2024-02-15 12:14:58 +01:00
George Hotz	ce1f9f5556	hotfix: new linearizer docs	2024-02-12 18:56:30 +01:00
George Hotz	2e60012bcf	move create schedule and delete old API (#3377 ) * move create schedule and delete old API * fix test multitensor	2024-02-12 18:10:45 +01:00
George Hotz	41efaa848c	move graph.py and jit.py into features (#3376 ) * move graph.py into features * move jit into features * fix quickstart	2024-02-12 17:34:34 +01:00
Mason Mahaffey	3ebf7a3e38	reflect changes to shapetracker in doc printouts (#3349 )	2024-02-08 16:20:30 +01:00
George Hotz	3c728d1082	compiler support (#3260 ) * compiler support * revert that * fix tests	2024-01-26 23:36:40 -08:00
chenyu	1b508e0f71	fix fuzz_linearizer toCPU to as_buffer (#3158 )	2024-01-17 13:18:46 -05:00
George Hotz	e4528543fa	remove LLVMOPT	2024-01-15 16:01:09 -08:00
chenyu	e39cd3e7f2	update env_vars.md (#3127 ) mostly removed deprecated ones. not clear how to maintain this especially for extra/examples	2024-01-15 01:06:56 -05:00
George Hotz	1f9aee8b6f	remove numpy from device (#3123 ) * remove numpy from device * fix tests * np item * cleanups * simplify with as_buffer * no toCPU * tinygradic * cast to scalar	2024-01-14 19:36:05 -08:00
George Hotz	ea5824657d	move fromcpu out of lazy.py (#3122 ) * move fromcpu out of lazy.py * fix abstractions2	2024-01-14 18:21:08 -08:00
George Hotz	a280cfe169	move dtypes to dtype.py (#2964 ) * move dtypes to dtype.py * fix urllib	2024-01-01 14:58:48 -08:00
chenyu	3ba591c3fd	less outdated abstraction.py (#2917 ) removed some old terms and updated types and code pointers	2023-12-22 15:31:02 -05:00
chenyu	50927defad	s/lazydata.realized/lazydata.base.realized/g (#2914 ) * s/lazydata.realized/lazydata.base.realized/g * not that	2023-12-22 14:45:13 -05:00
George Hotz	1765849937	new lazy, benchmark (#2878 ) * lazy rewrite, try 2 * min fix tests * pass contig test * put broken pads back * move that to realize * no contig child fixes array packing * so wrong * now that's correct * base children * fix bind issues * disable to_image_idx * fix tests * that failure shouldn't break other tests * more fixes * fix torch * skip failing tests in CI * 1e-7 * half is broken * 1e-6 margin of error	2023-12-20 14:33:21 -08:00
George Hotz	00d9eda961	FROM -> COPY, move vars_from_ast (#2675 )	2023-12-07 16:32:30 -08:00
chenyu	9996f1adf9	no document prs (#2622 )	2023-12-05 13:05:36 -05:00
Amrit Sahu	e8d6a6ef2e	view.reshape without symbolic (#2218 ) * handle reshape of contiguous subparts with explicit mask * remove the add/remove ones logic in reshape * accomodate ones in accumulate logic * make multiply commutative * fix linting * make mypy happy * add test for commutative mul * merge dimensions in shape_strides for 1 range masks * add offsets for merging * fix linting * add back explicit 1 reshapes * fix mypy errors * fix accumulate by includng state * include non-zero stride dimension in acc * small cleanup * more compact to_shape_strides * more logical cleanup * compress more * compress reshape mask * adding some comments * small bug fix * improve test coverage * remove explicit add remove ones * small bug in test * enable test_reshape_splitting_combining * small fix * 10 lines less to_shape_strides * shorten reshape mask * some more cleanup * more cleanup * introduce some symbols for compactness * more symbols * more cleaner * lessen symbols, it became less readable * remove merge_views from view.reshape * change to_shape_strides to _merge_dims * improve readability * fix corner case * cleanup * better handling of 1 <= Variable('i',1,10) & new_dim = Variable('i',1,10) * rewrite _reshape_mask for readability * fix white space * add comment * nice shorthands for readability * add proof in docs * small nit --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2023-12-04 12:46:53 -05:00
George Hotz	d6b404ac11	No dtype alloc (#2570 ) * fix all allocs * improve docs * ugh fix fake alloc	2023-12-02 13:29:40 -08:00
George Hotz	5068e99d18	refactor to remove extra kernel params (#2563 ) * refactor to have compiled kernel * bugfixes * docs/beautiful.py * revert that * fix tests	2023-12-02 00:32:25 -08:00
George Hotz	6733425095	lower schedule (#2559 ) * lower schedule * remove RAND, and don't put load in the JIT yet * better fix for that test	2023-12-01 19:17:46 -08:00
wozeparrot	28183c7438	feat: reword (#2549 )	2023-12-01 10:56:18 -08:00
chenyu	7fec966b5e	bye bye NOOP (#2534 ) * bye bye NOOP * SIN * NEG	2023-11-30 23:10:35 -08:00
George Hotz	2c363b5f0b	new style device (#2530 ) * cpu tests pass * torch works * works * metal works * fix ops_disk * metal jit works * fix openpilot * llvm and clang work * fix webgpu * docs are rly broken * LRU works on metal * delete comment * revert name to ._buf. LRU only on Compiled * changes * allocator * allocator, getting closer * lru alloc * LRUAllocator * all pass * metal * cuda * test examples * linearizer * test fixes * fix custom + clean realize * fix hip * skip tests * fix tests * fix size=0 * fix MOCKHIP * fix thneed * copy better * simple * old style metal copy * fix thneed * np reshape * give cuda a device	2023-11-30 17:07:16 -08:00
Yingbo Ma	d43485ae9e	Fix `graph_uops` (#2457 ) * Load networkx when we need to graph uops * Document GRAPHUOPS * import nx in `graph_uops`	2023-11-27 18:42:48 -08:00
George Hotz	9e07824542	move device to device.py (#2466 ) * move device to device.py * pylint test --disable R,C,W,E --enable E0611 * fix tests	2023-11-27 11:34:37 -08:00
chenyu	c4dfde761e	remove the commented import (#2463 )	2023-11-27 11:50:41 -05:00
George Hotz	4da2ddea6e	Interpreted cleanups (#2312 ) * move the compiler out of ops * don't return realized * var_vals filter, fix custom * typing	2023-11-15 09:02:23 -08:00
chenyu	a753c8e071	examples of new GPT2 and JIT change (#2261 ) * var_vals are global * working with global ish * better * fix export model * fix tests * better kv cache * does it run? * use where for kvmask * fix excessive var_vals * fix import * how does multigpu use this? * llama kinda work * faster and simpler * cleanup * fix conversation mode * test cleanups * fix one more test * test cleanup --------- Co-authored-by: George Hotz <geohot@gmail.com>	2023-11-10 15:07:02 -05:00
George Hotz	0c9b4ab885	no to_underlying (#2222 ) * no to_underlying * context is no longer used * no more optimizing * update docs	2023-11-05 21:34:20 -08:00
George Hotz	f17bc16f46	simple runtime args (#2211 ) * simple runtime args * fix some tests * fix abstractions and triton * fix search	2023-11-03 12:31:29 -07:00
George Hotz	03cf0afa4f	move all to compile api (#2203 ) * move metal+clang to compile api * all to the new style * remove binary arg * fix triton * fixup tests * fix clang * diskcache is generic * __wrapped__ * compile_gpu * fix thneed * keep the src in the ASTRunner * lib * move compile_gpu * compile_gpu in device * put compiler in astrunner * test reverts * triton compiler * ugh, that too	2023-11-01 23:01:32 -07:00
chenyu	5d5921d2c8	small doc env update (#2112 )	2023-10-18 14:49:25 -07:00
George Hotz	c36d306606	KOPT is over, BEAM is upstream (#2071 ) * create cache for q learning * make linter happy * global beam * where it belongs * bugfix * ditch the kopt, use the beam * faster lin and DEBUG=2 okay * remove kopt, move search to features	2023-10-16 09:46:03 -07:00
George Hotz	121f7aa8c5	Schedule item (#2012 ) * ScheduleItem * put var_vals in the schedule * fix tests, wow that proliferated quickly * not ready to be in the schedule	2023-10-07 08:59:25 -07:00
Roelof van Dijk	972d9ea215	fix: PRUNEGRAPH is unused (#1985 )	2023-10-05 14:28:43 -07:00
George Hotz	de5d603ec1	corealize + remove realize from lazybuffer (#1968 ) * corealize + remove realize from lazybuffer * fix multigpu * fix graph	2023-10-04 10:59:31 -07:00
nimlgen	2ea1dd3e87	no process() in Linearizer (#1966 ) * no process() in Linearizer * more process() clean up	2023-10-04 07:18:42 -07:00
George Hotz	0945848b5f	schedule the loadops like everything else (#1964 ) * schedule the loadops like everything else * unify loadops with other things we schedule * delete all the ops * fix symbolic jit	2023-10-04 02:36:04 -07:00
Yixiang Gao	094d3d71be	with Tensor.train() (#1935 ) * add with.train * remove the rest TODOs * fix pyflake * fix pyflake error * fix mypy	2023-09-28 18:02:31 -07:00

1 2 3

120 Commits