tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-02-08 05:35:11 -05:00

Author	SHA1	Message	Date
George Hotz	d6b404ac11	No dtype alloc (#2570 ) * fix all allocs * improve docs * ugh fix fake alloc	2023-12-02 13:29:40 -08:00
chenyu	c8774713c5	lazy cleanup (#2567 )	2023-12-02 13:21:43 -05:00
George Hotz	5068e99d18	refactor to remove extra kernel params (#2563 ) * refactor to have compiled kernel * bugfixes * docs/beautiful.py * revert that * fix tests	2023-12-02 00:32:25 -08:00
George Hotz	27481b9206	Switch ops_gpu -> gpuctypes (#2532 ) * ops_gpu is go * fix size 0 * fix image, and add more tests * nerf openpilot test, doesn't test thneed * run the schedule * better * oops, new inputs * delete pyopencl * Update ops_gpu.py	2023-12-01 22:30:21 -08:00
George Hotz	6733425095	lower schedule (#2559 ) * lower schedule * remove RAND, and don't put load in the JIT yet * better fix for that test	2023-12-01 19:17:46 -08:00
Christopher Mauri Milan	077567f62d	Remove as_buffer for TORCH (#2554 ) * remove as_buffer for torch * enable torch zerocopy if on cpu * remove as_buffer even on torch:cpu	2023-12-01 18:51:38 -08:00
chenyu	86fbd413f3	update test_real_world configs (#2557 )	2023-12-01 20:03:52 -05:00
andresgit	00523d5656	New fix accessing elements created by padding (#2529 ) * pad slice test cases, many failing * fix failing test cases check mask if we are outside the base buffer also create a multi-view if in that case we reshape to an empty shape * real_offset calculation more readable --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2023-12-01 19:08:10 -05:00
chenyu	67f4e03724	rewrite 0 size loadop into a CONST (#2556 ) * rewrite 0 size loadop into a CONST * check alloc size * EMPTY is better * Revert "EMPTY is better" This reverts commit 574fe0f9ed28f1b97da5a81afdfd2cd5d9a94ff9. * no ast is created * fix test	2023-12-01 18:29:06 -05:00
George Hotz	4447188051	gate METAL_FAST_LOAD	2023-12-01 15:28:40 -08:00
chenyu	e9426f4fe4	simpler get_contraction (#2552 ) * simpler get_contraction * and test	2023-12-01 18:02:52 -05:00
George Hotz	f5de21e753	fast path for copy (#2548 ) * fast copy * ruff first * flat_mv on malloc * order + webgpu test	2023-12-01 11:34:47 -08:00
George Hotz	12fa846122	zero copy (#2531 ) * zero copy * zero copy test * loads coder in milliseconds * zero copy for cpu and torch * src_from_buffer is None * SLOW_METAL_COPY there	2023-11-30 18:38:41 -08:00
George Hotz	2c363b5f0b	new style device (#2530 ) * cpu tests pass * torch works * works * metal works * fix ops_disk * metal jit works * fix openpilot * llvm and clang work * fix webgpu * docs are rly broken * LRU works on metal * delete comment * revert name to ._buf. LRU only on Compiled * changes * allocator * allocator, getting closer * lru alloc * LRUAllocator * all pass * metal * cuda * test examples * linearizer * test fixes * fix custom + clean realize * fix hip * skip tests * fix tests * fix size=0 * fix MOCKHIP * fix thneed * copy better * simple * old style metal copy * fix thneed * np reshape * give cuda a device	2023-11-30 17:07:16 -08:00
chenyu	7d26452305	call ruff with --preview (#2522 ) some checks are ignored without --preview	2023-11-30 13:59:00 -05:00
chenyu	5db0cdfbd3	support list of ints (or other Tensorable) in tensor indices (#2520 ) * support list of ints (or other Tensorable) in tensor indices * enable some index test cases	2023-11-30 12:46:33 -05:00
chenyu	bd941a0df1	first version of test_indexing (#2515 ) * first version of test_indexing * move to test/imported	2023-11-30 00:03:59 -05:00
qazal	370cfbb957	Cleanup vectorized hip renders (#2497 ) * add typedefs and make_dtypen functions use ext_vector_type for half16 kernels * remove the old test_render because we just use whatever cstyle has * align vectors	2023-11-29 14:02:12 -08:00
George Hotz	065aff747e	make webgpu test reliable (#2502 ) * remove retry that doesn't work * fix cleanup * process exit in cleanup * add space	2023-11-29 10:02:24 -08:00
George Hotz	6707f2588e	use copyin (#2500 ) * it's always copyin * all RawBuffer are RawBufferCopyIn * cleanups * this fixes it * requirements='C' * more correct	2023-11-29 09:34:00 -08:00
chenyu	3eb3c74675	metal ci tests everything (#2499 ) * metal ci tests everything * pretty good * METAL	2023-11-29 12:04:37 -05:00
George Hotz	889acefe85	Support weird loads in Image (#2498 ) * image support weird loads * umm, that was always wrong * openpilot compile fails with a weird error * image test passes * we have valids now * clean that up * no more required opts * add fastvits test, fix bug * minor cleanups	2023-11-29 08:30:46 -08:00
George Hotz	5629fc368c	Use Buffer.STORE at the end of ASTs (#2494 ) * work * store broken * interpreteds work * this passes * symbolic cpu * fix tests * fix opt tests * images fail * fix InterpretedFlopCounter * stupid hack for images	2023-11-28 20:11:37 -08:00
Liam	cf0c9096a9	Removing METAL Skips as CI works (#2488 ) * Test metal CI * remove metal and CI restrictions * enable dtype tests for metal ci	2023-11-28 19:46:59 -08:00
George Hotz	d87a246439	move to new cached fetch (#2493 ) * move to new cached fetch * extra.utils is over * loads * bump download cache * bump timeout	2023-11-28 17:36:55 -08:00
George Hotz	ab5d14d4ba	MEM -> LOAD (#2492 ) * MEM -> LOAD * keep legacy working	2023-11-28 16:46:37 -08:00
chenyu	847f0a02b1	non-simplifiable mod should result in ModNode (#2490 ) * non-simplifiable mod should result in ModNode * space	2023-11-28 16:52:19 -05:00
mmmkkaaayy	ddb6a33ae5	improve test assertions for jit cache len with graph executor (#2476 ) * improve test assertions for jit cache len with graph executor * delete newline * unused import * another unused import	2023-11-27 23:02:45 -08:00
chenyu	28a67106ca	enable symbolic ops tests for hip (#2485 )	2023-11-27 22:33:41 -08:00
Christopher Mauri Milan	7f01dd04f0	Apply ruff linting rules to tests (#2473 ) * everything except F821 * enable F821 with noqa * dumb fix * fix remaining imports and (former) lambdas * replace _ with noqa to avoid gc	2023-11-27 21:24:06 -08:00
Davi Silva	136dbd8b36	HIP CI that compiles (to RDNA3) but doesn't have to run (#2482 ) * hip amd compilation * gate the test properly * cleanup unused import * remove superfluous numpy conversion * add SpeedyNet tests (f32 [passes] & f16 [fails]) * make CI verbose (error log from hip compiler) * test the real ops_hip * Merge branch 'tinygrad:master' into ci/hip-compilation * fix CI * cleanup * really fix CI * Fix CI Three: the refixening --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-11-27 21:17:06 -08:00
George Hotz	acbe6d1b53	Revert "HIP compilation on CI targeting RDNA3 (#2459 )" (#2481 ) This reverts commit `d275ff930a`.	2023-11-27 20:41:21 -08:00
qtkite	cb507a9389	Remove the toCPU copy (#2445 ) * Remove the rawbuffer copy in runtime/lib.py on line 44 * remove buffer view * added metadata back, oops * delayed cpu testcase * whitespace * whitespace * buffer behavior as is * Update test_jit.py	2023-11-27 20:37:13 -08:00
Davi Silva	d275ff930a	HIP compilation on CI targeting RDNA3 (#2459 ) * hip amd compilation * gate the test properly * cleanup unused import * remove superfluous numpy conversion * add SpeedyNet tests (f32 [passes] & f16 [fails]) * make CI verbose (error log from hip compiler) * test the real ops_hip * Merge branch 'tinygrad:master' into ci/hip-compilation * fix CI * cleanup * really fix CI	2023-11-27 20:33:11 -08:00
Paul Gustafson	98cd9e8926	Add assertion to prevent nonsense mod values (#2474 )	2023-11-27 18:37:44 -08:00
qazal	e267a93124	reset seed on every run (#2468 )	2023-11-27 12:55:54 -08:00
George Hotz	9e07824542	move device to device.py (#2466 ) * move device to device.py * pylint test --disable R,C,W,E --enable E0611 * fix tests	2023-11-27 11:34:37 -08:00
qazal	262cd26d28	Simplify openpilot kernel (#2460 ) * a conditional with the same results either way is a noop * add unit test	2023-11-27 10:02:27 -08:00
chenyu	61a80a0675	asserts LtNodes of SumNode with MulNode of Nodes (#2465 )	2023-11-27 12:56:59 -05:00
Paul Gustafson	1d89c018fa	Add isinstance check before gcd call in SumNode.__lt__ (#2450 ) * Add isinstance check before gcd call * Delete blank lines * Fix unit test typo * Delete blank lines again --------- Co-authored-by: Paul Gustafson <paul.gustafson@theambrusgroup.com>	2023-11-26 13:05:04 -08:00
George Hotz	8e9cdef61f	clean up the buffers (#2447 ) * clean up the buffers * remove allocate_output * functools.lru_cache is methodcache * add TestShapeTrackerSize * cache_clear * no 0 sz buffer, add _ on functions that shouldn't be imported * fix size * if -> while	2023-11-26 11:02:29 -08:00
chenyu	511310737e	test_linearizer_failures to run on all backends (#2443 ) * test_linearizer_failures to run on all backends * test ubuntu and cuda * failed only in CUDA CI * move asserts	2023-11-26 01:17:29 -05:00
George Hotz	9eb2746d62	fix copy issue + add regression test (#2441 )	2023-11-25 14:06:08 -08:00
George Hotz	7170a9a057	coder.py can write and run code (#2439 ) * wip mistral * coder * touchups * cleanups * mistral cleanups * clean up cache create * download the weights, fix tests * fix llama loading * global fixup * clean up all * move llama model * cleanups * Revert "cleanups" This reverts commit `a71c5d59eb`. * fine, leave it	2023-11-25 12:27:54 -08:00
chenyu	9a5d0e70de	Device.DEFAULT instead of getenv to exclude tests (#2429 )	2023-11-24 17:10:24 -05:00
George Hotz	8ff2e13550	From teeny (#2426 ) * changes from teenygrad work * support not supporting ImageDType/PtrDType * fixups from teeny	2023-11-24 12:50:56 -08:00
George Hotz	8f89e21fca	torch and numpy don't share ops anymore (#2412 ) * torch and numpy don't share ops anymore * that should be filtered out elsewhere * still const * graph + enet example cleanup * hmm, we do still need it because of symbolic	2023-11-23 16:58:10 -08:00
George Hotz	193be14b6c	that had bugs, force an order (#2411 )	2023-11-23 15:52:16 -08:00
George Hotz	095e2ced61	add name support to fetch (#2407 ) * add name support * use fetch in gpt2 * remove requests from main lib, networkx also optional * umm, keep that assert * updates to fetch * i love the walrus so much * stop bundling mnist with tinygrad * err, https * download cache names * add DOWNLOAD_CACHE_VERSION * need env. * ugh, wrong path * replace get_child	2023-11-23 14:16:17 -08:00
qazal	b927942d58	Move HIP render logic to its dedicated place (#2394 ) * update HIP language * vectorized render_cast with special treatment for hip only * test coverage for all cases --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-11-23 13:03:29 -08:00

1 2 3 4 5 ...

1062 Commits