tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-10 07:28:15 -05:00

Author	SHA1	Message	Date
Christopher Milan	c6ba016da6	fix cuda check (#13726 )	2025-12-16 18:00:09 -05:00
nimlgen	77a76d1b13	device: respect compiler ContextVars (#13523 ) * device: envvars for cc * fix * fix * x * um * fix * remote * em * cleanup * typing * fix * debug * lvp? * ugh * singl * rm * lol * fix * ? * this? * why? * rev * mod test * l	2025-12-02 14:42:04 +03:00
Christopher Milan	09f3aae169	In-tree autogen: all C libraries (#13220 ) * checkout files from autogen branch * ioctl with payload * fix am generations * properly fix generations This reverts commit `b2a54f4f41`. * revert discovery.h * support pragma pack(1) * typo * better getter * typo * NVCEC0_QMDV05_00_RELEASE[01]_ENABLE * align support * anon handling fix --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-11-13 18:57:44 -08:00
wozeparrot	c3149c618a	feat: nvcc compiler (#12852 )	2025-10-21 11:31:23 -07:00
George Hotz	1ecf403294	cleanup long lines [pr] (#12623 ) * cleanup long lines * more * a few more * all noqa fixed * fix amd + cuda * clean that up	2025-10-12 20:18:05 +08:00
chenyu	585bd95b50	fix ruff 0.14.0 [pr] (#12547 )	2025-10-09 01:52:30 -04:00
nimlgen	fb96394ff5	auto-select available compilers (#12094 ) * device: auto select compilers * fix * metal+opencl * nv/cuda * test without ptx * ptx * fix tests * fix * fix test * rename * test + cleaner * xx * ops * better test * win? * um? * types * debug * win?? * sep rung * wtf? * debug * skip win * revert this * types	2025-09-10 19:52:01 +03:00
nimlgen	75c2c42def	suppress exceptions only during finalization (#11451 ) * suppress exceptions only during finalization * fix * fix typing * fix more warns * fix * better? * Revert "better?" This reverts commit `a068aa5793`. * mm? * no as e	2025-07-31 13:57:12 +03:00
nimlgen	188ed38315	replace from_mv with lightweight mv_address (#11280 )	2025-07-19 13:50:51 +03:00
George Hotz	67a1c92fc0	remove del spam from CI (#10699 ) * remove del spam from CI * more * preconstruct default buffer spec * ignore those errors * check exception * more exception check * skip stuff	2025-06-08 10:14:30 -07:00
Ignacio Sica	f69722dc2a	refactor cuda disassemble (#10449 )	2025-05-22 08:58:24 -07:00
uuuvn	7bc4864bc4	Make `dev` a property of `Allocator` (#10286 ) * Make `dev` a property of `Allocator` (this is a prereq refactor for #10285) At least `BufferXfer.copy` accesses it assuming it's always present, currently most devices just add this property on their own repeating the same code over and over again. This is also a bit footguny, see `RemoteAllocator` that named this property `device` instead of `dev`, i could obviously just change that in one place but doing it globally seems like a better solution (and it reduces code duplication too). `MallocAllocator` is a bit special, but passing `None` works just fine. * typing * ignore type instead of cast	2025-05-13 17:01:01 -07:00
Ignacio Sica	cfad139189	bump assembly debug to 7 (#9662 )	2025-04-01 11:51:33 +08:00
nimlgen	1d06d61b16	from_blob for cuda (#9223 ) * from_blob for cuda * maybe docs? * minor docs * example * waiting 9224 --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-02-24 14:02:06 +03:00
nimlgen	9bc317d5d2	mockcuda (#8503 ) * init mockcuda * run gpu ocelot * fix * sfixes * disable broken tests * linter * these fails as well * pylint * myypy * this fails on real platforms as well * mypy please	2025-01-05 01:23:57 +03:00
nimlgen	90f1f0c9d5	eh (#8309 ) Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-12-26 13:16:34 -05:00
George Hotz	62e5d96446	more typing work [pr] (#8345 )	2024-12-19 21:46:35 -08:00
George Hotz	9c77e9f9b7	replace Tuple with tuple [pr] (#8344 ) * replace Tuple with tuple [pr] * replace List with list [pr] * replace Dict with dict [pr] * replace Set with set [pr]	2024-12-19 21:27:56 -08:00
George Hotz	c5d458ce02	BufferSpec and ProgramSpec [pr] (#7814 ) * BufferSpec and ProgramSpec [pr] * delete preallocate, it's unused * Revert "delete preallocate, it's unused" This reverts commit `dcfcfaccde`.	2024-11-21 12:18:05 +08:00
George Hotz	6688539bc9	rename device to dev so Buffer can be Allocator [pr] (#7799 ) * rename device to dev to Buffer can be Allocator [pr] * missed those * update the Program classes also * more renames * oops	2024-11-20 15:47:26 +08:00
George Hotz	d71fe7faa5	rename allocator methods to not conflict [pr] (#7788 ) * rename allocator methods to not conflict [pr] * forgot those * transfer + offset	2024-11-20 00:10:29 +08:00
chenyu	348d37df46	a few more unused type ignore [pr] (#7568 )	2024-11-06 10:17:19 -05:00
nimlgen	99fb115791	cuda correct pointer type (#7153 )	2024-10-18 22:39:59 +03:00
George Hotz	ca0dca35f7	move ptx renderer [pr] (#7118 )	2024-10-17 14:50:32 +08:00
Francis Lam	b0dd407cdd	ops_cuda: add optional dynamic smem parameter (#6956 ) * ops_cuda: add optional dynamic smem parameter This is required to enable larger than 48kb shared memory usage on a per-kernel basis. * move setting max dynamic smem size to init	2024-10-11 21:51:06 +03:00
nimlgen	1903542c2d	nv/cuda compilers touchup (#5759 ) * nv/cuda compilers touchup * fix cuda check + move nv disasm * remove includes * fix nvrtc_check	2024-07-28 00:15:28 +03:00
chenyu	9838c1a6ff	update import style in runtime (#5735 )	2024-07-26 14:00:23 -04:00
George Hotz	5c688560bc	move CUDA/HIP compilers to their own files [run_process_replay] (#5732 )	2024-07-26 10:00:15 -07:00
nimlgen	baface413a	nv better nvdisasm fail message (#5682 ) * nv better nvdisasm message * cuda	2024-07-24 16:19:26 +03:00
nimlgen	b4c49ae3fa	remove cudacpu in favour of mockgpu (#5225 ) * remove cudacpu in favour of mockgpu * remove unused import * not used as well	2024-06-29 11:05:16 +03:00
nimlgen	ee02dcb98e	nv supports PTX=1 (#5222 ) * nv supports PTX=1 * not needed * split nv compiler into nvrtc autogen * remove to_c_array * test * Revert "test" This reverts commit `f0b56f308b`.	2024-06-29 10:46:29 +03:00
chenyu	a8e9307e0b	pylint runtime/ and shape/ (#5044 ) as pointed out by #4877, need to add `__init__.py` to trigger pylint. fixed some errors except ops_python (will do in a separate pr, it has a lot of errors), and sub-folders in runtime	2024-06-18 19:48:18 -04:00
Roelof van Dijk	0eebb8e998	fix: _free should not return (#4880 )	2024-06-08 14:45:06 +02:00
Roelof van Dijk	1785a70e77	fix: else-return on runtime (#4881 ) * fix: add init file * fix: no else-return * fix: remove file again	2024-06-08 14:44:24 +02:00
Szymon Ożóg	f7201b6852	Remove deprecated code (#4724 )	2024-05-25 03:02:12 -04:00
chenyu	286b4dbdf2	compile raise CompileError and skip only RuntimeError in multiprocess… (#4646 ) * compile raise CompileError and skip only RuntimeError in multiprocess beam renderer error with multiprocess should not be skipped by beam * use `==` for dtype to dtype comparison * that needs to be is * typo	2024-05-19 00:25:25 -04:00
George Hotz	347a3acb37	add renderer class (#4524 ) * add renderer class * tests pass * fix pylint * fix tensor cores	2024-05-10 21:40:02 -07:00
George Hotz	d438d5698d	bring buffer back to device (#4517 )	2024-05-10 11:22:31 -07:00
George Hotz	4eef1ee9bf	move renderer into options (#4514 ) * move renderer into options * fix tests * renders are functions	2024-05-10 10:01:51 -07:00
George Hotz	89e119bc58	move Allocator to buffer.py (#4502 ) * move Allocator to buffer.py * move those to realize * memory file * cleanup	2024-05-09 19:45:56 -07:00
George Hotz	9fc4465557	subbuffer support (#4397 ) * subbuffer support * diskbuffer offset * cuda subbuffer works * use subbuffer * more subbuffer tests * consecutive * cast * consec * offset * view is a better name * offset is in nbytes * fix view + memory planner * delete unused DiskRunner * reverse order * no subbuffers on unrealized consts * only enabled for disk * don't reverse memory * view supported devices * pickle buffer view * ring jit * support extra view inputs in jit * fix JIT=2 issue * test copy jit * p2p isn't an option anymore * fix dep tracking issue * fix mypy * fix pickle * from_nv is contents now	2024-05-03 18:05:57 -07:00
George Hotz	60e3aa5cb1	more docs (#4271 ) * more work on docs * CompilerOptions is dataclass	2024-04-24 10:52:42 +08:00
Micah Zoltu	7bc862767c	Improves error message when CUDA module fails to load. (#4243 )	2024-04-21 11:10:14 -04:00
nimlgen	5a57b48134	cuda p2p enable when available (#4153 )	2024-04-12 16:21:54 +03:00
George Hotz	af5984df43	cudagraph memcpy through host (#4137 )	2024-04-10 13:17:17 -07:00
chenyu	1de9778949	import Buffer and BufferOption from tinygrad.buffer (#4076 )	2024-04-04 22:12:23 -04:00
chenyu	b47f6cebb2	LinearizerOptions -> CompilerOptions (#3978 )	2024-03-28 17:50:23 -04:00
nimlgen	e2d6f76723	_alloc and _free with options (#3934 ) * _alloc has options * linter * fix hsa	2024-03-26 09:11:41 -07:00
nimlgen	739f47eb0f	check on cuEventSynchronize (#3933 )	2024-03-26 16:14:38 +03:00
nimlgen	f2a9ea4ea9	lru allocator for copyin host buffers (#3918 ) * lru allocator for copyin host buffers * linter happy	2024-03-25 15:57:18 +03:00

1 2 3 4

153 Commits