tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-02-12 23:54:58 -05:00

Author	SHA1	Message	Date
chenyu	a8e9307e0b	pylint runtime/ and shape/ (#5044 ) as pointed out by #4877, need to add `__init__.py` to trigger pylint. fixed some errors except ops_python (will do in a separate pr, it has a lot of errors), and sub-folders in runtime	2024-06-18 19:48:18 -04:00
Roelof van Dijk	1785a70e77	fix: else-return on runtime (#4881 ) * fix: add init file * fix: no else-return * fix: remove file again	2024-06-08 14:44:24 +02:00
chenyu	286b4dbdf2	compile raise CompileError and skip only RuntimeError in multiprocess… (#4646 ) * compile raise CompileError and skip only RuntimeError in multiprocess beam renderer error with multiprocess should not be skipped by beam * use `==` for dtype to dtype comparison * that needs to be is * typo	2024-05-19 00:25:25 -04:00
George Hotz	347a3acb37	add renderer class (#4524 ) * add renderer class * tests pass * fix pylint * fix tensor cores	2024-05-10 21:40:02 -07:00
George Hotz	d438d5698d	bring buffer back to device (#4517 )	2024-05-10 11:22:31 -07:00
George Hotz	4eef1ee9bf	move renderer into options (#4514 ) * move renderer into options * fix tests * renders are functions	2024-05-10 10:01:51 -07:00
George Hotz	89e119bc58	move Allocator to buffer.py (#4502 ) * move Allocator to buffer.py * move those to realize * memory file * cleanup	2024-05-09 19:45:56 -07:00
Sohaib	61c97d5305	refactor ops_gpu ctypes (#4331 ) * refactor ops_gpu ctypes - remove redundant byref as ctypes automatically handles passing `type` as `POINTER(type)` - use walrus operator instead of init_c_var when possible * clSetKernelArg argtype is POINTER(None)	2024-04-30 01:33:34 +08:00
chenyu	1de9778949	import Buffer and BufferOption from tinygrad.buffer (#4076 )	2024-04-04 22:12:23 -04:00
chenyu	b47f6cebb2	LinearizerOptions -> CompilerOptions (#3978 )	2024-03-28 17:50:23 -04:00
nimlgen	e2d6f76723	_alloc and _free with options (#3934 ) * _alloc has options * linter * fix hsa	2024-03-26 09:11:41 -07:00
qazal	27f4de2ce4	delete half_prekernel (#3388 ) * generic rendering of half and bf16 hotfix * fix uops + regression test * fix the test for metal's half4 * uop.uop fixup * mypy with --strict-equality, fix ops_gpu	2024-02-14 15:40:48 +01:00
George Hotz	3c728d1082	compiler support (#3260 ) * compiler support * revert that * fix tests	2024-01-26 23:36:40 -08:00
George Hotz	03a6bc59c1	move autogen to runtime/autogen (#3254 )	2024-01-26 12:44:19 -08:00
George Hotz	a3869ffd46	move gpuctypes in tree (#3253 ) * move gpuctypes in tree * fix mypy * regex exclude * autogen sh * mypy exclude * does that fix it * fix mypy * add hip confirm * verify all autogens * build clang2py * opencl headers * gpu on 22.04	2024-01-26 12:25:03 -08:00
George Hotz	cb372b053f	add device speed test (#3244 )	2024-01-25 12:01:22 -08:00
George Hotz	ed8a32722a	hip mutex signal (#3234 ) * hip mutex * hip mutex 2 * sync	2024-01-24 13:23:09 -08:00
George Hotz	23b084e70a	add device name to device, all are constructed (#3221 )	2024-01-23 20:34:56 -08:00
George Hotz	4a07ea355d	buffer options should work (#3211 ) * buffer options should work * minor * fix dtype	2024-01-22 19:23:55 -08:00
nimlgen	992067399e	clean up exceptions in __del__ everywhere (#3165 )	2024-01-18 08:34:09 -08:00
nimlgen	81ae4ea179	compile cache for several devices (#3148 ) * compile cache for several devices * ops_gpu uses hash to not care about sql * hip rdna test with device * linter happy * no device passed where possible * arch is optional to compile_{hip\|cuda}	2024-01-16 11:45:26 -08:00
George Hotz	120c8b1841	update llvm api + add cache key (#3140 ) * update llvm api + add cache key * use_xcode is a different function * types	2024-01-15 17:25:32 -08:00
chenyu	0fe6904351	use device from LinearizerOptions in kernel search (#3090 ) * use device from LinearizerOptions in kernel search removed all Device.DEFAULT in search.py * pass device string for parallel pickle * device for interpreted backends in LinearizerOptions	2024-01-11 14:46:03 -05:00
George Hotz	a280cfe169	move dtypes to dtype.py (#2964 ) * move dtypes to dtype.py * fix urllib	2024-01-01 14:58:48 -08:00
George Hotz	56f44bd10e	move the compiler cache to be global (#2957 ) * move the compiler cache to be global * remove non robust test * remove dead code	2024-01-01 10:59:56 -08:00
Marcus Asteborg	1fa4f161fe	Update CLProgram to use unsigned long long for event profiling (#2808 ) On Windows, the unsigned long type is 32-bit, which is not compatible with the required data size for event profiling.	2023-12-16 23:48:44 -08:00
George Hotz	6d6eb9302d	ruff checks the max line length is 150 (#2734 ) * ruff checks the max line length is 150 * fix tensor.py * a lot more * done	2023-12-12 17:34:47 -08:00
George Hotz	c53e854687	cast image doesn't work on nvidia (#2626 ) * cast image doesn't work on nvidia * hmm, interpreteds use buffer size 0 * fix type * no lru	2023-12-05 12:48:19 -08:00
George Hotz	664475f247	vals is an argument (#2599 ) * vals is an argument * don't even know how that's legal python	2023-12-03 21:50:43 -08:00
George Hotz	fcd0b2ee6c	fix multigpu on tinybox (#2595 ) * fix multigpu on tinybox * fixed multigpu	2023-12-03 16:48:07 -08:00
George Hotz	171543fc8d	cleanups to save lines and files (#2577 ) * runtime/graph -> features/graph * put all the cstyle renderers in cstyle * same line for those * how did that pass mypy	2023-12-02 16:29:56 -08:00
nimlgen	065495e0c9	save a few lines in ops_gpu (#2564 ) Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2023-12-02 15:05:22 -08:00
George Hotz	d6b404ac11	No dtype alloc (#2570 ) * fix all allocs * improve docs * ugh fix fake alloc	2023-12-02 13:29:40 -08:00
George Hotz	5068e99d18	refactor to remove extra kernel params (#2563 ) * refactor to have compiled kernel * bugfixes * docs/beautiful.py * revert that * fix tests	2023-12-02 00:32:25 -08:00
George Hotz	27481b9206	Switch ops_gpu -> gpuctypes (#2532 ) * ops_gpu is go * fix size 0 * fix image, and add more tests * nerf openpilot test, doesn't test thneed * run the schedule * better * oops, new inputs * delete pyopencl * Update ops_gpu.py	2023-12-01 22:30:21 -08:00
chenyu	67f4e03724	rewrite 0 size loadop into a CONST (#2556 ) * rewrite 0 size loadop into a CONST * check alloc size * EMPTY is better * Revert "EMPTY is better" This reverts commit 574fe0f9ed28f1b97da5a81afdfd2cd5d9a94ff9. * no ast is created * fix test	2023-12-01 18:29:06 -05:00
George Hotz	2c363b5f0b	new style device (#2530 ) * cpu tests pass * torch works * works * metal works * fix ops_disk * metal jit works * fix openpilot * llvm and clang work * fix webgpu * docs are rly broken * LRU works on metal * delete comment * revert name to ._buf. LRU only on Compiled * changes * allocator * allocator, getting closer * lru alloc * LRUAllocator * all pass * metal * cuda * test examples * linearizer * test fixes * fix custom + clean realize * fix hip * skip tests * fix tests * fix size=0 * fix MOCKHIP * fix thneed * copy better * simple * old style metal copy * fix thneed * np reshape * give cuda a device	2023-11-30 17:07:16 -08:00
George Hotz	756b01f46f	why were these ever called buffer (#2483 )	2023-11-27 21:02:07 -08:00
George Hotz	9e07824542	move device to device.py (#2466 ) * move device to device.py * pylint test --disable R,C,W,E --enable E0611 * fix tests	2023-11-27 11:34:37 -08:00
andresgit	259a869fc1	Fix UnicodeDecodeError when debugging on Intel APU (#2421 ) * test DEBUG=5 * print prg if NVIDIA, fixes error on Intel APU	2023-11-25 12:30:50 -08:00
George Hotz	cbb8486779	ResNet training changes (update benchmark) (#2390 ) * default arg for chunk * bring back to_ * good changes * new set * unused hash * fix optim * new torch loader * fix test lr scheduler	2023-11-22 17:41:12 -08:00
valar	123ea051e6	refactor/ci: delete many `# type: ignore` (#2281 ) * refactor/ci: delete many `# type: ignore` * replace `axis.__class__ is int` with `isinstance(axis, int)` to make mypy happy * add `--warn-unused-ignores` to mypy flag refs #2240 * ci: move `--warn-unused-ignores` flag to mypy config refs #2240	2023-11-12 11:04:20 -08:00
vish-pr	6051f0ce82	For cuda get current free space from device, and retry alloc failures (#2197 ) * For cuda get current free space from device, and rery alloc failures * type ignore for mypy * add init to get free mem in cuda * Move retry logic in common lib. Fix typo in override _get_cur_free_space * linter error fix in test file * Not catch all, as it will catch KeyboardInterrupt * fix unintened line changes	2023-11-09 15:53:50 -08:00
George Hotz	f17bc16f46	simple runtime args (#2211 ) * simple runtime args * fix some tests * fix abstractions and triton * fix search	2023-11-03 12:31:29 -07:00
George Hotz	03cf0afa4f	move all to compile api (#2203 ) * move metal+clang to compile api * all to the new style * remove binary arg * fix triton * fixup tests * fix clang * diskcache is generic * __wrapped__ * compile_gpu * fix thneed * keep the src in the ASTRunner * lib * move compile_gpu * compile_gpu in device * put compiler in astrunner * test reverts * triton compiler * ugh, that too	2023-11-01 23:01:32 -07:00
George Hotz	8932816816	remove arm64, caching for cuda (#2201 ) * remove arm64, caching for cuda * caching in llvm * switch cache_compiled to new cache * fix clang * caching for metal * fix pylint * cleanups * perf_counter and binary	2023-11-01 18:44:00 -07:00
nimlgen	8c07c73a9b	Fix cl map buffer (#2190 ) * fix gpu enqueue_map_buffer out of space * add test	2023-10-31 12:02:46 -07:00
imaolo	228b310478	align cpu buffer before copy into cl buffer (#2135 )	2023-10-23 21:04:35 -04:00
George Hotz	5472a14544	openpilot compile2 (#1977 ) * start compile2 * tweak * why are there two more kernels? * minor cleanups * don't break onnx tests * add __metadata__ support to safetensors * no early realize in onnx * cleanups * bugfix * clean up image type, add optimize * opt to match old * try that * opt work * run compile2 * optimizer * prt more * prerealize * imp * NOLOCALS works * no locals means no locals * support fractional globals * all locals welcome * int that * cleanups * show gemv regression * clean up diff * use idx for the cond * nolocals --------- Co-authored-by: Comma Device <device@comma.ai>	2023-10-15 20:39:46 -07:00
qazal	71d93ffd79	Refactor GPU and Metal langauges in their own separate renderers (#2033 ) * Refactor GPU and Metal langauges in their own separate renderers * remove CStyleLanguage imports * move renderers too	2023-10-10 07:46:41 -07:00

1 2 3

102 Commits