tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-23 13:58:00 -05:00

Author	SHA1	Message	Date
qazal	40ec9410f9	simpler process replay (#5452 ) * remove check_process_replay * that can go to the top * add assert back * [run_process_replay] * checkout code [run_process_replay] * temp [run_process_replay] * revert temp [run_process_replay] * ahh this is why [run_process_replay] * revert temp [run_process_replay]	2024-07-13 19:55:06 +03:00
George Hotz	955e1179fb	move compile tests and merge (#5451 ) * move compile tests and merge * revert enet move, bump download cache * oh, try setting clang	2024-07-13 08:04:46 -07:00
chenyu	9a187e6102	fix handcode_opt script (#5435 ) * fix handcode_opt script * run in ci * real run in ci * HALF=0	2024-07-12 20:52:28 -04:00
George Hotz	b055ece550	hotfix: bump to cache gpuocelot	2024-07-12 13:54:14 -07:00
chenyu	b17e4adb3a	add `-c advice.detachedHead=false` to process replay git checkout (#5419 ) remove the noisy `Note: switching to 'origin/master'. You are in 'detached HEAD' state. You can look around, make experimental changes...` in log	2024-07-12 15:13:26 -04:00
qazal	31fcc516dc	more process replay tooling (#5407 ) * replays * what's in there * can it be up there * sha is enough * insert sha as the key * fix str * update reset utils * that nested try/except was terrible * github_context can go	2024-07-12 13:11:34 +03:00
Roelof van Dijk	6ec7dbc287	ci: parallelize uops tests (#5405 )	2024-07-12 11:22:41 +03:00
qazal	b91a0ccdc3	make [run_process_replay] [no_assert] the default (#5390 )	2024-07-11 22:36:59 +03:00
qazal	004366b193	context aware process replay [run_process_replay] (#5378 ) * test tc as ctx var * remove from opts * process replay * pop variable * B -> Variable * fix re-assign * pop temp vars * move TRANSCENDENTAL=2	2024-07-11 13:07:28 +03:00
chenyu	2396ab9b33	more transcend cleanup [run_process_replay] (#5369 ) fix test name, less # noqa: E501 and removed the cast	2024-07-10 23:05:03 -04:00
chenyu	64986f949c	more transcend math tests in ci (#5368 ) * more transcend math tests in ci test large input to trig functions that hit different reduction algo, and test TRANSCENDENTAL=2 for all backend * no CUDACPU * try that	2024-07-10 21:19:09 -04:00
chenyu	322c37e621	use helpers.JIT in llama and gpt2 examples (#5350 ) * use helpers.JIT in llama and gpt2 examples replaced getenv("JIT"), effectively made gpt2 default jit * fix test_gpt2	2024-07-09 15:04:43 -04:00
Ian Paul	d5a68ae6b3	Simple abstractions3.py fix (#5343 ) * abstractions3.py fix * Add abstractions3.py to CI tests	2024-07-09 13:48:42 +03:00
chenyu	631bc974a0	raise line count limit to 8500 (#5331 )	2024-07-08 14:00:28 -04:00
SnakeOnex	8c03816ae9	fix README example (#5284 ) * fixed README example * README test * changed py -> python markdown code flags in REAME	2024-07-04 11:15:07 -04:00
chenyu	191463a919	add timing to SDXL (#5273 )	2024-07-02 23:29:54 -04:00
chenyu	5808c37302	hotfix disable flaky llama3 beam benchmark on green (#5249 )	2024-07-01 15:00:47 -04:00
chenyu	b9122ecdaf	revert stable diffusion validation with threefry (#5248 ) * Revert "use threefry in stable diffusion benchmark (#4988)" This reverts commit `44dfa37c70`. * sdxl and validation fix * relax threshold	2024-07-01 14:43:47 -04:00
nimlgen	57e89645cd	hcq spec test (#5226 ) * start hcq spec test * more test * fixes * run on amd as well * test amdgpu exec * fix amd * amd mockgpu support sdma timestamp	2024-07-01 17:36:37 +03:00
chenyu	88763eb9ff	fix stable_diffusion with fp16 (#5239 )	2024-06-30 12:59:31 -04:00
nimlgen	dd7eef7d71	libc defs to autogen (#5217 ) * libc defs to autogen * amd import libc * linter * better a bit * remove comment, check this * not hardcoded path	2024-06-29 14:37:33 +03:00
nimlgen	6b08cb5e38	ptx runs on nv in benchmarks (#5224 )	2024-06-29 11:06:44 +03:00
nimlgen	b4c49ae3fa	remove cudacpu in favour of mockgpu (#5225 ) * remove cudacpu in favour of mockgpu * remove unused import * not used as well	2024-06-29 11:05:16 +03:00
chenyu	7090eac8cb	validate sdxl output and put it in benchmark (#5211 ) * validate sdxl output and put it in benchmark * don't print fetch progress_bar in CI	2024-06-28 11:40:52 -04:00
chenyu	d8dc43ad06	remove JIT_BATCH_SIZE=4 from gpt2 NV benchmark (#5198 ) this no longer helps	2024-06-27 15:20:34 -04:00
chenyu	83da8b3558	use NV instead of CUDA in benchmark (#5192 ) also reenabled mixtral on green	2024-06-27 13:52:58 -04:00
chenyu	0c6c7c5f7b	CACHELEVEL=0 -> IGNORE_BEAM_CACHE=1 in benchmark (#5191 ) ignoring beam cache but using compile cache should be fine, saved some benchmark time. also updated `beam_search` to check flag value before accessing diskcache	2024-06-27 13:15:18 -04:00
chenyu	c12de4f47d	benchmark use JITBEAM for llama and gpt2 (#5189 )	2024-06-27 12:56:02 -04:00
qazal	3af17849bf	safely parse quoted titles [run_process_replay] (#5183 )	2024-06-27 16:39:48 +03:00
qazal	6ca7b13ed1	limit pickled objects [run_process_replay] (#5154 ) * limit pickled objects * delete uop from the list * debug metal * need self.opts for TC * dont need device * [run_process_replay] * minor	2024-06-26 13:51:32 +03:00
qazal	8aa786232d	docs for running process replay locally (#5083 )	2024-06-21 09:55:08 -04:00
nimlgen	fb1bf48cfe	io_uring for copies from disk (#5035 ) * exp uring * fixes and old version * nv * cleaner * cmp vs aio * fix * no lib * fix nv * linter * disk_speed_test now runs default * fixes * uring -> io_uring * linter happy * get_temp_buf comment added * tiny nits * put wait back * test runs everywhere * remove consts * remove mmap consts * do not require iouring to run test, they are generic	2024-06-21 11:36:51 +03:00
qazal	97f1347dd9	fix check_process_replay for special characters (#5072 ) * 'test' [run_process_replay] [no_assert] * test with ( ) { } '' " " * remove the log [run_process_replay] '' () { } '{ * helpful echos [run_process_replay] [no_assert] () '' * test [run_process_replay] [no_assert] * test2 [run_process_replay] [no_assert] * test3 [run_process_replay] [no_assert] * it's also correct this way [run_process_replay] [no_assert] * remove extras [run_process_replay]	2024-06-20 20:23:29 +03:00
qazal	a6a5dba637	Revert "UPat for has_valid in load/store (#5052 )" (#5056 ) * manually insert in the Linearizer * fix process replay	2024-06-19 20:53:36 +03:00
qazal	ee01e464e3	use process replay as a diff creator (#4903 ) * add no_assert option [run_process_replay] [no_assert] * test [run_process_replay] [no_assert] * [run_process_replay] * back to normal [run_process_replay] * remove the log	2024-06-19 18:17:31 +03:00
chenyu	dc942bf1f6	jit sampling functionn in test_randomness.test_multinomial (#5034 ) * jit sampling functionn in test_randomness.test_multinomial `THREEFRY=1 python3 -m pytest test/test_randomness.py::TestRandomness::test_multinomial --durations 1` 7 sec -> 1.2 sec * skip that	2024-06-18 14:21:05 -04:00
chenyu	e9c6a36894	remove CACHELEVEL=0 in llama3 benchmark (#5025 )	2024-06-17 22:43:16 -04:00
chenyu	acaf9a490d	RECIP(-0.0) should be -inf (#5024 ) * RECIP(-0.0) should be -inf added test_dtype_alu for PYTHON backend * catcht that * fix those two	2024-06-17 22:26:58 -04:00
George Hotz	bee8fc29ee	add GPT2 half/half+beam to AMD (#5000 ) * add GPT2 half/half+beam to AMD * winograd in training. half and half/beam file upload	2024-06-16 14:07:14 -07:00
chenyu	44dfa37c70	use threefry in stable diffusion benchmark (#4988 ) also updated default steps to 10. easier to tell the image is following the prompt.	2024-06-15 20:25:29 -04:00
wozeparrot	ce1ed374c9	more tinychat fixes (#4971 )	2024-06-15 16:29:39 -07:00
qazal	ff8e9eefc3	hotfix: don't use ASSERT_COMPILE for benchmarks process replay (#4981 ) * use replay_codegen [run_process_replay] * disable for now [run_process_replay]	2024-06-15 16:57:47 +03:00
uuuvn	92f49efd06	Trigger process replay from pull request title [run_process_replay] (#4980 ) * Trigger process replay from pull request title * idk how this thing works btw * test if it will work * try 2 * Revert "idk how this thing works btw" This reverts commit `580da51b07`. * Revert "try 2" This reverts commit `7ff1e86d5d`. * test if it works * meh * Reapply "idk how this thing works btw" This reverts commit `dd33ad7c14`. * revert	2024-06-15 16:21:00 +03:00
wozeparrot	62dc36d371	autogen _try_dlopen (#4949 )	2024-06-14 12:12:18 -07:00
chenyu	f902af4f0b	increase metal ci test timeout to 20 minutes (#4920 ) make it less annoying for now	2024-06-11 18:45:51 -04:00
qazal	7f3d9e6d94	revert hsa autogen removal (#4914 ) * Revert "only install comgr in AMD CI (#4909)" This reverts commit `7f03420d05`. * rocm-llvm only removal	2024-06-11 12:55:45 -04:00
qazal	7f03420d05	only install comgr in AMD CI (#4909 ) * test * delete hsa autogen	2024-06-11 06:19:33 -04:00
qazal	8b5bcf309a	process replay in all of CI (#4884 )	2024-06-10 14:49:29 -04:00
George Hotz	f42183ba28	hotfix: relax cifar to 93.2	2024-06-09 13:09:21 +02:00
nimlgen	654a8b9ef7	retire hsa (#4885 ) * retire hsa * EMULATE_AMD	2024-06-09 11:33:03 +03:00

... 11 12 13 14 15 ...

1092 Commits