tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-21 04:47:56 -05:00

Author	SHA1	Message	Date
chenyu	18d4ecc1f3	lower nv test_gemm_4096 target (#13107 )	2025-11-05 11:05:16 -05:00
chenyu	54141e9cb9	DISABLE_COMPILER_CACHE=1 in speed_v_theoretical (#13096 )	2025-11-04 11:28:18 -05:00
George Hotz	d59d4cdbe4	lil less is okay	2025-10-21 17:09:44 +08:00
chenyu	a3dae51085	lower test_gemm_8192 on red (#10883 )	2025-06-19 10:01:25 -04:00
wozeparrot	eb739bb96a	hotfix: lower threshold (#10786 )	2025-06-11 19:36:20 -04:00
George Hotz	b06291077c	no amdgpu kernel driver (#10408 ) * no amdgpu kernel driver * don't test hip * lower req	2025-05-18 20:52:39 -07:00
George Hotz	427471550a	hotfix: amd tflops to 74 and some external_benchmark_sdxl_softmax stuff	2025-04-29 09:02:27 -04:00
George Hotz	d1f6701eb7	hotfix: lower amd threshold + improve block reorder test	2025-04-22 20:44:29 +01:00
nimlgen	9bd13de44c	lower test_gemv_4096_16384 to 750 for red (#9367 )	2025-03-05 22:44:48 +03:00
chenyu	2cb2fce8d9	lower test_gemm_8192 amd_tflops to 65 (#9364 )	2025-03-05 14:06:11 -05:00
chenyu	4342300eff	lower test_gemm_8192 amd to 70 (#9277 ) flaky	2025-02-26 16:32:08 -05:00
chenyu	0513b0c17d	lower green test_gemm_8192 tflops to 125 [pr] (#8820 ) flaky	2025-01-30 17:30:08 -05:00
George Hotz	d19c1c7f03	bump 75 -> 73 for test failure	2025-01-13 09:18:38 -08:00
chenyu	6a7f971fa0	hotfix max(DEBUG, 2) -> max(DEBUG.value, 2) [pr] (#8553 )	2025-01-10 12:57:44 -05:00
chenyu	9789a83064	hotfix DEBUG in speed_v_theoretical.py conv (#8266 ) infinite loop with manual DEBUG set `DEBUG=2 python test/external/speed_v_theoretical.py -k conv` ``` File "/Users/chenyu/code/tinygrad/tinygrad/helpers.py", line 95, in __ge__ def __ge__(self, x): return self.value >= x ^^^^^^^^^^^^^^^ [Previous line repeated 4984 more times] RecursionError: maximum recursion depth exceeded in comparison ```	2024-12-15 19:44:45 -05:00
chenyu	62e19649c0	lower test_conv_3x3_256_32_32_256_256 (#8226 ) tiny7 is slow	2024-12-13 17:15:53 -05:00
chenyu	155f7df599	lower test_gemm_4096 expectation on green (#8152 ) getting 119 sometimes, so lowered to 115	2024-12-10 18:05:12 -05:00
chenyu	5c6ed5dba6	lower test_conv_3x3_256_32_32_256_256 expectation (#8060 ) failed https://github.com/tinygrad/tinygrad/actions/runs/12182799887/job/33982676812#step:9:210	2024-12-05 10:30:56 -05:00
George Hotz	20878be2af	lower test_gemv_4096_16384 expectations	2024-12-05 12:08:26 +08:00
chenyu	0693158d28	lower v_theoretical gemv on red (#8042 ) tiny7 is still slower https://github.com/tinygrad/tinygrad/actions/runs/12166149038/job/33931736130#step:8:209	2024-12-04 13:59:40 -05:00
George Hotz	08657cb7b0	hotfix: bump expectations in speed_v_theoretical	2024-12-04 19:00:33 +08:00
George Hotz	ea65c79ba2	hotfix: don't spam BEAM debug in speed_v_theoretical	2024-12-04 18:47:16 +08:00
George Hotz	09b00b1b04	hotfix: use kernel timings instead of python timings in speed_v_theoretical	2024-12-04 18:36:17 +08:00
qazal	b797aee720	uop global buf number tracking try 2 [pr] (#7912 ) * uop buffer init small refactor [pr] * add early * this way it doesn't need late * buffer_num * itertools.count * count from 0 * down to 380	2024-12-02 14:45:17 +08:00
George Hotz	cbcc1c20eb	second try at block linearize (#7892 ) * second try at block linearize * weeee, works for lil matmul * it's so beautiful * test tiny passes * fix bugs * combine matching BLOCKENDS * wrapping * test lin failures passes * those failures were fake * flip sort order * fix ptx tests * deal with store better * dumb ptx fix * expect less * reduce lines * reduce lines * less lines and cleaner * no defaultdict * tighter * simpler block_parent_count	2024-12-02 13:43:09 +08:00
George Hotz	6c1efb9a72	hotfix: amd gemv was flaky	2024-12-02 11:08:24 +08:00
chenyu	bb23469f93	lower conv threshold on red (#7948 )	2024-11-28 13:31:06 -05:00
chenyu	f54508549f	don't search conv weight init in speed_v_theoretical (#7943 )	2024-11-28 10:03:18 -05:00
chenyu	5c5b1b994c	less flaky benchmarks (#7855 ) JIT=2 for metal cifar with HALF, and lower tflops for nv test_gemm_4096. failures in https://github.com/tinygrad/tinygrad/actions/runs/11980239535/job/33404098428?pr=7830	2024-11-22 16:39:39 -05:00
chenyu	11cea00090	lower vs_theoretical conv tflops threshold for nv (#7811 ) less flaky	2024-11-20 20:03:49 -05:00
chenyu	1884f021e3	add conv3x3 to speed_v_theoretical (#7658 ) * add conv3x3 to speed_v_theoretical * show test duration	2024-11-12 16:41:56 -05:00
chenyu	962dafb467	use randn in speed_v_theoretical instead of rand (#7656 ) * use randn in speed_v_theoretical instead of rand this made green gemv 20% faster... but why? * update threshold	2024-11-12 15:00:32 -05:00
chenyu	6159790ab8	add gemv to speed_v_theoretical (#7654 ) * add gemv to speed_v_theoretical getting ~300GB/s if we just count the memory of inputs and output * better green numbers * flip	2024-11-12 11:19:35 -05:00
chenyu	99f29e50b2	update speed_v_theoretical numbers (#7647 ) better amd after set compute profile	2024-11-11 20:05:13 -05:00
chenyu	773d5b60bf	beam benchmark tests (#7638 ) * beam benchmark tests * lower AMD number somehow * less flaky	2024-11-11 18:11:18 -05:00

35 Commits