tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-23 05:48:08 -05:00

Author	SHA1	Message	Date
chenyu	0e57152dbb	clean up test_uop_symbolic [pr] (#8165 ) removed old `Node` references	2024-12-11 14:13:19 -05:00
chenyu	5eadae204b	test multi device rand with manual_seed (#8164 )	2024-12-11 13:11:31 -05:00
Maxim Zakharov	e53a5bf0c3	StableDdiffusion UI - convenient send via Enter (#8160 )	2024-12-11 19:05:24 +01:00
qazal	047a6dabc3	prereq for scheduler contiguous_child [pr] (#8163 ) * the whole context is fine here [pr] * fix that	2024-12-12 02:02:22 +08:00
ignaciosica	3a8e8ac6c2	remove dead code (#8161 )	2024-12-11 12:07:19 -05:00
George Hotz	8f4299fcc8	hotfix: suppress shutdown errors in CLProgram	2024-12-11 08:08:32 -08:00
Ahmed Harmouche	a73e3677d0	Test linearizer on webgpu (#8159 ) * Test linearizer on wgpu * Skip tests due to exceeded dims	2024-12-11 17:03:26 +01:00
qazal	b894657aa7	assert the same things without mutating or accessing internal ops state [pr] (#8157 ) * don't mutate internal state in test_lazybuffer * fix test_schedule internals * save time * third si * fine sometimes buffer_view isn't there	2024-12-11 22:01:27 +08:00
qazal	63de8f2208	late scheduler context builder [pr] (#8155 )	2024-12-11 19:59:39 +08:00
chenyu	d462f8ace0	use HALF in cifar wino benchmarks (#8153 ) more representative as it hits tensor cores on tinyboxes	2024-12-10 20:21:00 -05:00
George Hotz	c8e7707a7e	hotfix: disable flaky move tensor test	2024-12-10 17:11:21 -08:00
chenyu	155f7df599	lower test_gemm_4096 expectation on green (#8152 ) getting 119 sometimes, so lowered to 115	2024-12-10 18:05:12 -05:00
chenyu	c4be1529cf	update test for Tensor.softplus (#8150 ) test beta and extreme inputs. to pass big input, it needs to support `threshold`, which needs fix on backward that we punt until new gradient api	2024-12-10 17:48:02 -05:00
Ahmed Harmouche	a8cfdc70ed	Run more webgpu tests (#8142 )	2024-12-10 23:20:04 +01:00
Ahmed Harmouche	ed7318a3f5	Fix puppeteer install (#8148 ) Clean npm cache before puppeteer install	2024-12-10 23:06:33 +01:00
George Hotz	a1b3724ff8	prepickle process replay [pr] (#8147 )	2024-12-10 11:46:36 -08:00
George Hotz	aa3b094334	changes from delete lazy [pr] (#8146 ) * changes from delete lazy [pr] * test tweak	2024-12-10 11:06:17 -08:00
chenyu	286fec115e	fix Tensor.minimum for int (#8145 ) use invert instead of just neg. consolidate min, argmin, and minimum also update maximum to not apply the mid point for int	2024-12-10 13:34:41 -05:00
Ahmed Harmouche	71dd222f66	Fix setitem on wgpu (#8144 )	2024-12-10 19:34:25 +01:00
qazal	b69fea6ae5	process replay without global list [pr] (#8143 )	2024-12-11 02:20:09 +08:00
qazal	08405279f9	pre merge_views+ops_folding refactor [pr] (#8140 ) * simple start * valid early * more dumb things removed * don't ever use base * cleaner	2024-12-11 00:55:00 +08:00
qazal	56c84cee29	derive COPY nbytes late in realize [pr] (#8137 ) * derive COPY arg later in realize [pr] * can assume no implicit casts or movement ops here	2024-12-10 22:04:07 +08:00
qazal	2d26b011ac	allow VIEW on BUFFER [pr] (#8136 ) * allow VIEW of BUFFER [pr] * base it later * better diff * base shouldn't exist after anywhere merge_views	2024-12-10 21:29:38 +08:00
qazal	3a2658efbd	small changes to refine the delete_lazy diff (#8134 ) * _view -> view * const_arg things	2024-12-10 18:46:10 +08:00
qazal	6d33da09c9	split scalar getitem tests into correctness and optimization [pr] (#8133 )	2024-12-10 18:18:46 +08:00
qazal	7436ebef2f	spend lines on const_arg for tensor and scheduler [pr] (#8132 ) * spend lines on const_arg for tensor and scheduler [pr] * simple test_const_arg * base on lazy	2024-12-10 18:07:35 +08:00
chenyu	917deb88a4	make //0 return 0 in python_alu (#8131 ) on master it raises because it cannot truncate inf to int, which crashes valid expression like `(t > 0).where(1//t, t)`.	2024-12-09 19:32:06 -05:00
George Hotz	f83d715f41	move checks into compile3, delete compile2 [pr] (#8127 ) * move checks into compile3 [pr] * test_vs_onnx * test v torch works * float16 won't compile on compile3 * actually delete compile2	2024-12-09 14:21:42 -08:00
chenyu	358287959b	fix pow of int to negative const int (#8129 ) it should return in int	2024-12-09 17:20:18 -05:00
chenyu	12f7d284e0	failed test case for int pow (#8128 ) also updated test_ops so that non-float compares with `assert_equal`. removed `test_multinomial` which is tested better in test_randomness	2024-12-09 16:15:09 -05:00
qazal	80de06c8b9	scheduler ops_folding from delete_lazy (#8124 ) * scheduler diff from delete_lazy * test_std_mean * late fold copy of CONST * clang const is fine	2024-12-10 00:36:01 +08:00
George Hotz	87c360c4b5	hotfix: add --size 8B to llama3	2024-12-09 07:53:20 -08:00
George Hotz	a773c5a571	hotfix: default llama3 is 1B with download_model	2024-12-09 07:23:35 -08:00
Ahmed Harmouche	c6277fce09	Remove f16 decompression lib from SD compile.py (#8121 ) * Remove f16-to-f32-gpu lib, use tinygrad exported decompression * No need to create new instance	2024-12-09 14:09:00 +01:00
qazal	22d99f1421	test_pickle_realized_tensor actually tests pickle [pr] (#8119 ) * test_pickle_realized_tensor actually tests pickle [pr] * clang	2024-12-09 17:26:19 +08:00
chenyu	ccf54c2375	fix argmax/min on int32 min (#8118 )	2024-12-09 02:29:23 -05:00
chenyu	c814de2dd4	fix bitwise_not for signed int (#8117 ) -1 is correct because 2**32-1 is not within int32 range, so in some case clang casts the whole thing into uint32	2024-12-09 02:02:51 -05:00
ttomsa	e22d7b6fb0	fix var vmax inside special (#8116 )	2024-12-09 01:16:08 -05:00
qazal	0033012096	init noop changes from delete_lazy [pr] (#8115 )	2024-12-09 01:42:05 +08:00
qazal	5dd61035f7	revert VALID early folding for now (#8114 ) This reverts commit `4074f52317`.	2024-12-09 00:34:24 +08:00
qazal	69e48da961	set NOOPT in test_avg_pool3d_failure (#8112 ) * set NOOPT=0 in test_avg_pool3d_failure * noopt should still pass	2024-12-08 10:48:29 -05:00
nimlgen	3a7d64b96c	hcq remove update from args state (#8104 ) * hcq remove update from args state fix amd ugh qcom? qcom ops ops qcom fix qcom texture info fx qcom fix qcom qcom, sry minor works * remove old code * unrelated+sint * qcom * typing * rm comments	2024-12-08 15:22:05 +03:00
nimlgen	d6e66095fd	hcq buffer is a class (#8106 ) * hcq buffer is a class * qcom * no from_mv in qcom * remove qcombuffer * useless cast * mypy * qcom fix * _md -> meta	2024-12-08 13:29:43 +03:00
chenyu	b9c977f1c8	clean up bounds in Tensor.shard (#8107 )	2024-12-07 17:19:43 -05:00
geohotstan	f8294b3bda	add avg pool 3d failure test (#8105 ) * add test * try simplify test case * add TODO comment	2024-12-07 16:34:38 -05:00
qazal	6be388be86	failing test for const folding breaking indexing [pr] (#8103 )	2024-12-07 19:55:02 +08:00
nimlgen	8b1fa9cb7d	nv hcq queue touchups (#8102 )	2024-12-07 14:09:38 +03:00
qazal	4074f52317	VALID early folding (#8100 ) * fold valid * :) * fix test_verify_ast * keep symbolic working	2024-12-07 18:37:47 +08:00
qazal	07b6d5cf63	assign early folding (#8093 ) * assign early folding [pr] * move to to_si * - * fix generate_dataset * diff too big * no recreation, no diff * gzip * new sops from tiny10 * final try	2024-12-07 17:02:55 +08:00
George Hotz	00ac0db9d4	np tensors have the memory from numpy in compile3 [pr] (#8098 )	2024-12-07 14:01:51 +08:00

1 2 3 4 5 ...

7181 Commits