tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-23 22:08:08 -05:00

Author	SHA1	Message	Date
George Hotz	fcbd0e4de3	assigns are no longer used [pr] (#11333 )	2025-07-22 15:35:07 -07:00
George Hotz	09431d4ad1	make DEFINE_REG behave like the others (#11273 ) * simpler define reg * cast * PTRCAT define_acc * cleanups * fix uops stats * fix linearizer tests * llvm * define reg sets const * define reg sets const * no assign * collapse that * fix test_max_pool2d_bigger_stride_dilation * use index, fix webgpu * devec * fix tests * fix webgpu * fix llvm * threads for python * fix ops_python * only for reg * acc_half is real now in the emulator * fix llvm * fix webgpu init * fix wgpu test * fix some tests * fix ptx * fix ptx bool acc * cleanups * broken, meh. will fix with ENDRANGE * line count	2025-07-22 13:53:56 -07:00
chenyu	4535908679	update keccak test_long (#11331 ) it should compare with arg "shake_128"	2025-07-22 16:08:01 -04:00
nimlgen	3faa352dcc	am: bump version after mm changes (#11328 )	2025-07-22 21:54:10 +03:00
George Hotz	affd83961c	small changes from define_reg (#11327 ) * small changes from define_reg * fix webgpu	2025-07-22 11:11:48 -07:00
nimlgen	53b3d87456	am: use 4-lvl pdir (#11326 )	2025-07-22 20:58:15 +03:00
chenyu	2d7c28de6a	clean up dup lambdas in helper_test_exception (#11325 )	2025-07-22 12:21:57 -04:00
chenyu	c6aa8e58ca	fix TestDropoutProbabilityEdgeCases (#11322 )	2025-07-22 11:13:56 -04:00
chenyu	fb42c84365	merge TestRollEdgeCases into test_ops (#11321 )	2025-07-22 10:55:57 -04:00
chenyu	1d8b3e9d1c	movementop only Tensor.roll (#11317 ) * movementop only Tensor.roll * fixed	2025-07-22 10:34:15 -04:00
chenyu	a41140241b	truncate unsigned const in cstyle (#11318 ) it can be a warning or a hard error in clang PTX and PYTHON also need fix, skipping for now	2025-07-22 08:02:12 -04:00
qazal	6668d6d241	fix word_wrap with newlines in input string [pr] (#11319 )	2025-07-22 12:03:13 +03:00
qazal	0c4e19f270	hotfix: disable process replay in REMOTE=1 tests (#11320 ) * hotfix: disable process replay in REMOTE=1 tests * comment	2025-07-22 10:41:58 +03:00
George Hotz	3b674df34b	generic changes from define_reg_2 (#11315 ) * generic changes from define_reg_2 * fix for ptx * ugh, that one	2025-07-21 15:14:06 -07:00
chenyu	6e9506e6fd	Tensor.roll supports dims=None (#11313 )	2025-07-21 17:29:23 -04:00
George Hotz	108aac8af4	use AddrSpace instead of local (#11314 ) * use AddrSpace instead of local * addrspace in test	2025-07-21 14:00:06 -07:00
chenyu	d3a93185a6	clean up test_roll (#11312 )	2025-07-21 16:00:50 -04:00
George Hotz	532b52fcef	store has a dtype, like assign (#11309 ) * store has a dtype, like assign * fix upat * fix test	2025-07-21 12:50:01 -07:00
geohotstan	445ff8de56	ONNX onnx_parser and buffer_parse clean up (#11000 ) * start * remove onnx.load from compile4 and move np to dropout * clean up and enable test * clean up * move WebGPU ONNX test into MacOS (WebGPU) * leave test in ONNX (CPU) * fix raw_data init None, and simplify onnx_runner test a little? * THESE TESTS ARE SO UGLY UGHH * need to really think about how to structure the test * wow LLMs are quite something * not always on disk now * also add external data loading test * cleaner tests * minimize diff and add const folding tests * add external data loading too * whoops add webgpu back.. but why was it not needed in the first place? * better comment * move webgpu test to macos(webgpu)? * llm english so much better than me wow * trigger CI to check flakiness --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-07-21 15:10:25 -04:00
George Hotz	842184a1ab	rename kernelize to schedule, try 2 (#11305 )	2025-07-21 11:18:36 -07:00
George Hotz	7e8f5dde74	matmul style is still reshape (#11308 )	2025-07-21 11:14:57 -07:00
George Hotz	41de76a7fd	put assign and store next to each other [pr] (#11306 )	2025-07-21 11:07:35 -07:00
nimlgen	de2df92551	hcq: use devices instead of ids in HCQGraph (#11303 ) * hcq: use devices instead of ids in HCQGraph * fiz	2025-07-21 20:03:12 +03:00
wozeparrot	30ce16a424	feat: failing test for long keccak (#11292 )	2025-07-21 12:49:23 -04:00
uuuvn	178dbf3f66	Remote scheduler changes (#11177 )	2025-07-21 09:29:44 -07:00
वेदांत	e368628736	Add amin support to Tensor operations in Torch backend (#11290 ) * intiger div mod fix * Revert "intiger div mod fix" This reverts commit `d5d2f201bf`. * feat arg_min support * tets update * test fix	2025-07-21 09:14:08 -04:00
qazal	5eb54e2499	viz: close event streams before profiler render (#11300 )	2025-07-21 15:42:31 +03:00
nimlgen	cc3c1e4c14	hcq: move cpu to hcq (#11262 ) * hcq: move cpu to hcq * import time * upd * fix * windows support * hm * cleaner * fix timer * fix timing * std is ns * skip profiler * mypy * cleaner * cleanups * after merge * default is back	2025-07-21 15:10:38 +03:00
nimlgen	816c01c2d4	hcq: default copy_queue_t=None (#11297 )	2025-07-21 14:45:20 +03:00
qazal	6520a7fcb6	viz: factorize event stream (#11298 )	2025-07-21 14:42:00 +03:00
nimlgen	9c533e5c38	hcq: cpu prereq (#11296 )	2025-07-21 13:35:18 +03:00
nimlgen	e87a42e243	hcq: prepare for windows (#11293 ) * hcq: prepare for windows * comments	2025-07-21 13:08:56 +03:00
nimlgen	df3ba0a7c0	autogen: fix imports in libusb (#11294 )	2025-07-21 13:04:27 +03:00
nimlgen	dd6a2d432f	hcq: default timestamp metrics is ns (#11295 )	2025-07-21 12:56:30 +03:00
wozeparrot	53345ef4e2	feat: make ops_disk work on block devices (#11291 )	2025-07-20 14:39:50 -07:00
qazal	3002c63b1e	process replay: optionally pass tinygrad import error (#11289 ) * process replay: optionally pass tinygrad import error * gate all tinygrad internals * s/getenv/os.getenv pre import * diff	2025-07-20 22:57:56 +03:00
chenyu	9e3a593313	minor kernel.py cleanups [pr] (#11286 )	2025-07-20 10:15:31 -04:00
quortus	5f17927a87	Shorten UOp.load method (#11285 )	2025-07-20 13:48:04 +03:00
chenyu	54924f9969	type remove Union and Optional [pr] (#11283 ) use `\|` for consistency	2025-07-19 14:05:52 -04:00
nimlgen	2f72be5055	nv_smi: init basic insmod/rmmod/reset cmds (#11282 )	2025-07-19 15:43:03 +03:00
qazal	577e581943	fix typo in sqtt/readme (#11281 )	2025-07-19 15:10:24 +03:00
nimlgen	188ed38315	replace from_mv with lightweight mv_address (#11280 )	2025-07-19 13:50:51 +03:00
quortus	1a25e27f32	Do not produce out of spec intermediate UOp in gated LOAD/STORE folding (#11207 ) Co-authored-by: chenyu <chenyu@fastmail.com>	2025-07-18 15:42:55 -04:00
chenyu	ec3efd2919	move upcast before reduce (#11250 ) * move upcast before reduce upcast goes to end of global+local+upcast * r_196_32_4_24_8	2025-07-18 14:42:15 -04:00
chenyu	be2f4336e6	use onnx 1.18.0 in DSP test (#11279 )	2025-07-18 14:09:23 -04:00
nimlgen	9a88bd841c	hcq: refactor into peer_groups (#11277 ) * hcq: refactor into peer_groups * fix fors * fixes * ooops * mypy * tiny fixes	2025-07-18 16:34:18 +03:00
nimlgen	f432eef708	hcq: rename CPU -> KICK in graph for kickoff signal (#11278 )	2025-07-18 15:54:35 +03:00
quortus	52bbd9900b	[pr] Stable tensor order in _find_all_tensors_for_uops (#11276 ) * Use dict for all_tensors to get stable tensor order in _find_all_tensors_for_uops * Rerun tests	2025-07-18 13:12:01 +03:00
chenyu	c5a5d74642	Revert "image_dot of 2 half inputs returns half (#11007 )" (#11274 ) This reverts commit `fa8e08f922`.	2025-07-17 17:34:18 -04:00
Utkarsh Gill	fa8e08f922	image_dot of 2 half inputs returns half (#11007 ) * cast after sum * comment out skipif * minor fix * only test IMAGE * IMAGE is supported now * simpler * simplerr * only cast if dtype is None * dont need to change base_imaeg_type * only cast when dtype is half * add explicit test * actually no, workflow seems better * actually, keep both * move test * fix indent --------- Co-authored-by: Utkarsh Gill <engelbart@Utkarshs-MacBook-Pro.local>	2025-07-17 13:47:22 -07:00

1 2 3 4 5 ...

9556 Commits