tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-23 22:08:08 -05:00

Author	SHA1	Message	Date
nimlgen	ca09c180dc	cpu: remove del spam (#11343 ) * cpu: remove del spam * fix	2025-07-23 12:02:37 +03:00
nimlgen	304eb9cecb	allocate less memory in am tests (#11342 )	2025-07-23 11:11:26 +03:00
George Hotz	e14b4fefa5	ranges on store (#11334 ) * ranges on store * fix store spec * fix that * fix gates * fix tests * fix ptx	2025-07-22 21:00:50 -07:00
George Hotz	c65b5aab62	small things from endrange (#11339 ) * small things from endrange * store	2025-07-22 19:45:37 -07:00
George Hotz	53339e62f7	no gate store anymore (#11338 ) * no gate store anymore * fix up spec	2025-07-22 18:41:15 -07:00
chenyu	7a9a5cfd28	isolate test/external/external_test_am.py (#11335 ) seems to be the one crashing, also remove -n=auto for that	2025-07-22 19:02:20 -04:00
George Hotz	fcbd0e4de3	assigns are no longer used [pr] (#11333 )	2025-07-22 15:35:07 -07:00
George Hotz	09431d4ad1	make DEFINE_REG behave like the others (#11273 ) * simpler define reg * cast * PTRCAT define_acc * cleanups * fix uops stats * fix linearizer tests * llvm * define reg sets const * define reg sets const * no assign * collapse that * fix test_max_pool2d_bigger_stride_dilation * use index, fix webgpu * devec * fix tests * fix webgpu * fix llvm * threads for python * fix ops_python * only for reg * acc_half is real now in the emulator * fix llvm * fix webgpu init * fix wgpu test * fix some tests * fix ptx * fix ptx bool acc * cleanups * broken, meh. will fix with ENDRANGE * line count	2025-07-22 13:53:56 -07:00
chenyu	4535908679	update keccak test_long (#11331 ) it should compare with arg "shake_128"	2025-07-22 16:08:01 -04:00
nimlgen	3faa352dcc	am: bump version after mm changes (#11328 )	2025-07-22 21:54:10 +03:00
George Hotz	affd83961c	small changes from define_reg (#11327 ) * small changes from define_reg * fix webgpu	2025-07-22 11:11:48 -07:00
nimlgen	53b3d87456	am: use 4-lvl pdir (#11326 )	2025-07-22 20:58:15 +03:00
chenyu	2d7c28de6a	clean up dup lambdas in helper_test_exception (#11325 )	2025-07-22 12:21:57 -04:00
chenyu	c6aa8e58ca	fix TestDropoutProbabilityEdgeCases (#11322 )	2025-07-22 11:13:56 -04:00
chenyu	fb42c84365	merge TestRollEdgeCases into test_ops (#11321 )	2025-07-22 10:55:57 -04:00
chenyu	1d8b3e9d1c	movementop only Tensor.roll (#11317 ) * movementop only Tensor.roll * fixed	2025-07-22 10:34:15 -04:00
chenyu	a41140241b	truncate unsigned const in cstyle (#11318 ) it can be a warning or a hard error in clang PTX and PYTHON also need fix, skipping for now	2025-07-22 08:02:12 -04:00
qazal	6668d6d241	fix word_wrap with newlines in input string [pr] (#11319 )	2025-07-22 12:03:13 +03:00
qazal	0c4e19f270	hotfix: disable process replay in REMOTE=1 tests (#11320 ) * hotfix: disable process replay in REMOTE=1 tests * comment	2025-07-22 10:41:58 +03:00
George Hotz	3b674df34b	generic changes from define_reg_2 (#11315 ) * generic changes from define_reg_2 * fix for ptx * ugh, that one	2025-07-21 15:14:06 -07:00
chenyu	6e9506e6fd	Tensor.roll supports dims=None (#11313 )	2025-07-21 17:29:23 -04:00
George Hotz	108aac8af4	use AddrSpace instead of local (#11314 ) * use AddrSpace instead of local * addrspace in test	2025-07-21 14:00:06 -07:00
chenyu	d3a93185a6	clean up test_roll (#11312 )	2025-07-21 16:00:50 -04:00
George Hotz	532b52fcef	store has a dtype, like assign (#11309 ) * store has a dtype, like assign * fix upat * fix test	2025-07-21 12:50:01 -07:00
geohotstan	445ff8de56	ONNX onnx_parser and buffer_parse clean up (#11000 ) * start * remove onnx.load from compile4 and move np to dropout * clean up and enable test * clean up * move WebGPU ONNX test into MacOS (WebGPU) * leave test in ONNX (CPU) * fix raw_data init None, and simplify onnx_runner test a little? * THESE TESTS ARE SO UGLY UGHH * need to really think about how to structure the test * wow LLMs are quite something * not always on disk now * also add external data loading test * cleaner tests * minimize diff and add const folding tests * add external data loading too * whoops add webgpu back.. but why was it not needed in the first place? * better comment * move webgpu test to macos(webgpu)? * llm english so much better than me wow * trigger CI to check flakiness --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-07-21 15:10:25 -04:00
George Hotz	842184a1ab	rename kernelize to schedule, try 2 (#11305 )	2025-07-21 11:18:36 -07:00
George Hotz	7e8f5dde74	matmul style is still reshape (#11308 )	2025-07-21 11:14:57 -07:00
George Hotz	41de76a7fd	put assign and store next to each other [pr] (#11306 )	2025-07-21 11:07:35 -07:00
nimlgen	de2df92551	hcq: use devices instead of ids in HCQGraph (#11303 ) * hcq: use devices instead of ids in HCQGraph * fiz	2025-07-21 20:03:12 +03:00
wozeparrot	30ce16a424	feat: failing test for long keccak (#11292 )	2025-07-21 12:49:23 -04:00
uuuvn	178dbf3f66	Remote scheduler changes (#11177 )	2025-07-21 09:29:44 -07:00
वेदांत	e368628736	Add amin support to Tensor operations in Torch backend (#11290 ) * intiger div mod fix * Revert "intiger div mod fix" This reverts commit `d5d2f201bf`. * feat arg_min support * tets update * test fix	2025-07-21 09:14:08 -04:00
qazal	5eb54e2499	viz: close event streams before profiler render (#11300 )	2025-07-21 15:42:31 +03:00
nimlgen	cc3c1e4c14	hcq: move cpu to hcq (#11262 ) * hcq: move cpu to hcq * import time * upd * fix * windows support * hm * cleaner * fix timer * fix timing * std is ns * skip profiler * mypy * cleaner * cleanups * after merge * default is back	2025-07-21 15:10:38 +03:00
nimlgen	816c01c2d4	hcq: default copy_queue_t=None (#11297 )	2025-07-21 14:45:20 +03:00
qazal	6520a7fcb6	viz: factorize event stream (#11298 )	2025-07-21 14:42:00 +03:00
nimlgen	9c533e5c38	hcq: cpu prereq (#11296 )	2025-07-21 13:35:18 +03:00
nimlgen	e87a42e243	hcq: prepare for windows (#11293 ) * hcq: prepare for windows * comments	2025-07-21 13:08:56 +03:00
nimlgen	df3ba0a7c0	autogen: fix imports in libusb (#11294 )	2025-07-21 13:04:27 +03:00
nimlgen	dd6a2d432f	hcq: default timestamp metrics is ns (#11295 )	2025-07-21 12:56:30 +03:00
wozeparrot	53345ef4e2	feat: make ops_disk work on block devices (#11291 )	2025-07-20 14:39:50 -07:00
qazal	3002c63b1e	process replay: optionally pass tinygrad import error (#11289 ) * process replay: optionally pass tinygrad import error * gate all tinygrad internals * s/getenv/os.getenv pre import * diff	2025-07-20 22:57:56 +03:00
chenyu	9e3a593313	minor kernel.py cleanups [pr] (#11286 )	2025-07-20 10:15:31 -04:00
quortus	5f17927a87	Shorten UOp.load method (#11285 )	2025-07-20 13:48:04 +03:00
chenyu	54924f9969	type remove Union and Optional [pr] (#11283 ) use `\|` for consistency	2025-07-19 14:05:52 -04:00
nimlgen	2f72be5055	nv_smi: init basic insmod/rmmod/reset cmds (#11282 )	2025-07-19 15:43:03 +03:00
qazal	577e581943	fix typo in sqtt/readme (#11281 )	2025-07-19 15:10:24 +03:00
nimlgen	188ed38315	replace from_mv with lightweight mv_address (#11280 )	2025-07-19 13:50:51 +03:00
quortus	1a25e27f32	Do not produce out of spec intermediate UOp in gated LOAD/STORE folding (#11207 ) Co-authored-by: chenyu <chenyu@fastmail.com>	2025-07-18 15:42:55 -04:00
chenyu	ec3efd2919	move upcast before reduce (#11250 ) * move upcast before reduce upcast goes to end of global+local+upcast * r_196_32_4_24_8	2025-07-18 14:42:15 -04:00

1 2 3 4 5 ...

9562 Commits