tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-30 09:18:07 -05:00

Author	SHA1	Message	Date
chenyu	cb5702f170	tiny cleanup to transcendental xexp2 (#7326 ) also added test for exp and log of nan and inf	2024-10-27 21:54:20 -04:00
chenyu	4c855ae692	unit test transcendental helpers (#7325 ) added a test to run UOps with const inputs. seems to have issue with both payne_hanek_reduction and cody_waite_reduction	2024-10-27 19:55:00 -04:00
qazal	8d9459f281	always run process replay with contextvars (#7323 ) * always run process replay with contextvars [pr] * not the last two * extra * no pr	2024-10-27 20:44:42 +02:00
qazal	adcdaa17bb	map BUFFER to Metadata [pr] (#7324 )	2024-10-27 20:31:04 +02:00
qazal	d634261c51	late buffer uops [pr] (#7322 )	2024-10-27 19:34:01 +02:00
chenyu	cdbe08b94b	use UOp.render in colored_shape (#7321 ) similar to function name, print rendered str instead of raw UOp	2024-10-27 11:42:31 -04:00
chenyu	4a03e00aa1	fix llama3 download_model assert (#7320 ) false positive if download_model and model are not provided	2024-10-27 11:20:24 -04:00
talati	d4d201d87b	fixing branch condition on UOps.IF in the ptx renderer (#7315 ) * fixing branch condition on UOps.IF in the ptx renderer * ptx works --------- Co-authored-by: Nick Talati <nick.talati@quantworks.com> Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com> Co-authored-by: qazal <qazal.software@gmail.com>	2024-10-27 14:27:38 +02:00
qazal	a410b46c1d	unskip test_gated_store_with_if [pr] (#7319 )	2024-10-27 14:03:12 +02:00
Maximilian Wolf	3c992250d5	Failing test: different behavior on different devices (#7193 ) * add minimal failing test * more tiny makes linter happy * tinyfy * no walrus in assert * a tiny bit simpler * minimal * better place, better name, expected failure * skip devices with correct behavior	2024-10-27 09:53:58 +08:00
eliotgolding	e920f1d663	Llama 3.2 1B load from GGUF (#7295 ) * gguf 1b-instruct * not needed	2024-10-27 09:29:02 +08:00
chenyu	d66fe7a66f	fix simplify_valid (#7313 ) the simplex should compare with valid bound, not its vmin	2024-10-26 14:21:12 -04:00
chenyu	0a4d01f6d4	disable simplify_valid (#7312 ) fixed test_failure_55. will reenable it later after fixing the bug	2024-10-26 12:42:48 -04:00
nimlgen	293714610a	capture beam log runtime errors (#7311 )	2024-10-26 13:59:45 +03:00
nimlgen	3c62315aa8	add resnet pf (#7310 ) * add resnet pf * all platforms	2024-10-26 13:20:32 +03:00
nimlgen	68cd2c0669	nv correct local memory based on device (#7307 ) * nv correct local memory based on device * linter * oops * oops2	2024-10-25 22:23:42 +03:00
chenyu	2ddfb9678a	update exponent_bias in transcendental (#7304 ) from https://en.wikipedia.org/wiki/Exponent_bias, 15, 127, 1023 are bias	2024-10-25 10:45:49 -04:00
chenyu	e7cd21c5e3	remove custom render in test_simplify_valid_idx (#7303 ) use UOp render to compare	2024-10-25 10:20:26 -04:00
chenyu	4688c01e3e	transcendental cleanups (#7301 ) simplified polyN and some redundant line cleanups	2024-10-25 09:30:25 -04:00
George Hotz	aadf688aeb	order flipper as normal rewrite rule (#7300 ) * instant isn't actually used [pr] * order flipper as normal rewrite rule * fix inf loop * need simplify now	2024-10-25 21:28:30 +08:00
George Hotz	3c31497f55	instant isn't actually used [pr] (#7299 ) * instant isn't actually used [pr] * tolerance bump	2024-10-25 21:01:29 +08:00
George Hotz	199a991237	line reduction [pr] (#7296 )	2024-10-25 17:05:09 +07:00
George Hotz	4fed358511	hotfix: timeouts to 20 minutes. better no stats update than a red x	2024-10-25 16:31:52 +08:00
George Hotz	dc3148c677	hotfix: minor speed increase + stable diffusion relax	2024-10-25 16:27:21 +08:00
George Hotz	4812801aa6	try for canonical order (#7286 ) * try for canonical order * cmp better * disable bad tests * flip const order * fix test * fix tests * different fix for NOOP * metaclass here * fix tests * narrower scope	2024-10-25 16:04:54 +08:00
George Hotz	d3500af71b	move consts last in uop toposort (#7290 ) * move consts last in uop toposort * consts first in toposort	2024-10-25 14:58:48 +08:00
qazal	e3c9c94896	Revert "move anything that isn't bfs [pr] (#7273 )" (#7289 ) This reverts commit `b805711f86`.	2024-10-25 14:38:30 +08:00
qazal	0b47eca085	schedule.py reorders [pr] (#7285 ) * schedule.py reorders [pr] * diff * more renames	2024-10-25 14:30:23 +08:00
George Hotz	004af512e6	try all matches in the function (#7288 )	2024-10-25 14:17:04 +08:00
George Hotz	bcf0537653	canonicalize the order prereqs (#7283 ) * canonicalize the order * don't change that yet * that order isn't safe with uops	2024-10-25 11:37:51 +08:00
qazal	603d637105	split to fuse.py and schedule.py [pr] (#7284 )	2024-10-25 06:17:24 +03:00
qazal	698457c5ce	big graph ScheduleContext [pr] (#7282 )	2024-10-25 05:58:23 +03:00
qazal	b805711f86	move anything that isn't bfs [pr] (#7273 )	2024-10-25 05:34:21 +03:00
George Hotz	6dc7d3c949	instant uop rules [pr] (#7263 ) * instant uop rules [pr] * real instant * only instant folder * better diff * instant means instant * Revert "instant means instant" This reverts commit `e58d9161bf`.	2024-10-25 10:32:45 +08:00
chenyu	90f720d703	limit idiv by neg bound to only if s0 is non-negative [pr] (#7277 ) also updated the tests when div by negative const	2024-10-24 15:46:50 -04:00
chenyu	d4c94d0d32	disable llama 1 4gpu and 6gpu benchmark (#7276 ) having llama3 4gpu and 6gpu should be good enough	2024-10-24 14:19:22 -04:00
chenyu	e6929f2402	RUN_PROCESS_REPLAY=0 on llama 70B and resnet training (#7272 ) * RUN_PROCESS_REPLAY=0 on llama 70B and resnet training also added a 15 minutes total timeout, this cannot grow indefinitely * add a few more * a few more just for NV	2024-10-24 12:09:54 -04:00
chenyu	b777cfdcba	update test_max_simplify_and_cancel (#7270 ) it's fixed and no longer dumb	2024-10-24 10:29:05 -04:00
qazal	d3953c6c55	split the preload path [pr] (#7271 )	2024-10-24 17:28:03 +03:00
chenyu	acad11ea8e	minor cleanup to View add [pr] (#7247 )	2024-10-24 09:18:47 -04:00
nimlgen	98f8d0ccf9	nv limit max local memory with envvar (#7265 )	2024-10-24 16:01:50 +03:00
qazal	f20b651ee0	prescheduling refactor from big graph [pr] (#7268 ) * prescheduling refactor from big graph [pr] * finally replay	2024-10-24 14:55:07 +03:00
George Hotz	2e6ec43c49	hotfix: become the latest for process replay	2024-10-24 19:02:59 +08:00
George Hotz	9a3d498d9c	with commutative hack, uops can change. fix that (#7266 ) * with commutative hack, uops can change. fix that * simpler	2024-10-24 18:50:23 +08:00
qazal	d482d927a8	hotfix: nobody uses [run_process_replay] [pr] (#7264 )	2024-10-24 13:37:29 +03:00
qazal	fa5dc7857a	assign toposort with big graph, bfs [pr] (#7242 ) * assign toposort with big graph, bfs [pr] * cycle * merge 2 * filter bufs * delete inputs	2024-10-24 13:09:01 +03:00
George Hotz	4d081eb560	double mod is single mod (#7262 ) * double mod is single mod * unused name	2024-10-24 18:02:51 +08:00
George Hotz	e4631a47f4	symbolic arange support (#7252 ) * symbolic arange support WIP [pr] * smin/smax from old try * pad2d symbolic works * real test * sym arange * symbolic arange test passes * double mod is single mod * lol that's not right * more tests * Update ops.py	2024-10-24 17:55:53 +08:00
George Hotz	23b26d40d5	small max rules + windows VIZ (#7261 ) * a rule from smax * that rule too * fix VIZ on windows	2024-10-24 17:43:39 +08:00
George Hotz	415186da3c	Revert "some rules to simplify max (#7258 )" (#7260 ) This reverts commit `b56fab54ea`.	2024-10-24 17:15:52 +08:00

... 77 78 79 80 81 ...

10417 Commits