tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-21 04:47:56 -05:00

Author	SHA1	Message	Date
Sieds Lykles	86e908db57	cast parents of int64 alu to int32 if possible (#11977 ) * add overflows helper * add rules * x -> y * check overflow of u too * cleaner * use alu instead of replace to preserve vectorization * just one rule * add test	2025-09-03 11:05:04 +02:00
Sieds Lykles	033184b3cb	parse_valid with non const rhs (#11957 ) * const to using vmin/vmax * add test * convert to int * remove left over part of and	2025-09-03 08:08:46 +02:00
Sieds Lykles	53eff8970a	add Ops.GEP to _min_max (#11976 )	2025-09-03 07:07:54 +02:00
Sieds Lykles	d1d0960e6e	remove intermediate cast using bounds - weaker pattern (#11974 )	2025-09-03 06:24:40 +02:00
Sieds Lykles	8a2846b31a	assert embedding input is integer dtype (#11963 ) * cast embedding input * raise error if not using int for index embedding	2025-09-03 01:44:26 +02:00
George Hotz	1b73993521	pyrender to render uops (#11968 ) * pyrender to render uops * new pyrender style * pyrender works * list str * store render	2025-09-02 15:44:01 -07:00
chenyu	69dd1817d0	raise RuntimeError in merge_dicts instead of assert [pr] (#11965 )	2025-09-02 17:18:44 -04:00
qazal	f750c15965	viz: add python marker (#11952 ) * viz: add python marker * remove duplicate	2025-09-02 23:44:00 +03:00
George Hotz	550cf2ca7f	tests from postopt (#11964 ) * tests from postopt * reraise is fine	2025-09-02 13:34:17 -07:00
nimlgen	897254ad6c	ci: add dev<->cpu copy speeds (#11959 )	2025-09-02 15:22:44 +03:00
George Hotz	0dfca4e74b	add failing test for rangeify setitem (#11954 )	2025-09-01 16:24:35 -07:00
chenyu	6a40216724	correct bf16 fuzz input in test_dtype_alu (#11933 ) it was using float16 inputs, now it's uint16 then convert to bf16	2025-09-01 10:52:26 -04:00
chenyu	965ea59b16	test_dtype_alu use AMD_LLVM from helpers (#11950 )	2025-09-01 10:03:17 -04:00
b1tg	a9f07c31bc	fix amd llvm sqrt (#11936 ) * fix amd llvm sqrt * lint --------- Co-authored-by: b1tg <b1tg@users.noreply.github.com> Co-authored-by: chenyu <chenyu@fastmail.com>	2025-09-01 09:31:14 -04:00
qazal	0a53e72f70	viz: fix trace duration in python test decoder (#11949 )	2025-09-01 14:32:25 +03:00
qazal	27c9ed5a84	viz: more consistent naming of events (#11948 ) * s/shapes/events in test_viz * s/bufs/events in the memory packer	2025-09-01 14:16:47 +03:00
Sieds Lykles	d9560a631c	remove cast between ints if safe (#11946 )	2025-09-01 05:56:49 +02:00
Sieds Lykles	a19d689481	fix vec dtype _min_max (#11944 )	2025-09-01 03:24:07 +02:00
Sieds Lykles	f32f3464d6	Can safe cast from certain ints to floats (#11941 ) * add rule * add some tests * prevent infinite loop with bfloat16 * add some ints to double and float can_safe_cast * add tests	2025-09-01 00:51:24 +02:00
Sieds Lykles	1c6e43c203	Double cast is one cast if intermediate cast is safe (#11939 ) * add rule * add some tests * prevent infinite loop with bfloat16 * prevent more infinite rewrite	2025-09-01 00:36:29 +02:00
b1tg	c1eeb3b99c	only skip AMD_LLVM (#11934 ) Co-authored-by: b1tg <b1tg@users.noreply.github.com>	2025-08-31 18:15:47 +03:00
b1tg	75d380a77c	fix transcendentals in python renderer (#11932 ) * fix transcendentals in python renderer * add test --------- Co-authored-by: b1tg <b1tg@users.noreply.github.com>	2025-08-31 09:37:17 -04:00
Sieds Lykles	d3252ccd85	fix special vmax when arg is UOp (#11930 )	2025-08-31 06:54:39 +02:00
chenyu	af89be317e	relax rtol for bfloat16 test_dtype_alu (#11926 )	2025-08-30 17:16:08 -04:00
qazal	c27b99d68f	viz: refactor to indexed rewrite traces (#11923 )	2025-08-30 20:01:47 +03:00
qazal	bf0d055b39	viz: color by name (#11919 )	2025-08-30 16:04:58 +03:00
Sieds Lykles	0bc34c000f	simplify range mod its own upper bound (#11917 ) * add rules * add tests	2025-08-30 08:37:35 +02:00
chenyu	561318fea7	Tensor.cos in test_stype_alu (#11916 ) * Tensor.cos in test_stype_alu * need this fix anyway	2025-08-29 20:26:36 -04:00
nimlgen	c6e342cdac	mockgpu: no hang if gpuocelot failed (#11915 )	2025-08-30 00:44:49 +03:00
chenyu	26d03a86a1	test_symbolic_ops.py cleanup (#11895 )	2025-08-29 17:11:59 -04:00
b1tg	b2cc06218a	python bfloat16 (#11912 ) * python bf16 * _to_torch_storage_type --------- Co-authored-by: b1tg <b1tg@users.noreply.github.com>	2025-08-29 15:18:02 -04:00
George Hotz	afad7d0cd1	remove dtype from range, it will be dtypes.index soon [pr] (#11914 ) * remove dtype from range, it will be dtypes.index soon [pr] * a few more	2025-08-29 09:52:07 -07:00
George Hotz	394c2d1db1	update Kernel API in tests + move optimize_local_size (#11907 )	2025-08-28 15:12:47 -07:00
nimlgen	fa695ac1ce	ci: mac gpuocelot (#11906 ) * gm * fix? * ops * imp * xx * add file	2025-08-28 23:29:43 +03:00
George Hotz	b9b438c516	small updates from postopt (#11903 ) * tests from postopt * modernize * skip lin tests * that's fixed? * skip, not failure	2025-08-28 12:34:52 -07:00
Ben Waldron	ea1be2e4cd	[bounty] Remove using reshape to register symbolic shape (#11771 ) * Modify tests and start work towards removing symbolic reshape * Refactor symbolic reshape * fix small error * much cleaner + fix more tests * Can remove this now * Update test_symbolic_ops and test_tiny * Couple more tests * Unused import * More tests and add EXPAND to Tensor.empty * Fix test beam search * all int * Fix rangeify by adding shrink * Remove OOB check and so fix test_symbolic_jit * test_symbolic_jit doesn't need OOB Context anymore either * Should remove that test now * Cleanups part 1 * fix linters * Final cleanups * Don't reassign inside for loop --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-08-28 12:30:49 -04:00
Ben Waldron	17ecaf4682	Add test_variable_empty (#11889 ) * Add test_variable_empty * Move test and add TODO --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-08-28 11:38:27 -04:00
Nino Risteski	54be477152	rope cache optim for jit prune in llm.py (#11678 ) * rope cache optim for jit prune * rope test * tests in test attention * Revert "rope test" This reverts commit `69ede543d0`. * lint	2025-08-28 08:31:29 -07:00
quortus	5f8fe9a331	Replace ASSIGN with STORE in test_linearizer (#11821 )	2025-08-28 07:33:20 -07:00
geohotstan	4e8370309c	Support onnx If OP (#11648 ) * start * tiny clean up * whoops, didn't mean to accidentally fix this * fix .to(device), kinda hacky and this fix makes it slower? * merge properly * FINALLY figured out slowness, also hack pylint for now * add DEBUGONNX print for subgraph * oops * WOOOOOOOO SHAPE CACHE 50% SPEED INCREASE * small fix, but maybe all deterministic Tensor creation in fp should be cached * cache condition * sliiiightly cleaner * better abstraction? * remove sam from model_benchmark * remove shape cache speed up for now * less lines * isinstance fix --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-08-28 10:17:35 -04:00
George Hotz	6d6f0dada7	support for tuple ranges (#11890 ) * support for tuple ranges * breaks it	2025-08-28 07:02:31 -07:00
chenyu	beb5982165	FUSE_ATTENTION (#11884 )	2025-08-27 19:59:17 -04:00
nimlgen	44816218b5	memplan: fix large buffers planning (#11878 ) * memplan: fix large buffers planning * fix * fix dsp	2025-08-27 23:54:27 +03:00
nimlgen	4006366752	Revert "memplan: fix large buffers planning (#11876 )" (#11877 ) This reverts commit `7f90497efc`.	2025-08-27 22:36:14 +03:00
nimlgen	7f90497efc	memplan: fix large buffers planning (#11876 ) * memplan: fix large buffers planning * fix	2025-08-27 22:04:15 +03:00
Jordan Chalupka	e9789d8a70	Add mxfp4 support (#11873 ) * bump ggml url * map mxfp4 to tensor * tests	2025-08-27 10:56:56 -07:00
Sieds Lykles	d39365809a	add ctx to z3_renderer arg (#11867 ) * add ctx to z3_renderer arg * update symbolic fuzzer * rewrite u1,u2,u3 * update fuzz_fast_idiv * remove imports	2025-08-27 03:38:15 +02:00
chenyu	7028cb4167	clean up TestBitcastConstFolding (#11856 )	2025-08-26 15:26:47 -04:00
George Hotz	b268755d51	small changes from postopt (#11854 )	2025-08-26 11:56:16 -07:00
Sieds Lykles	a3aeef45cc	associative variation of where branch-merging (#11851 ) * add rule and test * change comment	2025-08-26 19:27:05 +02:00

... 3 4 5 6 7 ...

4433 Commits