tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-04-29 03:00:14 -04:00

Author	SHA1	Message	Date
George Hotz	caee42e8a6	Revert "name from uops [pr] (#9151 )" (#9154 ) This reverts commit `28897be9a2`.	2025-02-18 16:06:44 +08:00
George Hotz	28897be9a2	name from uops [pr] (#9151 )	2025-02-18 15:52:03 +08:00
George Hotz	a4dab3ec3f	add name uop (#9149 ) * add name uop, TODO: refactor renderer to use * renderer uses name uop * fix tests * render * ptx	2025-02-18 15:26:58 +08:00
George Hotz	df3b320f46	rewriter -> devectorizer [pr] (#9147 )	2025-02-18 12:42:08 +08:00
chenyu	465421b525	fix Tensor.isclose (#9143 ) many corner cases around inf and nan	2025-02-17 12:03:12 -05:00
qazal	36741cbbc1	enable real_size assert for test_conv_2x2_backward_one_view [pr] (#9142 )	2025-02-17 17:53:44 +01:00
Ali Ladjevardi	35e9c4657b	Use proper units when printing beam time (#9103 ) * use proper units when printing beam time * refactor DEBUG=2	2025-02-17 23:41:38 +08:00
Clément Verrier	a7f91224eb	add `Tensor.isclose()` (#8844 ) * add `Tensor.isclose()` * support `equal_nan` so as to match PyTorch's behavior * update unit tests * remove some tests temporarily * re-enable one test * re-enable other test * try to fix failing tests during CI * save one line of code --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-02-17 10:11:40 -05:00
qazal	660c034da6	KERNEL op try 3 (#9061 ) * work * tolerate shape, maybe this is ASSIGN(RESHAPE(BUF), KERNEL) * err, it's not ASSIGN(BUF, KERNEL), it's ASSIGN(VIEW(BUF), KERNEL) * burn the boats * assign slightly works * assign works * cleanup + var_vals can exist * fine image + fix metadata * metadata, without making everything 30% slower * diff pruning * faster assign schedule * add_buffer_ops stage * add kernel_spec back * add viz display * more strict kernel_spec	2025-02-17 14:47:54 +01:00
George Hotz	4dd10d03b7	move is_increasing to ops [pr] (#9134 )	2025-02-17 19:27:48 +08:00
George Hotz	1bf66d62cf	symbolic gets its own file [pr] (#9132 )	2025-02-17 18:55:21 +08:00
George Hotz	bd694faf6c	factor out the expander logic [pr] (#9131 )	2025-02-17 18:09:48 +08:00
quortus	5bdf0c7951	Bitcast constant folding 2.0 (#9089 ) * Prevent const folding in test_payne_hanek_reduction * Do not use list as a default parameter * Bitcast constant folding --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-02-17 18:08:20 +08:00
quortus	2be4529f14	Test broken const folding wraparound behavior (#9080 ) * Test broken const folding wraparound behavior * Add repro for test_payne_hanek_reduction const folding bug --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-02-17 17:44:56 +08:00
quortus	638d925e4e	Prevent const folding in test_payne_hanek_reduction (#9088 ) * Prevent const folding in test_payne_hanek_reduction * Do not use list as a default parameter	2025-02-17 17:31:10 +08:00
George Hotz	9289425170	add ast to ProgramSpec + pre matcher [pr] (#9128 ) * add ast to ProgramSpec + pre matcher [pr] * cleaner cast + test fix	2025-02-17 16:39:14 +08:00
quortus	edf7213f34	Make bitcast to the same dtype noop (#9121 )	2025-02-16 20:28:44 -05:00
Ahmed Harmouche	59fe45f947	Solve get_grouped_dims does not split issue (#9085 ) * Solve dims too large errors on webgpu * Simplify divisor find * Test square root divisor * Fix lint * Refactor into group_dims and split_dims * Refactor * Fix lint * Add back max check in _group_dims * Prefer grouping over split --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-02-16 19:57:29 -05:00
chenyu	c954419bc8	minor tweak to transcendental pow (#9112 ) also added more pow with const test cases	2025-02-15 18:03:25 -05:00
chenyu	8dfa0024f0	raise in scatter if self and src have different dtype [pr] (#9109 ) raise RuntimeError that matches torch instead of an implcitly cast	2025-02-15 11:21:34 -05:00
George Hotz	4672d9af73	actual tests for the dsp backend [pr] (#9102 ) * actual tests for the dsp backend [pr] * fix name	2025-02-15 15:17:56 +08:00
Marcello Fuschi	8824f7e9df	Make logcumsumexp numerically stable (#9050 ) * Make logcumsumexp numerically stable * Refactor * Refactor for special case ndim=0 * Refactor * Use the correct device for mask --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-02-14 19:25:17 -05:00
b1tg	1f1362fd27	add truncate_bf16 (#9078 ) Co-authored-by: b1tg <b1tg@users.noreply.github.com>	2025-02-15 07:59:09 +08:00
chenyu	73af42aeab	fix pow backward when base is 0 (#9075 )	2025-02-13 21:06:01 -05:00
qazal	2d04a75a40	start tracking bottom_up_rewrite in viz [pr] (#9071 ) * start tracking bottom_up_rewrite in viz [pr] * use the tracking matcher in test_viz	2025-02-14 00:28:10 +01:00
chenyu	5ef48bbe0a	swap order in rsqrt (#9069 ) fixed backward for 0	2025-02-13 16:51:21 -05:00
chenyu	e02e3b94c3	remove SQRT hack in llvm (#9067 ) replaced with xpow 0.5 in transcendental. fixed sqrt(0) backward	2025-02-13 15:42:34 -05:00
chenyu	947c97e6ff	add test_sqrt to test_speed_v_torch (#9066 ) working on getting rid of llvm sqrt hack	2025-02-13 15:25:54 -05:00
chenyu	49abc09f77	remove the reshapes in test_arange_2_reduce [pr] (#9063 )	2025-02-13 12:33:25 -05:00
chenyu	2573d0621a	Tensor.scatter_reduce touchup [pr] (#9060 )	2025-02-13 10:01:14 -05:00
Josh Moore	1f9d2442b9	Add `Tensor.scatter_reduce` (#8947 ) * pytorch scatter -> scatter_reduce * WIP scatter_reduce implementation * _pre_scatter return type hint * split out src, mask to satisfy linter * Add src cast back in * dict of lambdas instead of ifs * sum and prod reduction ops with include_self * add reduce arg error message * add amax and amin reduction ops * Fix include_self for higher dims * Simplify * Simplify amax and amin too * Pull include_self logic out into _inv_mask function * reduce arg cannot be None for scatter_reduce * Fix self-mask issue * Add mean reduce op * Add tests * any() not needed here * remove comment * End support for Tensor src with reduce arg in tinygrad scatter * Process index, dim inside actual functions * Add scatter_reduce to onnx * Add excluded onnx ScatterElements reduction tests back in * Save 2 lines on the mask helpers * Update docs * Add include_self=False tests * cleanup * Remove unneeded helper function --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-02-13 09:08:54 -05:00
qazal	2b9ce1235a	simple failing case for reorder expand + keep views in tensor_map [pr] (#9057 )	2025-02-13 11:22:55 +01:00
George Hotz	33a1151f2f	Revert "match torch rmsnorm implementation (#6799 )" (#9052 ) This reverts commit `a66b8250e0`.	2025-02-13 14:42:45 +08:00
Ryan Dorrington	a66b8250e0	match torch rmsnorm implementation (#6799 ) * update rmsnorm to match torch implementation * run all tests * formatting * formatting * oneline * default to 1e-6 * restore old test * formatting * don't save elementwise_affine * your message * ignore webgpu --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-02-13 13:02:51 +08:00
gg	19ae829bd1	test float uop in sym_infer (#7456 ) * float uop in sym_infer * break line :( * rerun mypy * update GlobalCounters types * revert type change and cast assignments to mem and ops * cast inferred value to UOp in reshape * cast hcq, update view reshape to handle inferred float * rm extra space * update error * no type updates	2025-02-13 12:55:28 +08:00
JaSpa99	d2ff55e9c6	OSX GPUOcelot (#8209 ) * add patches * add osx test in ci * macos specific uvm, gpfifo mask * only do that for now * Revert "add patches" This reverts commit `80d3112a57`. * use fork for now * workflow only one worker * merge osxtests with tests * Revert "merge osxtests with tests" This reverts commit `3461c8f46c`. * macos pagesize 16384 --------- Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-02-13 12:24:29 +08:00
chenyu	f4f56d7c15	move time_linearizer to extra.optimization.helpers [pr] (#9048 ) no longer used in tinygrad	2025-02-12 15:49:58 -05:00
chenyu	c15486cf39	remove contiguous in test_subbuffer_used [pr] (#9046 ) test works without contiguous	2025-02-12 14:41:16 -05:00
chenyu	f53b819648	UOps. -> Ops. [pr] (#9044 ) updated the comments and doc except extra	2025-02-12 12:53:23 -05:00
Ahmed Harmouche	916d5e7f08	WebGPU f16 support (f16 bounty part 2) (#8653 ) * WebGPU f16 support * Don't enable f16 yet * dtype tests passing after bitcast fix * Maybe all WebGPU green? * Require shader-f16 in examples * Minor wgsl touchup * 1 line shorter * Simpler * Add transcendetal support * log2 nan location mismatch on Vulkan * Nan skips	2025-02-12 19:46:53 +08:00
Ignacio Sica	aaed315fee	add AMX support to LLVM (#8957 ) * init amx support for llvm * revert elf changes * fix attributes for AMX asm calls * add comments * add llvm amx job to benchmarks * cleanup * cleanup * hotfix: improve comments * comment for aux buffers * hotfix: * move amx_tc to ClangRenderer * merge master * refactor * add docs * add corsix docs reference --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-02-12 16:01:18 +08:00
Josh Moore	0c97c10814	TestOps: silence pytorch std()/var() degrees of freedom warnings (#9034 )	2025-02-12 14:49:18 +08:00
chenyu	2845f8797a	failed test cases for rsqrt at 0 and similar ones (#9035 ) * failed test cases for rsqrt at 0 and similar ones related to 0inf this failed	2025-02-11 17:50:16 -05:00
nimlgen	166670a2f2	nv: fill grid/block sizes (#9025 )	2025-02-11 16:30:30 +03:00
qazal	c80603285e	bring back some things from the fix_kernel_ops diff [pr] (#9027 ) * bring fix_kernel_ops back [pr] * fix	2025-02-11 14:20:31 +01:00
George Hotz	fb698920f1	revert scheduler change (#9019 ) * Revert "cleanup ast rewriter [pr] (#9012)" This reverts commit `bf0bcb2d5a`. * Revert "kernel op cleanups + use ScheduleItem [pr] (#9009)" This reverts commit `c52cd2b437`. * Revert "construct the schedule sink 2 (#8925)" This reverts commit `cfd3db7862`.	2025-02-11 11:34:12 +08:00
chenyu	6c39aa4a6b	adjust cuda ci test targets (#9014 )	2025-02-10 15:29:59 -05:00
qazal	bf0bcb2d5a	cleanup ast rewriter [pr] (#9012 )	2025-02-10 19:07:59 +01:00
chenyu	586e48d696	a few more backward tests now pass (#9010 )	2025-02-10 12:46:21 -05:00
chenyu	25fa5e4d5f	enable backward tests in test_std_one_in_axis [pr] (#9007 ) still one correction=0 case is broken Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>	2025-02-10 10:44:05 -05:00

1 2 3 4 5 ...

3402 Commits