tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-10 07:28:15 -05:00

Author	SHA1	Message	Date
Ayman Jabr	256f81bb02	Fix tracemeta 0 (#13049 ) * chore: tclesius branch resolved * fix: indentation --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-11-13 09:07:11 -08:00
George Hotz	b147e7e8e6	flatten bufferize (#12984 ) * flatten bufferize * simpler * tests pass * flat * not flat	2025-10-29 11:23:43 +08:00
George Hotz	b0da173f2f	add unique to const, fix longstanding bug (#12965 ) * add unique to const, fix longstanding bug * _force_unique=True * fix tests * fix more tests	2025-10-28 15:11:37 +08:00
Sieds Lykles	7f798a9630	Cleanup const buffers (#12829 ) * split pm_cleanups * update test_schedule * shrink when we remove bufferize * dont do shrink if shape is empty * update tests * remove 1 from metadata deal with the noop bufferize * only noop on cvar * cleanup * fix if * rename	2025-10-21 14:53:49 +02:00
George Hotz	8be7844b2e	use apply uop for assign to fix assign metadata (#12732 ) * use apply uop for assign * fix metadata for assign * fix backward metadata * those aren't real tests	2025-10-16 20:34:12 +08:00
George Hotz	592e86f6f5	remove UOp.st (#12716 ) * remove UOp.st * fix tests * torch backend disable	2025-10-16 14:44:09 +08:00
chenyu	c3278e5622	clean up old tests (#12708 )	2025-10-15 17:53:17 -04:00
chenyu	312c622d35	support None in pad_to and shrink_to (#12700 )	2025-10-15 09:25:31 -04:00
Christopher Milan	0aabc1e938	Mesa NIR backend (NAK/LLVMpipe) (#12089 ) * nak works * TestOps::test_add works * testop has no crashes * fix bool casts * fix typo * add disassemble * RANGE and locals/regs * simplify NAKCompiler * disass cleanup * cleanup nir codegen * almost all tests passing * cleanup notes in extra/ * old notes * only import nak if NIR=1 * fix new SPECIAL syntax * fix local/shared memory * more tests passing * add DEFINE_VAR support * llvmpipe kinda works * diskcache * some mypy stuff * lvp passing test_ops.py * fix imports * actually fix imports * remove 'stdout' * fix llvm import * fix mypy issues * nicer errors * simpler test_dtype skips * test lvp in CI * fix github action syntax * fix more actions typos * switch to mesa 25.1.0 * diskcache_put * better generation for lvp nir_options * b64encode shader blobs * Revert diskcache changes This reverts commits `930fa3de8a` and `8428c694b3`. * general cleanup * better error messages * fix llvm import * fix windows tests * link with libm and libgcc_s * fix some errors * dont check for 'float4' * NIR uses pointer arithmetic * use tinymesa * bump tinymesa * bump tinymesa again * update lvp nir_options * print nir shader with DEBUG * simplify LVPCompiler * more tests * "gated" STORE * NAK is cacheable * more tests * all tests pass locally for NAK * test autogen in CI * autogen deps * more deps * fix uop_gc * fix macos * mypy * save 2 lines * save two more lines * save 1 line * save 4 lines * save more lines * Revert "save more lines" This reverts commit `dd3a720c5a`. * save more lines * fix LVP on windows * refactor * reorganize some code * refactor lib_gpu * move LVP check * out of order loads * remove support.mesa * bump tinymesa version * simplify LVP jit * macos * macos ci * shell: bash * testing * more testing * compute brew prefix * stupid typo * actually fix * lib * stdout on macos * inline gallivm_compile_module * Revert "inline gallivm_compile_module" This reverts commit `b65983b151`. * elf macos * semicolon * inherit from CPULLVMCompiler * ruff * disas test * fix libm linking * default is fine actually * arm works * add elf loader link test * fix NAK beam * pylint is too smart by half --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com> Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com>	2025-10-15 17:38:33 +08:00
George Hotz	db4a359374	fix up some slow tests that launch python (#12672 ) * fix up some slow tests that launch python * svd nonfull in parallel * split test_advancedindex	2025-10-14 19:13:55 +08:00
chenyu	c8dfd10257	ShapeTracker.real_strides -> is_expanded [pr] (#12579 ) only keep the used part	2025-10-09 22:52:45 -04:00
chenyu	ae51bdd06a	remove trivial use of RANGEIFY flag (#12550 ) some tests need update still	2025-10-09 02:29:38 -04:00
George Hotz	3b0b3a2e64	fast RANGEIFY (#12504 ) * rtoposort is fast, can replace rangeify with this * fast rangeify * work * fast rangeify works for mnist * should work * progress * pad fix * FAST * tests passing * don't delete those shape ops * put in rangeify map * ending ranges fix * tests * mstack/mselect no hacks * move to indexing.py * touch up tests + add comments * disable failing test * actually make the file readable * failing * error	2025-10-08 19:38:06 +08:00
chenyu	f82b16a0e9	RANGEIFY test_tensor (#12235 )	2025-09-18 10:35:43 -04:00
chenyu	84d2d047ea	Tensor.pad_to and Tensor.shrink_to (#12210 ) most of the time i want this instead of spelling out the args also add more input validation to shrink	2025-09-16 12:24:55 -04:00
Sieds Lykles	2fc0bd150b	Arange overflow raises error and one_hot upcast (#11975 ) * add error * to_dtype * shorten line * add test * upcast one hot dim im overflows	2025-09-13 00:18:25 +02:00
chenyu	647965fb09	test_train cleanup (#12140 ) * test_train cleanup remove skipIf due to buffer sizes, runs locally * those are slow	2025-09-12 13:21:30 -04:00
nimlgen	551560b87c	do not use getenv('PTX') in tests (#12095 ) * test without ptx * fix tests * fix test * linters	2025-09-10 14:04:07 +03:00
Sieds Lykles	581b2388c2	add dtypes.index (#12015 ) * add dtypes.index * cast shape, stride and mask to dtypes.index in view.create * move pm_lower_index_dtype to ops * DEFINE_VAR is dtype.index by default * merge var_val_using_str * remove int from commutative * fix test_rewrite_map * change that to dtypes.index * change some int to index * shorten those * remove old cast in renderer * cleanup * change that back * add comment * delete comment * just delete those * view doesnt have to cast anymore * adjust comment	2025-09-06 06:03:44 +02:00
chenyu	337e979a59	call dtypes.as_const in Tensor(list) (#11840 )	2025-08-25 22:08:26 -04:00
George Hotz	07b0df0d86	hotfix: test tensor dims start at 1	2025-08-05 15:40:24 -07:00
George Hotz	4dabdf7c6d	Revert "optimize in rewrite (#11516 )" (#11517 ) This reverts commit `3b777a9e05`.	2025-08-05 15:39:07 -07:00
George Hotz	3b777a9e05	optimize in rewrite (#11516 ) * changes * fix test uops * dim shouldn't be 0 * huh, why did that one not save	2025-08-05 15:33:26 -07:00
chenyu	5b570196e4	support `DEV=` to specify device (#11351 )	2025-07-23 17:40:55 -04:00
George Hotz	32e9949052	rename lazydata to uop (#10698 )	2025-06-08 08:42:22 -07:00
George Hotz	4c315f8e17	MSTACK little non-functional changes (#10648 )	2025-06-05 13:20:22 -07:00
George Hotz	b3b43a82c4	remove Tensor.no_grad, it's meaningless now [pr] (#10556 )	2025-05-28 22:20:02 -07:00
George Hotz	b58f2d4544	fix tests (#10493 )	2025-05-23 18:38:07 -07:00
George Hotz	1e4d63e06e	uops can have multiple metadata (#10479 ) * uops can have multiple metadata * fixups	2025-05-22 21:35:02 -07:00
chenyu	8cc2dff4d8	only float Tensors have gradient [pr] (#10475 )	2025-05-22 21:02:11 -04:00
George Hotz	411392dfb7	move files into uop dir (#10399 ) * move files into uop dir [pr] * tinygrad.uop is a thing * fix uop docs, no pr * fix viz	2025-05-18 11:38:28 -07:00
George Hotz	603c03bef2	fix tests for rewrite [pr] (#10167 ) * fix tests for rewrite [pr] * cleaner * delete linearize_uop * clean up the rest	2025-05-05 19:19:49 -07:00
Park Jun	c3ad7b2a84	create randperm and support pytorch backend (#10019 )	2025-04-24 07:29:02 -04:00
George Hotz	e358e0a0c6	move metadata set to tensor [pr] (#9976 ) * move metadata set to tensor [pr] * only track that in tensor.py	2025-04-22 12:30:35 +01:00
Andrew Furey	50dee4a7b3	add test for checking const gradients (#9598 )	2025-03-27 15:17:37 -04:00
chenyu	22fc0a2e36	bert sum acc in half (#9412 ) also BS=96	2025-03-11 23:03:15 -04:00
qazal	81a71ae0f6	hotfix: skip test_exclude_const_metadata (#9208 )	2025-02-22 23:26:04 +02:00
qazal	4578c3e8fd	simpler tensor metadata mapping + tests [pr] (#9203 ) * simpler tensor metadata mapping + tests [pr] * remove kernel metadata * don't map nones	2025-02-22 20:18:46 +01:00
chenyu	2e7c2780a9	CLANG -> CPU (#9189 )	2025-02-20 18:03:09 -05:00
George Hotz	a4dab3ec3f	add name uop (#9149 ) * add name uop, TODO: refactor renderer to use * renderer uses name uop * fix tests * render * ptx	2025-02-18 15:26:58 +08:00
George Hotz	df3b320f46	rewriter -> devectorizer [pr] (#9147 )	2025-02-18 12:42:08 +08:00
Josh Moore	44e0eab8fd	Fix AttributeError occurring after ValueError in _apply_uop (#8905 ) * Fix AttributeError occurring after ValueError in _apply_uop * Update tensor.py --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-02-06 10:56:29 +08:00
chenyu	1f730ae8f8	remove retain_graph in Tensor.backward [pr] (#8835 ) not used. gradient accumulation works directly	2025-01-31 13:41:26 -05:00
George Hotz	b4bf6a7dea	switch backward to use gradient [pr] (#8235 ) * switch backward to use gradient [pr] * set device correctly, dedup * why does that fail? * add noop cast * simple backward * fix beautiful_mnist * touchups * set in compute_gradient * uop_count * uop_count was wrong * collections * no note * skip that test * update sched kernel counts * train mnist is 65 * fix metadata and gc * fixes * materialize_grads * no pathlib stuff * add contiguous_backward, fix bugs * add some realize * fix multi	2025-01-26 09:12:16 +09:00
qazal	6cb74bb630	fix using clone with shrink [pr] (#8724 ) * fix using clone with shrink [pr] * remove extra arg, add test_clone_with_shrink_realized	2025-01-23 08:28:07 +02:00
George Hotz	98d01a059d	rename uopgraph to rewriter [pr] (#8682 )	2025-01-19 17:03:12 -08:00
mesozoic-egg	3506a7585f	upcast overflowed idx to int64 [pr] (#8268 ) * use full_shape to determine if index can potentially overflow * update comment * use shapetracker to check max index value * wip * lint * handle mask * upcast to int64 by st is noop on WGSL * fix comments * Handle negative overflow, intermediaries overflow, int64 support handle negative overflow handle symbolic wip handle intermediate values wip check if typemap support int64 lint comment * add invalid_dtype lint * Fix bug on checking mask overflow wip wip * Add more tests, need to resolve partial upcast test Valid_view_dup test valid op overflow refine test cases clean up cleanup wip refine tests lint * Upcast is handled by lower_load_store upcast as graph_rewrite to backtrack update test wip cleanup wip cleanup do upcast in lower_load_store lint * cleanup * do upcast within lower_load_store and mutate ctx * do upcast in get_idx and view revert lint * cleanup * Upcast in vec, const upcast to const test case 3 upcast on vector lint * simplify idx with symbolic in case of fake overflow test case4 test case 4 update test * test case4 is only for metal * try: upcast inside graph_rewrite instead of shapetracker wip * checking overflow can just be done directly on all views, with idxs * cleanup * REMOVE hard coded uop test for idx upcast * refactor cleanup refactor * do actual casting when necessary, instead of rewriting all idx hard code uop test new upcast * check dtype for int64 in webgpu * cleanup cleanup * cleanup * update tests cleanup comment cleanup cleanup * comment * comment * update comment update comment * refactor * typo * keep the scope to only upcasting * white space * Revert "white space" This reverts commit `314d7eb184`. * Revert "keep the scope to only upcasting" This reverts commit `1ef701dd85`. * sym folding is not necessary lint1 * fold symbolic lint * use symbolic simple when folding shapetracker idx * full sym folding is required after all... * Ops.CAST should retain the src min max * put rewrite to lowerer wip * start testing on higher level wip test higher level in test_tensor * find Ops.STORE in list instead of recursively * check dtype support when upcasting * remove invalid_dtype * lint * fix int64 support checks in upcast lint * skipif skipunless * revert fold to find test case * Revert "revert fold to find test case" This reverts commit `225bb6e801`. * test sym folding * handle ptx * wip * wip * delete hard coded uop test * lint fixes * wip * fix checking for None * lint * handle ptx * comment * dtype for overflow() * update skipIf skipUnless * assert in wgsl renderer for int64 wip * do folded_upcast in to_indexed_op, real_size uses views_to_indexed_ops * assert in lowerer for dtype support lint * Revert "assert in lowerer for dtype support" This reverts commit `8e9b1b79bf`. * assert dtype in kernel.py * Revert "assert dtype in kernel.py" This reverts commit `e29b9a9893`. * wip * assert in render * remove old assert * check dtype from rendere, assert in upcast wip * smaller arange for sym fold case * linearize directly * use expand directly * lint * lint * rename * no need to check dtype in device.py * trigger pr * remove dtype assert in upcast, make wgpu fail in render * use DType for type hint instead of dtypes * assert on KeyError in tests for webgpu backend int64 * use a tuple for src * test real kernel run wip * lint error * restore * fix real_size * update test example * resolve merge stuff --------- Co-authored-by: Mesozoic Egg <mesozoic.egg@proton.mail>	2025-01-17 11:52:31 -05:00
George Hotz	c85737c200	assert to prepare for grad uop [pr] (#8280 ) * assert to prepare for grad uop [pr] * fix test_nn * fix most of test_tensor * few more tests * fix multi * uniform gradient * acc_dtype * any for multi * fix typing * fix assert, CAST_BEFORE_VIEW is still the issue * explict test for CAST_BEFORE_VIEW --------- Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>	2025-01-14 13:26:56 -08:00
chenyu	c4e33048c6	test Tensor.clone has a different lazydata [pr] (#8600 )	2025-01-13 20:13:44 -05:00
qazal	866dfa1f23	create_schedule([x.lazydata]) -> x.schedule() in tests (#8449 )	2024-12-31 03:15:52 +08:00

1 2 3 4

181 Commits