tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-08 06:34:03 -05:00

Author	SHA1	Message	Date
chenyu	2e2b5fed12	fix misspellings (#13976 )	2026-01-02 10:37:38 -05:00
George Hotz	0221b96761	assembly/amd: fix all ops tests (#13910 ) * assembly/amd: fix all ops tests * test_ops with smaller sizes * ds store/load 2addr	2025-12-30 18:01:34 -05:00
George Hotz	3dbde178c1	mark slow tests as slow instead of as CI (#13736 ) * mark slow tests as slow instead of as CI * CI shouldn't have different behavior * more skips / CI * slow	2025-12-17 10:29:57 -04:00
ayanhan	47a170be2e	test: enable cummax scalar IndexError test (#13625 )	2025-12-09 12:25:56 -05:00
Christopher Milan	1c16b6e082	Mesa: freedreno (#12746 ) * ir3 init * got a program * 1 + 1 works * use isa_disasm instead of shader_disasm * wip * matmul works * works on py3.14 * fix const loading * skip QCOM failing tests * cleanup * args actually work * add compile-only tests * fix typo and install tinymesa * IR3 NULL backend * (float32) images work * autogen fix * fix compile only test * typo * mypy happy * compile-only uses py3.14 * bump mesa * unify qcom disassembler * float16 works * disasm shows in viz * save a line * add real del * variable workgroup sizes * simplify diff * bump line count * properly set wgsz * regen mesa * no preamble * bump lines	2025-12-08 14:02:08 -05:00
chenyu	e8879f7e31	match torch clamp backward (#13533 ) * match torch clamp backward * fix PYTHON	2025-12-02 17:58:32 -05:00
Sieds Lykles	114bb94c55	Fix load collapse MAX to ADD (#13406 ) * add Ops.ADD to pattern * add test	2025-11-21 12:26:14 +01:00
George Hotz	4027eef264	fix test warnings (#13114 ) * fix test warnings * precommit passes * ignore std_mean warning	2025-11-05 15:06:29 -08:00
chenyu	4b7329001d	clean up test_avg_pool3d (#12905 )	2025-10-24 14:31:36 -04:00
George Hotz	c780cd9abb	new linearizer with early endrange (#12823 ) * new linearizer with early endrange * cleanups * second stage removal * not store * do that later * end cleanup * fix globals * end * multi end * fix ends earlier * work * do_merge_ends * mini change * range_gate * fix cpu * test fixups * ranges on index * not for ptx	2025-10-21 17:37:48 +08:00
Christopher Milan	0aabc1e938	Mesa NIR backend (NAK/LLVMpipe) (#12089 ) * nak works * TestOps::test_add works * testop has no crashes * fix bool casts * fix typo * add disassemble * RANGE and locals/regs * simplify NAKCompiler * disass cleanup * cleanup nir codegen * almost all tests passing * cleanup notes in extra/ * old notes * only import nak if NIR=1 * fix new SPECIAL syntax * fix local/shared memory * more tests passing * add DEFINE_VAR support * llvmpipe kinda works * diskcache * some mypy stuff * lvp passing test_ops.py * fix imports * actually fix imports * remove 'stdout' * fix llvm import * fix mypy issues * nicer errors * simpler test_dtype skips * test lvp in CI * fix github action syntax * fix more actions typos * switch to mesa 25.1.0 * diskcache_put * better generation for lvp nir_options * b64encode shader blobs * Revert diskcache changes This reverts commits `930fa3de8a` and `8428c694b3`. * general cleanup * better error messages * fix llvm import * fix windows tests * link with libm and libgcc_s * fix some errors * dont check for 'float4' * NIR uses pointer arithmetic * use tinymesa * bump tinymesa * bump tinymesa again * update lvp nir_options * print nir shader with DEBUG * simplify LVPCompiler * more tests * "gated" STORE * NAK is cacheable * more tests * all tests pass locally for NAK * test autogen in CI * autogen deps * more deps * fix uop_gc * fix macos * mypy * save 2 lines * save two more lines * save 1 line * save 4 lines * save more lines * Revert "save more lines" This reverts commit `dd3a720c5a`. * save more lines * fix LVP on windows * refactor * reorganize some code * refactor lib_gpu * move LVP check * out of order loads * remove support.mesa * bump tinymesa version * simplify LVP jit * macos * macos ci * shell: bash * testing * more testing * compute brew prefix * stupid typo * actually fix * lib * stdout on macos * inline gallivm_compile_module * Revert "inline gallivm_compile_module" This reverts commit `b65983b151`. * elf macos * semicolon * inherit from CPULLVMCompiler * ruff * disas test * fix libm linking * default is fine actually * arm works * add elf loader link test * fix NAK beam * pylint is too smart by half --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com> Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com>	2025-10-15 17:38:33 +08:00
George Hotz	fb61f3519f	remove assign contiguous hack (#12659 ) * remove assign contiguous hack * remove bad contiguous usage in torch backend * assign	2025-10-14 16:42:14 +08:00
chenyu	001b3710d3	enable some test_ops tests (#12607 )	2025-10-10 07:23:21 -04:00
chenyu	cf8232ec6a	clean up more RANGEIFY flag (#12556 )	2025-10-09 03:06:48 -04:00
chenyu	ae51bdd06a	remove trivial use of RANGEIFY flag (#12550 ) some tests need update still	2025-10-09 02:29:38 -04:00
George Hotz	0f25b4b289	move frontend dir to nn [pr] (#12470 )	2025-10-07 10:42:22 +08:00
chenyu	f203d8b221	update RANGEIFY kernel count and test_masked_select (#12435 )	2025-10-03 00:41:34 -04:00
wozeparrot	a6dd5a224b	skip webgpu tests (#12433 )	2025-10-02 21:31:07 -07:00
chenyu	6ba8bf282f	skip test_masked_select for RANGEIFY PYTHON (#12395 )	2025-10-01 04:13:31 -04:00
qazal	e8c595c29e	remu: add new instructions introduced in RANGEIFY (#12363 ) * add v_mad_i64_i32 for test_output_padded_conv_transpose2d * run amd test_ops * skip test_masked_select	2025-09-30 12:36:29 +03:00
Sieds Lykles	d21e34e617	enable test_sum_twice (#12270 ) * remove skip * remove import	2025-09-23 00:57:29 +02:00
chenyu	393c6b236c	test case to sum twice in different order (#12253 ) * test case to sum twice in different order fixed by #12251 * try metal	2025-09-20 10:11:57 -04:00
chenyu	edffc246ed	MUL in reduce_unparented (#12223 ) * MUL in reduce_unparented * some test	2025-09-17 11:56:39 -04:00
chenyu	12a910f1d2	update torch 2.8 (#12172 ) support _reshape_alias. something is wrong with one case of unfold	2025-09-14 15:19:03 -04:00
George Hotz	bcafa72b7f	use tags instead of graph_rewrite_map in rangeify (#12110 ) * use tags instead of graph_rewrite_map in rangeify * new style, add realize * metadata works * simple failure * fix * loops * stuff becomes a NOOP when you remove it * stuff becomes a NOOP when you remove it * tags on bufferize * bmnist works * locals don't work * shippable * fix some tests * simpler map_realize * remove const hack * debuggable test * broke * assign test * straight up bug * wooo it passes * sink shouldn't be there * fix ops * bmnist * kv cache ish * Set RANGEIFY context variable to 0 * should work normal * better * types * hacks to fix test_symbolic * pm_add_buffers * tests should pass	2025-09-14 11:39:01 +08:00
chenyu	aac3dceaf6	merge two PYTHON backend ci job (#12143 ) * merge two PYTHON backend ci job and mark anything that takes > 10 in test_ops slow * two more	2025-09-12 17:36:46 -04:00
chenyu	544eb2c402	clean up test_scatter_reduce (#12125 )	2025-09-11 16:36:58 -04:00
chenyu	0e266f376c	ops_gpu -> ops_cl (#12103 )	2025-09-10 15:15:48 -04:00
nimlgen	1c6c42715f	unify cpu and llvm (#11982 ) * try unify cpu and llvm * fixes * fix * ops * no llvm * fix * rm * lvmm is ot * oops * override * no llvm * ignore * skip llvm * ooops	2025-09-09 13:54:44 +03:00
chenyu	ce7163e9b4	clean up skip slow tests in PYTHON (#12028 ) skip with SKIP_SLOW_TEST and decorators	2025-09-05 11:35:26 -04:00
chenyu	52166fd7eb	smaller test_ops inputs (#12007 )	2025-09-04 16:22:33 -04:00
chenyu	d0e739453e	update many einsum tests (#11981 ) correct the exception testing, and raise ValueError instead of assert when checking args	2025-09-03 15:40:20 -04:00
chenyu	69dd1817d0	raise RuntimeError in merge_dicts instead of assert [pr] (#11965 )	2025-09-02 17:18:44 -04:00
chenyu	7123df3928	Use Tensor.logaddexp to implement Tensor.softplus (#11796 ) instead of piecewise linear, numerical is handled by logaddexp. jax does this and i think it's more elegant than torch's approach	2025-08-23 11:52:29 -04:00
chenyu	fb8ee02424	Tensor.logaddexp (#11793 )	2025-08-23 09:15:00 -04:00
geohotstan	1e679bd789	fix max_unpool2d inf (#11784 ) * start * add regression test for maxunpool2d	2025-08-22 08:31:24 -04:00
chenyu	91a4de4ca7	fix getitem with inf in tensor (#11781 )	2025-08-21 21:55:32 -04:00
chenyu	5276fbc9c5	fix gather with inf values (#11760 ) (mask * x) is wrong because 0*inf is nan. i feel we have a lot of those still...	2025-08-20 20:35:40 -04:00
chenyu	4fe19eec72	Ops.TRUNC (#11659 )	2025-08-13 18:40:48 -04:00
chenyu	0c97d6de1b	don't round pow output for int pow int (#11625 ) also added atol=0 and big pows for the tests	2025-08-11 20:57:47 -04:00
chenyu	d623f6d850	support int Tensor pow to const non-negative int (#11624 ) matches torch	2025-08-11 19:50:19 -04:00
chenyu	a67e0917c3	list indexing can normalize in python (#11609 ) * list indexing can normalize in python list index does not need to be normalized in tensor * update those	2025-08-10 20:02:38 -04:00
chenyu	1181ec0cd2	few more tensor indexing test cases (#11608 )	2025-08-10 18:56:42 -04:00
chenyu	dfb702ef33	fix sort for small dim (#11601 ) * fix sort for small dim * fixed test_sort_empty	2025-08-10 01:17:41 -04:00
chenyu	aa1a6f2132	support threshold in Tensor.softplus (#11564 ) fix gradient for large input	2025-08-07 13:43:18 -04:00
chenyu	dbc7807c61	enable WEBGPU tests with buffer limit (#11489 ) TestSample still fails?	2025-08-03 13:02:44 -07:00
chenyu	2d7c28de6a	clean up dup lambdas in helper_test_exception (#11325 )	2025-07-22 12:21:57 -04:00
chenyu	fb42c84365	merge TestRollEdgeCases into test_ops (#11321 )	2025-07-22 10:55:57 -04:00
chenyu	1d8b3e9d1c	movementop only Tensor.roll (#11317 ) * movementop only Tensor.roll * fixed	2025-07-22 10:34:15 -04:00
chenyu	6e9506e6fd	Tensor.roll supports dims=None (#11313 )	2025-07-21 17:29:23 -04:00

1 2 3 4 5 ...

638 Commits