tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-24 14:28:09 -05:00

Author	SHA1	Message	Date
chenyu	cfd28517df	move pow folding tests to test_schedule [pr] (#8955 ) not really belongs to test_const_folding	2025-02-07 12:51:43 -05:00
George Hotz	c2b4c43edb	handle stride 0 reduce (#8068 ) * handle stride 0 reduce [pr] * more test fixups * a few more --------- Co-authored-by: qazal <qazal.software@gmail.com>	2025-02-07 15:40:58 +01:00
Ahmed Harmouche	133cacadde	Autogen webgpu dawn, removing wgpu-py dependency (f16 support part 1) (#8646 ) * Switch to dawn, all tests passing locally * Use dawn-python * Skip failing test * Skip midcast and fix timestamp on metal ci * Autogen webgpu * Try fetch dawn lib again * /usr/lib * Without lib prefix * Test autogen diff * Delete webgpu support, move everything to ops_webgpu * mypy fix * Simplify, refactor * Line savings * No ResultContainer * Type annotation for result * Some more simplifications * Why was this explicit sync used at all? * Refactor: delete functions that are only used once * Create shader module inline * Clear unit tests cache, maybe that solves it * That wasn't it * Try deleting cache to pass failing weight compare * weights_only=False for pytorch 2.6 * Simplify ctype array creation * Remove nanosecond precision timestamps * Simplify error handling * Refactor, add back type annotations * Deleted custom submit function, refactor * read_buffer simplify * Fix use after free, refactor * Simplify supported_features * Runtime docs --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-02-07 15:16:59 +08:00
Bhavya Gada	3b67712892	[bounty] Fix LLVM=1 NO_DEVECTORIZE=1 python3 test/test_ops.py TestOps.test_strided_conv2d_simple (#8937 ) * fix LLVM=1 NO_DEVECTORIZE=1 python3 test/test_ops.py TestOps.test_strided_conv2d_simple * remove expectedFailure --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-02-07 10:07:54 +08:00
George Hotz	f54242849d	failing test for the devectorize [pr] (#8940 ) * failing test for the devectorize [pr] * add DEVECTORIZE to method_cache	2025-02-07 09:44:54 +08:00
chenyu	a092b6395d	Tuple -> tuple, List -> list [pr] (#8936 )	2025-02-06 14:21:19 -05:00
qazal	79fb5c6470	hotfix: test_shard_no_recompile shouldn't rely on schedule order [pr] (#8928 )	2025-02-06 16:27:59 +02:00
George Hotz	ae45826758	hotfix: GRAPH_ONE_KERNEL + fix timing	2025-02-06 17:52:20 +08:00
George Hotz	1c53e8bf27	Revert "objc fast msg (#8922 )" (#8926 ) This reverts commit `c3f99a727e`.	2025-02-06 17:50:49 +08:00
George Hotz	c3f99a727e	objc fast msg (#8922 ) * benchmark kernel launch * don't realize unneeded * faster * faster metal * fix mypy * new objc message style [pr] * without sync * no div 0 * lru cache that * no sync in the profile * fix * update all to new style * remove comment * graph one kernel * fix graph one kernel * remove that sync	2025-02-06 17:49:06 +08:00
George Hotz	a8e54df363	benchmark single kernel launch (#8921 ) * benchmark kernel launch * don't realize unneeded * faster * faster metal * fix mypy * without sync * no div 0 * lru cache that * no sync in the profile	2025-02-06 13:35:34 +08:00
Josh Moore	44e0eab8fd	Fix AttributeError occurring after ValueError in _apply_uop (#8905 ) * Fix AttributeError occurring after ValueError in _apply_uop * Update tensor.py --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-02-06 10:56:29 +08:00
chenyu	30695da256	remove Tensor._to_const_val (#8917 ) * remove Tensor._to_const_val added a TODO for advance indexing on const, which was the last place that checks const in Tensor * that is not folding now * one more	2025-02-05 21:44:39 -05:00
uuuvn	09ec33a578	Better errors when relocating against undefined symbol (#8902 )	2025-02-06 10:13:44 +08:00
chenyu	488200f16c	move more pow const to rewrite (#8916 ) * move more pow const to rewrite one less use of _to_const_val * fix	2025-02-05 20:30:12 -05:00
chenyu	76671381aa	move positive const ** t to a rewrite rule (#8914 ) * move positive const ** t to a rewrite rule * one more test	2025-02-05 19:30:12 -05:00
chenyu	189bfa164e	enable backward test for pow(neg const x) (#8912 ) backward works now. 0x still does not work because it's a special case fixed in transcendental	2025-02-05 15:35:21 -05:00
Ignacio Sica	aec3b8d515	add regression test: `test_get_kernel_actions_preserves_actions_state` (#8907 ) * test_get_kernel_actions_preserves_actions_state * simplify * simplify * refactor assert message	2025-02-05 14:13:01 -05:00
Ignacio Sica	15f94ac964	TC_SEARCH_OVER_SHAPE to search multiple TC shapes (#8793 ) * squash search over search * refactor assert * init benchmark * cleaner get_kernel_actions * cleaner get_kernel_actions * add comment	2025-02-05 11:03:46 -05:00
qazal	6f0cc2e9c5	rename to KernelContext and move the linearize_sched comment [pr] (#8899 ) * rename to KernelContext and move that comment [pr] * 500	2025-02-05 07:49:58 +01:00
George Hotz	c1c5227acb	preserve size in dtype ptr [pr] (#8898 )	2025-02-05 14:38:57 +08:00
eliotgolding	bb5ded85cc	Don't rewrite idiv to rshift when numerator is negative (#8885 ) * more conditions for shift rewrite mul/idiv * make ptx test uint so the new condition is true * delete idiv test * rewrite to 0 is wrong for idiv, as denominator is cast to 0 before division * mul/div by 2**(large count) is unsupported anyway	2025-02-05 07:47:33 +08:00
chenyu	48349efdc1	copy is already contiguous (#8886 )	2025-02-04 17:53:33 -05:00
qazal	6a0da51ed0	truncate process replay logs [pr] (#8891 ) * truncate process replay logs [pr] * work * max_lines * bump to 1K	2025-02-04 20:26:48 +01:00
qazal	acf0baefee	process replay from tensor uops to kernel ast (#8883 ) * process replay from tensor uops to kernel ast * this dedups * switch back to string key	2025-02-04 18:09:20 +01:00
George Hotz	56fa5c1191	dsp simulator (#8869 ) * dsp simulator * progress * fix * close on test tiny * working * less waste * line savings * Device DSP compiler * mock DSP at the bottom * DSP tests * docker caching * test update * need load * skip that test for CI DSP * last touch * ugh	2025-02-04 09:45:04 +08:00
chenyu	836cf42c2e	fix rand_like for multi (#8880 )	2025-02-03 19:00:14 -05:00
chenyu	746d899dbd	move multi axis to property (#8879 ) also updated tests so that axis is known prior to realize	2025-02-03 16:02:09 -05:00
chenyu	cce26009f0	simplify pow to not call cos (#8877 ) use %2 instead of cos to detect even numbers	2025-02-03 12:54:18 -05:00
George Hotz	af2c2837f6	hotfix: skip broken test, add KERNEL Op	2025-02-03 14:02:55 +08:00
qazal	83a904aaad	just schedule in test_recursive_pad [pr] (#8860 )	2025-02-02 15:01:24 +02:00
FICTURE7	66306b5321	Fix disk tensor assignment (#8855 ) * Add test for disk tensor assignment failure * Fix disk tensor assignment --------- Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>	2025-02-02 13:50:34 +02:00
Ali Ladjevardi	6e523e4d17	Remove size arg from DEFINE_LOCAL [pr] (#8845 ) * remove size arg form DEFINE_LOCAL * make mypy happy * whitespace * dont change code in extra * revert to temp1 to pass pr	2025-02-02 19:47:32 +08:00
nimlgen	7841852870	hcq pci signal fuzzer (#8854 ) * hcq pci signal fuzzer * kk * correct	2025-02-01 23:42:27 +03:00
qazal	dc34a4146f	better process_replay context print [pr] (#8856 ) * better process_replay context print [pr] * test: revert push cast * Revert "test: revert push cast" This reverts commit `38a2aef6f8`.	2025-02-01 21:50:23 +02:00
chenyu	5b1fc4dcb2	push cast to branches in UOp where (#8850 )	2025-02-01 13:55:24 -05:00
chenyu	73ee2d74c0	raise RuntimeError for int base pow (#8852 ) current implementation is not precise and blocking other simplification change	2025-02-01 12:11:57 -05:00
qazal	72e1f41f8e	add unbind_vars pattern matcher (#8851 ) * add unbind_vars pattern matcher [pr] * this can be cvar * this is empty	2025-02-01 18:25:44 +02:00
George Hotz	431a86615d	fix multi Ops.CONTIGUOUS_BACKWARD [pr] (#8843 )	2025-02-01 09:21:31 +08:00
Ahmed Harmouche	07d3676019	weights_only=False (#8839 )	2025-01-31 17:16:47 -05:00
chenyu	1f730ae8f8	remove retain_graph in Tensor.backward [pr] (#8835 ) not used. gradient accumulation works directly	2025-01-31 13:41:26 -05:00
chenyu	0a59db936a	raise RuntimeError in schedule_step if not Tensor.training [pr] (#8834 )	2025-01-31 12:03:04 -05:00
qazal	af4f9d1aa9	use matchers to verify AST shape [pr] (#8828 ) * use matchers to verify kernel AST [pr] * work * use swizzle_cnt * add comment * imports * modified_ast comment * brief	2025-01-31 09:17:42 +02:00
George Hotz	643c09a6c6	tensor uop spec should be in spec.py [pr] (#8827 ) * tensor uop spec should be in spec.py [pr] * err, spec.py * print uops can stay	2025-01-31 13:54:04 +08:00
qazal	a78f0f85d3	remove support for checking tensor uops in FUSE_ARANGE [pr] (#8829 )	2025-01-31 07:48:28 +02:00
qazal	1fce864a6d	delete multi output support (#8822 ) * delete multioutput for now * test_schedule * test_assign too * linter * 515 for sd * update tests and ctx * update that assign check	2025-01-30 22:45:50 -05:00
Ankit Avinash	7647cd8428	[bounty] Stride is flip (#8792 ) * replace stride with flip * Complete replacing stride with flip clean flip function in view.py fix tests * fix tests for multi shapetracker * fix tests for fuzz shapetracker * fix tests for fuzz shapetracker * debug * debug * fix * fix * fix --------- Co-authored-by: George Hotz <geohot@gmail.com> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-01-31 11:34:10 +09:00
chenyu	0513b0c17d	lower green test_gemm_8192 tflops to 125 [pr] (#8820 ) flaky	2025-01-30 17:30:08 -05:00
Ignacio Sica	f0924e0857	fix and test (#8814 ) Co-authored-by: chenyu <chenyu@fastmail.com>	2025-01-30 16:35:53 -05:00
qazal	530961f7d5	realized only exists on base (#8815 ) * realized only exists on base [pr] * shorter * update that too	2025-01-30 23:02:25 +02:00

1 2 3 4 5 ...

3339 Commits