tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-28 00:08:16 -05:00

Author	SHA1	Message	Date
qazal	c7c279a6bd	unbind ShapeTrackers without maintaining a cache [pr] (#8889 ) * replace with a try [pr] * check vars * ahaa	2025-02-04 19:43:41 +01:00
chenyu	61de654efa	minor shard cleanup [pr] (#8888 )	2025-02-04 13:22:31 -05:00
qazal	6ec7f1b00f	replace UPat(name="x") with UPat.var("x") [pr] (#8887 ) * replace UPat(name="x") with UPat.var("x") [pr] * a few more	2025-02-04 19:12:40 +01:00
qazal	c26b06eaeb	delete fold_img_cast [pr] (#8875 )	2025-02-04 18:43:45 +01:00
qazal	acf0baefee	process replay from tensor uops to kernel ast (#8883 ) * process replay from tensor uops to kernel ast * this dedups * switch back to string key	2025-02-04 18:09:20 +01:00
Ignacio Sica	dcf104ee68	ptx wmma render refactor (#8873 ) Co-authored-by: chenyu <chenyu@fastmail.com>	2025-02-04 11:01:23 -05:00
qazal	b92f36179d	don't use set in schedule + add GroupOp.All [pr] (#8882 ) * don't use set in schedule + add GroupOp.All [pr] * update that	2025-02-04 08:19:27 +01:00
George Hotz	56fa5c1191	dsp simulator (#8869 ) * dsp simulator * progress * fix * close on test tiny * working * less waste * line savings * Device DSP compiler * mock DSP at the bottom * DSP tests * docker caching * test update * need load * skip that test for CI DSP * last touch * ugh	2025-02-04 09:45:04 +08:00
chenyu	836cf42c2e	fix rand_like for multi (#8880 )	2025-02-03 19:00:14 -05:00
chenyu	746d899dbd	move multi axis to property (#8879 ) also updated tests so that axis is known prior to realize	2025-02-03 16:02:09 -05:00
nimlgen	fa90079370	amd: reallocate scratch (#8872 ) * amd: reallocate scratch * use it * oops * allocate default * mypy * ops * address realloc from none better * types correct * this better * ops * rm	2025-02-03 23:21:37 +03:00
chenyu	ec447a31e7	factor out get_axis in multi [pr] (#8878 ) ALU/REDUCE_AXIS/RESHAPE/PERMUTE can change axis. prereq to move this logic to ops.py	2025-02-03 14:39:08 -05:00
chenyu	cce26009f0	simplify pow to not call cos (#8877 ) use %2 instead of cos to detect even numbers	2025-02-03 12:54:18 -05:00
geohotstan	d1aa9f30bc	copy onnx_ops into onnx (#8876 ) * just copy it over * make OnnxOps a global var * some small style stuff * rerun CI but also some small clean up * some comments	2025-02-03 12:15:07 -05:00
Ali Ladjevardi	73c75d6ee1	DEFINE_LOCAL variable names start from temp0, not temp1 (#8870 )	2025-02-03 22:50:38 +08:00
qazal	b6c617272a	New schedule.py Order [pr] (#8874 )	2025-02-03 14:59:11 +02:00
George Hotz	b075aefc12	hotfix: revert llvm host_arch	2025-02-03 16:46:19 +08:00
George Hotz	a5753095dc	llvm cleanups [pr] (#8867 )	2025-02-03 15:32:41 +08:00
George Hotz	f484db0e63	dsp cleanups [pr] (#8866 )	2025-02-03 15:18:53 +08:00
George Hotz	af2c2837f6	hotfix: skip broken test, add KERNEL Op	2025-02-03 14:02:55 +08:00
qazal	565c37c681	start simplifying the scheduler context [pr] (#8830 )	2025-02-02 18:11:36 +02:00
qazal	d64af3c884	reorder simplifier and grouper logic in scheduler [pr] (#8861 )	2025-02-02 17:19:52 +02:00
qazal	83a904aaad	just schedule in test_recursive_pad [pr] (#8860 )	2025-02-02 15:01:24 +02:00
uuuvn	6dadb60c93	LLVM JIT (+autogen llvm instead of llvmlite) (#8486 ) * LLVM JIT * Autogen LLVM * Update autogen * Move things around * even more non-determinism * windows * more autogen weirdness * more windows stuff * blind windows development try 2 * more blind windows development * even more blind windows development * maybe i should just set up a windows vm... * why can't everyone just use sysv abi? * cleanup debugging stuff * unused import * icache flushing isn't required on x86 * merge jit_nt and jit_unix * more * Temporary hack to not segfault * better error * bad conflict resolution * Attempt to simplify support/llvm.py * More refactoring --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-02-02 19:52:42 +08:00
FICTURE7	66306b5321	Fix disk tensor assignment (#8855 ) * Add test for disk tensor assignment failure * Fix disk tensor assignment --------- Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>	2025-02-02 13:50:34 +02:00
Ali Ladjevardi	6e523e4d17	Remove size arg from DEFINE_LOCAL [pr] (#8845 ) * remove size arg form DEFINE_LOCAL * make mypy happy * whitespace * dont change code in extra * revert to temp1 to pass pr	2025-02-02 19:47:32 +08:00
nimlgen	7841852870	hcq pci signal fuzzer (#8854 ) * hcq pci signal fuzzer * kk * correct	2025-02-01 23:42:27 +03:00
qazal	dc34a4146f	better process_replay context print [pr] (#8856 ) * better process_replay context print [pr] * test: revert push cast * Revert "test: revert push cast" This reverts commit `38a2aef6f8`.	2025-02-01 21:50:23 +02:00
chenyu	5b1fc4dcb2	push cast to branches in UOp where (#8850 )	2025-02-01 13:55:24 -05:00
chenyu	73ee2d74c0	raise RuntimeError for int base pow (#8852 ) current implementation is not precise and blocking other simplification change	2025-02-01 12:11:57 -05:00
qazal	72e1f41f8e	add unbind_vars pattern matcher (#8851 ) * add unbind_vars pattern matcher [pr] * this can be cvar * this is empty	2025-02-01 18:25:44 +02:00
nimlgen	b3fa76419a	am: move queues to gpus (#8848 ) * am: fix * add flsg for thos * do not depend on host parameter,	2025-02-01 18:02:52 +03:00
George Hotz	42d7c800a1	hotfix: add missing tinychat fonts + other assets	2025-02-01 09:34:44 +08:00
George Hotz	431a86615d	fix multi Ops.CONTIGUOUS_BACKWARD [pr] (#8843 )	2025-02-01 09:21:31 +08:00
Ahmed Harmouche	07d3676019	weights_only=False (#8839 )	2025-01-31 17:16:47 -05:00
nimlgen	741bbc900d	Revert "am: queues allocated on gpus (#8836 )" (#8837 ) This reverts commit `7bbb568dec`.	2025-01-31 22:53:41 +03:00
nimlgen	7bbb568dec	am: queues allocated on gpus (#8836 ) * am: fix * add flsg for thos	2025-01-31 22:14:43 +03:00
chenyu	1f730ae8f8	remove retain_graph in Tensor.backward [pr] (#8835 ) not used. gradient accumulation works directly	2025-01-31 13:41:26 -05:00
chenyu	0a59db936a	raise RuntimeError in schedule_step if not Tensor.training [pr] (#8834 )	2025-01-31 12:03:04 -05:00
qazal	af4f9d1aa9	use matchers to verify AST shape [pr] (#8828 ) * use matchers to verify kernel AST [pr] * work * use swizzle_cnt * add comment * imports * modified_ast comment * brief	2025-01-31 09:17:42 +02:00
George Hotz	643c09a6c6	tensor uop spec should be in spec.py [pr] (#8827 ) * tensor uop spec should be in spec.py [pr] * err, spec.py * print uops can stay	2025-01-31 13:54:04 +08:00
qazal	a78f0f85d3	remove support for checking tensor uops in FUSE_ARANGE [pr] (#8829 )	2025-01-31 07:48:28 +02:00
qazal	2a33750e4c	simpler group_realizes + ScheduleItem construction [pr] (#8825 )	2025-01-31 06:34:53 +02:00
George Hotz	e63d160376	hotfix: sched comment	2025-01-31 12:10:04 +08:00
qazal	1fce864a6d	delete multi output support (#8822 ) * delete multioutput for now * test_schedule * test_assign too * linter * 515 for sd * update tests and ctx * update that assign check	2025-01-30 22:45:50 -05:00
Ankit Avinash	7647cd8428	[bounty] Stride is flip (#8792 ) * replace stride with flip * Complete replacing stride with flip clean flip function in view.py fix tests * fix tests for multi shapetracker * fix tests for fuzz shapetracker * fix tests for fuzz shapetracker * debug * debug * fix * fix * fix --------- Co-authored-by: George Hotz <geohot@gmail.com> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-01-31 11:34:10 +09:00
chenyu	0513b0c17d	lower green test_gemm_8192 tflops to 125 [pr] (#8820 ) flaky	2025-01-30 17:30:08 -05:00
Ignacio Sica	f0924e0857	fix and test (#8814 ) Co-authored-by: chenyu <chenyu@fastmail.com>	2025-01-30 16:35:53 -05:00
qazal	f5da275f46	simpler remove_movement_ops [pr] (#8818 )	2025-01-30 23:32:52 +02:00
qazal	c8d878a5c1	remove r.lazydata.buf_uop_view [pr] (#8817 )	2025-01-30 23:14:36 +02:00

... 53 54 55 56 57 ...

10417 Commits