tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-24 06:18:01 -05:00

Author	SHA1	Message	Date
George Hotz	7636d2cdc5	flip order of get_program args (#10905 )	2025-06-20 17:23:23 -07:00
George Hotz	b41e0563a3	move stuff to kernelize folder (#10902 ) * move stuff to kernelize folder * oops, forgot that	2025-06-20 16:10:20 -07:00
George Hotz	92678e59ee	move kernel to opt (#10899 )	2025-06-20 15:22:28 -07:00
chenyu	a3dae51085	lower test_gemm_8192 on red (#10883 )	2025-06-19 10:01:25 -04:00
George Hotz	18593c9800	one less rewrite on schedule [pr] (#10872 ) * one less rewrite on schedule [pr] * verify in ebs	2025-06-18 17:06:17 -07:00
wozeparrot	bdbf121285	fix: contigous -> contiguous (#10868 )	2025-06-18 13:09:51 -07:00
George Hotz	cba6e15937	split grouper and kernelize [pr] (#10854 )	2025-06-17 17:54:20 -07:00
uuuvn	a51f18f8f9	CI flakiness (#10851 ) https://github.com/tinygrad/tinygrad/actions/runs/15718103629/job/44292845140?pr=10753#step:4:161	2025-06-17 14:46:30 -07:00
nimlgen	c0329148c7	am: check va is aligned to page size (#10815 ) * am: check va is aligned to page size * swap them * is this faster	2025-06-15 22:51:09 +03:00
George Hotz	5dc1bc6070	switch get_kernel -> get_program [pr] (#10817 ) * switch get_kernel -> get_program [pr] * fix tests	2025-06-15 12:26:50 -07:00
wozeparrot	eb739bb96a	hotfix: lower threshold (#10786 )	2025-06-11 19:36:20 -04:00
chenyu	612cdf5146	move fuzz_shape_ops to run with other fuzzer (#10767 ) * move fuzz_shape_ops to run with other fuzzer * don't skip CPU	2025-06-10 17:43:04 -04:00
b1tg	52c49dd4f3	fix onnx ci (#10762 ) Co-authored-by: b1tg <b1tg@users.noreply.github.com>	2025-06-10 14:28:40 -04:00
George Hotz	f84c320548	better external_benchmark_schedule [pr] (#10722 )	2025-06-09 10:26:11 -07:00
b1tg	24d328e313	onnx parser (#10435 ) * onnx parser * fix compile, lint * onnx.load -> onnx_load * compatible with ModelProto * fix test external_test_onnx_ops.py * fix tests * fix signed int * reduce to 261 lines * fix TypeProto.Optional * debug for _parse_message, add TypeProto.Sequence, cleanup * onnx_load from Tensor * remove BufferedReader * 174 lines and reduce tensor copy * cleanup * use onnx_load in external_model_benchmark.py * fix qcom test * [onnx] parser support external data --------- Co-authored-by: b1tg <b1tg@users.noreply.github.com> Co-authored-by: chenyu <chenyu@fastmail.com>	2025-06-09 12:44:28 -04:00
George Hotz	81b9c04574	move high level stuff to unit tests [pr] (#10708 ) * move high level stuff to unit tests [pr] * process replay on unit tests * fix pr, less compute * set omp num threads * set 200MB buffer size limit * delete junk * fix tests * faster * move test_indexing to unit * faster	2025-06-08 14:05:56 -07:00
George Hotz	32e9949052	rename lazydata to uop (#10698 )	2025-06-08 08:42:22 -07:00
leopf	eb7305e6a4	Tensor.keccak("sha3_256") (#7186 ) Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com> Co-authored-by: George Hotz <geohot@gmail.com> Co-authored-by: wozeparrot <wozeparrot@gmail.com>	2025-06-06 15:24:05 -07:00
wozeparrot	0d86f8d375	fix failed threefry (#10646 )	2025-06-05 17:17:42 -07:00
chenyu	46811d0d3c	minor external_model_benchmark cleanup (#10644 )	2025-06-05 14:13:28 -04:00
chenyu	80ebce421d	remove metal buffer limit in external_model_benchmark [pr] (#10642 ) not needed anymore	2025-06-05 13:00:51 -04:00
wozeparrot	4d1686f767	clean: becnhmark -> benchmark (#10620 )	2025-06-03 19:28:18 -07:00
qazal	910cabb081	add kernel count to grouper process replay differ [pr] (#10611 )	2025-06-03 15:21:27 +03:00
qazal	3cc73a0172	simpler process replay main loop [pr] (#10588 ) * simpler process replay main loop [pr] * use logging * default to 1	2025-06-01 15:03:21 +03:00
qazal	dc882d3d7d	merge process replay and viz captures [pr] (#10581 ) * refactoring * test script * work * more work * diff * repr splits lines correctly * that * add location * add location * also don't need name_override * k.copy * [pr] * name_override 2 * err	2025-06-01 12:30:10 +03:00
George Hotz	b3b43a82c4	remove Tensor.no_grad, it's meaningless now [pr] (#10556 )	2025-05-28 22:20:02 -07:00
Sieds Lykles	ae02a1e232	[bounty] Z3 symbolic fuzzer [pr] (#10514 ) * First version, caught a bug? * Nicely print failure to reproduce * Remove that * Put the assert back * Change fuzzing to use testing_unit so it has z3 * Test key to match * Add rule * Add test * Add test for edge case 0 * Merge patterns * update comment * consistent whitespace * whitespace * add condition * add test * update comment * use Variable * fuzzer using z3_renderer * Cleaned up printing and debugging * working new fuzzer * change some comments and printing * more formatting * fuzz failures in seperate file * fix fstring * more tests * naming * remove added line * remove comment * print number of skipped expressions * use self.assertEqual --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-05-28 16:28:37 -04:00
geohotstan	fd9f236a82	move test over (#10508 )	2025-05-25 21:51:51 -04:00
George Hotz	0d39bb5de1	rename to get_kernelize_map (#10465 )	2025-05-22 11:44:44 -07:00
qazal	df4cbb69e9	move fuzz_schedule.py to extra [pr] (#10444 )	2025-05-21 10:07:24 +03:00
chenyu	29624af872	skip commavq in external_model_benchmark (#10439 ) precision issue with different onnxruntime version	2025-05-21 01:45:33 -04:00
nimlgen	2895198c36	am: download regs (#10419 ) * am: download regs * x * linter * mypy * after merge * raise * fixed name * fix * xx * remove * missing reg * missing reg * move to online * ops	2025-05-20 18:59:56 +03:00
George Hotz	b06291077c	no amdgpu kernel driver (#10408 ) * no amdgpu kernel driver * don't test hip * lower req	2025-05-18 20:52:39 -07:00
George Hotz	411392dfb7	move files into uop dir (#10399 ) * move files into uop dir [pr] * tinygrad.uop is a thing * fix uop docs, no pr * fix viz	2025-05-18 11:38:28 -07:00
qazal	9e2089dcd4	don't raise Exception in process replay [pr] (#10392 ) * don't raise Exception in process replay [pr] * continue generating diffs unless [pr] is set, exit(1) otherwise * change * works	2025-05-18 11:23:23 +03:00
qazal	e9e5b54e43	grouper cleanups and merge with insert_kernels [pr] (#10349 ) * grouper cleanups and merge with insert_kernels [pr] * remove that	2025-05-16 14:39:56 +03:00
wozeparrot	1ed04f993b	move benchmark stat tracking to influxdb (#10185 )	2025-05-15 16:14:56 -07:00
qazal	1770e00c41	only CAPTURE_PROCESS_REPLAY=1 + add filterwarnings back [pr] (#10292 )	2025-05-14 11:58:42 +03:00
qazal	1c97338be5	enable process replay assert for schedule [pr] (#10280 ) * enable process replay assert for schedule * start at unique+1	2025-05-14 11:10:47 +03:00
uuuvn	7bc4864bc4	Make `dev` a property of `Allocator` (#10286 ) * Make `dev` a property of `Allocator` (this is a prereq refactor for #10285) At least `BufferXfer.copy` accesses it assuming it's always present, currently most devices just add this property on their own repeating the same code over and over again. This is also a bit footguny, see `RemoteAllocator` that named this property `device` instead of `dev`, i could obviously just change that in one place but doing it globally seems like a better solution (and it reduces code duplication too). `MallocAllocator` is a bit special, but passing `None` works just fine. * typing * ignore type instead of cast	2025-05-13 17:01:01 -07:00
nimlgen	6f42bf8b54	usbgpu: 10 steps in benchmark to hit cache (#10273 )	2025-05-13 17:06:50 +03:00
geohotstan	1c4ab6b991	ONNX add tests against ORT (#10270 ) * start * clean up * indicate file location too	2025-05-13 04:03:52 -04:00
nimlgen	2145bce3f9	usbgpu: copyin size is 16k (#10240 ) * usbgpu: copyin size is 16k * ush	2025-05-09 22:12:54 +03:00
nimlgen	267ba9b592	usbgpu: better names in copy speed benchmark (#10212 )	2025-05-08 16:12:37 +03:00
nimlgen	ba52fce4b2	usbgpu: benchmark in ci (#10208 ) * usbgpu: benchmark * usbgpu: benchmark	2025-05-08 12:02:04 +03:00
wozeparrot	10437904cd	refactor: ops_cloud -> ops_remote [pr] (#10166 )	2025-05-05 15:59:51 -07:00
George Hotz	a0240d8c2b	lil work on llvm speed (#10157 ) * lil work on llvm speed * llvm failing test * 1e-4 * simpler failing test * once is fine * gpt suggests this syntax change * bump that debug	2025-05-04 16:37:26 -07:00
George Hotz	36ccaa88a6	move merge views [pr] (#10156 ) * move merge views [pr] * move flow to __init__ [pr]	2025-05-04 14:41:47 -07:00
George Hotz	5f3f162606	cache rewrites for renderer [pr] (#10155 ) * add caching to rewrites for renderer [pr] * remove that * update ebs	2025-05-04 13:45:15 -07:00
nimlgen	45bf7c5b81	am: add allocation bench (#10135 ) * init allocation bench * sorryg * betetr	2025-05-02 13:51:07 +03:00

1 2 3 4 5 ...

779 Commits