tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-04-29 03:00:14 -04:00

Author	SHA1	Message	Date
George Hotz	ef58ab340a	hotfix: remove n=auto from REMOTE=1 test	2025-06-09 09:19:36 -07:00
chenyu	d93a0bee6b	mlperf ci uses its own cache (#10705 ) not to interfere with regular cache which is used by benchmark	2025-06-08 19:43:32 -04:00
George Hotz	81b9c04574	move high level stuff to unit tests [pr] (#10708 ) * move high level stuff to unit tests [pr] * process replay on unit tests * fix pr, less compute * set omp num threads * set 200MB buffer size limit * delete junk * fix tests * faster * move test_indexing to unit * faster	2025-06-08 14:05:56 -07:00
George Hotz	4305f532d9	clean up apt stuff (#10706 ) * clean up apt stuff * single apt install * fixes * fix opencl + ldconfig	2025-06-08 11:06:09 -07:00
George Hotz	4e2c3560b4	smaller tests are faster tests [pr] (#10704 ) * remove del spam from CI * more * preconstruct default buffer spec * ignore those errors * check exception * more exception check * skip stuff * smaller tests mean faster tests * a few more	2025-06-08 10:54:19 -07:00
George Hotz	32141ec867	make apt CI faster (#10702 )	2025-06-08 09:43:39 -07:00
chenyu	4f535641f7	add one huggingface_onnx test to mac benchmark ci (#10700 ) this crashed for me on onnx parser pr but seems fine for the author. see if ci mac is fine	2025-06-08 12:26:12 -04:00
George Hotz	7ff175c022	cache a venv to avoid pip usage (#10689 ) * try built in pip caching * try venv * export venv * set VIRTUAL_ENV * revert that * venv key * fix * ci cache hit? * fix windows	2025-06-07 20:13:41 -07:00
George Hotz	53ed64e133	ci speed work 1 (#10676 ) * skip a few slow tests * use a venv for python packages * create venv * no user, it's in venv * ignore venv * venv * new cache key * try that * this * version the python cache	2025-06-07 16:33:11 -07:00
wozeparrot	37e1ef1be3	feat: cleanup old AM processes (#10653 )	2025-06-05 15:41:00 -07:00
qazal	7114b6ab31	viz browser tests (#10626 ) * viz browser tests * expect failure if js/ isn't included * back green	2025-06-04 14:58:24 +03:00
chenyu	18e9ec3ea1	add wino cifar to search benchmark (#10615 ) * add wino cifar to search benchmark * FUSE_OPTIM=1 * revert those	2025-06-03 20:38:43 -04:00
chenyu	1c1f578490	DISABLE_COMPILER_CACHE in sdxl search (#10614 )	2025-06-03 09:22:25 -04:00
chenyu	4ab3391e6f	`set -o pipefail` for mlperf run_and_time (#10577 ) also run the 5.1 script in ci cron job	2025-05-30 16:36:44 -04:00
wozeparrot	5e3c4a8431	fix: comma testsig (#10568 )	2025-05-29 19:00:07 -07:00
George Hotz	ee12e801a3	optional fused optimizers (#10549 ) * enumerate cases of Tensors in the JIT * optional fused optimizers * add fused optimizer test * move that there * ugh	2025-05-28 13:50:30 -07:00
Sieds Lykles	ae02a1e232	[bounty] Z3 symbolic fuzzer [pr] (#10514 ) * First version, caught a bug? * Nicely print failure to reproduce * Remove that * Put the assert back * Change fuzzing to use testing_unit so it has z3 * Test key to match * Add rule * Add test * Add test for edge case 0 * Merge patterns * update comment * consistent whitespace * whitespace * add condition * add test * update comment * use Variable * fuzzer using z3_renderer * Cleaned up printing and debugging * working new fuzzer * change some comments and printing * more formatting * fuzz failures in seperate file * fix fstring * more tests * naming * remove added line * remove comment * print number of skipped expressions * use self.assertEqual --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-05-28 16:28:37 -04:00
chenyu	23e41f523a	sdxl also run with cached search (#10546 )	2025-05-28 06:51:56 -04:00
chenyu	fffdc4d31c	workflow to run sdxl with search (#10543 )	2025-05-27 17:25:41 -04:00
uuuvn	c29c46853f	Very basic mock sqtt (#10512 ) This mockgpu sqtt emulation will just ignore basically everything and end up with a 0x1000 size trace full of zeroes, but just testing for things like register rename is better than nothing i guess	2025-05-26 14:38:28 -07:00
chenyu	2eeea373af	add BENCHMARK_LOG for mlperf resnet cron (#10516 )	2025-05-25 22:00:29 -04:00
b1tg	a1f64af92d	ci: setup llvm for amdremote (#10507 ) Co-authored-by: b1tg <b1tg@users.noreply.github.com>	2025-05-25 21:52:27 -04:00
wozeparrot	7c81f9f95e	fix: gate mlperf workflow (#10515 )	2025-05-25 17:06:21 -07:00
George Hotz	6b8eb5fec2	split mlperf to its own red benchmark run (#10492 ) * Add mmapeak implementation for 7900 XTX * Change identation * Use a template instead of multiple assebly files * Fix output formatting * Reduce register file bank conflicts * More accurate measurement for quick instructions * Add support for gfx1201 * RDNA4 wmma requires less VGRPs * RDNA4 does not have s_cmpk instructions * Add v_wmma_i32_16x16x32_iu4 for gfx1201 * Add sparse wmma instructions * split to tinybox red MLPerf Benchmark --------- Co-authored-by: Panagiotis Kourouklidis <panagiotis.kourouklidis@gmail.com>	2025-05-23 17:12:41 -07:00
George Hotz	bf2a0907be	gate the mockdsp behind MOCKDSP=1 [pr] (#10486 )	2025-05-23 11:44:02 -07:00
uuuvn	3ca5680920	Test remote in benchmark (#10304 ) hlb cifar is fast so added it, can add bert too if you think it's ok 6 real gpus to test multigraph and transfers + accuracy validation should probably be added to tinystats too, i don't know how though Co-authored-by: chenyu <chenyu@fastmail.com>	2025-05-23 12:12:57 -04:00
chenyu	c5acb4e06e	run mlperf resnet daily (#10482 ) Runs at 08:05 UTC (12:05 AM Pacific Time)	2025-05-23 07:16:20 -04:00
chenyu	116d9e6306	run mlperf resnet on red box (#10413 ) also made push to `update_mlperf` branch trigger	2025-05-19 12:48:36 -04:00
George Hotz	f1fe1f93c1	hotfix: 14000 lines	2025-05-19 09:40:53 -07:00
qazal	90eb3c0e5d	add MobileNetV2 benchmark to comma CI (#10250 ) * add MobileNetV2 to comma CI * symlink imagenet * also the signature * comment that out * need imagenetmock * same train and test set * quantize on CPU=1 * verbose * need __hexagon_divsf3 * 0x858d6c15 * quant cpu + CC=clang-19	2025-05-19 18:22:50 +03:00
George Hotz	b06291077c	no amdgpu kernel driver (#10408 ) * no amdgpu kernel driver * don't test hip * lower req	2025-05-18 20:52:39 -07:00
chenyu	485e80da69	run_and_time for resnet ci (#10405 )	2025-05-18 23:39:57 -04:00
uuuvn	0f825e12f2	Remote fixedvars (#10371 ) * amd mockgpu graph support For testing remote graph stuff (prompted by #10371) in ci * Remote fixedvars Somehow none of existing tests failed when fixedvars were added, looking what to add as an regression test for this --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-05-18 09:57:13 -07:00
uuuvn	27c12be471	amd mockgpu graph support (#10385 ) For testing remote graph stuff (prompted by #10371) in ci	2025-05-18 09:43:16 -07:00
chenyu	9b4e2a75cd	symlink datasets in mlperf workflow (#10391 )	2025-05-18 03:26:05 -04:00
qazal	0294bfe507	simpler can_pad (#10364 ) * simpler can_pad [pr] * 3 kernels * tests * less kernels	2025-05-18 10:00:07 +03:00
chenyu	efa8dfe7fb	test cron job to run resnet (#10368 )	2025-05-17 08:57:02 -04:00
chenyu	c798f2f427	brew --quiet to suppress already installed warnings (#10346 ) example https://github.com/tinygrad/tinygrad/actions/runs/15057000247	2025-05-15 23:31:18 -04:00
wozeparrot	1ed04f993b	move benchmark stat tracking to influxdb (#10185 )	2025-05-15 16:14:56 -07:00
Ignacio Sica	47b3055fe2	set fail-fast behavior (#10336 )	2025-05-15 11:24:45 -07:00
George Hotz	50181ab09f	hotfix: bump to 13500 lines	2025-05-14 18:49:59 -07:00
George Hotz	7a3d4de59a	hotfix: add GRAPH_ONE_KERNEL=1 to UsbGPU openpilot test	2025-05-14 14:50:37 -07:00
George Hotz	f1130ab3d3	openpilot benchmark test (#10290 ) * openpilot benchmark test * that	2025-05-13 22:49:28 -07:00
George Hotz	ec46f658d7	openpilot llvm test [pr] (#10288 )	2025-05-13 16:51:41 -07:00
uuuvn	ddff9857b8	Remote properties is a dataclass (#10283 ) Not strictly required for anything but soon there will be like 4 new properties and having it be a huge json just seems like a bad taste. It also seems right to not have a separate endpoint for this, just `GetProperties` request that returns a repr of this similar to how requests are sent in `BatchRequest`. This will also make a switch to anything other than http much simpler if it will be required for any reason, like just a tcp stream of `BatchRequest`s	2025-05-13 11:56:58 -07:00
uuuvn	ba87eca0f1	Remote multi (basic) (#10269 ) * Basic remote multi support Simplest thing to be able to use remote with multiple gpus, very slow because no transfers (copyin copyout for cross-device copies) * tests	2025-05-13 09:52:47 -07:00
chenyu	ad5cb2717d	FUSE_ARANGE=1 in bert bench (#10263 ) still fails, something multi related maybe Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>	2025-05-13 09:12:19 -04:00
chenyu	0015b3921f	sleep more in CI Remove amdgpu (#10261 ) see if this is less flaky	2025-05-12 08:13:44 -04:00
hooved	7b4f05fd00	Add test for correctness of Infinity in WebGPU (#10201 ) * use function for infinity instead of uniform * test infinity math locally * test infinity math in CI * make pytest available to MacOS (WebGPU) * revert to master except failing webgpu test	2025-05-08 05:20:05 -07:00
nimlgen	7d6ed1b1e9	hotfix: mac ci (#10210 ) * fixed? * cmnt	2025-05-08 14:13:23 +03:00

1 2 3 4 5 ...

833 Commits