tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-09 15:08:02 -05:00

Author	SHA1	Message	Date
George Hotz	acf72872b3	move view left to the outer graph prereqs + testing (#10725 ) * move view left to the outer graph * global view right * dont need that one * remove comment * test kernelize * simple * split onnx, test sdxl null * fix testing * ugh, wrong one * Update test.yml	2025-06-09 20:43:25 -07:00
b1tg	24d328e313	onnx parser (#10435 ) * onnx parser * fix compile, lint * onnx.load -> onnx_load * compatible with ModelProto * fix test external_test_onnx_ops.py * fix tests * fix signed int * reduce to 261 lines * fix TypeProto.Optional * debug for _parse_message, add TypeProto.Sequence, cleanup * onnx_load from Tensor * remove BufferedReader * 174 lines and reduce tensor copy * cleanup * use onnx_load in external_model_benchmark.py * fix qcom test * [onnx] parser support external data --------- Co-authored-by: b1tg <b1tg@users.noreply.github.com> Co-authored-by: chenyu <chenyu@fastmail.com>	2025-06-09 12:44:28 -04:00
Sieds Lykles	cfa65bea05	Subtract 1 from Variable upper bound (#10715 )	2025-06-09 09:25:53 -07:00
George Hotz	32e9949052	rename lazydata to uop (#10698 )	2025-06-08 08:42:22 -07:00
chenyu	e88fe41d37	update vits vctk model to use download from huggingface (#10688 ) google drive points to a warning page that does not work	2025-06-07 20:47:28 -04:00
Sieds Lykles	c29a56dd51	Fix whisper OOB (#10685 ) * fix whisper and test * remove import	2025-06-07 20:23:50 -04:00
Sieds Lykles	2f605eadf7	fix oob (#10666 )	2025-06-07 11:32:03 -04:00
wozeparrot	0d86f8d375	fix failed threefry (#10646 )	2025-06-05 17:17:42 -07:00
chenyu	4ab3391e6f	`set -o pipefail` for mlperf run_and_time (#10577 ) also run the 5.1 script in ci cron job	2025-05-30 16:36:44 -04:00
chenyu	baf482d314	copy mlperf stuff to 5.1 (#10576 ) 5.0 is finalized, new changes go to 5.1	2025-05-30 16:12:39 -04:00
George Hotz	b3b43a82c4	remove Tensor.no_grad, it's meaningless now [pr] (#10556 )	2025-05-28 22:20:02 -07:00
George Hotz	e4e7b5d7e1	continue work on beautiful cifar (#10555 )	2025-05-28 21:42:01 -07:00
George Hotz	871df1436a	more beautiful cifar (#10551 ) * enumerate cases of Tensors in the JIT * optional fused optimizers * add fused optimizer test * move that there * ugh * work on beautiful_cifar * speed close to hlb_cifar * schedule to corealize all * one line sched step * less lines	2025-05-28 20:48:20 -07:00
chenyu	74cf5dbd9e	mlperf system updates (#10550 ) standardized processor and accelerator names	2025-05-28 16:15:46 -04:00
chenyu	51dc7eedb0	correct use AM for resnet run_and_time (#10524 )	2025-05-26 15:33:11 -04:00
chenyu	c1919ad55f	use AM for resnet run_and_time (#10523 )	2025-05-26 14:50:49 -04:00
chenyu	2d50efb92b	`set -e` on mlperf run_and_time scripts (#10519 )	2025-05-26 09:22:30 -04:00
chenyu	dc6309242d	WallTimeEvent for mlperf ci (#10506 )	2025-05-24 10:56:03 -04:00
George Hotz	0d39bb5de1	rename to get_kernelize_map (#10465 )	2025-05-22 11:44:44 -07:00
George Hotz	577a0b4cfa	openpilot compile4 (wip) (#10407 ) * openpilot compile4 * add copies * remove junk	2025-05-22 10:47:34 -07:00
chenyu	67d1364106	update LOGMLPERF in red resnet run_and_time (#10416 )	2025-05-19 13:23:33 -04:00
qazal	90eb3c0e5d	add MobileNetV2 benchmark to comma CI (#10250 ) * add MobileNetV2 to comma CI * symlink imagenet * also the signature * comment that out * need imagenetmock * same train and test set * quantize on CPU=1 * verbose * need __hexagon_divsf3 * 0x858d6c15 * quant cpu + CC=clang-19	2025-05-19 18:22:50 +03:00
chenyu	485e80da69	run_and_time for resnet ci (#10405 )	2025-05-18 23:39:57 -04:00
George Hotz	411392dfb7	move files into uop dir (#10399 ) * move files into uop dir [pr] * tinygrad.uop is a thing * fix uop docs, no pr * fix viz	2025-05-18 11:38:28 -07:00
George Hotz	0b733ba75e	multi device training with GPT2 [pr] (#10375 ) * multi device training with GPT2 [pr] * Update grouper.py	2025-05-17 15:33:56 -07:00
wozeparrot	12a1ccc680	clean: double import (#10345 )	2025-05-15 20:15:09 -07:00
wozeparrot	1ed04f993b	move benchmark stat tracking to influxdb (#10185 )	2025-05-15 16:14:56 -07:00
George Hotz	568d6d96e7	small changes from new multi [pr] (#10318 )	2025-05-14 20:50:59 -07:00
George Hotz	bfc30fa6ea	hotfix: typo in shm_name	2025-05-14 19:34:52 -07:00
George Hotz	2bc54b3e22	manually handle OSX	2025-05-14 19:17:51 -07:00
George Hotz	ab460486d7	Revert "resnet dataloader osx (#10316 )" This reverts commit `aef336930a`.	2025-05-14 19:15:07 -07:00
George Hotz	aef336930a	resnet dataloader osx (#10316 ) * mlperf dataloader on mac * resnet dataloader [pr] * simple should work	2025-05-14 18:31:26 -07:00
chenyu	fbaa26247a	randn_like in minrf (#10298 ) tested that it trains to similar loss	2025-05-14 07:59:50 -04:00
George Hotz	98c84a711d	min rectified flow example [pr] (#10252 ) * work on minrf example * more * jit sample * t is tensor not const * fixes * more convs * fix dropout * don't print * 504 * big patch * onehot * touch * use embeddings * dumb uses final layer * act * non fl * match * tp * 3 * of * ppsz * normal * add adln * no t * weird transformer * weird transformer * contig * actual speed fix * dumb * cb * 0 * t is 0 * mort-t * args * dumb days are over * readable * contig * no more t mask * mask_t * init to zero * clean * steps * work * tt * t * solid	2025-05-11 18:36:44 -07:00
Adam Van Ymeren	a28ca0680f	update dead link (#10242 )	2025-05-09 19:59:52 -04:00
Rory Clear	9f2931ae67	Fix yolo load failing silently (#10046 ) * wait for js before loading model * use f32 * revert html changes, try both cameras and remove f16 req * clean	2025-05-07 11:46:09 -07:00
Kevin Buhler	363481e2fb	correct mispelled words (#10165 )	2025-05-05 08:12:41 -07:00
chenyu	4a04098389	fix llama3 with nf4 quantize (#10107 ) also int8 outputs is wrong	2025-04-29 15:14:36 -04:00
qazal	a59d18da21	hack for VIZ=1 with examples/llama (#10103 ) * hack for VIZ=1 with examples/llama * move it alongside BEAM=0	2025-04-29 23:42:17 +08:00
chenyu	3eba3d6ee9	don't pass model in convert_from_huggingface and convert_from_gguf (#10094 ) it only needs n_layers	2025-04-28 20:11:19 -04:00
chenyu	610ee79b22	cherry pick mlperf5.0 branch to master (#10089 )	2025-04-28 15:36:56 -04:00
George Hotz	b341296304	hotfix: save sdxl ram	2025-04-27 12:09:45 -04:00
George Hotz	68c5f7ba80	load fast in sdxl (#10072 ) * load fast in sdxl * back to that with the ret * no context	2025-04-27 11:58:51 -04:00
George Hotz	4b8ef6ce78	hotfix: sdxl corealize	2025-04-27 10:41:46 -04:00
George Hotz	1253819151	make beautiful indexing use a Variable (#10063 ) * make beautiful indexing use a Variable * stunning test * better color * training is broken * fix tests * fix variable indexing * fix test * no contiguous * revert that * revert that too * indexing two bind * skip for webgpu * make not slow	2025-04-27 08:22:38 -04:00
Rory Clear	a13a43c4fe	yolo 416 to 640 res (#10047 )	2025-04-26 20:45:58 -04:00
George Hotz	ea5dddc537	reduce collapse generic (#10045 ) * reduce collapse generic * new arange folder * new range folding * correct with sym * all tests pass * indexing ops passes * failing tests * fix tests, remove unused * revert that * torch indexing is fast * skip on webgpu * touchups * comments	2025-04-26 09:13:24 -04:00
Rory Clear	3a189fa561	More yolo processing in tinygrad (#9928 ) * more tg less np * update webgpu html for new compile * resize boxes * remove text * add back note * fix indentation * fix indentation * remove magic num * remove now unused funcs * back to numpy nms * no loop * fix iou suppression * update test * dont suppress other classes * add working scale * fix expected value, rounded up 0.24 was being counted * add postprocess bool for onnx test * fix indents * clean * clean * fix indent * remove print * fix indent * remove unused import * remove hardcoded 0.25 * space * spacing * clean label_predictions func * remove single item lists * space * use postprocess output in test * space * clean * clean * remove redundant threshold * remove redundant threshold * clean * rename var * move loop into func * unhardcode iou_threshold * remove unused values * clean * add note * clean * keep const * move back funcs --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-04-24 16:21:46 -04:00
chenyu	74c6cf8be3	lint mlperf model_train (#10038 )	2025-04-24 16:19:44 -04:00
chenyu	a25abf55e3	retinanet only call postprocess_detections with RUNMLPERF (#10017 ) during setup only need to compile `_eval_step().numpy()`	2025-04-23 20:45:38 -04:00

1 2 3 4 5 ...

1106 Commits