tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-02-10 22:54:59 -05:00

Author	SHA1	Message	Date
chenyu	f4f56d7c15	move time_linearizer to extra.optimization.helpers [pr] (#9048 ) no longer used in tinygrad	2025-02-12 15:49:58 -05:00
nimlgen	e5a3f60fc2	am: remove libpciaccess dep (#8980 ) * am: remove libpciaccess dep * offset in mockhwiface * op * fake regions	2025-02-09 16:06:55 +03:00
George Hotz	ae45826758	hotfix: GRAPH_ONE_KERNEL + fix timing	2025-02-06 17:52:20 +08:00
George Hotz	1c53e8bf27	Revert "objc fast msg (#8922 )" (#8926 ) This reverts commit `c3f99a727e`.	2025-02-06 17:50:49 +08:00
George Hotz	c3f99a727e	objc fast msg (#8922 ) * benchmark kernel launch * don't realize unneeded * faster * faster metal * fix mypy * new objc message style [pr] * without sync * no div 0 * lru cache that * no sync in the profile * fix * update all to new style * remove comment * graph one kernel * fix graph one kernel * remove that sync	2025-02-06 17:49:06 +08:00
George Hotz	a8e54df363	benchmark single kernel launch (#8921 ) * benchmark kernel launch * don't realize unneeded * faster * faster metal * fix mypy * without sync * no div 0 * lru cache that * no sync in the profile	2025-02-06 13:35:34 +08:00
qazal	6f0cc2e9c5	rename to KernelContext and move the linearize_sched comment [pr] (#8899 ) * rename to KernelContext and move that comment [pr] * 500	2025-02-05 07:49:58 +01:00
qazal	6a0da51ed0	truncate process replay logs [pr] (#8891 ) * truncate process replay logs [pr] * work * max_lines * bump to 1K	2025-02-04 20:26:48 +01:00
qazal	acf0baefee	process replay from tensor uops to kernel ast (#8883 ) * process replay from tensor uops to kernel ast * this dedups * switch back to string key	2025-02-04 18:09:20 +01:00
George Hotz	56fa5c1191	dsp simulator (#8869 ) * dsp simulator * progress * fix * close on test tiny * working * less waste * line savings * Device DSP compiler * mock DSP at the bottom * DSP tests * docker caching * test update * need load * skip that test for CI DSP * last touch * ugh	2025-02-04 09:45:04 +08:00
nimlgen	7841852870	hcq pci signal fuzzer (#8854 ) * hcq pci signal fuzzer * kk * correct	2025-02-01 23:42:27 +03:00
qazal	dc34a4146f	better process_replay context print [pr] (#8856 ) * better process_replay context print [pr] * test: revert push cast * Revert "test: revert push cast" This reverts commit `38a2aef6f8`.	2025-02-01 21:50:23 +02:00
chenyu	73ee2d74c0	raise RuntimeError for int base pow (#8852 ) current implementation is not precise and blocking other simplification change	2025-02-01 12:11:57 -05:00
qazal	72e1f41f8e	add unbind_vars pattern matcher (#8851 ) * add unbind_vars pattern matcher [pr] * this can be cvar * this is empty	2025-02-01 18:25:44 +02:00
Ankit Avinash	7647cd8428	[bounty] Stride is flip (#8792 ) * replace stride with flip * Complete replacing stride with flip clean flip function in view.py fix tests * fix tests for multi shapetracker * fix tests for fuzz shapetracker * fix tests for fuzz shapetracker * debug * debug * fix * fix * fix --------- Co-authored-by: George Hotz <geohot@gmail.com> Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-01-31 11:34:10 +09:00
chenyu	0513b0c17d	lower green test_gemm_8192 tflops to 125 [pr] (#8820 ) flaky	2025-01-30 17:30:08 -05:00
nimlgen	a2faa5e49b	am: fix pt free (#8810 )	2025-01-30 15:14:55 +03:00
Ignacio Sica	260df1a17f	`tc_select` noop (#8801 ) * tc_select noop * revert changes in test	2025-01-29 13:53:23 -05:00
qazal	ed672881b0	remove additions/deletion in pr + check uops are equal [pr] (#8779 ) * use warnings there [pr] * remove those + move assert_diff [pr] * warn after log * remove * back	2025-01-28 08:57:34 +02:00
Ignacio Sica	b240f12593	[TIP-9] rename Opt's amt to arg 2 (#8770 ) * rename Opt amt to arg * ignore_beam_cache for test_tiny * move ignore_beam_cache to test_tiny * move to separate pr * revert space change --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-01-27 14:19:04 -05:00
George Hotz	3ed146a5ff	Revert "rename Opt amt to arg (#8767 )" (#8769 ) This reverts commit `bf041659a5`.	2025-01-27 23:46:37 +09:00
Ignacio Sica	bf041659a5	rename Opt amt to arg (#8767 )	2025-01-27 23:36:47 +09:00
nimlgen	dc10187fc0	am: add am_smi (#8739 ) * am: start monitor * cleanups * fixes * hmm * progress * cleanup	2025-01-24 20:16:19 +03:00
nimlgen	e4512baea4	am: cleanup mm (#8730 ) * am: cleanup mm * cle * ops * entries	2025-01-23 15:49:37 +03:00
nimlgen	93fb50ce77	allreduce: add flags (#8713 )	2025-01-22 17:44:31 +03:00
qazal	2dae467b75	scheduler + process_replay import cleanup (#8711 )	2025-01-22 12:44:07 +02:00
nimlgen	c5e46c5eee	am: recover from any boot interrupt (#8703 ) * am: recover from any load interrupt * add fuzzer * nu	2025-01-21 22:22:23 +03:00
geohotstan	dd82b4c913	make onnx runner a class (#8647 ) * this * clean up * more clean ups and improve debug msg * more correct training toggler * remove manual training toggling * change some variable names * actually just add the training toggle for LIMIT envvar too * more refinement * __call__ and OnnxRunner * fix half pylint, other half is importing from onnx while this file is onnx.py, figure out later * ahhhh found another mistake * remove limit from __call__ --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-01-20 10:11:05 -08:00
nimlgen	08ca871d77	am: remove pm block (#8688 ) * am: remove pm block * hm * oops	2025-01-20 18:05:22 +03:00
nimlgen	9d3c40601f	am: fast memory manager (#8654 ) * start * progress * fixes * smth * mini fixes * fix2 * ugh, need this for now * faster * cleanups * tiny linters * make mypy happier * test & free pts * ops * linter * cleanup vm * fix * remove map_from * tiny fixes * add test to ci	2025-01-20 16:58:22 +03:00
George Hotz	98d01a059d	rename uopgraph to rewriter [pr] (#8682 )	2025-01-19 17:03:12 -08:00
eliotgolding	0289fbb1c2	limit real_size to the size of first View of ShapeTracker (#8628 ) * fix real_size * add fuzzer; typing * spacing --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-01-16 16:27:39 -05:00
chenyu	52e7003414	Revert "make kits19 dataset samples have small sizes (#8591 )" (#8610 ) This reverts commit `76a03e950a`.	2025-01-14 12:24:27 -05:00
Francis Lata	76a03e950a	make kits19 dataset samples have small sizes (#8591 )	2025-01-14 08:27:45 -08:00
qazal	863abc7140	scheduling graph_rewrite prereqs for BLOCK in ASSIGN (#8598 ) * remove the BUF_LIMIT assert * skip the base one * work * work * good error * ok comment * shorter check	2025-01-14 03:01:59 -05:00
qazal	ae2229d727	assert kernel buffer limit at compile time [pr] (#8595 ) * remove the BUF_LIMIT assert * skip the base one	2025-01-13 16:32:07 -05:00
geohotstan	4abe631b56	fix onnx mobilenetv2-7-quantized.onnx (#8574 ) * is 67% considered fixed? * move test up * share function * add qgemm too * make sure qgemm comes out as int * actually that note is not right * remove qgemm (I did it wrong) and add it later lol.	2025-01-13 09:25:06 -08:00
George Hotz	d19c1c7f03	bump 75 -> 73 for test failure	2025-01-13 09:18:38 -08:00
qazal	79738d768c	do not require PYTHONPATH=. for process replay [pr] (#8567 )	2025-01-11 09:45:34 -05:00
qazal	a70d1bf439	move print_diff to process replay [pr] (#8566 ) * move print_diff to process replay [pr] * ruff rightfully complians	2025-01-11 09:28:45 -05:00
qazal	60503c8621	use CAPTURE_PROCESS_REPLAY=1 in CI [pr] (#8564 )	2025-01-11 06:03:48 -05:00
chenyu	6a7f971fa0	hotfix max(DEBUG, 2) -> max(DEBUG.value, 2) [pr] (#8553 )	2025-01-10 12:57:44 -05:00
George Hotz	9833fe83d8	more work on onnx imagenet [pr] (#8552 ) * more work on onnx imagenet [pr] * working quantization * static quant * benchmark onnx 0 dim	2025-01-09 20:28:18 -08:00
chenyu	2cbb34535c	simpler allreduce script [pr] (#8551 ) time everything on tensor level and get time from GlobalCounters.time_sum_s	2025-01-09 21:38:13 -05:00
chenyu	23c56817d8	update and clean up allreduce script [pr] (#8549 ) make `run` to able to run with ring only	2025-01-09 19:35:28 -05:00
geohotstan	299d333806	Add QLinearConv, QLinearMatMul, QLinearAdd, QLinearGlobalAveragePool to onnx (#8478 ) * QLinearEverything * ok ort verify passes * this should be int instead * cast to int then char to do wraparound * cleaner * move contrib ops to microsoft ops --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2025-01-09 15:08:53 -08:00
chenyu	85a4397f27	fix create_schedule_with_vars usage in allreduce benchmark [pr] (#8522 ) * fix create_schedule_with_vars usage in allreduce benchmark [pr] because i didn't know how to use it... * increase time limit because tiny17 is slow	2025-01-07 01:30:01 -05:00
chenyu	0061dc7447	fix benchmark allreduce and add to ci [pr] (#8521 )	2025-01-07 00:37:59 -05:00
geohotstan	9229867fec	Support asymmetrical pads for all pooling functions (#8109 ) * implemented in tensor * apply onnx tests to asymmetrical pads * better onnx op ordering * correct ceil_mode asymmetrical * fix onnx_ops comments * a few more TODOs and fix some stupidity * fix some typing * fix test * mypy still a little messed up * refactor out pad struct transformation * add simple docs for now * add whatever tests possible * add tests for _resolve_pool_pads * better err msg * whoops didn't mean to include this * retry CI * enable asymmetric pads onnx tests * better docs --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2025-01-05 16:01:08 -05:00
qazal	12fa4340b3	pickle ContextVars in process replay [pr] (#8484 ) * pickle ContextVars in process replay * add test_pickle_context_var [pr] * more realistic	2025-01-03 23:11:54 +08:00

1 2 3 4 5 ...

659 Commits