tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-10 07:28:15 -05:00

Author	SHA1	Message	Date
wozeparrot	c870764940	Revert "add line changes diff bot to CI (#1863 )" (#1870 )	2023-09-15 16:56:42 -04:00
Yixiang Gao	789c84a7a3	add line changes diff bot to CI (#1863 )	2023-09-15 16:29:58 -04:00
chenyu	29ac8293d7	run gpt2 in CI (#1866 )	2023-09-15 04:37:02 +08:00
chenyu	1b46de1a3e	fix type of helpers.prod, add test cases (#1859 )	2023-09-14 05:16:55 +08:00
chenyu	e67306ba04	symbolic shape type with TypeGuard (#1852 )	2023-09-13 05:27:22 +08:00
Roelof van Dijk	c91b44f7bf	refactor: move size to view (#1848 ) * refactor: move size to view * fix: pylint --------- Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-09-11 07:16:04 -07:00
chenyu	9e9ea20784	Fix view, CI cpu test with python 3.8 (#1845 )	2023-09-10 22:37:58 -04:00
chenyu	3ec301c2d7	apply view.py patch (#1844 )	2023-09-10 17:32:15 -07:00
Yixiang Gao	a32951a001	add test_tensor_copy (#1840 ) * add test_tensor_copy * fix whitespace * add value check	2023-09-10 16:01:58 -07:00
Roelof van Dijk	1bc52c60df	fix: minor tweaks to view (#1842 ) Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-09-10 15:55:57 -07:00
George Hotz	47e602f717	view: do not trade complexity for speed (#1839 ) * view: do not trade complexity for speed * staticmethods * view create	2023-09-10 11:29:53 -07:00
chenyu	c0bc4cfbaf	DivNode.b is int (#1833 )	2023-09-10 09:04:29 -07:00
nimlgen	13790b1e20	cast types in render_load (#1837 )	2023-09-10 07:58:13 -07:00
David Hou	e74a6ca7e4	expand in terms of substitute (#1827 )	2023-09-09 14:43:00 -07:00
George Hotz	0e3e2bac13	amd wino: upload results	2023-09-09 13:57:14 -07:00
George Hotz	6f95c5f284	winograd speed test for AMD (#1826 )	2023-09-09 13:56:33 -07:00
George Hotz	0f2bd10d00	add winograd CIFAR to mac tests (#1825 ) * add winograd CIFAR to mac tests * symlink already done	2023-09-09 13:45:24 -07:00
nimlgen	31fca43706	kopt works with local+grouped reduce and tests (#1824 )	2023-09-09 13:22:09 -07:00
chenyu	9da40c8448	move Node.__lt__ SumNode special case to SumNode (#1823 )	2023-09-09 13:20:38 -07:00
Francis Lam	651205fa5c	linearizer: support local and group_for_reduce dimensions together (#1821 ) also minor changes to test_speed_v_torch.py and size of UOps.SPECIAL	2023-09-08 12:39:27 -07:00
segf00lt	9e8c1dbf34	patch to remove hack from stable_diffusion.py (#1814 ) * patch to remove hack from stable_diffusion.py * sorry linter * realize after assign? * float16 broken in llvmlite use float64 for now * int32 * idiot forgot to change test array dtype	2023-09-08 09:26:50 -07:00
chenyu	ebcda8a714	Move var_vals from ShapeTracker to LazyBuffer (#1819 )	2023-09-08 09:25:10 -07:00
kormann	7ac65a93b4	utils.printtree (#1816 ) * utils.printtree * linter compliance * rename to print_tree	2023-09-07 23:08:57 -07:00
George Hotz	4613c9e77c	add tvm example, formatting (#1813 ) * add tvm example * no realize	2023-09-07 11:50:41 -07:00
nimlgen	5b15a972b5	no functions with same names in test/ (#1811 )	2023-09-07 11:27:31 -07:00
George Hotz	722823dee1	stable diffusion: force fp16 free	2023-09-06 15:11:05 -07:00
chenyu	928cb1a64a	AndNode.substitute short circuit (#1800 ) * AndNode substitute short circuit * Node.__bool__ is faster than Node.__eq__	2023-09-06 14:58:49 -07:00
nimlgen	a78a1fa499	fix jit buffer reuse when freed (#1802 ) * fix jit buffer reuse when freed * Firbid output_buffer reusage	2023-09-06 14:41:57 -07:00
Yixiang Gao	22cf15e9d0	convert function into tinygrad (#1803 )	2023-09-06 14:41:26 -07:00
Pavol Rusnak	52a92bf95d	use class Foo: instead of class Foo(): (#1797 ) * use class Foo: instead of class Foo(): * add ruff linter, copy settings from .flake8 to ruff.toml	2023-09-06 12:20:25 -07:00
badcc	fd25792c8b	Ensure freqs as type float32 in freqs_cis (#1798 )	2023-09-06 10:24:15 -07:00
chenyu	35072877ef	sym_infer is noop for int input (#1795 )	2023-09-06 09:17:20 -07:00
George Hotz	f67638b27a	delete broken DDPG example	2023-09-06 08:01:12 -07:00
George Hotz	78a43ad2c7	add uop fixup (#1793 )	2023-09-06 07:55:22 -07:00
geohotstan	1bbf26d7fd	fix try except not catching fxn() in benchmark (#1783 ) * have function raise notimplementederror * more lines * revert back to 2 lines :D * aahhhhhhhh shoooot im stupid * keep it minimal?	2023-09-06 07:36:43 -07:00
chenyu	09e78a9d07	Node does not need to subclass ABC (#1792 ) * Node does not need to subclass ABC * class Node:	2023-09-06 07:35:45 -07:00
badcc	ee9ac20752	Use correct dtype in Tensor when data is an ndarray (#1785 ) * use correct dtype in Tensor when data is an ndarray * attempt 2 * add assert to be consistent * Add test case for ndarray * Add test case for list * remove whitespace	2023-09-06 07:35:32 -07:00
nimlgen	130cd55942	fix gpu compilation of const GEP (#1788 )	2023-09-06 07:34:46 -07:00
George Hotz	e10a9692ec	Revert "fix attn_mask None issue" (#1787 ) * Revert "fix attn_mask None issue (#1786)" This reverts commit `bd06d88c73`. * Update tensor.py	2023-09-05 21:18:55 -07:00
David Hou	343b256deb	PoC fast winograd compile (#1771 ) * proof of concept for variable replace global load * small hacks to make faster * clean up a little? * linter * allow substituting with an expression * clean up a little * fix everything * try to fix bug? * type annotation * typing * typing	2023-09-05 21:14:40 -07:00
Pavol Rusnak	a50a7ef6f2	revert typo in external_multi_gpu.py (#1777 ) introduced by `fb1cc6bf4b`	2023-09-05 20:46:28 -07:00
George Hotz	bd06d88c73	fix attn_mask None issue (#1786 )	2023-09-05 20:45:54 -07:00
Francis Lam	0379b64ac4	add seed option to stable_diffusion (#1784 ) useful for testing correctness of model runs	2023-09-05 19:45:15 -07:00
George Hotz	6100d7425f	add 2 to locals, uops debug 5 (#1782 )	2023-09-05 19:44:43 -07:00
Roelof van Dijk	2a11669e1d	perf: faster and more readable merge_dicts (#1775 ) Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-09-05 14:42:19 -07:00
George Hotz	89a8a02697	disable openpilot model in model benchmark	2023-09-05 13:32:30 -07:00
geohotstan	9af5645ba3	onnx full passing (#1076 ) * 1 * 83 failed * learning how git works * lol idk * zero shape aaaa * space lol * aaa * test check * haha * fixed gather * 73 failing * 71 failing * 68 failing * added some debug * fking resize * lol * 62 failing * 58 failling fucking did nearest resize hell yeah * clean up * 56 failing * janitor duty * lol * 53 failing * hi mom * 50 failing * added linear interp, but coord_trans is wrong * did lin interpolation woohoo * 43 failing * 40 failing * temporary Gather fix * 39 failing * fixed slice onnxver<10 * 37 failing * 35 failing * excluded tests that use float64 * 32 failing with hacks * added _batchnorm() for 3D 5D batchnorm, 29 failing * changed ALLOWED_KERNEL_COUNT from 199 to 207 * added improved Gather op, reverted ALLOWED_KERNEL_COUNT commit * support Round op * added storage_order/indices maxpool, 27 failing * support maxunpool, 25 failures * support Gradient, 23 failures * merged new where * added Adam * cleanups * added Momentum and Nesterov Momentum * added Adagrad * support sequence_type, 20 failing * ugh git * I give up on cubic interp :D, 9 failing * sexy 1 liner gather, much improved, wow * polished gather to make it shine bright like a diamond * clean 1 liner for gather * improved readability of gather * uhh * clean up * more clean up * WHITEspace * implemented SoftmaxCrossEntropyLoss op * added comments and cleaned up if statements * update * thank based wozeparrot for pow and new GatherElements * CPU and TORCH all pass \| cast float64 -> float32 for all fromCPU() * _nearest_gather() failing on yolo * reverted ops_cpu change and added assert in Resize * added comments for resize for multiple channels * oops * merge * test * switched np.pad to Tensor.pad for constant padding * gah * gah2 * sexy reflect pad with movementops -> add * delete commented out lines * edge mode pad sexy as well * trying out model_benchmark * revert gitignore change lol * init * Revert "init" This reverts commit `682bf2073a`. * wrote cast workaround for CPU, CPU and TORCH all pass * wrote cast workaround for CPU, CPU and TORCH all pass * skipped tests w/ 0 shape for METAL and GPU * excluded tests for CLANG, CPU, TORCH, CLANG pass * fixed hacky ConvTranspose * gotta figure out autopad * UOps.STORE support cast bool -> float * small fix for fast gather * reverted 0 shape skipped tests * oops missed a file * added comment * fixed slice op hack * First commit to pr * More trig ops * More trig ops * format * isinf support * More ops * changed onnx_ops to use our new gather :D * Det op bug fix * rebase * fixed some tests * det broken and slow * fixed compress to use new gather * implemented argmax argmin * support variable types in type_proto * support Upsample and Identity sequence * we support float64 now and tinygrad support automatic broadcasting * added EyeLike op * resize does support multiple channels now actually * yolov8 onnx runs successfully * added batch size 1 * oops * finally fixed type_proto I think * fixed some llvm bugs * del whitespaces * added ZenginU Format PR * test * oops * added float64 exclude tests back * more skipped tests * try * ok openpilot pass * flake8 pass * woooooohooo * revert external_model_benchmark changes * perf tested gather * removed promote types from ops_cpu * numerical errors from 1681 is fixed --------- Co-authored-by: ZenginU <umutzengin00@gmail.com>	2023-09-05 13:23:32 -07:00
George Hotz	fb1cc6bf4b	llama jit is default, print tok/sec (#1774 ) * llama jit is default, print tok/sec * jit not default in CI	2023-09-05 10:12:16 -07:00
Roelof van Dijk	f6e6a1a4d7	perf: avoid cast, restore isinstance (#1772 ) Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-09-05 09:07:04 -04:00
geohotstan	671101e6b8	Metal stuff pip install on default when on Darwin (#1770 ) * added to setup * split lines for Darwin stuff	2023-09-04 21:59:54 -07:00

1 2 3 4 5 ...

2480 Commits