tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-25 23:08:06 -05:00

Author	SHA1	Message	Date
George Hotz	c5a941d466	webgl backend in extra (#3041 ) * WebGL WIP * 84% of ops passing test * tests passing 100% * Cleanup, refactor * Shave off some lines * Work on dtypes * TestOps at 100% again * Efficient net shaders compile in browser webgl2 * Compile all efficientnet shaders in browser * Create empty textures for tensor buffers * Run program. Up next weight loading * Exported WebGL model working * Add tests, refactor * Explicit cast alu for GLSL * Fix CI tests * WebGL efficientnet demo * Compile and run yolov8 in browser * Fix imports * Simplify yolo compile * Fix boolbool and cast cmplt to float More tests * Do std tests pass on CI? * Skip std tests on CI * Remove explicit_cast_alu hack, and solve it in code_for_op * Move to new dtype-less alloc api * Remove local size hack: optimize local_size only if device has local * Remove glsl.py, and move content to cstyle * dont_use_locals in opts * Fix dtype tests * type_map in CStyleLanguage * Make core changes smaller, cleaner, refactor export_model and demo * Skip pad_slice * Simplify: render_const, render_conditional * solve bool alu for other binops, cleaner ops_webgl * Fix noopt hack * Remove some skipIfs * WebGL image hack * type_names is a better name * global_max * Fix dtype import * Fix type_names -> type_map * Fix lint * Remove webgpu, back to 5k lines (#3040) * remove webgpu * max 5000 lines * revert those to master * retain that cstyle --------- Co-authored-by: Ahmed Harmouche <ahmedharmouche92@gmail.com>	2024-01-08 09:29:13 -08:00
chenyu	ef5f545fd8	add more Tensor.clip test cases (#3034 ) * add more Tensor.clip test cases add cases for same low/high and both negative etc * case min > max	2024-01-07 13:08:59 -05:00
chenyu	138c17c094	enable argmax tests for METAL/WEBGPU in CI (#3027 ) not sure why it was skipped but works now in CI	2024-01-05 21:43:00 -05:00
chenyu	520406cf3a	add Tensor.unflatten and Tensor.flatten(end_dim) (#3023 ) simplified cases when splitting a dim, or merge dims in predix	2024-01-05 17:55:29 -05:00
chenyu	4465ef28c5	add test_softmax to test_ops (#3020 ) * add test_softmax to test_ops somehow it was not tested * too many buffers in softmax backward for WEBGPU	2024-01-05 11:19:49 -05:00
chenyu	ae112c9dbe	fix some long lines in tests (#3006 ) * fix some long lines in tests * better	2024-01-03 23:53:33 -05:00
Kevin Herro	bd6a0c90a0	add Tensor.split (#2750 ) * add Tensor.split (#2677) * fix mypy errors * add list support for Tensor.split * fix ruff comments * match tensor.split api * simplify split and test_split --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-01-01 22:09:04 -08:00
George Hotz	a280cfe169	move dtypes to dtype.py (#2964 ) * move dtypes to dtype.py * fix urllib	2024-01-01 14:58:48 -08:00
Isalia20	8de1fc2539	Einsum space fix (#2927 ) * space removal in formula and a single test to cover it * space in torch einsum as well * replacing spaces in a var formula to support truncating all the spaces	2023-12-24 01:23:27 -05:00
George Hotz	1765849937	new lazy, benchmark (#2878 ) * lazy rewrite, try 2 * min fix tests * pass contig test * put broken pads back * move that to realize * no contig child fixes array packing * so wrong * now that's correct * base children * fix bind issues * disable to_image_idx * fix tests * that failure shouldn't break other tests * more fixes * fix torch * skip failing tests in CI * 1e-7 * half is broken * 1e-6 margin of error	2023-12-20 14:33:21 -08:00
geohotstan	fec8e9060c	Add simple fancy indexing exceptions (#2706 ) * fancy indexing raise error * updated error message * improved error check * oops * fixed onnx * oops typo * merge * add full_flatten * try * merged and updated some tests * more cleaning * done * temp fix onnx * try * add todo in onnx_test * reword * gah	2023-12-19 11:23:51 -05:00
chenyu	220abcd8ff	fix squeeze of 0-dim Tensor with negative dim (#2821 ) if ndim=0, only accepted dim is 0, -1, None. other negative dim results in IndexError	2023-12-17 22:02:07 -05:00
chenyu	85c6250a3e	support Tensor.einsum with no "->" in formula (#2807 ) output is the sorted alphabets if there's no "->"	2023-12-17 00:46:24 -05:00
George Hotz	051402625e	remove pushing contig + fix linearizer bug (#2798 ) * remove that logic * fix test, move LOADs * fix repeat issue on LLVM * with_phi	2023-12-16 09:36:31 -08:00
chenyu	765f8b05e5	TernaryOps.WHERE has vin[0] as bool and BinaryOps.CMPLT always outputs bool (#2782 ) * vin[0] to where is always bool * due to better hack * update test * fix test_uops	2023-12-15 14:51:51 -05:00
chenyu	81a747fc63	more test cases in test_slice_fancy_indexing_with_idx (#2751 )	2023-12-13 17:52:26 -05:00
George Hotz	7e5b3e53fe	changes to prep for new lazy (#2748 ) * changes to prep for new lazy * put those back	2023-12-13 10:28:22 -08:00
chenyu	aa4a0de287	simpler Tensor.pow to integer (#2746 )	2023-12-13 11:39:20 -05:00
George Hotz	6d6eb9302d	ruff checks the max line length is 150 (#2734 ) * ruff checks the max line length is 150 * fix tensor.py * a lot more * done	2023-12-12 17:34:47 -08:00
chenyu	2ee6f689c5	simpler einsum (#2700 )	2023-12-10 21:24:44 -05:00
Davi Silva	7fbebb3df6	Implement einsum (#2686 ) * hopeful impl for Tensor.einsum * satisfy mypy by having less typing. :( * a few simple tests * even more tests * permute tests * xfails for improper usage * fix LLVM test fail * use argfix * more helpful error message on shape mismatch	2023-12-10 15:56:01 -08:00
geohotstan	67ff2b2b18	Formatted test_indexing (#2688 ) * added tensor.clone() for more correct cloning behavior * some work and randint issue * formatted * final cleanups * oops, bug fix	2023-12-09 11:38:36 -05:00
Ahmed Harmouche	50dcd532d5	Get all WEBGPU test_ops passing (#2646 ) * Get all WEBGPU tests passing * Custom render store is not needed in wgsl	2023-12-06 07:40:37 -08:00
wozeparrot	6d58c19736	binaryops xor (#2627 ) * feat: initial xor * feat: numpy xor * feat: llvm xor * feat: quick test for xor * feat: slightly working xor in torch * feat: xor in tensor * feat: slightly better test	2023-12-05 13:21:42 -08:00
geohotstan	f12bcccb87	[ready] refactor getitem round 2 :D (#2568 ) * new getitem * go * add temporary simple tests * better * comments * WOW that took awhile * save 1 line lol * work * still need to add comprehensive tests, but i think getitem looks nice :D * GIMME GREEN CI CHECKMARK PLS * try.. * k idk * added tests for errors * fixed small hack * added tests * almost good * try no contig? * yay no more contig + comments and spacing * finishing touches (comments) * revert regex unittests lol * add suggested change * oops I fell asleep yesterday	2023-12-04 22:36:32 -05:00
andresgit	00523d5656	New fix accessing elements created by padding (#2529 ) * pad slice test cases, many failing * fix failing test cases check mask if we are outside the base buffer also create a multi-view if in that case we reshape to an empty shape * real_offset calculation more readable --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2023-12-01 19:08:10 -05:00
chenyu	7d26452305	call ruff with --preview (#2522 ) some checks are ignored without --preview	2023-11-30 13:59:00 -05:00
chenyu	5db0cdfbd3	support list of ints (or other Tensorable) in tensor indices (#2520 ) * support list of ints (or other Tensorable) in tensor indices * enable some index test cases	2023-11-30 12:46:33 -05:00
Liam	cf0c9096a9	Removing METAL Skips as CI works (#2488 ) * Test metal CI * remove metal and CI restrictions * enable dtype tests for metal ci	2023-11-28 19:46:59 -08:00
Christopher Mauri Milan	7f01dd04f0	Apply ruff linting rules to tests (#2473 ) * everything except F821 * enable F821 with noqa * dumb fix * fix remaining imports and (former) lambdas * replace _ with noqa to avoid gc	2023-11-27 21:24:06 -08:00
George Hotz	9e07824542	move device to device.py (#2466 ) * move device to device.py * pylint test --disable R,C,W,E --enable E0611 * fix tests	2023-11-27 11:34:37 -08:00
George Hotz	8ff2e13550	From teeny (#2426 ) * changes from teenygrad work * support not supporting ImageDType/PtrDType * fixups from teeny	2023-11-24 12:50:56 -08:00
George Hotz	8f89e21fca	torch and numpy don't share ops anymore (#2412 ) * torch and numpy don't share ops anymore * that should be filtered out elsewhere * still const * graph + enet example cleanup * hmm, we do still need it because of symbolic	2023-11-23 16:58:10 -08:00
chenyu	d2c0035c73	add back as_strided, move rebuilt mops to extra (#2344 ) * add back as_strided, move rebuilt mops to extra * negative stride for ops_cpu * Revert "negative stride for ops_cpu" This reverts commit `a13b6815ac`. * skip that * style	2023-11-17 14:34:30 -05:00
George Hotz	1d5501594e	force rebuild of ocelot (#2334 ) * force rebuild of ocelot * SzymonOzog gpuocelot * delete that * downgrade that * non parallel * force rebuild * use llvm * nauto * less mem maybe * print test * helper_test_exception skip CUDACPU * helper_test_exception * shippable	2023-11-16 20:44:14 -08:00
George Hotz	3baaf298d6	two stage cumsum in tensor.py (#2331 ) * two stage cumsum in tensor.py * 2 more kernels for llama cumsum * gpt-2 and llama use fast multinomial	2023-11-16 12:09:53 -08:00
chenyu	27f4c26312	fix getitem slice when end < start (#2329 )	2023-11-16 11:20:27 -05:00
chenyu	f1f863c953	allow 0-dim array to broadcast into zero shape tensor (#2315 ) * allow 0-dim array to broadcast into zero shape tensor * not in	2023-11-15 13:12:21 -05:00
chenyu	123a0b86b2	support zero in shape (#2303 ) * zero in shape start * no assert for that * if output size is 0, return without exec * tweak * strides * reduce over non-zero * shrink and expand * fix import * test_elementwise where * cannot reshape from size 0 to size 1 * compiled backend reduce over 0 * zeros for numpy * reduce over 0 and keepdim resulted in 1 * reduce empty set default values * compare with same input * pad test case * cat test case * torch does not support that?	2023-11-15 11:57:48 -05:00
chenyu	175cdbe815	fix pad None will value (#2308 )	2023-11-14 23:57:05 -05:00
George Hotz	78623ba204	two simple tests	2023-11-10 16:16:06 -08:00
George Hotz	85d26ddc36	uops loop removal (#2262 ) * remove the loop * cleanups * tests failing still * global_loop_ctx wasn't needed * replace_op is cleaner * minor opt * cast opt was wrong * uop_num * uop num was dumb * tuplize_uops * torch tests * fix test_uops	2023-11-10 15:24:47 -08:00
George Hotz	38b7f5a7fd	less phi, proper phi (#2241 ) * less phi, proper phi * disable flaky whisper test	2023-11-08 16:13:43 -08:00
chenyu	719a97b337	fix IMAGE=2 failed with NOOPT=1 (#2209 ) * IMAGE=2 failed with NOOPT=1 * fix it	2023-11-05 13:16:37 -08:00
chenyu	f582ec56d5	Replace (getenv("CI", "") != "") with helpers.CI (#2213 )	2023-11-03 15:20:44 -07:00
George Hotz	b245f1307e	add exp2 (#2192 )	2023-10-31 17:48:42 -07:00
George Hotz	87b714b8cb	split test_conv2d	2023-10-18 14:00:50 -07:00
George Hotz	15da96f393	print test durations and add speed (#2107 ) * print test durations * decrease sizes to increase speed * faster * GPU/CLANG onnx in seperate runner * test split, move ONNX CPU CI * simpler tests * simpler uops test * faster * less cuda apt * running ninja install * apt install * split fancy indexing	2023-10-18 13:46:42 -07:00
George Hotz	c5edb3c374	train value net, improve API, add BCE (#2047 ) * api cleanups, BCE losses * valuenet * fixup examples * learning okay * add valuenet runner * net improvements * net improvements * 40% win rate	2023-10-12 07:56:38 -07:00
geohotstan	8d6cecb25c	Torch eq fix (#1562 ) * init * Revert "init" This reverts commit `682bf2073a`. * kids dont do drugs * one way to fix * resolve merge conflict * no more or * clean up	2023-10-11 12:57:11 -07:00

1 2 3 4 5 ...

307 Commits