tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-02-19 02:44:40 -05:00

Author	SHA1	Message	Date
George Hotz	d627349af0	teeny changes (#1589 ) * teeny changes * import order	2023-08-20 13:38:38 -07:00
George Hotz	739f327d2d	Shorter (#1582 ) * deleting lines * remove insert dims * if statement is never hit * bug fixes	2023-08-20 08:12:16 -07:00
geohotstan	a293c18d34	Gather bugfix (#1561 )	2023-08-16 19:53:14 -04:00
geohotstan	8763037f0e	Fancy indexing is fancy wow and gather thing (#1399 )	2023-08-16 18:35:49 -04:00
YiMing Han	e00acb1eaf	fix deepwalk ctx check (#1536 )	2023-08-13 23:03:17 -07:00
Jacky Lee	ef5f648e2f	Tensor.scaled_dot_product_attention to match torch, used in LLaMA, and tested (#1502 ) * Implement scaled_dot_product_attention and test * Support attn_mask * Support is_causal too * Use in llama * Don't forget to reshape * Set requires_grad=False for causal * Remove staticmethod * Remove extra spaces	2023-08-08 23:27:13 -07:00
George Hotz	d24f936501	just cmplt (#1493 ) * just cmplt * fix maximum * don't save, there's no backward * ugh, no slot either * eq is a scam	2023-08-08 13:58:10 -07:00
George Hotz	d67e248d9b	simple bitcast 2 (#1445 ) * simple bitcast 2 * bc 2 * empty * Revert "empty" This reverts commit `d8ee083655`.	2023-08-06 00:30:50 -07:00
Francesco Castelli	579f4615a0	Add assert for wrong matmul/dot shapes (#1438 )	2023-08-04 18:16:56 -04:00
Umut Zengin	52db7d7435	inf, -inf support for pad (#1436 )	2023-08-04 15:05:25 -04:00
Umut Zengin	8889821547	Const pad support to pad2d and slice (#1392 ) * slice to pad2d migrate * Gain line * Mypy happy * Mypy happy * Revert * whitespace	2023-08-02 08:58:52 -07:00
Umut Zengin	0de5f20970	Re-open constant pad support to Tensor.pad (#1388 ) * Added const padding support to .pad * Linter	2023-07-31 17:08:57 -07:00
JaSpa99	5ab12059da	rng hlops: add normal and kaiming_normal (#1378 ) * add normal and kaiming_normal * make sure its float * add tests	2023-07-31 10:37:02 -07:00
wozeparrot	32d1afa4b5	feat: correct case when base is 0 (#1360 )	2023-07-27 13:53:38 -04:00
wozeparrot	c22e77abfd	Match torch on fractional negative base pow (#1352 ) * feat: match torch on fractional negative base pow * feat: tests for trunc	2023-07-26 19:14:54 -07:00
Umut Zengin	d4ebadf2da	Small Tensor.cat optimization and reformating (#1347 )	2023-07-26 18:01:12 -04:00
geohotstan	4056f97187	Gather (#1329 )	2023-07-25 15:05:41 -04:00
waifairer	d89fb729e5	flake8 (#1323 ) * flake8: Ignore frequent violations, correct infrequent ones * Ignore some rules in test * Reorder test ignores * Lint test + main * EOF indent * Include all E71,E72 errors * Test the failing case in CI * Revert "Test the failing case in CI" This reverts commit `110add0a70`. * Push to test! This reverts commit `f317532779`. * ok back to passing This reverts commit `ba5052685f`. * Prove that CI fails when formatting is incorrect. * Fix formatting * Remove duplicitous E117 rule * Use flake8 config for precommit --------- Co-authored-by: waifairer <waifairer@gmail.com>	2023-07-24 11:19:58 -04:00
George Hotz	086382b64e	Revert "Fix max nan (#1298 )" (#1334 ) This reverts commit `50774470b2`.	2023-07-23 20:41:28 -07:00
uncommonSensor	50774470b2	Fix max nan (#1298 ) * Fix max nan * Adds nan check option to max function * Calls to max can pass in "ignore_nan=True" argument * Added max nan CI tests * Fix max nan * Adds nan check option to max function * Calls to max can pass in "ignore_nan=True" argument * Added max nan CI tests * Turned off due to the need for granularity	2023-07-23 19:39:44 -07:00
madt2709	d2c1e8409a	Update arange to be (start, stop, step) (#1308 )	2023-07-21 00:27:23 -04:00
George Hotz	ca77d6cd72	bfloat16 in LLVM (enough for llama 2) (#1293 ) * add bf16 support to LLVM * bf16 read works	2023-07-19 20:18:32 -07:00
Umut Zengin	74e63fe4ee	Added test_chunk and fixed (#1283 )	2023-07-19 22:21:26 -04:00
chenyu	940b6fd21a	Revert "Fix constant folding for Tensor([3]) (#1227 )" (#1274 ) This reverts commit `ab645317c9`.	2023-07-19 10:51:06 -07:00
Umut Zengin	fde9f0e60d	Slice migrated in Eye op (#1281 ) * Migrated from slice to pad and shrink, made cleaner * Changed repeat with reshape and expand	2023-07-19 09:08:38 -07:00
Stan	ed472bffea	Fix: negative axis in `tensor.cumsum` (#1261 )	2023-07-17 16:16:38 -07:00
Adrian Kretz	5a8ad57163	Add WHERE ternary (or trinary?) op (#1196 ) * Rename FusedOps to TernaryOps * Support ternary broadcast * Add where llop and mlop * Make where op work in cstyle codegen * Don't skip test_inf_where * Add backward path to where op * Use bool in cstyle codegen * Add LLVM where op * Add numpy where op * Add torch where op * Simplify where mlop * Update documentation * Forgot a rename * Merged relevant changes from PR #1195 onto PR #1196 * Add test to cover changes to linearizer.ast_parse for WHERE op Without this METAL will try to use ternary op on float4 and fail * Make where op work in wgsl backend * Allow ternary ops to be merged * Make mypy happy --------- Co-authored-by: Francis Lam <flam@alum.mit.edu>	2023-07-16 00:31:55 -07:00
Stan	264d467f2b	Added `tensor.squeeze` and support for testing exceptions (#1241 ) * WIP: `tensor.squeeze` function * Added `test_except` param to `helper_test_op` to avoid false positives * Extracted new method `helper_test_exception` for testing exceptions * Made `squeeze` not throw IndexError when ndim == 0 and dim <= 0 to match PyTorch	2023-07-15 00:33:24 -07:00
Roelof van Dijk	8f2e2f5ee2	style: else-after-return (#1216 ) Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-07-12 10:26:38 -07:00
chenyu	ab645317c9	Fix constant folding for Tensor([3]) (#1227 ) * Fix constant folding for Tensor([3]) * Remove duplicated prod import * load in the same device * better numpy * add constant fold shape test cases * improve tests	2023-07-11 14:01:32 -07:00
madt2709	bb316a42af	Fix pow to work with negative tensors (#1191 )	2023-07-09 17:33:04 -07:00
George Hotz	43385c7dbf	remove contiguous on full (#1212 )	2023-07-09 17:31:15 -07:00
Carson Radtke	13a1abf9e7	remove tuple from type annotation in Tensor.__init__ (#1211 )	2023-07-09 16:27:07 -07:00
fluffy χατγιρλ	ef1909500e	remove superfluous parentheses (#1197 )	2023-07-08 15:11:02 -07:00
fluffy χατγιρλ	628ee46627	Fix bug where Tensor.randn returns inf (#1192 ) * fix randn inf bug * add test * more compact test * clarify test purpose	2023-07-08 12:03:46 -07:00
Yahya Lmallas	fd66d1ca00	fix Tensor.manual_seed() default to wrong type (#1168 ) * fix Tensor.manual_seed() default to wrong type None while it should be int * remove that tests	2023-07-07 10:42:48 -07:00
cheeetoo	f109af3cbb	Don't save parents unless needed (#1142 ) * don't save parents unless requires grad * keep del ctx since idk	2023-07-05 18:11:57 -07:00
Eli Frigo	801564f31b	Remove POW llop and add SQRT llop (#1104 ) * fixed division by zero for fast operations * made et closer to 0 * replace POW llop with SQRT * updated mlops to swap SQRT and POW llops * updated hlops to swap POW and SQRT * added sqrt llop to cpu runtime * added sqrt llop to cstyle codegen * added POW llop to llvm ir codegen * added SQRT llop to torch runtime * moved pow from mlops to hlops * found a better way to do reverse pow * fixed indentation * added SQRT llop to triton * update docs to match new llops * removed POW operator from assembly codegen * added sqrt and rsqrt to pow hlop * rewrote pow function in tensor.py * Adjust tolerance * Adjust for adamw * Reduce for Adam too * removed accidental leftover code * removed all of accidental code * added rsqrt test * removed pow from mlops again it was added back when resolving merge conflicts --------- Co-authored-by: Jacky Lee <jla524@sfu.ca>	2023-07-05 18:07:58 -07:00
Kunwar Raj Singh	9e6067378f	Broken Sigmoid backward: Add test and mlop for Sigmoid (#1113 ) * Add failing sigmoid test * update more tests * add mlop for sigmoid * add back test * math.log(math.e) = 1 * remove divides --------- Co-authored-by: Kunwar Raj Singh <kunwar31@pop-os.localdomain>	2023-07-04 00:14:22 -07:00
Anselm Coogan	a22aad7d32	Use generators instead of lists in `any`s and `all`s (#1111 ) * Use generators in any(..) instead of lists for better best-case * Use generators in all(...) instead of lists * enable R1729 in .pylintrc * revert import sorting --------- Co-authored-by: Anselm Coogan <anselm@scandit.com>	2023-07-03 16:06:06 -07:00
Taras Tsugrii	cbb5c655e5	[tensor][perf] Replace list comprehension with . (#1102 ) It's more concise, idiomatic and faster: ``` In [8]: %timeit [1 for _ in range(100)] 2.12 µs ± 26.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each) In [9]: %timeit [1] 100 515 ns ± 5.23 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each) ```	2023-07-02 18:34:23 -07:00
geohotstan	575f75f613	hello (#1084 )	2023-07-01 01:29:35 -07:00
Jacky Lee	754e54ebb9	Fix Tensor ceil and floor for whole numbers (#1071 ) * Works on non-special numbers * Test different cases	2023-06-27 23:22:17 -07:00
George Hotz	3e33befc1d	realize hotspots (#1059 ) * realize hotspots * no str check * minor changes * make this an assert * faster and more readable * nicer self.buffers * tests for weak op + LAZYCACHE=0	2023-06-26 18:31:18 -07:00
Kunwar Raj Singh	5d3310ce56	MaskRCNN Inference (#884 ) * MaskRCNN weights loading * backbone maybe works * backbone works, but resnet body atol 1e-3 * RPN Call, but veryy wrong output * fixed topk * RPN maybe works, not sure about nms * Fix cursed modules * add back editorconfig * Full call, wrong output * Full call works * fix mask * use NMS from retinanet * Removing extra funcs * refactor * readable * Add example to run model * remove filter * Fix split, batched inference is worse * Fix image sizes * Matching reference * merge master * add filter on top detections * cuda backend fixed * add model eval and spec * convert images to rgb * fix eval * simplify examples code * remove extra code * meshgrid using tinygrad * removing numpy * roi align, floor, ceil * remove numpy from level_mapper * remove numpy from pooler * Revert "Merge branch 'master' of github.com:kunwar31/tinygrad into mrcnn-inference" This reverts commit `4b95a3cb49`, reversing changes made to `98f2b1fa2e`. * roi align gather * fix master merge * revert to old floor, ceil as ints present in domain * use log2 op * fix indexes * weird bug with ints and gpu * weird bug with ints and gpu * refactors, add env var for gather * floor with contiguous, where * refactor topk, sort * remove staticmethod * refactor stride * remove log2 mlop * realize -> contiguous * refactor forward * remove num_classes, stride_in_1x1 from state * refactor forward * refactoring * flake8 * removing numpy in anchor gen, use numpy for gather, nonzero, optimize topk * keep using tinygrad for smaller gathers * fix empty tensors * comms * move from tensor.py * resnet test passing * add coco dataset back * fix spaces * add test for log2 * no need to create Tensors * no need to create Tensors --------- Co-authored-by: Kunwar Raj Singh <kunwar31@pop-os.localdomain>	2023-06-25 15:37:51 -07:00
George Hotz	c8fbdeb48e	test speed llama (#1046 ) * test speed llama * oops, put it back * uses the real device codegen * just do it on the mac * pp * is faster? * Revert "is faster?" This reverts commit `42db542010`. * disable docker again for less load on CI	2023-06-25 15:22:56 -07:00
Francesco Castelli	6ff720103e	Reduce tensor dot line count and fixed 1d tensor dot (#1045 ) * fixed tensor.dot * no 1d dot for image=1 * shorter lines * add 3d dot tests	2023-06-25 10:32:45 -07:00
Diogo	d2b837c1d9	Adds floor/ceil (#989 ) * floor ceil impl * control casting in numpy	2023-06-17 10:56:21 -07:00
Rayan Hatout	2d567ef688	Optimizations in tensor.py (#974 ) * optimizations in tensor.py * make mypy happy * revert split of Function class	2023-06-14 08:44:35 -07:00
George Hotz	ba4eadb04c	PTX assembly support (#977 ) * ptx assembly * all ops tests pass * fix tests	2023-06-13 12:31:42 -07:00

1 2 3 4 5 ...

427 Commits