tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-26 07:18:40 -05:00

Author	SHA1	Message	Date
Max Hahn	f9cb31fdc2	added visitor pattern (#1669 ) * added visitor pattern * pylint bug workaround * added tests, made abstract OpNode inherit from ABC * fixed assert * fix check of abstract classes in negative test * remove assert False	2023-08-30 09:03:44 -07:00
George Hotz	fdd7f282cb	Reenable tensor cores for self-hosted Mac CI (#1717 ) * debug 5 matmul * allow tensor cores in CI * tensor cores on arm64 * put debug back	2023-08-30 07:53:04 -07:00
chenyu	ac183568be	llama JIT python runtime speedup (#1633 ) * no JIT call in TransformerBlock * idea * move 2 reshapes to jitted function shrink inside jitted too, 6.3ms remove back reshapes, 5.5ms isinstance -> __class__ 4.99ms * think revert ops_gpu.py revert symbolic.py too PYOPENCL_COMPILER_OUTPUT=1 * cleanup * fix cache shape for conversational model only reshape if start_pos > 0 * small cleanup * include var_vals.keys() to st.key * add comments * llama small update * everything jitted again, similar structure to gpt2 * fix typing * add TODO for in place update cache	2023-08-30 07:51:05 -07:00
Umut Zengin	1682e9a38a	Fix: Stable Diffusion index (#1713 )	2023-08-30 00:21:10 -04:00
wozeparrot	2f768e386d	stable diffusion benchmark artifact (#1714 )	2023-08-29 21:08:40 -04:00
George Hotz	0ea22bf249	remove DEBUG=1 from stable diffusion AMD since jit cache is fixed	2023-08-29 12:46:12 -07:00
George Hotz	ab9b9ff3e2	pipefail benchmark (#1709 ) (#1710 ) * feat: specify shell * feat: specify shell for mac Co-authored-by: wozeparrot <wozeparrot@gmail.com>	2023-08-29 08:15:02 -07:00
George Hotz	aa7c98722b	sd timing (#1706 )	2023-08-28 20:22:57 -07:00
nimlgen	8844a0a822	llvm jitted (#1652 )	2023-08-28 20:22:44 -07:00
nimlgen	1c0449e190	add cache collector (#1595 ) * init cache collector * add test_cache_collector.py * switch GlobalCounters.cache to CacheCollector * init jit models test * jitted SD * add debug msg to print loaded bufs count * moved cache collctor to jit * clearer SD * no double device import	2023-08-28 19:59:55 -07:00
George Hotz	f5f8b09c13	allow manual release (#1704 )	2023-08-28 17:54:25 -07:00
George Hotz	715047a1e4	fix release publish (#1703 )	2023-08-28 17:48:00 -07:00
Olivier Chafik	ee6d8de2dc	Llama: load models in HuggingFace format (incl. indexed, safetensors) (#1583 )	2023-08-28 15:11:40 -04:00
qazal	3515ba4f23	add dtypes test (#1682 )	2023-08-28 08:12:15 -07:00
Roelof van Dijk	50f669e43b	[ready] perf: simpler Tensor init (#1679 ) Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-08-27 22:18:03 -04:00
Roelof van Dijk	b66f54e379	perf: avoid reshaping if not necessary (#1683 ) Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-08-27 14:17:04 -04:00
Roelof van Dijk	328cf2e86a	perf: remove cast and revert back to isinstance (#1694 ) Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-08-27 14:15:52 -04:00
wozeparrot	8b354b3f73	feat: version bump! (#1687 ) v0.7.0	2023-08-27 12:38:58 -04:00
Roelof van Dijk	abaa605f71	[ready] perf: start enumerate at 1 instead of checking all i (#1691 ) Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-08-27 12:00:32 -04:00
Roelof van Dijk	2730ed657f	perf: faster lazyop eq (#1693 ) Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-08-27 11:17:02 -04:00
Roelof van Dijk	6ca509a485	perf: constant in while in for in busy func (#1688 ) Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-08-27 11:13:16 -04:00
Roelof van Dijk	b89d81330f	fix: restore old behaviour (#1689 ) Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-08-27 10:45:53 -04:00
chenyu	66fbf4800b	fix symbolic_ops tests with Tensor.training=True (#1686 )	2023-08-26 23:19:56 -04:00
Roelof van Dijk	6c5dc9c153	[ready] perf: faster lazyop init (#1673 ) Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-08-26 22:59:10 -04:00
wozeparrot	f61d0657d1	document new envvars (#1676 ) * feat: document some new envvars * feat: actually put values * feat: no more cifar torch * feat: no fakedata	2023-08-26 20:17:02 -04:00
Yixiang Gao	9d93a82354	remove FAKEDATA (#1685 )	2023-08-26 20:15:54 -04:00
chenyu	b5d700adae	update openpilot supercombo.onnx to 0.9.4 (#1681 ) * update openpilot supercombo.onnx to 0.9.4 * update tests for the new model * comment out comma models from external_model_benchmark	2023-08-26 19:16:08 -04:00
Roelof van Dijk	89b529c07f	[ready] ci: add py38 to linters (#1674 ) * ci: add py38 to linters * fix: run linters only on py38 --------- Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-08-26 09:34:15 -04:00
Jordan Wright	25be7f745d	Tensor.uniform with dtype=int bug fix (#1593 )	2023-08-26 01:59:53 -04:00
Roelof van Dijk	f702a8f497	[ready] avoid in-function graph imports in lazy.py (#1666 ) Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-08-25 13:56:28 -04:00
Roelof van Dijk	02e64da678	refactor: tuples can be concatenated with + (#1671 ) Co-authored-by: Roelof van Dijk <roelof.van.dijk@vitestro.com>	2023-08-25 12:37:13 -04:00
Yixiang Gao	173850f599	fix CIFAR jit (#1657 ) * update mask function * kept 94 with the new fetcher clean up batch fetcher * 94.04% without cutmix * 94.04% with cutmix * move batch fetcher to avoid fetching additional batch last STEP	2023-08-24 16:14:40 -07:00
chenyu	f00325e77d	ops_metal newCommandQueueWithMaxCommandBufferCount_(1024) (#1664 )	2023-08-24 15:42:00 -07:00
DavidFarago	1ba8f0dca3	Quickstart: Upgrade section "Training" to new code (#1663 ) Co-authored-by: Dave Farago <dfarago@innoopract.com>	2023-08-24 17:12:16 -04:00
DavidFarago	29adae84eb	Quickstart: Use tensors to compute train accuracy (#1662 ) Co-authored-by: Dave Farago <dfarago@innoopract.com>	2023-08-24 17:09:12 -04:00
George Hotz	d37d092c14	split linearizer into 3 files (#1654 )	2023-08-23 14:58:47 -07:00
George Hotz	1b8c40234f	Uast start (#1650 ) * work * more tests * more tests 2 * don't break it	2023-08-23 12:00:06 -07:00
geohotstan	484708da87	#1615 fix (#1616 )	2023-08-23 14:51:05 -04:00
Pavol Rusnak	b57c374164	add accelerator links to readme (#1649 )	2023-08-23 14:47:55 -04:00
George Hotz	82623697a8	Move asm renderer (#1648 ) * teeny changes * teeny updates * move to renderer	2023-08-23 10:06:43 -07:00
George Hotz	a89363574d	teeny changes (#1647 ) * teeny changes * teeny updates	2023-08-23 09:53:39 -07:00
George Hotz	a6d842af7a	move device to ops (#1646 ) * move device to ops * mlops types * 2 lines	2023-08-23 08:30:17 -07:00
nimlgen	a65ae1198b	do replace div->mul for non-floats (#1644 )	2023-08-23 07:34:31 -07:00
George Hotz	da694d4241	move that image import	2023-08-22 21:30:55 -07:00
George Hotz	41e83be3dd	simple where broadcast (#1643 )	2023-08-22 21:24:49 -07:00
George Hotz	c831218139	Optional: Reduce line count and simplify the LazyBuffer interface (#1642 ) * less lines in lazybuffer, def e * custom function * cast * reorder functions * lb type	2023-08-22 21:01:10 -07:00
George Hotz	d25046e66a	matvec tests (#1634 ) * matvec tests * f16 * f16 is broken	2023-08-22 17:33:58 -07:00
George Hotz	643cbdfd50	make embedding and GPT-2 fast (#1631 ) * make embedding fast * jit more, variable shape support * print mem bw	2023-08-22 15:14:38 -07:00
Niklas D	a7752ad65d	Fix link to state.py in quickstart (#1632 )	2023-08-22 17:39:30 -04:00
c143	c9c40bb16f	Import whole math module in tensor.py (#1628 )	2023-08-22 17:07:46 -04:00

1 2 3 4 5 ...

2388 Commits