tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-29 16:58:18 -05:00

Author	SHA1	Message	Date
ttomsa	170ece6605	fix advanced setitem overlap with 0 (#7793 ) * fix advanced setitem overlap with 0 * fix comment	2024-11-19 16:03:55 -05:00
Gaétan Lepage	159c0bf25e	test_kernel_cache_in_action: fix test (#7792 )	2024-11-19 13:34:56 -05:00
George Hotz	913a27ee27	from_buffer on metal was never called [pr] (#7791 )	2024-11-20 00:35:17 +08:00
Eitan Turok	56017c52a0	Raise error when model architecture does not match state dict (#7772 ) * init * style * style * style * fix test	2024-11-20 00:11:54 +08:00
George Hotz	d71fe7faa5	rename allocator methods to not conflict [pr] (#7788 ) * rename allocator methods to not conflict [pr] * forgot those * transfer + offset	2024-11-20 00:10:29 +08:00
chenyu	d5f76462c8	fix CI beautiful_mnist dir (#7790 ) fixed `fatal: not a git repository (or any of the parent directories): .git` because $HOME is not $GITHUB_WORKSPACE	2024-11-19 09:59:02 -05:00
geohotstan	aeaf574a05	add failure test for setitem bug (#7786 ) * add failure test * rename * improve tests * improve tests and no need numpy	2024-11-19 08:54:21 -05:00
qazal	1e31b5ba6b	hotfix: ctx doesn't impact process replay [pr] (#7785 )	2024-11-19 20:17:01 +08:00
qazal	8360bbd88d	faster assign view check [pr] (#7781 )	2024-11-19 19:42:51 +08:00
George Hotz	3daa376107	remove numpy from assign [pr] (#7784 ) * remove numpy from assign [pr] * cast not required	2024-11-19 19:34:53 +08:00
George Hotz	fbb4099b3c	add test for compile3 [pr] (#7783 ) Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>	2024-11-19 19:26:51 +08:00
qazal	4f6071d919	capture the schedule context in process replay [pr] (#7782 )	2024-11-19 19:12:00 +08:00
qazal	f493d480e3	metadata appending to graph_rewrite (#7780 )	2024-11-19 18:05:42 +08:00
chenyu	73ea913050	really not using numpy in gpt2 example (#7779 )	2024-11-18 23:21:16 -05:00
chenyu	e6debda5c4	remove numpy from gpt2 and llama examples (#7778 )	2024-11-18 22:48:17 -05:00
George Hotz	005636304b	have VIZ=1 use HTTP/1.1 for keep-alive [pr] (#7776 )	2024-11-19 09:38:12 +08:00
George Hotz	65f188aafb	bump version to 0.10.0 v0.10.0	2024-11-19 08:27:28 +08:00
chenyu	26200574dc	load_state_dict test cases when model and data shard differently (#7774 ) current behavior is weird... when model is sharded and state_dict is not, load shards the state_dict and model shard axis does not change. but if model and state_dict are sharded differently, model shard axis becomes the state_dict axis after load. it should either always use model shard axis or always use state_dict shard	2024-11-18 16:08:24 -05:00
Francis Lata	a1c1b9547f	Context manager support for tqdm (#7770 ) * add context manager support * add test case for context manager usage	2024-11-18 14:12:03 -05:00
geohotstan	8100109c9d	Add replicate mode to Tensor.pad (#7608 ) * base implementation * add tests * actually remove the assertionerror test * actually only have reflect for this pr * change the 4 if-else one liner * maybe use a lambda * fix * maybe a lil cleaner * fix tests * complete * small change --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-11-18 10:55:38 -05:00
qazal	62db6398a5	delete buffer tracking from ScheduleContext [pr] (#7766 )	2024-11-18 22:47:32 +08:00
Shuni	ed76d3ceac	Fix AMD queue CWSR memory size (#7765 ) * Fix AMD queue CWSR memory size * fix linter error * add debug_memory_size field * align CWSR save area allocation to page size	2024-11-18 17:22:03 +03:00
ignaciosica	f02462c5cb	swizzle tc [pr] (#7633 ) * swizzle tc draft * further cleanup * hotfix: remove typing from fix_st and cleanup * hotfix: revert cache property (moved into separate pr) * hotfix * hotfix: rename * take patterns from schedule * hotfix: rename vars * hotfix * no more view of store * hotfix: linter * as view is only used for tc fix up and tc is only enabled for LOAD, remove valid an preload from pm rule - also remove inner simplify in fix_st * add typing to fix_st --------- Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>	2024-11-18 21:08:21 +08:00
qazal	6ea4a173e7	make is_realized a property [pr] (#7763 ) * make is_realized a property [pr] * fix assign * multi	2024-11-18 19:15:37 +08:00
chenyu	5de0ea40f3	reorder `Tensor.__init__` to match type (#7758 ) and reordered check lazy devices part	2024-11-17 21:32:48 -05:00
chenyu	66d7d5af50	fix Tensor(MultiLazyBuffer) with different dtype should fail (#7757 ) similar to Tensor(LazyBuffer) as we don't cast implicitly	2024-11-17 21:05:45 -05:00
chenyu	b1d734a02c	remove the -1 then -(-1) in Tensor.argmax (#7753 )	2024-11-17 16:54:09 -05:00
chenyu	e3081355fe	minor Tensor.einsum cleanup (#7752 ) removed some dead conditions and add types. still reads more complicated than needed	2024-11-17 16:11:30 -05:00
chenyu	8b08a72657	consmetic change to Tensor._pool (#7751 ) aligned the shink lines	2024-11-17 15:38:11 -05:00
chenyu	df817297b6	fix passing acc_dtype="" to Tensor.prod should fail (#7750 ) similar to sum	2024-11-17 11:38:13 -05:00
chenyu	55707fd00d	fix passing sum_acc_dtype="" to Tensor.sum should fail (#7748 )	2024-11-17 10:58:41 -05:00
chenyu	f18296e23c	simpler Tensor._reduce (#7747 )	2024-11-17 09:20:00 -05:00
qazal	0cc8de2f15	reverse map buf_uops [pr] (#7743 )	2024-11-17 21:29:56 +08:00
chenyu	0292ae7508	Tensor.meshgrid cleanup (#7741 )	2024-11-17 08:26:53 -05:00
qazal	40642cb9ea	to_uop split paths part 2 [pr] (#7746 )	2024-11-17 21:07:28 +08:00
qazal	99024b922b	to_uop one path for all ops part 1 (#7745 ) * flat meta ops * one path for everything * add tests * view is always base * just run	2024-11-17 20:12:44 +08:00
qazal	eeb222f98b	add UOp.new_buffer [pr] (#7742 )	2024-11-17 16:44:52 +08:00
chenyu	a15a900415	fix Tensor.meshgrid for 1D input and check indexing (#7740 )	2024-11-16 23:39:30 -05:00
geohotstan	72a41095bc	add Tensor.meshgrid (#7714 ) * initial implementation and test * some other places that can use meshgrid * revert the onnx_ops change * add to docs * revert interpolate too * update * improve edge case test * might as well test grad * add to test can improve docs --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-11-16 23:06:47 -05:00
mesozoic-egg	1a5e896bd4	[pr] Have PTX share code with LLVM (#7635 ) * integrate into ops_cuda * remove debugging stuff * lint fix * mypy fixes * swap ptx.py * edit * simplify wmma * wip * space * refactor * sync the ops removal changes * refactor * rename variables --------- Co-authored-by: judy <mesozoic.egg@proton.mail>	2024-11-17 10:53:56 +08:00
chenyu	f2f7384b67	_resolve_dim cleanup (#7736 ) no duplicated self.ndim+outer	2024-11-16 11:05:39 -05:00
chenyu	e777211a00	Tensor.repeat cleanup (#7735 ) flatten instead of double for loop comprehension	2024-11-16 10:43:45 -05:00
chenyu	f1efd84c92	fix repeat_interleave with negative dim (#7734 )	2024-11-16 10:15:29 -05:00
chenyu	e3105675fb	cond.where(True, False) is cond (#7733 )	2024-11-16 09:44:17 -05:00
qazal	40ae0e9115	smaller big graph (#7695 ) * start * work * rewrite to PRELOAD * st is always from base * fix aesthetics * work * more work * refactor to is_forced_realize * uh * green? * metaop can be image * dont count realized * this is the new src * test_tiny_add passes * work	2024-11-16 22:04:57 +08:00
qazal	f3f95ab9d9	flatten fusion upats [pr] (#7732 )	2024-11-16 21:26:19 +08:00
qazal	ec8c5598f6	refactor to generic UPat for sourcing unrealized bufs [pr] (#7731 ) * base check * use is_scheduled * fixup lazy * update metadata * match is too slow	2024-11-16 21:01:22 +08:00
ignaciosica	597a239e28	Remove UnaryOps, BinaryOps, TernaryOps, MetaOps [pr] (#7725 ) * remove unaryops * remove ternaryops * remove metaops * hotfix * remove binaryops * hotfix: test_pattern_matcher --------- Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>	2024-11-16 20:56:56 +08:00
chenyu	22da31b223	clean up Tensor.dot (#7728 ) more docs (similar to numpy) and removed many confusing `-min(n2, 2)`	2024-11-15 18:21:15 -05:00
chenyu	4338c450ac	fix max_pool2d for int tensor with padding (#7726 ) padding inf messed output dtype	2024-11-15 16:22:11 -05:00

... 70 71 72 73 74 ...

10417 Commits