tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-10 07:28:15 -05:00

Author	SHA1	Message	Date
George Hotz	439911b2e6	disable disable_abstract_method [pr] (#7815 )	2024-11-21 12:28:57 +08:00
George Hotz	c5d458ce02	BufferSpec and ProgramSpec [pr] (#7814 ) * BufferSpec and ProgramSpec [pr] * delete preallocate, it's unused * Revert "delete preallocate, it's unused" This reverts commit `dcfcfaccde`.	2024-11-21 12:18:05 +08:00
George Hotz	490a6130af	more hcq typing [pr] (#7813 ) * more hcq typing [pr] * minor * less generic	2024-11-21 11:23:07 +08:00
George Hotz	9df5a62c5e	unify to HWQueue [pr] (#7812 ) * unify to HWCommandQueue [pr] * all is HWQueue	2024-11-21 10:33:08 +08:00
chenyu	11cea00090	lower vs_theoretical conv tflops threshold for nv (#7811 ) less flaky	2024-11-20 20:03:49 -05:00
chenyu	46aa23539f	generate and print mypy lineprecision report (#7809 )	2024-11-20 16:53:17 -05:00
chenyu	c815d7b56e	run bfloat16 tensor core in metal benchmark (#7808 ) * run bfloat16 tensor core in metal benchmark * separate task	2024-11-20 15:34:07 -05:00
chenyu	33a496279b	load_state_dict check v.shape instead of v.lazydata.shape (#7807 )	2024-11-20 14:39:30 -05:00
ignaciosica	fc3154a7b3	metal bf16 tc support [pr] (#7408 ) * add bf16 tc for metal * hotfix: spacing * fix tolerance and skip metal bf16 in ci * hotfix: check for dtype_out * hotfix: add check for tc.dtype_out is bf16 back * hotfix: add parens	2024-11-20 14:39:08 -05:00
geohotstan	66a069ee25	add replicate mode to Tensor.pad (#7802 ) * base implementation * add tests * actually remove the assertionerror test * good	2024-11-20 08:39:58 -05:00
George Hotz	eb0bb7dc0b	final dname to device [pr] (#7806 ) * final dname to device [pr] * oops, fix nv	2024-11-20 20:20:28 +08:00
George Hotz	bc977fec53	dname -> device [pr] (#7804 ) * dname -> device [pr] * a few more * only one left	2024-11-20 17:57:14 +08:00
George Hotz	0a74acd90e	add proper typing to HCQ [pr] (#7803 ) * add proper typing to HCQ [pr] * more types * and qcom * HCQProgram has device type * typed allocator	2024-11-20 17:20:39 +08:00
George Hotz	6688539bc9	rename device to dev so Buffer can be Allocator [pr] (#7799 ) * rename device to dev to Buffer can be Allocator [pr] * missed those * update the Program classes also * more renames * oops	2024-11-20 15:47:26 +08:00
ttomsa	9adeb1041c	fix advanced setitem with 1 in shape (#7797 ) * fix advanced setitem with 1 in shape * linter	2024-11-19 20:04:59 -05:00
chenyu	d800a79112	use "signed char" for int8 (#7796 ) * use "signed char" for int8 "char" might be unisgned depends on platform. fixed `python -m pytest test/test_ops.py::TestOpsUint8::test_interpolate_bilinear` on arm64 linux * opencl does not have "signed char"	2024-11-19 19:29:54 -05:00
chenyu	f16122f9c4	update README to make it runs with just tinygrad (#7795 )	2024-11-19 17:25:12 -05:00
ttomsa	170ece6605	fix advanced setitem overlap with 0 (#7793 ) * fix advanced setitem overlap with 0 * fix comment	2024-11-19 16:03:55 -05:00
Gaétan Lepage	159c0bf25e	test_kernel_cache_in_action: fix test (#7792 )	2024-11-19 13:34:56 -05:00
George Hotz	913a27ee27	from_buffer on metal was never called [pr] (#7791 )	2024-11-20 00:35:17 +08:00
Eitan Turok	56017c52a0	Raise error when model architecture does not match state dict (#7772 ) * init * style * style * style * fix test	2024-11-20 00:11:54 +08:00
George Hotz	d71fe7faa5	rename allocator methods to not conflict [pr] (#7788 ) * rename allocator methods to not conflict [pr] * forgot those * transfer + offset	2024-11-20 00:10:29 +08:00
chenyu	d5f76462c8	fix CI beautiful_mnist dir (#7790 ) fixed `fatal: not a git repository (or any of the parent directories): .git` because $HOME is not $GITHUB_WORKSPACE	2024-11-19 09:59:02 -05:00
geohotstan	aeaf574a05	add failure test for setitem bug (#7786 ) * add failure test * rename * improve tests * improve tests and no need numpy	2024-11-19 08:54:21 -05:00
qazal	1e31b5ba6b	hotfix: ctx doesn't impact process replay [pr] (#7785 )	2024-11-19 20:17:01 +08:00
qazal	8360bbd88d	faster assign view check [pr] (#7781 )	2024-11-19 19:42:51 +08:00
George Hotz	3daa376107	remove numpy from assign [pr] (#7784 ) * remove numpy from assign [pr] * cast not required	2024-11-19 19:34:53 +08:00
George Hotz	fbb4099b3c	add test for compile3 [pr] (#7783 ) Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>	2024-11-19 19:26:51 +08:00
qazal	4f6071d919	capture the schedule context in process replay [pr] (#7782 )	2024-11-19 19:12:00 +08:00
qazal	f493d480e3	metadata appending to graph_rewrite (#7780 )	2024-11-19 18:05:42 +08:00
chenyu	73ea913050	really not using numpy in gpt2 example (#7779 )	2024-11-18 23:21:16 -05:00
chenyu	e6debda5c4	remove numpy from gpt2 and llama examples (#7778 )	2024-11-18 22:48:17 -05:00
George Hotz	005636304b	have VIZ=1 use HTTP/1.1 for keep-alive [pr] (#7776 )	2024-11-19 09:38:12 +08:00
George Hotz	65f188aafb	bump version to 0.10.0 v0.10.0	2024-11-19 08:27:28 +08:00
chenyu	26200574dc	load_state_dict test cases when model and data shard differently (#7774 ) current behavior is weird... when model is sharded and state_dict is not, load shards the state_dict and model shard axis does not change. but if model and state_dict are sharded differently, model shard axis becomes the state_dict axis after load. it should either always use model shard axis or always use state_dict shard	2024-11-18 16:08:24 -05:00
Francis Lata	a1c1b9547f	Context manager support for tqdm (#7770 ) * add context manager support * add test case for context manager usage	2024-11-18 14:12:03 -05:00
geohotstan	8100109c9d	Add replicate mode to Tensor.pad (#7608 ) * base implementation * add tests * actually remove the assertionerror test * actually only have reflect for this pr * change the 4 if-else one liner * maybe use a lambda * fix * maybe a lil cleaner * fix tests * complete * small change --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-11-18 10:55:38 -05:00
qazal	62db6398a5	delete buffer tracking from ScheduleContext [pr] (#7766 )	2024-11-18 22:47:32 +08:00
Shuni	ed76d3ceac	Fix AMD queue CWSR memory size (#7765 ) * Fix AMD queue CWSR memory size * fix linter error * add debug_memory_size field * align CWSR save area allocation to page size	2024-11-18 17:22:03 +03:00
ignaciosica	f02462c5cb	swizzle tc [pr] (#7633 ) * swizzle tc draft * further cleanup * hotfix: remove typing from fix_st and cleanup * hotfix: revert cache property (moved into separate pr) * hotfix * hotfix: rename * take patterns from schedule * hotfix: rename vars * hotfix * no more view of store * hotfix: linter * as view is only used for tc fix up and tc is only enabled for LOAD, remove valid an preload from pm rule - also remove inner simplify in fix_st * add typing to fix_st --------- Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com>	2024-11-18 21:08:21 +08:00
qazal	6ea4a173e7	make is_realized a property [pr] (#7763 ) * make is_realized a property [pr] * fix assign * multi	2024-11-18 19:15:37 +08:00
chenyu	5de0ea40f3	reorder `Tensor.__init__` to match type (#7758 ) and reordered check lazy devices part	2024-11-17 21:32:48 -05:00
chenyu	66d7d5af50	fix Tensor(MultiLazyBuffer) with different dtype should fail (#7757 ) similar to Tensor(LazyBuffer) as we don't cast implicitly	2024-11-17 21:05:45 -05:00
chenyu	b1d734a02c	remove the -1 then -(-1) in Tensor.argmax (#7753 )	2024-11-17 16:54:09 -05:00
chenyu	e3081355fe	minor Tensor.einsum cleanup (#7752 ) removed some dead conditions and add types. still reads more complicated than needed	2024-11-17 16:11:30 -05:00
chenyu	8b08a72657	consmetic change to Tensor._pool (#7751 ) aligned the shink lines	2024-11-17 15:38:11 -05:00
chenyu	df817297b6	fix passing acc_dtype="" to Tensor.prod should fail (#7750 ) similar to sum	2024-11-17 11:38:13 -05:00
chenyu	55707fd00d	fix passing sum_acc_dtype="" to Tensor.sum should fail (#7748 )	2024-11-17 10:58:41 -05:00
chenyu	f18296e23c	simpler Tensor._reduce (#7747 )	2024-11-17 09:20:00 -05:00
qazal	0cc8de2f15	reverse map buf_uops [pr] (#7743 )	2024-11-17 21:29:56 +08:00

1 2 3 4 5 ...

6884 Commits