tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-29 08:48:15 -05:00

Author	SHA1	Message	Date
Ahmed Harmouche	ed7318a3f5	Fix puppeteer install (#8148 ) Clean npm cache before puppeteer install	2024-12-10 23:06:33 +01:00
George Hotz	a1b3724ff8	prepickle process replay [pr] (#8147 )	2024-12-10 11:46:36 -08:00
George Hotz	aa3b094334	changes from delete lazy [pr] (#8146 ) * changes from delete lazy [pr] * test tweak	2024-12-10 11:06:17 -08:00
chenyu	286fec115e	fix Tensor.minimum for int (#8145 ) use invert instead of just neg. consolidate min, argmin, and minimum also update maximum to not apply the mid point for int	2024-12-10 13:34:41 -05:00
Ahmed Harmouche	71dd222f66	Fix setitem on wgpu (#8144 )	2024-12-10 19:34:25 +01:00
qazal	b69fea6ae5	process replay without global list [pr] (#8143 )	2024-12-11 02:20:09 +08:00
qazal	08405279f9	pre merge_views+ops_folding refactor [pr] (#8140 ) * simple start * valid early * more dumb things removed * don't ever use base * cleaner	2024-12-11 00:55:00 +08:00
qazal	56c84cee29	derive COPY nbytes late in realize [pr] (#8137 ) * derive COPY arg later in realize [pr] * can assume no implicit casts or movement ops here	2024-12-10 22:04:07 +08:00
qazal	2d26b011ac	allow VIEW on BUFFER [pr] (#8136 ) * allow VIEW of BUFFER [pr] * base it later * better diff * base shouldn't exist after anywhere merge_views	2024-12-10 21:29:38 +08:00
qazal	3a2658efbd	small changes to refine the delete_lazy diff (#8134 ) * _view -> view * const_arg things	2024-12-10 18:46:10 +08:00
qazal	6d33da09c9	split scalar getitem tests into correctness and optimization [pr] (#8133 )	2024-12-10 18:18:46 +08:00
qazal	7436ebef2f	spend lines on const_arg for tensor and scheduler [pr] (#8132 ) * spend lines on const_arg for tensor and scheduler [pr] * simple test_const_arg * base on lazy	2024-12-10 18:07:35 +08:00
chenyu	917deb88a4	make //0 return 0 in python_alu (#8131 ) on master it raises because it cannot truncate inf to int, which crashes valid expression like `(t > 0).where(1//t, t)`.	2024-12-09 19:32:06 -05:00
George Hotz	f83d715f41	move checks into compile3, delete compile2 [pr] (#8127 ) * move checks into compile3 [pr] * test_vs_onnx * test v torch works * float16 won't compile on compile3 * actually delete compile2	2024-12-09 14:21:42 -08:00
chenyu	358287959b	fix pow of int to negative const int (#8129 ) it should return in int	2024-12-09 17:20:18 -05:00
chenyu	12f7d284e0	failed test case for int pow (#8128 ) also updated test_ops so that non-float compares with `assert_equal`. removed `test_multinomial` which is tested better in test_randomness	2024-12-09 16:15:09 -05:00
qazal	80de06c8b9	scheduler ops_folding from delete_lazy (#8124 ) * scheduler diff from delete_lazy * test_std_mean * late fold copy of CONST * clang const is fine	2024-12-10 00:36:01 +08:00
George Hotz	87c360c4b5	hotfix: add --size 8B to llama3	2024-12-09 07:53:20 -08:00
George Hotz	a773c5a571	hotfix: default llama3 is 1B with download_model	2024-12-09 07:23:35 -08:00
Ahmed Harmouche	c6277fce09	Remove f16 decompression lib from SD compile.py (#8121 ) * Remove f16-to-f32-gpu lib, use tinygrad exported decompression * No need to create new instance	2024-12-09 14:09:00 +01:00
qazal	22d99f1421	test_pickle_realized_tensor actually tests pickle [pr] (#8119 ) * test_pickle_realized_tensor actually tests pickle [pr] * clang	2024-12-09 17:26:19 +08:00
chenyu	ccf54c2375	fix argmax/min on int32 min (#8118 )	2024-12-09 02:29:23 -05:00
chenyu	c814de2dd4	fix bitwise_not for signed int (#8117 ) -1 is correct because 2**32-1 is not within int32 range, so in some case clang casts the whole thing into uint32	2024-12-09 02:02:51 -05:00
ttomsa	e22d7b6fb0	fix var vmax inside special (#8116 )	2024-12-09 01:16:08 -05:00
qazal	0033012096	init noop changes from delete_lazy [pr] (#8115 )	2024-12-09 01:42:05 +08:00
qazal	5dd61035f7	revert VALID early folding for now (#8114 ) This reverts commit `4074f52317`.	2024-12-09 00:34:24 +08:00
qazal	69e48da961	set NOOPT in test_avg_pool3d_failure (#8112 ) * set NOOPT=0 in test_avg_pool3d_failure * noopt should still pass	2024-12-08 10:48:29 -05:00
nimlgen	3a7d64b96c	hcq remove update from args state (#8104 ) * hcq remove update from args state fix amd ugh qcom? qcom ops ops qcom fix qcom texture info fx qcom fix qcom qcom, sry minor works * remove old code * unrelated+sint * qcom * typing * rm comments	2024-12-08 15:22:05 +03:00
nimlgen	d6e66095fd	hcq buffer is a class (#8106 ) * hcq buffer is a class * qcom * no from_mv in qcom * remove qcombuffer * useless cast * mypy * qcom fix * _md -> meta	2024-12-08 13:29:43 +03:00
chenyu	b9c977f1c8	clean up bounds in Tensor.shard (#8107 )	2024-12-07 17:19:43 -05:00
geohotstan	f8294b3bda	add avg pool 3d failure test (#8105 ) * add test * try simplify test case * add TODO comment	2024-12-07 16:34:38 -05:00
qazal	6be388be86	failing test for const folding breaking indexing [pr] (#8103 )	2024-12-07 19:55:02 +08:00
nimlgen	8b1fa9cb7d	nv hcq queue touchups (#8102 )	2024-12-07 14:09:38 +03:00
qazal	4074f52317	VALID early folding (#8100 ) * fold valid * :) * fix test_verify_ast * keep symbolic working	2024-12-07 18:37:47 +08:00
qazal	07b6d5cf63	assign early folding (#8093 ) * assign early folding [pr] * move to to_si * - * fix generate_dataset * diff too big * no recreation, no diff * gzip * new sops from tiny10 * final try	2024-12-07 17:02:55 +08:00
George Hotz	00ac0db9d4	np tensors have the memory from numpy in compile3 [pr] (#8098 )	2024-12-07 14:01:51 +08:00
George Hotz	22feb3a2f1	move copy into the JIT for openpilot compile3 (#7937 ) * move copy into the JIT, test fails * ahh, prune was the issue	2024-12-07 13:26:26 +08:00
leopf	0ed731b5ea	torch_load with Tensors (#8037 ) * torch_load with Tensors * remove passthrough_reset + use accept_filename * Revert "remove passthrough_reset" * version note * cleanup	2024-12-07 09:55:41 +08:00
chenyu	2d321646b8	default tensors to int32 in test_ops (#8097 ) torch defaults to int64 but we care more about int32 anyway. remove skipped tests due to int64 not supported	2024-12-06 20:33:36 -05:00
chenyu	e9692de42b	don't FUZZ_ALL_ACTIONS in fuzz_linearizer.py (#8096 ) mostly for speed, this is just making sure the script runs	2024-12-06 17:22:17 -05:00
chenyu	564b3a3e1b	onnx Bitwise ops (#8095 ) free stuff!	2024-12-06 16:58:09 -05:00
qazal	a97b8fa3c5	maskless const can lower without valid, p1 [pr] (#8094 )	2024-12-06 23:21:19 +02:00
mesozoic-egg	aaf2379f97	remove ordered parents, seems like dead code [pr] (#8092 ) * remove ordered parents, seems like dead code * no need to dedup	2024-12-06 16:19:37 -05:00
nimlgen	e180a31c5e	tiny metal cleanup (#8089 ) * tiny metal cleanup * cast * sry	2024-12-06 21:44:32 +03:00
chenyu	d000c08f04	fix return type of Tensor.pow (#8091 ) int to power of int should return int etc, it hints that we would like to have Ops.POW	2024-12-06 13:38:29 -05:00
qazal	1ea4dc9565	big graph init conceptual cleanup [pr] (#8090 ) * keep Ops.BUFFER naming consistent [pr] * big graph init conceptual cleanup [pr] * make everything pass through * pylint doesn't complain now	2024-12-06 20:07:00 +02:00
geohotstan	5184410fc3	combine get inputs and type_parse function in onnx [fixed] (#8081 ) * 1 is simpler than 2 * variable name * change error wording * shapes for sequence type must be homogeneous * bug fix for model benchmark * fix comments too --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-12-06 12:34:47 -05:00
nimlgen	d1282da7e8	hcq bump alloc (#8078 ) * hcq bump alloc * hm * nv * typo	2024-12-06 19:19:04 +03:00
qazal	df84dc6444	unrelated test fixups from delete_lazy [pr] (#8088 ) * unrelated test fixups from delete_lazy [pr] * fine if it's scheduled later	2024-12-06 17:31:02 +02:00
geohotstan	0b7c44677d	Fix uint8 cast underflow (#6305 ) * hacky fix for cast * only float to uint8 * limit to float -> uint8 * touchup alu cast test * improve tests and support more float to unsigned casts * del one repeated test * del 1 more repeated test * try removing expected failure test * hmmm try 1 more * skip tests for flakiness * uint64 super flaky * clean up * grammar * just match numpy * why is CI numpy different from local numpy * increase verbosity * try * try2 * try3 * try4 * yeah idk * new direction * try again * just don't support uint32 and uint64 * done? * oops * comment * documentation * it is what it is --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-12-06 10:25:03 -05:00

... 64 65 66 67 68 ...

10417 Commits