tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-28 08:17:58 -05:00

Author	SHA1	Message	Date
qazal	07b6d5cf63	assign early folding (#8093 ) * assign early folding [pr] * move to to_si * - * fix generate_dataset * diff too big * no recreation, no diff * gzip * new sops from tiny10 * final try	2024-12-07 17:02:55 +08:00
George Hotz	00ac0db9d4	np tensors have the memory from numpy in compile3 [pr] (#8098 )	2024-12-07 14:01:51 +08:00
George Hotz	22feb3a2f1	move copy into the JIT for openpilot compile3 (#7937 ) * move copy into the JIT, test fails * ahh, prune was the issue	2024-12-07 13:26:26 +08:00
leopf	0ed731b5ea	torch_load with Tensors (#8037 ) * torch_load with Tensors * remove passthrough_reset + use accept_filename * Revert "remove passthrough_reset" * version note * cleanup	2024-12-07 09:55:41 +08:00
chenyu	2d321646b8	default tensors to int32 in test_ops (#8097 ) torch defaults to int64 but we care more about int32 anyway. remove skipped tests due to int64 not supported	2024-12-06 20:33:36 -05:00
chenyu	e9692de42b	don't FUZZ_ALL_ACTIONS in fuzz_linearizer.py (#8096 ) mostly for speed, this is just making sure the script runs	2024-12-06 17:22:17 -05:00
chenyu	564b3a3e1b	onnx Bitwise ops (#8095 ) free stuff!	2024-12-06 16:58:09 -05:00
qazal	a97b8fa3c5	maskless const can lower without valid, p1 [pr] (#8094 )	2024-12-06 23:21:19 +02:00
mesozoic-egg	aaf2379f97	remove ordered parents, seems like dead code [pr] (#8092 ) * remove ordered parents, seems like dead code * no need to dedup	2024-12-06 16:19:37 -05:00
nimlgen	e180a31c5e	tiny metal cleanup (#8089 ) * tiny metal cleanup * cast * sry	2024-12-06 21:44:32 +03:00
chenyu	d000c08f04	fix return type of Tensor.pow (#8091 ) int to power of int should return int etc, it hints that we would like to have Ops.POW	2024-12-06 13:38:29 -05:00
qazal	1ea4dc9565	big graph init conceptual cleanup [pr] (#8090 ) * keep Ops.BUFFER naming consistent [pr] * big graph init conceptual cleanup [pr] * make everything pass through * pylint doesn't complain now	2024-12-06 20:07:00 +02:00
geohotstan	5184410fc3	combine get inputs and type_parse function in onnx [fixed] (#8081 ) * 1 is simpler than 2 * variable name * change error wording * shapes for sequence type must be homogeneous * bug fix for model benchmark * fix comments too --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-12-06 12:34:47 -05:00
nimlgen	d1282da7e8	hcq bump alloc (#8078 ) * hcq bump alloc * hm * nv * typo	2024-12-06 19:19:04 +03:00
qazal	df84dc6444	unrelated test fixups from delete_lazy [pr] (#8088 ) * unrelated test fixups from delete_lazy [pr] * fine if it's scheduled later	2024-12-06 17:31:02 +02:00
geohotstan	0b7c44677d	Fix uint8 cast underflow (#6305 ) * hacky fix for cast * only float to uint8 * limit to float -> uint8 * touchup alu cast test * improve tests and support more float to unsigned casts * del one repeated test * del 1 more repeated test * try removing expected failure test * hmmm try 1 more * skip tests for flakiness * uint64 super flaky * clean up * grammar * just match numpy * why is CI numpy different from local numpy * increase verbosity * try * try2 * try3 * try4 * yeah idk * new direction * try again * just don't support uint32 and uint64 * done? * oops * comment * documentation * it is what it is --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-12-06 10:25:03 -05:00
Ahmed Harmouche	f3983f6743	Move efficientnet example (#8087 )	2024-12-06 15:48:16 +01:00
qazal	7dbd166227	skip test_schedule_mem_used_with_inputs [pr] (#8086 )	2024-12-06 16:44:34 +02:00
qazal	0356657ced	move view_supported_devices to device [pr] (#8085 )	2024-12-06 16:44:15 +02:00
Ahmed Harmouche	fad3eaa35e	Use atomicLoad builtin when loading atomic type (#8084 )	2024-12-06 15:33:11 +01:00
qazal	79966fade0	free up lines for const_arg [pr] (#8083 )	2024-12-06 16:28:51 +02:00
Ahmed Harmouche	ba35c4138b	Use matching JS TypedArray for buffer dtype (#8080 )	2024-12-06 14:52:23 +01:00
geohotstan	a684d72e55	add ceil_mode for avg_pool and max_pool (#7579 ) * wip pool * check CI for remove alternative implementation * Revert "check CI for remove alternative implementation" This reverts commit `7b1bb900e5`. * fix test * tests tests tests * slap a resolve on it * fix comment * a little simpler pool * check CI for removal again * Revert "check CI for removal again" This reverts commit `be798b7857`. * small * update * some ez tests * english * clean up code * fix ruff * how did I +25 lines? * small clean ups * moar clean ups * try test_avgpool2d_failure2 in CI * final clean up * exclude bug fix * avg underscore pool * no more edge case stuff * add better comments for explanation * add test cases for decreasing end padding * address feedback * improve test coverage * tiny more polish as we wait for lines :D * more readable code ordering * add to documentation * oops * set to False instead --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-12-06 08:34:14 -05:00
chenyu	b73d9a7d24	Revert "combine get inputs and type_parse function in onnx (#8069 )" (#8079 ) This reverts commit `074a67a6eb`.	2024-12-06 08:04:21 -05:00
Sieds Lykles	c8313a3669	Cleaner rule for mul/idiv by power of two [pr] (#8076 ) * Cleaner rule for mul/idiv by power of two * Change comment	2024-12-06 08:02:24 -05:00
chenyu	a77ee72d11	clean up reshape size check [pr] (#8067 ) removed a resolve, and remove special case for 0 size assert since it's covered by generic size check	2024-12-06 07:51:19 -05:00
geohotstan	074a67a6eb	combine get inputs and type_parse function in onnx (#8069 ) * 1 is simpler than 2 * variable name * change error wording * shapes for sequence type must be homogeneous	2024-12-06 07:42:35 -05:00
nimlgen	c0240855b9	qcom has not transfer (#8075 ) * qcom alloc is not hcq alloc * maybe base? * test	2024-12-06 14:45:01 +03:00
Ahmed Harmouche	ce72fe1411	u32 to f16 in tinygrad (#8074 ) * f16 decompression in tinygrad * Typing and cleanup	2024-12-06 12:00:13 +01:00
George Hotz	e37bff6c19	fix bug in jit prune with copy [pr] (#8073 )	2024-12-06 18:38:23 +08:00
George Hotz	aae8557ada	test copy inside jit [pr] (#8072 )	2024-12-06 17:51:50 +08:00
George Hotz	e2fe7f0d2f	hotfix: actually fix pylint, it's a python 3.10 issue	2024-12-06 13:53:46 +08:00
George Hotz	b28d660172	update self_tokenize, fix pylint maybe	2024-12-06 13:49:41 +08:00
George Hotz	344fd4845c	example: self_tokenize. someday tinygrad will be recursively self improving	2024-12-06 13:35:02 +08:00
JaSpa99	3c5d5f9414	mypy==1.13.0 (#7990 ) * explicit instantiation and narrowing asserts * explicit cast * bump * one line assert * handle case for no copy_queue_t * Revert "handle case for no copy_queue_t" This reverts commit `38347806ca`. * more readable control flow --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-12-06 12:09:14 +08:00
leopf	65b6696f3b	refactor safe_load (#8035 ) * refactor safe_load * cleanup	2024-12-06 12:08:21 +08:00
chenyu	e7d5fe4a32	improve idiv _min_max (#8066 ) for the cases that the we don't know the exact bounds, we might still know the sign. with this, can remove some resolve for symbolic shapetracker	2024-12-05 23:02:16 -05:00
chenyu	13b954f22c	unify expand conditions [pr] (#8065 ) same condition (check if old == new or old == 1) in tensor and view. also renamed _pad_left to _align_left because it's not really a pad	2024-12-05 21:40:14 -05:00
chenyu	aefdff4ef5	reshape mask cleanups [pr] (#8064 ) don't need canonicalize_st because we always merge 1 in `_merge_dims`	2024-12-05 20:20:43 -05:00
chenyu	05dba6e4ee	minor to_indexed_uops cleanup [pr] (#8063 )	2024-12-05 17:15:03 -05:00
chenyu	b2dd703592	fix typing of UOp.range [pr] (#8062 ) start/end should not be float or bool	2024-12-05 14:56:34 -05:00
Sieds Lykles	49c6dab74b	Add pattern for div mod recombine with gcd (#8061 ) Co-authored-by: chenyu <chenyu@fastmail.com>	2024-12-05 13:16:58 -05:00
geohotstan	707e9a9c8e	add _one_hot_along_dim helper for Tensor.arange masking (#8039 ) * feelsbadman * feelsextrabadman * make sure indices is on same device as self Tensor * renamed to _one_hot_along_dim * revert onnx change will do them in onnx only PRs * address feedback * add onnx changes here too * make pad arg better * revert pad arg * maybe still keep dim * simplify onehot onnx ops more --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-12-05 12:43:00 -05:00
chenyu	3c5983473a	combine parentless reduce rule [pr] (#8059 )	2024-12-05 11:28:35 -05:00
chenyu	87594a8153	simpler dtypes.max for int [pr] (#8058 )	2024-12-05 10:31:41 -05:00
geohotstan	66b8242375	Simple onnx.py clean ups (#8054 ) * start * simplify ops * why did this not work before * will split buffer parse to separate pr * flip the error order * only this much for now * to_python_const clean up * minimize diff * move tensor_methods into onnx.py * improve some type signatures --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-12-05 10:31:26 -05:00
chenyu	5c6ed5dba6	lower test_conv_3x3_256_32_32_256_256 expectation (#8060 ) failed https://github.com/tinygrad/tinygrad/actions/runs/12182799887/job/33982676812#step:9:210	2024-12-05 10:30:56 -05:00
Ahmed Harmouche	c6f5bb03fa	YoloV8 WebGPU fixes (#8057 ) * Bump up input size to 416, show if webgpu is not supported * Minor fix in export_model	2024-12-05 16:23:45 +01:00
nimlgen	78c01a5c2b	amd general _gpu_alloc (#8056 ) * amd general _gpu_alloc * hmm * ops	2024-12-05 15:50:23 +03:00
nimlgen	8071600897	nv one _gpu_alloc (#8055 )	2024-12-05 15:22:03 +03:00

... 69 70 71 72 73 ...

10633 Commits