tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-24 14:28:09 -05:00

Author	SHA1	Message	Date
chenyu	c814de2dd4	fix bitwise_not for signed int (#8117 ) -1 is correct because 2**32-1 is not within int32 range, so in some case clang casts the whole thing into uint32	2024-12-09 02:02:51 -05:00
ttomsa	e22d7b6fb0	fix var vmax inside special (#8116 )	2024-12-09 01:16:08 -05:00
qazal	5dd61035f7	revert VALID early folding for now (#8114 ) This reverts commit `4074f52317`.	2024-12-09 00:34:24 +08:00
qazal	69e48da961	set NOOPT in test_avg_pool3d_failure (#8112 ) * set NOOPT=0 in test_avg_pool3d_failure * noopt should still pass	2024-12-08 10:48:29 -05:00
geohotstan	f8294b3bda	add avg pool 3d failure test (#8105 ) * add test * try simplify test case * add TODO comment	2024-12-07 16:34:38 -05:00
qazal	6be388be86	failing test for const folding breaking indexing [pr] (#8103 )	2024-12-07 19:55:02 +08:00
qazal	4074f52317	VALID early folding (#8100 ) * fold valid * :) * fix test_verify_ast * keep symbolic working	2024-12-07 18:37:47 +08:00
qazal	07b6d5cf63	assign early folding (#8093 ) * assign early folding [pr] * move to to_si * - * fix generate_dataset * diff too big * no recreation, no diff * gzip * new sops from tiny10 * final try	2024-12-07 17:02:55 +08:00
chenyu	2d321646b8	default tensors to int32 in test_ops (#8097 ) torch defaults to int64 but we care more about int32 anyway. remove skipped tests due to int64 not supported	2024-12-06 20:33:36 -05:00
chenyu	564b3a3e1b	onnx Bitwise ops (#8095 ) free stuff!	2024-12-06 16:58:09 -05:00
qazal	a97b8fa3c5	maskless const can lower without valid, p1 [pr] (#8094 )	2024-12-06 23:21:19 +02:00
chenyu	d000c08f04	fix return type of Tensor.pow (#8091 ) int to power of int should return int etc, it hints that we would like to have Ops.POW	2024-12-06 13:38:29 -05:00
geohotstan	5184410fc3	combine get inputs and type_parse function in onnx [fixed] (#8081 ) * 1 is simpler than 2 * variable name * change error wording * shapes for sequence type must be homogeneous * bug fix for model benchmark * fix comments too --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-12-06 12:34:47 -05:00
qazal	df84dc6444	unrelated test fixups from delete_lazy [pr] (#8088 ) * unrelated test fixups from delete_lazy [pr] * fine if it's scheduled later	2024-12-06 17:31:02 +02:00
geohotstan	0b7c44677d	Fix uint8 cast underflow (#6305 ) * hacky fix for cast * only float to uint8 * limit to float -> uint8 * touchup alu cast test * improve tests and support more float to unsigned casts * del one repeated test * del 1 more repeated test * try removing expected failure test * hmmm try 1 more * skip tests for flakiness * uint64 super flaky * clean up * grammar * just match numpy * why is CI numpy different from local numpy * increase verbosity * try * try2 * try3 * try4 * yeah idk * new direction * try again * just don't support uint32 and uint64 * done? * oops * comment * documentation * it is what it is --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-12-06 10:25:03 -05:00
Ahmed Harmouche	f3983f6743	Move efficientnet example (#8087 )	2024-12-06 15:48:16 +01:00
qazal	7dbd166227	skip test_schedule_mem_used_with_inputs [pr] (#8086 )	2024-12-06 16:44:34 +02:00
qazal	0356657ced	move view_supported_devices to device [pr] (#8085 )	2024-12-06 16:44:15 +02:00
Ahmed Harmouche	ba35c4138b	Use matching JS TypedArray for buffer dtype (#8080 )	2024-12-06 14:52:23 +01:00
geohotstan	a684d72e55	add ceil_mode for avg_pool and max_pool (#7579 ) * wip pool * check CI for remove alternative implementation * Revert "check CI for remove alternative implementation" This reverts commit `7b1bb900e5`. * fix test * tests tests tests * slap a resolve on it * fix comment * a little simpler pool * check CI for removal again * Revert "check CI for removal again" This reverts commit `be798b7857`. * small * update * some ez tests * english * clean up code * fix ruff * how did I +25 lines? * small clean ups * moar clean ups * try test_avgpool2d_failure2 in CI * final clean up * exclude bug fix * avg underscore pool * no more edge case stuff * add better comments for explanation * add test cases for decreasing end padding * address feedback * improve test coverage * tiny more polish as we wait for lines :D * more readable code ordering * add to documentation * oops * set to False instead --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-12-06 08:34:14 -05:00
chenyu	b73d9a7d24	Revert "combine get inputs and type_parse function in onnx (#8069 )" (#8079 ) This reverts commit `074a67a6eb`.	2024-12-06 08:04:21 -05:00
chenyu	a77ee72d11	clean up reshape size check [pr] (#8067 ) removed a resolve, and remove special case for 0 size assert since it's covered by generic size check	2024-12-06 07:51:19 -05:00
geohotstan	074a67a6eb	combine get inputs and type_parse function in onnx (#8069 ) * 1 is simpler than 2 * variable name * change error wording * shapes for sequence type must be homogeneous	2024-12-06 07:42:35 -05:00
nimlgen	c0240855b9	qcom has not transfer (#8075 ) * qcom alloc is not hcq alloc * maybe base? * test	2024-12-06 14:45:01 +03:00
Ahmed Harmouche	ce72fe1411	u32 to f16 in tinygrad (#8074 ) * f16 decompression in tinygrad * Typing and cleanup	2024-12-06 12:00:13 +01:00
George Hotz	e37bff6c19	fix bug in jit prune with copy [pr] (#8073 )	2024-12-06 18:38:23 +08:00
George Hotz	aae8557ada	test copy inside jit [pr] (#8072 )	2024-12-06 17:51:50 +08:00
chenyu	e7d5fe4a32	improve idiv _min_max (#8066 ) for the cases that the we don't know the exact bounds, we might still know the sign. with this, can remove some resolve for symbolic shapetracker	2024-12-05 23:02:16 -05:00
Sieds Lykles	49c6dab74b	Add pattern for div mod recombine with gcd (#8061 ) Co-authored-by: chenyu <chenyu@fastmail.com>	2024-12-05 13:16:58 -05:00
chenyu	5c6ed5dba6	lower test_conv_3x3_256_32_32_256_256 expectation (#8060 ) failed https://github.com/tinygrad/tinygrad/actions/runs/12182799887/job/33982676812#step:9:210	2024-12-05 10:30:56 -05:00
Ahmed Harmouche	ff9a89f714	Proper dtypes for input/output of exported WebGPU model (#8053 ) * Respect input/output dtypes in exported WebGPU model * Add some comments about skipped dtypes	2024-12-05 10:38:05 +01:00
qazal	435a51e10c	reduce folding simple tests [pr] (#8040 ) * reduce folding simple tests [pr] * test for view and realized src pattern * realize / buffer behavior	2024-12-05 12:22:45 +08:00
George Hotz	20878be2af	lower test_gemv_4096_16384 expectations	2024-12-05 12:08:26 +08:00
George Hotz	df18e7cc37	accept filename decorator [pr] (#8049 ) * accept filename decorator [pr] * add test for safe_load * bring old tar tests back	2024-12-05 11:40:59 +08:00
chenyu	b3220ca7b1	test cases of always True/False lt (#8048 ) * test cases of always True/False lt * one more	2024-12-04 20:38:40 -05:00
geohotstan	5ce8090d42	simple onnx_ops cleanups (#8003 ) * simple clean ups first * more work * kinda have adam * ooo momentum worked nicely * almost there * wow.. is the onnx test wrong * nicer optim stuff * just skip that test * small comment changes * use naming convention from other parts of codebase --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-12-04 15:33:03 -05:00
Sieds Lykles	70db1bab5c	Fold nested div with const (#8010 ) * Rebase nested div and with const * Update the ordering * return None on vectors Fixes cpu test --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-12-04 14:59:09 -05:00
chenyu	0693158d28	lower v_theoretical gemv on red (#8042 ) tiny7 is still slower https://github.com/tinygrad/tinygrad/actions/runs/12166149038/job/33931736130#step:8:209	2024-12-04 13:59:40 -05:00
qazal	b116e1511d	make device on uop optional [pr] (#8034 )	2024-12-04 20:18:00 +08:00
Ahmed Harmouche	13eedd373b	Run WebGPU tests on ubuntu (#8033 )	2024-12-04 12:42:04 +01:00
George Hotz	08657cb7b0	hotfix: bump expectations in speed_v_theoretical	2024-12-04 19:00:33 +08:00
George Hotz	ea65c79ba2	hotfix: don't spam BEAM debug in speed_v_theoretical	2024-12-04 18:47:16 +08:00
George Hotz	09b00b1b04	hotfix: use kernel timings instead of python timings in speed_v_theoretical	2024-12-04 18:36:17 +08:00
leopf	f0401e14e8	tar_extract with Tensors (#7853 ) * initial * USTAR, PAX and GNU support + testing * from_bytes byteorder * use TarInfo.frombuf * tensor only usage * remove contextlib.suppress * shorter ow,pax * more tests * testing length + move tests * cleanup * new approach: RawTensorIO * fix fetch * enable read test * cleanup and ignore fix * fix for python < 3.12 * make it RawIO * functions --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com> Co-authored-by: chenyu <chenyu@fastmail.com>	2024-12-04 17:03:19 +08:00
uuuvn	e9c5b23ba1	Use MTLCompiler directly (v2) (#7920 ) * Use MTLCompiler directly (v2) * to_block_literal and REQUEST_TYPE_COMPILE * Rewrite command encoding * Revert to_block_literal * Maybe that's more readable to some people? * Typo and comment about stdlib caching * Update ops_metal.py * Update ops_metal.py * Update ops_metal.py --------- Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>	2024-12-04 16:36:48 +08:00
chenyu	0c060fa040	update uop and tests to not use lt/gt/le/ge [pr] (#8023 ) just use dunder methods, eventually remove those from ops	2024-12-03 21:02:52 -05:00
Ahmed Harmouche	db330a3110	Remove WebGL (#8012 )	2024-12-03 16:02:53 +01:00
chenyu	ef3752625b	add test case of realize_size with 0 in shape (#8011 )	2024-12-03 09:19:50 -05:00
George Hotz	09eac42fd6	cache indexed uops in st [pr] (#8008 ) * cache indexed uops in st [pr] * remove arg from range	2024-12-03 21:27:07 +08:00
Sieds Lykles	e44183647f	Improved div folding (#7996 ) * First version of div_mod folding together * Working version with old div folding behaviour * Test is fixed * Fix linting * Happy mypy --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-12-03 08:11:25 -05:00

... 31 32 33 34 35 ...

4618 Commits