tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-02-01 02:05:22 -05:00

Author	SHA1	Message	Date
chenyu	99e7a1d5e9	support symbolic reshape with non-contiguous (#4844 ) * support symbolic reshape with non-contiguous pre-requisite for symbolic arange (make symbolic ones that can be folded). * test cases * typo * shorter	2024-06-05 16:01:19 -04:00
chenyu	a352b6d9ce	symbolic Tensor.var (#4843 ) taken from #4446 and add more tests	2024-06-05 12:55:54 -04:00
Timmy	887643cf34	Multireduce atomic local load/store test (#4786 ) * atomic load/store test * tests for nested & unrolled * check barriers * linters * cleaning up diff * fix assert in _temp_create_multireduce_ast changes * cleaning up the check for redundant barriers * minor cleanups for the assert * always seed randn, helps with debuggability --------- Co-authored-by: qazal <qazal.software@gmail.com>	2024-06-05 14:41:19 +03:00
Szymon Ożóg	273945df67	Regression tests for bitshift (#4829 ) * Regression tests for bitshift * Add test for bitshift not triggered * Enable tests	2024-06-05 11:42:34 +02:00
Alec Chen	5ac30c29d8	Construct UOps patterns using UPat (#4821 ) * Allow UPat pattern definitions * Convert pattern matcher tests to UPat constructions * Convert constant_folder patterns to upat constructions * Convert assembly patterns to upat constructions * [run_process_replay] Drop UPat.from_dict	2024-06-05 10:29:37 +02:00
Szymon Ożóg	e47277d18a	Disable for PTX as well (#4838 ) Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com>	2024-06-05 10:37:59 +03:00
Francis Lam	890e7c12bb	test/external/verify_kernel: add support for single pickled kernel (#4836 )	2024-06-04 18:59:21 -04:00
Elias Wahl	04e237328b	Refactor to class style (#4804 )	2024-06-04 14:08:31 -07:00
David Hou	cddce0e168	don't cast before view on shape changing bitcast (#4833 ) * don't cast before view on shape changing bitcast * make sure cast before view triggers	2024-06-04 16:04:52 -04:00
Alec Chen	4909a0d16f	Fix arg set in pattern matcher (#4830 )	2024-06-04 15:10:09 -04:00
Alec Chen	c96026ac65	Add arg set regression test for pattern matcher (#4827 ) * Add arg set regression test for pattern matcher * real regression --------- Co-authored-by: qazalin <qazal.software@gmail.com>	2024-06-04 13:35:09 -04:00
chenyu	a70e8a80d7	test_ops test cmp with special floats (#4826 ) prepare to fix nan, it did not work with ge and le before either	2024-06-04 12:10:21 -04:00
chenyu	3afc914617	CMPEQ -> CMPNE and make it safe to pad (#4818 ) * CMPNE * new dataset	2024-06-03 18:02:15 -04:00
Szymon Ożóg	bb7b031c5c	Bitshift (#4728 ) * WIP * Cleanup * Cleanup * Fix variable, refactor to use set * right shift should be signed/unsigned * Test for bitshifts * Allow a neg	2024-06-03 21:16:01 +02:00
nimlgen	e78a9bf3f2	support view in nv/amd (#4812 ) * support view in nv/amd * fix amd * fix * run test on nv/amd	2024-06-03 22:11:52 +03:00
chenyu	45083ccb43	canonicalize 0 in shape in View.create (#4815 ) set strides to 0, offset to 0, mask to None, and contiguous to True with size 0 view.	2024-06-03 13:37:37 -04:00
qazal	f64fa51a64	process replay for test/* (#4799 ) * add input to unit tests [run_process_replay] * add setup [run_process_replay] * run tests [run_process_replay] * add cuda and amd [run_process_replay] * run everything but BEAM=2 [run_process_replay] * skip export_model [run_process_replay] * fix amd CI * add concurrency back	2024-06-03 12:01:58 +03:00
Timmy	ca32921f84	Multireduce PADTO Test (#4785 ) * padto test * expanded multireduce padto tests * cuda doesnt run on ci * moving padto_where_multireduce test to SUM so that we can check the reduce axis * cleaning up tests some more * add wanna_outputs * refactor test_padto_sum_multireduce * fix max and refactor where * fix axis --------- Co-authored-by: qazal <qazal.software@gmail.com>	2024-06-02 13:46:53 +03:00
chenyu	1ffa5ec492	unit test ShapeTracker.consecutive (#4800 )	2024-06-01 10:10:51 -04:00
chenyu	8942230b1f	minor cleanups of test_tensor and extend some cases (#4794 )	2024-05-31 10:43:22 -04:00
qazal	637f482588	configure derandomizing CI tests (#4793 )	2024-05-31 17:06:58 +03:00
chenyu	7cc883ecee	CMPLT is safe to pad (#4790 ) 0 < 0 evals to False	2024-05-30 22:50:48 -04:00
chenyu	236390aafb	fix lazy r const folding with variable shape (#4783 ) currently not supporting const fold symbolic shape. I think it's possible with a refactor to Tensor.from_node. also added some failed required tests for symbolic arange.	2024-05-30 15:19:28 -04:00
chenyu	4921de1945	fix cumsum of 0-d tensor (#4781 ) * fix cumsum of 0-d tensor * _resolve_dim for all	2024-05-30 12:41:09 -04:00
chenyu	4cf0eadf8f	failed test case for ellipsis in einsum (#4779 ) from #4156	2024-05-30 11:14:42 -04:00
Alec Chen	e89bc42cc7	Add UOps pattern matcher regression tests (#4725 ) * add pattern matcher regression tests * Remove test for dtype str after rebasing * Make test uops match type spec * leave const const, add const alu vin test * correct uops * actually correct uops	2024-05-30 17:12:20 +03:00
qazal	c2945be0a3	add fused tensor core opts tests (#4775 ) * add fused tc opts tests * n=64	2024-05-30 13:50:00 +03:00
chenyu	f1bf916b8a	apply NOOPT in test_arange complexity (#4774 ) with hcopt, arange(2560) uses less ops than arange(256)	2024-05-29 23:12:35 -04:00
chenyu	cde7a7cda7	isolate the 134ms kernel in train_gpt2.py (#4773 ) 133ms on tinybox red with BEAM=2	2024-05-29 17:26:24 -04:00
chenyu	59c6472b9f	check contiguous in View.create after canonicalizing mask and offset (#4770 ) mask / offset / strides can change during canonicalization, and contiguous can be True at the end	2024-05-29 11:31:13 -04:00
nimlgen	019f4680e5	check dims before execution on nv (#4756 ) * check dims before execution on nv * fix linter	2024-05-28 16:57:28 +03:00
qazal	0e824741c4	pre multi reduce codegen/* cleanup (#4755 ) * refactor self.reduceop * free lines * fix test	2024-05-28 08:15:48 -04:00
chenyu	53b9081aab	check arg types of Tensor.randint (#4751 ) raise TypeError if low, high, dtype are not ints	2024-05-27 20:24:10 -04:00
qazal	0e69b22629	multireduce OptOps tests (start) (#4733 ) * start * full tests * add skips * unrelated * notes	2024-05-27 12:21:33 +03:00
qazal	c7b1d802f1	delete duplicate tests in test_linearizer (#4723 ) * delete duplicate test test_simplify_uop isnt needed max works * ci * remove skip * add skip back	2024-05-26 08:11:42 +03:00
Szymon Ożóg	de5c69c4c9	Unify test_dtype naming conventions (#4730 )	2024-05-25 10:12:40 -04:00
chenyu	7e90026eb0	pow cleanup part 2 (#4727 ) more cleanups and fix 0 ** 0	2024-05-25 07:17:40 -04:00
chenyu	31358cbea5	change Tensor.stack to method (#4719 )	2024-05-24 17:04:19 -04:00
Szymon Ożóg	212025b53c	Int mulacc for ptx (#4680 ) * IntMulacc * don't mov const * Dont do int mulacc on ocelot * Workaround for ocelot * Remove ocelot workaround * Fix tests that merged into mulacc * fix uop cout after mergin to mulacc	2024-05-24 15:20:48 -04:00
qazal	c170ddceaf	fix commavq benchmark (#4712 ) * fix _slice and assert explicit device * with _slice	2024-05-24 19:40:57 +03:00
Szymon Ożóg	84255069e7	Fix int8 and uint8 on PTX (#4711 ) * Fix mem type for uchar * Bring tests back	2024-05-24 11:08:52 -04:00
chenyu	4398cc3654	update test_linearizer.py (#4707 ) tests passed locally on tinybox green. Also unified test skipping with local/shared/float4/tc	2024-05-23 22:41:22 -04:00
Francis Lam	49225522aa	wmma: chain unrolled WMMAs and phi only at the end (#4703 ) * wmma: chain unrolled WMMAs and phi only at the end * fix linter and tests * reduce lines	2024-05-23 17:50:18 -04:00
chenyu	eb714a600d	fix UOps.CAST noop for vectorized dtypes (#4704 ) * == * add test * not lazyop * use str comparison for PtrDType --------- Co-authored-by: qazal <qazal.software@gmail.com>	2024-05-23 17:33:29 -04:00
qazal	532c9e08e3	proposal: PHI nodes in TC shouldn't have children inside the loop (#4694 ) * expectations from UOpGraph * one with children * minimal repro * replace	2024-05-23 15:11:26 -04:00
Szymon Ożóg	9a9963ba7b	Remove uops deepcopy from PTX (#4671 ) * Remove uops deepcopy from PTX * Update test * Fix test * fix for non-ptx * Clean --------- Co-authored-by: chenyu <chenyu@fastmail.com>	2024-05-22 23:14:17 -04:00
chenyu	47aba47f64	update Torch.gather api (#4692 ) * update Torch.gather api gather(self, dim, index) to match torch * fix that	2024-05-22 21:54:06 -04:00
qazal	498cf3e7e0	fuzzer path search for DEFINE_ACC (#4656 ) * insert acc * add test_ops * find toposorts * todo - not yet ready * remove the import * atol and childless children	2024-05-23 00:50:01 +03:00
qazal	f11a81f707	isolated test for BEAM=2 llama wrong uops toposort (#4687 ) * add ast * skip test in CI	2024-05-23 00:47:37 +03:00
Francis Lam	721f9f6acf	test/external/verify_kernel: fix LOGKERNS variable name in comments (#4685 ) should've been changed with the LOGKERN to LOGKERNS change	2024-05-22 17:08:40 -04:00

1 2 3 4 5 ...

1863 Commits