tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-14 17:38:06 -05:00

Author	SHA1	Message	Date
George Hotz	8a38e0d207	only mish failed	2021-01-03 09:47:11 -08:00
George Hotz	a337f7780e	smarter way to write sign	2021-01-03 09:46:00 -08:00
George Hotz	1a4487965a	remove negative from things w/o negative	2021-01-03 09:43:34 -08:00
George Hotz	0531b848eb	second class sign	2021-01-03 09:33:12 -08:00
George Hotz	0702e0c763	nah, no sign, it's not what you want. use relu	2021-01-03 09:30:33 -08:00
George Hotz	29655609d5	fix GPU sign...these tests aren't very good	2021-01-03 09:00:49 -08:00
George Hotz	ea9c9af5d7	faster sign	2021-01-03 08:54:21 -08:00
George Hotz	c2eeb6950b	add support for sign. technically relu can be second class now	2021-01-03 08:29:57 -08:00
George Hotz	6842ad9ec8	minor cleanups, yolo work	2021-01-03 08:14:16 -08:00
NeuralLink	0825cf7f79	⚡ Added softplus and mish non stable (#220 ) * ⚡ Added softplus and mish CPU * 🔨 refactor * 🔨 second class softplus and mish * 🔨 test fix * no need of device in testing	2021-01-03 08:08:41 -08:00
George Hotz	ac229ea750	remove print	2021-01-02 12:53:30 -08:00
George Hotz	895d142503	start trying to load yolo v5	2021-01-02 12:51:55 -08:00
NeuralLink	ece07a3d12	🔨 refactor register ops (#233 ) * 🔨 refactor register ops * 🔨 reorder and register for ANE * 🔨 refactor * 🔨 conflicts * 🔨 minor fix * ane fix * extra reshape weird	2021-01-02 07:47:16 -08:00
Marcel Bischoff	42b4761025	transformer >99.98% test accuracy in ~30s (#230 ) * transformer * BS might divide len(Y_test) * outoput when accuracy is high * more readeable * fixed loss in serious_mnist for new API	2021-01-02 07:45:09 -08:00
Liam	ebd72ff437	Test split (#231 ) * Split tests Split tests into "Test CPU" and "Test GPU". Add test flag "TEST_DEVICES" which is a comma separated list of devices: CPU,GPU,ANE * Run tests based on provided TEST_DEVICES flag By default will run all "CPU,GPU,ANE" * fix bad quote * Revert changes and use GPU=1 This is done through setting the default Tensor Device to Device.CPU of GPU=1 is set. Run GPU tests: GPU=1 pytest -s -v	2021-01-01 09:19:03 -05:00
George Hotz	4a7cf2e420	more reordering	2020-12-31 09:58:02 -05:00
George Hotz	92abe43683	reduce before binary because of unbroadcasting	2020-12-31 09:49:52 -05:00
George Hotz	4291002881	reorder GPU ops	2020-12-31 09:46:39 -05:00
George Hotz	de7fe085de	no read out of bounds	2020-12-31 09:41:36 -05:00
George Hotz	1fb5fcafce	GPU slice should fix tests	2020-12-31 09:37:03 -05:00
Liam	e972a45456	Dynamically register ops to Tensor (#232 ) * Dynamically register ops to Tensor This saves lines. And reduces redundant repetition. * ffs spacing you don't pay me enough!	2020-12-31 09:10:19 -05:00
Marcel Bischoff	e2f833f58f	max to behave on ties like torch (#229 ) * checkpoint * fixing pow * undo pow * backward max on GPU and CPU rewrite * indentation * changing seed for curiosity * max replaced equality * undo seed * rebase * fixed tests * merge error	2020-12-30 18:52:50 -05:00
George Hotz	30f8132646	reorder ops in ops cpu	2020-12-30 11:00:01 -05:00
George Hotz	e5b2803b5d	ops in readme	2020-12-30 10:48:55 -05:00
George Hotz	2d44bf7f1a	Dot -> Matmul	2020-12-30 10:41:51 -05:00
George Hotz	10fc3ff5b9	cleaner syntax	2020-12-30 10:35:37 -05:00
George Hotz	fcfe3dae01	write slice for CPU	2020-12-30 10:32:53 -05:00
George Hotz	47504004fd	ane ops	2020-12-29 18:00:53 -05:00
George Hotz	1f5c9618ef	refactor in readme and issue #225	2020-12-29 17:30:04 -05:00
George Hotz	f9170505b3	if you like your transformers twice as slow, use the GPU	2020-12-29 17:14:23 -05:00
George Hotz	6a6a82e999	support multidot on GPU	2020-12-29 16:56:30 -05:00
George Hotz	27208d729b	add GPU max thanks to marcelbischoff	2020-12-29 16:44:14 -05:00
George Hotz	4bbad11afe	link to papers	2020-12-29 14:15:46 -05:00
George Hotz	3f8e137b6f	extra/transformer	2020-12-29 14:14:00 -05:00
George Hotz	c4e7a1ae59	accessors are dumb	2020-12-29 14:10:26 -05:00
George Hotz	fb6aaefb9b	save 2 lines	2020-12-29 14:02:50 -05:00
George Hotz	ea341c84fe	logsoftmax good, div bad	2020-12-29 13:59:39 -05:00
George Hotz	f18801c7db	simple pool. swimming is very easy now	2020-12-29 13:48:50 -05:00
George Hotz	8f9232d59b	readmee	2020-12-29 13:40:34 -05:00
George Hotz	837aaacfbf	Unpad2D on GPU:	2020-12-29 13:16:14 -05:00
George Hotz	02655c07d5	break maxpool2d on GPU	2020-12-29 13:05:57 -05:00
George Hotz	061e37de39	touchups	2020-12-29 12:41:21 -05:00
George Hotz	a2e6562330	fix max op, less lines	2020-12-29 10:47:04 -05:00
Marcel Bischoff	dc8fa7999c	Transpose on GPU (#221 ) * 2serious * load/save * fixing GPU * added DEBUG * needs BatchNorm or doesn't learn anything * old file not needed * added conv biases * added extra/training.py and checkpoint * assert in test only * save * padding * num_classes * checkpoint * checkpoints for padding * training was broken * merge * rotation augmentation * more aug * needs testing * streamline augment, augment is fast thus bicubic * tidying up * transformer eval * axis=-1 * transpose * test for permutation using torch.movedims * another test * line	2020-12-29 10:40:11 -05:00
George Hotz	36579f66bf	max op	2020-12-28 23:54:52 -05:00
George Hotz	bcb3ceeca3	set training in functions	2020-12-28 22:45:46 -05:00
George Hotz	51bf164b72	dropout, training	2020-12-28 22:12:23 -05:00
George Hotz	7b8fee038d	it works! forgot the sqrt	2020-12-28 16:23:52 -05:00
George Hotz	1faf05ef67	ahh, it's better if i don't train the embedding	2020-12-28 16:07:02 -05:00
George Hotz	c3832e1bde	hmm, fix layernorm to not be batchnorm and it breaks	2020-12-28 13:06:21 -05:00

... 211 212 213 214 215 ...

11106 Commits