tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-27 15:58:10 -05:00

Author	SHA1	Message	Date
George Hotz	8c475ea86a	relax atol, merge_view	2023-03-03 07:48:44 -08:00
George Hotz	9d539b8ebb	more intuitive output shape from _pool	2023-02-28 11:41:48 -08:00
George Hotz	758515dcc0	conv2d is an hlop (#589 ) * conv2d is an hlop * shorter conv * KOPT=-1 * alt imp * MULACC * smarter mulacc * pop conv * 7x7 -> 5x5 * didn't fix, that's not going to work * this is faster and matches old behavior * oh, non lazy just won't work with mulacc * mulacc in torch * bool types were creeping in * optimizer is actually better with hlop conv * fix pushing permutes issue * refactor einsum_mulacc * fix up readme * update readme * _image_conv2d * fix bias addition location * pushing permutes gets back to 200 kernels * conv cleanup * disable hlop conv * don't hide that in helpers	2023-02-23 17:52:31 -08:00
George Hotz	628ce067a1	add tests to mypy	2023-02-22 07:07:38 -08:00
Kirill	7944cfdadc	Remove Tensor.data (#565 )	2023-02-18 16:36:12 -08:00
Jacky Lee	e172f0087a	BatchNorm2D -> BatchNorm2d (#558 ) * BatchNorm2D -> BatchNorm2d * Fix typo	2023-02-16 12:31:49 -08:00
Jacky Lee	486f023e81	Rename Normalize and move to nn (#513 ) * Rename Normalize and move to nn * Match PyTorch for dim>1	2023-02-01 11:55:03 -08:00
George Hotz	487685919b	Revert "Rename Normalize and move to nn (#415 )" (#474 ) This reverts commit `d768acb6a9`.	2023-01-25 07:50:04 -08:00
Jacky Lee	d768acb6a9	Rename Normalize and move to nn (#415 ) * Rename Normalize and move to nn * Fix comparison to None error * Add test for GroupNorm * Rename test case * Flip parameters to match PyTorch * Increase error tolerance * Fix elementwise_affine on channels * Match arguments with PyTorch * Initialize weight and bias only when affine is true * Is this it? * A bit cleaner * Handle case where weight or bias is None	2023-01-25 07:47:59 -08:00
George Hotz	392e57aea7	ugh, why did that fail	2022-10-01 13:38:43 -04:00
George Hotz	ecc1a0470d	add Linear to tinygrad.nn	2022-09-07 07:40:48 -07:00
George Hotz	cff297ef9d	w/e, that's a later prob	2022-07-17 12:32:50 -07:00
George Hotz	bf299802f8	fixup tests	2022-07-17 12:11:53 -07:00
George Hotz	8ba3d1f803	fix bn test, affine is True	2022-01-15 19:52:15 -08:00
George Hotz	bd21304e3c	linear takes in weight and bias	2021-11-30 00:38:47 -05:00
George Hotz	dca076dbf1	remove dumb nn ops	2021-11-29 18:05:31 -05:00
George Hotz	ce3d198bb7	less lines and fix default device	2021-11-27 11:18:49 -05:00
Guglielmo Camporese	2b7589db64	Added ResNet-{18, 34, 50, 101, 152} (#271 ) * added resnets * fix minor * fix minor * resnet in models * added resnet test * added resnet train test * added linear, conv2d nn tests * fix minor in extra/training * resnet in models * fix minor * fix tolerance for linear in nn test * fix eval, this causes cpu and gpu UT failing * revert transformer test * fix minor for CPU test * improved model get_params for sequential layer * fix minor for params counting * commented broken ops tests * improved train for resnet	2021-06-21 09:37:24 -07:00
Skosh	78aa147b39	[WIP] YOLO working on tinygrad! (#245 ) * Some progress on yolov3 * Removed some debugging comments… Also, the forward pass eats all RAM for some reason * forward pass almost runs * forward pass runs almost * forward pass runs, now we gotta load the weights * loading weights works * fetches config and weights * everything kind of works, postprocessing of output still needs to be implemented, temp_process_results kind of works, but its kind of terrible, and not how things should be done * some changes * fixed some bugs in the forward pass and load_weights function, now outputs more correct values, however some values are still loaded incorrectly * Something is wrong with the forward pass, Conv2d tests added * forward pass almost outputs correct values, gotta fix one more thign * yolo works * some final changes * reverting changes * removed dataloader * fixed some indentation * comment out failing test, somehow it fails CI even though it passes on my computer… * fixed wrong probabilities * added webcam option to YOLO, now just need to add bounding boxes and speed it up * some progress towards adding bounding boxes * trying to speed up yolo layer on GPU, still faster on CPU but with 30GB ram usage * Faster inference times, bounding boxes added correctly, webcam works, but is slow, and there is a memory leak when running on CPU... Also added tinygrads output on the classic dog image * removed some debugging print statements * updated result image * something weird is going on, mean op on GPU tensor randomly faults, copying a tensor from GPU->CPU takes 10+ seconds…	2021-04-25 18:06:52 -07:00
Liam	ebd72ff437	Test split (#231 ) * Split tests Split tests into "Test CPU" and "Test GPU". Add test flag "TEST_DEVICES" which is a comma separated list of devices: CPU,GPU,ANE * Run tests based on provided TEST_DEVICES flag By default will run all "CPU,GPU,ANE" * fix bad quote * Revert changes and use GPU=1 This is done through setting the default Tensor Device to Device.CPU of GPU=1 is set. Run GPU tests: GPU=1 pytest -s -v	2021-01-01 09:19:03 -05:00
George Hotz	2f1b2c0a3b	add transpose, start on transformer	2020-12-27 16:59:12 -05:00
Liam	bcf1518309	All devices are equal! (#196 ) * Update all devices to be tested ANE, CPU and OCL all now support all tests. However tests are not currently passing on GPU and I cannot test on CPU. Failing GPU test are not an issue caused by this update. Tests have not been passing due to a missing "six" required installation. OpenCL Tests have not been run since commit: `1a1c63a08b` devices have 3 types and are handle by a new DeviceTypes enum. (The goal is to revert to Tensor.<type>, but this current setup allows for keyword argument defaults: `device=DeviceType.CPU`) All references to Tensor.GPU/CPU/ANE as been converted to the corresponding `DeviceTypes` enum. Refactor of the conversion code to allow for any device to any device conversion. * Add six dependency in requirements.txt * Resolve failure to run tests Move six into gpu required installs. Remove six from standard installation. * Remove repeated data conversion * Refactor method names Also reduce code with .to and .to_ * Dynamic device handlers * Refactor DeviceTypes -> Device * Add mem copy profiling back * test_backward_pass_diamond_model passing * Resolve Sum issue on GPU * Revert batchnorm2d tests * Update README with upadated API * ANE testing with * Last minute line gains	2020-12-15 23:44:08 -08:00
George Hotz	ffb96b2d0b	batchnorm by marcelbischoff	2020-12-09 03:23:04 -08:00
George Hotz	17659f7dd7	gpu speedup, tests work on M1	2020-12-06 09:05:49 -08:00
George Hotz	c14473f87d	unit test for batchnorm2d	2020-10-30 08:19:58 -07:00

25 Commits