tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-22 21:38:10 -05:00

Author	SHA1	Message	Date
Alexey Zaytsev	b58d875937	Add Tensor.ndim .element_size .is_floating_point (#876 )	2023-05-31 09:00:35 -07:00
Diogo	1272d8526a	Llvm int support (#866 ) * added int val support to llvm * lint fix * added types * fix merge issues	2023-05-30 17:49:26 -07:00
Nima Khodaveisi	5670123d88	Add tensor.numel (#869 ) * add tensor.numel * add tensor.numel	2023-05-30 16:08:09 -07:00
Diogo	0dab8edc97	support Int64 type in cstyle gen (#860 ) * added metal int64 and some simple tests * removed bool return type def * typo in test * also missing in clang and gpu runtimes * switched order for opencl * increased atol and removed new line in kernel prefix	2023-05-30 16:04:46 -07:00
Ubaidullah Khan	502e33652f	add Tensor.full and Tensor.full_like and reuse them (#852 ) * add Tensor.ones_like() * add full_like and full and reuse in zeros,ones * add tests for full and full_like	2023-05-29 17:48:09 -07:00
Rabia Eda Yılmaz	3075988468	Added kaiming_uniform initialization for Conv2d and Linear layers (#756 ) * added kaiming_uniform init for conv2d and linear layers * fix: set getattr * up * fix: set getattr * fix comments * better does not mean it is good * more nonlinearities * added test checks the distribution of default relu option * prettier * fix kernel size * edit distribution of returned tensor * complete tests and fix fan_mode * added higher dim test * prettier test * fix silly blank * just leaky_relu mode * default fan in and leaky relu * update params * fix test * shorter * generalize Tensor.uniform and adjust kaiming init - added low and high parameters to Tensor.uniform function, so it can have a specific range (default is 0 to 1) - adjusted return line of kaiming_uniform * range from -1 to 1 * delete comment * adjusted test_uniform * fixed * delete comment	2023-05-29 15:09:55 -07:00
Ubaidullah Khan	0e89c3f456	zeros_like use dtype if specified else default to tensor’s dtype (#848 )	2023-05-29 11:38:34 -07:00
Diogo	1a5d72f812	Onnx ops And, Or, Xor, Not (#847 ) * onnx and, or, xor, not * added bool type to llvm and clang * removed float conversion * switched where op to use tensor func	2023-05-29 11:09:20 -07:00
George Hotz	ddc9dafe62	tighten up the kernel count tests	2023-05-29 08:48:54 -07:00
Ubaidullah Khan	c825cc4774	use tensor dtype for zeros_like() (#842 ) * use tensor dtype for zeros_like() * add tests for zeros_like dtype * iterate over dtypes * remove space * remove print * fix test, iterate over a list	2023-05-29 08:05:50 -07:00
Marcello Fuschi	6ea5df19b2	Fix conv_transpose2d asymmetric padding (#840 )	2023-05-29 07:57:06 -07:00
wozeparrot	2fd2fb6380	int8/uint8 support (#837 ) * feat: int8 support * feat: uint8 support * feat: int8 tests * fix: fix uint8 on clang * feat: test casting between int8/uint8/float16/float32 * clean: way cleaner dtype tests * feat: preprocess_imagenet using the correct dtype * feat: add test for overflow between uint8 and int8	2023-05-28 23:15:06 -07:00
Jacky Lee	5d212864b5	Add MLPerf UNet3D model (#775 ) * Add ResNet inference test and cannon * Test with ResNet50 * test_car works with resnet fix * Add KiTS19 dataset * KiTS19: Implement iterate * No batch load for this dataset * Save results on iterate * Implement dice score * Add data prep and eval functions * Resolve shape issue * Conversion works but wrong values * Segfaults when load_from_pretrained is called * Fix segfault and assign properly * Final result generated, though very slow * Store and load final result to save time * Fix typo in finalize * Score computes * More bug fixes, dice score is very low * Working broken code * Assign output values to result * Getting a much higher score now * Fix dataset preprocessing * Mean DICE score of 88.5 * Ugh, typo * Attempt to reimplement model * Rename layers * Tiny model works, kinda * Accuracy? gone * Implement InstanceNorm and match torch * Test instance norm 2d and 3d * Combined input block with downsample block * Tiny model works, support strided convtranspose * Commands to download dataset * Clean up a bit * unet3d_v2 -> unet3d * Remove duplicated code * Oops, put tests back	2023-05-28 20:38:19 -07:00
George Hotz	59f9bcd4a4	Disktensors! (#819 ) * make empty a real thing * start ops_disk * disk tensor works * interpreted cleanup * slice write to disk * preprocess imagenet * fix custom function	2023-05-28 15:40:37 -07:00
Marcello Fuschi	6d49925a26	Add max_pool2d dilation (#833 )	2023-05-28 15:16:48 -07:00
wozeparrot	7460bd9b02	Add LAMB optimizer (#821 ) * feat: initial lamb optimizer * feat: corrently match tf impl and add test	2023-05-28 15:09:05 -07:00
SnakeOnex	1b337b5533	ONNX tests exclude all unsupported filetype tests (#832 )	2023-05-28 13:31:20 -07:00
Kirill R	081b3ab639	Tensor.where method (#830 )	2023-05-28 10:20:33 -07:00
Kirill R	0c0c7380af	Add Tensor.where (#826 ) * Add Tensor.where * fix linter * fix mypy	2023-05-28 08:04:56 -07:00
kposborne2	2163a1b049	Add shrink step to fix strided conv_transpose2d, and add to nn (#823 ) * implement conv transpose 2d * don't inherit, remove old assert --------- Co-authored-by: Kyle <kposborne@gmail.com>	2023-05-28 07:52:45 -07:00
crthilakraj	01daa74f9b	fixed TestCustomFunction (#820 )	2023-05-27 18:27:46 -07:00
George Hotz	a3feee29c5	make tests faster + add onnx (#815 ) * search one dir, disable slow * onnx tests * fast rnnt test	2023-05-27 08:53:32 -07:00
Mattis Megevand	606b841d3f	LR Schedulers (#755 ) * lr schedulers + test * lr scheduler test moved + integration test * integration test for all lr scheduler * lr scheduler test now deterministic * changed optimizer + parameters for lr sched test	2023-05-27 07:47:49 -07:00
Rayan Hatout	8b2c2d6896	Optimizations in `symbolic.py` (#796 ) * optimizations in symbolic.py * fix infinite recursion when expanding sums * add test case to make sure NumNodes are hoisted up in cases where MulNodes cancel eachother out	2023-05-26 12:59:53 -07:00
George Hotz	26014a0fa1	add convtranspose (#809 ) * add convtranspose * onnx convtranspose	2023-05-26 12:35:03 -07:00
symlon	04284414db	Batchnorm2d fixed running_var (#807 ) * BatchNorm2d match pytorch * removed comment * Batchnorm2d test multiple sizes	2023-05-26 12:32:49 -07:00
Aneesh Durg	6d4a728f62	Don't collapse dimensions during batched matmul (FIX #799 ) (#800 ) * Don't collapse dimensions during batched matmul (FIX #799) * Avoid reshaping tensor to the same shape * Skip batched matrix multiply when IMAGE is set	2023-05-26 11:15:34 -07:00
George Hotz	ee2c8423c7	disable that test on LLVM. i have to stop pushing to master	2023-05-26 03:11:03 +00:00
George Hotz	ea3194f68e	test touchups	2023-05-26 02:39:42 +00:00
wozeparrot	0dc333cfab	Promote Embedding to `nn` (#798 ) * feat: promote Embedding to nn * fix: fix failing test * feat: add test with jit * feat: rewrite embedding to no longer need stacked for loops * clean+fix: don't know how that happened	2023-05-25 18:39:45 -07:00
Diogo	c19ef0fcce	Add sin/cos/tan (#794 ) * added sin/cos/tan * fix lint * added onnx ops support	2023-05-25 09:04:56 -07:00
wozeparrot	01ae45a43c	Add mlperf RNN-T model (#782 ) * feat: initial rnn-t * feat: working with BS>1 * feat: add lstm test * feat: test passing hidden * clean: cleanup * feat: specify start * feat: way faster lstm & model * fix: default batch size * feat: optimization * fix: fix metrics * fix: fix feature splicing * feat: cleaner stacktime * clean: remove unused import * clean: remove extra prints * fix: fix tests and happy llvm * feat: have the librispeech dataset in its own dir * clean: unused variable * feat: no longer need numpy for the embedding + slightly more memory efficient lstm * fix: forgot to remove something that broke tests * feat: use relative paths * feat: even faster * feat: remove pointless transposes in StackTime * fix: correct forward * feat: switch to soundfile for loading and fix some leaks * feat: add comment about initial dataset setup * feat: jit more things * feat: default batch size back to 1 larger than 1 is broken again :( and even in the reference implementation it gives worse results	2023-05-25 00:41:21 -07:00
Sasha Krassovsky	b258af117a	Fix PytestCollectionWarning when running tests (#791 )	2023-05-24 23:17:57 -07:00
Jacky Lee	c552f6f92b	Inference test: add tests for ResNet50 (#773 ) * Add ResNet inference test and cannon * Test with ResNet50 * test_car works with resnet fix	2023-05-13 21:18:15 -07:00
Rabia Eda Yılmaz	e5b4b36cba	add std to tensor.py (#767 ) * add std * delete comment * edit: one liner std, add: test * adjust * fix: shape mismatch * set unbiased to False * added unbiased option * fix unbiased option in test and clean code * better * generalize axis * holly coffee molly * generalize axes without unbiased opt. * hopefully done * complete unbiased true for axes * Update test_ops.py * fixed * std completed without bessels correction * fix comment * ups	2023-05-13 12:20:44 -07:00
George Hotz	810f03dafa	conv3d + unet3d (#772 ) * conv3d, needs test * test passes, padding wrong on unet * unet3d * no conv3d on images	2023-05-12 13:54:07 -07:00
Jacky Lee	b80cf9220c	Statistics test: check if distributions match torch (#769 ) * Check if tensor values match torch * Clean up randomness tests and remove dependency * Remove kaiming uniform test	2023-05-07 21:43:23 -07:00
George Hotz	5b2ae262db	assertions for jit	2023-05-05 21:56:32 -07:00
George Hotz	81aa3e546b	exclude GPU on tiny (#766 )	2023-05-05 10:07:23 -07:00
George Hotz	f2a964f447	nocopy (#764 )	2023-05-05 09:32:06 -07:00
George Hotz	f28df9900f	multidevice works (#763 ) * basic multigpu working * better multigpu test * upper * touchups * cl sync	2023-05-04 01:04:58 -07:00
George Hotz	7ecf4dff68	multi cl_queue (#762 ) * multi cl_queue * only platforms 1 * gpus first, then cpus * put device on underlying buffer * cl_queue array	2023-05-03 12:15:28 -07:00
Joqsan	0b9d4126d0	Add Tensor.stack() and Tensor.repeat() (...trying to make einops work with tinygrad) (#758 ) * add stack() and repeat() methods * make stack a static method	2023-05-01 09:37:46 -07:00
George Hotz	3d15769a8f	50 TFLOPS cuda matmul	2023-04-19 14:38:24 -07:00
George Hotz	03b38864db	fix batchnorm at training (#753 ) * e2e testing * min failure * no affine on bn, still fails * why did i think i could detach that? * allow more kernels for bn * some test issue i don't understand	2023-04-19 08:01:04 -07:00
George Hotz	8b7ecd63bb	Remove Zeroview (#748 ) * no zeroview start * closer * stride mask * st tests pass, delete ZeroView * byebye zv * close to working * not contiguous with mask * subtract, don't add * mask on view * ugh, that shouldn't have been in there * shape merge * bugfixes * fuzzer + 4 fuzzer failures * fuzzer for symbolic * more fuzzing and nothing * that fuzzer doesn't hit either * fixes padding...ugh * no more offsets * working * rewrite load and store * all checks * fix idxs * progress * bugfix * float4_axis * works * cleanups * complex valids_okay	2023-04-17 08:21:46 -07:00
George Hotz	17e37157b6	fix backward convs (#746 ) * fix backward convs * no pushing in reduce * late cout * test_fold_4convs_sgd	2023-04-14 10:42:11 -07:00
George Hotz	f7f416d6f4	back to 6 for test_fold_conv_sgd	2023-04-14 07:34:00 -07:00
George Hotz	133521e730	relu UnaryOp is back	2023-04-14 07:12:53 -07:00
worldwalker2000	552a048a33	make maximum split the grad like torch when equal (#738 ) * make maximum split grad * added test for maximum split grad when equal * minor expr simplification * (2-eq)/2 only once * update test bc one more sum output child stays	2023-04-14 00:17:46 -07:00

... 76 77 78 79 80 ...

4433 Commits