tinygrad

mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-10 07:28:15 -05:00

Author	SHA1	Message	Date
Friedrich Carl Eichenroth	6f2b3755ca	set axis default to 0 (#854 )	2023-05-29 13:15:28 -07:00
Friedrich Carl Eichenroth	3b158f7a5f	fix onnx versions greater or equal 10 (#853 )	2023-05-29 13:04:06 -07:00
Ubaidullah Khan	0e89c3f456	zeros_like use dtype if specified else default to tensor’s dtype (#848 )	2023-05-29 11:38:34 -07:00
Diogo	1a5d72f812	Onnx ops And, Or, Xor, Not (#847 ) * onnx and, or, xor, not * added bool type to llvm and clang * removed float conversion * switched where op to use tensor func	2023-05-29 11:09:20 -07:00
SnakeOnex	844e6d0753	conv1d & conv3d onnx tests (#835 ) * conv1d onnx * [Work in progress] conv1d + enforcing full padding tuple length * make ONNX padding reorder not hardcoded, works for 1D and 3D convs now * conv2d interprets padding based on the input tensor dimensions	2023-05-29 10:16:45 -07:00
George Hotz	ae204e40c8	move line counter to python	2023-05-29 09:21:40 -07:00
wozeparrot	8c6085a715	Rewrite Adam/W as functions of LAMB (#839 ) * feat: rewrite adam/w as functions of lamb * feat: use adam style adam update + comment * fix: nvm need to use the lamb adam update	2023-05-29 09:21:35 -07:00
George Hotz	ddc9dafe62	tighten up the kernel count tests	2023-05-29 08:48:54 -07:00
vaibhav	9a18dade4b	Fix ResNet Kernel Fusion (from bounty) (#825 ) * Fix ResNet * Revert resnet to orignal * Change fix merge elementwise_op with reduce_op	2023-05-29 08:46:36 -07:00
JudeDavis1	f3168ee69b	default transformer dropout to 0 (#828 ) * default mha dropout to 0 * simplify assert * reform * default to 0.1	2023-05-29 08:06:16 -07:00
Ubaidullah Khan	c825cc4774	use tensor dtype for zeros_like() (#842 ) * use tensor dtype for zeros_like() * add tests for zeros_like dtype * iterate over dtypes * remove space * remove print * fix test, iterate over a list	2023-05-29 08:05:50 -07:00
crthilakraj	7925fa58d9	Fix cuda (#836 ) * disabled float4 ALU ops for CUDA, small fix to add half_prekernel before kernel_prefix * added supports_float4_alu option, and disabled for ops_cuda	2023-05-29 07:59:36 -07:00
Marcello Fuschi	6ea5df19b2	Fix conv_transpose2d asymmetric padding (#840 )	2023-05-29 07:57:06 -07:00
wozeparrot	2fd2fb6380	int8/uint8 support (#837 ) * feat: int8 support * feat: uint8 support * feat: int8 tests * fix: fix uint8 on clang * feat: test casting between int8/uint8/float16/float32 * clean: way cleaner dtype tests * feat: preprocess_imagenet using the correct dtype * feat: add test for overflow between uint8 and int8	2023-05-28 23:15:06 -07:00
Ali Benkassou	2939e40b98	Count Python tokens (#817 ) * Count Python tokens * Minor change * Dumb syntax * 2 spaces indentations + descending order	2023-05-28 21:26:49 -07:00
Jacky Lee	5d212864b5	Add MLPerf UNet3D model (#775 ) * Add ResNet inference test and cannon * Test with ResNet50 * test_car works with resnet fix * Add KiTS19 dataset * KiTS19: Implement iterate * No batch load for this dataset * Save results on iterate * Implement dice score * Add data prep and eval functions * Resolve shape issue * Conversion works but wrong values * Segfaults when load_from_pretrained is called * Fix segfault and assign properly * Final result generated, though very slow * Store and load final result to save time * Fix typo in finalize * Score computes * More bug fixes, dice score is very low * Working broken code * Assign output values to result * Getting a much higher score now * Fix dataset preprocessing * Mean DICE score of 88.5 * Ugh, typo * Attempt to reimplement model * Rename layers * Tiny model works, kinda * Accuracy? gone * Implement InstanceNorm and match torch * Test instance norm 2d and 3d * Combined input block with downsample block * Tiny model works, support strided convtranspose * Commands to download dataset * Clean up a bit * unet3d_v2 -> unet3d * Remove duplicated code * Oops, put tests back	2023-05-28 20:38:19 -07:00
Sohaib	65d09031f2	add retinanet with resnet backbone (#813 ) * add retinanet with resnet backbone * adds resnext to support loading retinanet pretrained on openimages * object detection post processing with numpy * data is downloaded and converted to coco format with fiftyone * data loading and mAP evaluation with pycocotools * remove fiftyone dep * * eval freq * fix model timing * del jit for last batch * faster accumulate	2023-05-28 20:20:16 -07:00
George Hotz	46327f7420	bugfix for stable diffusion	2023-05-29 00:03:09 +00:00
George Hotz	59f9bcd4a4	Disktensors! (#819 ) * make empty a real thing * start ops_disk * disk tensor works * interpreted cleanup * slice write to disk * preprocess imagenet * fix custom function	2023-05-28 15:40:37 -07:00
Marcello Fuschi	6d49925a26	Add max_pool2d dilation (#833 )	2023-05-28 15:16:48 -07:00
wozeparrot	7460bd9b02	Add LAMB optimizer (#821 ) * feat: initial lamb optimizer * feat: corrently match tf impl and add test	2023-05-28 15:09:05 -07:00
SnakeOnex	1b337b5533	ONNX tests exclude all unsupported filetype tests (#832 )	2023-05-28 13:31:20 -07:00
cheeetoo	21d27d31a9	Fix a couple pad tests (#827 ) * fix pad bug * float type hint for value * convert pads to list * update Pad type signature * Change \| to Union since not supported in < python 3.10	2023-05-28 12:06:46 -07:00
Kirill R	081b3ab639	Tensor.where method (#830 )	2023-05-28 10:20:33 -07:00
George Hotz	eea3542975	remove other install method	2023-05-28 08:36:21 -07:00
Kirill R	0c0c7380af	Add Tensor.where (#826 ) * Add Tensor.where * fix linter * fix mypy	2023-05-28 08:04:56 -07:00
kposborne2	2163a1b049	Add shrink step to fix strided conv_transpose2d, and add to nn (#823 ) * implement conv transpose 2d * don't inherit, remove old assert --------- Co-authored-by: Kyle <kposborne@gmail.com>	2023-05-28 07:52:45 -07:00
crthilakraj	01daa74f9b	fixed TestCustomFunction (#820 )	2023-05-27 18:27:46 -07:00
wozeparrot	67de3aa1de	Add mlperf bert model (#803 ) * feat: add mlperf bert model * feat: switch to nn.Embedding * clean+fix: fix formatting * feat: add simple downloader * feat: metrics * feat: don't actually need exact match * feat: doing a run * feat: set eps on the layernorms * clean+fix: cleaner impl + hopefully fixed * feat: move dataset initialization into iterate * feat: move tokenizer out of iterate * clean+fix: cleaner + working * clean: cleanup * fix: fix metrics * feat: need to use original bert gelu + download vocab * feat: make directory if it doesn't exist yet * feat: jit go brrr	2023-05-27 14:53:32 -07:00
George Hotz	1e56aced05	add changeable DEBUG (#816 )	2023-05-27 13:28:25 -07:00
George Hotz	a3feee29c5	make tests faster + add onnx (#815 ) * search one dir, disable slow * onnx tests * fast rnnt test	2023-05-27 08:53:32 -07:00
Mattis Megevand	606b841d3f	LR Schedulers (#755 ) * lr schedulers + test * lr scheduler test moved + integration test * integration test for all lr scheduler * lr scheduler test now deterministic * changed optimizer + parameters for lr sched test	2023-05-27 07:47:49 -07:00
George Hotz	87fa5af70a	ptx example	2023-05-26 19:28:51 -07:00
George Hotz	fd296ce444	have kernels wait on DEBUG=1	2023-05-26 22:51:16 +00:00
Rayan Hatout	8b2c2d6896	Optimizations in `symbolic.py` (#796 ) * optimizations in symbolic.py * fix infinite recursion when expanding sums * add test case to make sure NumNodes are hoisted up in cases where MulNodes cancel eachother out	2023-05-26 12:59:53 -07:00
George Hotz	26014a0fa1	add convtranspose (#809 ) * add convtranspose * onnx convtranspose	2023-05-26 12:35:03 -07:00
symlon	04284414db	Batchnorm2d fixed running_var (#807 ) * BatchNorm2d match pytorch * removed comment * Batchnorm2d test multiple sizes	2023-05-26 12:32:49 -07:00
George Hotz	65d63f5b40	support folding multiple of 4 into float4 (#808 )	2023-05-26 12:17:48 -07:00
Aneesh Durg	6d4a728f62	Don't collapse dimensions during batched matmul (FIX #799 ) (#800 ) * Don't collapse dimensions during batched matmul (FIX #799) * Avoid reshaping tensor to the same shape * Skip batched matrix multiply when IMAGE is set	2023-05-26 11:15:34 -07:00
George Hotz	803587b8b4	update readme	2023-05-26 06:11:05 +00:00
wozeparrot	7351eb4b61	feat: put temperary file in the same directory as the destination file (#805 )	2023-05-25 20:46:02 -07:00
George Hotz	3ddcb5c36f	Half4 load (#804 ) * support half4 load * cast to float4 * dead assert	2023-05-25 20:21:15 -07:00
George Hotz	ee2c8423c7	disable that test on LLVM. i have to stop pushing to master	2023-05-26 03:11:03 +00:00
George Hotz	ea3194f68e	test touchups	2023-05-26 02:39:42 +00:00
wozeparrot	0dc333cfab	Promote Embedding to `nn` (#798 ) * feat: promote Embedding to nn * fix: fix failing test * feat: add test with jit * feat: rewrite embedding to no longer need stacked for loops * clean+fix: don't know how that happened	2023-05-25 18:39:45 -07:00
George Hotz	f4f23dc9a3	version bump v0.6.0	2023-05-26 00:51:25 +00:00
George Hotz	faf80418b7	pyopencl by default since GPU is default (#802 )	2023-05-25 17:48:18 -07:00
wozeparrot	fca5028d78	feat: ability to exclude cl devices from being used (#801 )	2023-05-25 17:31:29 -07:00
Benedikt	3c465470f2	pip installation one liner (#793 )	2023-05-25 16:43:42 -07:00
George Hotz	a968c4c3a4	Cleanup mlperf (#797 ) * improve factorization * cleanups	2023-05-25 11:36:43 -07:00

1 2 3 4 5 ...

1961 Commits