Commit Graph

1890 Commits

Author SHA1 Message Date
SnakeOnex
1b337b5533 ONNX tests exclude all unsupported filetype tests (#832) 2023-05-28 13:31:20 -07:00
cheeetoo
21d27d31a9 Fix a couple pad tests (#827)
* fix pad bug

* float type hint for value

* convert pads to list

* update Pad type signature

* Change | to Union since not supported in < python 3.10
2023-05-28 12:06:46 -07:00
Kirill R
081b3ab639 Tensor.where method (#830) 2023-05-28 10:20:33 -07:00
George Hotz
eea3542975 remove other install method 2023-05-28 08:36:21 -07:00
Kirill R
0c0c7380af Add Tensor.where (#826)
* Add Tensor.where

* fix linter

* fix mypy
2023-05-28 08:04:56 -07:00
kposborne2
2163a1b049 Add shrink step to fix strided conv_transpose2d, and add to nn (#823)
* implement conv transpose 2d

* don't inherit, remove old assert

---------

Co-authored-by: Kyle <kposborne@gmail.com>
2023-05-28 07:52:45 -07:00
crthilakraj
01daa74f9b fixed TestCustomFunction (#820) 2023-05-27 18:27:46 -07:00
wozeparrot
67de3aa1de Add mlperf bert model (#803)
* feat: add mlperf bert model

* feat: switch to nn.Embedding

* clean+fix: fix formatting

* feat: add simple downloader

* feat: metrics

* feat: don't actually need exact match

* feat: doing a run

* feat: set eps on the layernorms

* clean+fix: cleaner impl + hopefully fixed

* feat: move dataset initialization into iterate

* feat: move tokenizer out of iterate

* clean+fix: cleaner + working

* clean: cleanup

* fix: fix metrics

* feat: need to use original bert gelu + download vocab

* feat: make directory if it doesn't exist yet

* feat: jit go brrr
2023-05-27 14:53:32 -07:00
George Hotz
1e56aced05 add changeable DEBUG (#816) 2023-05-27 13:28:25 -07:00
George Hotz
a3feee29c5 make tests faster + add onnx (#815)
* search one dir, disable slow

* onnx tests

* fast rnnt test
2023-05-27 08:53:32 -07:00
Mattis Megevand
606b841d3f LR Schedulers (#755)
* lr schedulers + test

* lr scheduler test moved + integration test

* integration test for all lr scheduler

* lr scheduler test now deterministic

* changed optimizer + parameters for lr sched test
2023-05-27 07:47:49 -07:00
George Hotz
87fa5af70a ptx example 2023-05-26 19:28:51 -07:00
George Hotz
fd296ce444 have kernels wait on DEBUG=1 2023-05-26 22:51:16 +00:00
Rayan Hatout
8b2c2d6896 Optimizations in symbolic.py (#796)
* optimizations in symbolic.py

* fix infinite recursion when expanding sums

* add test case to make sure NumNodes are hoisted up in cases where MulNodes cancel eachother out
2023-05-26 12:59:53 -07:00
George Hotz
26014a0fa1 add convtranspose (#809)
* add convtranspose

* onnx convtranspose
2023-05-26 12:35:03 -07:00
symlon
04284414db Batchnorm2d fixed running_var (#807)
* BatchNorm2d match pytorch

* removed comment

* Batchnorm2d test multiple sizes
2023-05-26 12:32:49 -07:00
George Hotz
65d63f5b40 support folding multiple of 4 into float4 (#808) 2023-05-26 12:17:48 -07:00
Aneesh Durg
6d4a728f62 Don't collapse dimensions during batched matmul (FIX #799) (#800)
* Don't collapse dimensions during batched matmul (FIX #799)

* Avoid reshaping tensor to the same shape

* Skip batched matrix multiply when IMAGE is set
2023-05-26 11:15:34 -07:00
George Hotz
803587b8b4 update readme 2023-05-26 06:11:05 +00:00
wozeparrot
7351eb4b61 feat: put temperary file in the same directory as the destination file (#805) 2023-05-25 20:46:02 -07:00
George Hotz
3ddcb5c36f Half4 load (#804)
* support half4 load

* cast to float4

* dead assert
2023-05-25 20:21:15 -07:00
George Hotz
ee2c8423c7 disable that test on LLVM. i have to stop pushing to master 2023-05-26 03:11:03 +00:00
George Hotz
ea3194f68e test touchups 2023-05-26 02:39:42 +00:00
wozeparrot
0dc333cfab Promote Embedding to nn (#798)
* feat: promote Embedding to nn

* fix: fix failing test

* feat: add test with jit

* feat: rewrite embedding to no longer need stacked for loops

* clean+fix: don't know how that happened
2023-05-25 18:39:45 -07:00
George Hotz
f4f23dc9a3 version bump v0.6.0 2023-05-26 00:51:25 +00:00
George Hotz
faf80418b7 pyopencl by default since GPU is default (#802) 2023-05-25 17:48:18 -07:00
wozeparrot
fca5028d78 feat: ability to exclude cl devices from being used (#801) 2023-05-25 17:31:29 -07:00
Benedikt
3c465470f2 pip installation one liner (#793) 2023-05-25 16:43:42 -07:00
George Hotz
a968c4c3a4 Cleanup mlperf (#797)
* improve factorization

* cleanups
2023-05-25 11:36:43 -07:00
Diogo
c19ef0fcce Add sin/cos/tan (#794)
* added sin/cos/tan

* fix lint

* added onnx ops support
2023-05-25 09:04:56 -07:00
wozeparrot
01ae45a43c Add mlperf RNN-T model (#782)
* feat: initial rnn-t

* feat: working with BS>1

* feat: add lstm test

* feat: test passing hidden

* clean: cleanup

* feat: specify start

* feat: way faster lstm & model

* fix: default batch size

* feat: optimization

* fix: fix metrics

* fix: fix feature splicing

* feat: cleaner stacktime

* clean: remove unused import

* clean: remove extra prints

* fix: fix tests and happy llvm

* feat: have the librispeech dataset in its own dir

* clean: unused variable

* feat: no longer need numpy for the embedding + slightly more memory efficient lstm

* fix: forgot to remove something that broke tests

* feat: use relative paths

* feat: even faster

* feat: remove pointless transposes in StackTime

* fix: correct forward

* feat: switch to soundfile for loading and fix some leaks

* feat: add comment about initial dataset setup

* feat: jit more things

* feat: default batch size back to 1

larger than 1 is broken again :(
and even in the reference implementation it gives worse results
2023-05-25 00:41:21 -07:00
Sasha Krassovsky
b258af117a Fix PytestCollectionWarning when running tests (#791) 2023-05-24 23:17:57 -07:00
George Hotz
0400315078 Revert "ops rdna"
This reverts commit 81a11d891d.
2023-05-21 13:02:18 -07:00
George Hotz
325a3bf2cf Revert "writing 2"
This reverts commit dddd6c42f0.
2023-05-21 13:02:17 -07:00
George Hotz
dddd6c42f0 writing 2 2023-05-21 12:52:36 -07:00
George Hotz
81a11d891d ops rdna 2023-05-21 11:45:38 -07:00
George Hotz
ed038ba129 Contract float4 ALU operations (#780)
* wrong expand

* tests passing

* pass lint
2023-05-16 19:03:49 -07:00
George Hotz
90fff82c8a Rdna (#776)
* assembler maybe

* custom asm

* rdna3 on quiet

* trigger crashes

* fixed notes

* non-fatal rdna2 crash

* Crash4

* improve rdna sniffer

* comments

* improve sniffer

* asm

* 131 TFLOPS RDNA3

* opt simple matmul

* todos
2023-05-16 05:33:57 -07:00
George Hotz
89b8b39d9c fix mypy 2023-05-13 21:25:36 -07:00
George Hotz
e0b2035023 fast imagenet eval, gets 76.14% across the set 2023-05-13 21:18:31 -07:00
Jacky Lee
c552f6f92b Inference test: add tests for ResNet50 (#773)
* Add ResNet inference test and cannon

* Test with ResNet50

* test_car works with resnet fix
2023-05-13 21:18:15 -07:00
Rabia Eda Yılmaz
e5b4b36cba add std to tensor.py (#767)
* add std

* delete comment

* edit: one liner std, add: test

* adjust

* fix: shape mismatch

* set unbiased to False

* added unbiased option

* fix unbiased option in test and clean code

* better

* generalize axis

* holly coffee molly

* generalize axes without unbiased opt.

* hopefully done

* complete unbiased true for axes

* Update test_ops.py

* fixed

* std completed without bessels correction

* fix comment

* ups
2023-05-13 12:20:44 -07:00
George Hotz
b705510d5c getting 77% on imagenet eval 2023-05-13 07:46:27 -07:00
George Hotz
810f03dafa conv3d + unet3d (#772)
* conv3d, needs test

* test passes, padding wrong on unet

* unet3d

* no conv3d on images
2023-05-12 13:54:07 -07:00
George Hotz
46d419060b start on mlperf models 2023-05-10 16:30:49 -07:00
Jacky Lee
d13629cb26 ResNet: match implementation with Nvidia and PyTorch (#770)
* Match ResNet implementation with pytorch and nvidia

* Reduce number of Epochs
2023-05-10 09:01:22 -07:00
Jacky Lee
b80cf9220c Statistics test: check if distributions match torch (#769)
* Check if tensor values match torch

* Clean up randomness tests and remove dependency

* Remove kaiming uniform test
2023-05-07 21:43:23 -07:00
George Hotz
cb7c22beeb fix mypy 2023-05-06 19:18:54 +00:00
George Hotz
5190037cbc rocm: disassembler for shader 2023-05-06 19:07:52 +00:00
George Hotz
7fbf96b992 jit: TODO, use abstractions 2023-05-05 22:51:30 -07:00