Commit Graph

4667 Commits

Author SHA1 Message Date
SnakeOnex
1b337b5533 ONNX tests exclude all unsupported filetype tests (#832) 2023-05-28 13:31:20 -07:00
Kirill R
081b3ab639 Tensor.where method (#830) 2023-05-28 10:20:33 -07:00
Kirill R
0c0c7380af Add Tensor.where (#826)
* Add Tensor.where

* fix linter

* fix mypy
2023-05-28 08:04:56 -07:00
kposborne2
2163a1b049 Add shrink step to fix strided conv_transpose2d, and add to nn (#823)
* implement conv transpose 2d

* don't inherit, remove old assert

---------

Co-authored-by: Kyle <kposborne@gmail.com>
2023-05-28 07:52:45 -07:00
crthilakraj
01daa74f9b fixed TestCustomFunction (#820) 2023-05-27 18:27:46 -07:00
George Hotz
a3feee29c5 make tests faster + add onnx (#815)
* search one dir, disable slow

* onnx tests

* fast rnnt test
2023-05-27 08:53:32 -07:00
Mattis Megevand
606b841d3f LR Schedulers (#755)
* lr schedulers + test

* lr scheduler test moved + integration test

* integration test for all lr scheduler

* lr scheduler test now deterministic

* changed optimizer + parameters for lr sched test
2023-05-27 07:47:49 -07:00
Rayan Hatout
8b2c2d6896 Optimizations in symbolic.py (#796)
* optimizations in symbolic.py

* fix infinite recursion when expanding sums

* add test case to make sure NumNodes are hoisted up in cases where MulNodes cancel eachother out
2023-05-26 12:59:53 -07:00
George Hotz
26014a0fa1 add convtranspose (#809)
* add convtranspose

* onnx convtranspose
2023-05-26 12:35:03 -07:00
symlon
04284414db Batchnorm2d fixed running_var (#807)
* BatchNorm2d match pytorch

* removed comment

* Batchnorm2d test multiple sizes
2023-05-26 12:32:49 -07:00
Aneesh Durg
6d4a728f62 Don't collapse dimensions during batched matmul (FIX #799) (#800)
* Don't collapse dimensions during batched matmul (FIX #799)

* Avoid reshaping tensor to the same shape

* Skip batched matrix multiply when IMAGE is set
2023-05-26 11:15:34 -07:00
George Hotz
ee2c8423c7 disable that test on LLVM. i have to stop pushing to master 2023-05-26 03:11:03 +00:00
George Hotz
ea3194f68e test touchups 2023-05-26 02:39:42 +00:00
wozeparrot
0dc333cfab Promote Embedding to nn (#798)
* feat: promote Embedding to nn

* fix: fix failing test

* feat: add test with jit

* feat: rewrite embedding to no longer need stacked for loops

* clean+fix: don't know how that happened
2023-05-25 18:39:45 -07:00
Diogo
c19ef0fcce Add sin/cos/tan (#794)
* added sin/cos/tan

* fix lint

* added onnx ops support
2023-05-25 09:04:56 -07:00
wozeparrot
01ae45a43c Add mlperf RNN-T model (#782)
* feat: initial rnn-t

* feat: working with BS>1

* feat: add lstm test

* feat: test passing hidden

* clean: cleanup

* feat: specify start

* feat: way faster lstm & model

* fix: default batch size

* feat: optimization

* fix: fix metrics

* fix: fix feature splicing

* feat: cleaner stacktime

* clean: remove unused import

* clean: remove extra prints

* fix: fix tests and happy llvm

* feat: have the librispeech dataset in its own dir

* clean: unused variable

* feat: no longer need numpy for the embedding + slightly more memory efficient lstm

* fix: forgot to remove something that broke tests

* feat: use relative paths

* feat: even faster

* feat: remove pointless transposes in StackTime

* fix: correct forward

* feat: switch to soundfile for loading and fix some leaks

* feat: add comment about initial dataset setup

* feat: jit more things

* feat: default batch size back to 1

larger than 1 is broken again :(
and even in the reference implementation it gives worse results
2023-05-25 00:41:21 -07:00
Sasha Krassovsky
b258af117a Fix PytestCollectionWarning when running tests (#791) 2023-05-24 23:17:57 -07:00
Jacky Lee
c552f6f92b Inference test: add tests for ResNet50 (#773)
* Add ResNet inference test and cannon

* Test with ResNet50

* test_car works with resnet fix
2023-05-13 21:18:15 -07:00
Rabia Eda Yılmaz
e5b4b36cba add std to tensor.py (#767)
* add std

* delete comment

* edit: one liner std, add: test

* adjust

* fix: shape mismatch

* set unbiased to False

* added unbiased option

* fix unbiased option in test and clean code

* better

* generalize axis

* holly coffee molly

* generalize axes without unbiased opt.

* hopefully done

* complete unbiased true for axes

* Update test_ops.py

* fixed

* std completed without bessels correction

* fix comment

* ups
2023-05-13 12:20:44 -07:00
George Hotz
810f03dafa conv3d + unet3d (#772)
* conv3d, needs test

* test passes, padding wrong on unet

* unet3d

* no conv3d on images
2023-05-12 13:54:07 -07:00
Jacky Lee
b80cf9220c Statistics test: check if distributions match torch (#769)
* Check if tensor values match torch

* Clean up randomness tests and remove dependency

* Remove kaiming uniform test
2023-05-07 21:43:23 -07:00
George Hotz
5b2ae262db assertions for jit 2023-05-05 21:56:32 -07:00
George Hotz
81aa3e546b exclude GPU on tiny (#766) 2023-05-05 10:07:23 -07:00
George Hotz
f2a964f447 nocopy (#764) 2023-05-05 09:32:06 -07:00
George Hotz
f28df9900f multidevice works (#763)
* basic multigpu working

* better multigpu test

* upper

* touchups

* cl sync
2023-05-04 01:04:58 -07:00
George Hotz
7ecf4dff68 multi cl_queue (#762)
* multi cl_queue

* only platforms 1

* gpus first, then cpus

* put device on underlying buffer

* cl_queue array
2023-05-03 12:15:28 -07:00
Joqsan
0b9d4126d0 Add Tensor.stack() and Tensor.repeat() (...trying to make einops work with tinygrad) (#758)
* add stack() and repeat() methods

* make stack a static method
2023-05-01 09:37:46 -07:00
George Hotz
3d15769a8f 50 TFLOPS cuda matmul 2023-04-19 14:38:24 -07:00
George Hotz
03b38864db fix batchnorm at training (#753)
* e2e testing

* min failure

* no affine on bn, still fails

* why did i think i could detach that?

* allow more kernels for bn

* some test issue i don't understand
2023-04-19 08:01:04 -07:00
George Hotz
8b7ecd63bb Remove Zeroview (#748)
* no zeroview start

* closer

* stride mask

* st tests pass, delete ZeroView

* byebye zv

* close to working

* not contiguous with mask

* subtract, don't add

* mask on view

* ugh, that shouldn't have been in there

* shape merge

* bugfixes

* fuzzer + 4 fuzzer failures

* fuzzer for symbolic

* more fuzzing and nothing

* that fuzzer doesn't hit either

* fixes padding...ugh

* no more offsets

* working

* rewrite load and store

* all checks

* fix idxs

* progress

* bugfix

* float4_axis

* works

* cleanups

* complex valids_okay
2023-04-17 08:21:46 -07:00
George Hotz
17e37157b6 fix backward convs (#746)
* fix backward convs

* no pushing in reduce

* late cout

* test_fold_4convs_sgd
2023-04-14 10:42:11 -07:00
George Hotz
f7f416d6f4 back to 6 for test_fold_conv_sgd 2023-04-14 07:34:00 -07:00
George Hotz
133521e730 relu UnaryOp is back 2023-04-14 07:12:53 -07:00
worldwalker2000
552a048a33 make maximum split the grad like torch when equal (#738)
* make maximum split grad

* added test for maximum split grad when equal

* minor expr simplification

* (2-eq)/2 only once

* update test bc one more sum output child stays
2023-04-14 00:17:46 -07:00
George Hotz
20894991ed good changes from the M1 Tensor Core project (#730)
* good changes

* working except llvm

* llvm types

* nice acc

* archprobe

* lang.float4

* use self.acc for late acc

* fix store bug
2023-03-29 05:11:02 +04:00
Andre Slavescu
39d6e1525f Added activation ops + tests (#729)
* activation ops

* type hints + more testing

* formatting correction + parameter testing

* fixes to shape testing

* hardtanh to use clip + removed type hints

* assign val fix
2023-03-28 13:17:53 +04:00
George Hotz
23f88fb026 synchronize for honest speed compare 2023-03-24 10:24:27 -07:00
George Hotz
1cb5b2d015 test_enet_se 2023-03-24 10:04:30 -07:00
Jacky Lee
fafe8e9ce2 casting: support all backends and implement half (#726)
* casting: support all backends and implement half

* map torch types in ops_torch

* reuse type map for torch buffer

* inverse dict lookup
2023-03-24 09:58:03 -07:00
George Hotz
e88b9bfe1e print gflops avg with DEBUG=2 2023-03-23 16:07:08 -07:00
Jacky Lee
e009b6f341 Add tests for casting (#724)
* Add tests for casting

* Skip half_matmul_upcast when TORCH=1

* Fix promotion on torch

* Fix spacing
2023-03-23 08:02:52 -07:00
George Hotz
b12b60af20 fix binop, other tests failure (#723)
* fix binop, other tests failure

* that was a bad idea

* better layernorm

* inference kernel count tests

* new style reshape pushing

* fixup replacement

* 199 kernels is okay. fix flops

* push reshape through unaryops only

* GRAPH=2 draws the phantom ops

* found resnet issue

* non working test

* mul is cheaper than div

* OPT inflation

* SHUFFLE_PAD_OPS in OPT=2
2023-03-22 18:15:07 -07:00
George Hotz
d6f4219952 LayerNorm2d for 2 lines 2023-03-20 16:58:43 -07:00
George Hotz
30b795874a remove RMSprop, nobody uses it anymore 2023-03-20 12:31:34 -07:00
George Hotz
623fb1ef28 do test_conv_with_bn test 2023-03-19 23:53:56 -07:00
George Hotz
5495c7d64e linearizer! (#714)
* linearizer outputs something

* working ish

* cstyle codegen

* clang mostly works

* fix load valid

* fix numberless loop

* fancy gen

* working

* fix enet compiler

* cleanups

* float4 upcasting

* less lines

* supports_float4

* constant folding

* mulacc

* internet tests flaky in CI

* 90% image support

* fix image generic

* bugs exposed with shapetracker and single view

* new llvm

* use vload, remove OLD

* that's really poorly done

* ending up being more lines
2023-03-19 23:43:49 -07:00
Cyril Roumégous
b629fd4cd8 add AdamW optimizer (#716)
* add AdamW optimizer

* one liner Adam optimizer
2023-03-19 12:51:06 -07:00
George Hotz
902906f909 Fix constant folding (#713)
* fix

* codegen

* contiguous is real

* no bufs_to_delete

* don't assign rawconst

* remove neg and not

* need exec to fix custom function jit
2023-03-18 17:52:46 -07:00
George Hotz
f5467cfedc Devicebufferless (#708)
* runs one metal kernel

* conv2d works

* ops tests are passing

* const folding

* all ops work

* pre commit always passes

* torch works

* working still

* fix graph test

* tests passing

* image almost works

* image conv works

* most images

* fix custom

* fix assignment

* fix compile enet

* clean up comments

* fix realize return value

* include shapetracker in LB repr

* copy should make a copy

* reenable method cache

* fix lna

* dtypes in graph

* forward only for IMAGE=2

* simple realize

* getting close

* fixup new api, it's good except the kernel count

* back to 197 kernels

* tests should pass

* go to a real float

* no type_on_cpu

* fix the docs

* put shapetracker back in it's proper place
2023-03-18 14:40:23 -07:00
Kirill
0532025b04 Fix llama 13B weights loading (#700)
* Fix llama 13B weights loading

* refactor more

* add test

* test storage offset

* fix spacing

* fix strides

* llama 13B working?

* yolo?

* better test for seeks
2023-03-15 08:59:52 -07:00