Commit Graph

469 Commits

Author SHA1 Message Date
George Hotz
4efde1ba0a test_matmul 2022-09-13 07:51:33 -07:00
George Hotz
790af99a48 fix slice one multi, and linear can be simpler with new broadcasting 2022-09-06 19:51:33 -07:00
YassineYousfi
5aad460c7a broadcast from right to left (#375)
* broadcast from right to left

* add another broadcasted add test
2022-09-06 16:36:13 -07:00
George Hotz
bcb867cdd6 better idea for numbers, do the division in python 2022-09-03 16:23:39 -07:00
George Hotz
033a3ecccf found tinygrad bug 2022-09-03 12:32:43 -07:00
George Hotz
5d45c6e516 Fold reduce (#362)
* folding reduce

* fold through movementops

* fixup shapes

* was too aggressive

* i knew we needed that

* don't recompute reduce

* working

* fix openpilot compile

* prunegraph openpilot

* types and reduce_shape

* refactor

* cleanups

* neater

* 1009

* 1004

* clean up reduce for 998
2022-07-19 09:24:02 -07:00
George Hotz
f93e297804 fix bug caused by rounding 2022-07-17 12:49:58 -07:00
George Hotz
bcf422dfdd Device2 (#358)
* option for matmul

* fixups

* fast like a nascar

* running

* thneed runner

* no buffer id makes no backing buffer

* move constant folding to the top

* runs on mac

* folded biases

* was v slow

* maybe just that

* elu touchup

* speed and float32

Co-authored-by: Comma Device <device@comma.ai>
2022-07-16 07:26:19 -07:00
George Hotz
5e46561f7e no_grad = NOT backward 2022-07-10 20:54:57 -07:00
George Hotz
b34ae7876f lol chr(10) not chr(13) 2022-07-10 20:03:11 -07:00
George Hotz
93c378dffc add test for slice_one 2022-07-03 12:14:20 -07:00
George Hotz
dffde3de5a support both asymmetric and negative padding 2022-06-26 17:59:25 -07:00
George Hotz
49c954b389 comments 2022-06-26 17:20:25 -07:00
George Hotz
8c483fbdc9 maxpool lazy fix 2022-06-26 17:07:03 -07:00
George Hotz
6b652dafb2 touchups 2022-06-19 16:57:14 -07:00
George Hotz
d5b3e18540 Accelerate with CL (#325)
* accelerated opencl

* it's running, it's just wrong

* bugfix

* model is correct in opencl

* lazy image convert

* add padding support to convolution

* that stuff was all upstreamed

* remove HEAD

* oops

* test_simple_conv2d_4 passes, add dilation support

* put logic in ops_opencl

* fix crash

* hmm, stride seems okay

* padding for batched inputs

* just an issue now with cout%4

* op model still passes

* fix startPackedInputChannel

* pre and post processing ops for graph

* don't break other llops

* shapetrackering

* reshapes are free

* lazy movement ops
2022-06-16 15:40:52 -07:00
George Hotz
2a14befb74 support padding 2022-06-15 14:46:44 -07:00
George Hotz
fef6c82491 wow dilation support was simple 2022-06-15 11:38:23 -07:00
George Hotz
0b182029dd support dilated convolution in torch 2022-06-14 18:03:35 -07:00
George Hotz
a690ba4588 add test for padding 2022-06-14 17:41:22 -07:00
George Hotz
e057ca23bb add flip 2022-06-14 17:28:43 -07:00
George Hotz
dcbca4fdf1 Expand Operator (#327)
* replace broadcasting with expand

* Tensor, not self

* remove broadcasting from mlops

* delete useless A operator

* expand, not repeat

* remove A op

* expand on gpu

* binary_op doesn't broadcast anymore

* expand is still total junk, but the tests should pass
2022-06-12 12:31:48 -07:00
George Hotz
33f18c61a1 test_broadcasted_add 2022-06-12 10:19:58 -07:00
George Hotz
85d17a2acd running resnet onnx 2022-06-11 13:17:15 -07:00
George Hotz
db5a632e8c multicat + test onnx is generic onnx 2022-06-11 11:50:47 -07:00
George Hotz
08de1aa636 add flatten to tinygrad 2022-06-11 11:15:16 -07:00
George Hotz
d061ce8d5e add ELU support 2022-06-11 10:47:23 -07:00
George Hotz
8864b37333 fix torch convdw 2022-06-10 15:04:39 -07:00
George Hotz
aac1a9b419 this breaks tests 2022-06-10 12:20:42 -07:00
George Hotz
a1dff4061b minor cleanups 2022-06-06 08:14:52 -07:00
George Hotz
58ed46963e fix broadcastdot 2021-11-29 18:54:57 -05:00
George Hotz
f909ab194f gelu with broken test 2021-11-29 15:00:50 -05:00
George Hotz
29dee59368 cat: forward only not required 2021-11-29 00:14:56 -05:00
George Hotz
3cdc77f526 add cat support 2021-11-28 23:21:49 -05:00
George Hotz
7ae14179d3 refactor ops 2021-11-27 11:12:23 -05:00
Evan Mays
285621aeda Cherry backprop for conv2d (#281)
* quick math: 0 + x = x.

* gradient w.r.t. x using cherry for conv

* gradient w.r.t. w for conv on cherry but doing vector dot products

* small optimization

* [cherry] optimize conv backpass for large channel count

* get rid of numpy einsum
2021-10-30 16:12:19 -07:00
Guglielmo Camporese
2b7589db64 Added ResNet-{18, 34, 50, 101, 152} (#271)
* added resnets

* fix minor

* fix minor

* resnet in models

* added resnet test

* added resnet train test

* added linear, conv2d nn tests

* fix minor in extra/training

* resnet in models

* fix minor

* fix tolerance for linear in nn test

* fix eval, this causes cpu and gpu UT failing

* revert transformer test

* fix minor for CPU test

* improved model get_params for sequential layer

* fix minor for params counting

* commented broken ops tests

* improved train for resnet
2021-06-21 09:37:24 -07:00
George Hotz
2affd226b3 speed up sum 2021-06-17 16:38:34 -07:00
George Hotz
c1d469d440 sum op 2021-06-17 16:19:35 -07:00
George Hotz
2075fdeb4f FPGA Based Accelerator for Tinygrad (#258)
* ops_risk

* risk sim

* guessing is for winners

* minor

* better

* matmal with risk

* conv doesn't work

* closer

* conv2d works

* ops_risk

* opt2 works

* opt1 may not be possible

* opt1 is a mulacc

* arty

* attosoc example building on mac

* minor

* riscv assembler

* gucci gang

* we got C code

* not a scam

* hello

* make risk mergeable into master

* unop support
2021-06-07 17:45:09 -07:00
George Hotz
62e3a8558c fix tolerance maybe 2021-01-05 07:45:47 -08:00
George Hotz
8a38e0d207 only mish failed 2021-01-03 09:47:11 -08:00
George Hotz
1a4487965a remove negative from things w/o negative 2021-01-03 09:43:34 -08:00
George Hotz
0702e0c763 nah, no sign, it's not what you want. use relu 2021-01-03 09:30:33 -08:00
George Hotz
c2eeb6950b add support for sign. technically relu can be second class now 2021-01-03 08:29:57 -08:00
NeuralLink
0825cf7f79 Added softplus and mish non stable (#220)
*  Added softplus and mish CPU

* 🔨 refactor

* 🔨 second class softplus and mish

* 🔨 test fix

* no need of device in testing
2021-01-03 08:08:41 -08:00
Liam
ebd72ff437 Test split (#231)
* Split tests

Split tests into "Test CPU" and "Test GPU".

Add test flag "TEST_DEVICES" which is a comma separated list of devices:
CPU,GPU,ANE

* Run tests based on provided TEST_DEVICES flag

By default will run all "CPU,GPU,ANE"

* fix bad quote

* Revert changes and use GPU=1

This is done through setting the default Tensor Device to Device.CPU of
GPU=1 is set.

Run GPU tests: GPU=1 pytest -s -v
2021-01-01 09:19:03 -05:00
George Hotz
4291002881 reorder GPU ops 2020-12-31 09:46:39 -05:00
Marcel Bischoff
e2f833f58f max to behave on ties like torch (#229)
* checkpoint

* fixing pow

* undo pow

* backward max on GPU and CPU rewrite

* indentation

* changing seed for curiosity

* max replaced equality

* undo seed

* rebase

* fixed tests

* merge error
2020-12-30 18:52:50 -05:00
George Hotz
fcfe3dae01 write slice for CPU 2020-12-30 10:32:53 -05:00