22 Commits

Author SHA1 Message Date
George Hotz
2e56a4793e rename log_softmax, support dim, fix onnx Softmax 2023-02-24 10:11:24 -08:00
Jacky Lee
cb679cd051 Fix weight initialization (#566)
* Fix weight initialization

* Use scaled_uniform in serious_mnist
2023-02-19 11:25:29 -08:00
Kirill
7944cfdadc Remove Tensor.data (#565) 2023-02-18 16:36:12 -08:00
George Hotz
c8b569a8c7 cleaner comments 2022-05-14 21:28:39 -07:00
George Hotz
d31ef0ae48 make vit names match pytorch 2021-11-30 11:34:14 -05:00
George Hotz
4b7c31b5b7 break vit into it's own file 2021-11-30 11:19:22 -05:00
George Hotz
46bbbcf7f0 model touchups 2021-11-30 11:13:34 -05:00
George Hotz
835869974c clean up vit code 2021-11-30 10:58:03 -05:00
George Hotz
c39824bc62 oops, forgot some stars 2021-11-30 00:46:14 -05:00
George Hotz
bd21304e3c linear takes in weight and bias 2021-11-30 00:38:47 -05:00
George Hotz
535f02cc64 use sequential 2021-11-30 00:25:39 -05:00
George Hotz
de938c2d9d vit is now tested 2021-11-30 00:23:06 -05:00
George Hotz
aff810e722 unify transformer block 2021-11-29 18:58:15 -05:00
George Hotz
58ed46963e fix broadcastdot 2021-11-29 18:54:57 -05:00
George Hotz
dca076dbf1 remove dumb nn ops 2021-11-29 18:05:31 -05:00
George Hotz
8097b8f7d6 vit works 2021-11-29 16:28:14 -05:00
George Hotz
f909ab194f gelu with broken test 2021-11-29 15:00:50 -05:00
George Hotz
1eafa5580e layernorm with learnable parameters 2021-11-29 13:03:57 -05:00
George Hotz
c7f795ca1e added dot affine 2021-11-29 12:55:56 -05:00
George Hotz
30eb3afbe1 add bias term to transformer 2021-11-29 12:45:27 -05:00
George Hotz
99b6051467 add ff_dim to transformer 2021-11-29 12:40:52 -05:00
George Hotz
d3f169b267 move good models to models, add a training step test 2021-06-19 11:24:15 -07:00