George Hotz
|
2e56a4793e
|
rename log_softmax, support dim, fix onnx Softmax
|
2023-02-24 10:11:24 -08:00 |
|
Jacky Lee
|
cb679cd051
|
Fix weight initialization (#566)
* Fix weight initialization
* Use scaled_uniform in serious_mnist
|
2023-02-19 11:25:29 -08:00 |
|
Kirill
|
7944cfdadc
|
Remove Tensor.data (#565)
|
2023-02-18 16:36:12 -08:00 |
|
George Hotz
|
c8b569a8c7
|
cleaner comments
|
2022-05-14 21:28:39 -07:00 |
|
George Hotz
|
d31ef0ae48
|
make vit names match pytorch
|
2021-11-30 11:34:14 -05:00 |
|
George Hotz
|
4b7c31b5b7
|
break vit into it's own file
|
2021-11-30 11:19:22 -05:00 |
|
George Hotz
|
46bbbcf7f0
|
model touchups
|
2021-11-30 11:13:34 -05:00 |
|
George Hotz
|
835869974c
|
clean up vit code
|
2021-11-30 10:58:03 -05:00 |
|
George Hotz
|
c39824bc62
|
oops, forgot some stars
|
2021-11-30 00:46:14 -05:00 |
|
George Hotz
|
bd21304e3c
|
linear takes in weight and bias
|
2021-11-30 00:38:47 -05:00 |
|
George Hotz
|
535f02cc64
|
use sequential
|
2021-11-30 00:25:39 -05:00 |
|
George Hotz
|
de938c2d9d
|
vit is now tested
|
2021-11-30 00:23:06 -05:00 |
|
George Hotz
|
aff810e722
|
unify transformer block
|
2021-11-29 18:58:15 -05:00 |
|
George Hotz
|
58ed46963e
|
fix broadcastdot
|
2021-11-29 18:54:57 -05:00 |
|
George Hotz
|
dca076dbf1
|
remove dumb nn ops
|
2021-11-29 18:05:31 -05:00 |
|
George Hotz
|
8097b8f7d6
|
vit works
|
2021-11-29 16:28:14 -05:00 |
|
George Hotz
|
f909ab194f
|
gelu with broken test
|
2021-11-29 15:00:50 -05:00 |
|
George Hotz
|
1eafa5580e
|
layernorm with learnable parameters
|
2021-11-29 13:03:57 -05:00 |
|
George Hotz
|
c7f795ca1e
|
added dot affine
|
2021-11-29 12:55:56 -05:00 |
|
George Hotz
|
30eb3afbe1
|
add bias term to transformer
|
2021-11-29 12:45:27 -05:00 |
|
George Hotz
|
99b6051467
|
add ff_dim to transformer
|
2021-11-29 12:40:52 -05:00 |
|
George Hotz
|
d3f169b267
|
move good models to models, add a training step test
|
2021-06-19 11:24:15 -07:00 |
|