Commit Graph

20 Commits

Author SHA1 Message Date
George Hotz
b132de677d tinygrad.nn (#367)
* tinygrad.nn

* flake8

* working on pylint

* more pylint

* more pylint

* pylint passes

* networkx

* mypy can't infer that type

* junk
2022-08-18 07:41:00 -07:00
George Hotz
99b6051467 add ff_dim to transformer 2021-11-29 12:40:52 -05:00
George Hotz
d3f169b267 move good models to models, add a training step test 2021-06-19 11:24:15 -07:00
Marcel Bischoff
42b4761025 transformer >99.98% test accuracy in ~30s (#230)
* transformer

* BS might divide len(Y_test)

* outoput when accuracy is high

* more readeable

* fixed loss in serious_mnist for new API
2021-01-02 07:45:09 -08:00
George Hotz
f9170505b3 if you like your transformers twice as slow, use the GPU 2020-12-29 17:14:23 -05:00
George Hotz
3f8e137b6f extra/transformer 2020-12-29 14:14:00 -05:00
George Hotz
bcb3ceeca3 set training in functions 2020-12-28 22:45:46 -05:00
George Hotz
51bf164b72 dropout, training 2020-12-28 22:12:23 -05:00
George Hotz
7b8fee038d it works! forgot the sqrt 2020-12-28 16:23:52 -05:00
George Hotz
1faf05ef67 ahh, it's better if i don't train the embedding 2020-12-28 16:07:02 -05:00
George Hotz
c3832e1bde hmm, fix layernorm to not be batchnorm and it breaks 2020-12-28 13:06:21 -05:00
George Hotz
2e89e75dcb layernorm fixes transformer instability 2020-12-28 12:58:15 -05:00
George Hotz
593233b668 log and exp are first class ops 2020-12-28 10:00:30 -05:00
Marcel Bischoff
ffff98db78 Evaluation in Transformers (#218)
* 2serious

* load/save

* fixing GPU

* added DEBUG

* needs BatchNorm or doesn't learn anything

* old file not needed

* added conv biases

* added extra/training.py and checkpoint

* assert in test only

* save

* padding

* num_classes

* checkpoint

* checkpoints for padding

* training was broken

* merge

* rotation augmentation

* more aug

* needs testing

* streamline augment, augment is fast thus bicubic

* tidying up

* transformer eval
2020-12-28 09:24:51 -05:00
George Hotz
65b07d2f4f fix onehot embed 2020-12-27 18:50:38 -05:00
George Hotz
d864e1c71a transformer is training 2020-12-27 18:46:32 -05:00
George Hotz
a361ef6861 fixup training loop 2020-12-27 18:35:56 -05:00
George Hotz
f15bec6dbc make multidot work on CPU 2020-12-27 17:25:37 -05:00
George Hotz
131e04c90c cpu only decorator 2020-12-27 17:18:55 -05:00
George Hotz
2f1b2c0a3b add transpose, start on transformer 2020-12-27 16:59:12 -05:00