Files
tinygrad/models
JaSpa99 d3d58a37e5 Bert: use Tensor.scaled_dot_product_attention (#1528)
* use scaled attn from Tensor

* add a test for bert

* linter

* no more tokenizer

* without loading weights

* remove prints

* tribute to linter lords

* smaller input and less runs

* small bert
2023-08-12 08:46:04 -07:00
..
2023-03-20 16:58:43 -07:00
2023-05-25 18:39:45 -07:00
2023-03-09 20:51:22 -08:00