Files
tinygrad/test
Yixiang Gao 13e872b53f add mutigpu support for llama attention (#3064)
* add llama attention test for multigpu

* test fails

* kv cache trying to shrink on sharded axis

* mask None works for scale dot product

* kv cache seems to be working but scale dot product breaks

* scaled dot product works, but the last linear layer failed

* running into the reshape case where it could be wrong for multigpu

* making sure it was the reshape

* adding contiguous doesn't solve

* need to shard more properly

* remove reshape test

* minor adjustment to scale dot product attention test

* weights are sharded wrong

* continue fix new weight sharding

* clean up

* fix attention when start_pos is 0

* remove print

* add TODOs for the best mutigpu interface
2024-01-11 16:31:02 -08:00
..
2024-01-09 16:37:37 -08:00
2023-12-18 18:53:28 -05:00
2023-12-01 11:34:47 -08:00
2020-12-15 23:44:08 -08:00
2023-06-25 10:38:58 -07:00
2023-12-20 14:33:21 -08:00
2024-01-01 14:58:48 -08:00
2023-12-22 09:23:37 -08:00
2024-01-08 09:29:13 -08:00
2023-12-20 23:47:50 -08:00
2023-02-27 06:53:18 -08:00
2023-12-18 21:09:32 -08:00
2024-01-09 20:10:22 -08:00
2024-01-01 14:58:48 -08:00
2023-12-07 17:07:05 -08:00
2024-01-08 18:45:03 -05:00
2024-01-09 20:17:13 -08:00
2024-01-08 14:44:53 -08:00
2024-01-01 14:58:48 -08:00
2024-01-09 08:59:04 -08:00