Commit Graph

106 Commits

Author SHA1 Message Date
Yixiang Gao
a686663657 make Embedding device aware for multigpu (#3051)
* make Embedding device aware for multigpu

* split line instead of igore because that's cheating

* add test incomplete

* add test complete

* remove comment

* fix white space

* remove nn.Embedding
2024-01-08 20:09:26 -08:00
Yixiang Gao
8a63f26a0f make LR scheduler work with multigpu (#3011)
* add a failing test for LR scheduler when using multigpu

* fix calculation order and unnecessary tensor created for float

* min_lr is no longer tensor
2024-01-04 12:10:56 -08:00
chenyu
81b97cd2c6 canonicalize device in LazyBuffer constructor (#2991)
fixed the multitensor +1 then sum bug
2024-01-03 12:55:25 -05:00
chenyu
db525cf8c2 multitensor failed test case with +1 then sum on DEVICE:0 (#2990) 2024-01-03 12:17:11 -05:00
George Hotz
5dbaaa7061 hotfix: make multitensor shard contiguous 2024-01-03 08:48:30 -08:00
George Hotz
f494b9d463 simple multitensor API (#2903)
* simple multitensor API

* test multitensor

* mt work

* new api

* copies

* all but data parallel

* allreduce there

* works, but axis sharded

* fix all mt tests

* features/multi

* work

* backprop

* fix tests

* tests passing

* mt progress

* cleanups

* less lines

* tensor cleanup

* save more lines

* mypy passes

* fix tests

* skip for cuda too

* bump download cache
2024-01-02 17:49:44 -08:00