uuuvn
|
6729f20aab
|
Ring allreduce try 2 (#3852)
* Ring allreduce v3
* Configurable size, number of gpus and jit in benchmark
* ScheduleBarrier v0
* GB/s that make sense
* ScheduleBarrier v0.1
* Fallback on 2 GPUs
* ScheduleBarrier v0.2
* ScheduleBarrier v0.3
* ScheduleBarrier v0.3.1
* ScheduleBarrier v0.3.2
* Replace ScheduleBarrier with automatic optimization
* unused import
* fix comment
* typing
* better fallback
* python 3.8
* RING=2 and use ContextVar
* DEBUG >= 2 and change name
* linter
* type
---------
Co-authored-by: George Hotz <72895+geohot@users.noreply.github.com>
Co-authored-by: chenyu <chenyu@fastmail.com>
Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com>
|
2024-03-21 19:17:51 -04:00 |
|