mirror of
https://github.com/ROCm/ROCm.git
synced 2026-02-21 03:00:39 -05:00
On some machines, the amount of available RAM might not be enough to
compile Triton with `2 * num_cpus` parallelism. For example, CircleCI's
`large` instance can't handle Triton compilation as is due to
insufficient memory.
Instead, I propose to take PyTorch's approach where we can define a
[`MAX_JOBS` env
var](0e4ddc2b40/tools/setup_helpers/cmake.py (L366-L368))
that gives the user the possibility to reduce (or increase) the
parallelism during compilation.
Co-authored-by: Philippe Tillet <phil@openai.com>
The file is empty.