Files
tinygrad/docs/env_vars.md
nimlgen 69f116a7e1 nv/amd profiler (#4718)
* nv/amd profiler

* fix

* fix

* profile copies

* profile logger

* fixes

* more fixes

* less lines and fixes

* fixes

* some linter

* back sync, no related change

* fix gpu2cpu time def

* simpler

* linter

* linter

* docs

* add add_event api
2024-06-23 17:10:12 +03:00

51 lines
2.5 KiB
Markdown

# List of environment variables that control tinygrad behavior.
This is a list of environment variable that control the runtime behavior of tinygrad and its examples.
Most of these are self-explanatory, and are usually used to set an option at runtime.
Example: `GPU=1 DEBUG=4 python3 -m pytest`
However you can also decorate a function to set a value only inside that function.
```python
# in tensor.py (probably only useful if you are a tinygrad developer)
@Context(DEBUG=4)
def numpy(self) -> ...
```
Or use contextmanager to temporarily set a value inside some scope:
```python
with Context(DEBUG=0):
a = Tensor.ones(10, 10)
a *= 2
```
## Global Variables
The columns of this list are are: Variable, Possible Value(s) and Description.
- A `#` means that the variable can take any integer value.
These control the behavior of core tinygrad even when used as a library.
Variable | Possible Value(s) | Description
---|---|---
DEBUG | [1-6] | enable debugging output, with 4 you get operations, timings, speed, generated code and more
GPU | [1] | enable the GPU backend
CUDA | [1] | enable CUDA backend
AMD | [1] | enable AMD backend
NV | [1] | enable NV backend
METAL | [1] | enable Metal backend (for Mac M1 and after)
METAL_XCODE | [1] | enable Metal using macOS Xcode SDK
CLANG | [1] | enable Clang backend
LLVM | [1] | enable LLVM backend
BEAM | [#] | number of beams in kernel beam search
GRAPH | [1] | create a graph of all operations (requires graphviz)
GRAPHUOPS | [1] | create a graph of uops (requires graphviz and saves at /tmp/uops.{svg,dot})
GRAPHPATH | [/path/to] | where to put the generated graph
DEFAULT_FLOAT | [HALF, ...]| specify the default float dtype (FLOAT32, HALF, BFLOAT16, FLOAT64, ...), default to FLOAT32
IMAGE | [1-2] | enable 2d specific optimizations
FLOAT16 | [1] | use float16 for images instead of float32
PTX | [1] | enable the specialized [PTX](https://docs.nvidia.com/cuda/parallel-thread-execution/) assembler for Nvidia GPUs. If not set, defaults to generic CUDA codegen backend.
PROFILE | [1] | enable output of [perfetto](https://ui.perfetto.dev/) compatible profile. This feature is supported in NV and AMD backends.