mirror of https://github.com/tinygrad/tinygrad.git synced 2026-01-10 07:28:15 -05:00

Files

George Hotz 18892242b0 global -> group (#1007 )

* global -> group

* allow None for local_size in custom function

* lil local

* comment on shape

* fix cuda

* smart local cast

* better local heuristic

* fix ptx, and work_dim cleanup

* fix metal

* fix ops test

* fix openpilot jit

* no more optlocal

* might fix metal tests

* try metal now

* see generated metal code

* test free removal. REVERT THIS

* mergable

2023-06-21 11:50:43 -07:00

6.9 KiB

Raw Blame History

List of environment variables that control tinygrad behavior.

This is a list of environment variable that control the runtime behavior of tinygrad and its examples. Most of these are self-explanatory, and are usually used to set an option at runtime.

Example: GPU=1 DEBUG=4 python3 -m pytest

The columns are: Variable, Possible Value(s) and Description.

A # means that the variable can take any integer value.

Global Variables

These control the behavior of core tinygrad even when used as a library.

Variable	Possible Value(s)	Description
DEBUG	[1-4]	enable debugging output, with 4 you get operations, timings, speed, generated code and more
GPU	[1]	enable the GPU backend
CUDA	[1]	enable CUDA backend
CPU	[1]	enable CPU backend
MPS	[1]	enable MPS device (for Mac M1 and after)
METAL	[1]	enable Metal backend (for Mac M1 and after)
METAL_XCODE	[1]	enable Metal using macOS Xcode SDK
TORCH	[1]	enable PyTorch backend
CLANG	[1]	enable Clang backend
LLVM	[1]	enable LLVM backend
LLVMOPT	[1]	enable slightly more expensive LLVM optimizations
LAZY	[1]	enable lazy operations (this is the default)
OPT	[1-4]	optimization level
GRAPH	[1]	create a graph of all operations (requires graphviz)
GRAPHPATH	[/path/to]	where to put the generated graph
PRUNEGRAPH	[1]	prune MovementOps and LoadOps from the graph
PRINT_PRG	[1]	print program code
IMAGE	[1]	enable 2d specific optimizations
FLOAT16	[1]	use float16 for images instead of float32
ENABLE_METHOD_CACHE	[1]	enable method cache (this is the default)
EARLY_STOPPING	[# > 0]	stop after this many kernels
DISALLOW_ASSIGN	[1]	disallow assignment of tensors
CL_EXCLUDE	[name0,name1]	comma-separated list of device names to exclude when using OpenCL GPU backend (like `CL_EXCLUDE=gfx1036`)
CL_PLATFORM	[# >= 0]	index of the OpenCL platform to run on. Defaults to 0.
RDNA	[1]	enable the specialized RDNA 3 assembler for AMD 7000-series GPUs. If not set, defaults to generic OpenCL codegen backend.
PTX	[1]	enable the specialized PTX assembler for Nvidia GPUs. If not set, defaults to generic CUDA codegen backend.

File Specific Variables

These are variables that control the behavior of a specific file, these usually don't affect the library itself. Most of the time these will never be used, but they are here for completeness.

accel/ane/2_compile/hwx_parse.py

Variable	Possible Value(s)	Description
PRINTALL	[1]	print all ANE registers

extra/onnx.py

Variable	Possible Value(s)	Description
ONNXLIMIT	[#]	set a limit for ONNX
DEBUGONNX	[1]	enable ONNX debugging

extra/thneed.py

Variable	Possible Value(s)	Description
DEBUGCL	[1-4]	enable Debugging for OpenCL
PRINT_KERNEL	[1]	Print OpenCL Kernels

extra/kernel_search.py

Variable	Possible Value(s)	Description
OP	[1-3]	different operations
NOTEST	[1]	enable not testing AST
DUMP	[1]	enable dumping of intervention cache
REDUCE	[1]	enable reduce operations
SIMPLE_REDUCE	[1]	enable simpler reduce operations
BC	[1]	enable big conv operations
CONVW	[1]	enable convw operations
FASTCONV	[1]	enable faster conv operations
GEMM	[1]	enable general matrix multiply operations
BROKEN	[1]	enable a kind of operation
BROKEN3	[1]	enable a kind of operation

examples/vit.py

Variable	Possible Value(s)	Description
LARGE	[1]	enable larger dimension model

examples/llama.py

Variable	Possible Value(s)	Description
WEIGHTS	[1]	enable loading weights

examples/mlperf

Variable	Possible Value(s)	Description
MODEL	[resnet,retinanet,unet3d,rnnt,bert,maskrcnn]	what models to use

examples/benchmark_train_efficientnet.py

Variable	Possible Value(s)	Description
CNT	[10]	the amount of times to loop the benchmark
BACKWARD	[1]	enable backward pass
TRAINING	[1]	set Tensor.training
CLCACHE	[1]	enable cache for OpenCL

examples/hlb_cifar10.py

Variable	Possible Value(s)	Description
TORCHWEIGHTS	[1]	use torch to initialize weights
DISABLE_BACKWARD	[1]	don't do backward pass

examples/benchmark_train_efficientnet.py & examples/hlb_cifar10.py

Variable	Possible Value(s)	Description
ADAM	[1]	use the Adam optimizer

examples/hlb_cifar10.py & xamples/hlb_cifar10_torch.py

Variable	Possible Value(s)	Description
STEPS	[0-10]	number of steps
FAKEDATA	[1]	enable to use random data

examples/train_efficientnet.py

Variable	Possible Value(s)	Description
STEPS	[# % 1024]	number of steps
TINY	[1]	use a tiny convolution network
IMAGENET	[1]	use imagenet for training

examples/train_efficientnet.py & examples/train_resnet.py

Variable	Possible Value(s)	Description
TRANSFER	[1]	enable to use pretrained data

examples & test/external/external_test_opt.py

Variable	Possible Value(s)	Description
NUM	[18, 2]	what ResNet[18] / EfficientNet[2] to train

test/test_ops.py

Variable	Possible Value(s)	Description
PRINT_TENSORS	[1]	print tensors
FORWARD_ONLY	[1]	use forward operations only

test/test_speed_v_torch.py

Variable	Possible Value(s)	Description
TORCHCUDA	[1]	enable the torch cuda backend

test/external/external_test_gpu_ast.py

Variable	Possible Value(s)	Description
KOPT	[1]	enable kernel optimization
KCACHE	[1]	enable kernel cache

test/external/external_test_opt.py

Variable	Possible Value(s)	Description
ENET_NUM	[-2,-1]	what EfficientNet to use

test/test_dtype.py & test/extra/test_utils.py & extra/training.py

Variable	Possible Value(s)	Description
CI	[1]	disables some tests for CI

examples & extra & test

Variable	Possible Value(s)	Description
BS	[8, 16, 32, 64, 128]	batch size to use

datasets/imagenet_download.py

Variable	Possible Value(s)	Description
IMGNET_TRAIN	[1]	download also training data with imagenet

6.9 KiB Raw Blame History