diff --git a/docs/README.md b/docs/README.md
new file mode 100644
index 0000000000..052347e02e
--- /dev/null
+++ b/docs/README.md
@@ -0,0 +1,125 @@
+### Welcome to the tinygrad documentation
+
+General instructions you will find in [README.md](https://github.com/geohot/tinygrad/blob/master/README.md)
+
+[abstraction.py](https://github.com/geohot/tinygrad/blob/master/docs/abstractions.py) is a well documented showcase of the abstraction stack.
+
+There are plenty of [tests](https://github.com/geohot/tinygrad/tree/master/test) you can read through
+[Examples](https://github.com/geohot/tinygrad/tree/master/examples) contains tinygrad implementations of popular models (vision and language) and neural networks. LLama, Stable diffusion, GANs and Yolo to name a few
+
+### Environment variables
+Here is a list of environment variables you can use with tinygrad.
+Most of these are self-explanatory, and used to enable an option at runtime.
+Example : `GPU=1 DEBUG=4 python3 -m pytest`
+
+The columns are: Variable, Value and Description
+They are also grouped into either general tinygrad or specific files
+
+##### General tinygrad
+DEBUG: [1-4], enable debugging output, with 4 you get operations, timings, speed, generated code and more
+GPU: [1], enable the GPU backend
+CPU: [1], enable CPU backend
+MPS: [1], emable MPS device (for Mac M1 and after)
+METAL: [1], enable Metal backend (for Mac M1 and after)
+METAL_XCODE: [1], enable Metal using MacOS Xcode sdk
+TORCH: [1], enable Torch backend
+CLANG: [1], enable Clang backend
+LLVM: [1], enable LLVM backend
+LLVMOPT: [1], enable LLVM optimization
+LAZY: [1], enable lazy operations
+OPT: [1-4], enable optimization
+OPTLOCAL: [1], enable local optimization
+JIT: [1], enable Jit
+GRAPH: [1], Create a graph of all operations
+GRAPHPATH: [/path/to], what path to generate the graph image
+PRUNEGRAPH, [1], prune movementops and loadops from the graph
+PRINT_PRG: [1], print program
+FLOAT16: [1], use float16 instead of float32
+ENABLE_METHOD_CACHE: [1], enable method cache
+EARLY_STOPPING: [1], stop early
+DISALLOW_ASSIGN: [1], enable not assigning the realized lazydata to the lazy output buffer
+
+##### tinygrad/codegen/cstyle.py
+NATIVE_EXPLOG: [1], enable using native explog
+
+##### accel/ane/2_compile/hwx_parse.py
+PRINTALL: [1], print all ane registers
+
+##### extra/onnx.py
+ONNXLIMIT: [ ], set a limit for Onnx
+DEBUGONNX: [1], enable Onnx debugging
+
+##### extra/thneed.py
+DEBUGCL: [1-4], enable Debugging for OpenCL
+PRINT_KERNEL: [1], Print OpenCL Kernels
+
+##### extra/kernel_search.py
+OP: [1-3], different operations
+NOTEST: [1], enable not testing ast
+DUMP: [1], enable dumping of intervention cache
+REDUCE: [1], enable reduce operations
+SIMPLE_REDUCE: [1], enable simpler reduce operations
+BC: [1], enable big conv operations
+CONVW: [1], enable convw operations
+FASTCONV: [1], enable faster conv operations
+GEMM: [1], enable general matrix multiply operations
+BROKEN: [1], enable a kind of operation
+BROKEN3: [1], enable a kind of operation
+
+##### examples/vit.py
+LARGE: [1], enable larger dimension model
+
+##### examples/llama.py
+WEIGHTS: [1], enable using weights
+
+##### examples/mlperf
+MODEL: [resnet,retinanet,unet3d,rnnt,bert,maskrcnn], what models to use
+
+##### examples/benchmark_train_efficientnet.py
+CNT: [10], the amount of times to loop the benchmark
+BACKWARD: [1], enable backward call
+TRAINING: [1], set Tensor.training
+CLCACHE: [1], enable Cache for OpenCL
+
+##### examples/hlb_cifar10.py
+TORCHWEIGHTS: [1], use torch to initialize weights
+DISABLE_BACKWARD: [1], dont use backward operations
+
+##### examples/benchmark_train_efficientnet.py & examples/hlb_cifar10.py
+ADAM: [1], enable Adam optimization
+
+##### examples/hlb_cifar10.py & xamples/hlb_cifar10_torch.py
+STEPS: [0-10], number of steps
+FAKEDATA: [1], enable to use random data
+
+##### examples/train_efficientnet.py
+STEPS: [1024 dividable], number of steps
+TINY: [1], use a tiny convolution network
+IMAGENET: [1], use imagenet for training
+
+##### examples/train_efficientnet.py & examples/train_resnet.py
+TRANSFER: [1], enable to use pretrained data
+
+##### examples & test/external/external_test_opt.py
+NUM: [18, 2], what ResNet[18] / EfficientNet[2] to train
+
+##### test/test_ops.py
+PRINT_TENSORS: [1], print tensors
+FORWARD_ONLY: [1], use forward operations only
+
+##### test/test_speed_v_torch.py
+TORCHCUDA: [1], enable the torch cuda backend
+
+##### test/external/external_test_gpu_ast.py
+KOPT: [1], enable kernel optimization
+KCACHE: [1], enable kernel cache
+
+##### test/external/external_test_opt.py
+ENET_NUM: [-2,-1], what EfficientNet to use
+
+##### test/test_dtype.py & test/extra/test_utils.py & extra/training.py
+CI: [1], enable to avoid some tests to run in CI
+
+##### examples & extra & test
+BS: [8, 16, 32, 64, 128], bytesize
+