mirror of
https://github.com/tinygrad/tinygrad.git
synced 2026-01-09 15:08:02 -05:00
4.1 KiB
4.1 KiB
Claude Code Guide for tinygrad
Architecture Overview
tinygrad compiles tensor operations into optimized kernels. The pipeline:
- Tensor (
tensor.py) - User-facing API, creates UOp graph - UOp (
uop/ops.py) - Unified IR for all operations (both tensor and kernel level) - Schedule (
engine/schedule.py,schedule/) - Converts tensor UOps to kernel UOps - Codegen (
codegen/) - Converts kernel UOps to device code - Runtime (
runtime/) - Device-specific execution
Key Concepts
UOp (Universal Operation)
Everything is a UOp - tensors, operations, buffers, kernels. Key properties:
op: The operation type (Ops enum)dtype: Data typesrc: Tuple of source UOpsarg: Operation-specific argumenttag: Optional tag for graph transformations
UOps are immutable and cached - creating the same UOp twice returns the same object (ucache).
PatternMatcher
Used extensively for graph transformations:
pm = PatternMatcher([
(UPat(Ops.ADD, src=(UPat.cvar("x"), UPat.cvar("x"))), lambda x: x * 2),
])
result = graph_rewrite(uop, pm)
Schedule Cache
Schedules are cached by graph structure. BIND nodes (variables with bound values) are unbound before cache key computation so different values hit the same cache.
Directory Structure
tinygrad/
├── tensor.py # Tensor class, user API
├── device.py # Buffer, device management
├── dtype.py # Data types
├── helpers.py # Utilities, environment vars
├── uop/
│ ├── ops.py # UOp class, Ops enum, PatternMatcher
│ ├── spec.py # UOp type verification
│ └── symbolic.py # Symbolic math simplification
├── engine/
│ ├── schedule.py # Schedule creation, caching
│ ├── realize.py # Tensor realization
│ ├── jit.py # JIT compilation
│ └── memory.py # Memory planning
├── schedule/
│ ├── rangeify.py # Convert movements to ranges
│ └── indexing.py # Index calculations
├── codegen/
│ ├── kernel.py # Kernel optimization
│ └── uopgraph.py # UOp graph transformations
├── renderer/ # Code generation (CUDA, Metal, etc.)
└── runtime/ # Device backends
Testing
# Run specific test
python -m pytest test/unit/test_schedule_cache.py -xvs
# Run with timeout
python -m pytest test/test_symbolic_ops.py -x --timeout=60
# Debug with print
DEBUG=2 python -m pytest test/test_schedule.py::test_name -xvs
# Visualize UOp graphs
VIZ=1 python -c "from tinygrad import Tensor; Tensor.ones(10).sum().realize()"
Common Environment Variables
DEBUG=1-4- Increasing verbosityVIZ=1- Enable graph visualizationSPEC=1- Enable UOp spec verificationNOOPT=1- Disable optimizationsDEVICE=CPU/CUDA/AMD/METAL- Set default device
Debugging Tips
- Print UOp graphs:
print(tensor.uop)orprint(tensor.uop.sink()) - Check schedule:
tensor.schedule()returns list of ScheduleItems - Trace graph rewrites: Use
VIZ=1or add print in PatternMatcher callbacks - Find UOps by type:
[u for u in uop.toposort() if u.op is Ops.SOMETHING]
Style Notes
- 2-space indentation, 150 char line limit
- PatternMatchers should be defined at module level (slow to construct)
- Prefer
graph_rewriteover manual graph traversal - UOp methods like
.replace()preserve tags unless explicitly changed - Use
.rtag(value)to add tags to UOps
Common Patterns
Graph Transformation
def my_transform(ctx, x):
# Return new UOp or None to skip
return x.replace(arg=new_arg)
pm = PatternMatcher([
(UPat(Ops.SOMETHING, name="x"), my_transform),
])
result = graph_rewrite(input_uop, pm, ctx={})
Finding Variables
# Get all variables in a UOp graph
variables = uop.variables()
# Get bound variable values
var, val = bind_uop.unbind()
Shape Handling
# Shapes can be symbolic (contain UOps)
shape = tensor.shape # tuple[sint, ...] where sint = int | UOp