move graph.py and jit.py into features (#3376)

* move graph.py into features * move jit into features * fix quickstart
2026-01-10 07:28:15 -05:00 · 2024-02-12 17:34:34 +01:00
parent 0f6cde243d
commit 41efaa848c
41 changed files with 77 additions and 47 deletions
--- a/docs/abstractions.py
+++ b/docs/abstractions.py
@@ -277,7 +277,7 @@ result = Tensor(2.0) + Tensor(3.0)

 # we have a global cache used by the JIT
 # from there, we can see the generated clang code
-from tinygrad.jit import CacheCollector
+from tinygrad.features.jit import CacheCollector
 CacheCollector.start()       # enables the cache
 result.realize()             # create the program and runs it
 cache_saved = CacheCollector.finish()  # disable the cache
--- a/docs/abstractions2.py
+++ b/docs/abstractions2.py
@@ -91,7 +91,7 @@ sched = out.schedule()
 for si in sched: print(si.ast.op)  # NOTE: the first two convert it to CLANG

 # DEBUGGING: print the compute ast as a tree
-from tinygrad.graph import print_tree
+from tinygrad.features.graph import print_tree
 print_tree(sched[-1].ast)
 # NOTE: sched[-1].ast is the same as st_0 above

--- a/docs/linearizer_v2.md
+++ b/docs/linearizer_v2.md
@@ -0,0 +1,29 @@
+At base, the Linearizer a function that takes an AST + opts -> uops
+It should be rewritten like this. The AST can't be a LazyOp, because it should be able to have multiple outputs
+
+We need a generic class to represent DAGs.
+This refactor is probably a prereq for the new linearizer, and can be used on existing uops also.
+Can this class also represent the large graph? The op graph is a subset of the large graph.
+
+Currently the Linearizer is merging many concerns:
+
+1. LocalBuffers are added. These should be added to the upper DAG, for both grouping and tensor cores. Some opts are used here. NOTE: currently reduce splitting is done in lazy.py and it shouldn't be
+2. The ShapeTrackers at the edges are collected and modified according to the other opts.
+3. The Ops are toposorted.
+4. The Ops are lowered to UOps. This requires expansion and loop assignment, potentially to global dimensions
+5. The indexes into the Tensor are computed from the shapetrackers
+
+More generically, the whole network is a DAG. Ignore the forward/backward stuff, I'm fine with starting at the LazyBuffer level.
+
+1. Is it possible to put an entire network in a single kernel? I think the answer has to be yes, but you may end up doing an absolutely crazy amount of recomputation. This should still be doable to check correctness.
+2. You can use intermediate buffers, be they local or global, to do less compute.
+
+This is a rewrite of a lot of tinygrad. I don't think continuing to support Interpreted backends is worth it, have to deal with disk in a smart way.
+
+We keep the frontend: tensor.py + mlops.py + lazy.py
+We keep the backend (renderer/runtime): cstyle.py + device.py + ops_*.py
+We keep the shapetracker/symbolic: shapetracker.py + view.py + symbolic.py
+We keep the features and nn stuff.
+But codegen is all rewritten.
+
+
--- a/docs/quickstart.md
+++ b/docs/quickstart.md
@@ -247,7 +247,7 @@ To use the JIT we just need to add a function decorator to the forward pass of o
 Or in this case we will create a wrapper function and decorate the wrapper function to speed up the evaluation of our neural network.

 ```python
-from tinygrad.jit import TinyJit
+from tinygrad import TinyJit

@TinyJit
 def jit(x):