mirror of
https://github.com/tinygrad/tinygrad.git
synced 2026-01-10 07:28:15 -05:00
move graph.py and jit.py into features (#3376)
* move graph.py into features * move jit into features * fix quickstart
This commit is contained in:
@@ -277,7 +277,7 @@ result = Tensor(2.0) + Tensor(3.0)
|
||||
|
||||
# we have a global cache used by the JIT
|
||||
# from there, we can see the generated clang code
|
||||
from tinygrad.jit import CacheCollector
|
||||
from tinygrad.features.jit import CacheCollector
|
||||
CacheCollector.start() # enables the cache
|
||||
result.realize() # create the program and runs it
|
||||
cache_saved = CacheCollector.finish() # disable the cache
|
||||
|
||||
@@ -91,7 +91,7 @@ sched = out.schedule()
|
||||
for si in sched: print(si.ast.op) # NOTE: the first two convert it to CLANG
|
||||
|
||||
# DEBUGGING: print the compute ast as a tree
|
||||
from tinygrad.graph import print_tree
|
||||
from tinygrad.features.graph import print_tree
|
||||
print_tree(sched[-1].ast)
|
||||
# NOTE: sched[-1].ast is the same as st_0 above
|
||||
|
||||
|
||||
29
docs/linearizer_v2.md
Normal file
29
docs/linearizer_v2.md
Normal file
@@ -0,0 +1,29 @@
|
||||
At base, the Linearizer a function that takes an AST + opts -> uops
|
||||
It should be rewritten like this. The AST can't be a LazyOp, because it should be able to have multiple outputs
|
||||
|
||||
We need a generic class to represent DAGs.
|
||||
This refactor is probably a prereq for the new linearizer, and can be used on existing uops also.
|
||||
Can this class also represent the large graph? The op graph is a subset of the large graph.
|
||||
|
||||
Currently the Linearizer is merging many concerns:
|
||||
|
||||
1. LocalBuffers are added. These should be added to the upper DAG, for both grouping and tensor cores. Some opts are used here. NOTE: currently reduce splitting is done in lazy.py and it shouldn't be
|
||||
2. The ShapeTrackers at the edges are collected and modified according to the other opts.
|
||||
3. The Ops are toposorted.
|
||||
4. The Ops are lowered to UOps. This requires expansion and loop assignment, potentially to global dimensions
|
||||
5. The indexes into the Tensor are computed from the shapetrackers
|
||||
|
||||
More generically, the whole network is a DAG. Ignore the forward/backward stuff, I'm fine with starting at the LazyBuffer level.
|
||||
|
||||
1. Is it possible to put an entire network in a single kernel? I think the answer has to be yes, but you may end up doing an absolutely crazy amount of recomputation. This should still be doable to check correctness.
|
||||
2. You can use intermediate buffers, be they local or global, to do less compute.
|
||||
|
||||
This is a rewrite of a lot of tinygrad. I don't think continuing to support Interpreted backends is worth it, have to deal with disk in a smart way.
|
||||
|
||||
We keep the frontend: tensor.py + mlops.py + lazy.py
|
||||
We keep the backend (renderer/runtime): cstyle.py + device.py + ops_*.py
|
||||
We keep the shapetracker/symbolic: shapetracker.py + view.py + symbolic.py
|
||||
We keep the features and nn stuff.
|
||||
But codegen is all rewritten.
|
||||
|
||||
|
||||
@@ -247,7 +247,7 @@ To use the JIT we just need to add a function decorator to the forward pass of o
|
||||
Or in this case we will create a wrapper function and decorate the wrapper function to speed up the evaluation of our neural network.
|
||||
|
||||
```python
|
||||
from tinygrad.jit import TinyJit
|
||||
from tinygrad import TinyJit
|
||||
|
||||
@TinyJit
|
||||
def jit(x):
|
||||
|
||||
Reference in New Issue
Block a user