diff --git a/.github/workflows/test.yml b/.github/workflows/test.yml index 18a01d085c..2031342b93 100644 --- a/.github/workflows/test.yml +++ b/.github/workflows/test.yml @@ -91,7 +91,7 @@ jobs: run: python -m mypy --strict-equality - name: Test Docs run: | - python docs-legacy/abstractions2.py + python docs/abstractions2.py - name: Test Quickstart run: awk '/```python/{flag=1;next}/```/{flag=0}flag' docs/quickstart.md > quickstart.py && PYTHONPATH=. python quickstart.py - name: Fuzz Test symbolic diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index a1bc8fcbd3..804712aef0 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -21,7 +21,7 @@ repos: pass_filenames: false - id: docs2 name: docs2 - entry: python3 docs-legacy/abstractions2.py + entry: python3 docs/abstractions2.py language: system always_run: true pass_filenames: false diff --git a/.tokeignore b/.tokeignore deleted file mode 100644 index cb7645b24d..0000000000 --- a/.tokeignore +++ /dev/null @@ -1,4 +0,0 @@ -* -!*/ - -!tinygrad/** diff --git a/README.md b/README.md index d3e74900be..740c3335c5 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,8 @@
- - tiny corp logo + + tiny corp logo tinygrad: For something between [PyTorch](https://github.com/pytorch/pytorch) and [karpathy/micrograd](https://github.com/karpathy/micrograd). Maintained by [tiny corp](https://tinygrad.org). @@ -87,7 +87,7 @@ tinygrad already supports numerous accelerators, including: - [x] [HSA](tinygrad/runtime/ops_hsa.py) And it is easy to add more! Your accelerator of choice only needs to support a total of ~25 low level ops. -More information can be found in the [documentation for adding new accelerators](/docs-legacy/adding_new_accelerators.md). +More information can be found in the [documentation for adding new accelerators](/docs/adding_new_accelerators.md). ## Installation diff --git a/docs-legacy/DESIGNv2.md b/docs-legacy/DESIGNv2.md deleted file mode 100644 index 5dda08e36b..0000000000 --- a/docs-legacy/DESIGNv2.md +++ /dev/null @@ -1,17 +0,0 @@ -tinygrad is a bit bloated now, and there's several places where concerns should be seperated and they aren't. - -tensor.py and mlops.py are great code. The interface going backward here is: - -LazyBuffer.const (this creates a matching size buffer) -LazyBuffer.contiguous (tbis is not exactly elementwise) -LazyBuffer.e (elementwise) -LazyBuffer.r (reduce) -reshape/permute/expand/stride/shrink/pad (movement) - -The lazy.py reordering engine has a lot of junk to deal with movementops that should be removed. - -view.py is mostly great code, except it shouldn't have the rendering logic, and the int type should be parameterized to not import from symbolic. - -LazyOp shouldn't have LazyBuffers as sources, just LazyOp LoadOps with a tuple of Views. Then the LazyOp uniquely determines the kernel and we don't have to do any replacement. - -ShapeTracker probably shouldn't exist and just be a part of LazyBuffer. Most of the stuff in ShapeTracker should move to symbolic_view, which combines view and symbolic. diff --git a/docs-legacy/OVERVIEW.md b/docs-legacy/OVERVIEW.md deleted file mode 100644 index 6ccfb1712d..0000000000 --- a/docs-legacy/OVERVIEW.md +++ /dev/null @@ -1,25 +0,0 @@ -tinygrad has four pieces - -* frontend (Tensor -> LazyBuffer) - * See tensor.py, function.py, multi.py, and lazy.py - * The user interacts with the Tensor class - * This outputs LazyBuffers, which form the simple compute graph -* scheduler (LazyBuffer -> ScheduleItem) - * See engine/schedule.py - * When a Tensor is realized, the scheduler is run to get its LazyBuffers to be computed - * This takes in LazyBuffers and groups them as appropriate into kernels. - * It returns a list of ScheduleItems + all the Variables used in the graph -* lowering (TODO: lots of work to clean this up still) - * See codegen/ (ScheduleItem.ast -> UOps) - * ScheduleItems have an ast that's compiled into actual GPU code - * Many optimization choices can be made here, this contains a beam search. - * renderer/compiler (UOps -> machine code) - * UOps are tinygrad's IR, similar to LLVM IR - * Here we either convert them to a high level language or machine code directly - * engine/realize.py (ScheduleItem -> ExecItem) -* runtime - * See runtime/ - * Runtime actually interacts with the GPUs - * It manages Buffers, Programs, and Queues - * Sadly, METAL and GPU (OpenCL) don't have a compiler that can be pulled out from the device itself - diff --git a/docs-legacy/README.md b/docs-legacy/README.md deleted file mode 100644 index a61c1ccf99..0000000000 --- a/docs-legacy/README.md +++ /dev/null @@ -1,31 +0,0 @@ -# Welcome to the tinygrad documentation! - -Here you will find documentation for tinygrad, as well as some examples and tutorials. - -## Getting Started - -Read the quick start guide [here](/docs/quickstart.md). - -Or if you want to jump right in to how tinygrad works, you can read the [abstraction stack](/docs-legacy/abstractions2.py) documentation. - -Or if you want to see some examples, you can look at the examples in the [examples](/examples) directory. - -Or if you just want to see some of the things tinygrad can do, check out the [showcase](/docs/showcase.md). - -## API - -This is currently a big work in progress. - -## Resources - -### Environment Variables - -[env_vars.md](/docs-legacy/env_vars.md) - -### Adding New Accelerators - -[adding_new_accelerators.md](/docs-legacy/adding_new_accelerators.md) - -### Community - -[![tinygrad discord](https://discordapp.com/api/guilds/1068976834382925865/widget.png?style=banner2)](https://discord.gg/ZjZadyC7PK) diff --git a/docs-legacy/adding_new_accelerators.md b/docs-legacy/adding_new_accelerators.md deleted file mode 100644 index 728a308398..0000000000 --- a/docs-legacy/adding_new_accelerators.md +++ /dev/null @@ -1,33 +0,0 @@ -# Adding a new accelerator to tinygrad - -It's pretty easy to add a new accelerator to tinygrad. All you need to do is implement a total of 20 (optionally 21) low level ops. Then tinygrad takes care of the rest, handling derivatives and syntactic sugar. - -## llops - -These are the ops that you must implement for your accelerator of choice. -``` -Buffer # class of memory on this device -unary_op (NOOP, CAST, EXP2, LOG2, SIN, SQRT) # A -> A -reduce_op (SUM, MAX) # A -> B (smaller size, B has 1 in shape) -binary_op (ADD, SUB, MUL, DIV, CMPEQ, CMPLT, MAX) # A + A -> A (all the same size) -load_op (EMPTY, CONST, FROM, CONTIGUOUS, CUSTOM) # -> A (initialize data on device) -ternary_op (WHERE) # A, A, A -> A -``` - -## mlops - -These are the mid level ops that handle the derivatives. -``` -Relu, Log, Exp, Sin # unary ops -Sum, Max # reduce ops (with axis argument) -Add, Sub, Mul, Div, Eq # binary ops (no broadcasting, use expand) -Expand, Reshape, Permute, Pad, Shrink, Flip # movement ops -Where # ternary ops -``` -These are implemented in [function.py](/tinygrad/function.py). - -## hlops - -These are the syntax sugar. They are built on top of the mlops and support most of the things that you could expect from a tensor library. - -These are implemented in [tensor.py](/tinygrad/tensor.py). diff --git a/docs-legacy/linearizer_v2.md b/docs-legacy/linearizer_v2.md deleted file mode 100644 index 52aae91a0e..0000000000 --- a/docs-legacy/linearizer_v2.md +++ /dev/null @@ -1,27 +0,0 @@ -At base, the Linearizer a function that takes an AST + opts -> uops -It should be rewritten like this. The AST can't be a LazyOp, because it should be able to have multiple outputs - -We need a generic class to represent DAGs. -This refactor is probably a prereq for the new linearizer, and can be used on existing uops also. -Can this class also represent the large graph? The op graph is a subset of the large graph. - -Currently the Linearizer is merging many concerns: - -1. LocalBuffers are added. These should be added to the upper DAG, for both grouping and tensor cores. Some opts are used here. NOTE: currently reduce splitting is done in lazy.py and it shouldn't be -2. The ShapeTrackers at the edges are collected and modified according to the other opts. -3. The Ops are toposorted. -4. The Ops are lowered to UOps. This requires expansion and loop assignment, potentially to global dimensions -5. The indexes into the Tensor are computed from the shapetrackers - -More generically, the whole network is a DAG. Ignore the forward/backward stuff, I'm fine with starting at the LazyBuffer level. - -1. Is it possible to put an entire network in a single kernel? I think the answer has to be yes, but you may end up doing an absolutely crazy amount of recomputation. This should still be doable to check correctness. -2. You can use intermediate buffers, be they local or global, to do less compute. - -This is a rewrite of a lot of tinygrad. I don't think continuing to support Interpreted backends is worth it, have to deal with disk in a smart way. - -We keep the features and nn stuff = 793 lines -We keep the frontend (Tensor -> LazyBuffer): tensor.py + mlops.py + lazy.py + dtype.py = 1032 lines -We keep the shapetracker/symbolic (part of the frontend): shapetracker.py + view.py + symbolic.py = 603 lines -Codegen is all rewritten. realize.py is simpler with the new codegen -We keep the backend (uops renderer/runtime): cstyle.py/llvmir.py + device.py + ops_*.py = 1216 lines (less when we remove interpreted) diff --git a/docs-legacy/reshape_without_symbolic.md b/docs-legacy/reshape_without_symbolic.md deleted file mode 100644 index c6da7d7219..0000000000 --- a/docs-legacy/reshape_without_symbolic.md +++ /dev/null @@ -1,70 +0,0 @@ -## ["View.reshape without symbolic"](https://github.com/tinygrad/tinygrad/pull/2218) - -This section contains the sketch proof of "Complete, Fast and Correct View.reshapes without using Symbolic". The goal is to reduce multi-views which cost runtime. - -1. **old_shape = (s1,s2,...,si,s(i+1),...,sn)** -2. **old_stride = (st1, st2, ... ,sti, st(i+1), ..., stn)** -3. **merge_old_shape = (p1, p2), where p1 = s1 * ... * si & p2 = s(i+1) * ... * sn**, -4. **new_shape = (k1, ..., kp, k(p+1), ..., kl)** -5. **prod(new_shape) = p1 * p2** (trivial) -6. **mask** and **new_mask** represent valid indexes before & after reshape respectively. - - -### Assumption - -**p1** & **p2** individually are mergeable (we will discuss later on this) & we cannot merge **p1** & **p2**. - -### Claim - -If **prod([k1 ... kp]) < p1** and **prod([k1 ... k(p+1)]) > p1**, reshape is not possible. - -**Proof** - -**k(p+1)** will require some dimensions from **p1** & some from **p2**, which means **p1** & **p2** should be mergeable, but they are not. - -**Conclusion** - -Hence, reshape is only possible **if ∃ a p, where prod([k1 .. kp]) = p1**. - - -### Conditions for mergeability - -**Case 1 - All non-zero strides** - -They will merge **if stx = st(x+1) * s(x+1), where x ∈ [1, ..., i-1, i+1, ..., n-1]**. - -**Proof** - -Lets consider merging of **(s1 ... si) -> p1**, here we have to get a single new stride corresponding to **p1**. For which it has to be contiguous. - -**Case 2 - Some stride is zero** - -Let **stj = 0 & st(j+1) != 0 & s(j+1) > 1, where 1 < j < i**. - -If **sj = 1** , reshape is trivial. - -If **sj > 1**, -- If **maskj** has range > 1, - reshape is not possible, because **s(j+1)** will need to be repeated at-least once and a single stride can't capture repetition. -- If **maskj** has range = 1, reshape is possible, since it is virtually shape = 1, with some offset. - - - -### Conditions for reshaping mask - -**Case 1 - Splitting Dimension** - Mask shouldn't be cut for successful reshape. - -- **Example** - -[1,2,3,4,5,6,7,8] -> [[1,2,3,4], [5,6,7,8]] ; **mask** = ((2,6)) ; **new_mask[0]** = (0,2) (trivial split). - -- **new_mask[1]** = not possible. It is only possible if **mask spans [1-8] or lies within a single dimension [1-4] or [5-8]**. - - -**Case 2 - Combining Dimension** - Mask should unfold continuously. - -- **Example** - **[[1,2],[3,4],[5,6]] -> [1,2,3,4,5,6]**; **mask** = ((0,2),(0,2)). - -- **new_mask** = (0,4); only possible because **mask1** span the whole dimension. - -- If **mask1** did not span the whole dimension, the only way combining would be possible is if **mask0** had range 1 as shown below. - - **[[1,2,3],[4,5,6]] -> [1,2,3,4,5,6]**; **mask** = ((1,2),(0,2)); **new_mask** = ((3,5)) \ No newline at end of file diff --git a/docs-legacy/abstractions2.py b/docs/abstractions2.py similarity index 100% rename from docs-legacy/abstractions2.py rename to docs/abstractions2.py diff --git a/docs-legacy/abstractions3.py b/docs/abstractions3.py similarity index 100% rename from docs-legacy/abstractions3.py rename to docs/abstractions3.py diff --git a/docs-legacy/env_vars.md b/docs/env_vars.md similarity index 100% rename from docs-legacy/env_vars.md rename to docs/env_vars.md diff --git a/docs-legacy/logo_tiny_dark.svg b/docs/logo_tiny_dark.svg similarity index 100% rename from docs-legacy/logo_tiny_dark.svg rename to docs/logo_tiny_dark.svg diff --git a/docs-legacy/logo_tiny_light.svg b/docs/logo_tiny_light.svg similarity index 100% rename from docs-legacy/logo_tiny_light.svg rename to docs/logo_tiny_light.svg diff --git a/docs/quickstart.md b/docs/quickstart.md index ba2aaef4ab..94801a04b6 100644 --- a/docs/quickstart.md +++ b/docs/quickstart.md @@ -76,7 +76,7 @@ print(t6.numpy()) ``` There are a lot more operations that can be performed on tensors, you can find them in the [Tensor](tensor.md) file. -Additionally reading through [abstractions2.py](https://github.com/tinygrad/tinygrad/blob/master/docs-legacy/abstractions2.py) will help you understand how operations on these tensors make their way down to your hardware. +Additionally reading through [abstractions2.py](https://github.com/tinygrad/tinygrad/blob/master/docs/abstractions2.py) will help you understand how operations on these tensors make their way down to your hardware. ## Models @@ -299,7 +299,7 @@ Many of the models in the [models/](https://github.com/tinygrad/tinygrad/tree/ma There exist a bunch of environment variables that control the runtime behavior of tinygrad. Some of the commons ones are `DEBUG` and the different backend enablement variables. -You can find a full list and their descriptions in [env_vars.md](https://github.com/tinygrad/tinygrad/blob/master/docs-legacy/env_vars.md). +You can find a full list and their descriptions in [env_vars.md](https://github.com/tinygrad/tinygrad/blob/master/docs/env_vars.md). ### Visualizing the Computation Graph diff --git a/mkdocs.yml b/mkdocs.yml index 9b5f8b09d6..433959b9b7 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -11,6 +11,7 @@ nav: - Showcase: showcase.md - Developer: developer.md - Function: function.md +- Environment: env_vars.md #- tinygrad: reference/ #extra_css: diff --git a/ruff.toml b/ruff.toml index b0a7913a43..fd32a4958d 100644 --- a/ruff.toml +++ b/ruff.toml @@ -27,7 +27,6 @@ line-length = 150 exclude = [ "docs/", - "docs-legacy/", "examples/", "extra/", "tinygrad/runtime/autogen",