diff --git a/docs/developer.md b/docs/developer.md
index 6a294ee384..4eeeb8915c 100644
--- a/docs/developer.md
+++ b/docs/developer.md
@@ -17,13 +17,13 @@ The `LazyBuffer` graph specifies the compute in terms of low level tinygrad ops.
 
 ## Scheduling
 
-The [scheduler](/tinygrad/engine/schedule.py) converts the graph of LazyBuffers into a list of `ScheduleItem`. One `ScheduleItem` is one kernel on the GPU, and the scheduler is responsible for breaking the large compute graph into subgraphs that can fit in a kernel. `ast` specifies what compute to run, and `bufs` specifies what buffers to run it on.
+The [scheduler](https://github.com/tinygrad/tinygrad/tree/master/tinygrad/engine/schedule.py) converts the graph of LazyBuffers into a list of `ScheduleItem`. One `ScheduleItem` is one kernel on the GPU, and the scheduler is responsible for breaking the large compute graph into subgraphs that can fit in a kernel. `ast` specifies what compute to run, and `bufs` specifies what buffers to run it on.
 
 ::: tinygrad.ops.ScheduleItem
 
 ## Lowering
 
-The code in [realize](/tinygrad/engine/realize.py) lowers `ScheduleItem` to `ExecItem` with
+The code in [realize](https://github.com/tinygrad/tinygrad/tree/master/tinygrad/engine/realize.py) lowers `ScheduleItem` to `ExecItem` with
 
 ::: tinygrad.engine.realize.lower_schedule
 
diff --git a/docs/quickstart.md b/docs/quickstart.md
index 8523b2157f..ba2aaef4ab 100644
--- a/docs/quickstart.md
+++ b/docs/quickstart.md
@@ -50,7 +50,7 @@ randn = Tensor.randn(2, 3) # create a tensor of shape (2, 3) filled with random
 uniform = Tensor.uniform(2, 3, low=0, high=10) # create a tensor of shape (2, 3) filled with random values from a uniform distribution between 0 and 10
 ```
 
-There are even more of these factory methods, you can find them in the [tensor.py](/tinygrad/tensor.py) file.
+There are even more of these factory methods, you can find them in the [Tensor](tensor.md) file.
 
 All the tensors creation methods can take a `dtype` argument to specify the data type of the tensor.
 
@@ -75,8 +75,8 @@ print(t6.numpy())
 # [-56. -48. -36. -20.   0.]
 ```
 
-There are a lot more operations that can be performed on tensors, you can find them in the [tensor.py](/tinygrad/tensor.py) file.
-Additionally reading through [abstractions2.py](/docs-legacy/abstractions2.py) will help you understand how operations on these tensors make their way down to your hardware.
+There are a lot more operations that can be performed on tensors, you can find them in the [Tensor](tensor.md) file.
+Additionally reading through [abstractions2.py](https://github.com/tinygrad/tinygrad/blob/master/docs-legacy/abstractions2.py) will help you understand how operations on these tensors make their way down to your hardware.
 
 ## Models
 
@@ -96,7 +96,7 @@ class Linear:
     return x.linear(self.weight.transpose(), self.bias)
 ```
 
-There are more neural network modules already implemented in [nn](/tinygrad/nn/__init__.py), and you can also implement your own.
+There are more neural network modules already implemented in [nn](nn.md), and you can also implement your own.
 
 We will be implementing a simple neural network that can classify handwritten digits from the MNIST dataset.
 Our classifier will be a simple 2 layer neural network with a Leaky ReLU activation function.
@@ -126,9 +126,9 @@ Finally, we just initialize an instance of our neural network, and we are ready
 Now that we have our neural network defined we can start training it.
 Training neural networks in tinygrad is super simple.
 All we need to do is define our neural network, define our loss function, and then call `.backward()` on the loss function to compute the gradients.
-They can then be used to update the parameters of our neural network using one of the many optimizers in [optim.py](/tinygrad/nn/optim.py).
+They can then be used to update the parameters of our neural network using one of the many [Optimizers](nn.md#optimizers).
 
-For our loss function we will be using sparse categorical cross entropy loss. The implementation below is taken from [tensor.py](/tinygrad/tensor.py), it's copied below to highlight an important detail of tinygrad.
+For our loss function we will be using sparse categorical cross entropy loss. The implementation below is taken from [tensor.py](https://github.com/tinygrad/tinygrad/blob/master/tinygrad/tensor.py), it's copied below to highlight an important detail of tinygrad.
 
 ```python
 def sparse_categorical_crossentropy(self, Y, ignore_index=-1) -> Tensor:
@@ -156,7 +156,7 @@ There is a simpler way to do this just by using `get_parameters(net)` from `tiny
 The parameters are just listed out explicitly here for clarity.
 
 Now that we have our network, loss function, and optimizer defined all we are missing is the data to train on!
-There are a couple of dataset loaders in tinygrad located in [/extra/datasets](/extra/datasets).
+There are a couple of dataset loaders in tinygrad located in [/extra/datasets](https://github.com/tinygrad/tinygrad/blob/master/extra/datasets).
 We will be using the MNIST dataset loader.
 
 ```python
@@ -229,10 +229,10 @@ with Timing("Time: "):
 
 ## And that's it
 
-Highly recommend you check out the [examples/](/examples) folder for more examples of using tinygrad.
+Highly recommend you check out the [examples/](https://github.com/tinygrad/tinygrad/blob/master/examples) folder for more examples of using tinygrad.
 Reading the source code of tinygrad is also a great way to learn how it works.
-Specifically the tests in [test/](/test) are a great place to see how to use and the semantics of the different operations.
-There are also a bunch of models implemented in [models/](/extra/models) that you can use as a reference.
+Specifically the tests in [test/](https://github.com/tinygrad/tinygrad/blob/master/test) are a great place to see how to use and the semantics of the different operations.
+There are also a bunch of models implemented in [models/](https://github.com/tinygrad/tinygrad/blob/master/extra/models) that you can use as a reference.
 
 Additionally, feel free to ask questions in the `#learn-tinygrad` channel on the [discord](https://discord.gg/beYbxwxVdx). Don't ask to ask, just ask!
 
@@ -276,7 +276,7 @@ You will find that the evaluation time is much faster than before and that your
 ### Saving and Loading Models
 
 The standard weight format for tinygrad is [safetensors](https://github.com/huggingface/safetensors). This means that you can load the weights of any model also using safetensors into tinygrad.
-There are functions in [state.py](/tinygrad/nn/state.py) to save and load models to and from this format.
+There are functions in [state.py](https://github.com/tinygrad/tinygrad/blob/master/tinygrad/nn/state.py) to save and load models to and from this format.
 
 ```python
 from tinygrad.nn.state import safe_save, safe_load, get_state_dict, load_state_dict
@@ -292,14 +292,14 @@ state_dict = safe_load("model.safetensors")
 load_state_dict(net, state_dict)
 ```
 
-Many of the models in the [models/](/models) folder have a `load_from_pretrained` method that will download and load the weights for you. These usually are pytorch weights meaning that you would need pytorch installed to load them.
+Many of the models in the [models/](https://github.com/tinygrad/tinygrad/tree/master/extra/models) folder have a `load_from_pretrained` method that will download and load the weights for you. These usually are pytorch weights meaning that you would need pytorch installed to load them.
 
 ### Environment Variables
 
 There exist a bunch of environment variables that control the runtime behavior of tinygrad.
 Some of the commons ones are `DEBUG` and the different backend enablement variables.
 
-You can find a full list and their descriptions in [env_vars.md](/docs-legacy/env_vars.md).
+You can find a full list and their descriptions in [env_vars.md](https://github.com/tinygrad/tinygrad/blob/master/docs-legacy/env_vars.md).
 
 ### Visualizing the Computation Graph
 
diff --git a/docs/showcase.md b/docs/showcase.md
index 04b18b7be4..961365d3c8 100644
--- a/docs/showcase.md
+++ b/docs/showcase.md
@@ -17,15 +17,15 @@ python3 examples/efficientnet.py webcam
 
 ### YOLOv8
 
-Take a look at [yolov8.py](/examples/yolov8.py).
+Take a look at [yolov8.py](https://github.com/tinygrad/tinygrad/tree/master/examples/yolov8.py).
 
-![yolov8 by tinygrad](showcase/yolov8_showcase_image.png)
+![yolov8 by tinygrad](https://github.com/tinygrad/tinygrad/tree/master/docs/showcase/yolov8_showcase_image.png)
 
 ## Audio
 
 ### Whisper
 
-Take a look at [whisper.py](/examples/whisper.py). You need pyaudio and torchaudio installed.
+Take a look at [whisper.py](https://github.com/tinygrad/tinygrad/tree/master/examples/whisper.py). You need pyaudio and torchaudio installed.
 
 ```sh
 SMALL=1 python3 examples/whisper.py
@@ -35,9 +35,9 @@ SMALL=1 python3 examples/whisper.py
 
 ### Generative Adversarial Networks
 
-Take a look at [mnist_gan.py](/examples/mnist_gan.py).
+Take a look at [mnist_gan.py](https://github.com/tinygrad/tinygrad/tree/master/examples/mnist_gan.py).
 
-![mnist gan by tinygrad](showcase/mnist_by_tinygrad.jpg)
+![mnist gan by tinygrad](https://github.com/tinygrad/tinygrad/tree/master/docs/showcase/mnist_by_tinygrad.jpg)
 
 ### Stable Diffusion
 
@@ -45,7 +45,7 @@ Take a look at [mnist_gan.py](/examples/mnist_gan.py).
 python3 examples/stable_diffusion.py
 ```
 
-![a horse sized cat eating a bagel](showcase/stable_diffusion_by_tinygrad.jpg)
+![a horse sized cat eating a bagel](https://github.com/tinygrad/tinygrad/tree/master/docs/showcase/stable_diffusion_by_tinygrad.jpg)
 
 *"a horse sized cat eating a bagel"*