Moves code walkthrough book chapters to docs (#629)

* replaced template blocks with code blocks in stages chapter

* replaced template blocks with code blocks in network chapter

* moved book sections to docs

* fix indentation in recover_signer codeblock

* remove unnecessary TODO comment in network.md
This commit is contained in:
Andrew Kirillov
2022-12-28 01:24:39 -08:00
committed by GitHub
parent a51fa4fd63
commit d4d8a8c882
29 changed files with 1178 additions and 458 deletions

View File

@@ -17,8 +17,5 @@ The book is continuously rendered [here](https://paradigmxyz.github.io/reth/)!
To get started with Reth, install, configure and sync your node.
* To install and build reth, you can use the following [installation instruction](./installation.md).
**[A Tour Of Reth]()**
This section will take a deep dive into the inner workings of Reth.
[gh-book]: https://github.com/paradigmxyz/reth/tree/main/book

View File

@@ -7,44 +7,3 @@
<!-- An overview of all the flags, how they work and how to configure the node -->
- [Configuring The Node]()
- [Running Reth]()
# A Tour Of Reth
- [Database]()
- [codecs]()
- [libmdbx-rs]()
- [db](./db/README.md)
- [Networking]()
- [P2P](./networking/p2p/README.md)
- [network](./networking/p2p/network/README.md)
- [eth-wire]()
- [discv4]()
- [ipc]()
- [RPC]()
- [rpc-api]()
- [rpc]()
- [rpc-types]()
- [Downloaders]()
- [bodies-downloaders]()
- [headers-downloaders]()
- [Ethereum]()
- [executor]()
- [consensus]()
- [transaction-pool]()
- [Staged Sync]()
- [stages](./stages/README.md)
- [Primitives]()
- [primitives]()
- [rlp]()
- [rlp-derive]()
- [Misc]()
- [interfaces]()
- [tracing]()
- [crate-template]()
- [examples]()
# Design
- [Goals](./design/goals.md)

View File

@@ -1,333 +0,0 @@
# db
The database is a central component to Reth, enabling persistent storage for data like block headers, block bodies, transactions and more. The Reth database is comprised of key-value storage written to the disk and organized in tables. This chapter might feel a little dense at first, but shortly, you will feel very comfortable understanding and navigating the `db` crate. This chapter will go through the structure of the database, its tables and the mechanics of the `Database` trait.
<br>
## Tables
Within Reth, the database is organized via "tables". A table is any struct that implements the `Table` trait.
[File: crates/storage/db/src/abstraction/table.rs](https://github.com/paradigmxyz/reth/blob/main/crates/storage/db/src/abstraction/table.rs#L56-L65)
```rust ignore
pub trait Table: Send + Sync + Debug + 'static {
/// Return table name as it is present inside the MDBX.
const NAME: &'static str;
/// Key element of `Table`.
///
/// Sorting should be taken into account when encoding this.
type Key: Key;
/// Value element of `Table`.
type Value: Value;
}
//--snip--
pub trait Key: Encode + Decode {}
//--snip--
pub trait Value: Compress + Decompress {}
```
The `Table` trait has two generic values, `Key` and `Value`, which need to implement the `Key` and `Value` traits, respectively. The `Encode` trait is responsible for transforming data into bytes so it can be stored in the database, while the `Decode` trait transforms the bytes back into its original form. Similarly, the `Compress` and `Decompress` traits transform the data to and from a compressed format when storing or reading data from the database.
There are many tables within the node, all used to store different types of data from `Headers` to `Transactions` and more. Below is a list of all of the tables. You can follow [this link](https://github.com/paradigmxyz/reth/blob/main/crates/storage/db/src/tables/mod.rs#L36) if you would like to see the table definitions for any of the tables below.
- CanonicalHeaders
- HeaderTD
- HeaderNumbers
- Headers
- BlockBodies
- BlockOmmers
- NonCanonicalTransactions
- Transactions
- TxHashNumber
- Receipts
- Logs
- PlainAccountState
- PlainStorageState
- Bytecodes
- BlockTransitionIndex
- TxTransitionIndex
- AccountHistory
- StorageHistory
- AccountChangeSet
- StorageChangeSet
- TxSenders
- Config
- SyncStage
<br>
## Database
Reth's database design revolves around it's main [Database trait](https://github.com/paradigmxyz/reth/blob/0d9b9a392d4196793736522f3fc2ac804991b45d/crates/interfaces/src/db/mod.rs#L33), which takes advantage of [generic associated types](https://blog.rust-lang.org/2022/10/28/gats-stabilization.html) and [a few design tricks](https://sabrinajewson.org/blog/the-better-alternative-to-lifetime-gats#the-better-gats) to implement the database's functionality across many types. Let's take a quick look at the `Database` trait and how it works.
[File: crates/storage/db/src/abstraction/database.rs](https://github.com/paradigmxyz/reth/blob/main/crates/storage/db/src/abstraction/database.rs#L19)
```rust ignore
/// Main Database trait that spawns transactions to be executed.
pub trait Database: for<'a> DatabaseGAT<'a> {
/// Create read only transaction.
fn tx(&self) -> Result<<Self as DatabaseGAT<'_>>::TX, Error>;
/// Create read write transaction only possible if database is open with write access.
fn tx_mut(&self) -> Result<<Self as DatabaseGAT<'_>>::TXMut, Error>;
/// Takes a function and passes a read-only transaction into it, making sure it's closed in the
/// end of the execution.
fn view<T, F>(&self, f: F) -> Result<T, Error>
where
F: Fn(&<Self as DatabaseGAT<'_>>::TX) -> T,
{
let tx = self.tx()?;
let res = f(&tx);
tx.commit()?;
Ok(res)
}
/// Takes a function and passes a write-read transaction into it, making sure it's committed in
/// the end of the execution.
fn update<T, F>(&self, f: F) -> Result<T, Error>
where
F: Fn(&<Self as DatabaseGAT<'_>>::TXMut) -> T,
{
let tx = self.tx_mut()?;
let res = f(&tx);
tx.commit()?;
Ok(res)
}
}
```
Any type that implements the `Database` trait can create a database transaction, as well as view or update existing transactions. As an example, lets revisit the `Transaction` struct from the `stages` crate. This struct contains a field named `db` which is a reference to a generic type `DB` that implements the `Database` trait. The `Transaction` struct can use the `db` field to store new headers, bodies and senders in the database. In the code snippet below, you can see the `Transaction::open()` method, which uses the `Database::tx_mut()` function to create a mutable transaction.
[File: crates/stages/src/db.rs](https://github.com/paradigmxyz/reth/blob/main/crates/stages/src/db.rs#L95-L98)
```rust ignore
pub struct Transaction<'this, DB: Database> {
/// A handle to the DB.
pub(crate) db: &'this DB,
tx: Option<<DB as DatabaseGAT<'this>>::TXMut>,
}
//--snip--
impl<'this, DB> Transaction<'this, DB>
where
DB: Database,
{
//--snip--
/// Open a new inner transaction.
pub fn open(&mut self) -> Result<(), Error> {
self.tx = Some(self.db.tx_mut()?);
Ok(())
}
}
```
The `Database` trait also implements the `DatabaseGAT` trait which defines two associated types `TX` and `TXMut`.
[File: crates/storage/db/src/abstraction/database.rs](https://github.com/paradigmxyz/reth/blob/main/crates/storage/db/src/abstraction/database.rs#L11)
```rust ignore
/// Implements the GAT method from:
/// https://sabrinajewson.org/blog/the-better-alternative-to-lifetime-gats#the-better-gats.
///
/// Sealed trait which cannot be implemented by 3rd parties, exposed only for implementers
pub trait DatabaseGAT<'a, __ImplicitBounds: Sealed = Bounds<&'a Self>>: Send + Sync {
/// RO database transaction
type TX: DbTx<'a> + Send + Sync;
/// RW database transaction
type TXMut: DbTxMut<'a> + DbTx<'a> + Send + Sync;
}
```
In Rust, associated types are like generics in that they can be any type fitting the generic's definition, with the difference being that associated types are associated with a trait and can only be used in the context of that trait.
In the code snippet above, the `DatabaseGAT` trait has two associated types, `TX` and `TXMut`.
The `TX` type can be any type that implements the `DbTx` trait, which provides a set of functions to interact with read only transactions.
[File: crates/storage/db/src/abstraction/transaction.rs](https://github.com/paradigmxyz/reth/blob/main/crates/storage/db/src/abstraction/transaction.rs#L36)
```rust ignore
/// Read only transaction
pub trait DbTx<'tx>: for<'a> DbTxGAT<'a> {
/// Get value
fn get<T: Table>(&self, key: T::Key) -> Result<Option<T::Value>, Error>;
/// Commit for read only transaction will consume and free transaction and allows
/// freeing of memory pages
fn commit(self) -> Result<bool, Error>;
/// Iterate over read only values in table.
fn cursor<T: Table>(&self) -> Result<<Self as DbTxGAT<'_>>::Cursor<T>, Error>;
/// Iterate over read only values in dup sorted table.
fn cursor_dup<T: DupSort>(&self) -> Result<<Self as DbTxGAT<'_>>::DupCursor<T>, Error>;
}
```
The `TXMut` type can be any type that implements the `DbTxMut` trait, which provides a set of functions to interact with read/write transactions.
[File: crates/storage/db/src/abstraction/transaction.rs](https://github.com/paradigmxyz/reth/blob/main/crates/storage/db/src/abstraction/transaction.rs#L49)
```rust ignore
/// Read write transaction that allows writing to database
pub trait DbTxMut<'tx>: for<'a> DbTxMutGAT<'a> {
/// Put value to database
fn put<T: Table>(&self, key: T::Key, value: T::Value) -> Result<(), Error>;
/// Delete value from database
fn delete<T: Table>(&self, key: T::Key, value: Option<T::Value>) -> Result<bool, Error>;
/// Clears database.
fn clear<T: Table>(&self) -> Result<(), Error>;
/// Cursor mut
fn cursor_mut<T: Table>(&self) -> Result<<Self as DbTxMutGAT<'_>>::CursorMut<T>, Error>;
/// DupCursor mut.
fn cursor_dup_mut<T: DupSort>(
&self,
) -> Result<<Self as DbTxMutGAT<'_>>::DupCursorMut<T>, Error>;
}
```
Lets take a look at the `DbTx` and `DbTxMut` traits in action. Revisiting the `Transaction` struct as an example, the `Transaction::get_block_hash()` method uses the `DbTx::get()` function to get a block header hash in the form of `self.get::<tables::CanonicalHeaders>(number)`.
[File: crates/stages/src/db.rs](https://github.com/paradigmxyz/reth/blob/main/crates/stages/src/db.rs#L106)
```rust ignore
impl<'this, DB> Transaction<'this, DB>
where
DB: Database,
{
//--snip--
/// Query [tables::CanonicalHeaders] table for block hash by block number
pub(crate) fn get_block_hash(&self, number: BlockNumber) -> Result<BlockHash, StageError> {
let hash = self
.get::<tables::CanonicalHeaders>(number)?
.ok_or(DatabaseIntegrityError::CanonicalHash { number })?;
Ok(hash)
}
//--snip--
}
//--snip--
impl<'a, DB: Database> Deref for Transaction<'a, DB> {
type Target = <DB as DatabaseGAT<'a>>::TXMut;
fn deref(&self) -> &Self::Target {
self.tx.as_ref().expect("Tried getting a reference to a non-existent transaction")
}
}
```
The `Transaction` struct implements the `Deref` trait, which returns a reference to its `tx` field, which is a `TxMut`. Recall that `TxMut` is a generic type on the `DatabaseGAT` trait, which is defined as `type TXMut: DbTxMut<'a> + DbTx<'a> + Send + Sync;`, giving it access to all of the functions available to `DbTx`, including the `DbTx::get()` function.
Notice that the function uses a [turbofish](https://techblog.tonsser.com/posts/what-is-rusts-turbofish) to define which table to use when passing in the `key` to the `DbTx::get()` function. Taking a quick look at the function definition, a generic `T` is defined that implements the `Table` trait mentioned at the beginning of this chapter.
[File: crates/storage/db/src/abstraction/transaction.rs](https://github.com/paradigmxyz/reth/blob/main/crates/storage/db/src/abstraction/transaction.rs#L38)
```rust ignore
fn get<T: Table>(&self, key: T::Key) -> Result<Option<T::Value>, Error>;
```
This design pattern is very powerful and allows Reth to use the methods available to the `DbTx` and `DbTxMut` traits without having to define implementation blocks for each table within the database.
Lets take a look at a couple examples before moving on. In the snippet below, the `DbTxMut::put()` method is used to insert values into the `CanonicalHeaders`, `Headers` and `HeaderNumbers` tables.
[File: crates/storage/provider/src/block.rs](https://github.com/paradigmxyz/reth/blob/main/crates/storage/provider/src/block.rs#L121-L125)
```rust ignore
let block_num_hash = BlockNumHash((block.number, block.hash()));
tx.put::<tables::CanonicalHeaders>(block.number, block.hash())?;
// Put header with canonical hashes.
tx.put::<tables::Headers>(block_num_hash, block.header.as_ref().clone())?;
tx.put::<tables::HeaderNumbers>(block.hash(), block.number)?;
```
This next example uses the `DbTx::cursor()` method to get a `Cursor`. The `Cursor` type provides a way to traverse through rows in a database table, one row at a time. A cursor enables the program to perform an operation (updating, deleting, etc) on each row in the table individually. The following code snippet gets a cursor for a few different tables in the database.
[File: crates/stages/src/stages/execution.rs](https://github.com/paradigmxyz/reth/blob/main/crates/stages/src/stages/execution.rs#L93-L101)
```rust ignore
// Get next canonical block hashes to execute.
let mut canonicals = db_tx.cursor::<tables::CanonicalHeaders>()?;
// Get header with canonical hashes.
let mut headers = db_tx.cursor::<tables::Headers>()?;
// Get bodies (to get tx index) with canonical hashes.
let mut cumulative_tx_count = db_tx.cursor::<tables::CumulativeTxCount>()?;
// Get transaction of the block that we are executing.
let mut tx = db_tx.cursor::<tables::Transactions>()?;
// Skip sender recovery and load signer from database.
let mut tx_sender = db_tx.cursor::<tables::TxSenders>()?;
```
We are almost at the last stop in the tour of the `db` crate. In addition to the methods provided by the `DbTx` and `DbTxMut` traits, `DbTx` also inherits the `DbTxGAT` trait, while `DbTxMut` inherits `DbTxMutGAT`. These next two traits provide various associated types related to cursors as well as methods to utilize the cursor types.
[File: crates/storage/db/src/abstraction/transaction.rs](https://github.com/paradigmxyz/reth/blob/main/crates/storage/db/src/abstraction/transaction.rs#L12-L17)
```rust ignore
pub trait DbTxGAT<'a, __ImplicitBounds: Sealed = Bounds<&'a Self>>: Send + Sync {
/// Cursor GAT
type Cursor<T: Table>: DbCursorRO<'a, T> + Send + Sync;
/// DupCursor GAT
type DupCursor<T: DupSort>: DbDupCursorRO<'a, T> + DbCursorRO<'a, T> + Send + Sync;
}
```
Lets look at an examples of how cursors are used. The code snippet below contains the `unwind` method from the `BodyStage` defined in the `stages` crate. This function is responsible for unwinding any changes to the database if there is an error when executing the body stage within the Reth pipeline.
[File: crates/stages/src/stages/bodies.rs](https://github.com/paradigmxyz/reth/blob/main/crates/stages/src/stages/bodies.rs#L205-L238)
```rust ignore
/// Unwind the stage.
async fn unwind(
&mut self,
db: &mut Transaction<'_, DB>,
input: UnwindInput,
) -> Result<UnwindOutput, Box<dyn std::error::Error + Send + Sync>> {
let mut tx_count_cursor = db.cursor_mut::<tables::CumulativeTxCount>()?;
let mut block_ommers_cursor = db.cursor_mut::<tables::BlockOmmers>()?;
let mut transaction_cursor = db.cursor_mut::<tables::Transactions>()?;
let mut entry = tx_count_cursor.last()?;
while let Some((key, count)) = entry {
if key.number() <= input.unwind_to {
break
}
tx_count_cursor.delete_current()?;
entry = tx_count_cursor.prev()?;
if block_ommers_cursor.seek_exact(key)?.is_some() {
block_ommers_cursor.delete_current()?;
}
let prev_count = entry.map(|(_, v)| v).unwrap_or_default();
for tx_id in prev_count..count {
if transaction_cursor.seek_exact(tx_id)?.is_some() {
transaction_cursor.delete_current()?;
}
}
}
//--snip--
}
```
This function first grabs a mutable cursor for the `CumulativeTxCount`, `BlockOmmers` and `Transactions` tables.
The `tx_count_cursor` is used to get the last key value pair written to the `CumulativeTxCount` table and delete key value pair where the cursor is currently pointing.
The `block_ommers_cursor` is used to get the block ommers from the `BlockOmmers` table at the specified key, and delete the entry where the cursor is currently pointing.
Finally, the `transaction_cursor` is used to get delete each transaction from the last `TXNumber` written to the database, to the current tx count.
While this is a brief look at how cursors work in the context of database tables, the chapter on the `libmdbx` crate will go into further detail on how cursors communicate with the database and what is actually happening under the hood.
<br>
## Summary
This chapter was packed with information, so lets do a quick review. The database is comprised of tables, with each table being a collection of key-value pairs representing various pieces of data in the blockchain. Any struct that implements the `Database` trait can view, update or delete entries in the various tables. The database design leverages nested traits and generic associated types to provide methods to interact with each table in the database.
<br>
# Next Chapter
[Next Chapter]()

View File

@@ -1,83 +0,0 @@
# Reth Goals
### Why are we building this client?
Our goal in building Reth, apart from improving client diversity, is to create a client that delivers maximally along each of the following dimensions:
- Performance
- Configurability
- Open-source friendliness
---
## Performance
### Why does performance matter?
This is a win for everyone:
- Average users & developers benefit from RPC performance, leading to more responsive applications and faster feedback.
- Home node operators benefit from faster sync times.
- Costs are lowered for all operators, whether in terms of storage costs, or being able to serve more requests from the same node.
- Searchers are able to run more simulations.
### What are the performance bottlenecks that need to be addressed?
**Optimizing state access**
The pipeline that a given transaction goes through as its processed is more or less the following:
RPC -> EVM -> Cache -> Codec -> DB
One of our first and foremost goals in Reth is to minimize the latency and maximize the throughput (think: request concurrency) of this pipeline.
Why? This is a win for everyone. RPC providers meet more impressive SLAs, MEV searchers become more effective, home nodes sync faster, etc.
The biggest bottleneck in this pipeline is not the execution of the EVM interpreter itself, but rather in accessing state and managing I/O. As such, we think the largest optimizations to be made are closest to the DB layer.
Ideally, we can achieve such fast runtime operation that we can avoid storing certain things (e.g.?) on the disk, and are able to generate them on the fly, instead - minimizing disk footprint.
---
## Configurability
### Why does configurability matter?
**Control over tradeoffs**
Almost any given design choice or optimization to the client comes with its own tradeoffs. As such, our long-term goal is not to make opinionated decisions on behalf of everyone, as some users will be negatively impacted and turned away from what could be a great client.
**Profiles**
We aim to facilitate the creation of community-developed configuration presets that are fit to various user profiles, e.g. archive node, RPC provider, MEV searcher, etc.
**Extension to EVM-compatible L1s and L2s**
Another consequence of a configurable design is the ability to quickly extend the client to support other EVM-compatible L1s and L2s, enabling innovation while retaining performance.
### How is Reth made configurable?
**Modularity & generics**
We prioritize a modular design for Reth with reasonable (and zero-cost!) abstractions over generic interfaces. We want it to be quick and easy for others to extend or adapt the implementation to their own needs.
---
## Open-source friendliness
### Why does open-source friendliness matter?
Maintaining a client implementation is *hard*. Bringing in talent and sustaining momentum in workstreams is a known challenge. As such, we take an open-source first approach to ensure that the development of Reth can be carried forward by the community.
We want to be as deliberate as possible in forming a feedback loop with the Ethereum community, and not only make it easy to contribute to Reth, but in fact actively *encourage* doing so.
Our goal is that community members with no Rust experience, and no experience running a node, will still be able to meaningfully contribute to the project, and accrue expertise in doing so.
### How does Reth support open-source contribution?
**Documentation**
It goes without saying that verbose and thorough documentation is a must. The docs should provide full context on the design and implementation of the client, as well as the contribution process, and should be accessible to anyone with a basic understanding of Ethereum.
**Issue tracking**
Everything that is (and is not) being worked on within the client should be tracked accordingly so that anyone in the community can stay on top of the state of development. This makes it clear what kind of help is needed, and where.

View File

@@ -1 +0,0 @@
# network

View File

@@ -1 +0,0 @@
# P2P

View File

@@ -1,327 +0,0 @@
# Network
The `network` crate is responsible for managing the node's connection to the Ethereum peer-to-peer (P2P) network, enabling communication with other nodes via the [various P2P subprotocols](https://github.com/ethereum/devp2p).
Reth's P2P networking consists primarily of 4 ongoing tasks:
- **Discovery**: Discovers new peers in the network
- **Transactions**: Accepts, requests, and broadcasts mempool transactions
- **ETH Requests**: Responds to incoming requests for headers and bodies
- **Network Management**: Handles incoming & outgoing connections with peers, and routes requests between peers and the other tasks
We'll leave most of the discussion of the discovery task for the [discv4](../discv4/README.md) chapter, and will focus on the other three here.
Let's take a look at how the main Reth CLI (i.e., a default-configured full node) makes use of the P2P layer to explore the primary interfaces and entrypoints into the `network` crate.
---
## The Network Management Task
The network management task is the one primarily used in the pipeline to interact with the P2P network. Apart from managing connectivity to the node's peers, it provides a couple of interfaces for sending _outbound_ requests.
Let's take a look at what the provided interfaces are, how they're used in the pipeline, and take a brief glance under the hood to highlight some important structs and traits in the network management task.
### Use of the Network in the Node
The `"node"` CLI command, used to run the node itself, does the following at a high level:
1. Initializes the DB
2. Initializes the consensus API
3. Writes the genesis block to the DB
4. Initializes the network
5. Instantiates a client for fetching data from the network
6. Configures the pipeline by adding stages to it
7. Runs the pipeline
Steps 5-6 are of interest to us as they consume items from the `network` crate:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=bin/reth/src/node/mod.rs anchor=snippet-execute}}
Let's begin by taking a look at the line where the network is started, with the call, unsurprisingly, to `start_network`. Sounds important, doesn't it?
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=bin/reth/src/node/mod.rs anchor=fn-start_network}}
At a high level, this function is responsible for starting the tasks listed at the start of this chapter.
It gets the handles for the network management, transactions, and ETH requests tasks downstream of the `NetworkManager::builder` method call, and spawns them.
The `NetworkManager::builder` constructor requires a `NetworkConfig` struct to be passed in as a parameter, which can be used as the main entrypoint for setting up the entire network layer:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/config.rs anchor=struct-NetworkConfig}}
The discovery task progresses as the network management task is polled, handling events regarding peer management through the `Swarm` struct which is stored as a field on the `NetworkManager`:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/swarm.rs anchor=struct-Swarm}}
The `Swarm` struct glues together incoming connections from peers, managing sessions with peers, and recording the network's state (e.g. number of active peers, genesis hash of the network, etc.). It emits these as `SwarmEvent`s to the `NetworkManager`, and routes commands and events between the `SessionManager` and `NetworkState` structs that it holds.
We'll touch more on the `NetworkManager` shortly! It's perhaps the most important struct in this crate.
More information about the discovery task can be found in the [discv4](../discv4/README.md) chapter.
The ETH requests and transactions task will be explained in their own sections, following this one.
The variable `network` returned from `start_network` and the variable `fetch_client` returned from `network.fetch_client` are of types `NetworkHandle` and `FetchClient`, respectively. These are the two main interfaces for interacting with the P2P network, and are currently used in the `HeaderStage` and `BodyStage`.
Let's walk through how each is implemented, and then apply that knowledge to understand how they are used in the pipeline. In doing so, we'll dig deeper under the hood inside the network management task to get a sense of what's going on.
### Interacting with the Network Management Task Using `NetworkHandle`
The `NetworkHandle` struct is a client for the network management task that can be shared across threads. It wraps an `Arc` around the `NetworkInner` struct, defined as follows:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/network.rs anchor=struct-NetworkInner}}
The field of note here is `to_manager_tx`, which is a handle that can be used to send messages in a channel to an instance of the `NetworkManager` struct.
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/manager.rs anchor=struct-NetworkManager}}
Now we're getting to the meat of the `network` crate! The `NetworkManager` struct represents the "Network Management" task described above. It is implemented as an endless [`Future`](https://doc.rust-lang.org/std/future/trait.Future.html) that can be thought of as a "hub" process which listens for messages from the `NetworkHandle` or the node's peers and dispatches messages to the other tasks, while keeping track of the state of the network.
While the `NetworkManager` is meant to be spawned as a standalone [`tokio::task`](https://docs.rs/tokio/0.2.4/tokio/task/index.html), the `NetworkHandle` can be passed around and shared, enabling access to the `NetworkManager` from anywhere by sending requests & commands through the appropriate channels.
#### Usage of `NetworkHandle` in the Pipeline
In the pipeline, the `NetworkHandle` is used to instantiate the `FetchClient` - which we'll get into next - and is used in the `HeaderStage` to update the node's ["status"](https://github.com/ethereum/devp2p/blob/master/caps/eth.md#status-0x00) (record the the total difficulty, hash, and height of the last processed block).
[File: crates/stages/src/stages/headers.rs](https://github.com/paradigmxyz/reth/blob/main/crates/stages/src/stages/headers.rs)
```rust,ignore
async fn update_head<DB: Database>(
&self,
tx: &Transaction<'_, DB>,
height: BlockNumber,
) -> Result<(), StageError> {
// --snip--
self.network_handle.update_status(height, block_key.hash(), td);
// --snip--
}
```
Now that we have some understanding about the internals of the network management task, let's look at a higher-level abstraction that can be used to retrieve data from other peers: the `FetchClient`.
### Using `FetchClient` to Get Data in the Pipeline Stages
The `FetchClient` struct, similar to `NetworkHandle`, can be shared across threads, and is a client for fetching data from the network. It's a fairly lightweight struct:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/fetch/client.rs anchor=struct-FetchClient}}
The `request_tx` field is a handle to a channel that can be used to send requests for downloading data, and the `peers_handle` field is a wrapper struct around a handle to a channel that can be used to send messages for applying manual changes to the peer set.
#### Instantiating the `FetchClient`
The fields `request_tx` and `peers_handle` are cloned off of the `StateFetcher` struct when instantiating the `FetchClient`, which is the lower-level struct responsible for managing data fetching operations over the network:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/fetch/mod.rs anchor=struct-StateFetcher}}
This struct itself is nested deeply within the `NetworkManager`: its `Swarm` struct (shown earlier in the chapter) contains a `NetworkState` struct that has the `StateFetcher` as a field:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/state.rs anchor=struct-NetworkState}}
#### Usage of `FetchClient` in the Pipeline
The `FetchClient` implements the `HeadersClient` and `BodiesClient` traits, defining the funcionality to get headers and block bodies from available peers.
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/fetch/client.rs anchor=trait-HeadersClient-BodiesClient}}
This functionality is used in the `HeaderStage` and `BodyStage`, respectively.
In the pipeline used by the main Reth binary, the `HeaderStage` uses a `LinearDownloader` to stream headers from the network:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/downloaders/src/headers/linear.rs anchor=struct-LinearDownloader}}
A `FetchClient` is passed in to the `client` field, and the `get_headers` method it implements gets used when polling the stream created by the `LinearDownloader` in the `execute` method of the `HeaderStage`.
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/downloaders/src/headers/linear.rs anchor=fn-get_or_init_fut}}
In the `BodyStage` configured by the main binary, a `ConcurrentDownloader` is used:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/downloaders/src/bodies/concurrent.rs anchor=struct-ConcurrentDownloader}}
Here, similarly, a `FetchClient` is passed in to the `client` field, and the `get_block_bodies` method it implements is used when constructing the stream created by the `ConcurrentDownloader` in the `execute` method of the `BodyStage`.
[File: crates/net/downloaders/src/bodies/concurrent.rs](https://github.com/paradigmxyz/reth/blob/main/crates/net/downloaders/src/bodies/concurrent.rs)
```rust,ignore
async fn fetch_bodies(
&self,
headers: Vec<&SealedHeader>,
) -> DownloadResult<Vec<BlockResponse>> {
// --snip--
let (peer_id, bodies) =
self.client.get_block_bodies(headers_with_txs_and_ommers).await?.split();
// --snip--
}
```
---
## ETH Requests Task
The ETH requests task serves _incoming_ requests related to blocks in the [`eth` P2P subprotocol](https://github.com/ethereum/devp2p/blob/master/caps/eth.md#protocol-messages) from other peers.
Similar to the network management task, it's implemented as an endless future, but it is meant to run as a background task (on a standalone `tokio::task`) and not to be interacted with directly from the pipeline. It's represented by the following `EthRequestHandler` struct:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/eth_requests.rs anchor=struct-EthRequestHandler}}
The `client` field here is a client that's used to fetch data from the database, not to be confused with the `client` field on a downloader like the `LinearDownloader` discussed above, which is a `FetchClient`.
### Input Streams to the ETH Requests Task
The `incoming_requests` field is the receiver end of a channel that accepts, as you might have guessed, incoming ETH requests from peers. The sender end of this channel is stored on the `NetworkManager` struct as the `to_eth_request_handler` field.
As the `NetworkManager` is polled and listens for events from peers passed through the `Swarm` struct it holds, it sends any received ETH requests into the channel.
### The Operation of the ETH Requests Task
Being an endless future, the core of the ETH requests task's functionality is in its `poll` method implementation. As the `EthRequestHandler` is polled, it listens for any ETH requests coming through the channel, and handles them accordingly. At the time of writing, the ETH requests task can handle the [`GetBlockHeaders`](https://github.com/ethereum/devp2p/blob/master/caps/eth.md#getblockheaders-0x03) and [`GetBlockBodies`](https://github.com/ethereum/devp2p/blob/master/caps/eth.md#getblockbodies-0x05) requests.
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/eth_requests.rs anchor=fn-poll}}
The handling of these requests is fairly straightforward. The `GetBlockHeaders` payload is the following:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/eth-wire/src/types/blocks.rs anchor=struct-GetBlockHeaders}}
In handling this request, the ETH requests task attempts, starting with `start_block`, to fetch the associated header from the database, increment/decrement the block number to fetch by `skip` depending on the `direction` while checking for overflow/underflow, and checks that bounds specifying the maximum numbers of headers or bytes to send have not been breached.
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/eth_requests.rs anchor=fn-get_headers_response}}
The `GetBlockBodies` payload is simpler, it just contains a vector of requested block hashes:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/eth-wire/src/types/blocks.rs anchor=struct-GetBlockBodies}}
In handling this request, similarly, the ETH requests task attempts, for each hash in the requested order, to fetch the block body (transactions & ommers), while checking that bounds specifying the maximum numbers of bodies or bytes to send have not been breached.
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/eth_requests.rs anchor=fn-on_bodies_request}}
---
## Transactions Task
The transactions task listens for, requests, and propagates transactions both from the node's peers, and those that are added locally (e.g., submitted via RPC). Note that this task focuses solely on the network communication involved with Ethereum transactions, we will talk more about the structure of the transaction pool itself
in the [transaction-pool](../../../ethereum/transaction-pool/README.md) chapter.
Again, like the network management and ETH requests tasks, the transactions task is implemented as an endless future that runs as a background task on a standalone `tokio::task`. It's represented by the `TransactionsManager` struct:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/transactions.rs anchor=struct-TransactionsManager}}
Unlike the ETH requests task, but like the network management task's `NetworkHandle`, the transactions task can also be accessed via a shareable "handle" struct, the `TransactionsHandle`:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/transactions.rs anchor=struct-TransactionsHandle}}
### Input Streams to the Transactions Task
We'll touch on most of the fields in the `TransactionsManager` as the chapter continues, but some worth noting now are the 4 streams from which inputs to the task are fed:
- `transaction_events`: A listener for `NetworkTransactionEvent`s sent from the `NetworkManager`, which consist solely of events related to transactions emitted by the network.
- `network_events`: A listener for `NetworkEvent`s sent from the `NetworkManager`, which consist of other "meta" events such as sessions with peers being established or closed.
- `command_rx`: A listener for `TransactionsCommand`s sent from the `TransactionsHandle`
- `pending`: A listener for new pending transactions added to the `TransactionPool`
Let's get a view into the transactions task's operation by walking through the `TransactionManager::poll` method.
### The Operation of the Transactions Task
The `poll` method lays out an order of operations for the transactions task. It begins by draining the `TransactionsManager.network_events`, `TransactionsManager.command_rx`, and `TransactionsManager.transaction_events` streams, in this order.
Then, it checks on all the current `TransactionsManager.inflight_requests`, which are requests sent by the node to its peers for full transaction objects. After this, it checks on the status of completed `TransactionsManager.pool_imports` events, which are transactions that are being imported into the node's transaction pool. Finally, it drains the new `TransactionsManager.pending_transactions` events from the transaction pool.
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/transactions.rs anchor=fn-poll}}
Let's go through the handling occurring during each of these steps, in order, starting with the draining of the `TransactionsManager.network_events` stream.
#### Handling `NetworkEvent`s
The `TransactionsManager.network_events` stream is the first to have all of its events processed because it contains events concerning peer sessions opening and closing. This ensures, for example, that new peers are tracked in the `TransactionsManager` before events sent from them are processed.
The events received in this channel are of type `NetworkEvent`:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/manager.rs anchor=enum-NetworkEvent}}
They're handled with the `on_network_event` method, which responds to the two variants of the `NetworkEvent` enum in the following ways:
**`NetworkEvent::SessionClosed`**
Removes the peer given by `NetworkEvent::SessionClosed.peer_id` from the `TransactionsManager.peers` map.
**`NetworkEvent::SessionEstablished`**
Begins by inserting a `Peer` into `TransactionsManager.peers` by `peer_id`, which is a struct of the following form:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/transactions.rs anchor=struct-Peer}}
Note that the `Peer` struct contains a field `transactions`, which is an [LRU cache](https://en.wikipedia.org/wiki/Cache_replacement_policies#Least_recently_used_(LRU)) of the transactions this peer is aware of.
The `request_tx` field on the `Peer` is used the sender end of a channel to send requests to the session with the peer.
After the `Peer` is added to `TransactionsManager.peers`, the hashes of all of the transactions in the node's transaction pool are sent to the peer in a [`NewPooledTransactionHashes` message](https://github.com/ethereum/devp2p/blob/master/caps/eth.md#newpooledtransactionhashes-0x08).
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/transactions.rs anchor=fn-on_network_event}}
#### Handling `TransactionsCommand`s
Next in the `poll` method, `TransactionsCommand`s sent through the `TransactionsManager.command_rx` stream are handled. These are the next to be handled as they are those sent manually via the `TransactionsHandle`, giving them precedence over transactions-related requests picked up from the network. The `TransactionsCommand` enum has the following form:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/transactions.rs anchor=enum-TransactionsCommand}}
`TransactionsCommand`s are handled by the `on_command` method. This method responds to the, at the time of writing, sole variant of the `TransactionsCommand` enum, `TransactionsCommand::PropagateHash`, with the `on_new_transactions` method, passing in an iterator consisting of the single hash contained by the variant (though this method can be called with many transaction hashes).
`on_new_transactions` propagates the full transaction object, with the signer attached, to a small random sample of peers using the `propagate_transactions` method. Then, it notifies all other peers of the hash of the new transaction, so that they can request the full transaction object if they don't already have it.
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/transactions.rs anchor=fn-on_new_transactions-propagate_transactions}}
#### Handling `NetworkTransactionEvent`s
After `TransactionsCommand`s, it's time to take care of transactions-related requests sent by peers in the network, so the `poll` method handles `NetworkTransactionEvent`s received through the `TransactionsManager.transaction_events` stream. `NetworkTransactionEvent` has the following form:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/transactions.rs anchor=enum-NetworkTransactionEvent}}
These events are handled with the `on_network_tx_event` method, which responds to the variants of the `NetworkTransactionEvent` enum in the following ways:
**`NetworkTransactionEvent::IncomingTransactions`**
This event is generated from the [`Transactions` protocol message](https://github.com/ethereum/devp2p/blob/master/caps/eth.md#transactions-0x02), and is handled by the `import_transactions` method.
Here, for each transaction in the variant's `msg` field, we attempt to recover the signer, insert the transaction into LRU cache of the `Peer` identified by the variant's `peer_id` field, and add the `peer_id` to the vector of peer IDs keyed by the transaction's hash in `TransactionsManager.transactions_by_peers`. If an entry does not already exist for the transaction hash, then it begins importing the transaction object into the node's transaction pool, adding a `PoolImportFuture` to `TransactionsManager.pool_imports`. If there was an issue recovering the signer, `report_bad_message` is called for the `peer_id`, which decreases the peer's reputation.
To understand this a bit better, let's double back and examine what `TransactionsManager.transactions_by_peers` and `TransactionsManager.pool_imports` are used for.
`TransactionsManager.transactions_by_peers` is a `HashMap<TxHash, Vec<PeerId>>`, tracks which peers have sent us a transaction with the given hash. This has two uses: the first being that it prevents us from redundantly importing transactions into the transaction pool for which we've already begun this process (this check occurs in `import_transactions`), and the second being that if a transaction we receive is malformed in some way and ends up erroring when imported to the transaction pool, we can reduce the reputation score for all of the peers that sent us this transaction (this occurs in `on_bad_import`, which we'll touch on soon).
`TransactionsManager.pool_imports` is a set of futures representing the transactions which are currently in the process of being imported to the node's transaction pool. This process is asynchronous due to the validation of the transaction that must occur, thus we need to keep a handle on the generated future.
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/transactions.rs anchor=fn-import_transactions}}
**`NetworkTransactionEvent::IncomingPooledTransactionHashes`**
This event is generated from the [`NewPooledTransactionHashes` protocol message](https://github.com/ethereum/devp2p/blob/master/caps/eth.md#newpooledtransactionhashes-0x08), and is handled by the `on_new_pooled_transactions` method.
Here, it begins by adding the transaction hashes included in the `NewPooledTransactionHashes` payload to the LRU cache for the `Peer` identified by `peer_id` in `TransactionsManager.peers`. Next, it filters the list of hashes to those that are not already present in the transaction pool, and for each such hash, requests its full transaction object from the peer by sending it a [`GetPooledTransactions` protocol message](https://github.com/ethereum/devp2p/blob/master/caps/eth.md#getpooledtransactions-0x09) through the `Peer.request_tx` channel. If the request was successfully sent, a `GetPooledTxRequest` gets added to `TransactionsManager.inflight_requests` vector:
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/transactions.rs anchor=struct-GetPooledTxRequest}}
As you can see, this struct also contains a `response` channel from which the peer's response can later be polled.
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/transactions.rs anchor=fn-on_new_pooled_transactions}}
**`NetworkTransactionEvent::GetPooledTransactions`**
This event is generated from the [`GetPooledTransactions` protocol message](https://github.com/ethereum/devp2p/blob/master/caps/eth.md#getpooledtransactions-0x09), and is handled by the `on_get_pooled_transactions` method.
Here, it collects _all_ the transactions in the node's transaction pool, recovers their signers, adds their hashes to the LRU cache of the requesting peer, and sends them to the peer in a [`PooledTransactions` protocol message](https://github.com/ethereum/devp2p/blob/master/caps/eth.md#pooledtransactions-0x0a). This is sent through the `response` channel that's stored as a field of the `NetworkTransaction::GetPooledTransactions` variant itself.
{{#template ../../../templates/source_and_github.md path_to_root=../../../../ path=crates/net/network/src/transactions.rs anchor=fn-on_get_pooled_transactions}}
#### Checking on `inflight_requests`
Once all the network activity is handled by draining `TransactionsManager.network_events`, `TransactionsManager.command_rx`, and `TransactionsManager.transaction_events` streams, the `poll` method moves on to checking the status of all `inflight_requests`.
Here, for each in-flight request, `GetPooledTxRequest.response` field gets polled. If the request is still pending, it remains in the `TransactionsManager.inflight_requests` vector. If the request successfully received a `PooledTransactions` response from the peer, they get handled by the `import_transactions` method (described above). Otherwise, if there was some error in polling the response, we call `report_bad_message` (also described above) on the peer's ID.
#### Checking on `pool_imports`
When the last round of `PoolImportFuture`s has been added to `TransactionsManager.pool_imports` after handling the completed `inflight_requests`, the `poll` method continues by checking the status of the `pool_imports`.
It iterates over `TransactionsManager.pool_imports`, polling each one, and if it's ready (i.e., the future has resolved), it handles successful and unsuccessful import results respectively with `on_good_import` and `on_bad_import`.
`on_good_import`, called when the transaction was successfully imported into the transaction pool, removes the entry for the given transaction hash from `TransactionsManager.transactions_by_peers`.
`on_bad_import` also removes the entry for the given transaction hash from `TransactionsManager.transactions_by_peers`, but also calls `report_bad_message` for each peer in the entry, decreasing all of their reputation scores as they were propagating a transaction that could not validated.
#### Checking on `pending_transactions`
Finally, the last thing for the `poll` method to do is to drain the `TransactionsManager.pending_transactions` stream. These transactions are those that were added either via propagation from a peer, the handling of which has been laid out above, or via RPC on the node itself, and which were successfully validated and added to the transaction pool.
It polls `TransactionsManager.pending_transactions`, collecting each resolved transaction into a vector, and calls `on_new_transactions` with said vector. The functionality of the `on_new_transactions` method is described above in the handling of `TransactionsCommand::PropagateHash`.

View File

@@ -1,87 +0,0 @@
# Stages
The `stages` lib plays a central role in syncing the node, maintaining state, updating the database and more. The stages involved in the Reth pipeline are the `HeaderStage`, `BodyStage`, `SenderRecoveryStage`, and `ExecutionStage` (note that this list is non-exhaustive, and more pipeline stages will be added in the near future). Each of these stages are queued up and stored within the Reth pipeline.
{{#template ../templates/source_and_github.md path_to_root=../../ path=crates/stages/src/pipeline.rs anchor=struct-Pipeline}}
When the node is first started, a new `Pipeline` is initialized and all of the stages are added into `Pipeline.stages`. Then, the `Pipeline::run` function is called, which starts the pipeline, executing all of the stages continuously in an infinite loop. This process syncs the chain, keeping everything up to date with the chain tip.
Each stage within the pipeline implements the `Stage` trait which provides function interfaces to get the stage id, execute the stage and unwind the changes to the database if there was an issue during the stage execution.
{{#template ../templates/source_and_github.md path_to_root=../../ path=crates/stages/src/stage.rs anchor=trait-Stage}}
To get a better idea of what is happening at each part of the pipeline, lets walk through what is going on under the hood within the `execute()` function at each stage, starting with `HeaderStage`.
<br>
## HeaderStage
<!-- TODO: Cross-link to eth/65 chapter when it's written -->
The `HeaderStage` is responsible for syncing the block headers, validating the header integrity and writing the headers to the database. When the `execute()` function is called, the local head of the chain is updated to the most recent block height previously executed by the stage. At this point, the node status is also updated with that block's height, hash and total difficulty. These values are used during any new eth/65 handshakes. After updating the head, a stream is established with other peers in the network to sync the missing chain headers between the most recent state stored in the database and the chain tip. The `HeaderStage` contains a `downloader` attribute, which is a type that implements the `HeaderDownloader` trait. The `stream()` method from this trait is used to fetch headers from the network.
{{#template ../templates/source_and_github.md path_to_root=../../ path=crates/interfaces/src/p2p/headers/downloader.rs anchor=trait-HeaderDownloader}}
The `HeaderStage` relies on the downloader stream to return the headers in descending order starting from the chain tip down to the latest block in the database. While other stages in the `Pipeline` start from the most recent block in the database up to the chain tip, the `HeaderStage` works in reverse to avoid [long-range attacks](https://messari.io/report/long-range-attack). When a node downloads headers in ascending order, it will not know if it is being subjected to a long-range attack until it reaches the most recent blocks. To combat this, the `HeaderStage` starts by getting the chain tip from the Consensus Layer, verifies the tip, and then walks backwards by the parent hash. Each value yielded from the stream is a `SealedHeader`.
{{#template ../templates/source_and_github.md path_to_root=../../ path=crates/primitives/src/header.rs anchor=struct-SealedHeader}}
Each `SealedHeader` is then validated to ensure that it has the proper parent. Note that this is only a basic response validation, and the `HeaderDownloader` uses the `validate` method during the `stream`, so that each header is validated according to the consensus specification before the header is yielded from the stream. After this, each header is then written to the database. If a header is not valid or the stream encounters any other error, the error is propagated up through the stage execution, the changes to the database are unwound and the stage is resumed from the most recent valid state.
This process continues until all of the headers have been downloaded and and written to the database. Finally, the total difficulty of the chain's head is updated and the function returns `Ok(ExecOutput { stage_progress: current_progress, reached_tip: true, done: true })`, signaling that the header sync has completed successfully.
<br>
## BodyStage
Once the `HeaderStage` completes successfully, the `BodyStage` will start execution. The body stage downloads block bodies for all of the new block headers that were stored locally in the database. The `BodyStage` first determines which block bodies to download by checking if the block body has an ommers hash and transaction root.
An ommers hash is the Keccak 256-bit hash of the ommers list portion of the block. If you are unfamiliar with ommers blocks, you can [click here to learn more](https://ethereum.org/en/glossary/#ommer). Note that while ommers blocks were important for new blocks created during Ethereum's proof of work chain, Ethereum's proof of stake chain selects exactly one block proposer at a time, causing ommers blocks not to be needed in post-merge Ethereum.
The transactions root is a value that is calculated based on the transactions included in the block. To derive the transactions root, a [merkle tree](https://blog.ethereum.org/2015/11/15/merkling-in-ethereum) is created from the block's transactions list. The transactions root is then derived by taking the Keccak 256-bit hash of the root node of the merkle tree.
When the `BodyStage` is looking at the headers to determine which block to download, it will skip the blocks where the `header.ommers_hash` and the `header.transaction_root` are empty, denoting that the block is empty as well.
Once the `BodyStage` determines which block bodies to fetch, a new `bodies_stream` is created which downloads all of the bodies from the `starting_block`, up until the `target_block` specified. Each time the `bodies_stream` yields a value, a `SealedBlock` is created using the block header, the ommers hash and the newly downloaded block body.
{{#template ../templates/source_and_github.md path_to_root=../../ path=crates/primitives/src/block.rs anchor=struct-SealedBlock}}
The new block is then pre-validated, checking that the ommers hash and transactions root in the block header are the same in the block body. Following a successful pre-validation, the `BodyStage` loops through each transaction in the `block.body`, adding the transaction to the database. This process is repeated for every downloaded block body, with the `BodyStage` returning `Ok(ExecOutput { stage_progress: highest_block, reached_tip: true, done })` signaling it successfully completed.
<br>
## SenderRecoveryStage
Following a successful `BodyStage`, the `SenderRecoveryStage` starts to execute. The `SenderRecoveryStage` is responsible for recovering the transaction sender for each of the newly added transactions to the database. At the beginning of the execution function, all of the transactions are first retrieved from the database. Then the `SenderRecoveryStage` goes through each transaction and recovers the signer from the transaction signature and hash. The transaction hash is derived by taking the Keccak 256-bit hash of the RLP encoded transaction bytes. This hash is then passed into the `recover_signer` function.
{{#template ../templates/source_and_github.md path_to_root=../../ path=crates/primitives/src/transaction/signature.rs anchor=fn-recover_signer}}
In an [ECDSA (Elliptic Curve Digital Signature Algorithm) signature](https://wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm), the "r", "s", and "v" values are three pieces of data that are used to mathematically verify the authenticity of a digital signature. ECDSA is a widely used algorithm for generating and verifying digital signatures, and it is often used in cryptocurrencies like Ethereum.
The "r" is the x-coordinate of a point on the elliptic curve that is calculated as part of the signature process. The "s" is the s-value that is calculated during the signature process. It is derived from the private key and the message being signed. Lastly, the "v" is the "recovery value" that is used to recover the public key from the signature, which is derived from the signature and the message that was signed. Together, the "r", "s", and "v" values make up an ECDSA signature, and they are used to verify the authenticity of the signed transaction.
Once the transaction signer has been recovered, the signer is then added to the database. This process is repeated for every transaction that was retrieved, and similarly to previous stages, `Ok(ExecOutput { stage_progress: max_block_num, done: true, reached_tip: true })` is returned to signal a successful completion of the stage.
<br>
## ExecutionStage
Finally, after all headers, bodies and senders are added to the database, the `ExecutionStage` starts to execute. This stage is responsible for executing all of the transactions and updating the state stored in the database. For every new block header added to the database, the corresponding transactions have their signers attached to them and `reth_executor::executor::execute_and_verify_receipt()` is called, pushing the state changes resulting from the execution to a `Vec`.
{{#template ../templates/source_and_github.md path_to_root=../../ path=crates/stages/src/stages/execution.rs anchor=snippet-block_change_patches}}
After all headers and their corresponding transactions have been executed, all of the resulting state changes are applied to the database, updating account balances, account bytecode and other state changes. After applying all of the execution state changes, if there was a block reward, it is applied to the validator's account.
At the end of the `execute()` function, a familiar value is returned, `Ok(ExecOutput { done: is_done, reached_tip: true, stage_progress: last_block })` signaling a successful completion of the `ExecutionStage`.
<br>
# Next Chapter
Now that we have covered all of the stages that are currently included in the `Pipeline`, you know how the Reth client stays synced with the chain tip and updates the database with all of the new headers, bodies, senders and state changes. While this chapter provides an overview on how the pipeline stages work, the following chapters will dive deeper into the database, the networking stack and other exciting corners of the Reth codebase. Feel free to check out any parts of the codebase mentioned in this chapter, and when you are ready, the next chapter will dive into the `database`.
[Next Chapter]()