Surge is a Nethermind initiative.
Thanks to @donnoh and @Bartek for their helpful discussions on the article.
Overview
The goal of this document is to detail how the Surge rollup template ensures state roots can be proven and verified with respect to the rollup’s proving protocol. Surge is a fork of Taiko, and many concepts included in this article are borrowed and adapted from Taiko documentation. All of the Surge protocol specs outlined in this document are present and available for review in the open-source Surge codebase at time of publishing.
Key Property 1: Proven State Root
Consider a chain of transaction batches and a corresponding chain of state roots starting from genesis state (N=0), where state root N is the result of applying transaction batch N to the state root N-1. A state root N os considered proven if a validity proof from the rollup’s proving protocol has been submitted on-chain that state root N is the result of applying transaction batch N to state root N-1.
Note: A validity proof in Surge’s proving protocol is actually a combination of 2 validity proofs from two of the 3 supported proof systems in Surge.
Key Property 2: Verified State Root
Consider a chain of transaction batches and a corresponding chain of state roots starting from genesis state (N=0), where state root N is the result of applying transaction batch N to the state root N-1. A state root N is considered verified if all state roots [1,…,N] are proven.
Starting from the genesis state, verified state roots are effectively a chain of state updates with which the L1 chain and its users interact. At any point in time on the finalized L1 chain, there is one head of the verified chain of rollup state roots. If a user transfers tokens that originated on L1 (minted on L1) to the bridge contract on the rollup, as soon as a state root containing the transfer is verified on L1, the user is allowed to withdraw the corresponding tokens on L1.
By ensuring the eventual progression of verified state roots, we ensure user withdrawals can eventually be processed.
Key Property 3: State Root Progression
Given a chain of verified state roots of height N>0, even in the presence of only one honest entity attempting to prove state roots, a state root at height N+1 is eventually proven and verified.
Note: State Root Progression does not say anything about the budget constraints of the honest entity or their access to proof generation. Both of these components are very important to consider, i.e. a honest entity with no money will not be able to pay for batch submission or proving costs.
State-Root Progression (SRP) Protocol
In this section, we define the general notion of a State-Root Progression (SRP) protocol, which all rollups must specify. After defining the general notion of an SRP protocol in this section, we proceed to outline and discuss Surge’s SRP protocol in the sections that follow.
An SRP protocol describes the steps required to eventually verify a state root update for a rollup. Additionally, an SRP protocol must describe the sub-protocols required to handle delays in state roots updates when state roots are not being proven, and how to upgrade the rollup smart contracts to ensure the long-term progression of verified state roots.
An SRP protocol must define the following components:
-
Supported proof systems: The proof system(s) used to prove state roots.
-
Protocol for progressing state roots updates
-
Protocol for pushing upgrades to the smart contracts: To ensure the long-term progression of state root updates, there must be a protocol for updating the on-chain smart contracts e.g. rollup performance or security improvements, proof systems patches.
-
Protocol halting scenarios: Scenarios from which the protocol cannot recover, meaning an effective loss of access to all rollup funds on L1 that have not already been sent to the rollup bridge.
Note: An ideal SRP protocol would not have any halting scenarios, guaranteeing progression in all scenarios. We are not aware if such an SRP protocol is possible due to the probability of bugs.
The Surge State Root Progression Protocol
Before introducing Surge’s SRP protocol, we highlight some of the key principles and design decisions in Surge:
-
Stage 2 from genesis.
Requirement How it is Satisfied in Surge Immediate upgrades by Security Council are limited to at most on-chain provable fault No immediate upgrades possible A minimum 30-day upgrade window for non-critical upgrades 45-day upgrade window Users can exit during the upgrade window Batch submission and proving is permissionless, so users can submit batches, and provide proofs for any batch they like Fraud proving is permissionless Batch submission and proving is permissionless, so users can submit batches, and provide proofs for any batch they like -
Removing governance-based attacks vectors. This is achieved through the following design choices:
- No emergency upgrade procedure to instantly upgrade Surge’s smart contracts, and as such, no requirement for a Security Council.
- A 45 day upgrade window after upgrades are signaled by the designated multi-sig holders (see Glossary) before upgrades can be triggered to enable users to withdraw funds in the case of unwanted upgrades.
- A requirement for batches to be submitted and verified continuously throughout the upgrade window for an upgrade to be triggered (explained in detail [here]LINK). This ensures state root progression has been maintained during the upgrade window, guaranteeing users have had the opportunity to exit funds from Surge before the upgrade is triggered.
-
A multi-proving approach.
- Agreeing proofs from any 2 of the 3 Surge proof systems can be used to prove state roots. This improves the resilience of the protocol to bugs in individual proof systems.
- Surge is using permissionlessly available proof systems such that anyone can pay for and provide proofs to prove state roots in Surge [link, link, link].
- At least one ZK-proof is required in all circumstances, in addition to one other proof.
Protocol Description
We now describe the Surge SRP protocol. See the code for full details.
1. Supported proof systems
- SGX-attestation of Reth-execution [link, link].
- SP1 zkVM using the Reth execution client as trace generator [link].
- RISC Zero zkVM using the Reth execution client as trace generator [link].
2. Protocol for progressing state root updates
In the following, we describe Surge’s protocol for ensuring state root progression:
- When a batch is submitted to the Surge smart contract on L1, a designated prover must be specified by the batch submitter. The designated prover is required to submit 2 agreeing state root proofs for the corresponding batch of transactions. The 2 proofs can be any agreeing combination of SGX, SP1, and RISC Zero proofs.
- The designated prover is required to deposit a liveness bond, committing to submit any 2 agreeing proofs within the proving window (currently 24 hours).
- . If the designated prover provides the proofs within the proving window, the prover is able to claim back the liveness bond.
- If the designated prover does not provide the proofs within the proving window, any prover can submit 2 agreeing proofs for the batch. The submitting prover receives the entire liveness bond of the original prover. There is no time-limit for submitting these proofs.
- When 2 agreeing proofs for a state root corresponding to the batch are accepted on-chain, the state root is considered proven.
3. Protocol for pushing upgrades to the smart contracts
Upgrades to the Surge protocol are implemented as follows:
- Upgrades to the Surge smart contracts can be queued to be pushed by the designated multi-sig key holders through calling the
upgradeTo
function. - Queueing an upgrade initiates a 45 day upgrade window, after which queued upgrades are potentially eligible to be triggered.
- Triggering an Upgrade: Upgrades can be triggered by the designated multi-sig if all of the following conditions hold at time of triggering:
- The upgrade was queued through the
upgradeTo
function at least 45 days before triggering. - The batch submission time of consecutive verified batches must be no more than 7 days apart throughout the entire upgrade window.
- At least one batch is submitted and verified in the 7 days before triggering the upgrade.
- The upgrade was queued through the
Together, the requirements of point 3 ensure that batches have been submitted and verified continuously thoughout the upgrade window, guaranteeing users have had an opportunity to verify withdrawal transactions on Surge and exit the rollup before an upgrade can be triggered. This captures the true essence of the upgrade window, giving users a genuine opportunity to exit the rollup before an upgrade is being triggered, even in the face of adversarial upgraders and proof systems.
4. Protocol halting scenario
We identify two main classifications of protocol halting scenarios:
-
Hostile upgrades (as coined by Vitalik). At any point in time, an upgrade can be queued which, if triggered, could permanently halt the chain beyond just the insertion of proof system bugs. This is an issue faced by all upgradeable rollups. Some basic examples of hostile upgrades include restricting proving to the null address (for which there is no known signing key), adding a fee of 1 billion ether to submit a batch, or reducing max batch gas to 0.
Importantly, given state root progression is maintained while a hostile upgrade is queued, a requirement for pushing upgrades in Surge, users have the upgrade window length to withdraw their funds from the rollup before it halts.
-
No 2 agreeing proofs. If no 2 agreeing proofs between SGX, SP1, and RISC Zero proofs for a batch are accepted by the respective verifiers on L1, a state root cannot be considered proven. In the Proof System Failure Scenarios section in the Appendix, we outline the scenarios that may prevent the generation and/or acceptance of 2 agreeing proofs. In this case, the Surge chain is halted and users cannot withdraw or interact tokens not already withdrawn to L1. Even if further batches are proved, these batches will never become verified since this step is blocked by the unprovable batch.
Note: Although halting scenarios due to proof system failures are possible in Surge, such scenarios would be as a result of two or more proof system failures. If proof systems are formally verified and rigorously tested, the probability of proof system failures approaches 0. Given our multi-prover set-up, this means the probability of halting is closer to 0 than the probability of any single proof system failing, as even one proof system can fail while still maintaining state root progression. This puts the onus on proof system developers to minimize the probability of proof system failures, work which is ongoing.
Our reliance on proof systems is in contrast to governance-based Stage 2 rollups, where a relatively small group of people can introduce halting scenarios with immediate effect in the case of a single proof system failure. Comparing social vulnerabilities to code-based vulnerabilities is not an exact science. However, we at Surge favor transparency through encoded security above any alternative.
Future improvements for the Surge SRP
We list below some SRP protocol improvements that may be adopted by Surge in the future, addressing some of the current SRP protocol’s drawbacks. Note that the effectiveness of these improvements has not been validated, but are likely to incrementally improve the resilience of Surge to proof system failures and halting.
- Use the Nethermind client as trace generator for supported proof systems. This would eliminate the semantic discrepancies that can appear between different the main execution client (currently Nethermind client), and the clients generating traces within the proof systems (currently Reth). This amendment will reduce the probability of forks between the executed state of the rollup and the state that gets verified on L1.
- Only require a single proof to prove a state root in certain scenarios. In the case of all three proof systems providing disagreeing proofs, or two of three systems crashing, outline a fallback mechanism to reduce the number of required proofs to prove a state root. With this, state root progression can be maintained in more proof system failure scenarios. This alternate design does come with new risks related to the how this proof system is chosen, and when exactly the two-proof requirement should be relaxed.
- Support more proof systems for computing state root proofs. Having a larger number of unique proof systems increases the likelihood of computing agreeing state root proofs for meaningful state roots. Some areas that would need to be properly considered for this solution:
- How many of these proof system proofs should be required to prove a state root?
- How will the costs for these new provers be covered?
- What are the criteria for adding new proof systems to the SRP protocol?
Appendix
Glossary
Designated multi-sig holders: A group of trusted individuals or entities that collectively manage certain critical functions of the rollup through a multi-signature wallet. The multi-sig holders are responsible for securing the assets and operations within the rollup. Multi-sig holders can have the authority to approve changes or upgrades to the rollup contracts, allowing the system to evolve and improve over time while ensuring that changes are agreed upon by multiple parties.
In Surge, the initial designated multi-sig holders are all Nethermind employees. This does not affect the Stage 2-ness of Surge, given our strict state root progression requirements before upgrades can be triggered.
Halting: The halting of a rollup refers to the temporary or permanent cessation of its operations. During a halt, users might be unable to execute transactions or access their tokens within the rollup.
Liveness Bond: A term borrowed from Taiko, this ris a specific amount of collateral which must be deposited by the designated prover of a batch. The liveness bond is only returned to the designated prover if the batch is proven within the proving window (currently set to 24 hours). If the proving window expires, any prover may submit a proof to capture the liveness bond.
Proven Batch/State Root: Two agreeing validity proofs for the batch/state root have been pushed on chain.
State root/root update: A state root represents the updated state of all accounts within the rollup after processing the transactions within a rollup batch. State roots are periodically submitted on the underlying L1 for verification. State roots updates ensure the integrity of the off-chain computation and synchronize the state of the rollup with the L1.
Upgrade window: According to L2Beat, Stage 2 rollups should offer users at least 30 days to exit the system in case of unwanted upgrades. Surge has a 45 days upgrade window in which it can upgrade any smart contract of the rollup. The upgrade is triggered by the designated multi-sig key holders. The upgrade window is intended to provide users with ample time to exit the protocol before any upgrade is pushed.
Verified Batch/State Root: A proven batch/ state root is verified if all parent batches/state roots of the batch/state root are also proven. This is terminology borrowed from Taiko.
Proof system failure scenarios
To the best of our knowledge, proof system failures can be classified by one of the following scenarios:
- The execution client processes a transaction batch, but the proof system does not: The batch is valid with respect to the execution client, but the proof system is not able to generate a state root proof for it. This can be caused by discrepancies between the semantics of the execution client and that of the proof system, or due to the fact that the computation needed to generate the proof is too complex or resource-intensive, potentially overwhelming the Prover’s capabilities.
- State fork between the execution client and the proof system: After processing a batch, the execution client transitions to a state, while the proof system outputs a state root proof showing the transition to a different state. This can be caused again by discrepancies between the semantics of the execution client and that of the proof system, especially when the execution client differs from that used by the proof system, or by bugs in the proof system which could enable construction of false proofs that an on-chain honest verifier would accept.
- Valid state root proof is not accepted by the on-chain verifier: If the on-chain verifier has bugs, it can refuse valid state root proofs.
Individual prover failures are also possible, but because of the permissionless nature of proof submission in Surge, we do not consider these further.