Achieving base layer functionality escape velocity without on-chain smart contracts, using sovereign ZK rollups

musalbas · 11 May 2023 11:24

A core design choice of Celestia is to minimise on-chain state, thereby creating an overhead-minimised and minimal DA layer. This means that Celestia does not have an on-chain smart contract environment, and does not act as a “settlement layer” (i.e. a bridging hub) for rollups. Instead, developers are expected to deploy their own settlement layers and shared security clusters on Celestia that rollups can use if they wish.

This allows Celestia to be credibly neutral to rollup settlement layers or shared security clusters built on top of it regardless of what execution environment they use. However, this means that although rollups can form trust-minimised validating two-way bridges with each other, they cannot form one with the Celestia L1 itself to bridge Celestia tokens (they can, however, use IBC and non-native committee-based bridging solutions). For this reason, we do not call rollups on Celestia “L2s”; they are independent (and possibly sovereign) clusters of chains that use Celestia for ordering or DA.

In order to allow for rollups to be L2s of Celestia and form a trust-minimised bridge, Celestia needs to achieve “functional escape velocity”, which in this case means having a sufficiently expressive execution environment such that you can run a validating bridge (i.e. run a light client for the rollup) that can withdraw or deposit assets to a rollup.

Previous wisdom has told us that in order to achieve functional escape velocity, you need a smart contract environment where it is possible to “execute custom user-generated scripts on-chain”. However is it possible to achieve functional escape velocity without a smart contract environment, thus still retaining Celestia’s credible neutrality and overhead-minimisation? It turns out that yes, you can, by simply adding a ”ZK verification key address type” to Celestia, and utilising sovereign ZK rollups.

zkSNARK verification key as a blockchain account

A zkSNARK verifier consists of a verifier V that computes V(vk, x, prf) and returns true if the proof is correct, where vk is a verification key that corresponds to some ZK program, x is a public input, and prf is the proof. We notice that this is actually very similar to verifying signatures for standard blockchain addresses with standard public keys (e.g. using ECDSA). Therefore, what if we simply added a new type of blockchain address and “signature scheme”, where a blockchain address is a ZK verification key that funds can be sent to, and funds can only be spent from that address upon providing a valid public input and proof (the “signature”)?

This would allow you to send funds to arbitrary ZK programs, that dictate the conditions needed for funds to be withdrawn. That ZK program can be a light client for a sovereign ZK rollup, that only allows funds to be withdrawn if a valid withdrawal transaction occurred on the ZK rollup. A sovereign ZK rollup does not require an on-chain smart contract to track its latest state. Instead, the rollup’s fork-choice rule can be executed directly inside the ZK rollup’s program. Quoting Sovereign Labs:

By tying calldata back to the L1 block headers, we can add a statement saying “I have scanned the DA layer for proofs (starting at block X and ending at block Y), and this proof builds on the most recent valid proof”. This lets us prove the fork-choice rule directly, rather than enforcing it client side! And if we’re already scanning the DA layer for proofs, we can easily scan for forced transactions as well.

From an implementation perspective, the basic idea is that the public input to the rollup’s Celestia ZK address would be the last Celestia block height (which would be enforced as an input by Celestia’s state machine), and the identifier of some “withdrawal transaction” on the rollup. The ZK program would return true if up to that block height, the identifier corresponds to a valid, unclaimed, withdrawal transaction.

Addendum

Optimistic rollups can also benefit from this, if they settle and post their fraud proofs to an intermediate or “wrapper” sovereign ZK rollup. This is effectively “ZK proofs of fraud proofs”.
There is a question over what zk scheme to support, and if we need to support multiple. However, even if we support one, and a rollup uses a different scheme, they may be able to use recursive proofs (a ZK proof of a ZK proof).
This post describes the most state-minimised functional escape velocity possible, however if we add some extra state, it can be beneficial to rollups. For example, if add a special transaction type to also update the state root of a rollup on-chain, this means the rollup no longer has to recursively prove previous proofs.

Thanks to Preston Evans for helpful comments on this post.

musalbas · 11 May 2023 11:25

Comment from Uma:

The question of what ZK scheme to support is inherently political and can get kind of tricky. There are a lot of different variations, even within families of similar protocols. For example with FRI-based protocols, different protocols use different hash functions for the merkelization and sometimes use different degree extensions.

I think groth16 (which Ethereum has), is probably the most universal one that maybe everyone can agree on (because of Ethereum), but again that is a choice to be made.

hydrogenbond007 · 11 May 2023 12:29

Hey, what do you think about other folding schemes cause Groth16 seems to be pretty slow in comparison. Even Plonk is nearly 5x faster than groth16

kelemeno · 11 May 2023 13:16

I think this is a great proposal. I have a few questions/observations.

Does this not totally go against the “Sovereign Rollup” narrative? The RUs will not be able to fork away from the validating bridge based on their own social consensus. So we are back to the enshrined bridge on L1, L1<>L2 paradigm.
Going further, you mention:

I didn’t realise this point, thank you for pointing it out: for pure sovereign RUs there is no state tracked on chain, a proof attests to the validity of the RU, tying it to state of the current DA layerr state root. For this to be valid this proof has to verify the previous proof.

There are also further efficiency gains:

it would also mean that the RU would not have to “scan the DA layer” for proofs, as the L1 would keep track of the state, making the proof more efficient. (Scanning for txs might still be required).
It will be easier to bridge assets between the RUs, as RUs will not have to verify each others’ proofs, this will be done by the L1. They can prove that a RU is in a certain state just from the DA layer.

I think we should add further state for X-RU txs, this will make the system more efficient. However, the explanation turned out to be quite long, and it does not belong here, so I will create a new post instead

musalbas · 11 May 2023 23:21

Does this not totally go against the “Sovereign Rollup” narrative? The RUs will not be able to fork away from the validating bridge based on their own social consensus. So we are back to the enshrined bridge on L1, L1<>L2 paradigm.

Sovereign rollups can have validating bridges too as long as they are upgradeable. The social aspect of sovereignty is primarily a social distinction rather than a technical one. See: https://twitter.com/musalbas/status/1644014320555098112

That being said, the idea in this post can also be used by “non-sovereign” rollups, what I actually mean are rollups where the fork-choice rule is proven succinctly inside the rollup (or by the rollup’s client directly) rather than by a smart contract, which is what Sovereign Labs means by sovereign rollup.

I think we need to probably refine the terminology a bit, or differentiate between “technical sovereignty” (rollups that do not need a smart contract to execute the fork choice rule) and “social sovereignty” (rollups that are upgraded by the community with hard forks).

it would also mean that the RU would not have to “scan the DA layer” for proofs, as the L1 would keep track of the state, making the proof more efficient. (Scanning for txs might still be required).

I think that’s true, they don’t need to scan the DA layer for the proofs, just the transactions, but they’d still need to post the proofs on-chain using the mentioned special transaction type.

It will be easier to bridge assets between the RUs, as RUs will not have to verify each others’ proofs, this will be done by the L1. They can prove that a RU is in a certain state just from the DA layer.

Sovereign Labs has an idea to aggregate ZK proofs so that sovereign ZK rollups don’t need to run a ZK verifier for every other rollup, mentioned here: https://members.delphidigital.io/reports/the-complete-guide-to-rollups/

nick · 12 May 2023 01:02

I love the idea and have some basic questions to clarify how this would work. I think there are a lot of gaps in my understanding so apologies if some of these don’t make sense.

Questions:

This post describes what the inputs to the proof are and what statement the proof proves, but what does the withdrawal transaction on Celestia actually look like and what does the verification script look like?
How does the zk program prove that a given withdrawal transaction is unclaimed? Does it scan for the claim transaction on Celestia? Maybe this would be easier to enforce on the Celestia side by keeping an accumulator of redeemed transaction ids.
How does Celestia know which destination address to send the withdrawn funds to? Is this included in the Celestia withdrawal transaction and if so, how do Celestia nodes verify that this is the correct address?
Will it be expensive to withdraw from the bridge because you need to generate a proof each time?
Will adding this functionality make it harder to fraud-prove the Celestia state machine? I wouldn’t think so, but just being cautious since in other discussions there was mention of the verification script needing access to all the previous block hashes from genesis.

kelemeno · 12 May 2023 10:09

I see, we discussed it once before on twitter, but I’m not sure we got to the end of it.

It seems like there are a few definitions of a sovereign rollup, I think this is causing my confusion, (please correct where wrong, and I will also try to add technical and social sovereignty):

Definition in docs: https://celestia.org/learn/sovereign-rollups/an-introduction/

“Uniquely, DA layers don’t verify whether sovereign rollup transactions are correct. Nodes verifying the sovereign rollup are responsible for verifying whether new transactions are correct”

The nodes scan the DA layer here.

Clearly, by adding zk-proof verification (the original topic), the DA layer verifies the transactions, and even the state, so the RUs will not be sovereign. I was going by this definition, but then this is out of date?

This is technically sovereign (as fork choice is done by DA), and socially sovereign, as the community can hard fork.

Sovereign Labs definition, (if I understood correctly): the state of a RU is a pure function of the DA layer, so we don’t need the DA layer or an L1 to decide the fork rule we will just prove it. This is very similar to (1), except the RU is Sov, even if the DA layer receives the proof, this is just a bridge. So this is a method for deciding the fork rule, for technical sovereignty. The RU itself can be upgradable, or it can be fixed. So this is agnostic on social sovereignty.
Current definition: Any rollup that is upgradeable separately from the L1. I.e. all Eth rollups today, maybe not rollups on Tezos. (Enshrined and not upgradable RUs are not sovereign). This is not technical sovereignty, but it is social.

I would like to add that I think we should further separate strong and weak social sovereignty. A chain is weakly sovereign if it can only upgrade itself according to some predetermined rule, i.e. a smart contract, program etc. Strong sovereignty means that there are no rules, the community is free to decide. According to this, ETH, BTC, type 1 Sov RUs are all strongly sovereign, as are most L1s. L1s with fixed governance mechanisms, type 2, 3 Sovereign RUs are weakly sovereign.

Currently, I think most people understand sovereignty to mean strongly sovereign, which is only possible for L1s and type 1 Sov RUs.

musalbas · 24 May 2023 22:40

nick:

This post describes what the inputs to the proof are and what statement the proof proves, but what does the withdrawal transaction on Celestia actually look like and what does the verification script look like?

How does the zk program prove that a given withdrawal transaction is unclaimed? Does it scan for the claim transaction on Celestia? Maybe this would be easier to enforce on the Celestia side by keeping an accumulator of redeemed transaction ids.

How does Celestia know which destination address to send the withdrawn funds to? Is this included in the Celestia withdrawal transaction and if so, how do Celestia nodes verify that this is the correct address?

Will it be expensive to withdraw from the bridge because you need to generate a proof each time?

Will adding this functionality make it harder to fraud-prove the Celestia state machine? I wouldn’t think so, but just being cautious since in other discussions there was mention of the verification script needing access to all the previous block hashes from genesis.

It’s just a standard Celestia transaction, expect you’re spending funds from a wallet owned by a ZK verification key, rather than an individually owned key.
Yes, the ZK program would have to scan the DA layer for claimed withdrawals to keep track of which withdrawals have been redeemed or not.
It’s specified in the “to” field in the Celestia transaction as described in (1). True, we need to enforce that the withdrawal address belongs to the user that actually initiated the withdrawal in the rollup. We could do this by adding the “to” field as a Celestia-enforced input to the ZK verifier, which the ZK verifier checks actually matches the withdrawal in the rollup.
My assumption would be that a prover would generate the “base” proof to prove the fork-choice rule of the rollup at a specific height, which can be built upon in a recursive proof for each withdrawal, so each withdrawal wouldn’t need to duplicate effort proving the same computation.
I don’t think so, I believe you only need the most recent hash(es).

musalbas · 24 May 2023 22:49

To be more precise, this should be that DA layers don’t define the state validity rules of the rollup. If there is a bridge with the DA, it shouldn’t be considered as enshrined, but just another bridge, and may therefore need to be upgradable in practice to follow rollup upgrades via hard forks. Kelvin also has some thoughts on this. The original definition of sovereign rollup is specified in this post:

A rollup chain is sovereign if it does not enshrine a settlement layer to determine the canonical chain and the transaction validity rules of the rollup. Rather, the canonical chain of the rollup is determined by the nodes in the rollup’s peer-to-peer network (provided that the blocks are available on the data availability layer). This means that the settlement layer cannot force inclusion of transactions into the rollup.

To “not enshrine a settlement layer” is primarily a social distinction rather than a technical one, which means that there is a social contract between the rollup’s community that the rollup’s transaction validity rules are defined by the community rather than an immutable L1 contract. In practice, this means that bridges to the rollup, which are not enshrined, must be mutable so that there is an upgrade path which acknowledges hard forks on the sovereign rollup (discussed in the next section).

But as I said above:

That being said, the idea in this post can also be used by “non-sovereign” rollups, what I actually mean are rollups where the fork-choice rule is proven succinctly inside the rollup (or by the rollup’s client directly) rather than by a smart contract, which is what Sovereign Labs means by sovereign rollup.

nick · 25 May 2023 23:43

Thanks, this makes it much clearer.

frisitano · 12 August 2023 14:37

I think proving the fork choice rule inside of a zk rollup is a very interesting idea but I’m not sure on the practicality of such a construction.

This would allow you to send funds to arbitrary ZK programs, that dictate the conditions needed for funds to be withdrawn. That ZK program can be a light client for a sovereign ZK rollup, that only allows funds to be withdrawn if a valid withdrawal transaction occurred on the ZK rollup. A sovereign ZK rollup does not require an on-chain smart contract to track its latest state. Instead, the rollup’s fork-choice rule can be executed directly inside the ZK rollup’s program. Quoting Sovereign Labs :

By tying calldata back to the L1 block headers, we can add a statement saying “I have scanned the DA layer for proofs (starting at block X and ending at block Y), and this proof builds on the most recent valid proof”. This lets us prove the fork-choice rule directly, rather than enforcing it client side! And if we’re already scanning the DA layer for proofs, we can easily scan for forced transactions as well.

I have a few preliminary thoughts that I would to table on the practicality of such a construction with the intention of motivating discussion and finding potential solutions. I wonder if those more familiar with the construction have considered the following.

I believe Celestia’s instantiation of the Namespaced Merkle Tree uses Sha256 as the hashing function. This is quite inefficient in a zkp context. Some benchmarks can be found here.
A possible attack vector would be to spam a rollups namespace with garbage data. The rollup would have to prove that the transactions are garbage and this would be expensive. The cost of attack would be cheap but the impact could be large.
The depth of the NMT is sensitive to the total throughput of Celestia. This means that the number of layers that have to be traversed to reach the root of the rollup tree of interest is sensitive to the total throughput. Not really a significant point in my opinion but something I have noted.

My biggest concern is about spam transactions and their implications on the practicality of proving the fork choice rule. Has anyone considered this?

musalbas · 13 August 2023 17:52

I believe Celestia’s instantiation of the Namespaced Merkle Tree uses Sha256 as the hashing function. This is quite inefficient in a zkp context. Some benchmarks can be found here .

I think it’s possible to optimize this quite a bit. For example, this sha256 STARK using plonky2 can do 140 hashes/sec on a MacBook.

A possible attack vector would be to spam a rollups namespace with garbage data. The rollup would have to prove that the transactions are garbage and this would be expensive. The cost of attack would be cheap but the impact could be large.

This is solvable with intra-namespace blob prioritization, where the rollup only process eg the 100 highest priority blobs (e.g. according to their gas paid).

The depth of the NMT is sensitive to the total throughput of Celestia. This means that the number of layers that have to be traversed to reach the root of the rollup tree of interest is sensitive to the total throughput. Not really a significant point in my opinion but something I have noted.

I don’t think this is a big concern given Merkle tree depths are O(log(n)) for n items.

eljhfx · 15 November 2023 23:12

Great post and thanks for sharing !

As a builder on Celestia, I am both excited and worried about changes to the base layer. Excited, because if approached properly the whole ecosystem stands to benefit from superior base-layer functionality. Worried, because important resources like time and focus can be revealed to be misspent due to shifts in architecture. Community norms and informal policies can be helpful in directing teams to focus on the right items. I really enjoyed your post on defining Celestia’s values recently as a start to this.

This is also relevant in the context of rollup bridging, where teams such as Neutron (my own), Dymension and others have been shaping some of their roadmap around being a bridge solution for Celestia rollups. Of course we support Celestia’s development in building the world’s first (and best?) modular blockchain, but in the same breathe it is also useful to understand where the puck is going so we can collaborate as effectively as possible.

My question for you is, 5-10 years down the line what parts of bridging would you want to see happen directly on Celestia, and what parts would you want to see handled within sovereign rollup clusters ?

(also interested in hearing anyone else’s thoughts on this and not just @musalbas)

musalbas · 18 November 2023 13:29

Can you elaborate on what you see as the different parts of bridging? I think the end goal should be to allow for trust-minimized bridges between Celestia and rollups (i.e. validating bridges that validate the STF using fraud/zk proofs), while keeping the state machine as minimal/baggage-free and neutral as possible (i.e. avoiding enshrining a specific smart contract environment as much as possible). In that sense it’s probably similar to Bitcoin’s roadmap for adding a ZK opcode in a few years, to allow for ZK rollups. It’s also a very similar design philosophy to IBC in some sense, as IBC does not rely on smart contracts, but now we need to generalize IBC into trust-minimized bridges that support fraud or ZK proofs.

Nramsrud · 3 January 2024 15:13

I have been meaning to respond to this for some time after seeing this question posed to Zooko on Twitter.

The execution layer tradeoff space for Celestia is an interesting one for enshrined snark verification. It seems as though minimization of state is a goal of Celestia L1 which means succinct proofs would be the best fit. Question is, just how “succinct” do you need them to be? This ranges from pairing based SNARKs which include Groth16 and KZG PCS (~200B) to non-pairing PCSs including IPA, Spartan, and other multivariate PCS (1-10KB) but probably not hash based schemes like FRI (10-100s of KB). To Zookos point, Groth requires per-circuit trusted setup, but this can be an application level decision to use it depending on just how specific the verification protocol is. KZG PCS enables succinct proofs for plonkish schemes with less trust (as it is an update-able, universal setup as opposed to Groth’s per circuit setup). So not all trust is created equal here, some of these ceremonies can be taken and built upon, making them quite robust (a powerful tradeoff for the succinctness they provide).

More applications will trend toward folding schemes as the tooling matures which adds additional verifier logic (hashing and verification over one curve for the primary proof and folding proof over a second curve). Doing this directly onchain is not going to be succinct enough. Compression proofs here will likely be commonplace for efficiency, becoming a proof like any other (with maybe a bit of extra, expensive to do in circuit, work).

All of this is defined over an Elliptic curve(s) (pairing based for pairing PCSs, dlog hard for non pairing based) which means it really comes down to popularity over usefulness for many. Being that BN254 is what ethereum supports with a lot of effort going into making tooling efficient for that curve and half cycles, that will likely be the best choice for many projects. I don’t know the cost of supporting many curves at the protocol level but that would be most beneficial as curve choice impacts the efficiency of the proving system for many applications.

In the end it really comes down the concrete limitations to be enforced on state growth, validator computational overhead and cost modeling and less about particular SNARK protocols as one, they are likely to continue to evolve (while being built upon the same low-level primitives) and two, different apps require different performance and trust.

This is a very interesting discussion that would blow up application complexity and whats possible to be built upon Celestia.

ebuchman · 24 January 2024 15:14

It’s also a very similar design philosophy to IBC in some sense, as IBC does not rely on smart contracts, but now we need to generalize IBC into trust-minimized bridges that support fraud or ZK proofs.

Would be great to see this functionality added in the form of a new IBC client.

It seems right now the primary way the Celestia state machine is expanding is through new IBC functionality (eg. PFM, ICA), so it could make sense for ZK accounts to also be implemented via IBC. As a generalized IBC standard it could have the added benefit of improving interop security and adoption across the wider rollup/appchain ecosystem.

IBC channels are already structured as native accounts like the proposed ZK accounts, except instead of being controlled by an external private key or multisig (or specific ZK construction) they are controlled by an IBC client (generally a light client for an external consensus instance). There has long been desire for IBC clients to expand to support ZK state transition proofs (and there’s various zkibc projects), but without support for such proofs in the SDK, it’s been hard to motivate a common standard. Perhaps the use-case of Celestia bridging TIA into and out of its rollups can really help here.

KagemniKarimu · 1 February 2024 13:43

I second this motion. I don’t have anything further to bear on the technical feasibility but it is an important ideal to establish common standards that can be transported to other protocols.

faddat · 18 March 2024 16:18

I strongly recommend that we do NOT use PFM on Celestia.

PFM violates the intended design of IBC on Celestia, and the minimalist philosophy behind Celestia.

And has caused nearly 100% of chains it has been deployed on to need emergency surgery.

Very yes.