ZKP on the Client-side: Challenges & Our Solutions

This blog is more likely be a casual chatty piece of myself talking about the experience and the story behind how we compose a decent client-side proving ZKP system. It's not supposed to be a technical discussion but more like a fun piece to talk about the general idea that inspires us when building OpenID3.

I'll get straight to it. Building a good zero-knowledge proof system on the client side is way harder than it seems to be. In able to make it work for OpenID3, we have take lots of efforts in making compromises. In this blog, I'm assume you have some basic knowledge with what are zero-knowledge proofs.

What is it like to build a ZKP system?

The ZKP system engineering as we have today, mostly serves the function to scale Ethereum. There are indeed some awesome client-side applications but mostly runs as an overlay to another system or simply consists of straightforwards circuits or tapping into a user group that care less about UX.

On the surface, when I am at events and start the conversation by saying that I'm a ZKP engineer, the most common response is somewhere between if I'm working for zkSync or if we are building another zk rollup layer2s. After a while, it's kinda annoying. It is kinda like saying that you live in Singapore and people starting to ask if they really cane people LOL. Anyway, I'm just trying to broadly picture the scene here - most ZKP systems are not built for the client-side applications.

So why not? There are a few major considerations that takes a completely different priority sequencing for client-side ZKP over the rollup types. I'll dive a bit into the questions by starting with a break-down of relevant concepts.

The prover and the verifier - it's straightforwards, the prover has some information and a defined program to process the information. The prover takes in those information and the program to spit out a proof. The verifier takes the program and the proof to verify that the prover is honest. For zkSNARKs, the prover will take a while and the verifier should be able to verify the proof cheaply and quickly. For most ZKP systems, the verifier is usually a smart contract.
The Witness and the Circuit - well following the above concepts, the information that the prover has is called a witness, and the defined program is called a circuit. A proof can be generated given a circuit and a witness.
The private and the public inputs - it's a slightly twisted concepts. Generally, for any ZKP systems, there is not much difference between program outputs and program inputs. After all, the very essential job for a ZKP system is prove the process to be run correctly. It might take a moment to sink in - but just remember this: public inputs are, essentially, equivalent to public outputs.
The application proof and the aggregated proof: most ZKP needs to be aggregated and some ZKP systems call this feature as recursive zkSNARKs. Basically, the application proof shall be the circuit that the application is trying to prove and the aggregated proof is like the fine standardization process to summarize a bunch of ZKPs to be run correctly. In another words, for 99% of the time, when working with ZKP systems, engineers would compose an application proof system as well as a proof system to verify the proof system itself. On the very practical level, verifying a aggregated proof is usually cheaper and more efficient for a verifier. It saves gas cost and pollution on chain.

Now, after these basic concepts, a few considerations that ZKP engineers usually take:

Prover time & memory consumption - for both the application proof and the aggregation proof.
Verifier cost - the cost to verify one or a batch of proofs on-chain.
Size - well... size matters and there are a couple of sizes to be considered. To set the scene, it's common knowledge for engineers to know that client-side networking is unstable by definition.
- The proof size, usually correlated to the on-chain verifying gas cost
- The executable size, how large of a executable (system executable or WASM blob) do users need to download.
- The parameter size, for some ZKP systems, a trusted setup is required. If so, how large would the trusted setup file would the users need to download when they are trying to securely run the circuit.
- The key size, there are generally two types of cryptographic keys, commonly as fixtures, the verifier key and the prover key. Trust me, they can be insanely huge sometimes.
Generalization - for lack of a better word, and my lack of knowledge of what people usually call it: I'm referring to the concerns of how different will the whole proving/verifying experience varies depends on different input parameters. Yes, they matter a lot. For instant, because we have very little idea of the exact length of the user identity claims, or length of JWT before they are hashed, if the circuit is not constructed carefully, the proving/verifying system will simply not function. Furthermore, badly constructed circuit that does not take enough considerations on those client-side variables, might leave it open for security exploits on the private user information.

ZKP on the client side

On the flip sides of a rollup-style proof system and a typical client-side proof systems, the priority of concerns are different. Again, it is usually hard to achieve all upsides without taking compromises. And it is fundamentally different for ZKP on the client side and the server side. Generally, the server side has the benefit of using powerful machines and ultra-fast networking, while on the client-side, we are talking about unstable networks, limited compute resource and inpatient users who keep refreshing their webpages. Therefore:

Prover time and memory consumption is the number one priority for the client-side ZKP, while it is far more forgiving on the server side.
Size! For some circuit that we composed, the trusted setup parameters can be as large as 20MB, for some DSLs, the resulting WASM blob can reach a few hundreds MBs. We just simply cannot ship those to the clients and expect customers to be happy about the long wait just to download files. Any executables, keys, parameters larger than a large JavaScript file would be hard to swallow.
On-chain verification cost! The cost to verify a rollup can be shared by a lots of users, but the client-side ZKP proof usually requires one single user to take on the cost.
Delay! ZK rollups won't be required to doing rollups every block, but the client side ZKP needs it to be as soon as the system can do. Users won't wait for a few days to have their proof verified on-chain but instead, a few minutes would be the most stretched acceptable timeline.

In general, the difference has a familiar ring for comparison between building an API server over carry out portion of the business logic inside a smart contract. No doubt, rollup-style ZKP systems are rocket science to build, but, building a functional client-side ZKP system is not a trivial feat and actually comes with more strict requirements for the system functionalities.

The OpenID3 ZKP System

We actually started messing around a wide range of ZKP systems and obviously some comes off in flavors on the expected client-side proof system we are expecting. In the end, we are most likely settled within a combination of ZKP systems. Our preliminary works drafts the following experience:

The client-side will download a WASM executable with an approximate size of 2MBs.
The client-side will generate an application proof in about 30 seconds.
The client-side will then pass along the application proof to one of a generic aggregator and be placed in a queue. Periodically, the aggregator will be triggered. The aggregator will take a shit lots of memory and approximately 3 minutes to aggregate the proof and pass the proof back to the client side. The aggregator will also register all client public inputs of the proof into a Merkle Tree and expose the tree root as the public input of the aggregated proof.
The aggregator will submit the aggregated proof on-chain, with some overhead, the smart contract can validate the proof and record the Merkle Tree root for about 250,000 Gas. (About the price for a Uniswap call from ETH to some ERC-20 token).
The user can submit a "claim" call and do Merkle Prove to identify themselves with hashed messages and link a portion of the ZKP to their address.
The aggregator, the smart contract, and us, know nothing about the user identity beyond an on-chain address and a hashed identity hash composed of the user issuer, identity, and client type.

What have we done and what can be better

This part can be slightly technical and requires a lot of background knowledge on ZKP but I'll try to use plain English for all concepts.

On the client-side, we opt-in to use Plonky2, which is a FRI+Plonk ZKP system. It requires no trusted-setup, fast prover time and reasonable memory consumption. We have trimmed down the circuit to deliver the minimal requirement to verify an OpenID credential. The compiled WASM blob is about ~3MB.
On the server-side, we first aggregate the Plonky2 proofs into a "wrapped" Plonky2 proof that is capable of aggregate a bunch of Plonky2 proofs into one, verify the right circuit construction (i.e. the user is executing the exact circuit as we expected), hash the public inputs to save space etc.
Right after the first Plonky2 aggregation, we run a Plonky2 verifier inside of Gnark, that translates the proof and verifier key into a quickly verifiable form on-chain. The Gnark aggregation is the major time consumer that takes about 2 and a half minutes.
Finally, the user, with the aggregated proof, can use a common generic Plonky2+Gnark verifier contract to cheaply verify themselves with as little as 211,000 gas.

For now, two pieces of work are in process on the ZKP side for OpenID3:

We are looking to accelerate the Gnark aggregation time, targeting less than 1 minutes with black magic tricks on GPU accelerations etc.
We are tree-shaking dependencies to further minimize the executable size, ideally to under 1MB.

Final Thoughts

The client-side ZKP is a wonderland with too few explorers. We are eager to work with any other teams to standardize the process we took.

Frameworks, DSL can be built around easily composing Plonky2 circuits for application proof.
We haven't introduced folding optimization for lots of our processes yet and we expect huge performance boosts with folding on aggregation.
A aggregator service, or a generic on-chain verification service can be built to act as a critical infrastructure for a ZKP future that brings security, decentralization and awesome developer experience for the larger community.

I hope you find this blog a fun read. As always, questions, concerns, criticisms, compliments are always welcomed <3

ZKP on the Client-side: Challenges & Our Solutions

Song Z

ZKP on the Client-side: Challenges & Our Solutions

ZKP on the Client-side: Challenges & Our Solutions

And our solution when building OpenID3

What is it like to build a ZKP system?

ZKP on the client side

The OpenID3 ZKP System

What have we done and what can be better

Final Thoughts