Quilibrium learning blog

Analyze Quilibrium from a code perspective

palanpeak

palanpeak

Quilibrium is an amazing project. Cassie has explained the logic of the Quilibrium project multiple times from top to bottom. Many people, including myself in the past, doubted the feasibility of such a complex system, However, as I delved deeper into the source code, I discovered the great design of the system. Therefore, I want to analyze Quilibrium from a code perspective, from the bottom up, and explain how each module is specifically implemented. Of course, what I know now is just the tip of the iceberg, but I will continue to think deeply.

The purpose of this blog is not for teaching—I don't think I am at that level yet—but to attract more developers like me who are willing to contribute to Quilibrium, making it easier for them to get started with the project. Hopefully, this will also lead to more people to discuss and learn together in the future.

The following code is from commit: 20eadfa519c01879bf6164c04acebd9bef87ffd2, and it might change with future versions. The following process is just my understanding of the code, and if there are any inaccuracies, please feel free to correct me with a comment.

To facilitate understanding, I will omit many details. I have divided the node operation of ceremonyclient into three parts: initialization, asynchronous services, and the main loop.

  • The initialization process refers to the steps that need to be done before the node starts, such as the Trusted Setup for KZG.

  • Asynchronous services refer to the node asynchronously starting a service that passively receives some input and then completes some logic. For example, the data worker will receive requests to calculate the VDF based on the input difficulty and challenge data.

  • The main loop refers to the core logic of the program, which will link the previous services together.

Initialization Process

Start subprocesses

Start n subprocesses as gRPC servers, n = length(cores) - 1. The specific purpose of this is detailed in the asynchronous service section, particularly in service1.

// main.go 408
go spawnDataWorkers(nodeConfig)

Initialize the KZG Ceremony:

KZG Trusted Setup. I will write a separate article to explain why a trusted setup is necessary and provide more details about KZG.

// main.go 410    
 kzg.Init()

Asynchronous Services

Service1: Data Worker

Provided Service: Hosts a gRPC server waiting for calls from the main process, as defined in the protocol described by protobufs/data.proto.

Functionality: Implements the function CalculateChallengeProof, which receives challenge data, based on the increment in the request, it generates the difficulty (inversely proportional) for VDF calculation and returns the proof. From this function, it is clear that the larger the increment, the faster the VDF is generated.

Code Entry Point:

// rpc/data_worker_ipc_server.go 31
func (r *DataWorkerIPCServer) CalculateChallengeProof(
    ctx context.Context,
    req *protobufs.ChallengeProofRequest,
) (*protobufs.ChallengeProofResponse, error) {
    ...
}

Service2: Peerinfo Manager

Provided Service: Manages the addition and querying of peer information.

Functionality: Stores information about peers and sorts these peers based on connection bandwidth, with faster connections listed first and slower ones last. Here, connection bandwidth refers to the bandwidth between the local machine and other peers.

Code Entry Point:

// consensus/master/master_clock_consensus_engine.go 175
 e.peerInfoManager.Start()
 
 // p2p/peer_info_manager.go 127
 func (m *InMemoryPeerInfoManager) searchAndInsertPeer(manifest *PeerManifest) {
 ...
 }

Service3: Master Time Reel

Provided Service: Manages the insertion and querying of clock frames and stores them.

Functionality: Stores historical clock frames under the .config directory.

Code Entry Point:

// consensus/master/master_clock_consensus_engine.go 180
err := e.masterTimeReel.Start()

// consensus/time/master_time_reel.go 178
func (m *MasterTimeReel) runLoop() {
    ...
}

Service4: Frame Validator

Provided Service: Waits to receive new frames.

Functionality: Validates new frames using VDF. If successful validation, it inserts the frame through service3.

Call Relationship:

  • Calls service3.

Code Entry Point:

// consensus/master/master_clock_consensus_engine.go 195
case newFrame := <-e.frameValidationCh:

Service5: Bandwidth Test

Provided Service: Waits to receive peer IDs.

Functionality: Directly connects to the received peer ID and performs a speed test. If the speed test does not meet expectations, the peer is given a low score. If it meets expectations, the basic information of this peer is saved by calling service2.

Call Relationship:

  • Calls service8 on other machines for network speed testing.

  • Calls service2 to store peer information.

Code Entry Point:

// consensus/master/master_clock_consensus_engine.go 216
case peerId := <-e.bandwidthTestCh:

Service6: Challenge Proof Verifier

Provided Service: Waits to receive requests for verifying challenges.

Functionality: Verifies the peer that generated the challenge. If the verification fails, it indicates that the peer has likely committed fraud, and the peer is given a low score.

Code Entry Point:

// consensus/master/master_clock_consensus_engine.go 218
case verifyTest := <-e.verifyTestCh:

Service7: Message Subscriber

Provided Service:

Subscribes to network messages based on a filter.

Functionality:

Subscribes to network messages according to the filter. Once a message is received, it determines the type of the message. Currently, two types of messages are supported:

  • ClockFrame type messages: Calls service4 to verify the clock frame.

  • SelfTest type messages: Calls service6 to verify the received challenge.

Code Entry Point:

// consensus/master/master_clock_consensus_engine.go 225
e.pubSub.Subscribe(e.filter, e.handleMessage, true)

Service8: Channel Listener

Provided Service: Provides a listener that can be directly connected to by other peers.

Functionality: Allows other peers to connect and use the local machine for bandwidth testing.

Code Entry Point:

// consensus/master/master_clock_consensus_engine.go 236
if err := e.pubSub.StartDirectChannelListener(e.pubSub.GetPeerID(),"validation",server); err != nil {
    panic(err)
}

Main Loop

  1. Retrieve the last locally verified Commitment.

  2. Combine the Peer ID and Commitment as data, and call service1 to perform CalculateChallenge to obtain the proof for this challenge.

  3. Calculate the Commitment and hash of this proof.

  4. Store the entire process as a Transaction in the local .config file.

  5. Broadcast this data over the network.

  6. Other nodes receive this broadcast data through service7.

diagram

post image

Arweave TX

q_cPDlDmKYNVJHLd-MiuhdnKndge2AuxO2gRmvBcsoo

Quilibrium learning blog