STARKs 101
A short walkthrough of how STARK proofs work
Last updated
A short walkthrough of how STARK proofs work
Last updated
Statement: There exists an x such that the 1023th () value of the fibonacci square sequence is 2338775057. The whole sequence is under prime
As the prover we know that the value of x is 3141592. However this value should not be leaked to the STARK verifier.
LDE (Low Degree Extension): Helps encode our sequence/data. Done in 3 steps:
Generate Input:
This is what our data looks like:
The above are the values of the first 1023 values in the fibonacci square sequence with the and set according to us (0 and 3141592)
This is also called the trace. Because this is like the trace of the computation.
Now we choose another sequence of numbers, say
For our case we choose: as the y-values for the trace values to be plotted on a graph
Interpolate Polynomial
Interpolate a polynomial such that for each ,
This is what it looks like:
Extend
Now, all you need to do is just evaluate the polynomial over a much larger domain as well.
In our case, it would be 8 times the original domain
In our case, the extended domain would look something like this:
This is format of Reed-Solomon encoding/codeword.
Just as a reminder, this photo is for intuition only as this graph is drawn over a real field and has a smooth curve. In reality since this plotting is being done over a prime field, it would NOT look like this.
Commitment: Now, we need to commit to our LDE.
To do this simply send the Merkle Tree root of the tree generated by hashing together all the values of the polynomial evaluation in the extended domain. That would look something like this:
These are the few things that we will use in this chapter from the last chapter:
Trace -
Generator of G -
Trace Polynomial -
The polynomial formed by the interpolation of the trace values.
Constraints on that needs to be satisfied in order to convince the verifier of our statement
So the thing that we are going for is doing a bunch of reductions of these constraints so that we get another statement which if we prove to be true, then goes on to imply that these 3 initial constraints are also true.
The first reduction is to convert the constraints on to constraints on the trace polynomial .
is the same as saying , where is .
Similarly, can be thought of as where
Lastly, the constraint can be converted to:
for ,
This is all possible because of the way the trace polynomial was created. Just to jog thy memory, here's a photo:
Finally, at the end of Reduction 1, this is the state of affairs:
So, instead of saying for , we could also reframe that as:
, for
Similarly, , for
And lastly, , for ,
So this basically means for the first constraint polynomial in this reduction, is a root
For the second constraint polynomial in this reduction, is a root
For the last constraint polynomial in this reduction, , are the roots
Therefore, at the end of the second reduction this is the state of affairs:
Given the theorem that if divides then is a root of , the above reduction can be boiled down to the following:
is a root of can be written as:
is a polynomial
is a root of can be written as:
is a polynomial
And finally the last constraint can be re-written as:
Regarding point 4, the denominator is extremely inefficient and it has like 1021 roots and it's multiplication would be very hard for a computer. So, we do a couple of heavy optimizations.
can be written as
^^This polynomial is much easier to compute.
Now notice there is a difference in the values of pi. So to make it equivalent to we divide the polynomial in 6 by the added value polynomials.
Then the final state of affairs after reduction 3 is the following:
Now we have established that we do not need to care about the initial constraints and all we need to do to prove our statement is to show that:
,, are polynomials
Now instead of showing 3 different rational functions to be polynomial, we can do the process once and for all.
Let's take a random linear combination of these three rational functions. That looks something like this:
Where all the are random field elements
We call it the composition polynomial
Luckily we have another theorem which states with very high probability that:
is a polynomial all 's are polynomials
Commit to the CP using the Merkle Tree
The goal here is to show that is a polynomial which in turn will prove that all the constraints are satisfied and our initial statement was true.
And ofc, we won't prove what we want to prove.
We won't prove that is a polynomial
We will prove that is close to a polynomial of low degree.
Now, let us define what close and low degree mean.
Low degree is some agreement between the prover aand the verifier on some bounded degree that the polynomial is bounded by.
To understand Close, let us first understand distance.
The distance between function and polynomial is the number of points where
For example, here the and disagree on 5 points, so the distance between them is 5.
So, a function is close to a polynomial if is small.
Our goal is to prove that the is close to a polynomial of low degree. AND WE ACHIEVE THAT VIA FRI PROTOCOL.
FRI stands for Fast Reed-Solomon Interactive Oracles Proof of Proximity
Receive random
Apply the FRI operator
Commit (the resulting polynomial)
Repeat the above steps again and again until
lastly the prover sends the result
at the stage where the degree has been reduced to the low degree agreed upon by the prover and verifier
The FRI operator functions (broadly) on the following principle:
As you can see above, keeps on getting halved until we have to prove and Domain is also less (8).
The verifier stops sending random values now since a polynomial that is bounded by degree 1 is actually a constant.
Each polynomial can be split into an even and odd part like this:
Here's an example of the split
It will be split into and in the following fashion:
Now we receive the random . This is what convinces the verifier that the prover didn't cheat, bc this can convince the verifier that the prover did not calculate the values in advance.
Now the new function will be:
Now
And
So,
As you can see in point 6, the bounded degree decreased from 6 to 3. We will keep on going like this until we reach an upper degree bound of 1.
Therefore,
Up until now, we created a trace polynomial and commited to the trace root (Ch1)
Then we did some reductions on the original requirements on the trace and created the Composition Polynomial. We then committed to the CP root.
Then in Ch3 we applied the FRI protocol to prove that CP was close to a polynomial of a low-degree and we committed each layer of FRI, ie, Root, Root, ...., Root.
This is what we have sent so far to the verifier from the prover's side. This is called the Commitment Phase.
The second part of the proof is called Decommitment or actually trying to convince the verifier that every calculation and commitment made so far was truthful and not false.
Now how this would work out is, that the verifier will send random elements and the prover will provide a validation data for each.
So, the first thing that needs to be proven by the prover to the verifier is the correct transition from trace polynomial to Composition Polynomial.
For that, the prover will send , , for a given .
This is enough information for the verifier to themselves calculate , based on the formula below
Now the next thing that needs to be proven to the verifier is the transition between each FRI layer. That is done via the following approach:
And both and can be calculated only via and like this:
Therefore, the verifier can computer by only getting and from the prover
Assuming that step 8 is clear, then the prover will send the value of and from that value and from the calculated in step 8, the verifier will be able to check the value of and on and on until .
The prover is sending out these elements such as , , , but remember we did not commit these individual elements but rather the hash of the merkle tree root, so how do we convince the verifier that we did indeed commit these elements into the Merkle tree?
So, the prover sends the verification path from the element to the Merkle tree to convince the verifier. This is called decommitment and looks something like this:
Now, the question is How Much data does the prover send to the verifier to convince him?
So, if you look at the left side (commitment), we are sending order of elements in the commitment phase and then for a single query , we are sending order of path (from Merkle leaf to root). So, overall complexity is
Now, for different queries, we need to send data in order of: . But since is constant, the data remains of order .
NOW AFTER ALL THIS SHENANIGANS, the VERIFIER IS CONVINCED THAT THE PROVER'S INITIAL STATEMENT ABOUT THE 1023rd VALUE OF THE SEQUENCE IS TRUE.