The potential of zkSNARKs is remarkable; you can authenticate the accuracy of computations without needing to perform them, and you won’t even discover what was processed—only that it was executed accurately. Regrettably, most descriptions of zkSNARKs end up employing vague explanations at some stage, rendering them as something “mystical,” implying that only the most enlightened truly grasp how and why (and if?) they function. The truth is that zkSNARKs can be simplified into four straightforward techniques, and this blog entry aspires to elucidate them. Anyone familiar with the workings of the RSA cryptosystem should also gain a solid understanding of zkSNARKs as currently utilized. Let’s find out if it accomplishes its aim!
In a brief summary, zkSNARKs as they are presently executed comprise 4 essential elements (don’t be concerned, we will clarify all the terms in forthcoming sections):
A) Encoding as a polynomial issue
The application to be verified is transformed into a quadratic equation of polynomials: t(x) h(x) = w(x) v(x), where equality holds if and only if the program has been executed accurately. The prover aims to persuade the verifier that this equality is valid.
B) Briefness through random sampling
The verifier selects a secret evaluation point s to simplify the issue from multiplying polynomials and confirming polynomial function equality to simple multiplication and equality verification on numbers: t(s)h(s) = w(s)v(s)
This drastically reduces both the size of the proof and the verification duration.
C) Homomorphic encoding/encryption
An encoding/encryption function E is utilized that possesses certain homomorphic properties (but is not entirely homomorphic, which remains impractical). This enables the prover to compute E(t(s)), E(h(s)), E(w(s)), E(v(s)) without knowledge of s; she only knows E(s) and a few other beneficial encrypted values.
D) Zero Knowledge
The prover rearranges the values E(t(s)), E(h(s)), E(w(s)), E(v(s)) through multiplication by a number, allowing the verifier to still validate their correct structure without being aware of the actual encoded values.
The very basic notion is that verifying t(s)h(s) = w(s)v(s) is equivalent to validating t(s)h(s) k = w(s)v(s) k for a random secret number k (which is not zero), with the distinction that if you are only provided with the numbers (t(s)h(s) k) and (w(s)v(s) k), it becomes impossible to deduce t(s)h(s) or w(s)v(s).
This was the vague explanation intended so that you could grasp the essence of zkSNARKs, and now we will delve into the specifics.
RSA and Zero-Knowledge Proofs
Let’s begin with a quick overview of how RSA operates, omitting certain meticulous details. Keep in mind that we frequently deal with numbers modulo some other number rather than complete integers. The notation here is “a + b ≡ c (mod n)”, which signifies “(a + b) % n = c % n”. Note that the “(mod n)” segment does not pertain to the right side “c” but is relevant to the “≡” and all other “≡” in the same equation. This may render it somewhat challenging to comprehend, yet I assure you I will use it judiciously. Now back to RSA:
The prover generates the following numbers:
- p, q: two random secret prime numbers
- n := p q
- d: random number such that 1
- e: a number such that d e ≡ 1 (mod (p-1)(q-1)).
The public key is (e, n) and the private key is d. The primes p and q can be disregarded but should not be exposed.
The message m is encrypted via
and c = E(m) is decrypted via
Due to the fact that cd ≡ (me % n)d ≡ med (mod n) and multiplication in the exponent of m behaves similarly to multiplication within the group modulo (p-1)(q-1), we glean that med ≡ m (mod n). Moreover, the security of RSA is predicated on the supposition that n cannot be factored efficiently, thus d cannot be derived from e (if we were aware of p and q, this would be straightforward).
One of the astonishing features of RSA is that it is multiplicatively homomorphic. Typically, two operations are homomorphic if their order can be swapped without impacting the result. In the context of homomorphic encryption, this is the attribute allowing computations on encrypted data. Fully homomorphic encryption, which exists but isn’t yet practical, would facilitate evaluating arbitrary programs on encrypted data. Here, in the case of RSA, we are solely referring to group multiplication. More formally: E(x) E(y) ≡ xeye ≡ (xy)e ≡ E(x y) (mod n), or put differently: The multiplication of the encryption of two messages equals the encryption of the product of the messages.
This homomorphic property already permits a form of zero-knowledge proof of multiplication: The prover possesses some secret numbers x and y and calculates their product but only transmits the encrypted versions a = E(x), b = E(y), and c = E(x y) to the verifier. The verifier then checks that (a b) % n ≡ c % n and the only information the verifier uncovers is the encrypted version of the product and that the product was computed accurately; however, she does not learn the two factors or the actual product. If you substitute the product with addition, this hints towards a blockchain where the principal operation is to aggregate balances.
Interactive Verification
Having touched upon the zero-knowledge aspect, let’s now concentrate on another primary feature of zkSNARKs, the succinctness. As you will discover later, succinctness is the considerably more remarkable aspect of zkSNARKs, as the zero-knowledge portion will be provided “for free” due to a specific encoding that allows a limited form of homomorphic encoding.
SNARKs stand for succinct non-interactive arguments of knowledge. In this general landscape of so-called interactive protocols, there exists a prover and a verifier where the prover intends to convince the verifier of a statement (e.g. that f(x) = y) through an exchange of messages. The generally sought properties are that no prover can persuade the verifier of a false statement (soundness) and that there exists a certain strategy for the prover to convince the verifier of any true statement.(completeness). The distinct components of the acronym signify the following:
- Concise: the dimensions of the messages are small when juxtaposed with the duration of the actual computation
- Non-interactive: there is minimal or no interaction. In the case of zkSNARKs, a setup phase is generally present, succeeded by a single communication from the prover to the verifier. Moreover, SNARKs often possess the so-called “public verifier” attribute, allowing anyone to validate without re-engaging, which is crucial for blockchains.
- ARguments: the verifier is only safeguarded against computationally constrained provers. Provers equipped with sufficient computational capabilities can fabricate proofs/arguments regarding false claims (Bear in mind that with adequate computational resources, any public-key encryption can be compromised). This characteristic is referred to as “computational soundness,” in contrast to “perfect soundness.”
- of Knowledge: the prover cannot generate a proof/argument without possessing a specific so-called witness (for instance, the address they intend to spend from, the preimage of a hash function, or the route to a particular Merkle-tree node).
Including the zero-knowledge prefix necessitates the requirement (in general terms) that during the interaction, the verifier acquires no information aside from the correctness of the assertion. The verifier specifically does not acquire the witness string – we will uncover what that is later.
For illustration, let’s examine this transaction validation computation: f(σ1, σ2, s, r, v, ps, pr, v) = 1 if and only if σ1 and σ2 are the root hashes of account Merkle-trees (the pre- and post-state), s and r are sender and receiver accounts, and ps, pr are Merkle-tree proofs affirming that the balance of s is at least v in σ1, and they hash to σ2 instead of σ1 if v is transferred from the balance of s to the balance of r.
Verifying the computation of f is relatively straightforward if all entries are known. Consequently, we can convert f into a zkSNARK where only σ1 and σ2 are made public, while (s, r, v, ps, pr, v) constitutes the witness string. The zero-knowledge attribute now enables the verifier to confirm that the prover is aware of some witness that transforms the root hash from σ1 to σ2 in a manner that does not infringe upon any criteria governing legitimate transactions, but she remains oblivious to who transferred how much money to whom.
The precise definition (still omitting certain specifics) of zero-knowledge stipulates that there exists a simulator that, after producing the setup string but lacking knowledge of the confidential witness, can interact with the verifier — yet an external observer cannot differentiate this interaction from that with the authentic prover.
NP and Complexity-Theoretic Reductions
To determine which issues and computations zkSNARKs are applicable to, we must establish certain concepts from complexity theory. If you’re unconcerned about what a “witness” entails, what you will not comprehend after “reading” a zero-knowledge proof, or why zkSNARKs are exclusively suitable for a particular issue concerning polynomials, you may skip this section.
P and NP
To begin, let’s limit our focus to functions that produce only 0 or 1, labeling such functions as problems. Since each bit of a longer outcome can be queried individually, this is not a substantial limitation but simplifies the theory significantly. Now, we aim to measure how “complex” it is to resolve a given problem (compute the function). For a designated machine implementation M of a mathematical function f, we can consistently count the number of actions required to compute f on a specific input x – this is termed the runtime of M on x. The exact definition of a “step” is not particularly significant in this context. As the program typically requires more time for larger inputs, the runtime is always evaluated in relation to the size or length (in number of bits) of the input. This is where the idea of an “n2 algorithm” arises – it’s an algorithm that requires at most n2 steps for inputs of size n. The terms “algorithm” and “program” are largely equivalent here.
Programs with a runtime of at most nk for a certain k are classified as “polynomial-time programs.”
Two principal categories of problems in complexity theory are P and NP:
- P represents the classification of problems L that have polynomial-time programs.
Even if the exponent k can be considerable for particular problems, P is deemed the class of “manageable” issues and indeed, for non-artificial concerns, k is generally not greater than 4. Validating a bitcoin transaction is a problem within P, as is computing a polynomial (and confining the result to 0 or 1). Broadly speaking, if one merely needs to compute a value and does not require “searching” for anything, the problem is virtually always classified under P. If searching for something is necessary, you primarily fall into a category referred to as NP.
The Class NP
zkSNARKs exist for all problems categorized in NP, and indeed, the current practical zkSNARKs can be generically applied to any problem within NP. It remains uncertain whether zkSNARKs exist for any issue outside of NP.
All problems in NP possess a certain structure, originating from the definition of NP:
- NP is the classification of problems L that possess a polynomial-time program V that can be utilized to verify a fact provided a polynomially-sized so-called witness for that fact. More formally:
L(x) = 1 if and only if there is some polynomially-sized string w (referred to as the witness) such that V(x, w) = 1
An instance of a problem in NP can be illustrated with the issue of boolean formula satisfiability (SAT). For this, we define a boolean formula using an inductive definition:
- any variable x1, x2, x3,… is a boolean formula (we also employ any other character to indicate a variable
- if f is a boolean formula, then ¬f is a boolean formula (negation)
- if f and g are boolean formulas, then (f ∧ g) and (f ∨ g) are boolean formulas (conjunction / and, disjunction/ or).
The expression “((x1∧ x2) ∧ ¬x2)” constitutes a boolean formula.
A boolean formula is satisfiable if there’s a method to allocate truth values to the variables so that the formula computes to true (where ¬true is false, ¬false is true, true ∧ false is false and similarly, following standard rules). The satisfiability problem SAT encompasses all satisfiable boolean formulas.
- SAT(f) := 1 if f represents a satisfiable boolean formula and 0 otherwise
The aforementioned example, “((x1∧ x2) ∧ ¬x2)”, is not satisfiable and therefore does not fall within SAT. The evidence for a particular formula is its satisfying assignment, and confirming that a variable assignment is satisfying can be accomplished in polynomial time.
Is P Equal to NP?
If you narrow down the description of NP to witness strings of length zero, you encompass the same problems as those in P. Because of this, every issue in P is also contained in NP. A major goal in complexity theory research is to demonstrate that these two categories are indeed distinct – specifically, that there exists a problem in NP that does not reside in P. It might appear straightforward that this is true, but if you can formally establish it, you could win US$ 1 million. Additionally, if you manage to prove the opposite, that P and NP are the same, besides the same monetary reward, there is a significant likelihood that cryptocurrencies could vanish overnight. This outcome would occur as it will become considerably simpler to discover solutions to proof-of-work puzzles, collisions in hash functions, or the private key associated with an address. These challenges all belong to NP, and since you would have demonstrated that P = NP, a polynomial-time program must exist for them. However, this article is not intended to alarm you; most researchers hold the belief that P and NP are not the same.
NP-Completeness
Let’s return to SAT. The remarkable characteristic of this ostensibly simple problem is that it not only belongs to NP, but it is also NP-complete. The term “complete” in this context is akin to “Turing-complete.” It denotes that it is one of the most challenging problems in NP, but more crucially — and this forms the definition of NP-complete — any problem in NP can be converted to an equivalent input for SAT in the following manner:
For every NP-problem L, there exists a so-called reduction function f, which can be computed in polynomial time such that:
Such a reduction function can be viewed as a compiler: It accepts source code written in a specific programming language and translates it into an equivalent program in another programming language, typically a machine language that exhibits the same semantic behavior. Given that SAT is NP-complete, such a reduction is available for any conceivable problem in NP, including the issue of verifying whether, for instance, a bitcoin transaction is valid when provided with the appropriate block hash. A reduction function exists that converts a transaction into a boolean formula, such that the formula is satisfiable if and only if the transaction is valid.
Example of Reduction
To illustrate such a reduction, let us analyze the problem of evaluating polynomials. First, we define a polynomial (similar to a boolean formula) as an expression composed of integer constants, variables, addition, subtraction, multiplication, and (correctly structured) parentheses. Now, the problem we aim to consider is
- PolyZero(f) := 1 if f represents a polynomial with a zero where its variables are derived from the set {0, 1}
We will now develop a reduction from SAT to PolyZero, thereby demonstrating that PolyZero is also NP-complete (confirming that it belongs to NP is assigned as an exercise).
It suffices to define the reduction function r based on the structural elements of a boolean formula. The concept here is that for any boolean formula f, the value r(f) is a polynomial with an equivalent number of variables, and f(a1,..,ak) evaluates to true if and only if r(f)(a1,..,ak) equals zero, where true corresponds to 1 and false corresponds to 0, and r(f) only assumes the values 0 or 1 for variables from {0, 1}:
- r(xi) := (1 – xi)
- r(¬f) := (1 – r(f))
- r((f ∧ g)) := (1 – (1 – r(f))(1 – r(g)))
- r((f ∨ g)) := r(f)r(g)
One might have expected that r((f ∧ g)) would be articulated as r(f) + r(g), but this would result in the polynomial stepping outside the {0, 1} set.
Applying r, the formula ((x ∧ y) ∨¬x) is transformed to (1 – (1 – (1 – x))(1 – (1 – y))(1 – (1 – x)),
Note that each of the transformation rules for r achieves the stated objective above, thereby confirming that r effectively conducts the reduction:
- SAT(f) = PolyZero(r(f)) or f is satisfiable if and only if r(f) has a zero in {0, 1}
Preservation of Witness
From this illustration, it’s evident that the reduction function solely specifies how to convert the input; however, upon closer examination (or by reviewing the proof of a valid reduction), one can uncover a method to also convert a valid witness along with the input. In our scenario, we only described how to convert the formula into a polynomial, but through the proof, we clarified how to alter the witness, the satisfying assignment. This concurrent transformation of the witness is not imperative for a transaction, yet it is often performed as well. This is quite significant for zkSNARKs since the primary duty of the prover is to convince the verifier that such a witness exists, without disclosing information regarding the witness.
Quadratic Span Programs
In the preceding section, we examined how computational challenges within NP can be interrelated and, notably, that there are NP-complete problems that are essentially mere reformulations of all other issues in NP – including transaction validation challenges. This enables us to easily devise a universal zkSNARK for all problems in NP: We simply select a suitable NP-complete problem. Thus, if we intend to demonstrate how to validate transactions using zkSNARKs, it’s adequate to illustrate the process for a specific problem that is NP-complete and potentially more straightforward to analyze theoretically.
This section and the one that follows are based on the paper GGPR12 (the referenced technical report provides significantly more information than the journal article), where the authors discovered that the problem known as Quadratic Span Programs (QSP) is particularly advantageous for zkSNARKs.A Quadratic Span Program (QSP) comprises a collection of polynomials, and the objective is to determine a linear combination of these that is a multiple of a distinct polynomial provided. Additionally, the specific bits of the input string limit the polynomials permitted for use. In particular (the general QSPs exhibit slightly more flexibility, but we define the strong version, as it will be useful later):
A QSP over a field F for inputs of length n includes
- a collection of polynomials v0,…,vm, w0,…,wm over this field F,
- a polynomial t over F (the target polynomial),
- an injective function f: {(i, j) | 1 ≤ i ≤ n, j ∈ {0, 1}} → {1, …, m}
The core task is to multiply the polynomials by coefficients and sum them in such a way that this total (termed a linear combination) is a multiple of t. For each binary input string u, the function f constrains the polynomials that may be employed, or in other words, their coefficients in the linear combinations. Formally:
An input u is validated (verified) by the QSP if and only if there are tuples a = (a1,…,am), b = (b1,…,bm) from the field F such that
- ak,bk = 1 if k = f(i, u[i]) for some i, (u[i] is the ith bit of u)
- ak,bk = 0 if k = f(i, 1 – u[i]) for some i and
- the target polynomial t divides va wb where va = v0 + a1 v0 + … + amvm, wb = w0 + b1 w0 + … + bmwm.
It is important to note that there is still some latitude in selecting the tuples a and b if 2n is less than m. This indicates that QSP is only applicable for inputs of a certain size – this issue is addressed by employing non-uniform complexity, a topic we will not explore at this moment; suffice it to say that it is effective for cryptography where inputs are typically small.
As a parallel to the satisfiability of Boolean formulas, the factors a1,…,am, b1,…,bm can be viewed as the assignments to the variables, or generally, the NP witness. To illustrate that QSP resides in NP, note that all the verifier needs to do (once she is aware of the factors) is to verify that the polynomial t divides va wb, which is a polynomial-time task.
We will not discuss the reduction from general computations or circuits to QSP here, as it does not aid in comprehending the general concept, so you must trust that QSP is NP-complete (or rather complete for some non-uniform analogue like NP/poly). In real-world applications, the reduction constitutes the actual “engineering” aspect – it must be handled wisely so that the resulting QSP remains as compact as feasible and possesses additional advantageous characteristics.
One aspect of QSPs that we can already observe is how to verify them significantly more efficiently: The verification process entails confirming whether one polynomial divides another polynomial. This can be aided by the prover supplying another polynomial h such that t h = va wb, which converts the task into validating a polynomial identity or alternatively, determining that t h – va wb = 0, meaning checking that a specific polynomial is the zero polynomial. This appears rather straightforward, but the polynomials we will use subsequently are quite substantial (the degree is approximately 100 times the number of gates in the original circuit), making the multiplication of two polynomials a challenging task.
Thus, instead of actually calculating va, wb and their product, the verifier selects a secret random point s (this point is part of the “toxic waste” of zCash), computes the values t(s), vk(s) and wk(s) for all k, and from these, va(s) and wb(s) and merely confirms that t(s) h(s) = va(s) wb(s). Consequently, a series of polynomial additions, scalar multiplications, and a polynomial product is simplified to field multiplications and additions.
Verifying a polynomial identity solely at a single point rather than at all points naturally diminishes the security; however, the only way the prover can deceive in the event that t h – va wb is not the zero polynomial is if she manages to hit a root of that polynomial. Since she does not know s and the count of roots is minor (the degree of the polynomials) in comparison to the potential values for s (the number of field elements), this is exceedingly safe in practice.
The zkSNARK in Detail
We shall now elaborate on the zkSNARK for QSP in depth. It commences with a setup phase that must be executed for every individual QSP. In zCash, the circuit (the transaction verifier) is predetermined, and therefore the polynomials for the QSP are also set, which allows the setup to occur only once and be reused for all transactions, which solely differ in the input u. For the setup, which produces the common reference string (CRS), the verifier selects a random and secret field element s and encrypts the values of the polynomials at that specific point. The verifier utilizes a specific encryption E and publishes E(vk(s)) and E(wk(s)) as part of the CRS. The CRS also contains numerous other values that enhance the efficiency of verification and additionally confer the zero-knowledge property. The encryption E employed here possesses a specific homomorphic property, which enables the prover to calculate E(v(s)) without actually knowing vk(s).
How to Evaluate a Polynomial Succinctly and with Zero-Knowledge
Let’s first examine a simpler scenario, specifically the encrypted evaluation of a polynomial at a secret point, without delving into the complete QSP challenge.
For this purpose, we establish a group (an elliptic curve is typically selected here) and a generator g. Recall that a group element is termed a generator if there exists a number n (the group order) such that the list g0, g1, g2, …, gn-1 includes all elements.in the assembly. The encryption is merely defined as E(x) := gx. Following this, the verifier selects a confidential field element s and makes public (as part of the CRS)
- E(s0), E(s1), …, E(sd) – d represents the highest degree of all polynomials
Subsequently, s can be (and must be) disregarded. This is precisely what zCash refers to as toxic waste, for if an individual can retrieve this and other secret values selected later, they could fabricate proofs at will by locating roots in the polynomials.
Utilizing these figures, the prover is able to compute E(f(s)) for any polynomials f without knowledge of s: Suppose our polynomial is f(x) = 4x2 + 2x + 4 and we wish to compute E(f(s)), we derive E(f(s)) = E(4s2 + 2s + 4) = g4s^2 + 2s + 4 = E(s2)4 E(s1)2 E(s0)4, which can be calculated from the disclosed CRS without knowledge of s.
The sole issue here is that, due to the destruction of s, the verifier cannot confirm that the prover executed the polynomial accurately. To address this, we also select another confidential field element, α, and reveal the following “shifted” values:
- E(αs0), E(αs1), …, E(αsd)
Similar to s, the value α is also discarded post-setup phase and is unknown to both the prover and the verifier. Using these encoded values, the prover can likewise compute E(α f(s)), in our case this is E(4αs2 + 2αs + 4α) = E(αs2)4 E(αs1)2 E(αs0)4. Thus the prover publishes A := E(f(s)) and B := E(α f(s))) and the verifier must ensure that these values correspond accurately. She accomplishes this by using another core component: a so-called pairing function e. The elliptic curve and the pairing function must be selected together, ensuring that the following condition holds for all x, y:
Employing this pairing function, the verifier verifies that e(A, gα) = e(B, g) — note that gα is known to the verifier since it is part of the CRS as E(αs0). To demonstrate that this verification holds true if the prover does not defraud, let us examine the following equations:
e(A, gα) = e(gf(s), gα) = e(g, g)α f(s)
e(B, g) = e(gα f(s), g) = e(g, g)α f(s)
The more crucial aspect, however, is whether the prover can somehow generate values A, B that satisfy the verification e(A, gα) = e(B, g) but are not E(f(s)) and E(α f(s))), respectively. The response to this inquiry is “we hope not”. This is referred to as the “d-power knowledge of exponent assumption,” and it remains uncertain whether a deceitful prover can accomplish such a feat. This assumption extends from similar principles that support the security of other public-key encryption methods, which similarly lack verified truth.
In reality, the aforementioned protocol does not genuinely grant the verifier the ability to affirm that the prover assessed the polynomial f(x) = 4x2 + 2x + 4; the verifier can solely ascertain that the prover evaluated some polynomial at the point s. The zkSNARK for QSP will include an additional value that enables the verifier to confirm that the prover indeed computed the correct polynomial.
What this example illustrates is that the verifier doesn’t need to assess the entire polynomial to validate this; it is sufficient to evaluate the pairing function. In the following step, we will incorporate the zero-knowledge component so that the verifier cannot reconstruct anything related to f(s), not even E(f(s)) – the encrypted value.
To achieve this, the prover selects a random δ and instead of A := E(f(s)) and B := E(α f(s))), she submits A’ := E(δ + f(s)) and B := E(α (δ + f(s)))). Assuming the encryption remains unbreakable, the zero-knowledge property becomes quite evident. We now need to verify two aspects: 1. the prover can genuinely compute these values and 2. the verifier’s check is still valid.
For 1., observe that A’ = E(δ + f(s)) = gδ + f(s) = gδgf(s) = E(δ) E(f(s)) = E(δ) A and likewise, B’ = E(α (δ + f(s)))) = E(α δ + α f(s))) = gα δ + α f(s) = gα δ gα f(s)
= E(α)δE(α f(s)) = E(α)δ B.
For 2., note that the only aspect the verifier checks is whether the values A and B she receives satisfy the equations A = E(a) and B = E(α a) for some value a, which evidently holds true for a = δ + f(s) as it does for a = f(s).
Thus, we now possess an understanding of how the prover can derive the encrypted value of a polynomial at an encrypted secret point without the verifier uncovering any information about that value. Let us now apply that to the QSP challenge.
A SNARK for the QSP Challenge
Bear in mind that in the QSP we are provided with polynomials v0,…,vm, w0,…,wm, a target polynomial t (of degree no greater than d) and a binary input string u. The prover discovers a1,…,am, b1,…,bm (which are somewhat limited according to u) and a polynomial h such that
- t h = (v0 + a1v1 + … + amvm) (w0 + b1w1 + … + bmwm).
In the preceding section, we have already elucidated how the common reference string (CRS) is established. We opt for confidential numbers s and α and disclose
- E(s0), E(s1), …, E(sd) and E(αs0), E(αs1), …, E(αsd)
Since we do not have a singular polynomial, but sets of polynomials determined by the problem, we also publish the computed polynomials immediately:
- E(t(s)), E(α t(s)),
- E(v0(s)), …, E(vm(s)), E(α v0(s)), …, E(α vm(s)),
- E(w0(s)), …, E(wm(s)), E(α w0(s)), …, E(α wm(s)),
and we require additional secret values βv, βw, γ (these will be utilized to confirm that the aforementioned polynomials were evaluated and not some random polynomials) and publish
- E(γ), E(βv γ), E(βw γ),
- E(βv v1(s)), …, E(βv vm(s))
- E(βw w1(s)), …, E(βw wm(s))
- E(βv t(s)), E(βw t(s))
This constitutes the complete common reference string. In actual applications, certain components of the CRS may be unnecessary, but this would complicate the explanation.
What action does the prover undertake? She employs the reduction described earlier to identify the polynomial h and the quantities a1,…,am, b1,…,bm. It is crucial to implement a witness-preserving reduction (refer to the prior explanation) because only then can the quantities a1,…,am, b1,…,bm be calculated in conjunction with the reduction and would be exceedingly challenging to ascertain otherwise. To articulate what the prover relays to the verifier as proof, we must revert to the definition of the QSP.
An injective function f: {(i, j) | 1 ≤ i ≤ n, j ∈ {0, 1}} → {1, …, m} existed which constrains the values of a1,…,am, b1,…,bm. Given that m is relatively extensive, there are indices that do not emerge in the output of f for any corresponding inputs. These indices are unconstrained, thus we will designate them as Ifree and define vfree(x) = Σk akvk(x) where k spans all indices in Ifree. For w(x) = b1w1(x) + … + bmwm(x), the proof now comprises of
- Vfree := E(vfree(s)), W := E(w(s)), H := E(h(s)),
- V’free := E(α vfree(s)), W’ := E(α w(s)), H’ := E(α h(s)),
- Y := E(βv vfree(s) + βw w(s)))
where the final component is utilized to ascertain that the proper polynomials were employed (this aspect has not yet been addressed in the preceding example). It is essential to note that all of these encrypted values can be produced by the prover possessing solely the CRS.
The verifier’s responsibility is now as follows:
Since the values of ak, where k does not correspond to a “free” index, can be computed directly from the input u (which is also recognized by the verifier, this is what is to be validated), the verifier can derive the missing segment of the complete sum for v:
- E(vin(s)) = E(Σk akvk(s)) where k encompasses all indices not in Ifree.
In doing so, the verifier now validates the following equalities utilizing the pairing function e (do not be alarmed):
- e(V’free, g) = e(Vfree, gα), e(W’, E(1)) = e(W, E(α)), e(H’, E(1)) = e(H, E(α))
- e(E(γ), Y) = e(E(βv γ), Vfree) e(E(βw γ), W)
- e(E(v0(s)) E(vin(s)) Vfree, E(w0(s)) W) = e(H, E(t(s)))
To understand the overarching notion here, it is crucial to realize that the pairing function facilitates limited computation on encrypted values: Arbitrary additions can be executed but only a single multiplication. The capacity for addition originates from the inherent additive homomorphism of the encryption itself, while the singular multiplication is achievable through the two arguments assigned to the pairing function. Thus, e(W’, E(1)) = e(W, E(α)) essentially multiplies W’ by 1 within the encrypted domain and compares that to W multiplied by α in the encrypted realm. Upon examining the values W and W’ are anticipated to possess – E(w(s)) and E(α w(s)) – this confirms if the prover presented a valid proof.
If you recall from the section on evaluating polynomials at secret points, these initial three verifications essentially confirm that the prover executed an evaluation of some polynomial constructed from the segments of the CRS. The second verification ensures that the prover employed the accurate polynomials v and w rather than arbitrary alternatives. The rationale is that the prover has no method to generate the encrypted combination E(βv vfree(s) + βw w(s))) through any means other than from the precise values of E(vfree(s)) and E(w(s)). This stipulation stems from the fact that the values βv are not included in the CRS independently, but solely in conjunction with the values vk(s) and βw is exclusively recognized in tandem with the polynomials wk(s). The sole mechanism for “mixing” them is through the identically encrypted γ.
Assuming the prover has supplied a legitimate proof, let us verify that the equality holds. The left and right sides are, respectively
- e(E(γ), Y) = e(E(γ), E(βv vfree(s) + βw w(s))) = e(g, g)γ(βv vfree(s) + βw w(s))
- e(E(βv γ), Vfree) e(E(βw γ), W) = e(E(βv γ), E(vfree(s))) e(E(βw γ), E(w(s))) = e(g, g)(βv γ) vfree(s) e(g, g)(βw γ) w(s) = e(g, g)γ(βv vfree(s) + βw w(s))
The third verification essentiallychecks that (v0(s) + a1v1(s) + … + amvm(s)) (w0(s) + b1w1(s) + … + bmwm(s)) = h(s) t(s), which is the primary condition for the QSP dilemma. It’s important to observe that multiplication of the encrypted values equates to addition of the unencrypted ones because E(x) E(y) = gx gy = gx+y = E(x + y).
Incorporating Zero-Knowledge
As mentioned earlier, the outstanding characteristic of zkSNARKS lies more in their brevity than in their zero-knowledge aspect. We will now explore how to incorporate zero-knowledge, while the subsequent section will delve further into the succinctness.
The concept involves the prover “shifting” certain values by a randomly chosen secret quantity and compensating for the shift on the equation’s other side. The prover selects random δfree, δw and implements the following substitutions in the proof
- vfree(s) is substituted with vfree(s) + δfree t(s)
- w(s) is substituted with w(s) + δw t(s).
Through these substitutions, the values Vfree and W, which encompass an encoding of the witness components, effectively become indistinguishable from randomness, making it impossible to retrieve the witness. Most of the equality checks remain “insensitive” to these adjustments; the sole value that still requires correction is H or h(s). We must ensure that
- (v0(s) + a1v1(s) + … + amvm(s)) (w0(s) + b1w1(s) + … + bmwm(s)) = h(s) t(s), or in different terms
- (v0(s) + vin(s) + vfree(s)) (w0(s) + w(s)) = h(s) t(s)
remains valid. After the changes, we acquire
- (v0(s) + vin(s) + vfree(s) + δfree t(s)) (w0(s) + w(s) + δw t(s))
and by expanding the multiplication, we observe that substituting h(s) with
- h(s) + δfree (w0(s) + w(s)) + δw (v0(s) + vin(s) + vfree(s)) + (δfree δw) t(s)
will suffice.
Compromise between Input and Witness Size
As you have noted in the earlier sections, the proof consists solely of 7 elements from a group (typically an elliptic curve). Moreover, the task that the verifier must undertake involves validating certain equalities incorporating pairing functions and calculating E(vin(s)), an operation that is linear with respect to the input size. Notably, neither the dimension of the witness string nor the computational demand needed to verify the QSP (without SNARKs) plays any significant role in the verification process. This implies that SNARKs can confirm extremely intricate problems and very straightforward problems with the same effort. The principal reason for this is that we verify the polynomial identity for a single point rather than the complete polynomial. Polynomials can grow more complex, but a specific point remains just that. The only factors affecting the verification workload are the level of security (i.e., the group size) and the maximum input dimensions.
It is feasible to diminish the second factor, the input dimension, by shifting some of it into the witness:
Rather than validating the function f(u, w), where u denotes the input and w represents the witness, we employ a hash function h and verify
- f'(H, (u, w)) := f(u, w) ∧ h(u) = H.
This indicates we substitute the input u with a hash of the input h(u) (expected to be considerably shorter) and validate that there exists some value x which hashes to H(u) (and thereby is highly likely equal to u) along with confirming f(x, w). Essentially, this transitions the original input u into the witness string, increasing the witness size while reducing the input size to a constant.
This is noteworthy, as it enables us to authenticate arbitrarily intricate assertions in constant time.
What Relevance Does This Hold for Ethereum
As verifying arbitrary computations is fundamental to the Ethereum blockchain, zkSNARKs are undoubtedly significant for Ethereum. With zkSNARKs, it becomes feasible to conduct secret arbitrary computations that are verifiable by anyone, in addition to doing so efficiently.
Although Ethereum employs a Turing-complete virtual machine, implementing a zkSNARK verifier on Ethereum is currently not viable. While the verification tasks may seem conceptually straightforward, calculating a pairing function is quite complex, thus resulting in a gas usage exceeding what is manageable within a single block. Elliptic curve multiplication is already relatively intricate, and pairings elevate that complexity.
Existing zkSNARK systems like zCash utilize the same problem/circuit/computation for every task. For zCash, this is the transaction verifier. Conversely, on Ethereum, zkSNARKs would not be restricted to a singular computational problem; rather, anyone could establish a zkSNARK system for their specialized computational tasks without the necessity of launching a new blockchain. Each new zkSNARK system introduced to Ethereum necessitates a fresh secret trusted setup phase (some elements can be reused, but not all), meaning a new CRS must be generated. It is also conceivable to develop a zkSNARK system for a “generic virtual machine,” eliminating the need for a new setup for different use cases, similar to not requiring a new blockchain to deploy a new smart contract on Ethereum.
Integrating zkSNARKs into Ethereum
There are various methods to facilitate zkSNARKs for Ethereum. All approaches aim to minimize the actual costs associated with pairing functions and elliptic curve operations (the other necessary operations are already inexpensive), thereby also decreasing the gas costs for these tasks.
- enhance the (guaranteed) performance of the EVM
- boost the performance of the EVM solely for particular pairing functions and elliptic curve multiplications
The first option is naturally the more advantageous one in the long term, but poses greater challenges to accomplish. Efforts are underway to introduce features and limitations to the EVM that would enable improved just-in-time compilation and interpretation without requiring significant alterations to the existing frameworks. Alternatively, one might consider entirely replacing the EVM with something like eWASM.
The second option can be actualized by mandating all Ethereum clients to implement a specific pairing function and multiplication on a particular elliptic curve as a precompiled contract. The advantage of this approach is that it is likely much simpler and quicker to accomplish. However, a disadvantage is that it confines us to a specific pairing function and elliptic curve. Any new Ethereum client would need to reconfigure these precompiled contracts. Additionally, if innovations arise and someone discovers superior zkSNARKs, more effective pairing functions, or enhanced elliptic curves, or if any flaws are identified in the elliptic curve, pairing function, or zkSNARK, we would be required to introduce new precompiled contracts.
