-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Currently we have a "typed" language (it's a GADT at the moment, but we could potentially change that, but I don't see why bother).
I believe the current language already accurately reflects what a source langauge for compilation to the R1CS/Sonic constraint system has to be. I also believe that the design space is really narrow and there aren't many opportunities for design choices.
However @jakzale mentioned one possibility and it's having a single assert storing a boolean expression in CNF.
Pros:
- a very sound foundation. CNF is something that is studied from all possible angles, we now how to produce expressions in that form, how to optimize etc
- hence more optimization opportunities
Cons:
- the prover wants to fail as early as possible when it attempts to produce a witness in an erroneous way. This suggests interleaving
lets andasserts and calling eachassertas early as possible. It's not clear how critical it would be if we deferred all calls toassertuntil every singleletis computed as we don't care about the performance of the prover that much (we mostly care about the performance of the verifier), but it does feel wrong to introduce such an inefficiency just for the sake of having a more minimalistic system. Interleavedlets andasserts do reflect better the nature of the computation that the prover is supposed to perform - optimization opportunities are mostly in the form of common subexpression elimination (CSE). Programmers are already good at this kind of thing. Moreover, the programmer knows better what kind of subexpression they want to share. Consider for example group addition on the Jubjub curve in Edwards form, in R1CS it looks like this:
(A) = (x1 + y1) × (x2 + y2)
(B) = (x1) × (y2)
(C) = (x2) × (y1)
(D) = (dj · B) × (C)
(B + C) = (1 + D) × (x3)
(A − B − C) = (1 − D) × (y3)
But in Zonk I implemented it like this (where / is supposed to be compiled without ensuring that the divisor is zero, because it can't be zero):
let x1y2 = x1 * y2;
let x2y1 = x2 * y1;
let x1x2 = x1 * x2;
let y1y2 = y1 * y2;
let prod = x1x2 * y1y2;
let x1 = (x1y2 + x2y1) / (1 + dj * prod);
let y1 = (y1y2 + x1x2) / (1 - dj * prod);
R1CS and Sonic are extremely similar, almost identical, yet I had to completely rewrite the whole thing in order to make compilation efficient. It looks like we'd need to implement a specific form of CSE for each possible zero knowledge backend, which is a lot of highly non-trivial research and development just to solve a task that nobody asked us to solve, as the programmer writing the actual code in Zonk likely already knows what expressions make bindings.
Note also that sometimes we don't want to perform CSE over certain subexpressions. For example, the following program:
let x = a + b + c;
let y = a + b + d;
let out = x * y;
can be more efficient than if we bind a + b to a variable as that would increase the number of constraints. Although given that we're supposed to produce compressed Sonic vectors it's not really clear if having an additional constraint here will make things slower, but then why commit to a single way of handling things, if it's not clear which option is better. The programmer can decide for themselves, particularly given the fact that efficiency concerns are all backend-specific.
Moreover if we did this single assert thing, the output of the first few stages of compilation would basically be obfuscated and completely indecipherable for any non-hello-world program. The programmer writes a program, introduces certain names and logic and we just inline everything and then come up with our own bindings and their bodies, not related in any way to the original program. That would make any kind of visual inspection of intermediate programs completely infeasible, i.e. we'd be less likely to catch an error if it appears somewhere in the logic of the compiler.
And our language is fairly low-level. The programmer should be given full control over what expressions stay bound to variables, what bindings get inlined. I believe that writing an algorithm that consistently outperforms a dedicated human being requires plenty of research and even if we had one, it'd have to be optional. Efficiency (in terms of the number of constraints in the final result) is critically important as that determines how much time is spent on verification and we do want to allow the programmer to fine-tune the output for their specific needs in a predictable way.
Finally, we simply have a ton of quite more vital stuff to do and don't have enough bandwidth for any non-essential and at the same time non-trivial research.
So I believe we shouldn't even attempt investigating this single assert idea any time soon.