diff --git a/labs/Demos/SAW/Bittwiddling/Bittwiddling.md b/labs/Demos/SAW/Bittwiddling/Bittwiddling.md index 8f2a04f4..76d90a18 100644 --- a/labs/Demos/SAW/Bittwiddling/Bittwiddling.md +++ b/labs/Demos/SAW/Bittwiddling/Bittwiddling.md @@ -187,5 +187,5 @@ https://github.com/weaversa/cryptol-course/issues |||| |-:|:-:|-| || [ ^ SAW Demos](../Demos.md) || -| [< Arithmetic Verifications](../ArithmeticVerifications/ArithmeticVerifications.md) | **Bit Twiddling** || +| [< Arithmetic Verifications](../ArithmeticVerifications/ArithmeticVerifications.md) | **Bit Twiddling** | [Salsa20 (Galois) >](../Salsa20/Salsa20.md) | || [! Bit Twiddling (Answers)](./BittwiddlingAnswers.md) || diff --git a/labs/Demos/SAW/Salsa20/Salsa20.md b/labs/Demos/SAW/Salsa20/Salsa20.md new file mode 100644 index 00000000..b91afcbd --- /dev/null +++ b/labs/Demos/SAW/Salsa20/Salsa20.md @@ -0,0 +1,234 @@ +# Introduction + +This demo adapts Galois, Inc.'s existing [SAW script](https://github.com/GaloisInc/saw-script/blob/v0.9/examples/salsa20/salsa.saw) +that verifies their own +[Salsa20 implementation](https://github.com/GaloisInc/saw-script/blob/v0.9/examples/salsa20/salsa20.c). +This example implementation is straightforwardly derived from their +Cryptol spec, which in turn closely matches Dr. Daniel J. Bernstein's +[original Salsa20 specification](http://cr.yp.to/snuffle/spec.pdf), +lending itself nicely to a _compositional_ verification with +_overrides_ and _uninterpreted functions_ that reflects the +implementation and specification. + +## Prerequisites + +Before starting this module, we recommend perusing Galois, Inc.'s +excellent tutorial on +[_Program Verification with SAW_](https://saw.galois.com/intro/index.html). +That tutorial: + * introduces Cryptol and SAW + (you can skip its setup instructions as their installation test + should work in whichever environment you set up for this course) + * specifies and verifies a simple bitcount implementation + * introduces verification of pointer-based implementations + * describes and motivates a _compositional_ verification of Salsa20 + (feel free to return to our module and revisit the next part later) + * offers an extended exercise on proof maintenance for HMAC + +## Skills You'll Learn + +After completing the above tutorial, this module will: + * demonstrate use of the Clang/LLVM command line tool + `opt` and the `dot` language and tool to generate and analyze a + _callgraph_ + * explain the correspondence between a callgraph and SAW overrides + (described in Galois's _Program Verification with SAW_) + * introduce you to uninterpreted functions; using these in + conjunction with overrides decomposes a Cryptol specification, + which potentially simplifies verification for a solver + +# Uninterpreted Functions + +_Program Verification with SAW_ demonstrates compositional verification +using _overrides_ that reflect an implementation's _callgraph_ to ease +the burden of proof for SAW: + +> The third argument to `crucible_llvm_verify` is a list of +> `CrucibleMethodSpec` objects. While performing verification, the work +> that was done to construct a `CrucibleMethodSpec` is re-used. +> Specifically, instead of recursively symbolically executing a +> verified function, the prior specification is used as an +> axiomatization of its behavior. + +(Note: Those `Crucible` prefixes are now deprecated, e.g. +`CrucibleMethodSpec` is now `LLVMSpec`.) + +The term _override_ comes from SAW's docstring for `llvm_verify`: + +```SAW +sawscript> :h llvm_verify +Description +----------- + + llvm_verify : LLVMModule -> String -> [LLVMSpec] -> Bool -> LLVMSetup () -> ProofScript () -> TopLevel LLVMSpec + +Verify the LLVM function named by the second parameter in the module +specified by the first. The third parameter lists the LLVMSpec +values returned by previous calls to use as overrides. The fourth (Bool) +parameter enables or disables path satisfiability checking. The fifth +describes how to set up the symbolic execution engine before verification. +And the last gives the script to use to prove the validity of the resulting +verification conditions. +``` + +One might wonder whether similar hints can ease the burden of proof by +axiomatizing a similarly complex Cryptol specification. Such hints are +called +[_uninterpreted functions_](https://www21.in.tum.de/teaching/sar/SS20/6.pdf), +which instruct an SMT solver or interface +([SBV](http://leventerkok.github.io/sbv/) or +[What4](https://github.com/GaloisInc/what4)) to an underlying SMT +solver to treat a function (from the specification) as "uninterpreted". +The solver abstracts away these function definitions as arbitrary +functions satisfying the corresponding type signature. This removes any +other constraints on the function, so uninterpretation alone would +typically break a proof. But in conjunction with overrides (which +reintroduce the implementation's constraints on the specification), +this effectively axiomatizes a Cryptol symbol definition, avoiding +recursively expanding its underlying definitions. This is often +useful for implementations that are derived from or closely match a +Cryptol specification, as is the case for Galois's Salsa20 example +(but there are +[tradeoffs](https://es-static.fbk.eu/people/griggio/papers/lpar06_ack.pdf) +for this approach). + +For instance, the following verification of `s20_doubleround` uses both +overrides and uninterpreted functions: + +```SAW +dr <- llvm_verify m "s20_doubleround" [cr,rr] false doubleround_setup w4_unint_z3(['columnround', 'rowround']); +``` + +`cr` and `rr` refer to previous SAW verification results, whereas +`columnround` and `rowround` refer to definitions imported from +`Salsa20.cry`. Though this just shaves a few more tenths of a second +off verification of Salsa20 (and at worst *increases* proof times), +benefits become obvious for more complex verifications such as that of +AES and others in +[LibCrypto](https://github.com/awslabs/aws-lc-verification/blob/master/SAW/proof/AES/AES.saw), +a formally verified cryptographic library for Amazon Web +Services. (This script provides other advanced instructions to aid +SMT solvers that are beyond the scope of this course.) + + +## Practice with Uninterpreted Functions and Overrides + +Now that we have added uninterpreted functions to our repertoire, let's +apply them to Salsa20 --- but first, let's more closely examine the +Galois tutorial's implementation and progress toward verifying it... + +Clang/LLVM includes a command line tool called +[`opt`](https://llvm.org/docs/CommandGuide/opt.html). `opt` is +primarily an _optimizer_ (hence its name), but can also be used for +bitcode analysis. For now we are interested in its feature to generate +a _callgraph_: + +``` +> cd labs/Demos/Salsa20 +> mkdir build +> clang -c -g -emit-llvm -Iinclude -o build/salsa20.bc src/salsa20.c +> opt -dot-callgraph -o dev/null build/salsa20.bc +cryptol-course/labs/Demos/SAW/Salsa20$ opt -dot-callgraph -o /dev/null build/salsa20.bc +Writing 'build/salsa20.bc.callgraph.dot'... +``` + +(Our Salsa20 source and script are a reorganized copy of Galois's +example. Feel free to work from the copy in your Galois tutorial +workspace instead.) + +This produces a [dot](https://graphviz.org/doc/info/lang.html) graph, +which can be converted to an image viewable in a web browser: + +``` +> dot -Tpng -o build/salsa20.png build/salsa20.bc.callgraph.dot +``` + + + salsa20.bc call graph + + +(`llvm.*` are LLVM "intrinsics" that we can ignore for now.) + +We are not aware of any similar tool to graphically depict Cryptol +modules, but here is a manually generated graph for `Salsa20.cry`: + + + Salsa20.cry partial dependency graph + + +(These also call numerous functions from the Cryptol prelude, but we +can ignore most of these for now. `rotl` implements `(<<<)`, so we +include it in the graph.) + +Finally, here is another manually generated graph that shows which SAW +method specifications are verified (or not) against which +implementation functions, using which Cryptol definitions, and which +potential overrides (based on the implementation's callgraph) and +function uninterpretations (based on dependencies among Cryptol +definitions) are used, in `salsa20.saw`: + + + salsa20.saw verification/override graph + + +This assumes that each function in the implementation is specified by +one SAW method specification, which in turn refers to one Cryptol +specification that closely reflects the function being verified. This +is often not the case, but if so, shows a close correspondence between +specification and implementation, simplifying the assurance case. + +Support for +[visualizing a SAW Remote API for Python script](https://github.com/GaloisInc/saw-script/issues/1664) +is in early development. This dashboard dynamically visualizes the +override graph of a SAW Remote API for Python proof script. + +Informed by these visualizations, perhaps we can fill in some gaps... + +**EXERCISE**: Add uninterpreted functions to `llvm_verify` +instructions (replace `abc` with +`{solver}([{uninterpreted functions}])`) (selecting any solver whose +type is `[String] -> ProofScript ()`) in `salsa20.saw` that are +straightforwardly implemented by functions in `salsa20.bc`. +(`s20_quarterround` implements `quarterround`, `s20_hash` implements +`Salsa20`, etc., as depicted by dashed blue edges in the above graph.) +How does this affect proof times? What happens if you remove +overrides? + +**EXERCISE**: Based on the above `salsa20.saw` diagram, are any +potential overrides missing from other `llvm_verify` instructions in +`salsa20.saw`? (Unused potential overrides are depicted as dashed red +edges in the above graph.) Are any implementation functions left +unspecified or unused in the compositional verification? (Dashed red +edges to nodes without an entry for a SAW method specification.) How +does specifying, verifying, and adding corresponding overrides and +uninterpreted functions affect proof times? + +**EXERCISE**: This implementation does not work with 128-bit keys as +described in the specs. Add or redefine functions in `salsa20.c` and +verify them accordingly, remembering to add overrides and +uninterpreted functions to your adapted SAW script to reflect your +modified implementation. + +# Conclusion + +In this demo, you learned about compositional verification and how to +leverage various tools to analyze an implementation and specification, +and produce a "complete" verification script to increase confidence in +not only an implementation, but your understanding of it. You saw how +an implementation based closely on a specification potentially +simplifies and boosts confidence in its verification. How might this +compare to an optimized or third-party implementation that agrees with +a specification but differs substantially in its composition...? + +# Solicitation + +How was your experience with this lab? Suggestions are welcome in the +form of a ticket on the course GitHub page: +https://github.com/weaversa/cryptol-course/issues + +# From here, you can go somewhere! + +|||| +|-:|:-:|-| +|| [ ^ SAW Demos](../Demos.md) || +| [< Bit Twiddling](../Bittwiddling/Bittwiddling.md) | **Salsa20 (Galois)** || diff --git a/misc/SAWDemos.gv b/misc/SAWDemos.gv index ed56b8cd..79fc4cf9 100644 --- a/misc/SAWDemos.gv +++ b/misc/SAWDemos.gv @@ -13,20 +13,21 @@ digraph { ArithmeticVerifications [URL="../labs/Demos/SAW/ArithmeticVerifications/ArithmeticVerifications.html"]; BitTwiddling [URL="../labs/Demos/SAW/Bittwiddling/Bittwiddling.html"]; + Salsa20 [URL="../labs/Demos/SAW/Salsa20/Salsa20.html"]; // branch nodes node [fillcolor="white"]; - // newline/space labels ArithmeticVerifications [label = "Arithmetic\nVerifications"] BitTwiddling [label = "Bit Twiddling"] + Salsa20 [label = "Salsa20"] // recommended flow edge [color=red]; - ArithmeticVerifications -> BitTwiddling; + ArithmeticVerifications -> BitTwiddling -> Salsa20; // branches edge [color=black]; diff --git a/misc/SAWDemos.gv.png b/misc/SAWDemos.gv.png index 3c879684..d285d069 100644 Binary files a/misc/SAWDemos.gv.png and b/misc/SAWDemos.gv.png differ diff --git a/misc/SAWDemos.gv.svg b/misc/SAWDemos.gv.svg index ef5d34e3..df70d12c 100644 --- a/misc/SAWDemos.gv.svg +++ b/misc/SAWDemos.gv.svg @@ -1,34 +1,51 @@ - - - + + -%3 -ArithmeticVerifications + +ArithmeticVerifications - -Arithmetic -Verifications + +Arithmetic +Verifications -BitTwiddling + +BitTwiddling - -Bit Twiddling + +Bit Twiddling -ArithmeticVerifications->BitTwiddling - - + +ArithmeticVerifications->BitTwiddling + + + + + +Salsa20 + + +Salsa20 + + + + + +BitTwiddling->Salsa20 + + diff --git a/misc/Salsa20.cry.dot b/misc/Salsa20.cry.dot new file mode 100644 index 00000000..44d70d43 --- /dev/null +++ b/misc/Salsa20.cry.dot @@ -0,0 +1,25 @@ +digraph "Salsa20.cry" { + label="Salsa20.cry"; + + rotl [label="(<<<)"]; + quarterround; + rowround; + columnround; + doubleround; + littleendian; + littleendian_inverse; + Salsa20; + Salsa20_expansion; + Salsa20_encrypt; + + quarterround -> rotl; + rowround -> quarterround; + columnround -> quarterround; + doubleround -> columnround; + doubleround -> rowround; + Salsa20 -> littleendian; + Salsa20 -> littleendian_inverse; + Salsa20 -> doubleround; + Salsa20_expansion -> Salsa20; + Salsa20_encrypt -> Salsa20_expansion; +} diff --git a/misc/Salsa20.cry.png b/misc/Salsa20.cry.png new file mode 100644 index 00000000..ae51b163 Binary files /dev/null and b/misc/Salsa20.cry.png differ diff --git a/misc/salsa20.bc.callgraph.dot b/misc/salsa20.bc.callgraph.dot new file mode 100644 index 00000000..d71e7bc7 --- /dev/null +++ b/misc/salsa20.bc.callgraph.dot @@ -0,0 +1,26 @@ +strict digraph "salsa20.bc" { + label="salsa20.bc"; + + s20_crypt32; + s20_rev_littleendian; + s20_expand32; + s20_hash; + s20_littleendian; + s20_doubleround; + s20_columnround; + s20_rowround; + s20_quarterround; + rotl; + + s20_crypt32 -> s20_rev_littleendian; + s20_crypt32 -> s20_expand32; + s20_expand32 -> s20_hash; + s20_hash -> s20_littleendian; + s20_hash -> s20_doubleround; + s20_hash -> s20_rev_littleendian; + s20_doubleround -> s20_columnround; + s20_doubleround -> s20_rowround; + s20_columnround -> s20_quarterround; + s20_rowround -> s20_quarterround; + s20_quarterround -> rotl; +} diff --git a/misc/salsa20.bc.png b/misc/salsa20.bc.png new file mode 100644 index 00000000..70eedf1c Binary files /dev/null and b/misc/salsa20.bc.png differ diff --git a/misc/salsa20.saw.dot b/misc/salsa20.saw.dot new file mode 100644 index 00000000..783353a5 --- /dev/null +++ b/misc/salsa20.saw.dot @@ -0,0 +1,111 @@ +digraph "salsa20.saw.dot" { + label="salsa20.saw"; + + node [shape="plaintext"]; + r [label=< + + + + +
rotl(<<<)
+ >]; + + qr [label=< + + + +
s20_quarterroundquarterround_setupquarterround
>]; + + rr [label=< + + + +
s20_rowroundrowround_setuprowround
>]; + + cr [label=< + + + +
s20_columnroundcolumnround_setupcolumnround
>]; + + dr [label=< + + + +
s20_doublerounddoubleround_setupdoubleround
>]; + + le [label=< + + + +
s20_littleendianlittleendian
>]; + + rle [label=< + + + +
s20_rev_littleendianlittleendian_inverse
>]; + + s20 [label=< + + + +
s20_hashsalsa20_setupSalsa20
>]; + + s20e32 [label=< + + + +
s20_expand32salsa20_expansion_32Salsa20_expansion
>]; + + s20enc_63 [label=< + + + +
s20_crypt32s20_encrypt32 63Salsa20_encrypt
>]; + + s20enc_64 [label=< + + + +
s20_crypt32s20_encrypt32 64Salsa20_encrypt
>]; + + s20enc_65 [label=< + + + +
s20_crypt32s20_encrypt32 65Salsa20_encrypt
>]; + + edge [color="red", style="solid"]; + rr:i -> qr:i; + cr:i -> qr:i; + dr:i -> cr:i; + dr:i -> rr:i; + s20:i -> dr:i; + s20e32:i -> s20:i; + s20enc_63:i -> s20e32:i; + s20enc_64:i -> s20e32:i; + s20enc_65:i -> s20e32:i; + + edge [style="dashed"]; + qr:i -> r:i; + s20:i -> le:i; + s20:i -> rle:i; + s20enc_63:i -> rle:i; + s20enc_64:i -> rle:i; + s20enc_65:i -> rle:i; + + edge [color="blue", style="dashed"]; + qr:c -> r:c; + rr:c -> qr:c; + cr:c -> qr:c; + dr:c -> cr:c; + dr:c -> rr:c; + s20:c -> le:c; + s20:c -> rle:c; + s20:c -> dr:c; + s20e32:c -> s20:c; + s20enc_63:c -> s20e32:c; + s20enc_64:c -> s20e32:c; + s20enc_65:c -> s20e32:c; +} diff --git a/misc/salsa20.saw.png b/misc/salsa20.saw.png new file mode 100644 index 00000000..70f0db6c Binary files /dev/null and b/misc/salsa20.saw.png differ