Introduction to program verification

Anton Trunov (Zilliqa Research)

11.03.2021

Course administrivia

  • ~ 10 lectures
  • ~ 10 seminars
  • There will be homework!

Course administrivia

  • You need to install and use Coq and Mathcomp
  • You need to have a laptop to bring to class
  • Prerequisites: it'd be nice if you were familiar with
    • Basic functional programming
    • Basic logic

Communication

  • Let's make this interactive
  • Ask questions as we go
  • Help steering the course
  • Course chat

Course reading

Course outline

  • Proof engineering with just the right amount of theory
  • Focus on verification of functional algorithms
  • SSReflect/Mathcomp architecture

What is formal program verification?

  • A technique providing means of increasing assurance for the correctness of software by proving its correctness with respect to a certain formal specification, using formal methods of mathematics
  • Formal ~ have a syntax and may be given semantics

Why is verification important?

  • Ensure systems are bug-free

    • Therac-25
    • Ariane 5 Disaster, Mars Climate Orbiter, Mariner 1, Patriot missile
    • The Pentium bug
    • The DAO Attack

Why is verification important?

  • Gain an insight about the system at hand

Components of formal verification

  • Specification
  • Implementation
  • Formal proof
  • Checker

Formal specification

  • A means to describe a system
  • Specifying systems is hard and is a form of art!

Formal proof

  • A formal proof is a proof in which every logical inference has been checked all the way back to the fundamental axioms (A definition by T.C. Hales)
  • All the intermediate logical steps are supplied, without exception
  • No appeal is made to intuition, even if the translation from intuition to logic is routine
  • A formal proof is less intuitive, and yet less susceptible to logical errors

There is lots of formal systems

  • Not all formalizms are created equal
  • E.g. to expand the definition of the number 1 fully in terms of Bourbaki primitives requires over 4 trillion symbols
  • With formal proofs one wants as much help as one can get

Formal methods techniques

The land of formal methods includes

  • Interactive theorem provers (e.g. Coq)
  • Automated theorem provers (SAT/SMT solvers, …)
  • Specification languages & Model checking
  • Program Logics

What is Coq?

Coq is a formal proof management system. It provides

  • a language to write mathematical definitions,
  • executable algorithms,
  • theorems (specifications),
  • environment for interactive development of machine-checked proofs.

Related systems

  • Lean prover (similar to Coq)
  • F* (used to verify crypto code in Firefox)
  • Isabelle/HOL (simple type theory, seL4)
  • Idris (honed towards programming)
  • Agda

Why Coq?

  • Expressive
  • Industrial adoption
  • Mature and battle-tested
  • Lots of books and tutorials
  • Lots of libraries
  • Excellent community

What do people use Coq for?

  • Formalization of mathematics:
    • Four color theorem
    • Feit-Thompson theorem
    • Homotopy type theory
  • Education: it's a proof assistant.
  • Industry: Compcert (at Airbus)

More examples

  • Coq-generated crypto code in Chrome
  • FSCQ: a file system written and verified in Coq
  • Armada: verifying concurrent storage systems
  • Cryptocurrencies (e.g. Tezos, Zilliqa)

Coq, its ecosystem and community

Coq, its ecosystem and community

Coq repo stats (LoC)

Language files code
OCaml 949 203230
Coq 1970 196057
TeX 26 5270
Markdown 22 3362
Bourne Shell 107 2839
   

Mathcomp stats (LoC)

Language files code
HTML :) 377 299260
Coq 92 83726
JavaScript 13 34363
CSS 6 1199
   

Proofs and Tests

  • @vj_chidambaram: Even verified file systems have unverified parts :)
  • FSCQ had a buggy optimization in the Haskell-C bindings
  • CompCert is known to also have bugs in the non-verified parts, invalid axioms and "out of verification scope" bugs

Proofs and Tests

  • QuickChick shows an amazing applicability of randomized testing in the context of theorem proving
  • Real-world verification projects have assumptions that might not be true

FSCQ stats (LoC)

Language files code
Coq 98 81049
C 36 4132
Haskell 8 1091
OCaml 10 687
Python 9 643
   

CompCert C Compiler stats (LoC)

Language files code
Coq 223 146226
C 223 65053
OCaml 147 28381
C/C++ Header 86 7834
Assembly 59 1542
   

What is Coq based on?

Calculus of Inductive Constructions

Just some keywords:

  • Higher-order constructivist logic
  • Dependent types (expressivity!)
  • Curry-Howard Correspondence

Curry-Howard Correspondence

  • Main idea:
    • propositions are special case of types
    • a proof is a program of the required type
  • One language to rule 'em all
  • Proof checking = Type checking!
  • Proving = Programming

Proving is programming

  • High confidence in your code
  • It is as strong as strong your specs are (trust!)
  • It can be extremely hard to come up with a spec (think of browsers)
  • IMHO: the best kind of programming

Coq as Programming Language

  • Functional
  • Dependently-typed
  • Total language
  • Extraction

Extraction

Extraction: xmonad

Extraction: toychain

certichain / toychain - A Coq implementation of a minimalistic blockchain-based consensus protocol

Embedding

  • hs-to-coq - Haskell to Coq converter
  • coq-of-ocaml - OCaml to Coq converter
  • goose - Go to Coq conversion
  • clightgen (VST)
  • fiat-crypto - Synthesizing Correct-by-Construction Code for Cryptographic Primitives

hs-to-coq - Haskell to Coq converter

  • part of the CoreSpec component of the DeepSpec project
  • has been applied to verification Haskell’s containers library against specs derived from
    • type class laws;
    • library’s test suite;
    • interfaces from Coq’s stdlib.
  • challenge: partiality

Machine Learning

Suggested reading (papers)

  • "Formal Proof" - T.C. Hales (2008)
  • "Position paper: the science of deep specification" - A.W. Appel (2017)
  • "QED at Large: A Survey of Engineering of Formally Verified Software" - T. Ringer, K. Palmskog, I. Sergey, M. Gligoric, Z. Tatlock (2019)