Max Taylor’s Personal Site

Rust: Generics Considered Colorful

2023-09-01T00:00:00-04:00

This post shows that Rust’s generics are colorful. I’ll demonstrate an example to show what I mean, and what the problems are.

Motivating Example

Consider this silly code:

trait MyTrait {
    fn foo(&self);
}

struct S1;

impl MyTrait for S1 {
    fn foo(&self) {
        println!("S1::foo()");
    }
}

fn call_foo<T>(t: &T) where T: MyTrait {
    t.foo();
}

fn main() {
    let s1 = S1{};
    call_foo(&s1);
}

This seems fine so far.

Now, let’s suppose we have an collection of MyTraits, like this:

// Previous code not shown.

struct S2;

impl MyTrait for S2 {
    fn foo(&self) {
        println!("S2::foo()");
    }
}

fn main() {
    let v: Vec<&dyn MyTrait> = vec![&S1{}, &S2{}];
    for x in v {
        call_foo(x);
    }
}

This produces this compilation error:

Compiling playground v0.0.1 (/playground)
error[E0277]: the size for values of type `dyn MyTrait` cannot be known at compilation time
  --> src/main.rs:28:18
   |
28 |         call_foo(x);
   |         -------- ^ doesn't have a size known at compile-time
   |         |
   |         required by a bound introduced by this call
   |
   = help: the trait `Sized` is not implemented for `dyn MyTrait`

The problem is that Rust generics are monomorphized, but monomorphization is not supported for trait objects.

call_foo is a colored function. The code doesn’t compile because trait objects are the wrong color.

Does this Matter In Real Life?

Yes. Here’s an example: The Rust bindings for interacting with the Z3 theorem prover have a trait z3::ast::Ast to represent terms, constants, and expressions. As you’re building a theory, you may want to maintain a vector of your constants in a Vec>. Once Z3 has constructed a model that satisfies your theory, you’ll probably want to query the model for the values of constants via the method pub fn get_const_interp>(&self, ast: &T) -> Option.

Well, you just shot your foot off. You can’t call this method on a trait object, so now you need to redo the work you just did. And the new code is going to be a whole lot uglier.

Fix 1: Prefer Trait Objects

In contrast to the orthodox Rust opinion, we should prefer to use trait objects unless we explicitly need to combine multiple trait bounds or dynamic dispatch is a performance issue. Here’s what I mean:

// Previous code not shown.

fn call_foo(x: &dyn MyTrait) {
    x.foo();
}

fn main() {
    let v: Vec<&dyn MyTrait> = vec![&S1{}, &S2{}];
    for x in v {
        call_foo(x);
    }
}

Note that this trait object is general enough to work with many data structures. For example, we can still use a Box with this implementation:

// Previous code not shown.

fn main() {
    let v2: Vec<std::boxed::Box<dyn MyTrait>> = vec![std::boxed::Box::new(S1{}), 
                                                     std::boxed::Box::new(S2{})];
    for x in &v2 {
        call_foo(x.as_ref());
    }
    
    call_foo(v2[0].as_ref());
}

And, of course, we can still use call_foo on a specific instance:

// Previous code not shown.

fn main() {
    let s = S1{};
    call_foo(&s);
}

Fix 2: Always Implement Your Traits for Trait Objects

You should just always implement your traits for trait objects:

// Previous code not shown.

impl MyTrait for &dyn MyTrait {
    fn foo(&self) {
        (**self).foo();
    }
}

fn call_foo<T>(x: &T) where T: MyTrait {
    x.foo();
}

fn main() {
    let v: Vec<&dyn MyTrait> = vec![&S1{}, &S2{}];
    for x in v {
        call_foo(&x);
    }
}

Note that this code also works on other kinds of trait objects:

// Previous code not shown.

fn main() {
    let v2: Vec<std::boxed::Box<dyn MyTrait>> = vec![std::boxed::Box::new(S1{}), 
                                                     std::boxed::Box::new(S2{})];
    for x in &v2 {
        call_foo(&x.as_ref());
    }

    call_foo(&v2[0].as_ref());

    let v3: Vec<std::rc::Rc<dyn MyTrait>> = vec![std::rc::Rc::new(S1{}), 
                                                 std::rc::Rc::new(S2{})];
    for x in &v3 {
        call_foo(&x.as_ref());
    }

    call_foo(&v3[0].as_ref());
}

If you create a trait then you must be the one that implements it for trait objects. Per the coherence rule a trait can only be implemented for a type by the crate that defines the trait or defines the type.

Fix 3: Fix Rust

There’s a lot of code in the wild that share the same pain-point as the Z3 example I mentioned. It shouldn’t be difficult to use generics. Effective Rust does explain the reason for the current design rather well. But I feel like this is an area that can be improved on.

A Concept and Template Meta-programming Approach to Session Types in C++

2023-08-19T00:00:00-04:00

Introduction

Programs communicate – whether with other programs or humans. Software developers write programs with a protocol in mind. Sometimes there’s documentation for the protocol. But there’s no mechanism that keeps implementation and documentation in sync. Bugs occur when protocols diverge.

Many of us already use type systems. But naive approaches to typing fall short of guaranteeing that an implementation speaks a protocol. For example: Suppose two threads T1 and T2 communicate over a channel chan. T1 and T2 play a guessing game. T1 guesses a number (int) and T2 informs T1 if the guess is right (bool). We might type chan as Chan>. This isn’t helpful, though. If T1 sends a bool the program should not compile, yet it does.

Session types are a tool that solves this problem. This post discusses an implementation of session types in C++. You’ll learn more about how you can use session types to specify protocols. You’ll also see some features in C++ (concepts and template meta-programming) you might not know how to use today.

All code is available on GitHub.

Motivating Example: IO

Instead of two threads playing a guessing game, let’s make a game for humans. First, the computer generates a random number between 1 and 100. Second, the computer prompts the user to guess the number. Then, the user enters a guess. Next, the computer evaluates the user’s guess. If the guess is correct then the program sends a congratulatory message and exits. If the guess is wrong then the program asks the user if they give up. The user keeps guessing the generated number until they get it right or give up.

This listing shows how we might specify this protocol with session types:

using GuessingGameProtocol =
    Rec>>,
               ExitProtocol>>;

template 
using QueryUserProtocol = Send>;

using KeepPlayingProtocol = Send>>;

using ExitProtocol = Choose;
using ExitUserLost = Send>>>;
using ExitUserWon = Send>;

Let’s unpack:

Rec
introduces a recursive protocol. It allows the protocol to repeat itself using Var.
Choose allows the implementation to make a choice between protocols P1 and P2. Choose, ExitProtocol> represents a choice between asking the user for another guess and terminating.
Send represents that the implementation sends a value of type T1 then executes the protocol P. Similarly, Recv receives.
Var accepts a natural number – either Z or Succ – and returns to the recursive environment N levels out.

Here’s what an implementation of this protocol might look like:

int main() {
  std::default_random_engine generator;
  generator.seed(time(nullptr));

  std::uniform_int_distribution distribution(1,10);
  const auto the_number = distribution(generator);

  auto keep_going = true;
  auto guess = 0;

  Chan chan(&std::cin, &std::cout);

  while (keep_going) {
    auto c1 = chan.enter().choose1();
    auto c2 = c1 << "Guess: ";
    auto c3 = c2 >> guess;

    keep_going = guess != the_number;
    if (keep_going) {
      auto c4 = c3.choose1();
      auto c5 = c4 << "Incorrect. Keep playing? (y/n) ";
      std::string response;
      auto c6 = c5 >> response;
      keep_going = response != "n";

      chan = c6.ret();
    } else {
      chan = c3.choose2().ret();
    }
  }

  if (guess != the_number) {
    auto ce = chan.enter().choose2().choose1();
    ce << "You lose. I was thinking of " << the_number << "." << std::endl;
  } else {
    auto ce = chan.enter().choose2().choose2();
    ce << "You win!" << std::endl;
  }
}

Some explanations are in order:

The Chan type represents a session typed communication channel. It encapsulates some other input and output mechanisms. In this case, cin and cout.
Programs operate on a Chan by calling methods. Following a method call, it is illegal to reuse the Chan – doing so triggers a run time error. Operations return new channels that speak the proper protocol.
chan.enter() enters a recursive context.
Chan>::choose1() returns a channel that speaks P1. Chan>::choose2() returns a channel that speaks P2.
Chan>::operator>>(T &t) reads a value from the channel’s input stream into t. It returns a channel that speaks P. operator<<(const T &t) behaves similarly.
Chan>::ret() returns a channel that speaks the Nth recursive protocol defined in the original type.

Combined, this provides a stronger guarantee than what we had before: Programs always send the right shaped data for the protocol, or send nothing.

Motivating Example: Multithreaded Communication

When two threads communicate over a channel it’s important that they speak the same protocol. Our intuition tells us that every Send should have a corresponding Recv, etc. We call this duality. We desire that our type system only allow two threads to communicate over the channel if they are each other’s duals.

This next listing shows part an implementation of program with two threads: T1 and T2. T1 sends a value to T2, who responds with that value doubled.

#include 
#include 
#include 
#include "sesstypes.hh"
#include "concurrentmedium.hh"

using Protocol = Rec>>>;

void log(const std::string &tname, const std::string &action, int val) {
    printf("%s %s %d\n", tname.c_str(), action.c_str(), val);
}

void log(const std::string &tname, const std::string &action) {
    printf("%s %s\n", tname.c_str(), action.c_str());
}

struct {
    template 
    void operator()(Chan chan) {
        int val;

        auto c = chan.enter();
        for (int i = 0; i < 5; i++) {
            auto c1 = c << i;
            log("T1", "sent", i);

            int val;
            auto c2 = c1 >> val;
            log("T1", "received", val);

            c = c2.ret().enter();
        }
        log("T1", "done", -1);
    }
} t1;


int main() {
    auto chan = std::make_shared>>();
    auto threads = connect(t1, t2, chan);
    threads.first.join();
    threads.second.join();
}

Critically, we are only allowed to call connect(t1, t2) if t2 is the dual of t1. This requirement is enforced at compile time.

A C++ Implementation

Now that we have a better idea about what session types are, let’s see how they are implemented.

Session Types

Duality with C++ Concepts

Duality is critical to our concurrent motivating example. The idea that a type has a dual can be captured using a concept. Concepts are named boolean predicates that restrict template parameters.

Take the definition of the Recv type:

template 
struct Recv {
    using dual = Send;
};

Recv defines dual as its opposite, Send. Since Recv requires that the protocol P has a dual, we constrain P to types where HasDual evaluates to true.

Here’s the implementation of HasDual:

template 
concept HasDual = requires { typename T::dual; };

This introduces another new feature of C++: The requires expression. requires { typename T::dual; } evaluates to true if typename T::dual compiles. Otherwise, it evaluates to false. (By the way, it’s illegal for a requires expression to always fail to compile.)

Concepts are great because they improve compiler error messages. We’ve all seen the error vomit C++ compilers produce when template expansion fails. Concepts eliminate much of the noise to help us debug.

Natural Numbers with Template Meta-Programming

Remember that Var uses a natural number to decide how many levels of recursion to return from. Let’s see how our natural numbers are implemented.

Here’s a naive way to implement natural numbers:

struct Z {};

template 
struct Succ {};

This definition allows us to write real natural numbers like Succ>. The problem is that it also allows us to write things that aren’t natural numbers, like Succ. Given that this post is about radical type checking, we should not be satisfied with this.

Instead, we use template metaprogramming to enforce that a type is a natural number. There are two ways for a type to be a natural number:

It is Z.
It is Succ and M is a natural number.

Here’s how we define a concept IsNat to check that a type is a natural number:

template 
struct IsNatImpl : std::false_type {};

template <>
struct IsNatImpl : std::true_type {};

template 
struct IsNatImpl>
    : std::conditional_t<
                IsNatImpl::value,
                std::true_type,
                std::false_type
      > {};

template 
concept IsNat = IsNatImpl::value;

The type_traits header provides std::true_type and std::false_type as canonical representations of true and false at the type level. The default implementation of IsNatImpl inherits from false_type, so its value member is false. The Z specialization inherits from true_type, so its value member is true.

The last specialization is kind of tricky. conditional_t is A when Condition is true and B otherwise. So we recursively check that IsNatImpl::value is true. If so, then Succ is a natural number, and so we inherit from true_type.

This lets us write a more correct version of natural numbers:

template 
struct Succ;

// Code for IsNat.

template <>
struct Succ {};

template 
struct Succ {};

The Chan Type

Here we discuss the implementation of the Chan type. Since recursion is the hardest thing that we have to support we’ll describe it first. It has far-reaching implications.

The idea is to represent a channel as a Chan. Protocol is the protocol type. For example, Recv>. E (for environment) is kind of like a stack. Here’s what I mean:

template 
class Chan, IT, OT, E> : ChanBase {
public:
    using ChanBase::ChanBase;

    Chan, E>> enter() {
        // Implementation not shown.
    }
};

So, Chan is specialized on recursive protocols. It provides only one method, enter. This makes it impossible to try to read from a recursive protocol, for example. The enter method for a protcol Rec

pushes P onto a stack. Since this all occurs in the type system, we represent the stack as a std::pair.

This allows us to define Var, which pops N levels from the environment:

template 
class Chan, IT, OT, std::pair> : ChanBase {
public:
    using ChanBase::ChanBase;

    Chan ret() {
        // Implementation not shown.
    }
};

template 
class Chan>, IT, OT, std::pair> : ChanBase {
public:
    using ChanBase::ChanBase;

    Chan, IT, OT, E> ret() {
        // Implementation not shown.
    }
};

This is sort of recursive. In the base case, ret returns a channel whose protocol is the top of the environment stack. Otherwise, for Var, ret returns a channel that also speaks Var. Only this time, it’s Var.

Chan is specialized for all of the types with duals. For example, here’s Chan, ...>:

template 
class Chan, IT, OT, E> : ChanBase {
public:
    using ChanBase::ChanBase;

    Chan operator>>(T &t) {
        if (ChanBase::used) {
            throw ChannelReusedError();
        }

        ChanBase::used = true;
        (*ChanBase::input) >> t;
        return Chan(ChanBase::input, ChanBase::output);
    }
};

Since it’s specialized, the only thing we can do with a Chan> is use operator>>. This prevents a large number of mistakes – we can’t send an integer at an unexpected time, for example.

Concurrent Communication Primitive

The second motivating example uses ConcurrentMedium to create a Chan, instead of cin and cout. This allows two threads to communicate over a channel. This section describes the design of ConcurrentMedium.

Guarantees

Two threads can both read and write data to a Chan.
Threads do not read their own write. If a thread attempts to read its own write, it blocks until another write is available.
Threads may only read a write once. If a thread attempts to read a write twice, it blocks until a new write is available.
Every write is observed by the next read. If a thread attempts to write data before the last write is read, it blocks until a read occurs.

Storage

We store writes in a std::variant. This is a type-safe union. So, the type ConcurrentMedium> can communicate values with types of int or std::string.

This listing shows this implementation:

template 
class ConcurrentMedium> {
public:
    ConcurrentMedium()
        : was_read(true), writers_waiting(0), readers_waiting(0) {}

    template 
    ConcurrentMedium& operator<<(const T &value) {
        std::unique_lock held_lock(lock);
        while (!was_read) {
            // Needs to be in a while loop to ignore "spurious wakeups".
            // https://en.cppreference.com/w/cpp/thread/condition_variable/wait
            writers_waiting++;
            writer_cv.wait(held_lock);
            writers_waiting--;
        }

        data = value;
        was_read = false;
        write_source = std::this_thread::get_id();

        if (readers_waiting > 0) {
            reader_cv.notify_one();
        }

        return *this;
    }

    template 
    ConcurrentMedium& operator>>(T &datum) {
        std::unique_lock held_lock(lock);
        while (write_source == std::this_thread::get_id() || was_read) {
            readers_waiting++;
            reader_cv.wait(held_lock);
            readers_waiting--;
        }

        datum = std::get(data);
        was_read = true;

        if (writers_waiting > 0) {
            writer_cv.notify_one();
        }

        return *this;
    }

private:
    std::mutex lock;

    int readers_waiting;
    std::condition_variable reader_cv;

    int writers_waiting;
    std::condition_variable writer_cv;

    std::variant data;
    std::thread::id write_source;
    bool was_read;
};

Problem 1: How to Ensure Reads/Writes are Type Safe?

You may notice a small problem with operator>> and operator<<: They accept any type T, but we are only able to read/write T if it is part of the variant.

The way we’re going to solve this problem is to create a concept AssignableToVariant that is true whenever T can be written to the variant V. AssignableToVariant is written by using a template meta-program called OneOf. Here are the implementations:

template 
struct OneOf : public std::false_type {};

template 
struct OneOf> : public std::conditional_t<
        (std::is_same_v || ...),
        std::true_type,
        std::false_type
    >
{};

template 
concept AssignableToVariant = OneOf::value;

This is similar to IsNatImpl. The syntax (std::is_same_v || ...) is called a fold expression. It essentially rewrites the original expression into (std::is_same_v || ... || std::is_same_v), although Ts[0] is not real syntax.

These are the updated signatures for operator<< and operator>>:

template > T>
ConcurrentMedium& operator<<(const T &value);

template > T>
ConcurrentMedium& operator>>(T &value);

Problem 2: How to Create a ConcurrentMedium For an Arbitrary Protocol?

ConcurrentMedium is hard to use. If we have a protocol Send>, it is time-consuming and error-prone to keep writing ConcurrentMedium>. Plus, we have to exert effort to keep the variant and the protocol in sync. To solve this problem, we’ll create another template meta-program called ProtocolTypes. ProtocolTypes>> automatically creates a std::variant.

Here’s the implementation:

template 
struct ProtocolTypesImpl;

template 
struct ProtocolTypesImpl, Recv> {
    using type = typename ProtocolTypesImpl, P>::type;
};

template 
struct ProtocolTypesImpl, Z> {
  using type = std::variant;
};

template 
struct ProtocolTypesImpl, Send> {
    using type = typename ProtocolTypesImpl, P>::type;
};

template 
struct ProtocolTypesImpl, Rec> {
    using type = typename ProtocolTypesImpl, P>::type;
};

template 
struct ProtocolTypesImpl, Var> {
    using type = std::variant;
};

template 
using ProtocolTypes =  ProtocolTypesImpl, P>::type;

Of course, there’s a small problem with this implementation. Namely, if we have a protocol Send>, we create a std::variant. Then, std::get(data) is illegal because the type int does not uniquely index the variant. We need all types to be unique.

Once again, we use a template meta-program to implement this idea:

template 
struct Unique : std::type_identity {};

template 
struct Unique, U, Us...>
    : std::conditional_t<(std::is_same_v || ...),
                         Unique, Us...>,
                         Unique, Us...>> {};

template 
struct MakeUniqueVariantImpl;

template 
struct MakeUniqueVariantImpl> {
    using type = typename Unique, Ts...>::type;
};

template 
using MakeUniqueVariant = typename MakeUniqueVariantImpl::type;

And we revise ProtocolTypes:

using ProtocolTypes = MakeUniqueVariant, P>::type>;

Now we can easily type a ConcurrentMedium: ConcurrentMedium>.

Reflections

Soundness

There are (at least) two important ways this approach is not sound:

Users can accidentally reuse channels. We mitigate this risk by raising an exception in this event. An affine type system could help solve this problem. Someone has already shown how to implement this in C++. I left this unimplemented for this post for two reasons. First, the approach isn’t sound either. It relies on automatic template type inference. But if users manually specify template types then the code passes all type checking incorrectly. Second, it’s complex. It’s too distracting for this blog post.
Liveliness is not guaranteed. Programs could simply never send (or receive) values over a channel. This can be mitigated by linear types, which require that a variable be used exactly once.
Completeness

This approach does help us describe communications between exactly two entities. But here are some scenarios that this specific approach doesn’t help:
Communications between several threads.
Asynchronous communication.

Using F* to Formally Verify Programs

2023-05-20T00:00:00-04:00

Formal methods are currently not widely embraced due to their perceived difficulty. However, the landscape is changing with the emergence of new technologies that make formal methods more accessible than ever before. F*, developed by Microsoft Research, is a groundbreaking functional language that combines dependent types and proof-oriented features. By bridging the gap between programming and proving, F* facilitates a gradual adoption of formal methods by software engineers. In this post, I will provide an introduction to the basics of F* and demonstrate how we can leverage its capabilities to verify a solution to a LeetCode problem. I can’t cover all the background material needed to understand F* in this post. I assume that you have some experience in a language like Haskell or OCaml.

This writing has three goals. First, I want to showcase how far formal methods have come. Second, there is not a lot of material discussing how to use formal methods, and particularly F*. I hope others are able to learn from my mistakes, and newcomers can pick up some proof-engineering strategies. Finally, I want to draw attention to some current pain-points for the sake of improving current formal methods research.

The F* tutorial has an editor you can interact with in your browser. You can follow along with these examples there, without downloading any additional software.

Contents:

Basics of F*

F* is a complex language, and I am but a journeyman. The purpose of this section is only to familiarize you, gentle reader, with enough F* to broadly understand this post’s verification efforts. If you are interested in learning more, check out the F* tutorial.

Functions

F* is inspired by ML languages. You can define simple functions like this:

let double (x: int) : int
    = x + x

This just defines a function called double that accepts an int as a parameter, and returns an int. Note that in F* int refers to a mathematical integer, not a fixed-size integer as in C. This means that the value of x can be arbitrarily large (small).

Note that we may want to define double like:

let double (x: int) : int
    = x * 2

But this simple definition won’t work because * is reserved by F* for constructing tuples. F* tells us this fact with an informative error message:

(Error 189) Expected expression of type "Type"; got expression "x" of
type "Prims.int"

Instead, we have to import a definition that redfines * to refer to multiplication. We do this by opening a module. This definition works:

open FStar.Mul

let double (x: int) : int
    = x * 2

Dependent Types

In a dependently typed programming language, types are permitted to depend on values. Let’s consider the double example:

let double (x: int) : (result: int{result = x + x})
    = x * 2

We changed the return type of double to (result: int{result = x + x}). This is called a refinement type. This is a dependent type because the type depends on the value of x (as well as the return value of double). Note that there is nothing special about the name result – we just needed a name to refer to the return value of double in the refinement type. Any name would work.

Interestingly, notice that x + x is not syntactically the same as x * 2. F* is aware of the semantics of the * operator and the + operator, and automatically proved that x * 2 = x + x. This highlights the power of F*: Many facts can be proven with little effort.

Assertions and Tactics

In F*, assert statements check that a condition is true at proof-time (i.e., before the code runs). This is done by proving the condition asserted. Here is a simple example:

let _ = assert (true)

Of course, the proposition true is always provable (true is true).

Here’s an example of a proposition that cannot be proved:

let _ = assert (false)

This produces this error message:

(Error 19) assertion failed; The SMT solver could not prove the query. Use --query_stats for more details.

Of course false cannot be proved (false is never true).

These examples are rather boring. Let’s consider an example that uses more interesting pieces of logic:

let _ = assert (forall (x: nat) (y: nat) .
                y >= x ==> 
                    (exists (z: nat) .
                        y = x + z))

In more familiar logic, we’d write this as $\forall x, y . y >= x \implies \exists z . y = x + z$.

But if we try to verify our assertion with F*, it fails:

(Error 19) assertion failed; The SMT solver could not prove the query. Use --query_stats for more details.

Under the hood, F* uses the Z3 SMT solver to perform proofs. While Z3 is powerful, no theorem prover can automatically prove all theorems. Z3 appears stuck here. Let’s try adding hints to help Z3 get unstuck:

open FStar.Mul
open FStar.Tactics

let _ = assert (forall (x: nat) (y: nat) .
                y >= x ==> 
                    (exists (z: nat) .
                        y = x + z))
        by (
            let x = forall_intro () in
            let y = forall_intro () in
            let imp = implies_intro () in
                witness (`(`#y - `#x));
                dump "after witness"
        )

We provide hints by using tactics, which are programs that manipulate proofs. Every proof has 1 or more goals, or statements that we need to show are true. Tactics use known facts to simplify goals. This example shows a few tactics:

forall_intro introduces the first variable quantified by forall to the set of known facts (i.e., the variable exists and has the specified type). As a really simple example, forall_intro transforms a goal like $\vdash \forall (x: \mathbb{N}) . x = x$ into $(x : \mathbb{N}) \vdash x = x$.
implies_intro adds the antecedent of an implication to the set of facts known to the theorem prover. To prove $\Gamma \vdash a \implies b$, it is sufficient to show $\Gamma, a \vdash b$.
witness helps us manipulate existence quantifiers. witness adds a term that shows an object with a given property exists. Here, our witness to the existential quantifier is y - x.
dump is an extremely useful tactic. It shows the current goals that need to be proved.

Dump shows us this message:

Goal 1/2:
(x: Prims.nat), (x'0: Prims.nat), (_: x'0 >= x) |- _ : Prims.squash (x'0 - x >= 0 == true)

Goal 2/2:
(x: Prims.nat), (x'0: Prims.nat), (_: x'0 >= x) |- _ : Prims.squash
(x'0 = x + (x'0 - x))

If you read these goals for a second, they should seem obviously true. F* is quite easy to use: If you think something is obvious, just stop talking and see if F* completes the proof:

open FStar.Mul
open FStar.Tactics

let _ = assert (forall (x: nat) (y: nat) .
                y >= x ==> 
                    (exists (z: nat) .
                        y = x + z))
        by (
            let x = forall_intro () in
            let y = forall_intro () in
            let imp = implies_intro () in
                witness (`(`#y - `#x))
        )

In this case, it does.

LeetCode Problem: Capacity to Ship Packages Within D Days

The problem we’re going to solve and verify is the Capacity to Ship Packages within D Days problem. You’re given weights (an array of positive numbers representing the weights of items), and days (the maximum number of days you have to ship all the items). These items must be loaded onto a ship with a capacity of capacity. The challenge is to find the smallest value of capacity so that the number of days required to ship the items is less than or equal to days. Check out the LeetCode description for more details.

Solution Design

Clearly, the minimum capacity that might work is the maximum element of weights. For, if the capacity were any smaller, it would be impossible to ship the largest item. The largest capacity we should consider is the sum of the item weights. Any larger capacity is superfluous, since a ship with this capacity can already ship all the items in 1 day. The correct capacity is therefore somewhere in the range $[maximum\_element~ weights,~ sum~ weights]$.

The naive approach is to simply check every weight in this range. But this number could be quite large – for instance, when the number of items is large but the maximum weight is small. A smarter approach is to use binary search to find the correct capacity.

To be frank, I find that getting the bounds of binary search right to be a little tricky. For tricky loop bounds, I craft loop invariants to help me write the code. Let $min\_elt$ denote the smallest capacity that maybe could ship the items, and $max\_elt$ denote the largest capacity we should consider. We will maintain two key invariants:

$\forall x . x < min\_elt \implies time\_to\_ship~ weights~ x > days$.
$\forall x . x >= max\_elt \implies time\_to\_ship~ weights~ x <= days$.

Under these invariants, when $min\_elt = max\_elt$, the correct capacity to return is $min\_elt$ (or, equivalently, $max\_elt$).

Modeling the Problem in F*

Days to Ship Items Given a Capacity

Let’s start by computing the number of days it takes to ship items with weights weights given a capacity capacity. We’ll represent weights as a non-empty list of natural numbers. F* already provides a theory of lists, so we’ll use that.

module Capacity

open FStar.List
open FStar.List.Tot
open FStar.Tactics

let weight_list = (l:list nat{not (isEmpty l)})

The syntax list nat describes a list of natural numbers. We use a refinement type to specify that the list is non-empty.

Here’s a function definition that returns the number of days it takes to ship some items:

let rec max_elt (l: weight_list) : nat =
  match l with
    | [x] -> x
    | (x::xs) -> 
      let max' = max_elt xs in
        if x >= max' then x
        else max'

let rec days_to_ship' (weights: weight_list)
                      (capacity: nat{capacity >= (max_elt weights)}) 
                      (current_cap: nat) 
                      : (x: nat{x >= 1})
  = 
  match weights with
    | [x] -> 
      if x <= current_cap then 1
      else 2
    | (x::xs) ->
      if x <= current_cap then
        days_to_ship' xs capacity (current_cap - x)
      else
        1 + (days_to_ship' xs capacity capacity)

let days_to_ship (weights: weight_list) 
                 (capacity: nat{capacity >= (max_elt weights)}) 
                 : (x: nat{x >= 1})
  = days_to_ship' weights capacity capacity

A few notes about these functions:

In F*, we must explicitly denote when functions are recursive by using the let rec syntax.
The match syntax performs pattern-matching. Inside of max_elt, [x] matches with a list containing exactly 1 item. The second match case (x::xs) matches with an item consed into any list. Note that these patterns are exhaustive since a weight list is non-empty. Also note that F* verifies this exhaustivity for us, automatically.
Notice the use of the refinement type on the capacity parameter. This is applying our earlier argument: The minimum capacity we can use to ship the items is the maxmium weight of the items.

Defining the Solution Function

Here’s the implementation of our solution function in F*:

let nat_sum (a: nat) (b: nat) : nat = a + b

let sum_of_weights (weights: weight_list) : nat = 
  List.Tot.fold_left nat_sum (hd weights) (tl weights)

let rec lemma_sum_of_weights_is_gte_max (weights: weight_list) :
  Lemma (ensures (sum_of_weights weights) >= max_elt weights)
  =
  match weights with
    | [w] -> ()
    | (x::xs) ->
        FStar.List.Tot.Properties.fold_left_monoid nat_sum 0 xs;
        lemma_sum_of_weights_is_gte_max xs

let min_bound (weights: weight_list) : nat = max_elt weights

let max_bound (weights: weight_list) : nat = sum_of_weights weights

// Returns the minimum capacity necessary to ship all the items in `days` days.
// Note that we have to specify that we decrease the difference between max_cap and min_cap.
let rec ship_within_days' (weights: weight_list) 
                          (days: nat{days > 0})
                          (min_cap: nat{min_cap >= min_bound weights})
                          (max_cap: nat{max_cap >= min_cap})
                          : Tot (n:nat{n >= min_cap /\ n <= max_cap}) (decreases max_cap - min_cap)
    =
    if min_cap = max_cap then
      min_cap
    else
      let middle_cap = (min_cap + max_cap) / 2 in
      let total_days = days_to_ship weights middle_cap in
      if total_days > days then
        ship_within_days' weights days (middle_cap + 1) max_cap
      else
        ship_within_days' weights days min_cap middle_cap

let ship_within_days (weights: weight_list) (days: nat{days > 0}) 
  : (n:nat{n >= min_bound weights /\ n <= max_bound weights})
  = lemma_sum_of_weights_is_gte_max weights;
    ship_within_days' weights 
                      days
                      (max_elt weights)
                      (sum_of_weights weights)

The heart of the implementation is ship_within_days', so we’ll start there. This is a fairly simple binary search implementation. Again, we’re just maintaining the 2 invariants discussed in the Solution Design subsection. Try to go through the logic and see why those invariants are maintained.

The first bit of new syntax we’ll discuss is the return type of ship_within_days'. It returns Tot (n:nat{n >= min_cap /\ n <= max_cap}) (decreases max_cap - min_cap). In F*, all functions must be total – meaning they must terminate. So, really, the type of double from earlier is

val double (x: int) : Tot int

But F* nicely writes Tot for us. Unfortunately, F* doesn’t know why the function ship_within_days' terminates. We explain it: Because max_cap - min_cap always decreases. F* can see that this statement is true, and then accepts our function as terminating. If we delete (decreases max_cap - min_cap) from our code, F* produces this error:

Could not prove termination of this recursive call; The SMT solver could not prove the query. Use --query_stats for more details.

This is our cue to add the decreases expression.

Our primary solution function is ship_within_days. There’s one bit of magic in it: The application of the lemma lemma_sum_of_weights_is_gte_max. This is required because we used a refinement type for max_cap that requires max_cap >= min_cap. Unfortunately, F* cannot automatically prove that (sum_of_weights weights) >= (max_elt weights), so type checking fails if we delete the application of the lemma:

Subtyping check failed; expected type max_cap: Prims.nat{max_cap >= max_elt weights}; got type Prims.nat; The SMT solver could not prove the query.

In general, F* cannot automatically prove propositions that require induction. But once we apply the lemma, F* can easily verify that the types are correct.

Now, let’s discuss our max_bound implementation for a moment. As we mentioned in the Solution Design, the maximum bound on the weights is just the sum of all weights. To sum the weights, we use the standard fold_left function that should be familiar to functional programmers. Note that we cannot write sum_of_weights like this:

// Error
let sum_of_weights (weights: weight_list) : nat = 
  List.Tot.fold_left (+) (hd weights) (tl weights)

This is because the type of + is int -> int -> int. While nat is a subtype of int, F*’s type checking algorithm does not induce int -> int -> int will produce a nat. To solve this problem, we explicitly define nat_sum.

Finally, lemma_sum_of_weights_is_gte_max procedes by induction. We use the Lemma (...) type because the function is a proof. In the case where this is exactly 1 item in the list, we produce the value (). This term has a type of unit. In F*, the type Lemma (ensures (sum_of_weights weights) >= max_elt weights) is really just a synonym for the type u:unit{(sum_of_weights weights) >= max_elt weights}. So, F* will automatically try (and succeed!) to show our lemma is true.

In the case when there is more than 1 item in the list, we first apply FStar.List.Tot.Properties.fold_left_monoid. This establishes the fact that nat_sum (x::xs) = x + nat_sum xs. The following line (lemma_sum_of_weights_is_gte_max xs) convinces F* that the lemma holds by induction. As an exercise: Look at lemma fold_left_monoid provides and consider why we didn’t use this definition:

// Error
let sum_of_weights (weights: weight_list) : nat = 
  List.Tot.fold_left nat_sum 0 weights

Proof of Correctness

There are two facts we want to prove:

Our solution ships all the items within days days.
Any capacity smaller than the one returned by our solution does not ship items within days days. I.e., our solution is minimal.

In fact, these statements are direct consequences of the 2 invariants we constructed in our design subsection. So, let’s start by writing these invariants in F*:

let min_bound_invariant (weights: weight_list) 
                        (cap: nat{cap >= min_bound weights})
                        (days: nat{days > 0})
  = forall (x : nat) . x >= min_bound weights /\ x < cap ==> days_to_ship weights x > days

let max_bound_invariant (weights: weight_list)
                        (cap: nat{cap >= min_bound weights}) 
                        (days: nat{days > 0})
  = forall (x : nat) . x >= cap ==> days_to_ship weights x <= days

Let’s also define the concept of minimality:

let is_minimal (w: weight_list) (c: nat{c >= min_bound w}) (days: nat{days > 0}) = 
  c = min_bound w \/ (c > min_bound w /\ days_to_ship w (c - 1) > days)

The proof follows from induction. We’ll start by drawing the outline of the proof, then fill in details until it is complete. To start the proof:

let rec lemma_ship_within_days'_ships_within_days (weights: weight_list) 
                                                  (days: nat{days > 0})
                                                  (min_cap: nat{min_cap >= min_bound weights})
                                                  (max_cap: nat{max_cap >= min_cap})
  : Lemma 
    (requires min_bound_invariant weights min_cap days /\ 
              max_bound_invariant weights max_cap days)
    (ensures (days_to_ship weights (ship_within_days' weights days min_cap max_cap)) <= days /\
             is_minimal weights (ship_within_days' weights days min_cap max_cap) days)
    (decreases max_cap - min_cap)
  = 
  if min_cap = max_cap then
     ()
  else 
     admit ()

Notice the new requires component of the Lemma type. The requires and ensures clauses of Lemma are preconditions and postconditions respectively. Our strategy is to require that our 2 invariants hold at each call to lemma_ship_within_days'_ships_within_days. Then, it is obvious that the postconditions hold. Indeed: Notice that F* automatically finds a proof when min_cap = max_cap. On the other hand, we use admit () in the else branch. F* programs that contain admit () aren’t proofs at all - admit () forces F* to accept the current goals as true (even if it they are false). However, it’s invaluable when building proofs.

Let’s zoom in further by applying the definition of ship_within_days in the else branch:

// Error
let rec lemma_ship_within_days'_ships_within_days (weights: weight_list) 
                                                  (days: nat{days > 0})
                                                  (min_cap: nat{min_cap >= min_bound weights})
                                                  (max_cap: nat{max_cap >= min_cap})
  : Lemma 
    (requires min_bound_invariant weights min_cap days /\ 
              max_bound_invariant weights max_cap days)
    (ensures (days_to_ship weights (ship_within_days' weights days min_cap max_cap)) <= days /\
             is_minimal weights (ship_within_days' weights days min_cap max_cap) days)
    (decreases max_cap - min_cap)
  = 
  if min_cap = max_cap then
     ()
  else 
    let middle_cap = (min_cap + max_cap) / 2 in
    let total_days = days_to_ship weights middle_cap in
    if total_days > days then (
       lemma_ship_within_days'_ships_within_days weights days (middle_cap + 1) max_cap
    ) else (
        admit ()
    )

Unfortunately, verification fails at this point:

(Error 19) assertion failed; The SMT solver could not prove the query. Use --query_stats for more details.

Frankly, this error message is pretty awful. Hopefully, it is clear that if lemma_ship_within_days'_ships_within_days can be applied in the body of if total_days > days then postcondition holds. This should lead us to suspect that the problem is that F* cannot prove the preconditions of lemma_ship_within_days'_ships_within_days holds at this point. Let’s add a temporary assert statement to check on the precondition:

// Error
let rec lemma_ship_within_days'_ships_within_days (weights: weight_list) 
                                                  (days: nat{days > 0})
                                                  (min_cap: nat{min_cap >= min_bound weights})
                                                  (max_cap: nat{max_cap >= min_cap})
  : Lemma 
    (requires min_bound_invariant weights min_cap days /\ 
              max_bound_invariant weights max_cap days)
    (ensures (days_to_ship weights (ship_within_days' weights days min_cap max_cap)) <= days /\
             is_minimal weights (ship_within_days' weights days min_cap max_cap) days)
    (decreases max_cap - min_cap)
  = 
  if min_cap = max_cap then
     ()
  else 
    let middle_cap = (min_cap + max_cap) / 2 in
    let total_days = days_to_ship weights middle_cap in
    if total_days > days then (
       assert (min_bound_invariant weights (middle_cap + 1) days);
       lemma_ship_within_days'_ships_within_days weights days (middle_cap + 1) max_cap
    ) else (
        admit ()
    )

F* still prints an assertion failed error, but now it points to the line checking the precondition. So, we know that the problem is that F* cannot prove min_bound_invariant on (middle_cap + 1). We know that maximum_bound_invariant must continue to hold.

Observe that min_bound_invariant holds because days_to_ship is decreasing: If we decrease the capacity, we will increase the days to ship, and the condition if total_days > days already has proven that we cannot ship at the capacity middle_cap. We just need to show F* these facts are true:

let rec lemma_days_to_ship_is_decreasing'' (weights: weight_list) 
                                           (cap: nat{cap >= (max_elt weights)})
                                           (ccap: nat{ccap <= cap})
                                           (ccap1: nat{ccap1 > ccap /\ ccap1 <= cap + 1})
  : Lemma (ensures days_to_ship' weights (cap + 1) ccap1 <= (days_to_ship' weights cap ccap))
  =
  match weights with
    | [w] -> ()
    | x::xs -> 
      if x <= ccap && x <= ccap1 then
        lemma_days_to_ship_is_decreasing'' xs cap (ccap - x) (ccap1 - x)
      else if x > ccap && x <= ccap1 then
        lemma_days_to_ship_is_decreasing' xs cap cap (ccap1 - x)
      else if x > ccap && x >= ccap1 then
        lemma_days_to_ship_is_decreasing'' xs cap cap (cap + 1)

and lemma_days_to_ship_is_decreasing' (weights: weight_list) 
                                      (cap: nat{cap >= (max_elt weights)})
                                      (ccap: nat{ccap <= cap})
                                      (ccap1: nat{ccap1 <= cap + 1})
  : Lemma (ensures days_to_ship' weights (cap + 1) ccap1 <= 1 + (days_to_ship' weights cap ccap))
  =
  match weights with
    | [w] -> ()
    | x::xs ->
      if x <= ccap && x <= ccap1 then
        lemma_days_to_ship_is_decreasing' xs cap (ccap - x) (ccap1 - x)
      else if x > ccap && x > ccap1 then
        lemma_days_to_ship_is_decreasing' xs cap cap (cap + 1)
      else if x > ccap && x <= ccap1 then
        lemma_days_to_ship_is_decreasing' xs cap cap (ccap1 - x)
      else
        // I.e., x <= ccap && x > ccap1
        lemma_days_to_ship_is_decreasing'' xs cap (ccap - x) (cap + 1)

let lemma_days_to_ship_is_decreasing (weights: weight_list) 
                                     (cap: nat{cap >= (max_elt weights)})
                                     (c_cap: nat{c_cap <= cap})
  : Lemma (ensures days_to_ship' weights (cap + 1) (c_cap + 1) <= days_to_ship' weights cap c_cap)
  =
  lemma_days_to_ship_is_decreasing'' weights cap c_cap (c_cap + 1)

Despite the coinductive proof, this is a simple argument. The theorem that we are primarily interested in is lemma_days_to_ship_is_decreasing''. This follows from induction. There is a wrinkle, though: In the else if x > ccap && x <= ccap1 branch. In this case, the preconditions of lemma_days_to_ship_is_decreasing'' are no longer met. So, we use coinduction to show that days_to_ship' weights (cap + 1) ccap1 <= 1 + (days_to_ship' weights cap ccap). Then, since days_to_ship weights cap ccap = 1 + days_to_ship xs cap cap, F* is automatically able to cancel the 1s and prove our theorem. A similar argument applies to lemma_days_to_ship_is_decreasing'.

But even armed with this theorem, F* still can’t prove the precondition. Try it. We’ll have to go even further:

let lemma_days_to_ship_is_decreasing_full (weights: weight_list) (cap: nat{cap >= (max_elt weights)})
  : Lemma (ensures days_to_ship weights (cap + 1) <= days_to_ship weights cap)
  = 
  lemma_days_to_ship_is_decreasing weights cap cap


let rec lemma_days_to_ship_is_decreasing2 (weights: weight_list) (c: nat{c >= min_bound weights})
  : Lemma (ensures (forall (x : nat) . x >= min_bound weights /\ x < c ==> 
                      days_to_ship weights x >= days_to_ship weights c))
  = if c > min_bound weights then (
       lemma_days_to_ship_is_decreasing_full weights (c -1);
       lemma_days_to_ship_is_decreasing2 weights (c - 1)
    )

let rec lemma_ship_within_days'_ships_within_days (weights: weight_list) 
                                                  (days: nat{days > 0})
                                                  (min_cap: nat{min_cap >= min_bound weights})
                                                  (max_cap: nat{max_cap >= min_cap})
  : Lemma 
    (requires min_bound_invariant weights min_cap days /\ 
              max_bound_invariant weights max_cap days)
    (ensures (days_to_ship weights (ship_within_days' weights days min_cap max_cap)) <= days /\
             is_minimal weights (ship_within_days' weights days min_cap max_cap) days)
    (decreases max_cap - min_cap)
  = 
  if min_cap = max_cap then
     ()
  else 
    let middle_cap = (min_cap + max_cap) / 2 in
    let total_days = days_to_ship weights middle_cap in
    if total_days > days then (
       lemma_days_to_ship_is_decreasing2 weights middle_cap;
       lemma_ship_within_days'_ships_within_days weights days (middle_cap + 1) max_cap
    ) else (
        admit ()
    )

As you might guess, F* has a similar problem with the max_bound_invariant. The problem is that the invariant requires all capacities greater than max_cap to ship in less than or equal to days, but our decreasing lemma only applies to max_cap + 1. Our proof strategy is to use induction to extend our original decreasing lemma to show $\forall k : \mathbb{N} . days\_to\_ship~ weights~ (capacity + k) <= days\_to\_ship~ weights~ capacity$. This argument convinces F*:

let rec lemma_days_to_ship_is_decreasing3'' (w: weight_list) (c : nat{c >= min_bound w}) (k: nat)
  : Lemma (ensures days_to_ship w (c + k) <= days_to_ship w c)
  = 
  if k = 0 then ()
  else (
    lemma_days_to_ship_is_decreasing_full w (c + k - 1);
    lemma_days_to_ship_is_decreasing3'' w c (k - 1)
  )

let lemma_days_to_ship_is_decreasing3' (w: weight_list) (c : nat{c >= min_bound w})
  : Lemma (ensures forall (k : nat) . days_to_ship w (c + k) <= days_to_ship w c)
  = 
  assert (forall (w: weight_list) (c: nat{c >= min_bound w}) (k : nat) . 
            days_to_ship w (c + k) <= days_to_ship w c)
  by (
    let w = forall_intros () in
    mapply (`lemma_days_to_ship_is_decreasing3'' )
  )

let lemma_add_definition (c:nat)
  : Lemma (ensures (forall (x : nat) . x >= c ==> (exists (k : nat) . x = k + c)))
  =
  assert (forall (x : nat) . x >= c ==> x - c >= 0 /\ x - c + c = x)

let lemma_days_to_ship_is_decreasing3 (weights: weight_list) (c: nat{c >= min_bound weights})
  : Lemma (ensures forall (x : nat) . x >= c ==> days_to_ship weights x <= days_to_ship weights c)
  =
    lemma_days_to_ship_is_decreasing3' weights c;
    lemma_add_definition c


let rec lemma_ship_within_days'_ships_within_days (weights: weight_list) 
                                                  (days: nat{days > 0})
                                                  (min_cap: nat{min_cap >= min_bound weights})
                                                  (max_cap: nat{max_cap >= min_cap})
  : Lemma 
    (requires min_bound_invariant weights min_cap days /\ 
              max_bound_invariant weights max_cap days)
    (ensures (days_to_ship weights (ship_within_days' weights days min_cap max_cap)) <= days /\
             is_minimal weights (ship_within_days' weights days min_cap max_cap) days)
    (decreases max_cap - min_cap)
  = 
  if min_cap = max_cap then
     ()
  else 
    let middle_cap = (min_cap + max_cap) / 2 in
    let total_days = days_to_ship weights middle_cap in
    if total_days > days then (
       lemma_days_to_ship_is_decreasing2 weights middle_cap;
       lemma_ship_within_days'_ships_within_days weights days (middle_cap + 1) max_cap
    ) else (
      lemma_days_to_ship_is_decreasing3 weights middle_cap;
      lemma_ship_within_days'_ships_within_days weights days min_cap middle_cap
    )

As an exercise: It is up to the reader to demonstrate that the min_bound_invariant and max_bound_invariant hold under the initial conditions set by ship_within_days.

Takeaways

The Good

F* has an amazing Emacs mode. It uses unicode symbols to make identifiers like forall and exists render as the appropriate logic symbols. It also allows you to verify code as you work inside of Emacs itself. Finally, it provides error squiggles.

F* can automatically find many proofs, more so than similar tools that I’ve experimented with (e.g., Coq and Isabelle). In that sense, F* seems easier to adopt than more mainstream tools.

The Bad

Error messages are bad. From my experience using Z3, this is because Z3 does not generate very good unsatisfiable cores. To expand: You provide Z3 a bunch of logical formulae. Z3 attempts to find an interpretation (i.e., a mapping of variables to values) that satisfies the formulae. When Z3 definitely cannot find an interpretation, the formulae are unsatisfiable. For the sake of error reporting, you might be interested in why formulae are unsatisfiable. What is the smallest number of formulae you can remove from the solver that makes the others satisfiable?

Unfortunately, things are not so simple for several reasons:

Z3 slows down when you enable the generation of unsatisfiable cores.
The unsatisfiable cores that Z3 generates are not minimal.
Just because a formula appears in a minimal unsatisfiable core does not mean that it necessarily is relevant to the fix.

Meanwhile, tools that use Z3 have to somehow manage the relationship between Z3 variables and their own semantic domain. This adds to the challenge of making good error messages with Z3.

The Ugly

Z3 is sensitive to a lot more than you may expect. A common idiom in F* is to test if adding a lemma helps you with a proof, like so:

let lemma_a (x: unit) : Lemma (ensures some_formula) = 
    admit ()

let lemma_b (x: unit) : Lemma (ensures some_formula) = 
    // Other lemmas not shown.
    lemma_a ();
    ()

Here, lemma_b uses lemma_a in its proof. Now, assume that Z3 is able to find a proof of lemma_b. So, we proceed to prove lemma_a. Very rarely, I have noticed that changing the proof of lemma_a causes Z3 to no longer be able to prove lemma_b. Obviously, this is surprising because the lemma_b does not logically depend on the specific proof of lemma_a.

Documentation and examples are also lacking. There are not a lot of high quality educational resources available today.

Conclusion

I found F* to be immensely usable. While error messages are not the best, this is really a limitation of the underlying SMT solver. From experience, Z3’s unsatisfiable cores are complex to handle. And moving back and forth from the high level language F* provides and SMT is challenging. But this definitely an area that needs improvement.

The ecosystem of F* is young. The resources I’ve used are:

The source code on GitHub. The standard library is not really documented today. But, due to the presence of preconditions/postconditions, the source is quite readable. I have learned to make it a habit to consult the source code for lemmas, like FStar.List.Tot.Properties.fold_left_monoid.
The F* tutorial contains some decent examples.
Read the papers. It’s okay to not understand everything – learn what you can, save the paper, and eventually you’ll come back to and things will make more sense.
The book “Certified Programming with Dependent Types.”
The book “Types and Programming Languages” provides good background PL theory.
The book “The Little Typer” provides a good background on dependent types.

I hope that this post has inspired you to give F* a try.

A LISP REPL Inside ChatGPT

2022-12-05T00:00:00-05:00

TL;DR: ChatGPT is a LISP REPL.

Inspired by a recent post where the author used ChatGPT as a virtual machine, I wanted to learn if ChatGPT can be a useful LISP interpreter. To my surprise, ChatGPT understands LISP remarkably well.

LISP Basics


Figure 1: Initial prompt and basic LISP functions.

Figure 1 shows the initial prompt I used. It’s very similar to the prompt in Building A Virtual Machine inside ChatGPT. We see a few interesting facts already:

ChatGPT understands that NIL evaluates to NIL.
ChatGPT understands function application (at least, for arithmetic functions).
ChatGPT understands some subtle semantics of Common Lisp: i.e., (eq (car nil) nil).

CONS and SETF


Figure 2: `CONS`‘ing and `SETF`s.

In LISP, we construct a CONS cell that contains two pointers (called CAR and CDR) with the CONS function. Continuing on to Figure 2, it seems like ChatGPT is aware of how CONS works. LISP also allows us to modify the value stored in a place with the the SETF macro. If the first argument to the SETF macro is a symbol (e.g., my-list), then SETF modifies the symbol table to associate the symbol-name with the value of the 2nd argument. ChatGPT seems aware of how SETF behaves. The first line of Figure 3 shows that ChatGPT can remember the state of the symbol table.


Figure 3: Recursive Functions of Symbolic Expressions.

A Function Named `F`

Figure 3 shows the definition of a function named f. Here, f computes the factorial of a number. This might seem challenging, since f is a recursive function. But ChatGPT evaluates the function without any problems. Figure 4 shows f applied to a larger, challenging input. Once again, ChatGPT correctly evaluates the expression.


Figure 4: The persistence of memory.

The Persistence of Memory

Let’s see if ChatGPT still recalls the association between my-list and (42) we introduced in our symbol table. Figure 4 shows the results of evaluating (setf (car my-list) 42). We see that:

ChatGPT understands setf works on arbitrary places, not just symbol names.
ChatGPT remembers we associated my-list with a list containing a single element.

Y-Combinator

Let’s try another challenge: The Y Combinator. I used this implementation. Figure 5 shows the results.

To my surprise, ChatGPT understands the function definition and correctly evaluates it. This is particularly challenging, since it shows:

ChatGPT knows about FUNCALL and understands what it means to be a LISP-2.
ChatGPT follows the indirection in the function calls.


Figure 5: The Y Combinator.

Closing Thoughts

I am very surprised how well ChatGPT handles the task of interpreting LISP code. I am very curious if ChatGPT actually understands the source code, or if it has seen enough examples that it can blindly regurgitate results it has memorized. Since LISP has very simple semantics, it’s a great tool for studying the extent of a large language model’s ability to understand and interpret source code.

At this past year’s ASE, there was a really interesting paper called AST-Probe: Recovering abstract syntax trees from hidden representations of pre-trained language models. One conclusion of this paper is large language models deeply understand syntax trees. I wonder if we can somehow decide if large language models understand a language’s operational semantics?

Programming Puzzle: Optimal Pothole Repair

2022-08-15T00:00:00-04:00

This post discusses how to efficiently schedule optimal pothole repair!

Introduction

I encountered a fun programming puzzle recently:

You are given a description of a two-lane road in which two strings, L1 and L2, represent the first and the second lane. Each lane consists of N segments of equal length.

The K-th segment of the first lane is represented by L1[K], and the K-th segment of the second lane is represented by L2[K], where “.” denotes a smooth segment of road, and “x” denotes a segment that contains potholes.

Cars can drive over segments with potholes, but it is uncomfortable for passengers. Therefore, a project to repair as many potholes as possible was submitted. At most one contiguous region of each lane may be repaired at a time. The region is closed to traffic while it is under repair.

How many road segments with potholes can be repaired given that the road must be kept open?

For example, if L1 = “..xx.x.” and L2 = “x.x.x..”, the maximum number of potholes we can repair is 4. See Figure 1 for an explanation.


Figure 1: Visualization of the example. Segments without potholes are shown as empty boxes. Segments with potholes are shown as gray boxes. Contiguous regions under repair are highlighted orange. The arrows indicate a path through the road.

Solution Idea

This problem has two key requirements:

Repairs affect a contiguous region. That means that a solution like the one shown in Figure 2 is not allowed.
Vehicles must be able to reach the end of the road.


Figure 2: L1 = “..xx…” and L2 = “x….x.”. The solution shown here is not allowed, since L2’s repair regions are not contiguous.

There are two important observations about the problem. First, a vehicle must be able to travel the road by changing lanes at most once. I give an argument for this point in the next paragraph. Second, no repair can occur at the segment where the vehicle changes lanes. This is because both lanes must be open for the vehicle to change lanes.

A proof by contradiction shows the vehicle can change lanes at most once in an allowed solution. First, assume without loss in generality that a vehicle starts in L1, and changes lanes twice at segments i and j. A repair must occur in the region [0, i-1] in L2, otherwise the vehicle could have started in L2. A repair must start at segment j in L2, otherwise the vehicle need not change lanes. But the segments [0, i-1] and [j, …] are not contiguous. So, the solution is not allowed. We conclude that a vehicle can change lanes at most once.

Since the vehicle can only change lanes once, we only need to find (1) the segment to change lanes, and (2) the starting lane. Let’s start by characterizing the segment where the vehicle changes lanes. Suppose the vehicle starts in L1. Call the ideal segment to change lanes C. The sum of potholes in L1 in region [C+1, …] and L2 in region […, C-1] is maximal. This is because, since the vehicle doesn’t start in L2, we can repair all segments in L2 until C. The same argument applies to L1 after C.

We can compute C in $O(n)$ time, where n is the number of segments. Maintain two arrays of length n, $avoided_{L1}$ and $avoided_{L2}$. Let $avoided_{L1}[i]$ denote the number of potholes avoided in L1[i+1, …] if the vehicle changes lanes from L1 to L2 at segment i. Similarly, $avoided_{L2}[i]$ denotes the number of potholes avoided in L2[0, i-1] if the vehicles changes lanes from L1 to L2 at segment i. So, $avoided_{L1}$ stores the partial sums of the number of potholes in L1 counting from the end. Meanwhile, $avoided_{L2}$ stores the partial sums of the number of potholes in L2 counting from the start. Computing C is simple: $C = \underset{0 \leq c < n}{\text{argmax}}(avoided_{L1}[c] + avoided_{L2}[c]).$

Finding the starting lane $L$ is also easy. Let $F(A)$ denote the value of $C$ for a vehicle that starts in lane $A$. Then, $L = \underset{l \in \left\{ L1,~ L2 \right\} }{\text{argmax}}(F(l))$.

Asymptotic Analysis

This solution has a runtime of $O(n)$, since computing $C$ takes $O(n)$ time. Memory usage is $O(n)$, since we create the extra arrays $avoided_{L1}$ and $avoided_{L2}$ to store partial sums.

Python Implementation

from typing import List


class PotholeState(enum.Enum):
    POTHOLE = 1
    CLEAN = 2


_STR_TO_STATE = {
    '.': PotholeState.CLEAN,
    'x': PotholeState.POTHOLE,
}


def read_lanes(l1: str, l2: str) -> List[List[PotholeState]]:
    return [[s1, s2] for s1, s2
            in zip(
              [_STR_TO_STATE[chr] for chr in l1],
              [_STR_TO_STATE[chr] for chr in l2],
            )]


def _max_repairable_helper(l1: List[PotholeState], l2: List[PotholeState]) -> int:
    l1_avoided_potholes = [0] * len(l1)
    l2_avoided_potholes = [0] * len(l2)
    for i in range(len(l1) - 2, -1, -1):
        l1_avoided_potholes[i] = l1_avoided_potholes[i + 1]
        if l1[i+1] == PotholeState.POTHOLE:
            l1_avoided_potholes[i] += 1
    for i in range(1, len(l2)):
        l2_avoided_potholes[i] = l2_avoided_potholes[i - 1]
        if l2[i - 1] == PotholeState.POTHOLE:
            l2_avoided_potholes[i] += 1
    return max([l1_avoided_potholes[i] + l2_avoided_potholes[i] for i in range(len(l1))])


def max_repairable_segments(road: List[List[PotholeState]]) -> int:
    lane1 = [road[i][0] for i in range(len(road))]
    lane2 = [road[i][1] for i in range(len(road))]
    return max(
        _max_repairable_helper(lane1, lane2),
        _max_repairable_helper(lane2, lane1),
    )

Follow-ups

Remove the requirement that only one contiguous region per lane can be under repair.
Find the regions under repair in both lanes in an optimal solution.
There are two key properties in a solution. Check the implementation with property tests.

Max Taylor’s Personal Site

Rust: Generics Considered Colorful

Motivating Example

Does this Matter In Real Life?

Fix 1: Prefer Trait Objects

Fix 2: Always Implement Your Traits for Trait Objects

Fix 3: Fix Rust

A Concept and Template Meta-programming Approach to Session Types in C++

Introduction

Motivating Example: IO

Motivating Example: Multithreaded Communication

A C++ Implementation

Session Types

Duality with C++ Concepts

Natural Numbers with Template Meta-Programming

The Chan Type

Concurrent Communication Primitive

Guarantees

Storage

Problem 1: How to Ensure Reads/Writes are Type Safe?

Problem 2: How to Create a ConcurrentMedium For an Arbitrary Protocol?

Reflections

Soundness

Completeness

Using F* to Formally Verify Programs

Basics of F*

Functions

Dependent Types

Assertions and Tactics

LeetCode Problem: Capacity to Ship Packages Within D Days

Solution Design

Modeling the Problem in F*

Days to Ship Items Given a Capacity

Defining the Solution Function

Proof of Correctness

Takeaways

The Good

The Bad

The Ugly

Conclusion

A LISP REPL Inside ChatGPT

LISP Basics

CONS and SETF

A Function Named F

The Persistence of Memory

Y-Combinator

Closing Thoughts

Programming Puzzle: Optimal Pothole Repair

Contents

Introduction

Solution Idea

Asymptotic Analysis

Python Implementation

Follow-ups

A Function Named `F`