How to write a type-safe unwrap (aka fromJust)

Let's say you've written a programming language in Coq. You have nice inductives for your ASTs; one for untyped terms (UntypedAST) and one for typed terms (TypedAST). You wrote a simple typechecker, and maybe an interpreter, too!

typecheck : UntypedAST -> option TypedAST
interp : TypedAST -> Value

You write a few programs…

Example well_typed := UAdd (UNat 1) (UNat 1).
Example ill_typed := UAdd (UNat 1) (UBool true).

… typecheck them:

Definition tc_good := typecheck well_typed.
= Some {| tau := Nat; ast := TAdd (TNat 1) (TNat 1) |} : option TypedAST
Definition tc_bad := typecheck ill_typed.
= None : option TypedAST

… and attempt to run them:

The term "tc_good" has type "option TypedAST" while it is expected to have type "TypedAST".

D'oh! interp takes a TypedAST, but typecheck returns an option. What do we do?

We can write a simple wrapper though, with a default value for the None case:

Definition unwrap_default (o: option TypedAST) : TypedAST :=
  match o with
  | Some t => t
  | None => {| ast := TNat 1 |}

= {| vtau := Nat; val := 2 |} : Value

But now we silently swallow type errors, which isn't ideal:

= {| vtau := Nat; val := 1 |} : Value

Let's see how we can get a safe but convenient version of unwrap (aka fromJust in the Haskell world and Option.get in OCaml).

Take 1: Pass a proof as an extra argument

The most straightforward way is to generalize unwrap by adding a proof that its argument is not None:

 Definition unwrap {A} (o: option A)
            (not_none: o <> None) : A :=
   match o return _ = o -> A with
   | Some a => fun _ => a
   | None => fun is_none => False_rect _ (not_none is_none)
   end eq_refl.

… it works, but it's not much fun for callers:

= {| vtau := Nat; val := 2 |} : Value

We can improve things slightly with tactics in terms:

= {| vtau := Nat; val := 2 |} : Value
Not a discriminable equality.

… but the generated terms are not pretty, so if you ever store them unreduced anywhere, you're in for all sorts of unpleasantness:

unwrap tc_good (fun H : tc_good = None => let H0 : False := eq_ind tc_good (fun e : option TypedAST => match e with | Some _ => True | None => False end) I None H in False_ind False H0) : TypedAST

Not great. Still, here is another example for comparison, this time using known-good indices into a list:

Definition nth_in_bounds {A} (l: list A) (n: nat)
           (in_bounds: n < List.length l) :=
  unwrap (List.nth_error l n)
         (proj2 (List.nth_error_Some l n) in_bounds).

= 3 : nat

Note that (maybe surprisingly) the computation doesn't block, despite the fact that the definition of nth_in_bounds uses an opaque proof List.nth_error_Some. The reason is that, as we've seen, unwrap doesn't actually look at the proof. In fact, in general, proofs don't tend to block computation, because Coq disallows elimination of informative Props into type (that is, programs that return non-Prop results can't inspect proofs — except non-informative ones, like eq_refl).

Take 2: Use an equality proof

The main pain point in the previous example was the complexity of the proof terms, so let's simplify them. Instead of proving o <> None, we'll prove that is_some o = true, and the proof will always be eq_refl:

 Definition is_some {A} (o: option A) : bool :=
   if o then true else false.

o:option A
is_some o = true -> o <> None
o:option A
is_some o = true -> o <> None
is_some (Some a) = true -> Some a <> None
is_some None = true -> None <> None
true = true -> Some a <> None
false = true -> None <> None
all: congruence. Qed.


In Coq the if _ then _ else _ notation works for any inductive type with two constructors, not just Booleans: the first constructor triggers the true case, and the second one triggers the false case. Conveniently, the definition of the option type in Coq puts Some _ first and None second, so it works intuitively with ifs.

On the other hand, the definitions of nat, list, and String put the base case first, which means that 0, [], and "" are truthy in Coq (Compute (if 0 then true else false) reduces to true), while 42, [1; 2; 3], and "coq" are falsy (Compute (if "coq" then true else false) reduces to false) — a slightly unfortunate result.

Now we can define a new variant of unwrap:

 Definition unwrap_dec {A} (o: option A)
            (is_some_true: is_some o = true) : A :=
   unwrap o (is_some_not_none is_some_true).

= {| vtau := Nat; val := 2 |} : Value

Much nicer! Now the proof is always the same, and we can even define a notation to hide it:

Notation unwrap_dec' o := (unwrap_dec o eq_refl).

Here's how it looks for list indices:

Definition nth_in_bounds_dec {A} (l: list A) (n: nat)
           (lt_true: (n <? List.length l) = true) :=
  nth_in_bounds l n (proj1 (Nat.ltb_lt _ _) lt_true).

= 3 : nat

One significant advantage of this strategy is that we can control the reduction strategy used to check that eq_refl has the right type (ensuring that the application of unwrap_dec is well-typed requires checking that eq_refl: is_some _ = true, which requires reducing is_some _ to unify it with true). Concretely, we can write (@eq_refl bool true : is_some tc_good = true) to using normal unification, (@eq_refl bool true <: …) to call vm_compute, and <<: to call native_compute.

As before, though, the proof term that we're passing is in fact dead code, and the error messages are not ideal:

= {| vtau := Nat; val := 2 |} : Value
The term "eq_refl" has type "is_some None = is_some None" while it is expected to have type "is_some None = true".
= {| vtau := let (tau, _) := match is_some_not_none ?is_some_true eq_refl return TypedAST with end in tau; val := (fix Ffix (x : Tau) (x0 : TypedAST' x) {struct x0} : (fix Ffix0 (x1 : Tau) : Set := match x1 with | Nat => nat | Bool => bool end) x := match x0 in (TypedAST' H) return ((fix Ffix0 (x1 : Tau) : Set := match x1 with | Nat => nat | Bool => bool end) H) with | TNat n => n | TBool b => b | TAdd e1 e2 => (fix Ffix0 (x1 x2 : nat) {struct x1} : nat := match x1 with | 0 => x2 | S x3 => S (Ffix0 x3 x2) end) (Ffix Nat e1) (Ffix Nat e2) | TAnd e1 e2 => if Ffix Bool e1 then Ffix Bool e2 else false end) (let (tau, _) := match is_some_not_none ?is_some_true eq_refl return TypedAST with end in tau) (let (tau, ast) as c return (TypedAST' (let (tau, _) := c in tau)) := match is_some_not_none ?is_some_true eq_refl return TypedAST with end in ast) |} : Value

Take 3: Use a dependent return type

We know that we only intend to call unwrap with arguments that reduce to Some _. We can make this explicit in the return type, instead of changing the arguments:

Inductive error : string -> Type := Err (s: string) : error s.

Definition unwrap_dep {A} (o: option A)
  : if o then A else error _ :=
  match o with
  | Some a => a
  | None => Err "Expecting Some, got None"

= {| vtau := Nat; val := 2 |} : Value

Here we're saying that we'll return an A if given a Some, and an error otherwise. And indeed, the error messages are much nicer:

The term "unwrap_dep tc_bad" has type "if tc_bad then TypedAST else error "Expecting Some, got None"" while it is expected to have type "TypedAST".

Here's how it looks for list indices:

Definition nth_in_bounds_dep {A} (l: list A) (n: nat)
  : if lt_dec n (List.length l) then A else error _ :=
  match lt_dec n (List.length l) as cmp
    return (if cmp then A else error _) with
  | left in_bounds => nth_in_bounds l n in_bounds
  | right _ => Err "Index is out of bounds"
= 3 : if lt_dec 2 (Datatypes.length [1; 2; 3]) then nat else error "Index is out of bounds"
= Err "Index is out of bounds" : if lt_dec 7 (Datatypes.length [1; 2; 3]) then nat else error "Index is out of bounds"

Bonus 1: Using unification

After I wrote this post, my colleague Jason Gross showed me another quite clever implementation of unwrap, leveraging inference:

Notation unwrap_refl o :=
  ((fun v (pf : o = Some v) => v) _ eq_refl) (only parsing).

= {| tau := Nat; ast := TAdd (TNat 1) (TNat 1) |} : TypedAST

The trick here is to force unification to infer the value inside the option: Coq will unify o = Some ?v (the type of pf) with ?a = ?a (the type of eq_refl), and instantiate ?v in passing, which the function then returns. Nifty!

Bonus 2: Using tactics in terms

Here's one final way to proceed with this, using tactics in terms:

Notation unwrap_tac opt :=
  ltac:(match (eval hnf in opt) with
        | Some ?v => exact v
        | ?other => fail "Error:" other "isn't [Some _]"
        end) (only parsing).

In practice, it works OK, but hnf is very slow (it's based on the same code as simpl). The cbv tactic and its faster cousins like vm_compute and native_compute are usually faster, but they get very costly if the terms are large and don't need to be fully normalized to determine whether we're in the Some or None case (think of a case like Some (very large term), where hnf will be free and cbv very slow).

Knowing this, it's a bit easier to understand why the unwrap_dec trick above works well: the type check that ensures that eq_refl has type is_some opt = true is essentially computing the head-normal form of opt and comparing it to Some, but it does that using Coq's fast reduction tactics. In fact, Jason has done a lot of work on exploring alternative strategies that combine reflection and fast full-reduction tactics such as vm_compute or native_compute to give fine-grained control over reduction.


Jason correctly points out that this notation won't give you great error messages if you pass it terms with typos:

The reference opt was not found in the current environment.

One way around this is to tweak the notation to force it to typecheck its argument before passing it into the tactic, like this:

Notation unwrap_tac' opt :=
  (match opt with _ =>
   ltac:(match (eval hnf in opt) with
         | Some ?v => exact v
         | ?other => fail "Error:" other "isn't [Some _]"
         end) end) (only parsing).

The reference Som was not found in the current environment.

Using a match instead of another construct like let _ := opt in ensures that we don't pollute the term (the match will self-reduce without requiring an explicit reduction).