# How to write a type-safe `unwrap`

(aka `fromJust`

)

Posted by Coq. Suggest edits or corrections.

on Fri 19 June 2020 inLet's say you've written a programming language in Coq. You have nice inductives for your ASTs; one for untyped terms (`UntypedAST`

) and one for typed terms (`TypedAST`

). You wrote a simple typechecker, and maybe an interpreter, too!

You write a few programs…

Example well_typed := UAdd (UNat 1) (UNat 1). Example ill_typed := UAdd (UNat 1) (UBool true).

… typecheck them:

Definition tc_good := typecheck well_typed.Definition tc_bad := typecheck ill_typed.

… and attempt to run them:

D'oh! `interp`

takes a `TypedAST`

, but `typecheck`

returns an `option`

. What do we do?

We can write a simple wrapper though, with a default value for the `None`

case:

Definition unwrap_default (o: option TypedAST) : TypedAST := match o with | Some t => t | None => {| ast := TNat 1 |} end.

But now we silently swallow type errors, which isn't ideal:

Let's see how we can get a safe but convenient version of `unwrap`

(aka `fromJust`

in the Haskell world and `Option.get`

in OCaml).

## Take 1: Pass a proof as an extra argument

The most straightforward way is to generalize `unwrap`

by adding a proof that its argument is not `None`

:

Definition unwrap {A} (o: option A) (not_none: o <> None) : A := match o return _ = o -> A with | Some a => fun _ => a | None => fun is_none => False_rect _ (not_none is_none) end eq_refl.

… it works, but it's not much fun for callers:

We can improve things slightly with tactics in terms:

… but the generated terms are not pretty, so if you ever store them unreduced anywhere, you're in for all sorts of unpleasantness:

Not great. Still, here is another example for comparison, this time using known-good indices into a list:

Definition nth_in_bounds {A} (l: list A) (n: nat) (in_bounds: n < List.length l) := unwrap (List.nth_error l n) (proj2 (List.nth_error_Some l n) in_bounds).

Note that (maybe surprisingly) the computation doesn't block, despite the fact that the definition of `nth_in_bounds`

uses an opaque proof `List.nth_error_Some`

. The reason is that, as we've seen, `unwrap`

doesn't actually look at the proof. In fact, in general, proofs don't tend to block computation, because Coq disallows elimination of informative `Prop`

s into type (that is, programs that return non-`Prop`

results can't inspect proofs — except non-informative ones, like eq_refl).

## Take 2: Use an equality proof

The main pain point in the previous example was the complexity of the proof terms, so let's simplify them. Instead of proving `o <> None`

, we'll prove that `is_some o = true`

, and the proof will always be `eq_refl`

:

Definition is_some {A} (o: option A) : bool := if o then true else false.A:Typeo:option Ais_some o = true -> o <> NoneA:Typeo:option Ais_some o = true -> o <> NoneA:Typea:Ais_some (Some a) = true -> Some a <> NoneA:Typeis_some None = true -> None <> Noneall: congruence. Qed.A:Typea:Atrue = true -> Some a <> NoneA:Typefalse = true -> None <> None

Note

In Coq the `if _ then _ else _`

notation works for any inductive type with two constructors, not just Booleans: the first constructor triggers the true case, and the second one triggers the false case. Conveniently, the definition of the `option`

type in Coq puts `Some _`

first and `None`

second, so it works intuitively with `if`

s.

On the other hand, the definitions of `nat`

, `list`

, and `String`

put the base case first, which means that `0`

, `[]`

, and `""`

are truthy in Coq (`Compute (if 0 then true else false)`

reduces to `true`

), while `42`

, `[1; 2; 3]`

, and `"coq"`

are falsy (`Compute (if "coq" then true else false)`

reduces to `false`

) — a slightly unfortunate result.

Now we can define a new variant of `unwrap`

:

Definition unwrap_dec {A} (o: option A) (is_some_true: is_some o = true) : A := unwrap o (is_some_not_none is_some_true).

Much nicer! Now the proof is always the same, and we can even define a notation to hide it:

`Notation unwrap_dec' o := (unwrap_dec o eq_refl).`

Here's how it looks for list indices:

Definition nth_in_bounds_dec {A} (l: list A) (n: nat) (lt_true: (n <? List.length l) = true) := nth_in_bounds l n (proj1 (Nat.ltb_lt _ _) lt_true).

One significant advantage of this strategy is that we can control the reduction strategy used to check that `eq_refl`

has the right type (ensuring that the application of `unwrap_dec`

is well-typed requires checking that `eq_refl: is_some _ = true`

, which requires reducing `is_some _`

to unify it with `true`

). Concretely, we can write `(@eq_refl bool true : is_some tc_good = true)`

to using normal unification, `(@eq_refl bool true <: …)`

to call `vm_compute`

, and `<<:`

to call `native_compute`

.

As before, though, the proof term that we're passing is in fact dead code, and the error messages are not ideal:

## Take 3: Use a dependent return type

We know that we only intend to call `unwrap`

with arguments that reduce to `Some _`

. We can make this explicit in the *return* type, instead of changing the arguments:

Inductive error : string -> Type := Err (s: string) : error s. Definition unwrap_dep {A} (o: option A) : if o then A else error _ := match o with | Some a => a | None => Err "Expecting Some, got None" end.

Here we're saying that we'll return an `A`

if given a `Some`

, and an `error`

otherwise. And indeed, the error messages are much nicer:

Here's how it looks for list indices:

```
Definition nth_in_bounds_dep {A} (l: list A) (n: nat)
: if lt_dec n (List.length l) then A else error _ :=
match lt_dec n (List.length l) as cmp
return (if cmp then A else error _) with
| left in_bounds => nth_in_bounds l n in_bounds
| right _ => Err "Index is out of bounds"
end.
```

## Bonus 1: Using unification

After I wrote this post, my colleague Jason Gross showed me another quite clever implementation of `unwrap`

, leveraging inference:

Notation unwrap_refl o := ((fun v (pf : o = Some v) => v) _ eq_refl) (only parsing).

The trick here is to force unification to infer the value inside the option: Coq will unify `o = Some ?v`

(the type of `pf`

) with `?a = ?a`

(the type of `eq_refl`

), and instantiate `?v`

in passing, which the function then returns. Nifty!

## Bonus 2: Using tactics in terms

Here's one final way to proceed with this, using tactics in terms:

```
Notation unwrap_tac opt :=
ltac:(match (eval hnf in opt) with
| Some ?v => exact v
| ?other => fail "Error:" other "isn't [Some _]"
end) (only parsing).
```

In practice, it works OK, but `hnf`

is very slow (it's based on the same code as `simpl`

). The `cbv`

tactic and its faster cousins like `vm_compute`

and `native_compute`

are usually faster, but they get very costly if the terms are large and don't need to be fully normalized to determine whether we're in the `Some`

or `None`

case (think of a case like `Some (very large term)`

, where `hnf`

will be free and `cbv`

very slow).

Knowing this, it's a bit easier to understand why the `unwrap_dec`

trick above works well: the type check that ensures that `eq_refl`

has type `is_some opt = true`

is essentially computing the head-normal form of opt and comparing it to `Some`

, but it does that using Coq's fast reduction tactics. In fact, Jason has done a lot of work on exploring alternative strategies that combine reflection and fast full-reduction tactics such as `vm_compute`

or `native_compute`

to give fine-grained control over reduction.

Note

Jason correctly points out that this notation won't give you great error messages if you pass it terms with typos:

One way around this is to tweak the notation to force it to typecheck its argument before passing it into the tactic, like this:

Notation unwrap_tac' opt := (match opt with _ => ltac:(match (eval hnf in opt) with | Some ?v => exact v | ?other => fail "Error:" other "isn't [Some _]" end) end) (only parsing).

Using a `match`

instead of another construct like `let _ := opt in …`

ensures that we don't pollute the term (the match will self-reduce without requiring an explicit reduction).