# How to write a type-safe `unwrap`

(aka `fromJust`

)

Posted by Coq. Suggest edits or corrections.

on Fri 19 June 2020 inLet's say you've written a programming language in Coq. You have nice inductives for your ASTs; one for untyped terms (`UntypedAST`

) and one for typed terms (`TypedAST`

). You wrote a simple typechecker, and maybe an interpreter, too!

typecheck : UntypedAST -> option TypedASTinterp : TypedAST -> Value

You write a few programs…

Example well_typed := UAdd (UNat 1) (UNat 1). Example ill_typed := UAdd (UNat 1) (UBool true).

… typecheck them:

Definition tc_good := typecheck well_typed.Definition tc_bad := typecheck ill_typed.= Some {| tau := Nat; ast := TAdd (TNat 1) (TNat 1) |} : option TypedAST= None : option TypedAST

… and attempt to run them:

The term "tc_good" has type "option TypedAST"
while it is expected to have type "TypedAST".

D'oh! `interp`

takes a `TypedAST`

, but `typecheck`

returns an `option`

. What do we do?

We can write a simple wrapper though, with a default value for the `None`

case:

Definition unwrap_default (o: option TypedAST) : TypedAST := match o with | Some t => t | None => {| ast := TNat 1 |} end.= {| vtau := Nat; val := 2 |} : Value

But now we silently swallow type errors, which isn't ideal:

= {| vtau := Nat; val := 1 |}
: Value

Let's see how we can get a safe but convenient version of `unwrap`

(aka `fromJust`

in the Haskell world and `Option.get`

in OCaml).

## Take 1: Pass a proof as an extra argument

The most straightforward way is to generalize `unwrap`

by adding a proof that its argument is not `None`

:

Definition unwrap {A} (o: option A) (not_none: o <> None) : A := match o return _ = o -> A with | Some a => fun _ => a | None => fun is_none => False_rect _ (not_none is_none) end eq_refl.

… it works, but it's not much fun for callers:

= {| vtau := Nat; val := 2 |}
: Value

We can improve things slightly with tactics in terms:

= {| vtau := Nat; val := 2 |} : ValueNot a discriminable equality.

… but the generated terms are not pretty, so if you ever store them unreduced anywhere, you're in for all sorts of unpleasantness:

unwrap tc_good
(fun H : tc_good = None =>
let H0 : False :=
eq_ind tc_good
(fun e : option TypedAST =>
match e with
| Some _ => True
| None => False
end) I None H in
False_ind False H0)
: TypedAST

Not great. Still, here is another example for comparison, this time using known-good indices into a list:

Definition nth_in_bounds {A} (l: list A) (n: nat) (in_bounds: n < List.length l) := unwrap (List.nth_error l n) (proj2 (List.nth_error_Some l n) in_bounds).= 3 : nat

Note that (maybe surprisingly) the computation doesn't block, despite the fact that the definition of `nth_in_bounds`

uses an opaque proof `List.nth_error_Some`

. The reason is that, as we've seen, `unwrap`

doesn't actually look at the proof. In fact, in general, proofs don't tend to block computation, because Coq disallows elimination of informative `Prop`

s into type (that is, programs that return non-`Prop`

results can't inspect proofs — except non-informative ones, like eq_refl).

## Take 2: Use an equality proof

The main pain point in the previous example was the complexity of the proof terms, so let's simplify them. Instead of proving `o <> None`

, we'll prove that `is_some o = true`

, and the proof will always be `eq_refl`

:

Definition is_some {A} (o: option A) : bool := if o then true else false.A:Typeo:option Ais_some o = true -> o <> NoneA:Typeo:option Ais_some o = true -> o <> NoneA:Typea:Ais_some (Some a) = true -> Some a <> NoneA:Typeis_some None = true -> None <> Noneall: congruence. Qed.A:Typea:Atrue = true -> Some a <> NoneA:Typefalse = true -> None <> None

Note

In Coq the `if _ then _ else _`

notation works for any inductive type with two constructors, not just Booleans: the first constructor triggers the true case, and the second one triggers the false case. Conveniently, the definition of the `option`

type in Coq puts `Some _`

first and `None`

second, so it works intuitively with `if`

s.

On the other hand, the definitions of `nat`

, `list`

, and `String`

put the base case first, which means that `0`

, `[]`

, and `""`

are truthy in Coq (`Compute (if 0 then true else false)`

reduces to `true`

), while `42`

, `[1; 2; 3]`

, and `"coq"`

are falsy (`Compute (if "coq" then true else false)`

reduces to `false`

) — a slightly unfortunate result.

Now we can define a new variant of `unwrap`

:

Definition unwrap_dec {A} (o: option A) (is_some_true: is_some o = true) : A := unwrap o (is_some_not_none is_some_true).= {| vtau := Nat; val := 2 |} : Value

Much nicer! Now the proof is always the same, and we can even define a notation to hide it:

`Notation unwrap_dec' o := (unwrap_dec o eq_refl).`

Here's how it looks for list indices:

Definition nth_in_bounds_dec {A} (l: list A) (n: nat) (lt_true: (n <? List.length l) = true) := nth_in_bounds l n (proj1 (Nat.ltb_lt _ _) lt_true).= 3 : nat

One significant advantage of this strategy is that we can control the reduction strategy used to check that `eq_refl`

has the right type (ensuring that the application of `unwrap_dec`

is well-typed requires checking that `eq_refl: is_some _ = true`

, which requires reducing `is_some _`

to unify it with `true`

). Concretely, we can write `(@eq_refl bool true : is_some tc_good = true)`

to using normal unification, `(@eq_refl bool true <: …)`

to call `vm_compute`

, and `<<:`

to call `native_compute`

.

As before, though, the proof term that we're passing is in fact dead code, and the error messages are not ideal:

= {| vtau := Nat; val := 2 |} : ValueThe term "eq_refl" has type "is_some None = is_some None" while it is expected to have type "is_some None = true".= {| vtau := let (tau, _) := match is_some_not_none ?is_some_true eq_refl return TypedAST with end in tau; val := (fix Ffix (x : Tau) (x0 : TypedAST' x) {struct x0} : (fix Ffix0 (x1 : Tau) : Set := match x1 with | Nat => nat | Bool => bool end) x := match x0 in (TypedAST' H) return ((fix Ffix0 (x1 : Tau) : Set := match x1 with | Nat => nat | Bool => bool end) H) with | TNat n => n | TBool b => b | TAdd e1 e2 => (fix Ffix0 (x1 x2 : nat) {struct x1} : nat := match x1 with | 0 => x2 | S x3 => S (Ffix0 x3 x2) end) (Ffix Nat e1) (Ffix Nat e2) | TAnd e1 e2 => if Ffix Bool e1 then Ffix Bool e2 else false end) (let (tau, _) := match is_some_not_none ?is_some_true eq_refl return TypedAST with end in tau) (let (tau, ast) as c return (TypedAST' (let (tau, _) := c in tau)) := match is_some_not_none ?is_some_true eq_refl return TypedAST with end in ast) |} : Value

## Take 3: Use a dependent return type

We know that we only intend to call `unwrap`

with arguments that reduce to `Some _`

. We can make this explicit in the *return* type, instead of changing the arguments:

Inductive error : string -> Type := Err (s: string) : error s. Definition unwrap_dep {A} (o: option A) : if o then A else error _ := match o with | Some a => a | None => Err "Expecting Some, got None" end.= {| vtau := Nat; val := 2 |} : Value

Here we're saying that we'll return an `A`

if given a `Some`

, and an `error`

otherwise. And indeed, the error messages are much nicer:

The term "unwrap_dep tc_bad" has type
"if tc_bad
then TypedAST
else error "Expecting Some, got None""
while it is expected to have type "TypedAST".

Here's how it looks for list indices:

```
Definition nth_in_bounds_dep {A} (l: list A) (n: nat)
: if lt_dec n (List.length l) then A else error _ :=
match lt_dec n (List.length l) as cmp
return (if cmp then A else error _) with
| left in_bounds => nth_in_bounds l n in_bounds
| right _ => Err "Index is out of bounds"
end.
```

= 3 : if lt_dec 2 (Datatypes.length [1; 2; 3]) then nat else error "Index is out of bounds"= Err "Index is out of bounds" : if lt_dec 7 (Datatypes.length [1; 2; 3]) then nat else error "Index is out of bounds"

## Bonus 1: Using unification

After I wrote this post, my colleague Jason Gross showed me another quite clever implementation of `unwrap`

, leveraging inference:

Notation unwrap_refl o := ((fun v (pf : o = Some v) => v) _ eq_refl) (only parsing).= {| tau := Nat; ast := TAdd (TNat 1) (TNat 1) |} : TypedAST

The trick here is to force unification to infer the value inside the option: Coq will unify `o = Some ?v`

(the type of `pf`

) with `?a = ?a`

(the type of `eq_refl`

), and instantiate `?v`

in passing, which the function then returns. Nifty!

## Bonus 2: Using tactics in terms

Here's one final way to proceed with this, using tactics in terms:

```
Notation unwrap_tac opt :=
ltac:(match (eval hnf in opt) with
| Some ?v => exact v
| ?other => fail "Error:" other "isn't [Some _]"
end) (only parsing).
```

In practice, it works OK, but `hnf`

is very slow (it's based on the same code as `simpl`

). The `cbv`

tactic and its faster cousins like `vm_compute`

and `native_compute`

are usually faster, but they get very costly if the terms are large and don't need to be fully normalized to determine whether we're in the `Some`

or `None`

case (think of a case like `Some (very large term)`

, where `hnf`

will be free and `cbv`

very slow).

Knowing this, it's a bit easier to understand why the `unwrap_dec`

trick above works well: the type check that ensures that `eq_refl`

has type `is_some opt = true`

is essentially computing the head-normal form of opt and comparing it to `Some`

, but it does that using Coq's fast reduction tactics. In fact, Jason has done a lot of work on exploring alternative strategies that combine reflection and fast full-reduction tactics such as `vm_compute`

or `native_compute`

to give fine-grained control over reduction.

Note

Jason correctly points out that this notation won't give you great error messages if you pass it terms with typos:

The reference opt was not found in the current
environment.

One way around this is to tweak the notation to force it to typecheck its argument before passing it into the tactic, like this:

Notation unwrap_tac' opt := (match opt with _ => ltac:(match (eval hnf in opt) with | Some ?v => exact v | ?other => fail "Error:" other "isn't [Some _]" end) end) (only parsing).The reference Som was not found in the current environment.

Using a `match`

instead of another construct like `let _ := opt in …`

ensures that we don't pollute the term (the match will self-reduce without requiring an explicit reduction).