A long story about MLF
  1. A long story about MLF, part 1: Introduction, ML
  2. A long story about MLF, part 2: System F and principality
  3. Moderately practical uses of System F over ML
  4. A long story about MLF, part 3: introduction to MLF
  5. A long story about MLF, part 4: gMLF, graphical MLF types

Moderately practical uses of System F over ML

(Gabriel Scherer (gasche) @ 2010-11-11 13:51:41)

In a recent comment, SpiceGuid asked :

Or, more practically, can you exhibit a motivating code example that requires full System F […], or is it still active research?

This is a very good question indeed; I’m sorry I have been a bit long to answer, I was writing this post. I’m very glad I’m able to produce such a code example — otherwise it would be quite embarassing. I have no less than three examples, that I hope you’ll find motivating, with various degrees of practicality.

Let me clarify a bit: the key difference in expressivity between ML and System F is that System F types can have type quantification (the universal quantifier ) arbitrarily deep inside types.

For example, (∀(α)α→α)→int is a System F type, that doesn’t exist in ML. Indeed, a ML type scheme is (implicitly) quantified outside the type: a ML approximation of the above type would be ('a → 'a) → int, equivalent to System F ∀(α) (α→α)→int: the quantification is on the outside.

Those two types are not equivalent. Consider the following λ-term:

λ(f) if f true then f 0 else f 1

This term can be given the type (∀(α)α→α)→int: it accepts a polymorphic function of type ∀(α)α→α, and uses it on true (returning a boolean), then on 0 and 1, returning integers. But it cannot be given the type ('a → 'a) → int: inside the function, f is used with type bool, but also with type int, so 'a could neither be bool nor int.

# ((fun f -> if f true then f 0 else f 1) : ('a -> 'a) -> int);;  
Error: This expression has type int but an expression was expected of
       type bool

So this is a crucial difference between System F and ML. When moving from ML to System F, we gain the ability to put ∀ inside types, inside arrow types (→). The previous example demonstrates that System F is strictly more expressive than ML, but I do not claim it is an interesting or motivating example. Three examples will follow, but they will all use the same pattern: a inside a (→).

Actually, we do not gain much from allowing at the right of arrows: α → ∀(β)β→β is not that different from ∀(β) α→(β→β). All the fun is when we have at the left of an arrow. This is deeply connected to the fact that (→) is positive on its right (covariant) and negative (contravariant) on its left…

Object types: values carrying (polymorphic) functions

My first personal encounter with higher-rank polymorphism was when using OCaml object types. It is not directly related to object-oriented programming, but rather to the fact that this coding style promotes packing functions (or methods) together with values. When you want to pass a polymorphic function as a method of an object, you need higher-rank polymorphism.

For example, imagine you have a collection class 'a coll that describes collections (lists, sequences, etc.) of values of type 'a. You have added boring methods such as length : int, iter : ('a → unit) → unit.

class type ['a] collection = object
  method length : int
  method iter : ('a -> unit) -> unit
end

let collection_of_list : 'a list -> 'a collection =
  fun li -> object
    method length = List.length li
    method iter f = List.iter f li
  end

Now, suppose you want to add a fold method, in the spirit of folding over a list. A quick reminder on list folding:

fold : ('a -> 'b -> 'b) -> 'a list -> 'b -> 'b

In this type (for lists, not for collection), 'a denotes the type of the elements of the list, and 'b the type of an accumulator that is computed by iterating the list. Classic examples:

let length list = fold (fun acc elem -> 1 + acc) list 0
let iter f list = fold (fun () elem -> f elem) list ()

The type of a fold method for a 'a collection is ('a→'b→'b)→'b→'b. In this type, 'b is locally quantified: it corresponds to ∀(β) (α→β→β)→β→β.

class type ['a] collection = object
  method length : int
  method iter : ('a -> unit) -> unit
  method fold : ('a -> 'b -> 'b) -> 'b -> 'b
end

But the corresponding implementation doesn’t work as is:

let collection_of_list : 'a list -> 'a collection =
  fun li -> object
    method length = List.length li
    method iter f = List.iter f li
    method fold f init = List.fold_right f li init
  end

Error: This expression has type
       < fold : ('a -> 'b -> 'b) -> 'b -> 'b; iter : ('a -> unit) -> unit;
         length : int >
     but an expression was expected of type 'a collection
     The universal variable 'c would escape its scope

OCaml can’t infer higher-rank polymorphism: we have to be explicit about it.

    method fold : 'b . ('a -> 'b -> 'b) -> 'b -> 'b =
      fun f init -> List.fold_right f li init

Note that OCaml departs from ML here: this is not a ML type, but a System F type. The syntax 'b . ... is the OCaml syntax for ∀(b) ....

If you wanted a ML type, you would have a collection type parametrized by the fold accumulator type:

type ('a, 'b) folder = < fold : ('a -> 'b -> 'b) -> 'b -> 'b; .. >

But this is very restrictive, as you couldn’t run two folds on the same collection returning different result types. In short, it wouldn’t work.

Note that I didn’t directly show a inside a (→) type yet. It seems that I only used inside the set of methods of the collection class; in other words, a inside a product (record, objet) type. But the arrows come when we write a function using a collection:

let list_of_collection : 'a collection -> 'a list = ...

You don’t see it here, but there is a inside "collection". You can make it explicit:

let list_of_foldable
 : < fold : 'b . ('a -> 'b -> 'b) -> 'b -> 'b; .. > -> 'a list =
   fun c -> c#fold (fun hd tl -> hd::tl) []

The important part is not that is inside the object; the OCaml type system forces us to use only in method and record field types, but in full System F we could do something like:

list_of_foldable : ∀(α) (∀(β) < fold : (α→β→β)→β→β; .. >) → α list

It is slightly less precise, but equivalent if you’re careful. The important part is that the is at the left of the arrow.

Those OCaml code examples obviously do not use the ML type system. The OCaml language has been extended with partial support for higher-rank polymorphism, as described in the paper Semi-Explicit First-Class Polymorphism for ML (1999), by Jacques Guarrigue and Didier Rémy.

This work is semi-explicit in that we have to be explicit when we introduce higher-rank polymorphism, by giving the quantified type explicitely, but we do not have to give the type at elimination time; we only need to say that we eliminate something, without needing to give the type. The restriction of such types to record and methods is used to hide the elimination construct from the programmer, making the system simpler: it is implicitly embedded in the field access and method access operations.

This is previous work that has indeed been an inspiration for the MLF type system by Didier Rémy.

ST monad: polymorphism means no escape

This is one use of higher-rank polymorphism that I like very much: polymorphism can mean purity, or rather unobservable side-effects. The original paper about the ST Monad is Lazy Functional State Thread (1994), by John Launchbury and Simon Peyton Jones.

If you don’t know what a monad is, you should try to learn it now. I won’t explain it here, and ST is not a beginner-friendly monad. You will certainly learn more trying to understand monads than reading this section. My favourite introduction to monads (for now) is the beginning of the Monads for functional programming (PDF) (1992) by Philip Wadler.

ST is a monad that is used to express local side effects. Monads are designed to affect computations, such that return types stay in the monad, and side effects cannot escape from the monad. ST is a monad you can escape from, if you’re nice enough.

Lets be more precise. ST is actually a family of monads indexed by a type parameter 's: ('s,'a) st is a computation in the ('s,_) st monad with values of type 'a. The 's parameter is used by the side-effectful operations you allow in your ST monad. For example, if you design a ST monad for references:

type ('s, 'a) st_ref
val st_ref : 'a -> ('s, ('s, 'a) st_ref) st
val get : ('s, 'a) st_ref -> ('s, 'a) st
val set : ('s, 'a) st_ref -> 'a -> ('s, unit) st

The idea is that you link the type of the reference and the type of the st monad which uses it. You have a type ('s,'a) st_ref for references, and each use of a reference is a ('s,_) st computation, with the same 's. I think of 's as a phantom type, but you may find the names region or thread more evocative.

What’s the use of such 's-ification of memory accesses? Here is the reward — beware, this is not directly OCaml syntax, I’m using a System F type here:

val run_st : ∀(α) (∀(s) (s, α) st) → α

If a value is polymorphic in the ST parameter s, it is allowed to escape the monad and return into the pure world. As if all side effects, mutations of references, were forgotten.

The reason why it works is that « polymorphism means no escape »: if your code is polymorphic in s, the effects inside cannot escape, they will never be side-effects. Let’s consider a few attempts at wrongdoing:

Here is some OCaml code implementing a ST monad:

module ST : sig
  type ('s, 'a) st
  val return : 'a -> ('s, 'a) st
  val bind : ('a -> ('s, 'b) st) -> ('s, 'a) st -> ('s, 'b) st

  type ('s, 'a) st_ref
  val st_ref : 'a -> ('s, ('s, 'a) st_ref) st
  val get : ('s, 'a) st_ref -> ('s, 'a) st
  val set : ('s, 'a) st_ref -> 'a -> ('s, unit) st

  type 'a secretive = { run : 's . unit -> ('s, 'a) st }
  val run_st : 'a secretive -> 'a
end = struct
  type ('s, 'a) st = 'a
  let return x = x
  let bind f m = f m

  type ('s, 'a) st_ref = 'a ref
  let st_ref, get, set = ref, (!), (:=)

  type 'a secretive = { run : 's . unit -> ('s, 'a) st }
  let run_st st = st.run ()
end

let (>>=) m f = ST.bind f m

An example of unobservable side effect. I create a reference, get its value, change it, get the value again, then only return the sum of values, forgetting the reference.

# let test =
    ST.run_st { ST.run = fun () ->
      ST.st_ref 1 >>= fun r ->
      ST.get r >>= fun before ->
      ST.set r 2 >>= fun () ->
      ST.get r >>= fun after ->
      ST.return (before + after)
    };;

val test : int = 3

You may have noticed that I use ∀(s) (unit → (s,'a) st) instead of directly ∀(s) (s,'a) st; this avoids OCaml value restriction issues and does not change the semantics (the unit function is run only once, immediately in run_st).

Two examples of attempt to break unobservability by return the reference or escaping it through a global reference.

# let test_escape_1 =
    ST.run_st { ST.run = fun () ->
      ST.st_ref 1 >>= fun r ->
      ST.return r
    }
Error: This field value has type unit -> ('a, ('a, int) ST.st_ref) ST.st
  which is less general than 'b. unit -> ('b, 'c) ST.st

# let test_escape_2 =
    let hole = ref None in
    ST.run_st { ST.run = fun () ->
      ST.st_ref 1 >>= fun r ->
      hole := Some r;
      ST.return ()
    }
    !hole
Error: This field value has type unit -> ('a, unit) ST.st
  which is less general than 'b. unit -> ('b, 'c) ST.st

There is another situation where polymorphism is used as a tool to ensure purity: polymorphism over monads. A piece of code may use a monad m, but actually be parametric in m, instead of using a particular value of m. If a code is polymorphic over a monad, it means there is actually no effect: it is only pure code in monadic style.

It is related to the ST polymorphism I described, but still quite different. For one, it cannot be expressed directly in System F: monads use higher-order polymorphism, they’re not types but type operators. And it is quite easy to see why such code is pure: you just have to instantiate m := Id, the identity monad, to extract the pure computation underlying.

I think the ST monad is more subtle: it is not pure, it is a secretly impure term. The polymorphism seals its secret, so that it can never be observed.

We may however think of an informal explanation similar to polymorphism over monads, where instantiating the quantified variable with the right term leads to purity.

Imagine that, instead of being just a phantom type, the type variable s decribed the memory region where the reference is allocated. A ('s,_) st computation is a computation that is allowed to make effects on all references allocated in 's, and no others. Now suppose our piece of code is polymorphic in 's; to run it, we only have to choose an arbitrary region 's0, or allocate it on the fly. The only way we could have side-effects is to have another piece of code in our program interacting with the same region 's0. But if we know that another piece of code uses a given region s0, we can always instantiate the 's variable of the run_st function with a different region 's1. This is similar to how we handle bound names in syntaxic term manipulations: we may always choose a fresh term that doesn’t conflict with any other name in the program. Here we can choose a fresh memory region, making mutations that are unobservable to the outside, as they don’t know the region we have chosen.

Please note, finally, that the ST monad is not the end of it all, the graal of side-effects tamers. It nicely captures a specific usage pattern for local side effects, that can be proved observably pure by the type system itself. Note that the type system used here is System F, not ML, as we crucially use the ∀(s) quantification on the left of the arrow in the run_st type.

Some uses of references don’t fit that pattern, and they need more complicated proofs of observational purity; we may be able to present them as well-typedness in more complicated type systems. In this direction, I know of the work of Francois Pottier, which has done quite a lot of things on regions for separating non-interacting effects, capabilities to represent fine-grained ownership and sharing of regions, and various rules attempting to model "hidden state", or unobservable local side effects. See for example the slides on Hidden state and the anti-frame rule (PDF) (2009), or the corresponding papers and other publications.

Existential types: locally hidden values

The two examples above were much more verbose than I had originally planed, so I will try to keep this one short. Besides, I haven’t found any simple and practical example for this one, though I’m convinced the notion itself is indeed of practical importance.

Existential types are used to hide parts of a type. For example, the type ∃(α) α*(α→string) represents the type of any value along with a printing function for it. The real type of the value is hidden by the new quantifier , and the only thing we can do is print it. The interest of hiding parts of a type is that we can mix together types having different hidden parts. For example, we could never put two values of type bool*(bool→string) and int*(int→string) together in a list, but we could pack them as elements of ∃(α) α*(α→string) and put them into the same list.

You could define the type thunk α := ∃(β) β*(β→α) of suspended computations that ultimately return a value of type α. The advantage over the classic unit→α is that you don’t have to allocate an extra closure to hide the β value. Existential types are useful for other low-level stuff, like typed compilation of closures, or even compilation to a typed assembly language. They also represent hiding and modularity, and allow to dynamically encode abstract data types and some object-oriented techniques.

For an in depth explanation of existential types and its use in typed closure conversion — which is roughly a technique to compile away all inner functions in a functional language — see the following courses notes (2010) by Didier Rémy.

Existential types can be added to a type system. For example, the seminal paper Abstract Types Have Existential Types (PDF) (1988), by John Mitchell and Gordon Plotkin, uses a type system similar to System F, extended with existential types. You can easily see that in this case, the same distinction between ML-style and F-style quantifications apply: you really need to pass inside arrows, as function parameters.

But the importance of System F goes further here: existential types can actually be encoded directly in System F, while this is not possible in ML. The idea is that the existential type ∃(α)T (here T is a metavariable representing a type expression that may contain occurences of the type variable α) can be encoded as:

∀(β) (∀(α)T→β)→β

We choose β fresh, in particular it does not appear in T. This encoding has a natural explanation: the existential type ∃(α)T means, in some way, that « I have a precise type for α, but I won’t show it to you ». How can I prove that I have such a value for α, without showing it to you? Well, I propose that you give me a challenge: you choose any type β, and give me a computation that needs T to build a β. If I really have some α, I will be able to build your β and pass it back to you.

This encoding crucially uses the ∀(α) quantification on the left of the outer arrow: it cannot be done in ML.

You may have noticed that the encoding above, including its interactive explanation, seems related to double-negation translations. But one should be careful when making such comparisons, as the and quantifiers here are not the quantifier from first-order logic:


Post Scriptum: I would like to thank rz0 for the proof-reading he consistently offers me for each blog article. I ashamedly use him as a spell checker, english grammar reference, and Master of Typography. His insightful questions and remarks also force me to improve and clarify my articles significantly.