Here we are, here we are, with a fully-functional programming language. And
yet, aside from using
val, we have no good way of defining variables.
val has the side-effect of mutating its current environment –
something we would rather avoid (at least for now).
Most functional languages (SML, OCaml, Haskell, every single Lisp) have a
solution to this problem: they have a form of expression called
let that is
used to bind a name to a value only in a given expression. Here’s what it
looks like in two different languages:
(* OCaml *) let x = 4 in let y = 5 in y * x (* => 20 *)
; Lisp (let ((x 4) (y 5)) (* y x)) ; => 20
Let, as it turns out, is a supremely important construct in functional
programming. Without it, the programmer is required to use imperative
begin, which evaluates a bunch of expressions and then
returns the value of the last one:
(begin (val x 4) (val y 5) (* y x)) ; => 20
There’s a problem with that, though: it pollutes the existing environment with
the values of
y had other values beforehand… well,
sorry. They’re gone now. If you’re feeling clever, however, you could instead
lambda, like so:
(((lambda (x) (lambda (y) (* y x))) 3) 4) ; => 12
((lambda (x y) (* y x)) 3 4) ; => 12
If you guessed that using lambda-expressions to model
let was a good idea,
you’re sort of right!
You’re right because it certainly works, and is more elegant than
should have concerns, though, about the amount of stack space that your typical
lambda-expression uses. Right now, every single lambda-expression captures a
copy of the existing environment. That’s a lot of wasted space 1 —
let-expressions are used often.
Another thing you should be wondering about the choice in implementation: is there a difference between the two lambda representations above?
Not really, even considering exceptional cases like two variables with the same name. Let’s have a look:
$ ocaml 13_let.ml > (((lambda (x) (lambda (x) x)) 5) 7) 7 > ((lambda (x x) x) 5 7) 7 > Exception: End_of_file. $
Even though we don’t check for duplicate variable names in lambda-expressions,
the results end up the same — the inner (last)
x is the one that is most
recently bound, and therefore used when evaluating the expression.
Either way, what we’re going to do in our implementation is less clever — just add a new AST constructor and evaluate them differently from other types of expression.
Yes, this is “lame” because it adds more core interpreter features without “needing” to. But it saves some AST transform headache and gives some performance wins. 2
[...] and let_kind = LET | LETSTAR | LETREC and exp = | Literal of value | Var of name | If of exp * exp * exp | And of exp * exp | Or of exp * exp | Apply of exp * exp | Call of exp * exp list | Lambda of name list * exp | Let of let_kind * (name * exp) list * exp (* NEW! *) | Defexp of def
The weird bit is the
LETREC for now — I’ll
touch on them toward the end of this post.
Now of course we have to go and patch all the non-exhaustive warnings that come up.
string_exp, we’ll transform each binding into a string and then put them
let rec string_exp = let spacesep_exp es = spacesep (List.map string_exp es) in let string_of_binding (n, e) = "(" ^ n ^ " " ^ (string_exp e) ^ ")" in function [...] | Lambda (ns, e) -> "(lambda (" ^ spacesep ns ^ ") " ^ string_exp e ^ ")" | Let (kind, bs, e) -> let str = match kind with | LET -> "let" | LETSTAR -> "let*" | LETREC -> "letrec" in let bindings = spacesep (List.map string_of_binding bs) in "(" ^ str ^ " (" ^ bindings ^ ") " ^ string_exp e ^ ")" [...]
I’ve also gone ahead and rewritten the stringify function for
actually print the whole expression.
build_ast, we have to do some “heavy lifting” in transforming the list of
lists to a
(name * exp) list.
exception UniqueError of string let rec assert_unique = function |  -> () | x::xs -> if List.mem x xs then raise (UniqueError x) else assert_unique xs let rec build_ast sexp = [...] let let_kinds = ["let", LET; "let*", LETSTAR; "letrec", LETREC] in let valid_let s = List.mem_assoc s let_kinds in let to_kind s = List.assoc s let_kinds in match sexp with [...] | Pair _ when is_list sexp -> (match pair_to_list sexp with [...] | (Symbol s)::bindings::exp:: when is_list bindings && valid_let s -> let mkbinding = function | Pair (Symbol n, Pair (e, Nil)) -> n, build_ast e | _ -> raise (TypeError "(let bindings exp)") in let bindings = List.map mkbinding (pair_to_list bindings) in let () = assert_unique (List.map fst bindings) in Let (to_kind s, bindings, build_ast exp) [...]
I set up
let_kinds as an association list 3 and then make helper
valid_let, to check if an symbol is “let”, “let*”, or “letrec”;
to_kind to transform a symbol into its equivalent constructor) to use down
This code handles
letrec, converting them into their
Note that I’ve also added a uniqueness constraint on the bindings – each of
the names must be distinct from all others. This gets rid of some nastiness
that we looked at above. Since this is useful, I’ve also gone ahead and called
assert_unique in the
Let AST is, as it turns out, a very convenient form for
evaluation. We can take the ASTs we’ve built, evaluate them, and then put them
neatly into our current environment when evaluating
exp. Speaking of
evaluation, now we’ll handle the
Let case in
let rec evalexp exp env = [...] let rec ev = function [...] | Let (LET, bs, body) -> let evbinding (n, e) = n, ref (Some (ev e)) in evalexp body (extend (List.map evbinding bs) env) | Let (LETSTAR, bs, body) -> failwith "Not yet implemented" | Let (LETREC, bs, body) -> failwith "Not yet implemented" | Defexp d -> raise ThisCan'tHappenError in ev exp
The bit that’s annoying is the
value option ref type, so I’ve added
evbinding to simplify the code. I’ve also made sure that
let* raises an
error because we haven’t made it yet.
Let’s give it a whirl to see if it works:
$ ocaml 13_let.ml > (val x 4) 4 > (let ((x 2)) x) 2 > (let ((x 2) (y 4)) (* x y)) 8 > (let ((x 3) (x 4)) x) Exception: UniqueError "x". $ ocaml 13_let.ml > (let ((x 3) (y x)) y) Exception: NotFound "x". $ ocaml 13_let.ml > (let* ((x 4) (y x)) y) Exception: Failure "Not yet implemented". > (letrec () 'anything-here) Exception: Failure "Not yet implemented". $
Looks like our
Even though I did not specify that this is correct, the expression
(let ((x 3)
(y x)) y) should indeed be an error. This is because each binding expression
x in this case) is evaluated in the pre-let environment, in
isolation. Whenever I get around to formally specifying the behavior for this
language using operational semantics, this will become more clear.
But what if we want to refer to previous bindings? We can totally do that, and
let* comes in. With
let* each binding can refer to all
previous bindings, but no future ones. So it’s kind of like a bunch of nested
(let* ((x 5) (y x)) y) ; is equivalent to (let ((x 5)) (let ((y x)) y))
We’ve already implemented all of the plumbing for
let* except for evaluation.
Just like the implementation decision for
let, we could simply do an AST
transform and make it a bunch of nested
let. We’re going to make a parallel
let rec evalexp exp env = [...] let rec ev = function | Let (LET, bs, body) -> let evbinding (n, e) = n, ref (Some (ev e)) in evalexp body (extend (List.map evbinding bs) env) | Let (LETSTAR, bs, body) -> (* NEW! *) let evbinding acc (n, e) = bind (n, evalexp e acc, acc) in evalexp body (extend (List.fold_left evbinding  bs) env) | Let (LETREC, bs, body) -> failwith "Not yet implemented" | Defexp d -> raise ThisCan'tHappenError in ev exp
EDIT: Turns out there’s a glaring issue in the
let* implementation above.
See 16: Standard Library for corrections.
fold_left, we make sure that we evaluate each of the bindings in order,
and in the environment created by evaluating the previous bindings. Ahh,
functional programming… where would I be without you?
On another note, according to most of the implementations that I looked up when
writing this post,
let* does not normally have a distinct variable
requirement, but lets later bindings shadow previous ones. This allows for
imperative-esque things like:
; Not allowed in our interpreter: (let* ((x 5) (x (factorial x)) (x (sqrt x)) (x (to-string x))) (print x))
Since we handled distinct names at the AST building level and treated all the let-expressions as the same, ours will be a wee bit different. If you feel compelled to change the interpreter so that binding uniqueness is checked at evaluation-time instead of at AST-build-time, go right ahead. It shouldn’t be too much of a hassle or break any code we’re going to write.
Let’s test out
$ ocaml 13_let.ml > (let* ((x 4) (y x)) y) 4 > Exception: End_of_file. $
The third type of let-expression,
letrec, is like
let* but it allows a
binding to reference any (even the current binding!) — not just bindings that
precede it. The caveat, though, is that the values have to be
lambda-expressions. Otherwise we’d run into weird issues with topologically
sorting the variables so they can be properly recursively defined… bunch of
nonsense. It looks like this:
(letrec ((f (lambda (x) (g (+ x 1)))) (g (lambda (x) (+ x 3)))) (f 0)) ; => 4
(letrec ((factorial (lambda (x) (if (< x 2) 1 (* x (factorial (- x 1))))))) (factorial 5)) ; => 120
It’s what we should have used in the last post, instead of adding that ugly
let rec evalexp exp env = let evalapply f vs = match f with | Primitive (_, f) -> f vs | Closure (ns, e, clenv) -> (* vvvvvvv ugly!! wrong!! vvvv *) evalexp e (extend (bindlist ns vs clenv) env) [...]
So let’s revert
evalapply to its normal happy self:
let rec evalexp exp env = let evalapply f vs = match f with | Primitive (_, f) -> f vs | Closure (ns, e, clenv) -> evalexp e (bindlist ns vs clenv) [...]
…and get crackalackin’ on
letrec. First let’s modify
mkloc so that it can
take any argument and just returns a new empty (“unspecified”) reference:
let mkloc () = ref None (* OLD *) let mkloc _ = ref None (* NEW *)
We’re not doing this to save source code space. That would be silly. We’re doing this so that we can more easily use it in mapping functions without wrapping it in an anonymous function. You’ll see what I mean in a second:
let rec evalexp exp env = [...] let rec unzip ls = (List.map fst ls, List.map snd ls) in let rec ev = function [...] | Let (LETREC, bs, body) -> let names, values = unzip bs in let env' = bindloclist names (List.map mkloc values) env in let updates = List.map (fun (n, e) -> n, Some (evalexp e env')) bs in let () = List.iter (fun (n, v) -> (List.assoc n env') := v) updates in evalexp body env' | Defexp d -> raise ThisCan'tHappenError in ev exp
unzip is your standard unzip function. It takes a list of tuples and returns
two lists. The first list is all of the first elements and the second list is
all of the second elements. This implementation makes two passes over the list
but oh well. It’s readable.
letrec works, in neatly enumerated steps 4:
unzipto break apart the bindings given to
letrecinto lists of names and values.
mklocover the values (this is just
List.map mkloc values).
option refbox for them one by one.
letrecin the new environment with all of the values resolved.
At some point when I figure out a good way of making diagrams, or perhaps when I find a good enough one online, I will include a diagram here. Call that a TODO.
You’re probably wondering why not do the let-expressions post first and then do mutually recursive functions in the metacircular evaluator? Yeah, well. I didn’t think of that. That’s part of the fun of this series, I think. I’m a real human and I make mistakes. Why hide them?
I’ve fixed the metacircular evaluator now, and the results don’t look too shabby:
(define eval. (e env) (letrec ( (eval-cond. (lambda (c a) (cond ((null. c) 'error) ((eval. (caar c) a) (eval. (cadar c) a)) (#t (eval-cond. (cdr c) a))))) (map-eval. (lambda (exps env) (cond ((null. exps) '()) (#t (cons (eval. (car exps) env) (map-eval. (cdr exps) env)))))) ) (cond ((sym? e) (lookup. e env)) [...]
You can see we now define
map-eval. as lambda-expressions
eval., instead of as two mutually recursive auxiliary
functions. How wonderful!
Next up, comments.
They use a lot of stack space for now. Later on we’ll optimize lambda-expressions. I can hardly wait! (Seriously, it’s awesome.) ↩
There’s this idea of AST “lowering”, which means taking a more advanced AST
(like one with
let in it) and transforming it into a less advanced AST
(like one with all the
lets transformed into
lambdas). AST lowering is
one of many common compiler stages in the process of producing machine
code. There may even be multiple stages comprised completely of lowering
An association list is a linked list of key-value pairs. It can be used
much the same way as a hash table can (query, add, remove, etc), except for
the time complexity (
O(n) vs a hashtable’s
O(1)). In this case, though,
we’ve got an upper bound on the number of elements in our list — 3. So
technically still constant time! ↩