Writing a Lisp, Part 5: If-Expressions

December 19, 2016

Last time we added environments to our Lisp. This time, we’re going to add if-expressions. If-expressions are an important part of a language, and a key piece in making our Lisp Turing complete.

If-expressions will be our first non-self-evaluating structure. That means we’ll need some additional plumbing to deal with them.

For example, we’ll need to introduce an eval_sexp function. eval_sexp will be the “Eval” in “Read-Eval-Print-Loop”. The final piece! Also notice that we are modifying repl to take into account environments. We do this because the environments will be necessary in the future. How else will, say, variables be evaluated?

let rec repl stm env =
  print_string "> ";
  flush stdout;
  let sexp = read_sexp stm in
  let (result, env') = eval_sexp sexp env in
  print_sexp result;
  print_newline ();
  repl stm env';;

let main =
  let stm = { chr=[]; line_num=1; chan=stdin } in
  repl stm Nil;;

We start off with the empty environment and in fact don’t end up modifying it anywhere. We’ll get there later.

eval_sexp of course doesn’t exist yet, so let’s make it. For now, we’ll say that Symbols, Fixnums, Booleans, and Nil all are self-evaluating – but note that in the future Symbols will be handled differently, as they are also used for variable names.

exception TypeError of string;;

let rec eval_sexp sexp env =
    let eval_if cond iftrue iffalse =
        let (condval, _) = eval_sexp cond env in
        match condval with
        | Boolean(true) -> iftrue
        | Boolean(false) -> iffalse
        | _ -> raise (TypeError "(if bool e1 e2)")
    in
    match sexp with
    | Fixnum(v) -> (Fixnum(v), env)
    | Boolean(v) -> (Boolean(v), env)
    | Symbol(v) -> (Symbol(v), env)
    | Nil -> (Nil, env)
    | Pair(Symbol "if", Pair(cond, Pair(iftrue, Pair(iffalse, Nil)))) ->
            eval_sexp (eval_if cond iftrue iffalse) env
    | _ -> (sexp, env)

So this is a fun function. Let’s take a walk through it step by step. First there’s a definition for eval_if – we’ll get back to that in a second. Skip it for now. Next, there’s all this nonsense so that the simple types can self-evaluate:

[...]
match sexp with
| Fixnum(v) -> (Fixnum(v), env)
| Boolean(v) -> (Boolean(v), env)
| Symbol(v) -> (Symbol(v), env)
| Nil -> (Nil, env)
[...]

Then we get to the really nasty-looking bit that evaluates if-expressions. This clause

[...]
| Pair(Symbol "if", Pair(cond, Pair(iftrue, Pair(iffalse, Nil)))) ->
        eval_sexp (eval_if cond iftrue iffalse) env
[...]

matches against the structure of the given s-expression. If-expressions should have the form (if bool-exp true-exp false-exp). If the condition is true, true-exp gets evaluated. If the condition is false, false-exp gets evaluated.

EDIT: Note that the way that if is implemented here is very much unlike how if is implemented in languages like C. In C, C++, Java, and others, there is no scope leak. That is, if you have a program like:

if (true) {
    int x = 5;
}

x (unless otherwise declared and defined) cannot be used outside of the if. In our Lisp implementation, it’s possible to do something like:

(if #t (val x 5) 8)

And then x can be used after the if-statement. If we wanted to emulate the C-like behavior, we could rewrite if as:

[...]
| Pair(Symbol "if", Pair(cond, Pair(iftrue, Pair(iffalse, Nil)))) ->
        let (expval, _) = eval_sexp (eval_if cond iftrue iffalse) env in
        (expval, env)
[...]

This would discard the environment returned by the if statement and instead use the original environment passed into the if.

/EDIT

If for some reason the condition is of the wrong type, raise an error:

[...]
let eval_if cond iftrue iffalse =
    let (condval, _) = eval_sexp cond env in
    match condval with
    | Boolean(true) -> iftrue
    | Boolean(false) -> iffalse
    | _ -> raise (TypeError "(if bool e1 e2)")
[...]

Last, if the expression is of any other form, just return it as-is and let it get printed. So, self-evaluating. For now.

Throughout all this, though, you’re probably wondering why we’re returning a tuple of the resulting expression and the resulting environment. Good thing to wonder. It turns out that some expressions that we’ll want to implement down the line should have side-effects. Like (val x 5), which binds a variable in the current scope. If we didn’t return the new environment with x in it, we’d have no way of capturing the changes.

Alright. Shall we take it for a spin?

$ ocaml 05_if.ml
> (+ 1 2)
(+ 1 2)
> #t
#t
> (if #t 3 4)
3
> (if #f 3 4)
4
> (if (if #t #f #t) 3 4)
4
> (if 3 4 5)
Exception: TypeError "(if bool e1 e2)".
$

Looks like it works and fails appropriately!

Download the code here if you want to mess with it.

Next up, primitive procedures.