Write Yourself a Scheme in 48 Hours

arianvanp · on March 27, 2015

This was my very first 'big' haskell project! I had a lot of fun making it and afterwards I decided to start following compilers courses at my university!(https://github.com/arianvp/Haskeme).

I'd recommend anyone dabbling in their first days of haskell to take a try at this project. You will learn a great deal and most of all it's just extremely cool.

Was a lot of fun to build and I would do it totally differently if I would do it again. For example I would've added lexical scoping and handle variables with a Reader or State instead of IORefs . It's quite possible to make an almost fully pure implementation without using I/O for variables.

bitL · on March 27, 2015

Don't forget to make a Prolog as well. 48h should be sufficient as well ;-)

tpush · on March 27, 2015

How would you implement mutable closures without using IORefs, though? In Scheme the environment that is captured by a lambda expression is mutable; how would you accomplish that without IORefs? (Genuine question, as I'm currently implementing an R7RS Scheme interpreter in Haskell and was wondering this before I just went with IORefs.)

tome · on March 27, 2015

A variable that is mutable to Scheme doesn't need to be implemented as mutable in Haskell. Simulate mutable state in the usual way by passing the "before" state into functions and the "after" state out.

tpush · on March 27, 2015

How would that work to model mutable closures? During the evaluation of a Scheme lambda expression, the current environment needs to be captured somehow. Assuming the resulting value of the evaluated lambda expression is something like (Haskell pseudocode): (env, (Env -> [Value] -> Cont -> IO ())) where 'env' is the captured environment. Further assuming this tuple is stored in the environment, how would you propagate a change to 'env' without it being an IORef and using writeIORef?

jerf · on March 27, 2015

You end up with something more like

    Env -> [Value] -> Cont -> IO (Env, Value)

Immutability in functional programming is better understood as "the only time something can change is when a function is being called, which can be passed new arguments", rather than "nothing can ever change". This means that things can't change within the execution of a function. So when you produce the new Env, you pass it along to the next opcode's implementation, and it is just as if the Env changed from the point of view of the next opcode implementation. It just didn't change uncontrollably.

Whereas an imperative program works with a lot of loops, I imagine a functional program as passing through endless, endless stack frames, like https://youtu.be/77f0Yj9uMRA?t=2m5s . Stack frames, endless stack frames being created. (Tail-call optimization is often cited as important to avoid blowing out the stack. It's also important just for sheer performance, so we're not literally making calls for every function call, i.e., pushing registers and state.)

Haskell plays immutability to the hilt, as laziness is very difficult to imagine without it. Most other languages with immutability should probably instead have Rust-style mutation control, in my opinion, as it maintains almost all of the interesting properties of immutability in strict languages while being easier to program with, and easier to make high performance.

tpush · on March 27, 2015

I'm sorry, I didn't quite convey what I actually wanted to know.

I'm aware of how to model mutable state by simply passing in and "returning" the state, that's not the problem.

Let me try to explain again, bear with me:

Assuming the evaluation function is

  evalLambda :: Env -> LambdaExpr -> Proc

(I'm purposefully ignoring continuations and possible IO here. Also, a new Env is not returned because evaluating a lambda doesn't mutate the environment.)

The result of evaluating a lambda expression is a procedure that closed over its environment, so let's assume Proc is a function that has a copy of the environment passed to evalLambda. Note that it is not a function that takes an Env, as the closing-over happens when evaluating a lambda and not during its application.

Now, the interesting thing happens when the result of the lambda, the function, is stored somewhere, and after that a variable whose binding was closed over by evalLambda is mutated: The function evaluating the assignment/definition correctly produces an updated environment, but how will the stored Proc know? I can't see how this can be done without IORefs, honestly.

tel · on March 27, 2015

If you're talking about dynamic scoping then the cheapest way to pull that off is just to store the AST for your lambda, unevaluated, in the Env and evaluate it in the new Env which is available at call time.

It's all about controlling what evaluation occurs at what time. Here's another example: a language with dynamic and lexical bindings. You can see how the time when evaluation of the body of the lambda varies. I'm being very heavy-handed here, though—this only works for a pure language.

    type Name = String
    data Exp
      = Var Name
      | LexLam Name Exp
      | DynLam Name Exp
      | App Exp Exp
      | Base

    eval :: Exp -> Exp
    eval = eval0 emptyEnv where
      eval0 env e = case e of
        Var nm           -> lookup nm env -- throws exception on failure
        Base             -> Base
        LexLam name body -> LexLam name (eval0 (add env name (Var name)) body)
        DynLam _ _       -> e
        App l r -> case (eval0 env l, eval0 env r) of
          (LexLam name body, val) -> eval0 (singletonEnv name val) body
          (DynLam name body, val) -> eval0 (add env name val) body

tpush · on March 28, 2015

Still not quite my point :-)

What I'm talking about is modeling this aspect of Scheme's behavior:

  (define x 5)
  (define (f y) (+ x y))
  (f 5) => 10
  (set! x 10)
  (f 5) => 15

Scheme's lambda evaluates its body during application, under the environment of its evaluation.

Your LexLam evaluated its body with the right env, but in the wrong time: A subsequent mutation of env would not alter the result of application.

Conversely, DynLam evaluates its body at the right time (during application), but with the wrong env. Given this code:

  (define (f y) (+ x y))
  (let ((x 5)) (f 5)) => Error, x is unbound in f

Your DynLam would incorrectly evaluate to 10, right?

I just don't see how to model this behavior without IORefs :-)

jerf · on March 28, 2015

It's possible you just need to work with it for a while. IORefs can't do anything the State type can't do in a single-threaded execution context. Nothing prevents any evaluation strategy you propose that makes sense from being adopted by Haskell.

FWIW, I recognize your confusion as something I've had myself. I worked it out in Erlang, not Haskell, but it's the same principles. Again, ignoring threading, functional programming can do anything imperative programming can do with at most a log n penalty because at worst the functional program can simply simulate a raw expanse of memory in a tree and implement mutation on top of that. What you're asking about is ultimately much simpler than that but you may just have to work with it for a while. All I can really do is assure you that no, IORefs are completely unnecessary in this case.

And again let me emphasize my sympathies with your position and that I do not mean this harshly... I am recalling where I myself was in the past. Until I worked with it for long enough for it to click I'm not sure anyone really could have explained it to me. I'd been imperating for a long time.

tel · on March 28, 2015

I wrote a number of posts pertaining to that log n trick:

http://jspha.com/posts/mutable_algorithms_in_immutable_langu...

tel · on March 28, 2015

It would, but this can be solved using the same mechanisms by introducing genuine lexical and dynamic environments. My implementation was just a suggestive hack.

dllthomas · on March 28, 2015

"Note that it is not a function that takes an Env, as the closing-over happens when evaluating a lambda and not during its application."

The first part of this is your error. "Env" is the entire runtime environment we are modeling, which includes the values of any mutable variables at the point in time we are considering. Therefore, it must be (in Haskell space) a function of Env. That doesn't mean that it is (explicitly) a function of Env in Scheme space - you typically have no way of reifying Env in Scheme space to talk about that.

tel · on March 27, 2015

Typically you want to talk about a function like

    eval :: Syntax -> Semantics

In Scheme `Syntax` is some kind of tree-like Sexpr thing or possibly a more well-defined AST. So let's focus on `Semantics` instead.

Ultimately, a perfectly fine semantics is `IO ()`. As Haskell's "sin bin", `IO` circumscribes all of the effects you need. So, the entirety of this game is to capture as much effect as you can "purely" so that it is reflected intelligently in the type.

The key pure semantic component for managing changing state is to focus on "changes" instead of points in state space. This we need a notion of the initial state (the empty env) and then a series of functions

    Env -> Env

If we let `Semantics = Env -> Env` then we've already captured a meaningful chunk of Scheme semantics purely. We cannot observe Scheme side effects---they would have to be faked and thrown away---but we can observe how the env evolves. To be a bit more clear, this eval "respects" the syntax properly---if `seq :: Syntax -> Syntax -> Syntax` sequences two Sexprs then

    eval (seq sexpr1 sexpr2) = eval sexpr2 . eval sexpr1

e.g. sequencing of syntax is just function composition in the "Env transformer semantics".

So that's how we avoid IORefs.

From here we just build more. Another fun thing to add would be input and output. Here we note that any given Syntactic fragment no longer merely transforms Envs but may also either attempt to print out some text or want to read it in.

    data Print e = Simple e | Read (Text -> e) | Write (Text, e)

    type Semantics = Env -> Print Env

and now we'd expect things like

    eval "(read)"         env === Read (\input -> env)
    eval "(write \"foo\") env === Write ("foo", env)

tome · on March 27, 2015

If you have many references to mutable closures I guess indeed it would be simplest to use IORefs or STRefs. However, you can simulate most of the functionality of IORefs and STRefs using a state monad so it is possible to avoid them, at least theoretically.

dllthomas · on March 27, 2015

A mutable variable in the hosted language is a getter and setter (or equivalently, a lens) on the hosted environment.

wldlyinaccurate · on March 27, 2015

If you have never touched Haskell before, I would recommend reading the first 4 chapters of Learn You a Haskell[0] first. LYAH has a much more gentle introduction to Haskell, and really helped me grasp the basics.

After those chapters I would highly recommend working through Write Youself a Scheme, using LYAH as a reference when you need a more in-depth explanation of something. This is how I taught myself Haskell, but I realise that this approach might not work for everybody -- YMMV.

[0] http://learnyouahaskell.com/chapters

airza · on March 27, 2015

for some reason the URL is messed up; remove the slash at the very end.

rasur · on March 27, 2015

Use the html version, not the pdf that is available - the pdf has (or at least did recently) errors which will throw the unwary (of course, if you enjoy learning and working out the errors, go for it...).

Otherwise this is an extremely useful and informative book, and you will learn a lot about Haskell (possibly even Scheme too)

chrisdew · on March 27, 2015

Has anyone merged Haskell and Scheme, such that I can write strongly typed Haskell code with the (lack of) Scheme syntax, and its macros?

(I hope that Template Haskell would make this possible.)

beering · on March 27, 2015

There is Typed Racket, which is not Haskell but still pretty cool.

S4M · on March 27, 2015

There is Shen[0]that has Haskell's type system with Lisp syntax, but I don't think it's much used - I haven't tried it myself.

[0] http://shenlanguage.org/

pseudonom- · on March 27, 2015

> Haskell's type system

Shen's type system is probably better described just as expressive.

http://shenlanguage.org/learn-shen/types/types_sequent_calcu... gives a sense of the flavor.

S4M · on March 27, 2015

Thanks for the link. Out of interest, have you play with Shen yourself? Did you build something with it?

pseudonom- · on March 27, 2015

I've only messed around in the REPL. https://www.youtube.com/watch?v=lMcRBdSdO_U gives a fairly practical introduction.

Moyamo · on March 27, 2015

I've heard of something called Liskell and Lisk (two different projects). I haven't tried either of them, so I don't know if they are usable.

akavel · on March 27, 2015

Does anybody know of a good free Haskell tutorial where the theme is "writing a game"?

cosarara97 · on March 27, 2015

> Wikibooks does not have a page with this exact name.

?

cosarara97 · on March 27, 2015

Oh, http://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_H... (without slash at the end)

sctb · on March 27, 2015

Thanks, we updated the link.