This was my very first 'big' haskell project! I had a lot of fun making it and afterwards I decided to start following compilers courses at my university!(https://github.com/arianvp/Haskeme).
I'd recommend anyone dabbling in their first days of haskell to take a try at this project. You will learn a great deal and most of all it's just extremely cool.
Was a lot of fun to build and I would do it totally differently if I would do it again. For example I would've added lexical scoping and handle variables with a Reader or State instead of IORefs . It's quite possible to make an almost fully pure implementation without using I/O for variables.
How would you implement mutable closures without using IORefs, though?
In Scheme the environment that is captured by a lambda expression is mutable; how would you accomplish that without IORefs? (Genuine question, as I'm currently implementing an R7RS Scheme interpreter in Haskell and was wondering this before I just went with IORefs.)
A variable that is mutable to Scheme doesn't need to be implemented as mutable in Haskell. Simulate mutable state in the usual way by passing the "before" state into functions and the "after" state out.
How would that work to model mutable closures?
During the evaluation of a Scheme lambda expression, the current environment needs to be captured somehow. Assuming the resulting value of the evaluated lambda expression is something like (Haskell pseudocode): (env, (Env -> [Value] -> Cont -> IO ()))
where 'env' is the captured environment. Further assuming this tuple is stored in the environment, how would you propagate a change to 'env' without it being an IORef and using writeIORef?
Immutability in functional programming is better understood as "the only time something can change is when a function is being called, which can be passed new arguments", rather than "nothing can ever change". This means that things can't change within the execution of a function. So when you produce the new Env, you pass it along to the next opcode's implementation, and it is just as if the Env changed from the point of view of the next opcode implementation. It just didn't change uncontrollably.
Whereas an imperative program works with a lot of loops, I imagine a functional program as passing through endless, endless stack frames, like https://youtu.be/77f0Yj9uMRA?t=2m5s . Stack frames, endless stack frames being created. (Tail-call optimization is often cited as important to avoid blowing out the stack. It's also important just for sheer performance, so we're not literally making calls for every function call, i.e., pushing registers and state.)
Haskell plays immutability to the hilt, as laziness is very difficult to imagine without it. Most other languages with immutability should probably instead have Rust-style mutation control, in my opinion, as it maintains almost all of the interesting properties of immutability in strict languages while being easier to program with, and easier to make high performance.
I'm sorry, I didn't quite convey what I actually wanted to know.
I'm aware of how to model mutable state by simply passing in and "returning" the state, that's not the problem.
Let me try to explain again, bear with me:
Assuming the evaluation function is
evalLambda :: Env -> LambdaExpr -> Proc
(I'm purposefully ignoring continuations and possible IO here. Also, a new Env is not returned because evaluating a lambda doesn't mutate the environment.)
The result of evaluating a lambda expression is a procedure that closed over its environment, so let's assume Proc is a function that has a copy of the environment passed to evalLambda. Note that it is not a function that takes an Env, as the closing-over happens when evaluating a lambda and not during its application.
Now, the interesting thing happens when the result of the lambda, the function, is stored somewhere, and after that a variable whose binding was closed over by evalLambda is mutated: The function evaluating the assignment/definition correctly produces an updated environment, but how will the stored Proc know? I can't see how this can be done without IORefs, honestly.
If you're talking about dynamic scoping then the cheapest way to pull that off is just to store the AST for your lambda, unevaluated, in the Env and evaluate it in the new Env which is available at call time.
It's all about controlling what evaluation occurs at what time. Here's another example: a language with dynamic and lexical bindings. You can see how the time when evaluation of the body of the lambda varies. I'm being very heavy-handed here, though—this only works for a pure language.
type Name = String
data Exp
= Var Name
| LexLam Name Exp
| DynLam Name Exp
| App Exp Exp
| Base
eval :: Exp -> Exp
eval = eval0 emptyEnv where
eval0 env e = case e of
Var nm -> lookup nm env -- throws exception on failure
Base -> Base
LexLam name body -> LexLam name (eval0 (add env name (Var name)) body)
DynLam _ _ -> e
App l r -> case (eval0 env l, eval0 env r) of
(LexLam name body, val) -> eval0 (singletonEnv name val) body
(DynLam name body, val) -> eval0 (add env name val) body
It's possible you just need to work with it for a while. IORefs can't do anything the State type can't do in a single-threaded execution context. Nothing prevents any evaluation strategy you propose that makes sense from being adopted by Haskell.
FWIW, I recognize your confusion as something I've had myself. I worked it out in Erlang, not Haskell, but it's the same principles. Again, ignoring threading, functional programming can do anything imperative programming can do with at most a log n penalty because at worst the functional program can simply simulate a raw expanse of memory in a tree and implement mutation on top of that. What you're asking about is ultimately much simpler than that but you may just have to work with it for a while. All I can really do is assure you that no, IORefs are completely unnecessary in this case.
And again let me emphasize my sympathies with your position and that I do not mean this harshly... I am recalling where I myself was in the past. Until I worked with it for long enough for it to click I'm not sure anyone really could have explained it to me. I'd been imperating for a long time.
It would, but this can be solved using the same mechanisms by introducing genuine lexical and dynamic environments. My implementation was just a suggestive hack.
"Note that it is not a function that takes an Env, as the closing-over happens when evaluating a lambda and not during its application."
The first part of this is your error. "Env" is the entire runtime environment we are modeling, which includes the values of any mutable variables at the point in time we are considering. Therefore, it must be (in Haskell space) a function of Env. That doesn't mean that it is (explicitly) a function of Env in Scheme space - you typically have no way of reifying Env in Scheme space to talk about that.
In Scheme `Syntax` is some kind of tree-like Sexpr thing or possibly a more well-defined AST. So let's focus on `Semantics` instead.
Ultimately, a perfectly fine semantics is `IO ()`. As Haskell's "sin bin", `IO` circumscribes all of the effects you need. So, the entirety of this game is to capture as much effect as you can "purely" so that it is reflected intelligently in the type.
The key pure semantic component for managing changing state is to focus on "changes" instead of points in state space. This we need a notion of the initial state (the empty env) and then a series of functions
Env -> Env
If we let `Semantics = Env -> Env` then we've already captured a meaningful chunk of Scheme semantics purely. We cannot observe Scheme side effects---they would have to be faked and thrown away---but we can observe how the env evolves. To be a bit more clear, this eval "respects" the syntax properly---if `seq :: Syntax -> Syntax -> Syntax` sequences two Sexprs then
e.g. sequencing of syntax is just function composition in the "Env transformer semantics".
So that's how we avoid IORefs.
From here we just build more. Another fun thing to add would be input and output. Here we note that any given Syntactic fragment no longer merely transforms Envs but may also either attempt to print out some text or want to read it in.
data Print e = Simple e | Read (Text -> e) | Write (Text, e)
type Semantics = Env -> Print Env
If you have many references to mutable closures I guess indeed it would be simplest to use IORefs or STRefs. However, you can simulate most of the functionality of IORefs and STRefs using a state monad so it is possible to avoid them, at least theoretically.
If you have never touched Haskell before, I would recommend reading the first 4 chapters of Learn You a Haskell[0] first. LYAH has a much more gentle introduction to Haskell, and really helped me grasp the basics.
After those chapters I would highly recommend working through Write Youself a Scheme, using LYAH as a reference when you need a more in-depth explanation of something. This is how I taught myself Haskell, but I realise that this approach might not work for everybody -- YMMV.
Use the html version, not the pdf that is available - the pdf has (or at least did recently) errors which will throw the unwary (of course, if you enjoy learning and working out the errors, go for it...).
Otherwise this is an extremely useful and informative book, and you will learn a lot about Haskell (possibly even Scheme too)
I'd recommend anyone dabbling in their first days of haskell to take a try at this project. You will learn a great deal and most of all it's just extremely cool.
Was a lot of fun to build and I would do it totally differently if I would do it again. For example I would've added lexical scoping and handle variables with a Reader or State instead of IORefs . It's quite possible to make an almost fully pure implementation without using I/O for variables.