> The failure to store a represention of the parsed code was disastrous because it meant that in order to evaluate an expression or statement for a second time, the engine had to parse it a second time.
Ha! They must have learned how to write an interpreter from Herbert Schildt:
In this piece of amusement, the Little C program is a giant null-terminated string, and the "instruction pointer" is of type char * . You get the picture.
> When I did come to write a garbage collector I used the mark-and-sweep algorithm. But something puzzled me, and I couldn’t find an answer in any of the textbooks I looked at: how was I supposed to schedule the collections? In a classic description of a garbage collector, you wait until memory runs out and then you collect the world. But this leads to bad behaviour on modern systems, because of swapping, and because other processes need memory too. You need to schedule collections well before memory is full. But when exactly? I still don’t know of a comprehensive solution to this problem.
In a nutshell, you let your run-time pretend that it's running in a small machine, until it is too close to huffing and puffing too hard and then you say "hey I lied, you're actually in a bigger machine: have some breathing room". This rubbery constraint keeps it behaving reasonably nicely, rather than "Wee, I have 4GB of free RAM to stomp over with strings and cons cells before ever calling GC!"
What you have to do is pick some heap size (that is typically substantially smaller than the amount of RAM). You let the GC whack against this artificial threshold, and if that gets too excessive, according to some measure, you increase it. E.g. if after a full GC you have less than some fudge threshold free, call the OS for more memory to integrate into the object heap.
The threshold is calculated in some way that the image doesn't have to execute numerous frequent GC's before it triggers the request for more space (it doesn't have to whack too hard and wastefully against the artificial limit).
Also, ephemeral GC will help, and ephemeral GC can have its own threshold against frequent ephemeral GC's. When not enough space is liberated by ephemeral, you schedule a full. Then if that doesn't liberate enough space, add more. Since ephemeral is fast (doesn't scan the full heap), you can work a bit closer to the heap limit (since you can suffer frequent ephemeral GC's better than frequent full GC's).
And, of course, the parameters controlling these behaviors are exposed in some way so users can tweak them. Command line arguments, env vars, local config file, system config file, run time global variable/API, ...
Shell script interpreters like bash do. You can change a bash script while it's running (append, or truncate and re-write) and it'll keep going from the same offset.
But then again, shell script is in a different class of language I'd say.
Bash reads code from a stream, and doesn't read the entire stream before executing the code.
However, the stream pointer isn't its instruction pointer in the script. A backwards branch (such as the end of a while loop, or a call to a function defined earlier) does not rewind the stream to that piece of text.
> But then again, shell script is in a different class of language I'd say.
Not much different in this regard from how, say, Common Lisp (load "file.lisp") processes the top-level forms in the file.
(defvar *counter* 0)
(defun addmore ()
(format t "hi ~s~%" (incf *counter*))
(with-open-file (f "addmore.lisp" :direction :output :if-exists :append)
(write-line "(addmore)" f)))
$ clisp addmore.lisp
hi 1
WARNING: OPEN: #<INPUT BUFFERED FILE-STREAM CHARACTER #P"addmore.lisp" @8>
already points to file "/home/kaz/test/addmore.lisp", opening the
file again for :OUTPUT may produce unexpected results
Open the file anyway
hi 2
WARNING: OPEN: #<INPUT BUFFERED FILE-STREAM CHARACTER #P"addmore.lisp" @9>
already points to file "/home/kaz/test/addmore.lisp", opening the
file again for :OUTPUT may produce unexpected results
Open the file anyway
hi 3
WARNING: OPEN: #<INPUT BUFFERED FILE-STREAM CHARACTER #P"addmore.lisp" @10>
already points to file "/home/kaz/test/addmore.lisp", opening the
file again for :OUTPUT may produce unexpected results
Open the file anyway
hi 4
WARNING: OPEN: #<INPUT BUFFERED FILE-STREAM CHARACTER #P"addmore.lisp" @11>
already points to file "/home/kaz/test/addmore.lisp", opening the
file again for :OUTPUT may produce unexpected results
Open the file anyway
hi 5
Ha! They must have learned how to write an interpreter from Herbert Schildt:
http://www.drdobbs.com/cpp/building-your-own-c-interpreter/1... [1989]
In this piece of amusement, the Little C program is a giant null-terminated string, and the "instruction pointer" is of type char * . You get the picture.
> When I did come to write a garbage collector I used the mark-and-sweep algorithm. But something puzzled me, and I couldn’t find an answer in any of the textbooks I looked at: how was I supposed to schedule the collections? In a classic description of a garbage collector, you wait until memory runs out and then you collect the world. But this leads to bad behaviour on modern systems, because of swapping, and because other processes need memory too. You need to schedule collections well before memory is full. But when exactly? I still don’t know of a comprehensive solution to this problem.
In a nutshell, you let your run-time pretend that it's running in a small machine, until it is too close to huffing and puffing too hard and then you say "hey I lied, you're actually in a bigger machine: have some breathing room". This rubbery constraint keeps it behaving reasonably nicely, rather than "Wee, I have 4GB of free RAM to stomp over with strings and cons cells before ever calling GC!"
What you have to do is pick some heap size (that is typically substantially smaller than the amount of RAM). You let the GC whack against this artificial threshold, and if that gets too excessive, according to some measure, you increase it. E.g. if after a full GC you have less than some fudge threshold free, call the OS for more memory to integrate into the object heap.
The threshold is calculated in some way that the image doesn't have to execute numerous frequent GC's before it triggers the request for more space (it doesn't have to whack too hard and wastefully against the artificial limit).
Also, ephemeral GC will help, and ephemeral GC can have its own threshold against frequent ephemeral GC's. When not enough space is liberated by ephemeral, you schedule a full. Then if that doesn't liberate enough space, add more. Since ephemeral is fast (doesn't scan the full heap), you can work a bit closer to the heap limit (since you can suffer frequent ephemeral GC's better than frequent full GC's).
And, of course, the parameters controlling these behaviors are exposed in some way so users can tweak them. Command line arguments, env vars, local config file, system config file, run time global variable/API, ...