I think the bias towards videos is because of the interactive and graphical nature of Smalltalk. Other programming languages, a simple text post with code snippets is usually enough but, with "environments full of awe and wonder" such as Squeak I believe only video can fully capture what is going on.
Personally, I like watching good demo videos of systems I am not familiar with, and not everyone is familiar with Smalltalk. It's easy to set TruffleSqueak up on your machine if you like to experiment on your own. Apart from that, the README.md includes a list of blog posts and papers.
Anyway, I'd really like to know how we could improve and make it more "practical to get more info". Any thoughts or suggestions?
Thanks for the feedback! We have plans to create a tutorial soon and run it virtually during SPLASH/ECOOP'20 [1] (I have to check why the tutorial page is no longer public).
One of the problems with attempting to propagate Smalltalk culture is that when your interaction with the machine doesn't resemble interacting with a teleprinter, then a wall of text becomes a poor medium for capturing it.
> I'd love to just hear more about your experience building this on top of Graal/Truffle.
There are lots of good resources for learning how to implement a language in Truffle. In addition to the official documentation and the GraalVM Slack, I often find myself looking at other GraalVM languages that are open source (e.g. Graal.js, SimpleLanguage, GraalPython). Also, the tooling available to language implementers is quite good (e.g. all debugging and profiling tools for Java, Truffle's language-agnostic tools, Ideal Graph Visualizer for analyzing Graal/Truffle graphs, Graal/Truffle command-line flags, ...).
> Any interesting or surprising anecdotes?
Supporting a Smalltalk system on the GraalVM definitely comes with interesting challenges, here are a couple of examples:
- Truffle is designed for building AST interpreters. Squeak is based on the Smalltalk-80 specification, which includes a well-defined bytecode set. For compatibility, you want a bytecode interpreter for Smalltalk. We even wrote a paper about how to do this with Truffle: https://fniephaus.com/2018/icooolps18-graalsqueak.pdf.
- Implementing some core Smalltalk mechanisms (e.g. allInstances, thisContext, becomeForward:, ...) and make them work well with the Graal compiler.
- Smalltalk is not just a language, but also a programming system. TruffleSqueak uses AWT/Swing for rendering the UI on GraalVM, and SDL2 when AOT-compiled with native image. Running UI applications with the Graal compiler, however, can yield interesting results. See for yourself: https://www.youtube.com/watch?v=wuGVyzUsEqE.
- Saving the image without breaking compatibility with the OpenSmalltalkVM and other Smalltalk VMs.
- Most languages are file-based, so Truffle's APIs are designed for files. In Smalltalk, everything -- even code -- is an object.
That bouncing atoms video just blew my mind. It took whole minutes, but it got so incredibly fast... Only to grind to a halt as soon as the workload changed just a little, haha.
I'd imagine you have a ton of bigger priorities, but being so hackable I wonder if there are easy tools within Graal/Truffle to hit that peak performance sooner and be stabler. I'd never expected it to stall so aggressively, probably worse than full GC.
The Graal compiler is known to be slow in terms of warmup. At the same time, it provides great peak-performance. However, and as you can see in the video, partial evaluation will trigger recompilation if the program is not "stable". And when you're interacting with an IDE, it will take quite some time for the IDE to become stable. Even worse, some things like debugging sessions will never be stable.
Of course, it doesn't make much sense to run your IDE at +60FPS. Squeak usually throttles the frame rate, and then the performance cliffs are harder to notice.
Nonetheless, the GraalVM team is very much aware of these problems and is actively working on them. We are only the first to visualize this in the form of an IDE. A couple of GraalVM releases ago, they introduced libgraal [1], which improved compilation times significantly. One idea to make this even better is to persist compiled code from the JIT, so that it can be reused the next time you run the program/IDE. Morphic, Squeak's UI framework, is unlikely to change and could be warmed up in advance.
Truffle team lead here. Yes we are working full steam on improving warmup and delays caused by going back to the interpreter in unexpected cases. Expect bigger improvements in this area soon.
That being said, we cannot do it all on the Truffle side without help of the language implementation. Truffle languages speculate on certain aspects of the program data. If they do so, then we need to deoptimize and invalidate the optimized code when this speculation is violated. So the stability of the language implementation really is an important factor. Questions like "do we need to speculate on this value being constant or does give us enough benefit to justify the deoptimization overhead?" need to be answered by the language implementation and not the Truffle framework. Afaik this was not a priority for TruffleSqueak so far, but it might be in the future. So there are potentialy future improvements also on the TruffleSqueak side.
Well, it is necessary to expose this to language implementations to reach good performance. If we could reach the same performance otherwise, we would not expose it, as it makes implementing Truffle languages more complicated. Unfortunately automating the specializing part is an unsolved research question for a method based compiler (deserves its own PhD). Other trace based meta-compilation approaches (e.g. PyPy) have an advantage here, but disadvantages in other areas.
> Only to grind to a halt as soon as the workload changed just a little, haha.
What really amazed me is the second time the mouse moved over the toolbar, towards the end of the video, it lagged less and was able to recover to the 200+fps much faster than the first time.
Correct, this indicates that recompilation occurred. The first time Graal compiled some of the UI machinery, which is all written in Smalltalk, no input events were triggered in the IDE. Consequently, the partial evaluator removed those code paths from compiled code. When we move the mouse over the window or start to click on UI elements, we cause recompilation, now with event handling compiled in. That's why these performance cliffs go away over time.
Would it help to for example run a full test suite on your project, and use the compilation results of that as a sort of base? I remember a long time ago C# had a feature like this.
I vaguely recall some mention in one of the docs I read a while back that you were having difficulties with the JIT causing noticable pauses/latency in interactive environments like Morphic, comparison to the OpenSmalltalkVM.
Am I recalling correctly, and if so is it still the case?
And more generally, in comparison to the OpenSmalltalkVM, what are the TruffleSqueak downsides? The upsides seem nicely obvious!
> I vaguely recall some mention in one of the docs I read a while back that you were having difficulties with the JIT causing noticable pauses/latency in interactive environments like Morphic, comparison to the OpenSmalltalkVM.
> Am I recalling correctly, and if so is it still the case?
GraalVM and TruffleSqueak have evolved quite a bit, but it is still the case. With libgraal, TruffleSqueak warms up much faster, and we were able to further improved the performance of TruffleSqueak. If you'd like to give it a try, it should be fairly easy to get started: https://github.com/hpi-swa/trufflesqueak#getting-started
> And more generally, in comparison to the OpenSmalltalkVM, what are the TruffleSqueak downsides? The upsides seem nicely obvious!
TruffleSqueak passes quite a lot of Squeak's SUnit tests (see [2]), but is of course not 100% compatible yet. Proper support for some plugins (e.g. FFI, OSProcess, ...) is still missing.
Apart from that, I'd say that TruffleSqueak has to live with the design decisions made in Truffle, but we can build on and reuse all of GraalVM components (e.g. JIT, GC, ...). The OpenSmalltalkVM, on the other hand, is more flexible, but you have to implemented everything from scratch.
Cool. That video of the bouncing atoms demo is fascinating.
It seems like a revisiting of the experience of the Self team between the second and third generation of Self VMs[1], though the underlying hardware might be just a little faster :) Do you think this tradeoff of peak performance vs interactivity is inherent in Tuffle's approach? Or can Truffle realistically aim for the best of both worlds?
One interesting aspect of the Self VM is that it can save the JIT generated native machine code together with the Self code in the image/snapshot, so that you can load a pre-warmed set of objects. Is that possible within the Truffle framework?
> It seems like a revisiting of the experience of the Self team between the second and third generation of Self VMs, though the underlying hardware might be just a little faster :) Do you think this tradeoff of peak performance vs interactivity is inherent in Truffle's approach? Or can Truffle realistically aim for the best of both worlds?
> One interesting aspect of the Self VM is that it can save the JIT generated native machine code together with the Self code in the image/snapshot, so that you can load a pre-warmed set of objects. Is that possible within the Truffle framework?
I completely agree. Mario Wolczko, who worked on the Self VM at that time, is part of the extended GraalVM team. So the knowledge and experience is there, it only has to be implemented. The GraalVM team is thinking about ways to support snapshotting and a couple of other things. Their profile-guided optimizations [1] is probably the closest to that and definitely a way into the right direction. So I'd say it is possible to make performance of GraalVM languages much more predictable, maybe to the extent that it doesn't matter anymore for interactivity.
> and we were able to further improved the performance of TruffleSqueak
Is there an overview somewhere which allows comparison with fig. 4 of your ACM 2019 publication? Anyway in that publication SOMns using the same technology - as far as I understand - showed a completely different performance pattern; does this still apply, and what is/was the reason for this?
> Is there an overview somewhere which allows comparison with fig. 4 of your ACM 2019 publication?
We have set up a benchmarking infrastructure for performance tracking, but it is unfortunately not public (yet). Also, we have switched from GraalVM EE/JDK8 to GraalVM CE/JDK11 a couple of months ago, which makes it hard to compare recent with older results. But what I can say it that performance on GraalVM CE/JDK11 is currently better than on GraalVM EE/JDK8 a year ago. I expect it to be faster on EE as well, but I'd need to measure by how much.
> Anyway in that publication SOMns using the same technology - as far as I understand - showed a completely different performance pattern; does this still apply, and what is/was the reason for this?
Correct. First, SOMns does not support a real Smalltalk system and does not support some crucial mechanisms (e.g. changing the class of an object). So TruffleSqueak must check the class of an object in quite some cases, which is a slight, unavoidable overhead. More importantly, SOMns has some primitives that Squeak does not, for example for comparing strings. That's for example another reason why SOMns is so much faster in the JSON benchmark. For your curiosity, here's a screenshot of the benchmark results for a current TruffleSqueak and SOMns (note that they ran on different versions of GraalVM, I think you can somewhat compare the TruffleSqueak numbers to "GraalSqueak-CE" in fig. 4 of the paper):
> What is the main difference between this and other Smalltalk implementations besides the underlying GraalVM?
The implementation tries to be as compatible as possible. Apart from that, it's implemented using a language implementation framework (rather than from scratch), and written in Java (rather than in Smalltalk like the state-of-the-art OpenSmalltalkVM, or some low-level language).
> Is it that you can easily call to Java and JavaScript (and perhaps back)?
> In other words, Why TruffleSqueak?
We are using TruffleSqueak as a research platform for polyglot programming. But, of course, it can do a lot more. The Smalltalk programming system runs on the same level as all other languages and on top of GraalVM. This allows direct interaction from Smalltalk tools with the GraalVM runtime. In the "Live plots with ggplot2" demo [1], for example, we briefly show a visualization of the Graal compilation queue. Usually, you'd have to go through something like JVMTI, but TruffleSqueak has direct access to the runtime.
I guess an alternative answer to your question is "why not?" :)
how deep does the support go? can i load an existing squeak/cuis/pharo? image? or is it only code compatible in that i can copy code from supported languages?
as a squeak or cuis user do i see any difference on the interface or is it all under the hood with additional APIs available that i can access from my code?
Yes, you can load recent, 64-bit, stock Squeak or Cuis images. Pharo is not (yet) supported. The user interface should be identical. For the polyglot API to work, some additional image-side code [1] needs to be loaded in your image.
It does work with Cuis, but not with Pharo. The reason for this is that TruffleSqueak makes some assumptions about the layout of certain objects (e.g. instances of Class). Those assumptions no longer hold on Pharo, as it derived from the specification. Also, Pharo heavily uses FFI, which is not properly supported yet. Support for Pharo is not a priority, but contributions are welcome.