In case anyone is wondering about why this project puts so much importance on things being inlineable, GPUs often don't support function calls or have an extremely shallow return stack.
In this way it reminds me of http://futhark-lang.org/ somewhat, though Futhark is explicitly targeted at producing extremely efficient GPU code.
As also described by the fourth to sixth paragraphs in the readme:
Abstraction by heap allocation is a dead end. It works moderately well on the current generation of computers where CPU is still the dominant driver of computation.
It cannot work for devices like GPUs and the rest coming down the line. Many of the most important computational devices of the future won't support heap allocation so an alternative is needed to draw out their full power. It is of absolute importance that a language for that task have excellent control over inlining. Inlining therefore must comes as guarantee in the language and be a part of the type system.
Inlining is a trade-off that expresses the exchange of memory for computation. It should be the default instead of heap allocating.
I guess whoever is wondering about the why hasn't read the introduction.
There are multiple perspectives to this. Without heap allocation during kernel execution GPUs might not even become the dominant driver of computation in the first place.
In this way it reminds me of http://futhark-lang.org/ somewhat, though Futhark is explicitly targeted at producing extremely efficient GPU code.