The presentation makes a distinction between mid-stack and leaf inlining, and apparently it was only done on leaf calls before because this is less confusing in backtraces.
The point is, as usual, Go is trying to catch up to what everyone else has had for years. The use of non-standard terminology here is suspicious, raising the question that the Go people are trying to hide the fact that they are playing catch-up.
Comments like these are so depressing. Someone does a bunch of work to improve the Go compiler and writes a presentation to share their approach (primarily so that other people working on Go can understand and extend it), and gives it all away for free. This is textbook open source citizenry, which should be applauded.
But instead, you come along and criticize them for being too specific with their terms (!!) and also accuse them of being deceptive. This is not a marketing exercise. There is no conspiracy here.
> writes a presentation to share their approach (primarily so that other people working on Go can understand and extend it
Perhaps Ericson2314's gripe is that it's being posted (and upvoted) on HN, rather than just a Go-specific forum (e.g. reddit.com/r/golang), and by implication it's intended to be read by a more general audience.
Nothing about giving your work away for free or wanting to do good necessarily puts one above criticism for misuse (or useless invention) of terminology. It's not a marketing exercise, you are correct -- all the more reason to expect a narrow use of terminology.
If you're going to be pedantic, you might at least be strictly correct. There is no existing term that would distinguish it from Go's prior state, so there is no misuse nor "useless invention". Even if there were and you were correct, this would still be pedantry at its finest.
Why would correct use of terminology be "pedantry at its finest"?
What people are concerned about is intellectual dishonesty. The presentation does not go to much effort to show, for example, why the term "mid-stack inlining" needs to be introduced for Go. Do other languages not have this kind of inlining? Does comparing and contrasting other implementations just not matter?
> Why would correct use of terminology be "pedantry at its finest"?
Because you're debating terminology when the wrong term caused no one confusion, thereby detracting from the actual conversation.
> What people are concerned about is intellectual dishonesty.
There's no cause for this concern.
> The presentation does not go to much effort to show, for example, why the term "mid-stack inlining" needs to be introduced for Go.
Sure it does; see slides 3-6.
> Do other languages not have this kind of inlining? Does comparing and contrasting other implementations just not matter?
Perhaps the author omitted it from his deck because it's impractical to cover the whole breadth of inlining in a deck that's already 35 slides long. Perhaps he simply didn't think to include it. There are a lot of likely explanations for why this wasn't included besides nefarious motives. You sound paranoid.
No hiding or sneaking is necessary to explain anything here. They needed to come up with a way to describe the difference between how it is currently implemented and how they are changing it. That's all.
> The point is, as usual, Go is trying to catch up to what everyone else has had for years.
As far as programming languages go, Go is very new. Of course, there will be a number of areas that still need to be optimized or fully fleshed out. I think the Go compiler has only been written in Go for about a year. I'm sure you you find this pretty lame as well.
I don't think anyone is pretending that Go is a decades-old language. What is the standard term for this kind of inlining which would distinguish it from the inlining already present in the compiler?
"Go inlining improvements..." and then "Conventionally, inlining includes ... but Go has been limited to ... until recently. With ... Go now supports inlining in a broader array of circumstances."
Since I've learned about continuation passing style (which Go channels could probably be formally transformed into), I've been convinced that there's a better way to do codegen. Better calling convention, better stack representation, better instruction architecture; I'm not yet sure - it's a nag continuously at the back of my mind, almost as though it's at the tip of my tongue. In this specific case, it must surely be possible to inline a continuation with some foreign architecture. I'd love to see some literature on the more experimental end of this stuff, if anyone has it.
That's an interesting read. One of only a handful of tech books I've read twice. It's thin and doesn't repeat itself all that much so if you read it twice it's still faster than reading most tech books once.
More recently though I heard someone proved mathematically that CPS can be transformed one-for-one into one of the more conventional models. That doesn't mean it might not still be easier for the humans to deal with however.
I strongly doubt this.
It doesn't say this in the preso, and ...
1. The compiler is a lot slower, which is usually from code growth and not debugging info growth. If the compiler is that much slower from debugging info growth, they have larger issues :)
2. Usually people do not include debug info sizes in binary sizes, because DWARF/et al info can be stripped and put alongside the binary (IE it doesn't even have to be part of the binary)
It does say this. One of the last slides says 4% of the additional size came from adding more debugging information, excluding anything to do with the new inlining.
"If I'm understanding the increase is mostly due to the "debugging" info that is added, not necessarily due to more code.
"
vs
" One of the last slides says 4% of the additional size came from adding more debugging information, excluding anything to do with the new inlining"
So no, it doesn't say that it's "mostly due", it says ~25% is due to debugging information.
Increased sizes lead to less effective use of instruction caches. Different use cases will be impacted differently by that, so it's worth profiling your particular application, at least until the size overhead comes down.
What impact would that have on build times? I know a lot of work has gone into getting back to 1.4 build times, but would the added work of inlining prolong builds?
The presentation redacted the stats about how this affects Google performance. I bet it saves enough CPU hours to pay the author's salary many many many times over. Good job!
That said, it's a little disappointing when runtimes require custom algorithms or metadata to walk the stack and construct a stack trace. It makes it harder to build debuggers that grok the state of multiple runtimes (e.g., the Go code and the C code in the same program). This also affects runtime tracing tools like DTrace, which by construction can't rely on runtime support for help.
We plan to expose all of the inlining information in the DWARF tables so debuggers won't have any problems with this. Internally, the runtime uses a different representation just so we can make it more compact and optimized for the runtime's exact needs. This way, you can also strip the debug info without breaking the runtime's own ability to walk stacks.
Go already performs cross-package inlining, so it can already inline library calls. (This is relatively easy to do in Go compared to other languages because packages must form a DAG. Compiling package A writes out enough information in the object file for A that compiling package B that depends on A can inline calls to functions in A.)
> This is relatively easy to do in Go compared to other languages because packages must form a DAG.
That doesn't make sense to me. The complicated part is storing the IR in packages in a form that can be read back into the compiler later. That's needed to do inlining at all. Once you have that done, doing LTO is trivial: you just slurp your IR for all modules linked together into the compiler and emit a single binary. (You can be fancier, like ThinLTO does, but again, the effort needed to do ThinLTO is independent of whether you have cyclic dependencies or not.)
"Compiling package A writes out enough information in the object file for A that compiling package B that depends on A can inline calls to functions in A"
So it records the calling convention, architecture flags, alignment, and other ABI pieces etc?
As well as an estimate of instruction-level inlining cost, summary info about arguments, etc, so you effectively decide whether inlining it will help or hurt, without having the IR around to try?
FWIW: Writing out the info is usually not the hard part, actually, and is unrelated to the DAG-ness of the packages.
GCC is just the perennial example here, but they refused to write it out for years for political reasons, not technical ones :)
"So it records the calling convention, architecture flags, alignment, and other ABI pieces etc?"
No. At the moment it records the AST in the object file, because the inliner works at the Go AST level. In the future it may instead record the SSA representation (which would obviously give better cost estimates; the current heuristics are really extremely simple).
"FWIW: Writing out the info is usually not the hard part, actually, and is unrelated to the DAG-ness of the packages."
The DAG-ness means it's always available when compiling the call site, even if it's a cross-package call. It means you don't have to do it at link time.
"In the future it may instead record the SSA representation (which would obviously give better cost estimates; the current heuristics are really extremely simple)."
This would be identical to what others do then :)
"The DAG-ness means it's always available when compiling the call site, even if it's a cross-package call. It means you don't have to do it at link time.
"
This is unrelated to DAG-ness.
Unless you mean something else by DAG-ness. DAG-ness means it's a directed-acyclic graph. That is, all other things being equal, it has no cycles.
This is unrelated to the problem.
For example, in other languages/etc, it could be weakly defined, or other some form of overridable, regardless of whether it has cycles, is actually multiply defined etc. That requires link-time resolution, becuase you can't optimize it making an assumption about its callees/callers, or even inline it, and then just hand that version to others because you may have screwed it up in a way one of those other callers depend on.
That is, the overridability is an attribute of the function, not a problem of how it is used.
Ditto on the ABI, alignment, etc.
None of the interesting problems they have to solve are related to packaging. They occur with DAGs or non-DAGs, are just related to these languages supporting a richer set of things you can do to functions :)
> The DAG-ness means it's always available when compiling the call site, even if it's a cross-package call. It means you don't have to do it at link time.
Why is it any harder to do at link time?
(I've implemented this in a production compiler, and choosing whether to do it at compile time or link time was a trivial decision.)
traditionally, this required a linker that understands there is ir in the files.
in practice, i don't believe this has been a problem for many years now (and again, was only a problem in the open source world, so saying it's related to the language is kind of strange.).
Every good production C++ compiler has had some form of link time optimization for many years.
IBM's, for example, has been happily cross-optimizing between C++, java, fortran, PL/IX, etc without any issues, going on at least 15, maybe 25+ years now (I know it's 15 for sure, i suspect it's closer to 25).
Go statically links to all Go libraries, and only dynamically links when interfacing with C code. (As of the last time I used it, a couple years ago; this may have changed since then.)
That doesn't really answer whether it can inline library functions. The fact that it statically links means it potentially could inline methods it finds in them but I don't think that it does currently since it appears that this inline functionality works when compiling source.
It can inline library functions, since, at latest, 1.4.2 (this is just what I happened to have handy on this computer). It's pretty easy to test by writing a package containing a simple function like "func Square(x int) int { return x * x }", compiling a main package that imports that package and calls Square, and then disassembling the executable to see whether it calls Square or inlines it.