Go 1.0.2, which is the newest stable release. I will see if I have some extra time to file a report.
Slightly off topic (but relevant to my comment above), do you know why this:
func TraverseInline() {
var ints := make([]int, N)
for i := range ints {
sum += i
}
}
should be slower than:
func TraverseArray(ints []int) int {
sum := 0
for i := range ints {
sum += i
}
return sum
}
func TraverseWithFunc() {
var ints := make([]int, N)
sum := TraverseArray(ints)
}
Seeing a huge difference here. With an array of 300000000 ints, TraverseInline() takes about 750ms, whereas TraverseWithFunc() takes 394ms. With gccgo (GCC 4.7.1), the different is much slighter, but there's still a difference: 196ms compared to 172ms. (These are best-case figures, I'm doing lots of loops.) Go is compiled, not JIT, so I don't see why a function should offer better performance.
I don't have any immediate insight for the difference in performance you're seeing (my hunch is that it's to do with where and how memory is being allocated), but I would suggest you ask about it on the Go-Nuts mailing list: https://groups.google.com/forum/?fromgroups#!forum/golang-nu...
The Go authors should be very receptive and informative on such an issue.
Not sure why the difference in performance between function and non-function, particularly considering this looks inline-able.
2 theories on why gcc is doing better. Maybe its unrolling the loop, and avoids a bunch of jumps. Alternatively, it could be using SSE, but this theory is a much less likely, because I would expect it to be 4 times faster not twice as fast. gc doesn't use SSE instructions yet.
I'm not seeing any difference between the two functions using the 1.0.2 x64 gc compiler. Here's the code I used (slightly modified to compile): http://play.golang.org/p/gcf0BOGncQ
If you want to dig deeper and are not averse to assembly, you can pass -gcflags -S to 'go build' and see the compiler output. Here's what I got for the above code:
The numbers are from Go 1.0 as that's what is available on Ubuntu Precise, my test box. (Getting similar numbers with 1.0.2 on a different box where I don't have gccgo.)
I realized that the first loop was using "i < len(ints)" as a loop test, which turns out to be more expensive than I thought, and the compiler doesn't optimize it into a constant (which would be expecting too much, I guess). After rewriting the test, the function call case is only slightly faster, although it is still significantly faster with gccgo.
Slightly off topic (but relevant to my comment above), do you know why this:
should be slower than: Seeing a huge difference here. With an array of 300000000 ints, TraverseInline() takes about 750ms, whereas TraverseWithFunc() takes 394ms. With gccgo (GCC 4.7.1), the different is much slighter, but there's still a difference: 196ms compared to 172ms. (These are best-case figures, I'm doing lots of loops.) Go is compiled, not JIT, so I don't see why a function should offer better performance.