EDIT: After some disassembly and experimenting, it seems like what we see here is some clever unrolling and memoisation. If I change 46 from a constant to a variable, it becomes twice as slow. Plus there is this in disasm:
Microbenchmarks like this can be difficult to perform in practice, as gccgo can perform optimizations on pure functions that prevent Go's benchmarks from actually fully repeating a test. Also, this is a special case in which gccgo shines, specifically because it is completely cpu-bound. Generally, there isn't nearly as much of a performance difference.
I've turned this code into a Go benchmark and run it with both the default cmd/compile (1.11) and gccgo (8.2.0).
So yeah, if you are doing heavy-weight math stuff in Go, you might consider switching to gccgo. You might get up to 20x performance boost.Here is the gist, please tell me if I screwed up somewhere: https://gist.github.com/ainar-g/1bd363d41c441d9ebf05c0c0b9f2....
EDIT: After some disassembly and experimenting, it seems like what we see here is some clever unrolling and memoisation. If I change 46 from a constant to a variable, it becomes twice as slow. Plus there is this in disasm:
What puzzles me is why doesn't gcc do this for the C version. Even if I add an explicit __attribute__((const)).