This is just another way of saying "we don't have a compiler, so don't process data at the element level if you want speed".
It is not an advantage of the language, but a disadvantage.
If you have a compiler, then it's only valid for aggressive, machine-specific optimizations. This is articulated by statements like "the library function is marginally faster than an open coded loop, because it uses some inline assembly that takes advantage of vectorized instructions on CPU X".
If I write an FFT routine in C myself, it will not lose that badly to some BLAS/LAPACK routine.
Forcing your compiler to figure out you're doing something trivial on a rank-n array is silly. So is writing all the overhead and logic (where a typo can break things) which goes into a for or while loop instead of two characters: +/
I encourage you to try writing an FFT routine in C yourself and compare it to FFTW, where they basically wrote a compiler for doing FFTs. It's also worth doing in an interpreted language in an array-wise fashion versus with a for-loop. You should get something like a factor of 100,000 speed up.
This is just another way of saying "we don't have a compiler, so don't process data at the element level if you want speed".
It is not an advantage of the language, but a disadvantage.
If you have a compiler, then it's only valid for aggressive, machine-specific optimizations. This is articulated by statements like "the library function is marginally faster than an open coded loop, because it uses some inline assembly that takes advantage of vectorized instructions on CPU X".
If I write an FFT routine in C myself, it will not lose that badly to some BLAS/LAPACK routine.