The key feature is *abstract* iteration, like functional maps, filters and folds...

scottlocklin · on March 28, 2017

Stuff like +/v is easier to reason about, but it's also handled by special code which runs a lot faster. At some point I need to do a blog on all the horrible things that happen inside your computer (cache hits, memory being swapped in and out, stacks popping) when you do an interpreted, explicit loop; bytecode, AST or whatever. There are R&D interpretors which claim to remove this overhead for trivial for loops,which are the main kind that end up getting used in numeric code, but none of them ever seem to make it into production (I'm sure someone will correct me if I am wrong; I am pretty sure Art wasn't doing this in K4, though he was probably best positioned to do so).

The real reason we like +/v besides less typing; it can be handled with special code which runs close to the theoretical machine speed. Lots of small places and languages this fact can be exploited in. R and Matlab basically have a subset of operations you can do +/v type things with. APL is the main class of languages where this sort of thing is built into the semantics of the language. If you're dealing with numerics in an interpreted language, it should be built into the semantics of your language, and that's how you should do things. Really it should be in compiled languages too, and that's how people should reason about code, but it's probably asking too much since APL has only been around since the 70s...

kazinator · on March 28, 2017

> handled by special code which runs a lot faster

This is just another way of saying "we don't have a compiler, so don't process data at the element level if you want speed".

It is not an advantage of the language, but a disadvantage.

If you have a compiler, then it's only valid for aggressive, machine-specific optimizations. This is articulated by statements like "the library function is marginally faster than an open coded loop, because it uses some inline assembly that takes advantage of vectorized instructions on CPU X".

If I write an FFT routine in C myself, it will not lose that badly to some BLAS/LAPACK routine.

scottlocklin · on March 28, 2017

Forcing your compiler to figure out you're doing something trivial on a rank-n array is silly. So is writing all the overhead and logic (where a typo can break things) which goes into a for or while loop instead of two characters: +/

I encourage you to try writing an FFT routine in C yourself and compare it to FFTW, where they basically wrote a compiler for doing FFTs. It's also worth doing in an interpreted language in an array-wise fashion versus with a for-loop. You should get something like a factor of 100,000 speed up.

kazinator · on March 28, 2017

> Forcing your compiler to figure out you're doing something trivial on a rank-n array is silly.

What is the alternative, if there is no canned procedure for it?

The procedure has to be written somewhere, somehow, in some language.

If compilers are silly, assembly, I guess?

kd0amg · on March 28, 2017

The analysis to recognize whether a "canned procedure" is applicable is nontrivial, to put it lightly.

jtraffic · on March 28, 2017

I see. Good answer.