Another class of these optimizations that are even harder (for me) to spot exploit matrix decompositions, multiplications, etc. If you can manage to convert your problem into some linear algebra system, a fast BLAS can do wonders. And then selectively using some BLAS result (that has side-effects as the article defines them) is even harder.
Unfortunately, it's often just as hard to distinguish such code from magic when reading it.
I think this is a poor generalization. It's only valid if you assume that you can only ever do things in languages that do not have idiomatic features or implementations well-suited to the task at hand.
Is a well-structured and readable (to a C programmer) program written in C less readable than a more verbose and DSL-laden program written in Ruby or a purely functional (whether that's relevant to the solution or not) program written in Common Lisp to solve the same problem? It might be... depending on your background and how well those tools fit the problem.
"Optimization" doesn't have to start after the product is working. Choosing a platform that supports your requirements from the start, if that's an option you have, has the potential to significantly mitigate what we'd traditionally call optimization in the sense of "we built it, holy crap, now how do we scale it?"
The statement "more optimized = less readable" isn't true across the entire curve. But it is generally pretty accurate at the limit. Maybe not for every problem, but for the majority of problems. No language can be expressive for anything but a tiny slice of the space of effective optimizations.
It's actually kind of a nice thing about optimizations that are non-portable, because the fallback implementation can act as an explanation of what the code does. Clearly commenting optimized code is a difficult art.
I have fully taken the functional Kool-Aid. I like writing what I want, not how to get there.
But this limits me to writing what I want. In some cases, it is more concise to simply write how to get there---especially when you can accomplish two aims at once.
Reminds me a bit from The Inner Product that basically said that context is the biggest hammer in the programmer's toolbox, and each layer of abstraction just adds more constraints.
Unfortunately, it's often just as hard to distinguish such code from magic when reading it.