one of the requirements for this kind of speed is for a language to be statically typed
No, one of the requirements for static optimization is that a language be statically typed. It does not follow that there are no ways to make dynamic languages fast, just that they mostly have to use different techniques.
You need to make languages fast on both dimensions, static and dynamic. A dynamically typed language can only be optimized in one dimension, which is why they usually trail behind statically typed languages in performances (since those are optimized along both axes).
I've not seen anything like the dynamic-dispatch optimisations the Self team came up with in the 1990s done in C++, nor problem-domain optimisations expressed as macros as is often done in Lisps, nor the aggressive constant-propagation of closures and their eventual inlining that Factor does.
Perhaps some of them could be applied, but a highly complicated base language with extensive mutation semantics (including pointer aliasing) probably mean they'll be so limited in their applicable scope, and so difficult to implement, that it's not an attractive activity for C++ compiler writers...
Do you know Synthesis OS (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.29.4...). The author made the OS specialize (and thus optimize) system calls during runtime (or something like that, it's been a while since I read the paper). The techniques might me applicable in higher-level languages than the assembler used there.
Yeah! Synthesis is quite excellent. I've been interested in exploring using the FORTH model of easily-accessible-runtime-compiler in that sort of context...
No, one of the requirements for static optimization is that a language be statically typed. It does not follow that there are no ways to make dynamic languages fast, just that they mostly have to use different techniques.