It's been a great many years since I've touched this stuff but two rules of thumb probably haven't changed in the meantime:
1. in general, algorithms tend to slow down on small problems in exchange for the asymptotic speedup -- so you probably don't want to use this algorithm for anything that fits in memory (or perhaps the solar system)
2. in specific, asymptotically fast matrix multiplication tend to be numerically unstable. So you probably don't want to use this algorithm unless you're working with infinite precision.
> 1. in general, algorithms tend to slow down on small problems in exchange for the asymptotic speedup -- so you probably don't want to use this algorithm for anything that fits in memory (or perhaps the solar system)
This is not necessary true and very much depend on specifics of an algo. Some improvements could be very applicable for quite normally sized tensors. For example, Winograd algorithm is quite often a routine choice for 3x3xc convolutions.
> 2. in specific, asymptotically fast matrix multiplication tend to be numerically unstable.
This should be investigated in each particular case of an algorithm, but numeric stability is indeed a major concern for a practitioner. I skimmed very superficially through the article but didn't notice mentions of that.
1. in general, algorithms tend to slow down on small problems in exchange for the asymptotic speedup -- so you probably don't want to use this algorithm for anything that fits in memory (or perhaps the solar system)
2. in specific, asymptotically fast matrix multiplication tend to be numerically unstable. So you probably don't want to use this algorithm unless you're working with infinite precision.