The reason you don't see flicks used all that much is because media encoding is incredibly proprietary in the space where it matters most, live broadcast. Selling into the broadcast space, every broadcaster has petabytes of video in their own chosen format. You have to support EVERY format to sell broadly, and they're not going to let you transcode everything into your format. Using flicks gives you the ability to support as many combinations as possible with the least effort.
It matters very little when users are trained to tolerate slow transitions between videos, formats, etc...
It also doesn't matter a whole lot when doing offline transcoding either, as you can afford to do the more expensive calculation.
Even for live broadcast, how can a few integer instructions matter? If you have to compare timestamps every audio sample, that's a few multiplications every 22 microseconds (common denominator can be computed once). Or am I completely off here?
Trying to realtime encode 4k into AVC using pure software encoders takes about 80 cores/hyperthreads. Then you need to meet the tight 70ns? timings of the SMPTE 2022-6/7 protocols (i.e. 12gbps without using jumbo frames because reasons).
It's also necessary for clip switching. If you want frame accurate clip switching (i.e. show->ad->ad->show) you need consistent and precise pointers into your files.
It matters very little when users are trained to tolerate slow transitions between videos, formats, etc...
It also doesn't matter a whole lot when doing offline transcoding either, as you can afford to do the more expensive calculation.