"Lossless compression requires identifying the pattern that produced an input as...

Houshalter · on March 17, 2014

This is not true. If you have a good predictor, you only need a few bits to store a piece of information. One way is just to record the places where your prediction is wrong. The ideal way would be to split all the possibilities so exactly half of possible sequences are on one side, and exactly half the probability is on the other. Every bit tells you what path to go down.

So instead of using 64 bits to specify the x y coordinates of every point on the plot, you could just use a much smaller number to represent how far it diverges from it's predicted location. You could narrow down the possible locations the point could be in by half, and then you just bits to specify only those possibilities, not all of them.