There is also a class at Stanford that builds up all the theory of the Kalman filter, starting with elementary probability [1]. The slides are all posted, and while they wouldn't be great to learn the material from, they're an excellent reference (and go into more depth on multivariate Gaussians and estimation theory than Probabilistic Robotics).
1. http://engr207b.stanford.edu/