This kind of cache (mapping a url to a constant image) seems to have little to d...

Jach · on Feb 6, 2012

If you start from the premise that there are only two hard CS problems then all hard CS problems must be a special case of one or both of those two. So cache in an on-chip memory sense is conflated with disk storage is conflated with distributed disk storage. This isn't necessarily bad, Smarty for example caches rendered files to disk storage and calls it a cache, but it does show off the problem of naming things. (That and every variable named "data".) Anyway, I imagine what the GP was getting at is that if your data distribution isn't particularly deterministic enough (e.g. swapping hard drives in and out when they fail) you have to deal with validation that a particular data changing command (store, delete, whatever) actually propagated to a sufficient portion (in some cases all) of the servers and that the introduction of new or changing or rogue servers doesn't affect that. The more apt term for this is consistency. Related is the CAP theorem which says for any distributed system, you can only pick any 2 of consistency, availability, or partition tolerance, though it's more interesting to talk about atomic operations, where transactions are/where they might be desired, and whether things get faster or slower with more data.

amalcon · on Feb 6, 2012

They're similar in that a deletion consists of a change to the image. This means that the image is now changing, so it's the same basic problem again.

The two scenarios have entirely different constraints, this is undeniable. It's still an instance of the same basic problem: verifyably deleting the image means invalidating all caches of that image.