Distributed systems are suited to static content or "append-only" mutable data -...

blurbleblurble · on Oct 1, 2017

Well of course it depends a lot on your specific application.

Applications like Tox or Matrix (which uses servers, but not necessarily "centralized" servers) are great examples of dynamic p2p applications.

Or for example applications that use statically distributed javascript to facilitate dynamic p2p communications. Stuff like together.js, gun.js, freedom.js, etc.

Syncthing and Resilio Sync are also wonderful examples, and Resilio Sync has amazing encryption features: you can give out seed-only links to your data. People who use these links won't have permission to decrypt the content. They will only have permission to echo it. That's a "raspberry pi plugged into the wall at the coffee shop" solution to private, mutable content distribution.

As for the shopping cart example, this is something that could be conducive to a more centralized approach, especially if your physical distribution model is centralized and your payment system is centralized (traditional banks). In that of case, you'd want to have a more direct connection with the physical distributor. If you want a direct, instantaneous connection with the shopping cart company's servers, then that's what you need.

But it's possible to have a situation where your product is not physical (like music or video), and you are using a decentralized currency (like bitcoin). There's absolutely no reason you couldn't facilitate that in a completely distributed way.

By the way, Bitcoin is a banking app... Have you ever used a browser based cryptocurrency wallet? Imagine a browser based cryptocurrency wallet that's hosted on IPFS. That's a pretty distributed banking app. If you want privacy too, use zcash or monero.

carolc · on Oct 1, 2017

See the comment below https://news.ycombinator.com/item?id=15376665. It is not true that distributed systems are only good for static content or "append-only" data. "Mutable systems" can be built on top of immutable systems.

vog · on Oct 1, 2017

I agree with you, but your argument is deeply flawed. There's quite a far stretch between "Y can be built on top of X" and "X is good for Y".

To provide an argument that might fill this gap:

Most systems don't actually have a huge amount of data. Look at the data size and data growth of CRMs, special-purpose wikis, and so on: These are mostly smaller than 500 MB (excluding static content like images), and grow by less than 1MB even on a busy day. And that's the uncompressed size.

Also, most systems, despite being mutable, actually want (or need) an audit trail. So these are really append-only systems which merely have a "mutable look and feel" to the user.

carolc · on Oct 1, 2017

Agreed that many CRMs etc. don't have a lot of data. And that's actually good, it makes the database size very manageable in the context of trustless, distributed networks.

I'm not following the logic of the argument here though, jumping from "X is good for Y" to "...don't actually have huge amount of data", perhaps you can elaborate?

With a merkelized append-only log (immutable DAG), there's always an audit trail. I agree with your point about "mutable look and feel", in a lot of use cases there's only a limited set of "writers" and updates happen infrequently.

Perhaps I should rephrase my previous comment, then, as "immutable systems are good for building mutable systems on top". Does that help to provide a better counter argument?

vog · on Oct 1, 2017

Here's my complete line of reasoning:

You can build mutable systems on top of immutable (append-only) systems. But is that a good idea? Yes, it is, for systems which don't have huge amounts of (non-static) data, and/or system which need an audit-trail anyway. And these are more systems than one may initially think.

carolc · on Oct 1, 2017

"Here's my complete line of reasoning: You can build mutable systems on top of immutable (append-only) systems. But is that a good idea? Yes, it is, for systems which don't have huge amounts of (non-static) data, and/or system which need an audit-trail anyway. And these are more systems than one may initially think."

I disagree that immutability is a negatively defining factor here re. data size or capabilities of the database.

If you look how many Big Data systems process data, you'll find that at the core of many, is an append-only log. For example: Kafka is a log (https://engineering.linkedin.com/distributed-systems/log-wha...), and looking at Apache Samza's architecture, we can see how a log is at the core of it (https://www.confluent.io/blog/turning-the-database-inside-ou...). In less Big Data orientated databases, there's always a log of operations (sometimes also called a transaction log or replication log) to keep the track of changes.

blurbleblurble · on Oct 1, 2017

I think git is a great example of bridging the mutable/immutable gap. The "mutable" stuff happens locally in the ram, or on a local filesystem, as someone edits their files, debugs, whatever. A commit represents a save checkpoint. Somebody has decided that this state is worth snapshotting, that it would be a useful reference down the line. At this point an immutable version is made, ready to be shared.

As with git, even if a version (commit) is immutable, it doesn't mean it's worth saving. Lots of times, you might make a temporary branch locally to do some work. Then you'll merge it and push the merged version upstream. Later you might check out a new copy from upstream, not caring that your temporary working branch isn't there.

User friendly versioning is a major challenge for dynamic, distributed applications. How do we gracefully bridge the gap between long term (distributed) memory and short term (local) memory? Each specific application has its own needs and tradeoffs.

And how do applications communicate about which versions are compatible with the applications' needs? About which versions are worth holding onto?

userpass · on Oct 1, 2017

I don't really get it. Sure it's fine if one p2P app uses 3GB (1GB for the append only log, 2GB for a database with indices that can actually be queried) of data. What if you have several apps? Let's say 10. Then you need 30GB and because people only have 32GB to 64GB of storage on their phones the discussion ends right here.

blurbleblurble · on Oct 1, 2017

I didn't downvote you. But your data sizes are arbitrary.

Why would something like a chat or email app need to hang onto that much history?

Imagine a distributed "email" app that uses networks of mutually trusted peers to deliver encrypted messages ("emails") asynchronously. My device doesn't need to hang onto your emails indefinitely. It only needs to hang onto them until they've been received. This could be done via explicitly sending receipts, or probably in most cases by giving stuff simple expiration dates. The sender would have the most incentive to hang onto the original message until its been delivered.

How this scales in terms of MB and GB is hugely dependent on how your application is configured, how frequently new data is emerging, the limits set by peers for how much they're willing to share, etc. But text is pretty cheap. I can't imagine storing 3 GB of yours or someone else's text emails on your phone, short term or long term. The raspberry pi plugged into the wall at your house can has much more storage anyway ;)

imtringued · on Oct 1, 2017

I don't really see why a CRM needs to be decentralised. You need to host it yourself to avoid a cloud vendor going out of business but other than that what problem do you solve by decentralising it?

OtterCoder · on Oct 1, 2017

Delivery. A cloud provides worldwide availability at the cost of trust. A distributed site can survive any one entity failing, which includes you, and it can serve from anywhere your users want it to.

lgierth · on Oct 1, 2017

You're right - you can model any data as append-only though. Granted, in many cases it will require you to seriously sit down and remodel your data. Nobody's claiming it's going to be SQL and ACID transactions :) It's more likely going to be collaborative append-only logs based on CRDTs.

There are examples of the use cases you mention being built with decentralized technologies. [1][2]

[1] The various cryptocurrency wallets and exchanges

[2] https://openbazaar.org