Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Wonderful questions! Actually some of the best in the entire thread I think.

1. See (3) but first read:

A) The upper boundary is defined by the current machine's local clock, which could have skew or drift.

B) The lower boundary is defined by the last known update on an individual record (down to the UUID+field).

2. The expected use case is for this conflict resolution algorithm is for basic field/value pairs (terms defined here: https://github.com/amark/gun/wiki/semantics, and here: https://github.com/amark/gun/wiki/JSON-Data-Format) within a UUID an object (called a node, as in a node in a graph).

This is what HAM works off and is considered the lowest level atomic pieces (the value). In order to sync on collaborative text you need to build an OT layer on top of this (I plan on doing this, possibly integrating with ShareJS as another mentioned). You cannot collaboratively sync on atomic values by themselves, you must define a CRDT for that - plugins/modules for them will be coming later.

3. Vector Clocks. HAM does not assume what the sort key is for state, it just assumes it is a value it can do <, <=, ===, =>, > comparisons on.

A) Vector clocks have a vulnerability that if you are working with temporary/ephemeral machines, the clocks will constantly get reset and have to play "catch up". However, network partitions are highly likely, so there is no guarantee that two machines won't issue a conflicting vector clock. If this happens, there is no standard way of dealing with this, although there are plenty of work arounds.

B) Timestamps also have a vulnerability, that is if you set your local clock ahead (say 2 years in the future) then it will "always win" wiping out other peers valid values. However you unfortunately cannot determine in an untrusted network whether a peer is being malicious about being 2 years in the future, or if they are actually at a different point in timespace - like a GPS satellite or on Mars, or went offline in the subway.

C) As a result, this is why I combine them together via the boundary function. The upper and lower boundaries of the state machine provide the relative "vector" for the untrusted timestamp in the delta update.

The benefits of this technique are two fold:

1) You get deterministic and idempotent resolution within a special-relativity timeframe in a decentralized system without gossip (consensus).

2) If you do run GUN within your own trusted network, you can use the timestamps to calculate drift between machines and then readjust the boundary function of the state machine. Thus giving you a highly accurate "objective" view of your data across peers, which if the latency is low enough could indicate it is worth creating locks (but thus sacrificing Availability).

Hope this was clear enough! Any questions? I'm going to be reposting this in the rest of the thread.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: