Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

anyone thinking about hanging their hat on writing decentralized, local first, crdt-driven software should also consider the complexity of changing your schema when there are potentially thousands of clients with different opinions of what that schema should be. some interesting research has been done[1] in this domain but i haven't seen any library of this ilk that supports real-world schema evolution in a way doesn't make me really, really wish i had a proper backend to lay down the law. the fact that migrations are in a "coming soon" state makes me wary of using jazz, and i wonder how they would approach this problem with the library as it is now.

[1] https://www.inkandswitch.com/cambria/



It’s a tough problem! I think 90% of it is solved with GraphQL/ protobuf like rules (“only ever introduce new fields”)

There are some edge cases where you might loose easy access to some data if two clients migrate concurrently, but we’re hoping to provide patterns to mitigate these

Edit: Right now it all depends on you to implement migrations “the right way” but we hope to provide guardrails soon


I've been thinking about this as well and wondering if one possible approach to avoid the eventual messy database state—where fields are "append-only" and never deleted—might be to include the up/down migration logic as part of the payload.

This approach may require a central authority (with no access to user data) responsible solely for providing the schema and migration patterns as code transformations.

Since running arbitrary code from the payload introduces potential security risks, the migration code could be cryptographically signed, ensuring that only valid, trusted transformation code is executed. A possible additional security layer would be to have the transformation code execute in a sandbox which can only output JSON data. (keeping a possible full before-migration version as backup in case something went wrong would always be a good idea)

Another option would be to use a transformation library for migrations, but in this case, the approach would only describe (as JSON) the functions and parameters needed to transition the schema from one version to another.


> I think 90% of it is solved with GraphQL/ protobuf like rules (“only ever introduce new fields”)

Agreed, that’s the only sensible thing to do. Not sure it’s 90% though.

> but we’re hoping to provide patterns to mitigate these

Hope is not confidence inspiring for the most difficult problem at the heart of the system. That doesn’t mean it has to be impossible, but it needs to be taken seriously and not an afterthought.

Another thing you have to think about is what happens when data of a new schema is sent to a client on an older schema. Does the “merging” work with unknown fields? Does it ignore and drop them? Or do you enforce clients are up to date in some way so that you don’t have new-data-old-software?


You’re right, I was just being short. Will give you a longer and more concrete answer tomorrow


Couldn't you just use api/schema versioning?


You could of course effectively create a new database whenever you make a new schema.

But that's not exactly convenient for the users.


I think he meant up/down data transformation migrations, not entire new dbs or collections.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: