More

brendoncarroll · 2026-01-27T16:42:33 1769532153

Me too. Version control is great, it should get more use outside of software.

Notable differences: E2E encryption, parallel imports (Got will light up all your cores), and a data structure that supports large files and directories.

rtkwe · 2026-01-27T18:54:05 1769540045

The problem is when you move beyond text files it gets hard to tell what changes between two versions without opening both versions in whatever program they come from and comparing.

brendoncarroll · 2026-01-27T19:01:38 1769540498

> The problem is when you move beyond text files it gets hard to tell what changes between two versions without opening both versions in whatever program they come from and comparing.

Yeah, totally agree. Got has not solved conflict resolution for arbitrary files. However, we can tell the user where the files differ, and that the file has changed.

There is still value in being able to import files and directories of arbitrary sizes, and having the data encrypted. This is the necessary infrastructure to be able to do distributed version control on large amounts of private data. You can't do that easily with Git. It's very clunky even with remote helpers and LFS.

I talk about that in the Why Got? section of the docs.

https://github.com/gotvc/got/blob/master/doc/1.1_Why_Got.md

DASD · 2026-01-27T17:57:11 1769536631

Nice! Not sure if you're aware of Got(Game of Trees) that appears to pre-date your Got.

https://gameoftrees.org/index.html

brendoncarroll · 2026-01-27T20:25:08 1769545508

Yes the author reached out. There has not yet been a confusion among real users that I am aware of.

https://github.com/gotvc/got/issues/20

brendoncarroll · 2026-01-15T15:26:04 1768490764

I also wrote a tool for doing this[0], after one of these agents edited a config file outside of the repo it was supposed to work within. I only realized the edit because I have my dotfiles symlinked to a git repository, and git status showed it when I was committing another change. It's likely that the agents are making changes that I (and others) are not aware of because there is no easy way to detect them.

The approach I started taking is mounting the directory, that I want the agent to work on, into a container. I use `/_` as the working directory, and have built up some practices around that convention; that's the only directory that I want it to make changes to. I also mount any config it might need as read-only.

The standard tools like claude code, goose, charm, whatever else, should really spawn the agent (or MCP server?) in another process in a container, and pipe context in and out over stdin/stdout. I want a tool for managing agents, and I want each agent to be its own process, in its own container. But just locking up the whole mess seems to work for now.

I see some people in the other comments iterating on what the precise arguments to bubblewrap should be. nnc lets you write presets in Jsonnet, and then refer them by name on the command line, so you can version and share the set of resources that you give to an agent or subprocess.

[0] https://github.com/brendoncarroll/nnc

brendoncarroll · 2026-01-14T19:14:41 1768418081

https://brendoncarroll.net

brendoncarroll · 2025-12-26T15:55:56 1766764556

I personally think that this is the future, especially since such an architecture allows for E2E encryption of the entire database. The protocol should just be a transaction layer for coordinating changes of opaque blobs.

All of the complexity lives on the client. That makes a lot of sense for a package manager because it's something lots of people want to run, but no one really wants to host.

brendoncarroll · 2025-12-15T22:24:12 1765837452

I recently released Blobcache v0.0.2. https://github.com/blobcache/blobcache

Blobcache is a content-addressed data store for holding application state, and buiding E2EE applications. This most recent release includes a git remote so you can push and fetch Git data into and out of Blobcache.

brendoncarroll · 2025-12-14T23:45:56 1765755956

I'm a happy bcachefs user. Haven't had any issues on a simple mirrored array, which I've been running since before it was in (and out) of the kernel. It's the best filesystem in 2025. Thank you for all your work.

What is the status of scrub? Are there any technical barriers to implementing it, or is it just prioritization at this point? FWIW I think there are probably a lot of sysadmin types who would move over to bcachefs if scrub was implemented. I know there are other cooler features like RS and send/receive, but those probably aren't blocking many from switching over.

koverstreet · 2025-12-14T23:55:19 1765756519

Scrub went in in 6.15. I think 6.17 might've been when it was fully solid; it took a bit for some bugs to shake out in the self healing paths.

brendoncarroll · 2025-11-23T20:19:48 1763929188

I work on a project Blobcache, a content addressed store for exposing and consuming storage over the network. It supports full end to end encryption, and offers a minimal API to prevent applications from leaking data.

https://github.com/blobcache/blobcache/blob/master/doc/0.2_W...

You can persist arbitrary hash-linked data structures in Blobcache volumes. One such data structure is the Git-Like Filesystem, which supports the usual files and trees.

https://github.com/blobcache/blobcache/blob/master/doc/8.5_G...

brendoncarroll · 2025-10-05T14:06:20 1759673180

I work on a FOSS project in this space, Blobcache.

https://github.com/blobcache/blobcache

Trusting a server to store an application's state is a different thing from trusting it to author changes or to read the data. Servers should become dumber, and clients should become smarter. When I use an app, I want the app to load E2E encrypted state from storage (possibly on another machine, possibly not owned by me) make whatever changes and produce new encrypted data to send back to the server. The server should just be trusted for durability, and to prevent unauthorized access, but not to tell the truth about doing either of those things. Blobcache provides an API to facilitate transactions on E2EE state between a dumb storage server and any smart client.

Blobcache can be installed on old hardware along with a VPN like Tailscale and then loaded up with data from other devices. Configuration is like SSH, drop a key in a configuration file to grant access. It removes most of the friction associated with consuming and producing storage as a resource.

I'm using it to build E2EE version control like Git, but for your whole home directory.

https://github.com/gotvc/got

ianopolous · 2025-10-05T16:47:26 1759682846

We should talk. This very similar to how apps use E2EE data in Peergos. Maybe we can join forces. https://peergos.org/posts/a-better-web

brendoncarroll · 2025-10-05T18:31:32 1759689092

I couldn't find an email in your bio. You can reach me via the email at the bottom of my website (in my HN bio).

Looking through the docs on Peergos, it looks like it's built on top of IPFS. I've been meaning to write some documentation for Blobcache comparing it to IPFS. I can give a quick gist here.

Blobcache Volumes are similar to an IPNS name, and the set of IPFS blocks that can be transitively reached from it. A significant difference is that Blobcache Volumes expose a transaction API with serializable isolation semantics. IPFS provides distributed, available-but-inconsistent, cryptographically signed cells. IPFS chooses availability, and Blobcache chooses consistency. A Blobcache Volume corresponds to a specific entity maintained and controlled by a specific Node. An IPFS name exists as a distributed entity on the network.

Most applications need some sort of consistent transactional cell (even if they don't realize it), but in order to be useful, inconsistent-but-available cells have to be used carefully in an application specific way. I blame this required application-specific care for the lack of adoption of CRDTs.

There's a long tail of other differences too. IPFS was pretty badly behaved the last time I used it, trying to configure my router, and creating lots of connections to other nodes. Blobcache is more like a web browser; it creates transient connections in immediate response to user actions.

That whole ecosystem is filled with complicated abstractions. Just as an example, the Multihash format is pervasive. It amounts to a tag for the algorithm used to create a hash, and then the hash output. I'd rather not have that indirection. All the hashes in Blobcache are 256 bits, and you set the algorithm per Volume. In Go that means the hashes can just be `[32]byte` instead of a slice and a tag and a table of algorithms.

I haven't used IPFS in a while, but I became pretty familiar with it awhile ago. Had I been able to build any of the stuff I was interested in on top of it, I probably wouldn't have written Blobcache.

ianopolous · 2025-10-05T18:55:11 1759690511

Thanks! I'll send you an email.

The good news is Peergos also has serializable transactional modifications. This comes from us storing signed roots in a db on your home server (not ipns). We also have our own minimal ipfs implementation that uses 1000x fewer resources than kubo, aka go-ipfs.

attila-lendvai · 2025-10-06T07:29:41 1759735781

you two should also be aware of

https://www.ethswarm.org/

(censorship resistant distributed storage with the same API)

brendoncarroll · 2025-10-07T14:25:40 1759847140

The same API part isn't surprising, content addressed stores are the most natural way to accept encrypted data.

The public storage networks are targeting a different use case than Blobcache though, which I think of as a private or web-of-trust storage network. To use a cryptocurrency backed storage solution, one must manage accounts, or a wallet of transaction outputs, connect to unknown parties on the internet, and pay for the increased redundancy. There's also legal risk, depending on the jurisdiction, when allowing untrusted parties to store arbitrary information on one's devices.

I don't want to consult the global economy in order to make use of my extra hard drives, which would otherwise be idle.

attila-lendvai · 2025-10-08T04:47:48 1759898868

re legal risks: no one knows what their machines are storing in swarm without also holding a key and a hash. the pieces are distributed based on the hash of the encrypted value.

apitman · 2025-10-06T01:47:00 1759715220

> Configuration is like SSH, drop a key in a configuration file to grant access. It removes most of the friction associated with consuming and producing storage as a resource.

What's the story for people who don't know what an SSH hey is?

hofo · 2025-10-06T17:52:02 1759773122

They either exercise intellectual curiosity through Google and YouTube, or they wait until someone decides it’s worth making a commercial offering.

g4k · 2025-10-05T18:07:41 1759687661

There is also https://remotestorage.io/ for per-user storage.

brendoncarroll · 2025-09-30T13:52:09 1759240329

I'm working on Blobcache. https://github.com/blobcache/blobcache

Blobcache is content addressed storage, available over the network. Blobcache allow nodes to securely produce and consume storage. Configuration in similar to SSH, drop a public key in the configuration, and you're done. Blobcache is a universal backend for E2E encrypted applications.

Docs - https://github.com/blobcache/blobcache/blob/master/doc/0.0_B...

I'm also working on Got Version Control https://github.com/gotvc/got

Got uses Blobcache for storing file data.

Got is like Git, if you fixed all the problems with storing large files and directories in Git. There's no "large files provider" to configure separately. All the data for a commit goes to the same place. Got also encrypts all the data you put in it, E2E. If you've run into problems putting your whole home directory in Git, you might have more luck with Got.

Both projects are GPL licensed, FOSS. Contributions welcome.

brendoncarroll · 2025-08-08T21:49:08 1754689748

All of those issues can be solved by doing an import of the changed file into the build system's content addressed store, and creating a new version of the entire input tree. You also don't need to choose between cancelling, waiting, or dropping. You can do 2 builds simultaneously, and anything consuming results can show the user the first one until a more recent one is available. If the builds are at all similar, then the similar components can be deduplicated at runtime.

These techniques are used in a build system that I work on[0]. Although it does not do automatic rebuilds like Poltergeist.

[0] https://github.com/wantbuild/want

dataflow · 2025-08-08T23:48:43 1754696923

> All of those issues can be solved

I've yet to see any build system solve these.

> by doing an import of the changed file into the build system's content addressed store, and creating a new version of the entire input tree.

That's going to be unusably slow and heavyweight for automatic rebuilds on a large repo. Maybe if you optimize it for a specific COW filesystem implementation that overlays things cleverly, it'd be able to scale. Or if your build tree avoids large directories and all your build tools handle symlinks fine, then you could symlink most things that don't change quickly. But I absolutely do not see this working on a large repo with the everyday filesystems people use. Not for a generic build system that allows arbitrary commands, anyway.

> You also don't need to choose between cancelling, waiting, or dropping. You can do 2 builds simultaneously

Do you have infinite CPU and RAM and money and time or something? Or are you just compiling Hello World programs? In my universe with limited resources this would not work... at all.

> These techniques are used in a build system that I work on[0].

And how exactly do you scale it the way you're describing with automatic rebuilds?

> Although it does not do automatic rebuilds like Poltergeist.

...ah.