Anecdotally, the team I was in in AWS needed to build a complicated component, one that, should it get it wrong, would be disastrous for the service.
They'd estimated about 4-6 months for a two person team, made from some of the best engineers in the service, focused entirely on it to get it written, tested and out to production.
They decided to use TLA+ to model, despite neither engineer having used it before. They lost about a week to getting up and running with it (one engineer's only complaint was how tied in to Eclipse it was), and then spent the res of the month working on and modelling the whole task.
It found problems. A whole bunch of them. The fixed the model until finally TLA+ gave them an all clear.
Then came the coding. Well... that didn't take very long at all. The TLA+ model effectively outlined all the code and methods for them. The actual programming ended up being almost a cookie cutter simple code.
In total, the new and complicated component went from drawing board to tested and ready for production in about 2 months. Despite having had to learn TLA+, it ended up taking less time than if they'd not written the TLA+ models in the first place.
Ive read a lot of industrial use of formal methods. If the method is easy, then almost all of them say what you said about the coding phase. I'll add that competing methods have code generators: Abstract, State Machines and B Method. ASM's look a lot like PlusCal with usability for about anything. Asmeta is one of tools with code generator. I think most doing it for B commercially use Atelier B and ProB. ProB does TLA+, too.
Now, far as TLA+-like stuff, tell them to check out SPIN model checker. It's seen more commercial use than most tools. Outperformed TLA+ in one paper but dont know if that extrapolates. Port some of your stuff to it to see what happens.
Hillel Wayne[1] has written a bunch about TLA+ for lay programmers like me. I've found his blog posts pretty good, I also bought his book Practical TLA+ which I like. He also wrote the free web book Learn TLA+[2].
When you decide to start, collect mail adresses of interested readers. Don't just assume people will stumble upon your book, seek out readers long before the book is finished.
I recently told this on Twitter to an Academic who had just finished writing a book on Cryptography in middle-age Venice (how cool is that?), but with publication still many months off.
She had never thought about collecting people's mail addresses, even though dozens (if not hundreds) of people replied on Twitter that they'd be interested in reading the book.
> Can anyone advise on whether it’s worth modelling general critical systems in TLA+ where they are not distributed?
Absolutely, and not just critical systems. Any system that is either complex or may have some non-obvious subtleties can benefit from a specification.
> In broad strokes, how does one make sure that your real system indeed behaves like your model?
In general, TLA+ can be used to specify very large and very complex systems. It is currently infeasible to mechanically verify with absolute certainty that systems of such size conform to a specification, regardless of tools used. The only systems that can be verified to such an extreme extent (called end-to-end verification, namely interesting global properties are verified all the way down to the code, and even to the machine-code level) are very, very small (no more than about 10KLOC), and even then require a tremendous amount of effort by experts.
Specifications should be relatively short and clear. They are therefore useful for stating your assumptions about the system, and then checking the consequences of those assumptions. Whether the assumptions are accurate, approximate or wrong can then be verified by inspection -- this is certainly feasible and commonly done in practice. There are also relatively cheap mechanical ways to check that the system conforms to the spec, but not with absolute certainty. One is called trace checking, and its possible use with TLA+ is described here: https://pron.github.io/files/Trace.pdf
It's hard to answer because using TLA+ in general is far from common. But among users, I would say that most cases are where they have an existing system. That was certainly how I first used TLA+ (when I realized there could be deep design issues with a system I worked on; TLA+ showed me that there indeed were two serious problems, and it also verified that my fix was correct).
In my experience this is how a lot of people practice learning TLA+: taking systems they already know and applying TLA+ to them. The advantage of this is you already know the system pretty well and can use that to check you modeled it correctly.
Also, it finds serious bugs scarily often. I've regularly received messages from people like "Yeah I tried modeling the system and it turns out we need to throw out six months of work due to a requirement bug".
Akka has testkit and multi node testkit which you can use to test such systems. If you need to explore state space, you can use data driven and or combinatorial testing attributes from *unit libraries.
TLA+ and any kind of testing serve different purposes. I don't know how good of an analogy this is, but tests are more like strength tests on a structure, while a TLA+ is like a blueprint.