There is a public distributed effort happening for Go right now: http://zero.sjeng.org/. They've been doing a fantastic job, and just recently fixed a big training bug that has resulted in a large strength increase.
I ported over from GCP's Go implementation to chess: https://github.com/glinscott/leela-chess. The distributed part isn't ready to go yet, we are still working the bugs out using supervised training, but will be launching soon!
We are using data from both human grandmaster games and self-play games of a recent Stockfish version. Both have resulted in networks that play reasonable openings, but we had some issues with the value head not understanding good positions. We think we have a line on why this is happening (too few weights in the final stage of the network), but this is exactly the purpose of the supervised learning debugging phase :).
This is really cool! The chart on that page makes it look like Leela Zero is already much much better than AlphaZero (~7400 Elo vs ~5200 Elo). I suspect I'm misinterpreting something though, could you clarify?
Leela Zero's ELO graph assumes that 0 ELO is completely random play, as a simple reference point.
On the other hand Alphago uses the more common ELO scale where 0 is roughly equivalent to a beginner who knows the rules, so you can't directly compare the two.
I've been having fun following along with Leela Zero, it's a great way to understand how a project like this goes at significant scale. Good luck with Leela Chess, I'm excited for it!
Sequential probability ratio test, it is essentially a test that distinguishes between two hypotheses with high probability.
In the case of Leela Zero the idea is to train new networks continuously and have them fight against the current best network, which is replaced only when a new network is statistically stronger, according to SPRT.
I ported over from GCP's Go implementation to chess: https://github.com/glinscott/leela-chess. The distributed part isn't ready to go yet, we are still working the bugs out using supervised training, but will be launching soon!