In a way, any GAN (https://en.wikipedia.org/wiki/Generative_adversarial_network) has aspects of a Society of Mind: two different networks communicating with each other, with the discriminator attempting to find flaws with the generator's ongoing output.
One of the challenges, I think, is that while some of these agents could interact with the world, it's just far more rapid for training if they just use their own (imperfect) models of the relevant subset of the world to give answers instantaneously. Bridging this to increasingly dynamic physical environments and arbitrary tasks is a fascinating topic for research.
And https://scholar.google.com/scholar?hl=en&as_sdt=0%2C31&as_vi... shows many attempts to generalize this to multiple adversarial agents specializing in different types of critique.
One of the challenges, I think, is that while some of these agents could interact with the world, it's just far more rapid for training if they just use their own (imperfect) models of the relevant subset of the world to give answers instantaneously. Bridging this to increasingly dynamic physical environments and arbitrary tasks is a fascinating topic for research.