We’ve been experimenting with a similar idea but in a browser-native environment...

We’ve been experimenting with a similar idea but in a browser-native environment — running real containers + a WebSocket terminal + multi-agent workflows. GPT-5.1 (Codex Max especially) seems to handle multi-step refactors a lot more cleanly, and chaining it through CLI agents has been surprisingly reliable.

Curious if anyone else is trying agent orchestration beyond the editor itself?