OpenCode was the first open source agent I used, and my main workhorse after experimenting briefly with Claude Code and realizing the potential of agentic coding. Due to that, and because it's a popular an open source alternative, I want to be able to recommend it and be enthusiastic about it. The problem for me is that the development practices of the people that are working on it are suboptimal at best; they're constantly releasing at an extremely high cadence, where they don't even spend the time to test or fix things (or even build a proper list of changes for each release), and they add, remove, refine, change, fix, and break features constantly at that accelerated pace.
More than that, it's an extremely large and complex TypeScript code base — probably larger and more complex than it needs to be — and (partly as a result) it's fairly resource inefficient (often uses 1GB of RAM or more. For a TUI).
On top of that, at least I personally find the TUI to be overbearing and a little bit buggy, and the agent to be so full of features that I don't really need — also mildly buggy — that it sort of becomes hard to use and remember how everything is supposed to work and interact.
I am more concerned about their, umm, gallant approach to security. Not only that OpenCode is permissive by default in what it is allowed to do, but that it apparently tries to pull its config from the web (provider-based URL) by default [1]. There is also this open GitHub issue [2], which I find quite concerning (worst case, it's an RCE vulnerability).
It also sends all of your prompts to Grok's free tier by default, and the free tier trains on your submitted information, X AI can do whatever they want with that, including building ad profiles, etc.
You need to set an explicit "small model" in OpenCode to disable that.
This. I work on projects that warrant a self hosted model to ensure nothing is leaked to the cloud. Imagine my surprise when I discovered that even though the only configured model is local, all my prompts are sent to the cloud to... generate a session title. Fortunately caught during testing phase.
If you're using software someone else wrote, you'd have to repeat this testing phase any time an update is installed, right?
(I do mean this as a general principle, but also it was pointed out elsewhere in the thread that this is a particularly "high velocity" project as far as unexpected changes go.)
I’m curious if there’s a reason you’re not just coding in a container without access to the internet, or some similar setup? If I was worried about things in my dev chain accessing any cloud service, I’d be worried about IDE plugins, libraries included in imports, etc. and probably not want internet access at all.
Yeah — you can develop in a container that’s configured to only allow local access. Your machine is connected to the Internet as usual, so you can access any docs you want or whatever, but the actual execution environment running on your machine can’t. This is pretty easy to set up in Docker, for example. It’s also useful because you can have the same exact dev environment no matter what machine you’re on, OS you’re running, etc.
The small_model option configures a separate model for lightweight tasks like title generation. By default, OpenCode tries to use a cheaper model if one is available from your provider, otherwise it falls back to your main model.
I would expect that if you set a local model it would just use the same model. Or if for example you set GPT as main model, it would use something else from OpenAI. I see no mentions of Grok as default
i ran it through mitmproxy, i am using pinned version 1.2.20, 6 march 2026, set up with local chat completions.
on that version, it does not fall back to the main model. it silently calls opencode zen and uses gpt-5-nano, which is listed as having 30 day retention plus openai policy, which is plain text human review by openai AND 3rd party contractors.
They're talking about before it's configured by the user. It defaults to 'free' models so that the user can ask a question immediately on startup. Once you configure a provider, the default models aren't used.
It depends. For a lot of hardware it's actually easier to get working on linux, because the driver is just part of the kernel and you don't have to do anything special, including manually installing drivers, to get it working.
There are some cases where hardware support on Linux is suboptimal, such as Nvidia cards and many fingerprint readers, but things are a LOT better now than they used to be. Most consumer laptops and desktops will run linux just fine.
In 2022 we got new zen 5 amd cpus almost when they came out, windows did not recognize a bunch of stuff and had to find and download individual drivers. In linux (ubuntu) everything worked out of the box, except only that the LTS release did not support the kernel that supported the new mobos yet and had to install the rolling release instead.
I liked the apple II, and the TRS 80 as I rather like basic. And then I didn’t hate DOS, and then I actively hated the graphical shell of Windows 3, but could not afford a Macintosh -so suffered through it where I had to, but mainly used DOS. Then I discovered UNIX, and did almost all of my work on a timeshare - in the early 90s!
Then Windows 95 came out and I actively hated it, but did think it was amazingly pretty - somehow this was the impetus for me to get a pc again, which I put Windows NT on. Which was profitable for freelance gigs in college. Soon after that, I dual booted it to Linux and spent most of my time in Slackware.
After that, I graduated and had enough money to buy a second rig, which I installed OS/2 warp on - which was good for side gigs. And I really liked. A lot. But my day job required that I have a Windows NT box to shell into the Solaris servers as we ran. Then I got a better class of employer and the next several let me run a Linux box to connect to our solaris (or Aix) servers.
Next my girlfriend at the time got a PowerBook G4 and installed OS X on it. It was obviously amazing. Windows XP came out, and it was once again so much worse than Windows NT - and crashed so much more - which was odd as it was based on Windows NT. (yes 98 was before this but it was really bad). Anyhow, right about here the Linux box I was running at home, died. And it was obvious that I was not going to buy an XP box, so I bought my first Mac.
And it’s been the same for the last 25 years - every time I look at a Windows box it’s horrible. I pretty much always have a Linux box headless somewhere in the house, and one rented in the cloud, and a Mac for interacting with the world.
And like the parent I actively dislike windows. And that’s interesting because I’ve liked most other operating systems I’ve used in my life, including MS-DOS. Modern windows is uniquely bad.
I use windows and absolutely hate the mac UI. Having the current window title bar always at the top of the screen doesn't make any sense when you have a very big monitor. It only made sense with the tiny monitors available when the mac UI was originally created.
Yeah, that is an annoyance for me too but for a different reason. I have set the menu bar to be only in the internal display (to avoid issues with my OLED external monitor) so when I have a window in the external monitor, I have to move the mouse to the internal monitor screen space if I want to open something that is in the app's title bar.
On the other hand, it is actually useful that there is mostly a specific place you find settings etc, as in windows/linux it tends to vary depending on the app where to find those (is there a bar on top of the window? Is there a button to expand a menu somewhere? Something else? Who knows).
No, it is still configurable. You can specify in your opencode.json config that it should be able to run everything. I think they just argued that it shouldn't be the default. Which I agree with.
No, the problem is that when logging in, the provider's website can provide an authentication shell command that OpenCode will send to the shell sight unseen, even if it is "rm -rf /home". This "feature" is completely unnecessary for the agent to function as an agent, or even for authentication. It's not about it being the default, it's about it being there at all and being designed that way.
> and (partly as a result) it's fairly resource inefficient (often uses 1GB of RAM or more. For a TUI).
That's (one of the reasons) why I'm favoring Codex over Claude Code.
Claude Code is an... Electron app (for a TUI? WTH?) and Codex is Rust. The difference is tangible: the former feels sluggish and does some odd redrawing when the terminal size changes, while the latter definitely feels more snappy to me (leaving aside that GPT's responses also seem more concise). At some point, I had both chewing concurrently on the same machine and same project, and Claude Code was using multiple GBs of RAM and 100% CPU whereas Codex was happy with 80 MB and 6%.
Performance _is_ a feature and I'm afraid the amounts of code AI produces without supervision lead to an amount of bloat we haven't seen before...
I think you’re confusing capital c Claude Code, the desktop Electron app, and lowercase c `claude`, the command line tool with an interactive TUI. They’re both TypeScript under the hood, but the latter is React + Ink rendered into the terminal.
The redraw glitches you’re referring to are actually signs of what I consider to be a pretty major feature, a reason to use `claude` instead of `codex` or `opencode`: `claude` doesn’t use the alternate screen, whereas the other two do. Meaning that it uses the standard screen buffer, meaning that your chat history is in the terminal (or multiplexer) scrollback. I much prefer that, and I totally get why they’ve put so much effort into getting it to work well.
In that context handling SIGWINCH has some issues and trickiness. Well worth the tradeoff, imo.
Codex is using its app server protocol to build a nice client/server separation that I enjoy on top of the predictable Rust performance.
You can run a codex instance on machine A and connect the TUI to it from machine B. The same open source core and protocol is shared between the Codex app, VS Code and Xcode.
That's the same reason I don't like Opencode, but Codex doesn't use the alternate screen. I remember it did when it was very very new, but now it doesn't.
Ah nice, good to know. I hadn’t used codex in a while. I actually really like opencode and its ui, just wish it didn’t clear the screen on exit. It could at least redraw whatever was last in the chat, that would be better than nothing.
I had a nasty slow claude code startup time at one point something like 8s, a clean install sorts it all out. Back up your mcp config and skills and you're good.
I think Go might be a better choice but not for that reason at all.
Go could implement something like this with no dependencies outside the standard library. It would make sense to take on a few, but a comparable Rust project would have at least several dozens.
Also, Go can deliver a single binary that works on every Linux distribution right out of the box. In Rust, its possible but you have to static compile with muslc and that is a far less well-trodden path with some significant differences to the glibc that most Rust libraries have been tested with.
My personal opinion is that I like Rust much more than Go, but I can’t deny that Rust is a big, and more dauntingly to newcomers, pretty unopinionated language compared to Go.
There are more syntax features, more and more complex semantics, and while rustc and clippy do a great job of explaining like 90% of errors, the remaining 10% suuuuuck.
There’s also some choices imposed by the build system (like cargo allowing multiple versions of the same dep in a workspace) and by the macro system (axum has some unintuitive extractor ordering needs that you won’t find unless you know to look for them), and those things and the hurdles they present become intuitive after a time but just while getting started? Oof
LLMs should know that, for maybe a CRUD app, there should be taken care of security at various layers, i.e. input validation in controllers. Knowledge from popular frameworks that communicate security boundaries should be transferable for them, even if everything is custom code. Very confusing to me how they manage to completely ignore so much of it. I guess they are too good following suit of a productivity minded vibe coder.
Frankly I don't think one even needs to learn it, if you know a bunch of other languages and the codebase is good. I was able to just make a useful change to an open source project by just doing it, without having written any lines of Go before. Granted the MR needed some revisions.
Rust is my favorite, though. There are values beyond ease of contribution. I can't replicate the experience with a Rust project anymore, but I suspect it would have been tougher.
agents don't really care and they're doing anywhere between 90-100% of the work on CC. if anything, rust is better as it has more built-in verification out of the box.
Rust is accessible to everyone now that Claude Code and Opus can emit it at a high proficiency level.
Rust is designed so the error handling is ergonomic and fits into the flow of the language and the type system. Rust code will be lower defect rate by default.
Plus it's faster and doesn't have a GC.
You can use Rust now even if you don't know the language. It's the best way to start learning Rust.
The learning curve is not as bad as people say. It's really gentle.
Java (incl. Scala, Closure, Groovy, Jython, etc.) is better suited to running as a server. Let agents write clean readable code and leave performance concerns to the JIT compiler. If you really want you can let agents rewrite components at runtime without losing context.
Erlang would offer similar benefits, because what we're doing with these things is more message passing than processing.
Rust is what I'd want agents writing for edge devices, things I don't want to have to monitor. Granted, our devices are edge devices to Anthropic, but they're more tightly coupled to their services.
There is literally no reason to write it in a JVM language in 2026 when better options exists. Either Go for simplicity and maintaininability or Rust to get the most out of the machine works.
Also, it'll be hard for them to lure good people to work on that thing. Absolutely no one is getting excited to write, vibe, or maintain Java.
I am not thrilled to use java, but it really does what it says on the tin. A customer copied the jar file I sent them to their as400 and it just worked. There is nothing quite like it.
Hi go binary, unfortunately you don't exist, because there is no cross compiler for that platform. Also please don't crash if you ever do get cross compiled, since the target system doesn't understand your utf8 strings.
I run many instances of Claude Code simultaneously and have not experienced what you are seeing. It sounds like you have a bias of Rust over Typescript.
No, they are describing a typical experience with the two apps. Just open both apps, run a few queries, and take a look at the difference in resource management yourself. It sounds like you have a bias of Claude Code over Codex.
Uh, it sounds like you're having trouble understanding that people in this thread are talking about two wildly different "claude code" applications. Those who are claiming the resources issues don't apply to them are referring to the cli application, ie: `claude` and those are saying things like "Just open both apps..." are surely referring to their GUI versions.
No, I've never used the GUI version. I literally just had to close and reopen the terminal running the Claude Code CLI on my Mac yesterday because it was taking too many resources. It generally happens when I ask Claude to use multiple sub agents. It's an obvious memory leak.
> The problem for me is that the development practices of the people that are working on it are suboptimal at best; they're constantly releasing at an extremely high cadence, where they don't even spend the time to test or fix things (or even build a proper list of changes for each release), and they add, remove, refine, change, fix, and break features constantly at that accelerated pace.
this is what i notice with openclaw as well. there have been releases where they break production features. unfortunately this is what happens when code becomes a commidity, everyone thinks that shipping fast is the moat but at the expense of suboptimality since they know a fix can be implemented quickly on the next release.
Openclaw has 20k commits, almost 700k lines of code, and it is only four months old. I feel confident that that sort of code base would have a no coherent architecture at all, and also that no human has a good mental model of how the various subsystems interact.
I’m sure we’ll all learn a lot from these early days of agentic coding.
> I’m sure we’ll all learn a lot from these early days of agentic coding.
So far what I am learning (from watching all of this) is that our constant claims that quality and security matter seem to not be true on average. Depressingly.
I think what we're seeing is a phase transition. In the early days of any paradigm shift, velocity trumps stability because the market rewards first movers.
But as agents move from prototypes to production, the calculus changes. Production systems need:
- Memory continuity across sessions
- Predictable behavior across updates
- Security boundaries that don't leak
The tools that prioritize these will win the enterprise market. The ones that don't will stay in the prototype/hobbyist space.
We're still in the "move fast" phase, but the "break things" part is starting to hurt real users. The pendulum will swing back.
This makes sense. Development velocity is bought by having a short product life with few users. As you gain users that depend on your product, velocity must drop by definition.
The reason for this is that product development involves making decisions which can later be classified as good or bad decisions.
The good decisions must remain stable, while the bad decisions must remain open to change and therefore remain unstable.
The AI doesn't know anything about the user experience, which means it will inevitably change the good decisions as well.
> So far what I am learning (from watching all of this) is that our constant claims that quality and security matter seem to not be true on average.
Only for the non-pro users. After all, those users were happy to use excel to write the programs.
What we're seeing now is that more and more developers find they are happy with even less determinism than the Excel process.
Maybe they're right; maybe software doesn't need any coherence, stability, security or even correctness. Maybe the class of software they produce doesn't need those things.
I still use excel to write programs. I use officescript and power query. I shy away from via but have also used it.. I’m not sure what your point is. The people stopping citizens’ development could ease off the job security lines and the deferral to lockdown
20 for me, and let's not exaggerate. We've given lip service to it this entire time. Hell look at any of the corps we're talking about (including where I work) and they're demanding "velocity without lowering the quality bar", but it's a lie: they don't care about the quality bar in the slightest.
One of my main lessons after a decent long while in security, is that most orgs care about security, *as long as it doesn't get in the way of other priorities* like shipping new features. So when we get something like Agentic LLM tooling where everything moves super fast, security is inevitably going to suffer.
I’m learning that projects, developed with the help of agents, even when developers claim that they review and steer everything, ultimately are not fully understood or owned by the developers, and very soon turns into a thousand reinvented wheels strapped together by tape.
> very soon turns into a thousand reinvented wheels strapped together by tape.
Also most of the long running enterprise projects I’ve seen - there was one that had been around for like 10 years and like about 75% of the devs I hadn’t even heard of and none of the original ones were in the project at all.
The thing had no less than three auditing mechanisms, three ways of interacting with the database, mixed naming conventions, like two validation mechanisms none of which were what Spring recommended and also configurations versioned for app servers that weren’t even in use.
This was all before AI, it’s not like you need it for projects to turn into slop and AI slop isn’t that much different from human slop (none of them gave a shit about ADRs or proper docs on why things are done a certain way, though Wiki had some fossilized meeting notes with nothing actually useful) except that AI can produce this stuff more quickly.
When encountered, I just relied on writing tests and reworking the older slop with something newer (with better AI models and tooling) and the overall quality improved.
Claude Code breaks production features and doesn't say anything about it. The product has just shifted gears with little to no ceremony.
I expect that from something guiding the market, but there have been times where stuff changes, and it isn't even clear if it is a bug or a permanent decision. I suspect they don't even know.
We're still in the very early days of generative AI, and people and markets are already prioritizing quality over quantity. Quantity is irrelevant when it comes value.
All code is not fungible, "irreverent code that kinda looks okay at first glance" might be a commodity, but well-tested, well-designed and well-understood code is what's valuable.
and once you've got your wish: ugly code without tests or a way to comprehend it, but cheap!
How much value are you going to be able to extract over its lifetime once your customers want to see some additional features or improvements?
How much expensive maintenance burden are you incurring once any change (human or LLM generated) is likely to introduce bugs you have no better way of identifying than shipping to your paying customers?
Maybe LLM+tooling is going to get there with producing a comprehensible and well tested system but my anectodal experience is not promising. I find that AI is great until you hit its limit on a topic and then it will merrily generate tokens in a loop suggesting the same won't-work-fix forever.
What you wrote aligns with my experience so far.
It's fast and easy to get something working, but in a number of cases it (Opus) just gets stuck 'spinning' and no number of prompts is going to fix that.
Moreover - when creating things from scratch it tends to use average/insecure/ inefficient approaches that later take a lot of time to fix.
The whole thing reminds me a bit of the many RAD tools that were supposed to 'solve' programming. While it was easy to start and produce something with those tools, at some point you started spending way too much time working around the limitations and wished you started from scratch without it.
I'm of the opinion that the diligence of experts is part of what makes code valuable assets, and that the market does an alright job of eventually differentiating between reliable products/brands and operations that are just winging it with AI[1].
I would think that the better the code is designed and factored and refactored, the easier it is to maintain and evolve, detect and remove bugs and security vulnerabilties from it. The ease of maintenance helps both AI and humans.
There are limits to what even AI can do to code, within practical time-limits. Using AI also costs money. So, easier it is to maintain and evolve a piece of software, the cheaper it will be to the owners of that application.
It's understandable and even desirable that a new piece of code rapidly evolves as they iterate and fix bugs. I'd only be concerned if they keep this pattern for too long. In the early phases, I like keeping up with all the cutting edge developments. Projects where dev get afraid to ship because of breaking things end up becoming bloated with unnecessary backward compatibility.
I recently listened to this episode from the Claude Code creator (here is the video version: https://www.youtube.com/watch?v=PQU9o_5rHC4) and it sounded like their development process was somewhat similar - he said something like their entire codebase has 100% churn every 6 months. But I would assume they have a more professional software delivery process.
I would (incorrectly) assume that a product like this would be heavily tested via AI - why not? AI should be writing all the code, so why would the humans not invest in and require extreme levels of testing since AI is really good at that?
I feel like our industry goes through these phases where there's an obvious thought leader that everyone's copying because they are revolutionary.
Like Rails/DHH was one phase, Git/GitHub another.
And right now it's kinda Claude Code. But they're so obviously really bad at development that it feels like a MLM scam.
I'm just describing the feeling I'm getting, perhaps badly. I use Claude, I recommended Claude for the company I worked at. But by god they're bloody awful at development.
It feels like the point where someone else steps in with a rock solid, dependable, competitor and then everyone forgets Claude Code ever existed.
I use Claude Code because Anthropic requires me to in order to get the generous subscription tokens. But better tools exist. If I was allowed to use Cursor with my Claude sub I would in a heartbeat.
I mean, I'm slowly trying to learn lightweight formal methods (i.e. what stuff like Alloy or Quint do), behavior driven development, more advanced testing systems for UIs, red-green TDD, etc, which I never bothered to learn as much before, precisely because they can handle the boilerplate aspects of these things, so I can focus on specifying the core features or properties I need for the system, or thinking through the behavior, information flow, and architecture of the system, and it can translate that into machine-verifiable stuff, so that my code is more reliable! I'm very early on that path, though. It's hard!
I heard from somebody inside Anthropic that it's really two companies, one which are using AI for everything and the other which spends all their time putting out fires.
OpenCode's creator acknowledged that the ease of shipping has let them ship prototype features that probably weren't worth shipping and that they need to invest more time cleaning up and fixing things.
Uff. This is exactly what Casey Muratori and his friend was talking about in of their more recent podcast. Features that would never get implemented because of time constraints now do thanks to LLMs and now they have a huge codebase to maintain
I'm still trying to figure out how "open" it really is; There are reports that it phones home a lot[0], and there is even a fork that claims to remove this behavior[1]:
I think there’s a conflict between “open” as in “open source”, and “open” as in “open about the practice” paired with the fact we usually don’t review software’s source scrupulously enough to spot unwanted behaviors”.
so how is telemetry not open? If you don't like telemetry for dogmatic reasons then don't use it. Find the alternative magical product whose dev team is able to improve the software blindfolded
> Find the alternative magical product whose dev team is able to improve the software blindfolded
The choice isn't "telemetry or you're blindfolded", the other options include actually interacting with your userbase. Surveys exist, interviews exist, focus groups exist, fostering communities that you can engage is a thing, etc.
For example, I was recruited and paid $500 to spend an hour on a panel discussing what developers want out of platforms like DigitalOcean, what we don't like, where our pain points are. I put the dollar amount there only to emphasize how valuable such information is from one user. You don't get that kind of information from telemetry.
> Surveys exist, interviews exist, focus groups exist, fostering communities that you can engage is a thing, etc.
We all know it’s extremely, extremely hard to interact with your userbase.
> For example I was paid $500 an hour
+the time to find volunteers doubled that, so for $1000 an hour x 10 user interviews, a free software can have feedback from 0.001% of their users. I dislike telemetry, but it’s a lie to say it’s optional.
—a company with no telemetry on neither of our downloadable or cloud product.
> We all know it’s extremely, extremely hard to interact with your userbase.
On the contrary, your users will tell you what you need to know, you just have to pay attention.
> I dislike telemetry, but it’s a lie to say it’s optional.
The lie is believing it’s necessary. Software was successful before telemetry was a thing, and tools without telemetry continue to be successful. Plenty of independent developers ship zero telemetry in their products and continue to be successful.
The biggest reason is I don't like being locked into an ecosystem. I can use whatever I want with OpenCode, not so much with Codex and Claude Code. Right now I'm only using GPT with it, but I like the option.
CC I have the least experience with. It just seemed buggy and unpolished to me. Codex was fine, but there was something about it that just didn't feel right. It seemed fined for code tasks but just as often I want to do research or discuss the code base, and for whatever reason I seemed to get terse less useful answers using Codex even when it's backed by the same model.
OpenCode works well, I haven't had any issues with bugs or things breaking, and it just felt comfortable to use right from the jump.
Probably all describe problems stem from the developers using agent coding; including using TypeScript, since these tools are usually more familiar with Js/Js adjacent web development languages.
Perhaps the use of coding agents may have encouraged this behavior, but it is perfectly possible to do the opposite with agents as well — for instance, to use agents to make it easier to set up and maintain a good testing scaffold for TUI stuff, a comprehensive test suite top to bottom, in a way maintainers may not have had the time/energy/interest to do before, or to rewrite in a faster and more resource efficient language that you may find more verbose, be less familiar with, or find annoying to write — and nothing is forcing them to release as often as they are, instead of just having a high commit velocity. I've personally found AIs to be just as good at Go or Rust as TypeScript, perhaps better, as well, so I don't think there was anything forcing them to go with TypeScript. I think they're just somewhat irresponsible devs.
> I think they're just somewhat irresponsible devs.
Before coding agents it took quite a lot more experience before most people could develop and ship a successful product. The average years of experience of both core team and contributors was higher and this reflected in product and architecture choices that really have an impact, especially on non-functional requirements.
They could have had better design and architecture in this project if they had asked the AI for more help with it, but they did not even know what to ask or how to validate the responses.
Of course, lots of devs with more years of experience would do just as badly or worse. What we are seeing here though is a filter removed that means a lot of projects now are the first real product everyone the team has ever developed.
The value of having (and executing) a coherent product vision is extremely undervalued in FOSS, and IMO the difference between a successful project in the long-term and the kind of sploogeware that just snowballs with low-value features.
> The value of having (and executing) a coherent product vision is extremely undervalued in FOSS
Interesting you say this because I'd say the opposite is true historically, especially in the systems software community and among older folks. "Do one thing and do it well" seems to be the prevailing mindset behind many foundational tools. I think this why so many are/were irked by systemd. On the other hand newer tools that are more heavily marketed and often have some commercial angle seem to be in a perpetual state of tacking on new features in lieu of refining their raison d'etre.
You must never rely on AI itself for authorization… don’t let it run on an environment where it can do that. I can’t believe this needs to be said but everyone seems to have lost their mind and decided to give all their permissions away to a non deterministic thing that when prompted correctly will send it all out to whoever asks it nicely.
Yeah every time I want to like it, scrolling is glitched vs codex and Claude. And other various things like: why is this giant model list hard coded for ollama or other local methods vs loading what I actually have...
On top of that. Open code go was a complete scam. It was not advertised as having lower quality models when I paid and glm5 was broken vs another provider, returning gibberish and very dumb on the same prompt
Is there a name for these types of "overbearing" and visually busy "TUIs"? It seems like all the other agents have the same aesthetic and it is unlike traditional nurses or plain text interfaces in a bad way IMO. The constant spinners, sidebars and needless margins are a nuisance to me. Especially over an ssh connection in a tmux session it feels wrong.
I’m a little surprised by your description of constant releases and instability. That matches how I would describe Claude Code, and has been one of the main reasons I tend to use OpenCode more than Claude Code.
OpenCode has been much more stable for me in the 6 months or so that I’ve been comparing the two in earnest.
Currently on the hunt for OpenCode alternatives as it's become disastrously unstable in recent weeks between SSE timeout issues, broken Desktop UI and a very, very unhealthy looking GitHub PR queue, I'm out.
Since I'm on a Kotlin stack, Junie CLI in ACP server mode is looking attractive.
I use Droid specifically because Claude Code breaks too often for me. And then Droid broke too (but rarely), and I just stuck to not upgrading (like I don't upgrade WebStorm. Dev tools are so fragile)
I’ve been testing opencode and it feels TUI in appearance only. I prefer commandline and TUIs and in my mind TUI idea is to be low level, extremely portable interface and to get out of the way. Opencode does not have low color, standard terminal theme so had to switch to a proper terminal program. Copy paste is hijacked so I need to write code out to file in order to get a snippet. The enter key (as in the return by the keypad) does not work for sending a line. I have not tested but don’t think this would work over SSH even. I have been googling around to find if I am holding it wrong but it feels to break expectations of a terminal app in a way that I wish they would have made it a gui. Makes me sad because I think the goods are there and it’s otherwise good.
FWIW, in Kitty on Linux, SHIFT + mouse-select copies and SHIFT + middle-mouse-button pastes. This use of SHIFT and otherwise using standard Unix style copy/paste is common in a lot of TUIs (eg, weechat).
I don’t think good TUI’s are the same as good command line programs. Great tui apps would to me be things like Norton/midnight commander, borlands turbo pascal, vim, eMacs and things like that
Yes cli and tui are not the same, but I expect TUI to work decent in general terminal emulator and not acitvely block copying and pasting. Having to install supported terminal emulator goes against the vibe.
> Due to that, and because it's a popular an open source alternative, I want to be able to recommend it and be enthusiastic about it. The problem for me is that the development practices of the people that are working on it are suboptimal at best;
This is my experience with most AI tools that I spend more than a few weeks with. It's happening so often it's making me question my own judgement: "if everything smells of shit, check your own shoes." I left professional software engineering a couple of years ago, and I don't know how much of this is also just me losing touch with the profession, or being an old man moaning about how we used to do it better.
It reminds me of social media: there was a time where social media platforms were defined by their features, Vine was short video, snapchat was disappearing pictures, twitter was short status posts etc. but now they're all bloated messes that try do everything.
The same looks to be happening with AI and agent software. They start off as defined by one features, and then become messes trying to implement the latest AI approach (skills, or tools, or functions, or RAG, or AGENTS.md, or claws etc. etc.)
I can fully see pi doing the same. It talks about a whole bunch of stuff it proudly doesn't do. New techniques will come around still, and if pi adopts them, it becomes bloated/unusable/unusable like opencode, and if it doesn't, then it falls behind SoTA and gets overtaken by next-big-thing.
I think shitty AI software is a product of being in a bubble and the pressure to move fast and stay relevant. Just like there was a bunch of shitty blockchain software, and a bunch of shitty VR software, and a bunch of shitty mobile app software when they were booming.
I don't think pi has been around long enough to prove it's immune to this yet.
I tried running Opencode on my 7$/yr 512mb vps but it had the OOM issue and yes it needs 1GB of ram or more.
I then tried running other options like picoclaw/picocode etc but they were all really hard to manage/create
The UI/UX I want is that I can just put my free openrouter api key in and then I am ready to go to get access to free models like Arcee AI right now
After reading your comments/I read this thread, I tried crush by charmbracelet again and it gives the UI/UX that I want.
I am definitely impressed by crush/ the charm team. They are on HN and they work great for me, highly recommended if you want something which can work on low constrained devices
I do feel like Charm's TUI's are too beautiful in the sense that running a connection over SSH can delay so when I tried to copy some things, the delay made things less copy-able but overall, I think that I am using Crush and I am happy for the most part :-)
Edit: That being said, just as I was typing this, Crush took all the Free requests from Openrouter that I get for free so it might be a bit of minor issue but overall its not much of an issue from Crush side, so still overall, my point is that Crush is worth checking out
Kudos to the CharmBracelet team for making awesome golang applications!
> they're constantly releasing at an extremely high cadence, where they don't even spend the time to test or fix things
Tbf, this seems exactly like Claude Code, they are releasing about one new version per day, sometimes even multiple per day. It’s a bit annoying constantly getting those messages saying to upgrade cc to the latest version
The hombrew version doesn’t say anything about the native installer, but still bugs you to update after every minor version (which is every day, sometimes multiple times per day)
The npm version tells you there’s a native installer and to use it instead
The native installer never says anything. Not sure how it gets updated
I'm quite impressed by how a team managed to wangle the mental model of React and JSX into a terminal interface and in fact I can only imagine that that itself is a product of AI.
That said, the runtime is so resource heavy that, even though the heavy computational workload is given to AI on a remote cluster of servers, it will bring an old-ish laptop to a stall.
I do wonder though...highly interactive TUIs are not novel. I would wager that AI + the attention of frontend devs have created an environment where you can make fancy terminal UIs without concern for how terminals generally work and if Electron is sitting in the background, it proves it.
I agree that Opencodr is using a lot of RAM, but regarding the features, I am ak only using the built in features and I wouldn't say they are too many, they are just enough for a complete workflow. If you need more you can install plugins, which I haven't done yet and it's my daily driver for four months.
Isn't this pretty much the standard across projects that make heavy use of AI code generation?
Using AI to generate all your code only really makes sense if you prioritize shipping features as fast as possible over the quality, stability and efficiency of the code, because that's the only case in which the actual act of writing code is the bottleneck.
I don't think that's true at all. As I said, in a response to another person blaming it on agentic coding above, there are a very large number of ways to use coding agents to make your programs faster, more efficient, more reliable, and more refined that also benefit from agents making the code writing research, data piping, and refactoring process quicker and less exhausting. For instance, by helping you set up testing scaffolding, handling the boilerplate around tests while you specify some example features or properties you want to test and expands them, rewriting into a more efficient language, large-scale refactors to use better data structures or architectures, or allowing you to use a more efficient or reliable language that you don't know as well or find to have too much boilerplate or compiler annoyance to otherwise deal with yourself. Then there are sort of higher level more phenomenological or subjective benefits, such as helping you focus on the system architecture and data flow, and only zoom in on particular algorithms or areas of the code base that are specifically relevant, instead of forever getting lost in the weeds of thinking about specific syntax and compiler errors or looking up a bunch of API documentation that isn't super important for the core of what you're trying to do and so on.
Personally, I find this idea that "coding isn't the bottleneck" completely preposterous. Getting all of the API documentation, the syntax, organizing and typing out all of the text, finding the correct places in the code base and understanding the code base in general, dealing with silly compiler errors and type errors, writing a ton of error handling, dealing with the inevitable and inoraticable boilerplate of programming (unless you're one of those people that believe macros are actually a good idea and would meaningfully solve this), all are a regular and substantial occurrence, even if you aren't writing thousands of lines of code a day. And you need to write code in order to be able to get a sense for the limitations of the technology you're using and the shape of the problem you're dealing with in order to then come up with and iterate on a better architecture or approach to the problem. And you need to see your program running in order to evaluate whether it's functionality and design a satisfactory and then to iterate on that. So coding is actually the upfront costs that you need to pay in order to and even start properly thinking about a problem. So being able to get a prototype out quickly is very important. Also, I find it hard to believe that you've never been in a situation where you wanted to make a simple change or refactor that would have resulted in needing to update 15 different call sites to do properly in a way that was just slightly variable enough or complex enough that editor macros or IDE refactoring capabilities wouldn't be capable of.
That's not to mention the fact that if agentic coding can make deploying faster, then it can also make deploying the same amount at the same cadence easier and more relaxing.
You're both right. AI can be used to do either fast releases or well designed code. Don't say both, as you're not making time, you're moving time between those two.
Which one you think companies prefer? Or if you're a consulting business, which one do you think your clients prefer?
> AI can be used to do either fast releases or well designed code
I have yet to actually see a single example of the latter, though. OpenCode isn't an isolated case - every project with heavy AI involvement that I've personally examined or used suffers from serious architectural issues, tons of obvious bugs and quirks, or both. And these are mostly independent open source projects, where corporate interests are (hopefully) not an influence.
I will continue to believe it's not actually possible until I am proven wrong with concrete examples. The incentives just aren't there. It's easy to say "just mindlessly follow X principle and your software will be good", where X is usually some variation of "just add more tests", "just add more agents", "just spend more time planning" etc. but I choose to believe that good software cannot be created without the involvement of someone who has a passion for writing good software - someone who wouldn't want to let an LLM do the job for them in the first place.
> It's easy to say "just mindlessly follow X principle and your software will be good", where X is usually some variation of "just add more tests", "just add more agents", "just spend more time planning" etc
That's a complete strawman of what I — or others trying to learn how to use coding agents to increase quality, like Simon Willison or the Oxide team — am saying.
> but I choose to believe that good software cannot be created without the involvement of someone who has a passion for writing good software - someone who wouldn't want to let an LLM do the job for them in the first place.
This is just a no true Scotsman. I prefer to use coding agents because they don't forget details, or get exhausted, or overwhelmed, or lazy, or give up, ever — whereas I might. Therefore, they allow me to do all of the things that improve code and software quality more extensively and thoroughly, like refactors, performance improvements, and tests among other things (because yes, there is no single panacea). Furthermore, I do still care about the clarity, concision, modularity, referential transparency, separation of concerns, local reasonability, cognitive load, and other good qualities of the code, because if those aren't kept up a) I can't review the code effectively or debug things as easily when they go wrong, b) the agent itself will struggle to male changes without breaking other things, and struggle to debug, c) those things often eventually effect the quality of the end state software.
Additionally, what you say is empirically false. Many people who do deeply value quality software and code quality, such as the creators of Flask, Redis, and SerenityOS/Ladybird, all use and value agentic coding.
Just because you haven't seen good quality software with a large amount of agentic influence doesn't mean it isn't possible. That's very close minded.
Show me an example then. I want to see an example of quality software that makes heavy use of AI generated code (as in, basically written entirely by AI similar to OpenCode), led by developer(s) who care deeply about software quality but still choose to not write code themselves.
You don't get it, a bunch of thought leaders/bloggers jumped on the next bandwagon, so you're dumb and lack skills if you don't take their word for it.
No, you are not allowed to see the excellent code that their masterful use of AI generated. You're just going to have to read their blog posts.
In theory it should give the LLM better tools to explore and edit the code. So instead of working with the code as text, it's a tree of stuff it can read and edit.
I don't know how much it works in practice though.
That is very disappointing coz I've been wanting to try an alternative to Gemini CLI for exactly these reasons. The AI is great but the actual software is a buggy, slow, bloated blob of TypeScript (on a custom Node runtime IIUC!) that I really hate running. It takes multiple seconds to start, requires restarting to apply settings, constantly fucks up the terminal, often crashes due to JS heap overflows, doesn't respect my home dir (~/.gemini? Come on folks are we serious?), has an utterly unusable permission system, etc etc. Yet they had plenty of energy to inject silly terminal graphics and have dumb jokes and tips scroll across the screen.
Is Claude Code like this too? I wonder if Pi is any better.
A big downside would be paying actual cost price for tokens but on the other hand, I wouldn't be tied to Google's model backend which is also extremely flaky and unable to meet demand a lot of the time. If I could get real work done with open models (no idea if that's the case yet) and switch providers when a given provider falls over, that would be great.
Note as of yesterday, they retired the lite coding plan you're talking about. New buy-in is $50/month for the pro plan unless you were already on the lite plan.
Claude will also happily write a huge pile of junk into your home directory, I am sad to report. The permissions are idiotic as well, but I always use it in a container anyway. But I have not had it crash and it hasn't been slow starting for me.
I tried it briefly and the practice - argued for strategy for operation actually - to override my working folder seelction and altering to the parent root git folder is a no go.
For serious coding work I use the Zed Agent; for everything else I use pi with a few skills. Overall, though, I'd recommend Pi plus a few extensions for any features you miss extremely highly. It's also TypeScript, but doesn't suffer from the other problems OC has IME. It's a beautiful little program.
Big +1 to Pi[1]. The simplicity makes it really easy to extend yourself too, so at this point I have a pretty nice little setup that's very specific to my personal workflows. The monorepo for the project also has other nice utilities like a solid agent SDK. I also use other tools like Claude Code for "serious" work, but I do find myself reaching for Pi more consistently as I've gotten more confident with my setup.
I've been building VT Code (https://github.com/vinhnx/vtcode), a Rust-based semantic coding agent. Just landed Codex OAuth with PKCE exchange, credentials go into the system keyring.
I build VT Code with Tree-sitter for semantic understanding and OS-native sandboxing. It's still early but I confident it usable. I hope you'll give it a try.
pi.dev is worth checking out. The basic idea is they provide a minimalist coding agent that's designed to be easy to extend, so you can tailor the harness to suit your needs without any bloat.
One of the best features is they haven't been noticed by Anthropic yet so you can still use your Claude subscription.
Yeah I tried using it when oh-my-opencode (now oh-my-openagent) started popping off and found it had highly unstable. I just stick with internal tooling now.
I also want to recommend it, but only to those who enjoy figuring out why their installation is broken after what was a harmless minor or even patch release.
The amount of configuration updates, broken plugins… on top of what was already a difficult product to customise; it’s simply too much.
Why isn’t `opencode-workspace` the default, given that the base product is barely usable? Bah. I just reinstalled AGY and Mistral’s Vibe and got on with the work.
I’m old. It’s an open-source, gratis thing. I’m grateful for projects like OpenCode, but it was infuriating to configure a good set of plugins and prompts for spec-driven development, only to have it all stop working a few times because of something hard to debug.
This is why I'm taking a wait-and-see approach to these tools on HN myself. My month with Claude Code (the TUI, not the GUI) was amazing from an IT POV, just slop-generating niche tools I could quickly implement and audit (not giant-ass projects), but I ain't outsourcing that to another company when Qwen et al are right there for running on my M1 Pro or RTX 3090.
I'm looking forward to more folks building these kinds of tools with a stronger focus on portability via API or loading local models, as means of having a genuinely useful assistant or co-programmer rather than paying some big corp way too much money (and letting them use my data) for roughly the same experience.
More than that, it's an extremely large and complex TypeScript code base — probably larger and more complex than it needs to be — and (partly as a result) it's fairly resource inefficient (often uses 1GB of RAM or more. For a TUI).
On top of that, at least I personally find the TUI to be overbearing and a little bit buggy, and the agent to be so full of features that I don't really need — also mildly buggy — that it sort of becomes hard to use and remember how everything is supposed to work and interact.