It's the only model where an explicit instruction at the end of my message is sometimes ignored. This doesn't happen with any of the gpts, kimis, glms, qwen, etc. Just a deepseek problem.
I have also noticed this with Sonnet, funnily enough - it's not as strong, but it's still there. But yeah, I haven't seen this with any other model so far (although I mostly use the stronger ones - maybe it's a function of intelligence?).
Have you tried DeepSeek V4 Flash? It's very competent and extremely cheap.
I think Gemma 4 is also a good example of a capable small model.
I mention these not only because they're cheap but because they can run on consumer devices. The "every year bigger and more capable SOTA model" trend is mirrored by "the every year smaller and more capable open source model" trend.
256GB is what deepseek v4 flash with Q4 requires I believe. It is really still very far from “running locally on your device”. And it’s getting further away every day, looking at how the electronic market prices are surging.
I need to find stats on average RAM of personal devices, but I expect it will be so low, we are light years away from running a frontier model (from today) locally on a smartphone, let’s stop dreaming (and I really would love having it).
I do agree local models are progressing and I am to this day in awe at what a 50GB file can do – it still feels like black magic to me.
Also granted, something like Gemma 2 2B seems to have similar performance to ChatGP 3.5 and only require 2GB of RAM. But I think the RAM/performance ratio curve over time is logarithmic and not linear, it’s moving slower and slower.
I use it through my opencode go subscription and it's exactly how you described. Very pragmatic and not too ambitious. It's similar to Kimi 2.5/6 in that regard.
But...AWS is a platform too, no? Seems like you're in the same category of risk you just moved to a more well-known name. Granted, Amazon is the most reliable even if they have their own quirks.
I was looking at this from Railway’s perspective. I really wonder what caused their account to be flagged, and they hint at more accounts being erroneously flagged as well.
Showdead is quite a disheartening experience - there’s just so much LLM generated crap. The dead internet theory doesn’t feel as fringe as it once did.
And backups. Sqlite makes it easier but no backup process is easy. You always have to backup and restore at least once to have the confidence to rely on it.
It's another (big) point towards paying someone else to host it.
It's the only model where an explicit instruction at the end of my message is sometimes ignored. This doesn't happen with any of the gpts, kimis, glms, qwen, etc. Just a deepseek problem.
Hope it improves!
reply