More

scrlk · 2026-04-08T16:56:56 1775667416

LG never really recovered after releasing several phones with bootlooping issues between 2015-16: https://en.wikipedia.org/wiki/LG_smartphone_bootloop_issues

Plus their software support was poor, even for the era.

stuxnet79 · 2026-04-08T21:11:43 1775682703

My G4 succumbed to this issue, and I was never able to revive it. I had some important documents and images there that I hadn't yet backed up to cloud that disappeared along with it. Still very sour about that. Other than that I enjoyed the phone, felt the dimensions were perfect and the camera was good for its time. But a defect of that nature is too serious to overlook so that was the last LG phone I ever owned.

Lammy · 2026-04-09T00:35:12 1775694912

At least they gave us https://www.youtube.com/watch?v=RbiEESkyaeM

scrlk · 2026-04-02T16:38:18 1775147898

Comparison of Gemma 4 vs. Qwen 3.5 benchmarks, consolidated from their respective Hugging Face model cards:

    | Model          | MMLUP | GPQA  | LCB   | ELO  | TAU2  | MMMLU | HLE-n | HLE-t |
    |----------------|-------|-------|-------|------|-------|-------|-------|-------|
    | G4 31B         | 85.2% | 84.3% | 80.0% | 2150 | 76.9% | 88.4% | 19.5% | 26.5% |
    | G4 26B A4B     | 82.6% | 82.3% | 77.1% | 1718 | 68.2% | 86.3% |  8.7% | 17.2% |
    | G4 E4B         | 69.4% | 58.6% | 52.0% |  940 | 42.2% | 76.6% |   -   |   -   |
    | G4 E2B         | 60.0% | 43.4% | 44.0% |  633 | 24.5% | 67.4% |   -   |   -   |
    | G3 27B no-T    | 67.6% | 42.4% | 29.1% |  110 | 16.2% | 70.7% |   -   |   -   |
    | GPT-5-mini     | 83.7% | 82.8% | 80.5% | 2160 | 69.8% | 86.2% | 19.4% | 35.8% |
    | GPT-OSS-120B   | 80.8% | 80.1% | 82.7% | 2157 |  --   | 78.2% | 14.9% | 19.0% |
    | Q3-235B-A22B   | 84.4% | 81.1% | 75.1% | 2146 | 58.5% | 83.4% | 18.2% |  --   |
    | Q3.5-122B-A10B | 86.7% | 86.6% | 78.9% | 2100 | 79.5% | 86.7% | 25.3% | 47.5% |
    | Q3.5-27B       | 86.1% | 85.5% | 80.7% | 1899 | 79.0% | 85.9% | 24.3% | 48.5% |
    | Q3.5-35B-A3B   | 85.3% | 84.2% | 74.6% | 2028 | 81.2% | 85.2% | 22.4% | 47.4% |

    MMLUP: MMLU-Pro
    GPQA: GPQA Diamond
    LCB: LiveCodeBench v6
    ELO: Codeforces ELO
    TAU2: TAU2-Bench
    MMMLU: MMMLU
    HLE-n: Humanity's Last Exam (no tools / CoT)
    HLE-t: Humanity's Last Exam (with search / tool)
    no-T: no think

kpw94 · 2026-04-02T17:05:17 1775149517

Wild differences in ELO compared to tfa's graph: https://storage.googleapis.com/gdm-deepmind-com-prod-public/...

(Comparing Q3.5-27B to G4 26B A4B and G4 31B specifically)

I'd assume Q3.5-35B-A3B would performe worse than the Q3.5 deep 27B model, but the cards you pasted above, somehow show that for ELO and TAU2 it's the other way around...

Very impressed by unsloth's team releasing the GGUF so quickly, if that's like the qwen 3.5, I'll wait a few more days in case they make a major update.

Overall great news if it's at parity or slightly better than Qwen 3.5 open weights, hope to see both of these evolve in the sub-32GB-RAM space. Disappointed in Mistral/Ministral being so far behind these US & Chinese models

culi · 2026-04-02T18:54:24 1775156064

You're conflating lmarena ELO scores.

Qwen actually has a higher ELO there. The top Pareto frontier open models are:

  model                        |elo  |price
  qwen3.5-397b-a17b            |1449 |$1.85
  glm-4.7                      |1443 | 1.41
  deepseek-v3.2-exp-thinking   |1425 | 0.38
  deepseek-v3.2                |1424 | 0.35
  mimo-v2-flash (non-thinking) |1393 | 0.24
  gemma-3-27b-it               |1365 | 0.14
  gemma-3-12b-it               |1341 | 0.11
  gpt-oss-20b                  |1318 | 0.09
  gemma-3n-e4b-it              |1318 | 0.03

https://arena.ai/leaderboard/text?viewBy=plot

What Gemma seems to have done is dominate the extreme cheap end of the market. Which IMO is probably the most important and overlooked segment

coder543 · 2026-04-02T20:30:28 1775161828

That Pareto plot doesn't seem include the Gemma 4 models anywhere (not just not at the frontier), likely because pricing wasn't available when the chart was generated. At least, I can't find the Gemma 4 models there. So, not particularly relevant until it is updated for the models released today.

coder543 · 2026-04-07T03:28:10 1775532490

Gemma 4 31B has now wiped out several of those models from the pareto frontier, now that it has pricing. Gemma 4 26B A4B has an Elo, but no pricing, so it still isn't on that chart. The Gemma 4 E2B/E4B models still aren't on the arena at all, but I expect them to move the pareto frontier as well if they're ever added, based on how well they've performed in general.

coder543 · 2026-04-02T17:19:56 1775150396

> Wild differences in ELO compared to tfa's graph

Because those are two different, completely independent Elos... the one you linked is for LMArena, not Codeforces.

nateb2022 · 2026-04-02T17:31:00 1775151060

> Very impressed by unsloth's team releasing the GGUF so quickly, if that's like the qwen 3.5, I'll wait a few more days in case they make a major update.

Same here. I can't wait until mlx-community releases MLX optimized versions of these models as well, but happily running the GGUFs in the meantime!

Edit: And looks like some of them are up!

FullyFunctional · 2026-04-03T03:44:09 1775187849

absolute n00b here is very confused about the many variations; it looks like the Mac optimized MX versions aren’t available in Ollama yet (I mostly use claude code with this)

gigatexal · 2026-04-02T18:15:20 1775153720

the benchmarks showing the "old" Chinese qwen models performing basically on par with this fancy new release kinda has me thinking the google models are DOA no? what am I missing?

bachmeier · 2026-04-02T18:05:23 1775153123

So is there something I can take from that table if I have a 24 GB video card? I'm honestly not sure how to use those numbers.

GistNoesis · 2026-04-02T18:25:15 1775154315

I just tried with llama.cpp RTX4090 (24GB) GGUF unsloth quant UD_Q4_K_XL You can probably run them all. G4 31B runs at ~5tok/s , G4 26B A4B runs at ~150 tok/s.

You can run Q3.5-35B-A3B at ~100 tok/s.

I tried G4 26B A4B as a drop-in replacement of Q3.5-35B-A3B for some custom agents and G4 doesn't respect the prompt rules at all. (I added <|think|> in the system prompt as described (but have not spend time checking if the reasoning was effectively on). I'll need to investigate further but it doesn't seem promising.

I also tried G4 26B A4B with images in the webui, and it works quite well.

I have not yet tried the smaller models with audio.

kpw94 · 2026-04-02T20:17:27 1775161047

> I'll need to investigate further but it doesn't seem promising.

That's what I meant by "waiting a few days for updates" in my other comment. Qwen 3.5 release, I remember a lot of complaints about: "tool calling isn't working properly" etc.

That was fixed shortly after: there was some template parsing work in llama.cpp. and unsloth pulled out some models and brought back better one for improving something else I can't quite remember, better done Quantization or something...

coder543 pointed out the same is happening regarding tool calling with gemma4: https://news.ycombinator.com/item?id=47619261

GistNoesis · 2026-04-02T20:34:36 1775162076

The model does call tools successfully giving sensible parameters but it seems to not picking the right ones in the right order.

I'll try in a few days. It's great to be able to test it already a few hours after the release. It's the bleeding edge as I had to pull the last from main. And with all the supply chain issues happening everywhere, bleeding edge is always more risky from a security point of view.

There is always also the possibility to fine-tune the model later to make sure it can complete the custom task correctly. But the code for doing some Lora for gemma4 is probably not yet available. The 50% extra speed seems really tempting.

amarshall · 2026-04-02T20:25:39 1775161539

If you are running on 4090 and get 5 t/s, then you exceeded your VRAM and are offloading to the CPU (or there is some other serious perf. issue)

mrinterweb · 2026-04-03T00:28:46 1775176126

Thank you. I have the same card, and I noticed the same ~100 TPS when I ran Q3.5-35B-A3B. G4 26B A4B running at 150TPS is a 50% performance gain. That's pretty huge.

refulgentis · 2026-04-02T19:15:19 1775157319

Reversing the X and Y axis, adding in a few other random models, and dropping all the small Qwens makes this worse than useless as a Qwen 3.5 comparison, it’s actively misleading. If you’re using AI, please don’t rush to copy paste output :/

EDIT: Lordy, the small models are a shadow of Qwen's smalls. See https://huggingface.co/Qwen/Qwen3.5-4B versus https://www.reddit.com/r/LocalLLaMA/comments/1salgre/gemma_4...

scrlk · 2026-04-02T19:44:10 1775159050

I transposed the table so that it's readable on mobile devices.

I should have mentioned that the Qwen 3.5 benchmarks were from the Qwen3.5-122B-A10B model card (which includes GPT-5-mini and GPT-OSS-120B); apologies for not including the smaller Qwen 3.5 models.

refulgentis · 2026-04-02T19:50:36 1775159436

It’s not readable on a phone either. Text wraps. unless you’re testing on foldable?

BloondAndDoom · 2026-04-02T22:38:56 1775169536

Small qwen models are magical

refulgentis · 2026-04-02T23:46:56 1775173616

It's so so good.

I have an app I've been working on for 2.5 years and felt kinda stupid making sure llama.cpp worked everywhere, including Android and iOS.

The 0.8B beats every <= 7B model I've used on tool use and can do RAG. Like you could ship it to someone who didn't know AI and it can do all the basics and leave UX intact.

scrlk · 2026-03-29T20:03:28 1774814608

TI started production at their SM1 fab back in December 2025, which focuses on 28 nm to 130 nm.

magicalhippo · 2026-03-30T11:50:27 1774871427

Because TI has a ton of microcontrollers, power management ICs, opamps and so on that doesn't need or is even desirable to produce on smaller processes.

scrlk · 2026-03-16T00:15:10 1773620110

An interesting read related to this bug from Joel Spolsky - My First BillG Review: https://www.joelonsoftware.com/2006/06/16/my-first-billg-rev...

scrlk · 2026-03-12T23:12:57 1773357177

Plus, having $50M sitting in a mobile hot wallet.

scrlk · 2026-03-11T23:00:04 1773270004

The irony is that, on a technicality, the hereditary peers were the only members of the Lords who had to win an election to get their seats.

> Under the reforms of the House of Lords Act 1999, the majority of hereditary peers lost the right to sit as members of the House of Lords, the upper house of the Parliament of the United Kingdom. Section 2 of the Act, however, provides an exception from this general exclusion of membership for up to 92 hereditary peers: 90 to be elected by the House, as well as the holders of two royal offices, the Earl Marshal and the Lord Great Chamberlain, who sit as ex officio members.

https://en.wikipedia.org/wiki/List_of_excepted_hereditary_pe...

cm2187 · 2026-03-11T23:14:03 1773270843

Yeah, the assumption is that the non hereditary peers are somehow more representative, but all they represent is being friends of the PM of the time. It's a historical oddity of questionable usefulness. Meanwhile the house of commons can wipe out any civil liberty with a majority of 50% plus one vote. It is remarkable how a system that seems so unstable and prone to abuses of power has served the longest continuously running democracy for so long.

skissane · 2026-03-12T00:10:26 1773274226

> Yeah, the assumption is that the non hereditary peers are somehow more representative, but all they represent is being friends of the PM of the time

There is an informal understanding that the government gives a certain number of life peerages to the opposition and minor parties, subject to the government being able to veto individual appointments they find objectionable. So it literally isn’t true that everyone gets one by being friends with the PM-although it certainly helps

Some parties reject their entitlement-the only reason why there are no SNP life peers, is the SNP has a longstanding policy to refuse to appoint any. There are currently 76 LibDem peers, 6 DUP, 3 UUP, 2 Green and 2 Plaid Cymru. SNP would very quickly get some too if they ever changed their mind about refusing the offer. The Northern Ireland nationalist parties (Sinn Fein and SDLP) likewise have a policy against nominating life peers.

cm2187 · 2026-03-12T05:38:22 1773293902

So the correction is “friends of the PM, and a few other key politicians”. Still a club of people who represent no one. And more problematic, are accountable to no one.

scrlk · 2026-03-11T23:30:43 1773271843

As Walter Bagehot wrote in The English Constitution: "An ancient and ever-altering constitution is like an old man who still wears with attached fondness clothes in the fashion of his youth: what you see of him is the same; what you do not see is wholly altered."

Absent ideological capture, it is perhaps one of the best forms of government ever created due to its pragmatic nature and its Lindyness is proof.

tehjoker · 2026-03-11T23:49:01 1773272941

50% + 1 is called democracy. Civil liberties are more liable to be swept away by minorities that come to power. In the US, the republicans often do this because they have minority popular support but a disproportionate representation in government. So the key is to make sure that it's 50% + 1 but also representative of the real population.

The nobility is another example of a minority with disproportionate power. It's important that they are reduced to ensure civil liberties.

cm2187 · 2026-03-12T00:08:42 1773274122

All other democracies have safeguards against the tyranny of the majority. Whether it is representativity by state in the US or in the EU, a constitution requiring a large consensus to change in the US, or the senate being elected by the elected officials of small cities in France, it is not true that democracy is just 50% + 1 vote.

hunterpayne · 2026-03-12T06:37:33 1773297453

What you describe is called a Republic. Pure democracy is precisely 50% + 1 vote.

troad · 2026-03-13T01:18:52 1773364732

Worth noting that the distinction between democracy and republic that you're clearly advocating here is a usage particular to Americans. It doesn't have much currency elsewhere.

Countries like the Netherlands, Denmark etc all have safeguards the dilute the power of 50% + 1, and yet they are clearly not republics, being monarchies.

Political scientists tend to talk more of 'liberal democracy' (whether republican or monarchical) v 'electoral autocracy' etc. This depends on the classical use of the term 'liberal' of course, which is another word that Americans tend to use differently from everyone else.

> The nobility is another example of a minority with disproportionate power. It's important that they are reduced to ensure civil liberties.

Alexis de Tocqueville would disagree - he believed that intermediate institutions (churches, professions, elites, etc) blunt the power of the state before it reaches average people. A society without intermediate institutions is one where you have an all-powerful state on the one hand, and a largely un-coordinated mass of average people on the other. He thought this was the highway to democratic despotism. (Worth noticing that totalitarian governments focus a lot of their energy on destroying alternative centres of power such as these.)

scrlk · 2026-03-03T23:28:58 1772580538

Incremental. If you want a major change, the M6 MBP is rumoured to launch towards the end of the year. It's expected to bring a new design and an OLED touchscreen.

https://www.bloomberg.com/news/articles/2026-02-24/apple-s-t... (https://archive.ph/qT3QV)

scrlk · 2026-02-19T17:39:16 1771522756

IME, they definitely nerf models. gemini-2.5-pro-exp-03-25 through AI Studio was amazing at release and steadily degraded. The quality started tanking around the time they hid CoT.

scrlk · 2026-02-14T16:57:24 1771088244

> Claude was used to do things their guidelines prohibit (facilitate violence, develop weapons or conduct surveillance)

There's Claude Gov models for this:

> U.S. national security customers may choose to use our AI systems for a wide range of applications from strategic planning and operational support to intelligence analysis and threat assessment. Claude Gov models deliver enhanced performance for critical government needs and specialized tasks. This includes:

> * Improved handling of classified materials, as the models refuse less when engaging with classified information

> * Greater understanding of documents and information within the intelligence and defense contexts

> * Enhanced proficiency in languages and dialects critical to national security operations

> * Improved understanding and interpretation of complex cybersecurity data for intelligence analysis

https://www.anthropic.com/news/claude-gov-models-for-u-s-nat...

scrlk · 2026-02-02T15:05:39 1770044739

Cf. -2000 Lines Of Code:

https://www.folklore.org/Negative_2000_Lines_Of_Code.html