Many times I read something on HN and come back to find it after a few days or weeks and using the current keyword based search has been consistently giving me a hard time, so I played around with LLMs as an alternative way of searching and finding information on HN.
Thats a fairly specialized chip and requires a bunch of custom software. The only way it can run apps unmodified is if the math libraries have been customized for this chip. If the performance is there, people will buy it.
For a minute I thought maybe it was Risc-V with a big vector unit, but its way different from that.
The quote at the end of the posted Reuters article (not the one you’re responding to) says that it doesn’t require extensive code modifications. So is the “custom software” is standard for the target customers of nextsilicon?
Companies often downplay the amount of software modifications necessary to benefit from their hardware platform's strengths because quite often, platforms that cannot run software out of the box lose out compared to those that can.
By the time special chips were completed and mature, the developers of "mainstream" CPUs had typically caught up speedwise in the past, which is why we do not see any "transputers" (e.g. Inmos T800), LISP machines (Symbolics XL1200, TI Explorer II), or other odd architectures like the Connection Machine CM-2 around anymore.
For example, when Richard Feynman was hired to work on the Connection Machine, he had to write a parallel version of BASIC first before he could write any programs for the computer they were selling:
https://longnow.org/ideas/richard-feynman-and-the-connection...
It's a bit more complicated, you need to use their compiler (LVVM fork with clang+fortran). This in itself is not that special as most accelerators (ICC, nvcc, aoc) already require this.
Modifications are likely on the level of: Does this clang support my required c++ version? Actual work is only required when you want to bring something else, like Rust (AFAIK not supported).
However, to analyze the efficiency of the code and how it is interpreted by the card you need their special toolchain. Debugging also becomes less convenient.
>> says that it doesn’t require extensive code modifications
If they provide a compiler port and update things like BLAS to support their hardware then higher level applications should not require much/any code modification.
Yeah, it's an unfortunate overlap.
The Mill-Core in NextSilicon terminology is the software defined "configuration" of the chip so to speak that represents swaths of the application that are deemed worthy of acceleration as expressed on the custom HW.
So really the Mill-Core is in a way the expression of the customer's code. really.
A framework for optimizing LLM agents, including but not limited to RL. You can even do fine tuning, they have an example with unsloth in there.
The design of this is pretty nice, it's based on a very simple to add instrumentation to your agent and the rest happens in parallel while your workload runs which is awesome.
You can probably do also what DSPy does for optimizing prompts but without having to rewrite using the DSPy API which can be a big win.
We are very excited for this integration with HF datasets. Datasets have a huge potential to deliver some much needed developer experience when it comes to working with data and LLMs/agentic architectures. Happy to answer any questions and also hear what the community thinks.
LLMs have the potential to compress the cost of learning new programming models. The current moats built around that cost will start to dissolve and that's a good thing.
That’s the whole reason of existence of Iceberg, Delta and Hudi right?
Not as easy as just appending metadata to a parquet file but in the other hand, parquet was never and probably shouldn’t be designed with that functionality in mind.
The valuable lesson from what Cloudflare claims is that if you want to make an LLM perform as you expect you have to build considering their strengths and weaknesses.
You can see the same behavior if you try to ask an LLM to code in an API that is not commonly used.
When it comes to MCP tooling I followed a different path but with similar assumptions.
There are tools that LLMs have been Rled to death to use. So I’m modeling my tools after them.
Specifically, I try to have a “glob” tool, used to let the LLM figure out structure. A search and a read tool and use regexp as much as possible for passing parameters.
It has been working well, at least in terms of the model knowing how to invoke and use the tools.
I have to say though that each model is different. I see differences between Claude code and Codex when I use the MCP for development, at least on how good they are in retrieving the information they need.
Maybe I should try to run some benchmarking and compare more formally
we had to update the instructions in the README to fix some issues with how to run the server locally, apologies for any pain caused to people who might have tried to do that already.
The updated README now has instructions that will work!
It's great to see content coming from people with real experience of the impact of AI.
The balance between more productivity versus less people is probably something that each team and company has to figure out but there has to be some kind of limit in terms of both.
Would be great to understand more that part, this will make it easier to reason about the longer term impact of AI in the market.
There are other formats though that it can be compared to.
The Lance columnar format is one: https://github.com/lancedb/lancedb
And Nimble from Meta is another: https://github.com/facebookincubator/nimble
Parquet is so core to data infra and widespread, that removing it from its throne is a really really hard task.
The people behind these projects that are willing to try and do this, have my total respect.