More

cpard · 2025-11-20T01:32:29 1763602349

As others said, Vortex is complementary to the table Formats you mentioned.

There are other formats though that it can be compared to.

The Lance columnar format is one: https://github.com/lancedb/lancedb

And Nimble from Meta is another: https://github.com/facebookincubator/nimble

Parquet is so core to data infra and widespread, that removing it from its throne is a really really hard task.

The people behind these projects that are willing to try and do this, have my total respect.

cpard · 2025-11-19T18:26:21 1763576781

Many times I read something on HN and come back to find it after a few days or weeks and using the current keyword based search has been consistently giving me a hard time, so I played around with LLMs as an alternative way of searching and finding information on HN.

cpard · 2025-10-26T04:59:58 1761454798

Servethehome[1] does a bit of a better job describing what maverick-2 is and why it makes sense.

[1]https://www.servethehome.com/nextsilicon-maverick-2-brings-d...

phkahler · 2025-10-26T15:24:41 1761492281

Thats a fairly specialized chip and requires a bunch of custom software. The only way it can run apps unmodified is if the math libraries have been customized for this chip. If the performance is there, people will buy it.

For a minute I thought maybe it was Risc-V with a big vector unit, but its way different from that.

stogot · 2025-10-26T17:00:10 1761498010

The quote at the end of the posted Reuters article (not the one you’re responding to) says that it doesn’t require extensive code modifications. So is the “custom software” is standard for the target customers of nextsilicon?

jll29 · 2025-10-26T17:16:21 1761498981

Companies often downplay the amount of software modifications necessary to benefit from their hardware platform's strengths because quite often, platforms that cannot run software out of the box lose out compared to those that can.

By the time special chips were completed and mature, the developers of "mainstream" CPUs had typically caught up speedwise in the past, which is why we do not see any "transputers" (e.g. Inmos T800), LISP machines (Symbolics XL1200, TI Explorer II), or other odd architectures like the Connection Machine CM-2 around anymore.

For example, when Richard Feynman was hired to work on the Connection Machine, he had to write a parallel version of BASIC first before he could write any programs for the computer they were selling: https://longnow.org/ideas/richard-feynman-and-the-connection...

This may also explain failures like Bristol-based CPU startup Graphcore, which was acquired by Softbank, but for less money than the investors had put in: https://sifted.eu/articles/graphcore-cofounder-exits-company...

lukeh · 2025-10-26T22:06:00 1761516360

XMOS (spiritual successor to Inmos) is still kicking around, it’s not without its challenges though, for the reasons you mention.

c0balt · 2025-10-27T12:42:01 1761568921

It's a bit more complicated, you need to use their compiler (LVVM fork with clang+fortran). This in itself is not that special as most accelerators (ICC, nvcc, aoc) already require this.

Modifications are likely on the level of: Does this clang support my required c++ version? Actual work is only required when you want to bring something else, like Rust (AFAIK not supported).

However, to analyze the efficiency of the code and how it is interpreted by the card you need their special toolchain. Debugging also becomes less convenient.

phkahler · 2025-10-27T14:36:25 1761575785

>> says that it doesn’t require extensive code modifications

If they provide a compiler port and update things like BLAS to support their hardware then higher level applications should not require much/any code modification.

ac29 · 2025-10-26T17:24:42 1761499482

The article says they are also developing a RISCV CPU

maratc · 2025-10-26T19:12:37 1761505957

I've also found their "Technology Launch" video[1] that goes somewhat deeper into the details (they also have code examples.)

[1] https://www.youtube.com/watch?v=krpunC3itSM

klooney · 2025-10-26T17:27:43 1761499663

They've got a "Mill Core" in there- is the design related to the Mill Computing design?

damageboy · 2025-10-26T18:36:33 1761503793

Yeah, it's an unfortunate overlap. The Mill-Core in NextSilicon terminology is the software defined "configuration" of the chip so to speak that represents swaths of the application that are deemed worthy of acceleration as expressed on the custom HW.

So really the Mill-Core is in a way the expression of the customer's code. really.

jecel · 2025-10-26T18:28:20 1761503300

They are completely different designs, but the name is inspired by the same source: the Mill component in Charles Babbage's Analytical Engine.

cpard · 2025-10-25T23:25:18 1761434718

A framework for optimizing LLM agents, including but not limited to RL. You can even do fine tuning, they have an example with unsloth in there.

The design of this is pretty nice, it's based on a very simple to add instrumentation to your agent and the rest happens in parallel while your workload runs which is awesome.

You can probably do also what DSPy does for optimizing prompts but without having to rewrite using the DSPy API which can be a big win.

cpard · 2025-10-21T19:14:27 1761074067

We are very excited for this integration with HF datasets. Datasets have a huge potential to deliver some much needed developer experience when it comes to working with data and LLMs/agentic architectures. Happy to answer any questions and also hear what the community thinks.

cpard · 2025-10-21T16:07:39 1761062859

LLMs have the potential to compress the cost of learning new programming models. The current moats built around that cost will start to dissolve and that's a good thing.

cpard · 2025-10-10T17:39:16 1760117956

That’s the whole reason of existence of Iceberg, Delta and Hudi right?

Not as easy as just appending metadata to a parquet file but in the other hand, parquet was never and probably shouldn’t be designed with that functionality in mind.

cpard · 2025-09-28T20:44:49 1759092289

The valuable lesson from what Cloudflare claims is that if you want to make an LLM perform as you expect you have to build considering their strengths and weaknesses.

You can see the same behavior if you try to ask an LLM to code in an API that is not commonly used.

When it comes to MCP tooling I followed a different path but with similar assumptions.

There are tools that LLMs have been Rled to death to use. So I’m modeling my tools after them.

Specifically, I try to have a “glob” tool, used to let the LLM figure out structure. A search and a read tool and use regexp as much as possible for passing parameters.

You can see an early version of this pattern here: https://github.com/typedef-ai/fenic/blob/main/examples/mcp/d...

It has been working well, at least in terms of the model knowing how to invoke and use the tools.

I have to say though that each model is different. I see differences between Claude code and Codex when I use the MCP for development, at least on how good they are in retrieving the information they need.

Maybe I should try to run some benchmarking and compare more formally

cpard · 2025-09-10T19:13:58 1757531638

we had to update the instructions in the README to fix some issues with how to run the server locally, apologies for any pain caused to people who might have tried to do that already.

The updated README now has instructions that will work!

cpard · 2025-08-20T20:09:43 1755720583

It's great to see content coming from people with real experience of the impact of AI.

The balance between more productivity versus less people is probably something that each team and company has to figure out but there has to be some kind of limit in terms of both.

Would be great to understand more that part, this will make it easier to reason about the longer term impact of AI in the market.