Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

There were initial difficulties in finetuning that made it less appealing early on, and that's snowballed a bit into having more of a focus on RAG.

Some of the issues still exist, of course:

* Finetuning takes time and compute; for one-off queries using in-context learning is vastly more efficient (i.e., look it up with RAG).

* Early results with finetuning had trouble reliably memorizing information. We've got a much better idea of how to add information to a model now, though it takes more training data.

* Full finetuning is very VRAM intensive; optimizations like LoRA were initially good at transferring style and not content. Today, LoRA content training is viable but requires training code that supports it [1].

* If you need a very specific memorized result and it's costly to get it wrong, good RAG is pretty much always going to be more efficient, since it injects the exact text in context. (Bad RAG makes the problem worse, of course).

* Finetuning requires more technical knowledge: you've got to understand the hyperparameters, avoid underfitting and overfitting, evaluate the results, etc.

* Finetuning requires more data. RAG works with a handful datapoints; finetuning requires at least three orders of magnitude more data.

* Finetuning requires extra effort to avoid forgetting what the model already knows.

* RAG works pretty well when the task that you are trying to perform is well-represented in the training data.

* RAG works when you don't have direct control over the model (i.e., API use).

* You can't finetune most of the closed models.

* Big, general models have outperformed specialized models over the past couple of years; if it doesn't work now, just wait for OpenAI to make their next model better on your particular task.

On the other hand:

* Finetuning generalizes better.

* Finetuning has more influence on token distribution.

* Finetuning is better at learning new tasks that aren't as present in the pretraining data.

* Finetuning can change the style of output (e.g., instruction training).

* When finetuning pays off, it gives you a bigger moat (no one else has that particular model).

* You control which tasks you are optimizing for, without having to wait for other companies to maybe fix your problems for you.

* You can run a much smaller, faster specialized model because it's been optimized for your tasks.

* Finetuning + RAG outperforms just RAG. Not by a lot, admittedly, but there's some advantages.

Plus the RL Training for reasoning has been demonstrating unexpectedly effective improvements on relatively small amounts of data & compute.

So there's reasons to do both, but the larger investment that finetuning requires means that RAG has generally been more popular. In general, the past couple of years have been won by the bigger models scaling fast, but with finetuning difficulty dropping there is a bit more reason to do your own finetuning.

That said, for the moment the expertise + expense + time of finetuning makes it a tough business proposition if you don't have a very well-defined task to perform, a large dataset to leverage, or other way to get an advantage over the multi-billion dollar investment in the big models.

[1] https://unsloth.ai/blog/contpretraining



So is a good summary:

1. If you have a large corpus of valuable data not available to the corporations, you can benefit from fine tuning using this data.

2. Otherwise just use RAG.


That summary's not wrong, it's just reductionist.

Fine-tuning makes sense when you need behavioral shifts (style, tone, bias) or are training on data unavailable at runtime.

RAG excels when you want factual augmentation without retraining the whole damn brain.

It's not either/or — it's about cost, latency, use case, and update cycles. But hey, binaries are easier to pitch on a slide.


Thanks for the detailed comment.

I had no idea that fine tuning for adding information is viable now. Last I checked (year+ back) it seemed to not work well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: