Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Those sound like the sort of issues which could be caused by your server silently truncating the middle of your prompts.

By default, Ollama uses a context window size of 2048 tokens.



I checked this, the whole conversation was about 1000 tokens.

I suspect the Ollama version might have wrong default settings, such as conversation delimiters. The experience of Gemma 3 in AI studio is completely different.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: