Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Can someone chime in on how much VRAM is needed for fine-tuning vs inference? I have a GTX 1080 on which flan-t5-large or deBERTa can fit, but how much memory does it need for training? I suppose you need to keep the gradients somewhere. Also, how many examples are enough for simple classification? Does a multilingual model transfer its knowledge if my fine-tuning data has only some of the languages?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: