Couldn't they find a real workload that does this? | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		HPsquared on June 13, 2024 \| parent \| context \| favorite \| on: AMD's MI300X Outperforms Nvidia's H100 for LLM Inf... Couldn't they find a real workload that does this?

ebalit on June 13, 2024 [–]

vLLM inference of Mixtral in fp16 is a real workload. I guess the details are there because of the different inference engine used. You need the most similar compute tasks to be ran but the compute kernels can't be the same as in the end they need to be ran by a different hardware.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact