Midjourney unquestionably has heavy data set curation and uses RLHF from users.
You don't have to speculate on this as you can see that custom models for SDXL for instance perform vastly better than vanilla SDXL at the same number of parameters. It's all data set and tagging.
That is technically true, but when the base model is wasting parameter information on poorly tagged, watermarked stock art and other garbage images, it's not really a meaningful distinction. Better data makes for better models, nobody cares about how well a model outputs trash.
Ok, but you're severely misrepresenting the importance of things. Base SDXL is a fine model. Base SDXL is going to be much better than a materially smaller model that you've retrained with "good data".
You don't have to speculate on this as you can see that custom models for SDXL for instance perform vastly better than vanilla SDXL at the same number of parameters. It's all data set and tagging.