>You will be able to detect very small effects, but if there is no difference, you are more likely to draw the correct conclusion with 1,000,000 samples than with 100 samples.
I believe that this is to the tweet author's point where they stated that in the real-world a null hypothesis never exactly holds true. So you can detect even the tiniest of variance with a large enough sample.
>given no actual difference
I think the error here is that there is a priori knowledge of no difference, where the author is stating that in real-life scenarios there will always be some differences even if being too minute to be of practical significance. We can fabricate "no difference" in simulations but in real experimental design there will almost always be variance, even if it's just an artifact of the measurement process rather than being causal from the independent variable(s). Whether or not the differences statistically significant depends on the experiment design, to include sample size.
So while I understand your valid point I think the author's claim was more about the practical application of statistics rather than the mathematical precision as it relates with simulated examples. But I could be misinterpreting.
If you're performing a rigorous experiment, then you have a control group, and your null hypothesis is that the experimental group will be the same. In actuality you will find that many null hypotheses hold true and the experimental design has no effect on the outcome, at all. Of course in some fields, like psych, most everything has some effect. But it's absolutely not correct to blame p-values for helping you distinguish between no effect and astonishingly small effects. They're functioning correctly. That is a secondary problem to solve, usually by stating a minimum effect size.
Of far greater problem is False Discovery Rate related things. Where you test 20 different things at once, and by chance identify one of them as significant even though the true effect size is 0. This is another area where increasing your sample size can help avoid problems, but even still you need to acknowledge your tools are imperfect.
>In actuality you will find that many null hypotheses hold true
I'm assuming you mean within the confines of the experiment, correct? I agree. The tweet author was eluding to the fact that "IRL" the null hypothesis is almost never true at the population level. Meaning if you grab a large enough sample you will detect very, very small differences. (This was her Lucky Charms ~ blood type example in the tweet). I also agree with that. I don't think the two claims are mutually exclusive and the fact they can coexist is (I believe) precisely her point about sample size.
I mean there are many real world examples where the impact of A has no effect on B, obviously. The number I'm thinking of, has, truly, no effect on the time since you last blinked. No sample size will change that.
Lucky Charms does probably relate to blood type in some impossibly small way. It makes sense that a biological trait has some relation to dietary consumption. I don't think any sort of non-garbage tier journal with peer review would publish it, but good on p-values for helping us detect them though. Not bad on sample size for making it possible to discern this effect size with a high degree of confidence.
It's worth noting that we also have many other tools to help us. For example you can test, given an expected effect size and a sample size, what the probability is of getting a statistically significant result, or a non-significant result, or a significant result that erroneously goes in the wrong direction. Or what the range of likely true mean effect size is given a significant sample difference.
We want large samples. They enhance confidence in findings. The author's premise seems to be it's better not to know that small rocks exist if we're only looking for big rocks. But fails to mention that the tools to find small rocks also help us identify big rocks with more clarity.
>You will be able to detect very small effects, but if there is no difference, you are more likely to draw the correct conclusion with 1,000,000 samples than with 100 samples.
I believe that this is to the tweet author's point where they stated that in the real-world a null hypothesis never exactly holds true. So you can detect even the tiniest of variance with a large enough sample.
>given no actual difference
I think the error here is that there is a priori knowledge of no difference, where the author is stating that in real-life scenarios there will always be some differences even if being too minute to be of practical significance. We can fabricate "no difference" in simulations but in real experimental design there will almost always be variance, even if it's just an artifact of the measurement process rather than being causal from the independent variable(s). Whether or not the differences statistically significant depends on the experiment design, to include sample size.
So while I understand your valid point I think the author's claim was more about the practical application of statistics rather than the mathematical precision as it relates with simulated examples. But I could be misinterpreting.