[old scientist, pointing at some data] After decades of research, thousands of experiments, a massive amount of peer reviewing, we can finally confidently conclude…
[smug dude with a ridiculous hairstyle] Uh yeah, but this TikTok by PatriotEagle1776 says your research is wrong
Thus the larger sample, to get something statistically significant. Which might not be practical due to cost.
Some methods suck no matter how much data you throw at it.
The study I was referencing had thousands of people taking their survey and the data quality was terrible because that’s what you get when asking people to recall what they ate over the past 20-30 years. Adding yet more people to the study won’t clean up the data and would start adding enough cost that it’d be cheaper to do close observation studies of 100 people and woupd actually achieve usable results.
The general guidelines on epidemiological studies (which both of my examples are) is that you cannot draw conclusions from a relative risk increase less than 100%.
So please stop with the blanket statement of “more data means better results”. It’s not true, and it’s the same claim that AI tech bros keep making to fleece gullible investors
More data does mean better results.
So when I can’t get a useful trendline on a graph of % of redheads born per number of bananas eaten by the mother, you’re saying it’s because I didn’t collect enough data? Why didn’t I think of that?
No trend is also a result, more data, more confidence.