Frequentist statistics

Frequentist probability asserts that the “probability” of something is the frequency at which it’s observed. Frequentist statistics extends the concept to statistics.

So replace every instance of the words chance and probability with the word frequency (adjusting the sentence as necessary to make logical sense) and you’re golden. 1This advice assumes you know you’re characterizing your data, not your hypothesis.

Alternatively, because this adjustment tends to lead to stilted sentences, replace x% chance or 0.x probability with x% of

Examples (replacing chance/probability with frequency):

  • Misleading: The probability that this 95% confidence interval contains the test statistic is 95%.
  • Accurate: The frequency with which 95% confidence intervals (in a number of repeated trials) contain the test statistic is 95%
  • Misleading: The p-value for this test was 0.03, so the probability of obtaining the data observed, assuming there was no difference between treatments, was 0.03.
  • Accurate: The p-value for this test was 0.03, so the frequency with which one might obtain the data observed (in a number of repeated trials), assuming there was no difference between treatments, was 3%.

Examples (replacing x% chance or 0.x probability with x% of):

  • Misleading: There is a 95% chance that this 95% confidence interval contains the test statistic.
  • Accurate: 95% of 95% confidence intervals contain the test statistic.
  • Misleading: There was a 3% chance (p = 0.03) of seeing these observations, assuming there was no difference between treatments.
  • Accurate: 3% of repeated trials (p = 0.03) would see similar observations, assuming there was no difference between treatments.


1Frequentist statistics only helps you draw conclusions on your data, not your hypothesis. It makes more sense to talk about the frequency of observing some data than the frequency of observing some hypothesis. If you fit your statements into the template

We'd expect [frequency] of repeated trials to have [data], assuming [hypothesis],
your new statement is probably accurate. Swapping data and hypotheses leads to incorrect/misleading statements.

Examples (statements that make it sound like you're testing your hypothesis (misleading) vs. ones that sound like you're testing data (correct)):

  • Misleading (characterizing the hypothesis): There was a 3% chance (p = 0.03) that there was no difference between treatments (because we observed these data).
    • Rephrased (still misleading; data and hypothesis are swapped): We'd expect 3% of repeated trials (p = 0.03) to have no difference between treatments, assuming the data were as observed.
    • Rephrased (accurate): We'd expect 3% of repeated trials (p = 0.03) to have these data, assuming there was no difference between treatments.
  • Accurate (characterizing the data): 3% of repeated trials would observe these data, assuming there was no difference between treatments.
    • Rephrased (still accurate): We'd expect 3% of repeated trials to have these data, assuming there was no difference between treatments.

No comments:

Post a Comment