Statistical Significance

Book
In: StatPearls [Internet]. Treasure Island (FL): StatPearls Publishing; 2024 Jan.
.

Excerpt

In research, statistical significance measures the probability of the null hypothesis being true compared to the acceptable level of uncertainty regarding the true answer. We can better understand statistical significance if we break apart a study design.

When creating a study, the researcher has to start with a hypothesis; that is, they must have some idea of what they think the outcome may be. For example, a study is researching a new medication to lower blood pressure. The researcher hypothesizes that the new medication lowers systolic blood pressure by at least 10 mm Hg compared to not taking the new medication. The hypothesis can be stated: "Taking the new medication will lower systolic blood pressure by at least 10 mm Hg compared to not taking the medication." In science, researchers can never prove any statement as there are infinite alternatives as to why the outcome may have occurred. They can only try to disprove a specific hypothesis. The researcher must then formulate a question they can disprove while concluding that the new medication lowers systolic blood pressure. The hypothesis to be disproven is the null hypothesis and typically the inverse statement of the hypothesis. Thus, the null hypothesis for our researcher would be, "Taking the new medication will not lower systolic blood pressure by at least 10 mm Hg compared to not taking the new medication." The researcher now has the null hypothesis for the research and must specify the significance level or level of acceptable uncertainty.

Even when disproving a hypothesis, the researcher can not be 100% certain of the outcome. The researcher must then settle for some level of confidence, or the degree of significance, for which they want to be confident their finding is correct. The significance level is given the Greek letter alpha and specified as the probability the researcher is willing to be incorrect. Generally, a researcher wants to be correct about their outcome 95% of the time, so the researcher is willing to be incorrect 5% of the time. Probabilities are decimals, with 1.0 being entirely positive (100%) and 0 being completely negative (0%). Thus, the researcher who wants to be 95% sure about the outcome of their study is willing to be wrong about the result 5% of the time. The alpha is the decimal expression of how much they are ready to be incorrect. For the current example, the alpha is 0.05. The level of uncertainty the researcher is willing to accept (alpha or significance level) is 0.05, or a 5% chance they are incorrect about the study's outcome.

Now, the researcher can perform the research. In this example, a prospective randomized controlled study is conducted in which the researcher gives some individuals the new medication and others a placebo. The researcher then evaluates the blood pressure of both groups after a specified time and performs a statistical analysis of the results to obtain a P value (probability value). Several different tests can be performed depending on the type of variable being studied and the number of subjects. The exact test is outside the scope of this review, but the output would be a P value. Using the correct statistical analysis tool when calculating the P value is imperative. If the researchers use the wrong test, the P value will not be accurate, and this result can mislead the researcher. A P value is a probability under a specified statistical model that a statistical summary of the data (eg, the sample mean difference between 2 compared groups) would be equal to or more extreme than its observed value.

In this example, the researcher hypothetically found blood pressure tended to decrease after taking the new medication, with an average decrease of 15 mm Hg in the group taking the new medication. The researcher then used the help of their statistician to perform the correct analysis and arrived at a P value of 0.02 for a decrease in blood pressure in those taking the new medication versus those not taking the new medication. This researcher now has the 3 required pieces of information to look at statistical significance: the null hypothesis, the significance level, and the P value.

The researcher can finally assess the statistical significance of the new medication. A study result is statistically significant if the P value of the data analysis is less than the prespecified alpha (significance level). In this example, the P value is 0.02, which is less than the prespecified alpha of 0.05, so the researcher rejects the null hypothesis, which has been determined within the predetermined confidence level to be disproven, and accepts the hypothesis, thus concluding there is statistical significance for the finding that the new medication lowers blood pressure.

What does this mean? The P value is not the probability of the null hypothesis itself. It is the probability that, if the study were repeated an infinite number of times, one would expect the findings to be as, or more extreme, than the one calculated in this test. Therefore, the P value of 0.02 would signify that 2% of the infinite tests would find a result at least as extreme as the one in this study. Given that the null hypothesis states that there is no significant change in blood pressure if the patient is or is not taking the new medication, we can assume that this statement is false, as 98% of the infinite studies would find that there was indeed a reduction in blood pressure. However, as the P value implies, there is a chance that this is false, and there truly is no effect of the medication on the blood pressure. However, as the researcher prespecified an acceptable confidence level with an alpha of 0.05, and the P value is 0.02, less than the acceptable alpha of 0.05, the researcher rejects the null hypothesis. By rejecting the null hypothesis, the researcher accepts the alternative hypothesis. The researcher rejects the idea that there is no difference in systolic blood pressure with the new medication and accepts a difference of at least 10 mm Hg in systolic blood pressure when taking the new medication.

If the researcher had prespecified an alpha of 0.01, implying they wanted to be 99% sure the new medication lowered the blood pressure by at least 10 mm Hg, the P value of 0.02 would be more significant than the prespecified alpha of 0.01. The researcher would conclude the study did not reach statistical significance as the P value is equal to or greater than the prespecified alpha. The research would then not be able to reject the null hypothesis.

Publication types

  • Study Guide