ABSTRACT
Rationale
The p value has long been used as the primary criterion for statistical significance; however, its dichotomous interpretation has been increasingly criticized for oversimplifying uncertainty and distorting scientific inference, particularly in health and sports sciences.
Aims and Objectives
This study aimed to critically analyze the limitations of using the p value as the central criterion of statistical significance and to discuss more robust methodological alternatives for statistical inference.
Methods
A critical review was conducted using the PubMed/MEDLINE database covering the period from 2015 to 2025, complemented by citation tracking. Reviews, editorials, guidelines, and methodological essays that directly addressed the interpretation of p values and complementary metrics were included. A total of 46 articles were selected and evaluated using a self-developed critical appraisal checklist.
Results
Among the included studies, 38 (82.6%) explicitly criticized the isolated or dichotomous use of the p value, whereas eight adopted a more moderate position, supporting its use only when combined with confidence intervals, effect sizes, or Bayesian approaches. No article defended the p value as a standalone criterion for scientific decision-making. The most frequent recommendations involved abandoning the term “statistically significant,” prioritizing the estimation of effect magnitude and precision, and promoting the use of compatibility intervals, effect sizes, and Bayesian methods.
Conclusion
Overcoming the binary logic of p < 0.05 is essential to enhance transparency, reduce bias, and better align statistical practice with the scientific and clinical relevance of research findings, particularly in the health and sports sciences.