Selected Statistical Topics of Regulatory Importance
Demissie Alemayehu, Birol Emir, Michael Gaffney in Interface between Regulation and Statistics in Drug Development, 2020
Statistical hypothesis testing in a regulatory setting involves the calculation under the null hypothesis of the probability that the observed treatment effect on a specific variable is due to chance alone. In a randomized study, if this probability (p-value) is low, the null hypothesis (usually that the treatment effect is 0) is rejected and a treatment effect is established. If only one primary variable is used to establish a treatment effect, then the requirement that p < α controls the probability of incorrectly concluding a treatment effect at α. The issue of multiplicity of endpoints refers to the clinical trial setting where more than one variable is used to establish a treatment effect. The chances of obtaining at least one p-value below α increase with the number of endpoints. For example, the probability under the null hypothesis that at least one p-value is less than 0.05 for three independent hypotheses is 1 – (0.95)3 = 0.14. Thus, regulators cannot accept a level α test for each variable if the goal is to rule out incorrectly concluding a treatment effect (Type I error) at an overall probability of α.
Statistics for Genomics
Altuna Akalin in Computational Genomics with R, 2020
In this case, we will be fitting a plane rather than a line. However, the fitting process which we will describe in the later sections will not change for our gene expression problem. We can introduce one more histone modification, H3K27me3. We will then have a linear model with 2 explanatory variables and the fitted plane will look like the one in Figure 3.13. The gene expression values are shown as dots below and above the fitted plane. Linear regression and its extensions which make use of other distributions (generalized linear models) are central in computational genomics for statistical tests. We will see more of how regression is used in statistical hypothesis testing for computational genomics in Chapters 8 and 10.
Concluding Remarks
Song S. Qian, Mark R. DuFour, Ibrahim Alameddine in Bayesian Applications in Environmental and Ecological Studies with R and Stan, 2023
We use a one sample t-test problem to illustrate the severe testing concept. In a t-test contrasting the null hypothesis of against the alternative hypothesis , we decide which one is supported by the data by first assuming that H0 is true. Under H0, the t-test assumption is that the observed data follows a normal distribution with mean μ0 (i.e., ), which implies that the test statistic , where is the sample average and s is the sample standard deviation. The t-distribution has a range of . Although any value of the test statistic is possible under the null hypothesis, the likelihood of observing a value of the t-statistic decreases under H0 and increases under Ha as the observed value increases. Statistical hypothesis testing is, then, a means to weigh the evidence for and against H0. The evidence is in the form of the p-value.
Antioxidant and Anti-Diabetic Functions of a Polyphenol-Rich Sugarcane Extract
Published in Journal of the American College of Nutrition, 2019
Jin Ji, Xin Yang, Matthew Flavel, Zenaida P.-I. Shields, Barry Kitchen
A statistical analysis was performed for all the study results. First, a correlation analysis was carried out to determine whether there is a relationship between the two variable (x, y) pairs of the study results. Then, a statistical hypothesis testing was performed. Because of the small sample sizes, the Kruskal–Wallis test, a nonparametric test, was used. The Kruskal–Wallis test is the nonparametric alternative to a one-way analysis of variance and does not require normal distributions. The null hypothesis of this test is that all the medians are equal. The alternative hypothesis is that the medians are different. If the Kruskal–Wallis test is significant, it indicates that at least two concentrations have significantly different medians. The statistical analysis was performed using the SAS® software, version 9 (SAS Institute, Inc.).
Identifying a motivational process surrounding adherence to exercise and diet among adults with type 2 diabetes
Published in The Physician and Sportsmedicine, 2020
Manon Laroche, Peggy Roussel, Francois Cury
Once the reliability of the measurements was verified the descriptive statistics (mean, standard deviation, distribution) and correlations of the key variables were examined. Then, a path model for evaluating the combined contribution (direct and indirect effects) of each variable – SOC strategy, promotion focus, prevention focus – on exercise, general diet, fruit and vegetable consumption, high-fat food consumption, and spacing of carbohydrates was run. In this model, age, gender, number of comorbidities and educational level were included as control variables. This path analysis was conducted by using Lisrel 9.1. The .05 level of significance was used for all statistical hypothesis testing. Beta represents the standardized regression coefficient. As for previous analyzes, the recommendations of Meyers et al. [27] were applied to assess the adequacy of the model (CFI and GFI ≥ .90; RMSEA ≤ .08). Finally, using SPSS software 18.0, a bootstrapping method [29] resample set at 5000 samples with bias-corrected 95% confidence intervals was employed to test the significance of the indirect effects. Point estimates of indirect effects are considered significant when zero is not contained in 95% confidence intervals [29].
The role of the p-value in the multitesting problem
Published in Journal of Applied Statistics, 2020
P. Martínez-Camblor, S. Pérez-Fernández, S. Díaz-Coto
Statistical procedures are frequently performed routinely and without the previous checking of the required assumptions. The derived conclusions are sometimes misunderstood and are not taken with the appropriate caution. Those actions contribute to the reproducibility crisis and to the deterioration of the sciences credibility [2]. The problem gets worse when the study involves thousands of statistical hypotheses which cannot be handled individually nor carefully. Such is the case of most of the studies in the so-called -omic sciences in which, commonly, thousands or even hundreds of thousands of null hypotheses are simultaneously tested and, once a threshold is computed, a subset of them are considered as effects. But, of course, statistical analysis cannot replace the rational thinking and the derived conclusions should be carefully considered [35]. Knowing the real implications of the selected threshold and the risks (limitations) of the decisions based on statistical hypothesis testing is crucial to get a good understanding of the observed results.
Related Knowledge Centers
- P-Value
- Pearson'S Chi-Squared Test
- Null Hypothesis
- Analysis of Variance
- Statistical Significance
- Type I & Type II Errors
- Fiducial Inference
- Inductive Reasoning
- Detection Theory
- Test Statistic