Explore chapters and articles related to this topic
Environmental Monitoring and Assessment – Normal Response Models
Published in Song S. Qian, Mark R. DuFour, Ibrahim Alameddine, Bayesian Applications in Environmental and Ecological Studies with R and Stan, 2023
Song S. Qian, Mark R. DuFour, Ibrahim Alameddine
Although the classical ANOVA is focused on testing the null hypothesis of no difference, the ANOVA test itself is almost always a starting point of an analysis. This is because when the ANOVA null hypothesis is rejected, we are naturally interested in the nature of the treatment effects, which is a multiple comparisons problem. In the classical ANOVA framework, multiple comparisons are conducted by adjusting the significance level of each comparison to achieve a family-wise type I error probability of 0.05. When the null hypothesis is not rejected, we cannot readily conclude that there is no treatment effect because failure to reject the null does not definitely show that the alternative is false. In many cases, we have reason to believe that the null hypothesis of no difference is unlikely to be true (which is why we were able to write a convincing proposal to win the research funding for the study). Consequently, the objective of the experiment or data collection is largely to quantify the differences among multiple treatment levels. This is why we find that using the Bayesian hierarchical modeling (BHM) approach (e.g., equations (4.1) and (4.2)) is the most logical approach. BHM is an estimation method. The shrinkage effect of BHM automatically reduces the magnitude of any given difference. Compared with inflating the confidence interval to address the multiple comparison problem, a shrinkage estimator such as BHM adjusts the estimated differences based on the relative balance between within- and among-group variances.
Neurohype
Published in L. Syd M Johnson, Karen S. Rommelfanger, The Routledge Handbook of Neuroethics, 2017
Scott O. Lilienfeld, Elizabeth Aslinger, Julia Marshall, Sally Satel
The nature of examining activity in the brain at multiple time points across thousands of voxels can generate the multiple-comparisons problem. When conducting numerous statistical tests, the proportion of false positive (spurious) results increases substantially. A hilarious yet powerful example of this problem in action involves brain imaging and, surprisingly, a study of a dead salmon. The recipients of the 2010 Ig Nobel Prize (Bennett et al., 2009)—not to be confused with the Nobel Prize—placed a dead salmon in a scanner, asking the fish to “look at” different emotional stimuli, and then performed the typical set of statistical analyses on the fish. Not correcting for multiple comparisons yielded a nonsensical finding—the dead fish exhibited neural activity in response to the stimuli. These absurd results emerged because the researchers—to demonstrate the hazards of not correcting for multiple comparisons—had conducted more than a thousand statistical tests (as commonly done in some neuroimaging research), some of which were bound to emerge as statistically significant merely by chance. The multiple-comparisons problem, which is by no means unique to brain-imaging research, can be remedied by using analytic tools that we need not discuss in detail here.
Factorial Designs with Time-to-Event Endpoints
Published in John Crowley, Antje Hoering, Handbook of Statisticsin Clinical Oncology, 2012
The multiple comparisons problem is one of the issues that must be considered in factorial designs. If tests of each treatment are performed at level α, which is typical for factorial designs (Gail et al. 1998), then the experiment-wide level, defined as the probability that at least one comparison will be significant under the null hypothesis, is greater than α. There is a disagreement on the issue of whether all primary questions should each be tested at level α or whether the experiment-wide level across all primary questions should be level α, but clearly if the probability of at least one false-positive result is high, a single positive result from the experiment will be difficult to interpret and may well be dismissed by many as inconclusive. Starting with global testing followed by pairwise tests only if the global test is significant is a common approach to limit the probability of false-positive results. A Bonferroni approach where each of T primary tests is performed at α/T is also an option. For comprehensive discussions of testing strategies in multiple testing settings, see Dmitrienko et al. (2010).
The Volumetric Changes of the Pineal Gland with Age: An Atlas-based Structural Analysis
Published in Experimental Aging Research, 2022
Minoo Sisakhti, Lida Shafaghi, Seyed Amir Hossein Batouli
The statistical analyses on the estimated volumes were performed in MATLAB. The average and standard deviation of the volumes were estimated in each age group. Also, the statistical correlation of the volumes with age was estimated, both in each age group, and in the total sample. Next, the similarity of the pattern of the changes of the PG volume in the 295 participants with the other 48 volumetric measures was estimated, based on the Pearson’s R correlation coefficient. This analysis was performed to identify the brain areas whose aging profile more resembled the pattern of the PG. Finally, to further identify the aging profile of the pineal gland, a linear as well as a nonlinear (second-order polynomial) line were fitted to the PG volumes; the line with a lower RMSE was selected as the best-fit. For drawing the boxplots, we used an open-source MATLAB code (Campbell, 2020). All the hypotheses were tested by having considered the Multiple Comparisons problem, and the alpha significance levels were corrected using the FWER (Bonferroni) method.
Identification of critical chemical modifications by size exclusion chromatography of stressed antibody-target complexes with competitive binding
Published in mAbs, 2021
Rachel Liuqing Shi, Gang Xiao, Thomas M. Dillon, Arnold McAuley, Margaret S. Ricci, Pavel V. Bondarenko
To minimize the chance of false discovery, the number of the tested modifications, or a scientifically relevant fraction of it, should be included in the evaluation of statistical significance. The false discovery rate (FDR) was introduced for the list of modifications detected in peptide mapping, rather than FPR. This approach was similar to genome-wide studies where a few genes involved in a pathway or a disease need to be selected from a list of thousands of genes with high confidence.51,52 The FDR is the expected fraction of false positives in a list of modifications. Statistics proposed several procedures for adjusting p-values to correct for the multiple comparisons problem. The oldest was the Bonferroni correction, where the corrected p-value (p*) needs to include the total number of tested modifications.52 p* can be determined by dividing p by N, where p was the p-value for a test of a single modification, and N was the number of modifications tested. As there were 420 modifications in the results from peptide mapping, corrected p-value threshold should be set to 1.19 × 10−4, as a result of 0.05 divided by 420 (top horizontal dashed line in Figure 6), or -log10 p-value should set to be larger than 3.9. Removing modifications from remote domains can be further applied to reduce the significance level or FDR level. For an antibody or Fc-fusion protein, modifications on the Fc region could be removed from the list, since CDRs are far from the Fc domain. To summarize, the number of tested modifications in the list, or a scientifically relevant fraction of it, should be added to the denominator of p-value.
In silico comparison of whole pelvis intensity-modulated photon versus proton therapy for the postoperative management of prostate cancer
Published in Acta Oncologica, 2023
Emile Gogineni, Ian K. Cruickshank, Hao Chen, Aditya Halthore, Heng Li, Curtiland Deville
Paired 2-sided Wilcoxon signed-rank tests were used to compare plans. Given the large number of endpoints and the multiple comparisons problem, Bonferroni corrections were applied to minimize the likelihood of false-positive results. All statistical analyses were conducted using Matlab (MathWorks, Natick, MA) and Excel (Microsoft, Redmond, WA).