Explore chapters and articles related to this topic
Common Statistical Approach
Published in Atsushi Kawaguchi, Multivariate Analysis for Neuroimaging Data, 2021
In normal hypothesis testing, the test statistic is constructed on the assumption that the null hypothesis is correct and the distribution that the statistic follows is determined theoretically. This is called a null distribution. Then, a threshold or p-value for determining whether to reject the null hypothesis or not is calculated from the test statistic using the null distribution.
Research Design
Published in Shyama Prasad Mukherjee, A Guide to Research Methodology, 2019
While detailed analyses of data have to be taken up only after the intended data have been collected and peculiarities in data like incompleteness, missing-ness, dubious values, false responses have been noted, research design should give out broad hints about the types of analysis as well as the manner in which hypotheses will be verified and results of verification exercises recorded. In the context of testing statistical hypotheses, it may be noted that the formulation of ‘alternative’ hypotheses – one-sided or two-sided – is not mandated in all research investigations. We may simply remain interested in the ‘null’ hypotheses that we have framed and like to know if the ‘sample’ of evidences that we have is strong enough against the concerned null hypothesis or not. Thus either we are able to reject the null hypothesis or we fail to do so. In fact, the p-value advocated by Fisher is a (negative) measure of the strength of evidence against the null hypothesis contained in the sample. This is obtained as the probability P0 that a sample from the underlying population(s) will yield a value of the test statistic that will exceed the value obtained from the data on the assumption that the null hypothesis is true. We require the null distribution (distribution under the null hypothesis) of the test statistic which is a function of the sample observations and takes into account the null hypothesis to be verified. In fact, the test statistic T (x) may be looked upon as a measure of discrepancy between the null hypothesis and the sample observation (x), with a larger value implying a greater discrepancy. The probability of T exceeding the observed value, say t0 = T (x0), is called the level of significance or tail probability or simply p-value. Usually, the observed value t0 is taken as significant or highly significant according to whether P0 < 0.05 or < 0.01. Fisher suggested that values of P be interpreted as a sort of “rational and well-defined measure of reluctance to the acceptance of hypotheses”.
Biostatistics and Bioaerosols
Published in Harriet A. Burge, Bioaerosols, 2020
Lynn Eudey, H. Jenny Su, Harriet A. Burge
Sections D and E give examples of testing the location, or center, of a population, and also examples of comparing the centers of two populations. Within each of the examples the basic steps of hypothesis testing are followed: Step 0. Determine the research question. The research question needs to be stated in terms of a population parameter in order to use hypothesis testing. Often this step is the most difficult.Step 1. State H0 and HA. Recall that H0 claims a specific value for the parameter being tested. Generally, HA is the claim that the researcher would like to prove and H0 is the currently accepted claim or the conservative claim.Step 2. Determine the significance level. This is sometimes a field specific level (e.g., in Medical Statistics α = 0.05 is commonly used.)Step 3. Choose an appropriate test statistic for the parameter being tested. This choice will depend on whether a parametric or nonparametric test is used, on what assumptions can be made about the population distribution, and on the sample size.Step 4. Determine, by the choice of the test statistic and the choice of a, how far the sample statistic can stray from the claimed value of the parameter (under the null hypothesis) before the decision is made to reject the null hypothesis and believe that the sample statistic is more consistent with the claim of the alternative hypothesis. This is done by determining the distribution of the test statistic under the assumption that the null hypothesis is true; this distribution will be referred to as the null distribution.Step 5. Collect the random sample and calculate the value of the test statistic using the sample data. Make a decision according to the guideline set up in Step 4. This is the statistical decision. The decision is either “Do not reject the null hypothesis” (the sample data was consistent with the claim of H0) or “Reject the null hypothesis” (the sample data gives significant evidence that the claim of HA is more likely). Sometimes this decision is based on the “observed significance level” (or ”p-value”) which is a measure of how likely it would be to observe the observed sample test statistic (or one more extreme) from the null distribution. The null hypothesis is rejected if the observed significance level is less than α.Step 6. Rephrase the statistical decision in terms of the research question. What conclusion (if any) can be drawn about the research question based on the hypothesis test?
Some Multivariate Tests of Independence Based on Ranks of Nearest Neighbors
Published in Technometrics, 2018
For each , first we arrange the other zjs (j ≠ i) according to their x-distances from zi. For k = 1, 2, …, n − 1, let be the kth x-neighbor of zi (i.e., ). We define Ri, 1 as the rank of , the y-distance corresponding to the nearest x-neighbor of zi, in the set . For k = 2, …, n − 2, Ri, k is defined as the rank of , the y-distance corresponding to the kth nearest x-neighbor of zi, in the set containing n − k many y-distances. We repeat this procedure for all values of i = 1, 2, …, n. We also define reverse ranks Rri, k = n − k + 1 − Ri, k for i = 1, 2, …, n and k = 1, 2, …, n − 2. For any suitable monotone function ϕ defined on (0, 1], and measure positive and negative associations between x-distances and y-distances. In this article, we use ϕ(t) = log (t), and finally T = max {T1, T2} is used as the test statistic. Naturally, H0 is rejected for large values of T. Note that for any fixed i, Ri, k's are independent and under H0, Ri, k follows the discrete uniform distribution with mass points {1, 2, …, n − k} for k = 1, 2, …, n − 2. But, for i ≠ j, since the joint distribution of Ri, k and () may depend on the underlying distribution of Z, our method is not distribution-free. Therefore, we calculate the cut-off using the permutation principle. We consider a random permutation π of {1, 2, …, n} and use as new sample observations. The test statistic is computed using these new observations. This procedure is repeated several times and the resulting empirical distribution of the test statistic is used to approximate the null distribution (see Puri and Sen 1971). The (1 − α)th quantile (0 < α < 1) of this empirical distribution is used as the cut-off for a level α test.