Example 8: Power
of Nonstandard Significance Tests in the Analysis of Variance
Traditionally, major hypothesis tests in the analysis of variance have
been performed to assess whether main effects, interactions, or simple
main effects exist at all. The traditional null hypothesis Ftest
is equivalent to a test that the RMSSE is equal to zero.
Many writers (see, for example, several of the chapters in Harlow, Mulaik,
& Steiger, 1997) have expressed dissatisfaction with tests of the
"nil hypothesis," i.e., tests that the effects are absolutely
zero. One solution to this problem is to test hypotheses of "small
effect" rather than hypotheses of zero effect. As examples of this
strategy, consider the test of close fit and test of notclose fit proposed
in structural modeling by MacCallum, Browne, and Sugawara (1996), or the
tests of minimal effect discussed by Murphy and Myors (1998) in their
recent, very accessible monograph on power analysis.
Testing hypotheses about close fit or minimal fit, compliments the interval
estimation approach advocated by Steiger and Fouladi (1997), who suggested
computing, and examining, a confidence interval on standardized effect
size. The confidence interval approach allows one to test any hypothesis
about effect size  simply examine whether the confidence interval excludes
a given value. However, the width of the confidence interval also conveys
information about the precision with which the data determine the size
of the effects. Hence, noncentralitybased confidence interval estimates
of effect size offer all the benefits of nonstandard hypothesis tests,
and more. For an extended discussion of this point, with numerous examples,
see Steiger & Fouladi (1997).
Power and sample size analysis in conjunction with hypotheses of minimal
effect offers some important advantages when used in combination with
the noncentrality interval estimation approach, because it assures, in
advance, that precision of estimation will be sufficient to make the confidence
interval usefully narrow.
In this exercise, we sketch our approach to tests of minimal effect
in the analysis of variance, compare it to the approach advocated by Murphy
and Myors (1998), and demonstrate how the calculations can be duplicated
easily with the Noncentral F Probability Calculator.
Relation between Measures
of Effect in ANOVA. There are several closely related measures
of effect size that are employed in the context of fixed effect, factorial
ANOVA designs. For notational convenience, define ∑effect
as the sum of squared effects in an ANOVA. For example in a 1Way ANOVA,
∑α = ∑jJ=1
α j2 

(7) 
Define σeffect2,
the effect variance as
Σeffect2
=
Σeffect
/
cellseffect 

(8) 
where cellseffect
is the number of cells involved in the effect. In a main effect, it is
the number of levels in that factor. In an interaction, it is the product
of the number of levels in the factors involved in the interaction.
The "signal to noise ratio" f2,
is defined as
ωeffect2
=
seffect2
/
se2 

(9) 
where σe2
is the error variance. The "proportion of variance accounted for
by the effect, with other main effects and interactions partialled out,"
ω2,
is given by
weffect(partialled)2
=
seffect2
/
seffect2
+
se2 

(10) 
(For simplicity of notation, we will refer to this as ω2
in what follows.) Consequently, f and ω2
share the very simple relationships
f 2
=
ω2
/
1  ω2 

(11) 
and
ω2
=
f
2
/
1 + f 2 

(12) 
The RMSSE is defined as
RMSSEeffect = √Σeffect /
dfeffectΣe2 

(13) 
Hence
RMSSE2
=
(cellseffect
/ dfeffect)f2 

(14) 
or
dfeffectRMSSE2
=
cellseffectf2 

(15) 
However, it is also the case that
δeffect 
= neffectdfeffectRMSSEeffect2 


= neffectcellseffectfeffect2 
(16) 

= neffectcellseffect(ω2
/
1  ω2) 

Since RMSSE has a monotonic, functional relationship with the noncentrality
parameter of the distribution of the Fstatistic,
so must f2
and ω2,
because each of these quantities can be transformed monotonically into
any of the others.
The implications of these results are that hypothesis tests on quantities
like ω2
can be reexpressed as hypotheses about the noncentrality parameter δ, and viceversa.
Suppose, for example, we want to test the hypothesis that ω2,
the proportion of variance accounted for by the treatment effect, is less
than or equal to .01, in a 1way ANOVA with four groups, and a sample
size of N = 25 in each group. We now address three questions concerning
such a situation.
How
would one perform such a test as a hypothesis test of the noncentrality
parameter δ?
What
would be the power of such a hypothesis test, if the actual value of ω2
is 0.10?
Suppose
we observe a value of 5.65 for the Fstatistic
in this analysis. What is the 90% confidence interval for ω2?
To answer the first question, recall that,
in Example
6, we learned how to test a hypothesis about the noncentrality parameter
δ. (The reader may
wish to review this example briefly.) Equation 16 expresses the relation
between δ and ω2.
In a 1Way ANOVA with four groups and N =
25 in each group, the degrees of freedom are 3 and 96. neffect
= 25 cellseffect =
4
and so
δnull = 4(25)(.01/1.01) = 1.0101 

(17) 
Hence, the hypothesis that ω2
≤0.01
is equivalent to the hypothesis that d ≤ 1.0101.
Example 6 gives a detailed discussion of how to test this hypothesis.
To answer the second question, we convert
an ω2
value of 0.10 into an equivalent value of δ.
We have
δ
alternative = 4(25)(.10/1.10) = 10/.9 =
11.1111 

(18) 
To compute the power, we use the Noncentral F Probability Calculator. Select
Power
Analysis from the Statistics menu to display the Power Analysis and Interval Estimation
Startup Panel. From the Startup Panel, select Probability
Distributions and Noncentral
F Distribution.
Now, click the OK button to
display the Noncentral F Probability Calculator.
Next, compute the critical value of F
for testing the hypothesis that ω2
≤ 0.01.
Enter 3 in the Numerator
df box, 96 in the Denom. df box, and 1.0101
in the Delta box. Next, select
the (1  Cumulative p) check
box and make sure the 1  Cum. p value
is .05. Choose F
as the quantity to compute by clicking on the F
option button under Compute.
Finally, click the Compute button.
The Observed F is the critical
value of F needed to test the
hypothesis that ω2
≤0.01.
The critical value of F is 3.5352
To compute the power of the test against the alternative that ω2 = 0.10,
we compute the power of the Ftest
when δ = 11.1111. Simply
leave the Observed F value in
place, and change Delta to 11.1111. Select 1
 p as the quantity to Compute,
and then click the Compute button.
This will compute the probability of obtaining an Fstatistic
greater than the Observed F when
δ = 11.1111, which
is the power of the test when ω2 = 0.10.
We see that the power is only .649.
Hence, it seems that at this sample size, the design lacks sufficient
precision to discriminate between minimal and medium size effects.
To answer the final question, we first utilize the method of Example
7 to construct a 90% confidence interval for δ,
then use the results of Equation 16 to convert this confidence interval
into a confidence interval for ω2.
Enter the 5.65 in the Observed F box, 0
in the Delta box, and then clear
the (1  Cumulative p) check
box. Next, click the Compute
button to compute the cumulative probability of the observed F.
The cumulative probability is above .95, so we know that the lower limit
of the confidence interval will be above zero. To compute the lower limit,
we solve for a value of the noncentrality parameter that will give the
observed F a cumulative probability
of .95. Enter .95 as Cum.
p, and select Delta under
Compute. Clicking the Compute
button yields 4.157486. To calculate
the upper confidence limit, set Cum.
p to .05, and repeat the
process, obtaining an upper limit of 31.54681.
These confidence limits may be converted readily into confidence limits
for ω2,
by combining the results in Equations 16 and 12. Specifically, Equation
16 expresses f 2
as a function of δ.
Equation 12 expresses ω2
as a function of f 2.
So first, we obtain a confidence interval for f 2
from the endpoints of the confidence interval for δ.
Specifically, using Equation 16, we have
feffect2 =
δeffect
\
neffectcellseffect 

(19) 
In this case, neffect=
25 and cellseffect = 4, so to convert
the confidence interval for δ
to one for f 2,
we simply divide the endpoints by 100, yielding a 90% confidence interval
from .04157486 to .3154681.
Next, we convert these endpoints using Equation 12. For the lower endpoint,
we have
.04157486 / 1 + .04157486 = 0.0399
For the upper endpoint, we have
.3154681 / 1 + 3154681 = 0.2398
Note how, although the observed F
has a probability level of .0013, and would be termed "highly significant"
by some, the percentage of variance accounted for has not been determined
with a high degree of precision. The 90% confidence interval for ω2
ranges from about 4% to about 24%.
See also, Power
Analysis  Index.