GRM Syntax -
Example 3: Best-Subset Regression with Categorical Predictors
This example illustrates the best-subset regression facilities of GRM,
and how they can be applied to experimental designs. The FORCE
keyword is used to force all five main effects into the model; GRM
will then search for a best subset of up to 5 additional two-way interactions
(i.e., START = 6, STOP
= 10). A unique feature of GRM
is that when categorical predictor variables or effects have more than
a single degree of freedom (as in this example), the stepwise and best-subset
procedures ensure that the coded (sigma-restricted) variables representing
the categorical predictors are moved in or out of the model as a block
(so that always complete multi-degree of freedom effects are included
or excluded from the final model). You can run the example shown below
using the example data file Tomatoes.sta.
GRM;
{ Dependent variable (list): }
DEPENDENT
= POUNDS;
{ Specification of grouping variables (factors);
note that
no codes
(values) are specified, so the program will by default
take all
grouping codes found in the data file. }
GROUPS
= 'SOIL CONDITION' POTSIZE
VARIETY 'PRODUCTION METHOD' LOCATION;
{ Here the bar operator and the @ operator
are used to construct the
factorial design
to degree 2; the bar operator will evaluate to all main
effects and interactions
up to the number specified after the @ operator }
DESIGN
= 'SOIL CONDITION' | POTSIZE
| VARIETY | 'PRODUCTION METHOD' | LOCATION @2;
{ Best-subset regression is requested as the
model building method. }
MBUILD
= BESTSUBSET;
{ Here the first 5 effects, i.e., main effects,
are "forced" into the model. }
FORCE
= 5;
{ Mallow's Cp index is will be used to evaluated
the subsets. }
BESTCRIT
= MALLOWSCP;
{ The search for the subsets will begin with
subsets of size 6, up to
subsets
of size 10 }
START
= 6;
STOP
= 10;
For more examples, see GRM Syntax -
Examples.