PROPHET StatGuide: Examining t test results to detect assumption violations
All the following results are provided as part of a PROPHET two-sample (unpaired)
t test analysis.
Results for sample values:
- Normality tests:
- If the assumptions for the t test hold, the values from
each sample should come from a
normal distribution.
Departures from normality can suggest the presence of
outliers
in the data, or of a nonnormal distribution
in one or more of the samples.
The normality test will give an indication of whether the
populations
from which the samples were drawn
appear to be normally distributed, but will not indicate the cause(s)
of the nonnormality. The smaller the sample size, the less
likely the normality test will be able to detect
nonnormality.
- Histograms:
- The histogram
for each sample has a reference
normal distribution
curve for a normal distribution with the same mean and variance
as the sample. This provides a reference for detecting gross
nonnormality when the sample sizes are large.
- Boxplots:
- Suspected
outliers
appear in a
boxplot
as individual points o or x outside
the box. If these appear on both sides of the box, they also suggest the
possibility of a
heavy-tailed
distribution. If they appear on only one side,
they also suggest the possibility of a
skewed
distribution. Skewness is also
suggested if the mean (+) does not lie on or near the central line of the
boxplot, or if the central line of the boxplot does not evenly divide the box.
Examples of these plots
will help illustrate the various situations.
- Normal probability plot:
- For values sampled from a
normal distribution,
the
normal probability plot,
(normal Q-Q plot)
has the points all lying on or near the straight line drawn
through the middle half of the points. Scattered points
lying away from the line are suspected
outliers.
Examples of these plots
will help illustrate the various situations.
- Normality test for residuals:
- If the assumptions for the t test hold, all the
residuals
(from both samples) should come from the same
normal distribution
with mean 0.
Departures from normality can suggest the presence of
outliers
in the data, or of a nonnormal distribution
in one or more of the populations
from which the samples were drawn.
The normality test will give an indication of whether the
populations from which the samples were drawn
appear to be normally distributed, but will not indicate the cause(s)
of the nonnormality. The smaller the sample size, the less
likely the normality test will be able to detect
nonnormality.
- Histogram for residuals:
- The histogram for
residuals has a reference
normal distribution
curve for a normal distribution with the same mean and variance
as the residuals. This provides a reference for detecting gross
nonnormality when the sample sizes are large.
- Boxplot for residuals:
- Suspected
outliers
appear in a
boxplot
as individual points o
or x outside the box. If these appear on both sides of the box, they
also suggest the possibility of a
heavy-tailed
distribution.
If they appear on only one side, they also suggest the possibility
of a
skewed
distribution. Skewness is also suggested if the
mean (+) does not lie on or near the central line of the boxplot, or
if the central line of the boxplot does not evenly divide the box.
Examples of these plots
will help illustrate the various situations.
- Normal probability plot for residuals:
- For data sampled from a
normal distribution, the
normal probability plot,
(normal Q-Q plot)
has the points all lying on or near the straight line drawn
through the middle half of the points. Scattered points
lying away from the line are suspected
outliers.
Examples of these plots
will help illustrate the various situations.
- Residuals plotted against fitted values:
- If the fitted model under the assumption of two
populations
with equal variance is correct, the plot of
residuals
against fitted values should suggest
a horizontal band across the graph.
Because there are only two unique fitted values,
the mean of each of the two samples, the
graph of residuals against fitted values will
consist of two vertical "stacks" of residuals;
the stacks should be about the same length and at about the same level.
Outliers
may appear as anomalous points in the graph (although an outlier
may not turn up in the residuals plot by virtue of
affecting the mean so that its fitted value lies near it).
A fan pattern like the profile of a megaphone, with a
noticeable flare either to the right or to the left
as shown in the picture (one of the "stacks" of residuals
is much longer than the other), suggests that
the variance in the values increases in the direction
the fan pattern widens (often to the right),
and this in turn suggests that a
transformation
may be needed.
Other systematic pattern in the residuals (like a linear
trend) suggest either that there is another factor that
should be considered in analyzing the data, or that
a transformation is needed.
Do a keyword search of PROPHET
StatGuide.
Back to StatGuide two-sample unpaired t test page.
Back to StatGuide home page.
Last modified: March 14, 1997
©1996 BBN Corporation All
rights reserved.