Note that the two values that make up each paired difference need not be independent, and in fact are expected to be correlated, such as before and after measurements. If you treat paired data as coming from two independent samples, such as doing an inappropriate Mann-Whitney rank-sum test instead of a paired sign test, then you may sacrifice power.
Because it requires only the sign for each paired difference, the sign test is quite resistant to outliers. However, it is often not the most powerful test available, and this could mean the difference between detecting a true difference or not. This is particularly true if the underlying distribution for the paired differences is symmetric, or if the data in fact come from a normal distribution. Another nonparametric test, the paired two-sample t test, or employing a transformation may result in a more powerful test.
Often, the effect of an assumption violation on the sign test result depends on the extent of the violation. Some small violations may have little practical effect on the analysis, while other violations may render the sign test result uselessly incorrect or uninterpretable. In particular, small sample sizes can increase vulnerability to assumption violations.
Because the statistic for the sign test is resistant, it will not be substantially affected by the presence of outliers. However, you should remain alert to the possibility that outliers may represent recording errors in the data.
The boxplot and normal probability plot (normal Q-Q plot) may suggest the presence of outliers in the data.
If you find outliers in your data that are not due to correctable errors, you may wish to consult a statistician as to how to proceed.
Even if none of the test assumptions are violated, a sign test with small sample sizes may not have sufficient power to detect a significant difference between the median of the paired differences and 0, even if the medians are in fact different. Power decreases as the significance level is decreased (i.e., as the test is made more stringent), and increases as the sample size increases. With very small samples, even samples from populations with very different means may not produce a significant sign test statistic. If a statistical significance test with small sample sizes produces a surprisingly non-significant P value, then a lack of power may be the reason. The best time to avoid such problems is in the design stage of an experiment, when appropriate minimum sample sizes can be determined, perhaps in consultation with a statistician, before data collection begins.
Examine the glossary.
Do a keyword search of PROPHET
StatGuide.
Back to StatGuide testing equality of means/location page.
Back to StatGuide nonparametric tests page.
Back to StatGuide sign test page.
Back to StatGuide home page.
©1996 BBN Corporation All rights reserved.