Observed values of 0 may be either sampling zeroes or structural zeroes. If they are structural zeroes, the chi-square test is not appropriate.
A standard (and conservative) rule of thumb is to avoid using the chi-square test for tables with expected cell frequencies less than 1, or when more than 20% of the table cells have expected cell frequencies less than 5.
Another rule of thumb (due to Roscoe and Byars) is that the average expected cell frequency should be at least 1 when the expected cell frequencies are close to equal, and 2 when they are not. (If the chosen significance level is 0.01 instead of 0.05, then double these numbers.)
Koehler and Larntz suggest that if the total number of observations is at least 10, the number categories is at least 3, and the square of the total number of observations is at least 10 times the number of categories, then the chi-square approximation should be reasonable.
The table of expected values will reveal whether any of these conditions is true, and Prophet will also generate an appropriate warning in the test results.
The standardized residuals are the signed square roots of these values. Positive residuals indicate that the observed cell frequency is greater than the expected cell frequency, and negative residuals indicate that the observed cell frequency is less than the expected cell frequency.
If there are standardized residuals greater than 2 or less than -2, those cells are not being fitted very well by the hypothesized distribution. A large residual may also mean that a particular cell is an outlier.
When the categories have a natural order, then a pattern to residuals (e.g., large negative ones at one end of the table, with large positive ones at the other end of the table) may indicate the possibility of an implicit factor.
Examine the glossary.
Do a keyword search of PROPHET
StatGuide.
Back to goodness of fit (chi-square) test page.
Back to StatGuide home page.
©1996 BBN Corporation All rights reserved.