The linear function

**Y = b0 + b1*X1 + b2*X2 + ... + bk*Xk + e[i]**is the correct model, where**Y[i]**is the*ith*observed value of Y,**Xj[i]**is the*ith*observed value of the*jth*X variable, and**e[i]**is the error term. Equivalently, the expected value of Y for a given value of X is**Y = b0 + b1*X1 + b2*X2 + ... + bk*Xk**. The**intercept**is**b0**, the expected value of Y when the value for each X variable is 0.The Xj variable (

**predictor variable**) values are fixed (i.e., none of the Xj is a random variable).The

**e[i]**are independent, and identically normally distributed with mean 0 and the same variance.The Y variable (

**response variable**) observations are independent.The variable Y is normally distributed with the same variance as the

**e[i]**. For a given set of X variable values, the variable Y has constant mean.

The normality assumption is required for
hypothesis tests, but not for estimation.

The X variables are also known as the **independent** variables.

The Y variable is also known as the **dependent** variable.

The **coefficients** are **bj**, the amount by which the expected
value of Y increases when Xj increases by a unit amount,
*when all the other X variables are held constant*.
This interpretation of the coefficients does not hold
if some of the X variables are functions of the others,
such as an interaction term Xj*Xk.

Note that it is *not* assumed that the X variables are
independent of each other.

**Ways to detect**before performing the multiple linear regression whether your data violate any assumptions.**Ways to examine**multiple linear regression results to detect assumption violations.**Possible alternatives**if your data or multiple linear regression results indicate assumption violations.

To properly analyze and interpret the
results of *multiple linear regression*, you should be familiar with the following terms and
concepts:

- simple linear regression
- correlation
- linear functions
- method of least squares
- residuals
- Gaussian (normal) distribution assumption
- equality of variance (homoscedasticity) assumption
- violation of assumptions
- transformations
- multiple regression
- leverage
- multicollinearity

- Belsley, David A., Kuh, Edwin, and Welsch, Roy E.
1980.
*Regression Diagnostics.*New York: John Wiley & Sons. - Brownlee, K. A. 1965.
*Statistical Theory and Methodology in Science and Engineering.*New York: John Wiley & Sons. - Daniel, Wayne W. 1995.
*Biostatistics.*6th ed. New York: John Wiley & Sons. - Draper, N. R. and Smith, H. 1981.
*Applied Regression Analysis. 2nd ed.*New York: John Wiley & Sons. - Hoaglin, D. C., Mosteller, F., and Tukey, J. W. 1985.
*Exploring Data Tables, Trends, and Shapes.*New York: John Wiley & Sons. - Neter, J., Wasserman, W., and Kutner, M.H. 1990.
*Applied Linear Statistical Models. 3rd ed.*Homewood, IL: Irwin. - Rosner, Bernard. 1995.
*Fundamentals of Biostatistics.*4th ed. Belmont, California: Duxbury Press. - Sokal, Robert R. and Rohlf, F. James. 1995.
*Biometry.*3rd. ed. New York: W. H. Freeman and Co. - Zar, Jerrold H. 1996.
*Biostatistical Analysis.*3rd ed. Upper Saddle River, NJ: Prentice-Hall.

Do a **keyword search** of PROPHET
StatGuide.

**
Back** to StatGuide modeling page.

**
Back** to StatGuide home page.

©1997 BBN Corporation All rights reserved.