Interpreting Coefficients for Qualitative IVs

The boxplot shows the distribution of SAT scores for two types of high schools - public and Catholic. The x-axis shows the type of high school, and the y-axis shows the SAT score. The box for each school type shows the median, interquartile range, and range of SAT scores. The red line in the middle of each box represents the median SAT score. The whiskers extend to the minimum and maximum non-outlier values, and any points outside the whiskers are considered outliers. The text annotation at the top right shows the unstandardized regression coefficient (b), which measures the difference in SAT score between Catholic and public high schools.

Interpreting coefficients for Qualititative IVs

Because a 1-unit increase in the qualitative variable doesn't make any substantive sense, we instead interpret the direction relative to a reference category. The reference category is the qualitative category left out of the regression against which we compare our regression coefficient.

The significance test is between the specified group and the reference group. In this case, we cannot be confident that public and catholic school seniors score differently on the SAT (p > .05).

Because the reference category is not included in the regression, the y-intercept is the expected value of Y for the reference category. 

For binary qualitative variables, the reference category is often implicit or unstated. For example, the regression output on the left doesn't show the reference category "catholic."

Relative to Catholic School seniors, the model predicts that public school graduates score 35.75 lower on the SAT.

Catholic school seniors are estimated to score 1215. 

Changing the reference category to "public school seniors" would give the same substantive conclusion. Relative to Public school seniors, the model predicts that catholic school graduates score 35.75 points higher on the SAT.

Public school seniors are expected to score 1179.