Regression Wisdom
1 of 13
Regression Wisdom
Featured
Derivatives of Inverse Trig Functions
Vocabulary Squares3
Ecotourism
Lt1 CorpStragPlanning
I Love Free Math 2010
Russia PG
C Reference Card
Case Study Microsoft Antitrust Case ver03
Using Exponents To Write Large Numbers
Dividing Polynomials Cont
Natural Language Processing - Nltk Grammars, Parsers And Demos
Light
ENGLISH GRAMMAR 2
Geogrpahy Sample Questions
Earth and rocks
Subtract with Regrouping
Multinational Enterprises presentation
SouthernEurope
3. Organizational Change And Organization Development
strong and weak verbs
Regression Wisdom - Transcript
Chapter 9 Regression Wisdom
Prediction is difficult especially about the future
Niels Bohr
Danish Physicist
Sifting Residuals for Groups
No regression analysis is complete without a display of the residuals
Is the linear model reasonable
Residuals are what is left over after the model describes the relationship
Reveal subtleties Additional details that confirm Reveal violations of regression conditions
Look a histogram of the residuals
Subsets
Another condition for fitting models
All data must come from the same group
If you discover that there is more than 1 group in a regression
Analyze the groups separately Use a different model for each group OR use the original model and note the different groups
Getting the Bends
Fundamental assumption for working with linear models
The relationship modeled is in fact linear
Sometimes it is hard to tell from the scatterplot
Plot regression residuals against predicted values to see if there is a bend The residual plot should have NO pattern if a linear model is appropriate
Extrapolation Reaching Beyond the Data
Linear models give a predicted value for each case in the data
Put a new x value into the equation The equation gives a predicted y value
The farther the x values lie from the data we used to build the regression the less we should trust the predicted y value
Extrapolation
Extrapolation
The use of a regression line for prediction outside the domain of values of x Extrapolation requires the assumption that the x y relationship never changes even at extreme values of y
NOT to be trusted
Predicting the Future
When the x variable in a linear model is time extrapolation is an attempt to predict the future If you must extrapolate into the future be wary
Don t believe that the prediction will come true
Outliers
Outliers strongly influence a regression
Outlier any point that stands away from the other data points Point that falls far from the regression line Compare the regression models with and without outlier Can especially influence regression model High Leverage
Model outliers
x Outliers
Influential Points
Influential points can hide in residual plots Points with high leverage pull the line close to them
Small residuals Look at the original scatterplot Find the regression model with and without the point s
To find influential points
Lurking Variables and Causation
With observational data there is no way to be sure that a lurking variable is not the cause of any apparent association Do NOT infer causality from a regression Resist the temptation to conclude that x causes y from a regression no matter how obvious that conclusion seems to you
Working with Summary Values
Scatterplots of statistics summarized over groups tend to show less variability than would be seen if we measured the same variable on individuals
Show less scatter Can give a false impression of how well a line summarizes the data
What Can Go Wrong
Make sure the relationship is straight
Check the residual plot Be cautious of extrapolating beyond the x values of the data Don t trust predicted values when x variable is time
Beware of extrapolating
Be on guard for different groups in your regression
Look at a histogram of the residual values
What Can Go Wrong
Look for outliers Beware of high leverage points especially influential ones Consider comparing two regressions
One with outlier and one without
Treat outliers honestly Beware of lurking variables Watch out for summary data












