Welcome Guestlogin to KGsePGregister at KGsePG email | FAQs

Regression Wisdom

download

    1 of 13

    Regression Wisdom



    Regression Wisdom - Transcript


    Chapter 9 Regression Wisdom
    Prediction is difficult especially about the future
    Niels Bohr
    Danish Physicist

    Sifting Residuals for Groups


    No regression analysis is complete without a display of the residuals


    Is the linear model reasonable



    Residuals are what is left over after the model describes the relationship


    Reveal subtleties Additional details that confirm Reveal violations of regression conditions



    Look a histogram of the residuals

    Subsets


    Another condition for fitting models


    All data must come from the same group



    If you discover that there is more than 1 group in a regression


    Analyze the groups separately Use a different model for each group OR use the original model and note the different groups

    Getting the Bends


    Fundamental assumption for working with linear models


    The relationship modeled is in fact linear



    Sometimes it is hard to tell from the scatterplot


    Plot regression residuals against predicted values to see if there is a bend The residual plot should have NO pattern if a linear model is appropriate

    Extrapolation Reaching Beyond the Data


    Linear models give a predicted value for each case in the data


    Put a new x value into the equation The equation gives a predicted y value



    The farther the x values lie from the data we used to build the regression the less we should trust the predicted y value

    Extrapolation


    Extrapolation


    The use of a regression line for prediction outside the domain of values of x Extrapolation requires the assumption that the x y relationship never changes even at extreme values of y



    NOT to be trusted


    Predicting the Future




    When the x variable in a linear model is time extrapolation is an attempt to predict the future If you must extrapolate into the future be wary


    Don t believe that the prediction will come true

    Outliers


    Outliers strongly influence a regression


    Outlier any point that stands away from the other data points Point that falls far from the regression line Compare the regression models with and without outlier Can especially influence regression model High Leverage



    Model outliers




    x Outliers


    Influential Points


    Influential points can hide in residual plots Points with high leverage pull the line close to them


    Small residuals Look at the original scatterplot Find the regression model with and without the point s



    To find influential points


    Lurking Variables and Causation




    With observational data there is no way to be sure that a lurking variable is not the cause of any apparent association Do NOT infer causality from a regression Resist the temptation to conclude that x causes y from a regression no matter how obvious that conclusion seems to you

    Working with Summary Values


    Scatterplots of statistics summarized over groups tend to show less variability than would be seen if we measured the same variable on individuals


    Show less scatter Can give a false impression of how well a line summarizes the data

    What Can Go Wrong


    Make sure the relationship is straight


    Check the residual plot Be cautious of extrapolating beyond the x values of the data Don t trust predicted values when x variable is time



    Beware of extrapolating




    Be on guard for different groups in your regression


    Look at a histogram of the residual values

    What Can Go Wrong


    Look for outliers Beware of high leverage points especially influential ones Consider comparing two regressions


    One with outlier and one without



    Treat outliers honestly Beware of lurking variables Watch out for summary data