forked from Ronggui/SOCI620004
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Regression diagnostics.do
105 lines (87 loc) · 2.37 KB
/
Regression diagnostics.do
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
use Prestige
* the regression of prestige on income and education
regress prestige income education
* standardised coefficents
regress prestige income education, beta
regress prestige income
** outliers, influential obs
** you can choose whatever variable name for the first argument
predict fitted, xb
predict hval, hat
* predict hval, leverage
predict resid, residual
predict rstd, rstandard
predict rstud, rstudent
predict cookdistance, cooksd
gen id=_n
twoway (dropline rstud id, mlabel(id))
* inspect leverage and cooksd
gen cookdistancesqrt = sqrt(cookdistance)
twoway (scatter rstud hval [weight=cookdistancesqrt], msymbol(oh))
* leverage-versus-squared-residuals plot
lvr2plot, mlabel(id)
** added-variable plot; aka, partial regression plot
** joint influence of mutiple obs
avplot income
avplots
* you can manually construct the same plot
reg prestige education
predict yr, residual
reg income education
predict xr, residual
twoway (scatter yr xr) (lfit yr xr)
* normality
hist resid
kdensity resid
qnorm resid
regress prestige income education
* Heteroskedasticity
estat hettest
* stardardized residuals
predict resid, rstandard
predict fitted
gen residsq = resid*resid
gen absresid = abs(resid)
twoway (scatter resid fitted)
* can the use a shorthand of rvfplot
rvfplot
* robust std err
regress prestige income education, robust
* linearity
twoway (scatter prestige income)
graph matrix income education prestige
* examine linearity and suggest alternative functional form
cprplot income, lowess
* perfect multicollinearity
gen income_n = income
regress prestige income income_n
* high correlation
drop income_n
gen income_n = 0.9*income + rnormal()
* vif to detect multicollearity
cor prestige income education
regress prestige income education
estat vif
* demonstrate the effect of centering
gen incomesq=income*income
reg prestige education income incomesq
estat vif
sum income
gen incomeC=income-r(mean)
gen incomeCsq=incomeC^2
reg prestige education incomeC incomeCsq
estat vif
* omitted variable test
ovtest
* demonstration the effect of omitted variables
gen incomesq=income*income
gen prestige_sim =2.456574+ 1.08039* income+0.8* incomesq+rnormal()
reg prestige_sim income
regress prestige income education women
* wald test / F test
test income education
regress prestige income education women
estimates store ful
regress prestige women
estimates store constr
lrtest constr ful