Skip to content

Commit 691f309

Browse files
committed
Added panel materials.
1 parent 1252cd7 commit 691f309

File tree

2 files changed

+351
-0
lines changed

2 files changed

+351
-0
lines changed

_utils/panel.do

+29
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
// Week 7b: Panel regression (an aside)
2+
// Adapted from the Stata docs, so we have
3+
// a dataset that's publicly available.
4+
5+
use http://www.stata-press.com/data/r13/nlswork
6+
7+
xtset idcode year
8+
9+
10+
local controls ///
11+
grade ///
12+
age ///
13+
ttl_exp ///
14+
tenure
15+
16+
local ivs ///
17+
not_smsa ///
18+
south
19+
20+
21+
reg ln_wage `controls' `ivs'
22+
23+
xtreg ln_wage `controls' `ivs', fe
24+
estimates store fe
25+
26+
xtreg ln_wage `controls' `ivs', re
27+
estimates store re
28+
29+
hausman fe re

notebooks/7a_panel.ipynb

+322
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,322 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "markdown",
5+
"metadata": {},
6+
"source": [
7+
"# Regression with panel data (an aside)\n",
8+
"\n",
9+
"In many studies in strategy and OT, we use text analysis as part of econometric models with panel data.\n",
10+
"Since we do not cover it elsewhere in the curriculum, we will take a small aside to discuss some of these models.\n",
11+
"\n",
12+
"**Note:** I'm using Stata here, so none of this content is interactive.\n",
13+
"\n",
14+
"This is partially adapted from the Stata `xtreg` docs, because we are covering it very quickly.\n",
15+
"You can find more detail [here](https://www.stata.com/manuals13/xtxtreg.pdf)."
16+
]
17+
},
18+
{
19+
"cell_type": "markdown",
20+
"metadata": {},
21+
"source": [
22+
"# Read data"
23+
]
24+
},
25+
{
26+
"cell_type": "raw",
27+
"metadata": {},
28+
"source": [
29+
". do panel.do\n",
30+
"\n",
31+
". // Week 7b: Panel regression (an aside)\n",
32+
". // Adapted from the Stata docs, so we have\n",
33+
". // a dataset that's publicly available.\n",
34+
". \n",
35+
". use http://www.stata-press.com/data/r13/nlswork\n",
36+
"(National Longitudinal Survey. Young Women 14-26 years of age in 1968)"
37+
]
38+
},
39+
{
40+
"cell_type": "markdown",
41+
"metadata": {},
42+
"source": [
43+
"In Stata, the `use` command reads data, including from URLs."
44+
]
45+
},
46+
{
47+
"cell_type": "markdown",
48+
"metadata": {},
49+
"source": [
50+
"# Setting the panel variables\n",
51+
"\n",
52+
"To help the model commands understand the panel structure, we use the `xtset` command. \n",
53+
"Do note that the year variables are not automatically added, so you would need to add `i.year` to have Stata create and use indicators for you.\n",
54+
"\n",
55+
"`xtset idcode year`"
56+
]
57+
},
58+
{
59+
"cell_type": "raw",
60+
"metadata": {},
61+
"source": [
62+
" panel variable: idcode (unbalanced)\n",
63+
" time variable: year, 68 to 88, but with gaps\n",
64+
" delta: 1 unit"
65+
]
66+
},
67+
{
68+
"cell_type": "markdown",
69+
"metadata": {},
70+
"source": [
71+
"The output of `xtset` tells us about the panel variables."
72+
]
73+
},
74+
{
75+
"cell_type": "markdown",
76+
"metadata": {},
77+
"source": [
78+
"# Using local macros for collecting variable names\n",
79+
"\n",
80+
"A good practice with Stata is using a local macro to collect variable names.\n",
81+
"That way, if we're running multiple models, we can keep them in sync.\n",
82+
"It's especially helpful when we decide to add a control or other variable, and we want the change to apply to all models."
83+
]
84+
},
85+
{
86+
"cell_type": "markdown",
87+
"metadata": {},
88+
"source": [
89+
"```stata\n",
90+
"local controls ///\n",
91+
" grade ///\n",
92+
" age ///\n",
93+
" ttl_exp ///\n",
94+
" tenure\n",
95+
"\n",
96+
"\n",
97+
"local ivs ///\n",
98+
" not_smsa ///\n",
99+
" south\n",
100+
"```"
101+
]
102+
},
103+
{
104+
"cell_type": "markdown",
105+
"metadata": {},
106+
"source": [
107+
"Note that we're using Stata's line continuation sentinel, `///`. \n",
108+
"This allows us to tell Stata that it should ignore the end of the line and process the next one as if there is no line break.\n",
109+
"\n",
110+
"There are two forms of practical significance here. \n",
111+
"First, we can avoid having a command that is one very long line that is hard to read and edit.\n",
112+
"Second, we can add a line continuation in front of one of these variables, and that one will be skipped, allowing us to easily \"turn off\" a variable in our analyses.\n",
113+
"\n",
114+
"**Note:** For some reason, the Stata app does not properly handle line continuations when entered in the command window."
115+
]
116+
},
117+
{
118+
"cell_type": "markdown",
119+
"metadata": {},
120+
"source": [
121+
"# Regressions compared"
122+
]
123+
},
124+
{
125+
"cell_type": "raw",
126+
"metadata": {},
127+
"source": [
128+
". \n",
129+
". reg ln_wage `controls' `ivs'\n",
130+
"\n",
131+
" Source | SS df MS Number of obs = 28,091\n",
132+
"-------------+---------------------------------- F(6, 28084) = 2626.73\n",
133+
" Model | 2305.54089 6 384.256816 Prob > F = 0.0000\n",
134+
" Residual | 4108.32299 28,084 .14628696 R-squared = 0.3595\n",
135+
"-------------+---------------------------------- Adj R-squared = 0.3593\n",
136+
" Total | 6413.86388 28,090 .228332641 Root MSE = .38247\n",
137+
"\n",
138+
"------------------------------------------------------------------------------\n",
139+
" ln_wage | Coef. Std. Err. t P>|t| [95% Conf. Interval]\n",
140+
"-------------+----------------------------------------------------------------\n",
141+
" grade | .0670419 .0010237 65.49 0.000 .0650355 .0690483\n",
142+
" age | -.0038303 .0005265 -7.28 0.000 -.0048622 -.0027984\n",
143+
" ttl_exp | .0287283 .0009252 31.05 0.000 .0269148 .0305417\n",
144+
" tenure | .0195421 .0008321 23.48 0.000 .017911 .0211731\n",
145+
" not_smsa | -.1637396 .0051791 -31.62 0.000 -.1738909 -.1535883\n",
146+
" south | -.1135945 .0047533 -23.90 0.000 -.1229112 -.1042777\n",
147+
" _cons | .8004553 .0173735 46.07 0.000 .7664024 .8345081\n",
148+
"------------------------------------------------------------------------------"
149+
]
150+
},
151+
{
152+
"cell_type": "markdown",
153+
"metadata": {},
154+
"source": [
155+
"The model above is simply an OLS model.\n",
156+
"As we'll see below, some of these parameter estimates are a lot higher than they are when we account for the non-independence.\n",
157+
"\n",
158+
"Note the syntax for using the local macros we created earlier: we use the name with a backtick `` ` `` on the left (the key to the left of the number 1 on a US keyboard) and an apostrophe `'` (the key to the right of the semicolon key on a US keyboard)."
159+
]
160+
},
161+
{
162+
"cell_type": "raw",
163+
"metadata": {},
164+
"source": [
165+
". xtreg ln_wage `controls' `ivs', fe\n",
166+
"note: grade omitted because of collinearity\n",
167+
"\n",
168+
"Fixed-effects (within) regression Number of obs = 28,091\n",
169+
"Group variable: idcode Number of groups = 4,697\n",
170+
"\n",
171+
"R-sq: Obs per group:\n",
172+
" within = 0.1491 min = 1\n",
173+
" between = 0.3526 avg = 6.0\n",
174+
" overall = 0.2517 max = 15\n",
175+
"\n",
176+
" F(5,23389) = 819.94\n",
177+
"corr(u_i, Xb) = 0.2348 Prob > F = 0.0000\n",
178+
"\n",
179+
"------------------------------------------------------------------------------\n",
180+
" ln_wage | Coef. Std. Err. t P>|t| [95% Conf. Interval]\n",
181+
"-------------+----------------------------------------------------------------\n",
182+
" grade | 0 (omitted)\n",
183+
" age | -.0026787 .000863 -3.10 0.002 -.0043703 -.0009871\n",
184+
" ttl_exp | .0287709 .0014474 19.88 0.000 .0259339 .0316079\n",
185+
" tenure | .0114355 .0009229 12.39 0.000 .0096265 .0132445\n",
186+
" not_smsa | -.0921689 .0096641 -9.54 0.000 -.1111112 -.0732266\n",
187+
" south | -.0633396 .0110819 -5.72 0.000 -.0850608 -.0416184\n",
188+
" _cons | 1.591678 .0186849 85.19 0.000 1.555054 1.628302\n",
189+
"-------------+----------------------------------------------------------------\n",
190+
" sigma_u | .36167618\n",
191+
" sigma_e | .29477563\n",
192+
" rho | .60086475 (fraction of variance due to u_i)\n",
193+
"------------------------------------------------------------------------------\n",
194+
"F test that all u_i=0: F(4696, 23389) = 6.63 Prob > F = 0.0000\n",
195+
"\n",
196+
". estimates store fe"
197+
]
198+
},
199+
{
200+
"cell_type": "markdown",
201+
"metadata": {},
202+
"source": [
203+
"This is a fixed effects model.\n",
204+
"Note that grade does not vary within units, so the model drops it.\n",
205+
"Also, note that it splits out the within, between, and overall effects for us, and reports some panel stats in the header.\n",
206+
"\n",
207+
"It also has an F test that the unit effects are zero, which is rejected in this case.\n",
208+
"Note that, when using robust standard errors (as we often do), that test is suppressed.\n",
209+
"\n",
210+
"The command at the bottom, `estimates store fe` stores the model estimates with the name `fe`.\n",
211+
"We could have named it anything, but `fe` is descriptive."
212+
]
213+
},
214+
{
215+
"cell_type": "raw",
216+
"metadata": {},
217+
"source": [
218+
". xtreg ln_wage `controls' `ivs', re\n",
219+
"\n",
220+
"Random-effects GLS regression Number of obs = 28,091\n",
221+
"Group variable: idcode Number of groups = 4,697\n",
222+
"\n",
223+
"R-sq: Obs per group:\n",
224+
" within = 0.1483 min = 1\n",
225+
" between = 0.4701 avg = 6.0\n",
226+
" overall = 0.3569 max = 15\n",
227+
"\n",
228+
" Wald chi2(6) = 8304.62\n",
229+
"corr(u_i, X) = 0 (assumed) Prob > chi2 = 0.0000\n",
230+
"\n",
231+
"------------------------------------------------------------------------------\n",
232+
" ln_wage | Coef. Std. Err. z P>|z| [95% Conf. Interval]\n",
233+
"-------------+----------------------------------------------------------------\n",
234+
" grade | .0691836 .0017689 39.11 0.000 .0657166 .0726506\n",
235+
" age | -.0038386 .0006544 -5.87 0.000 -.0051212 -.0025559\n",
236+
" ttl_exp | .0301313 .0011215 26.87 0.000 .0279331 .0323294\n",
237+
" tenure | .0134656 .0008442 15.95 0.000 .011811 .0151202\n",
238+
" not_smsa | -.128591 .0072246 -17.80 0.000 -.142751 -.114431\n",
239+
" south | -.0932646 .007231 -12.90 0.000 -.107437 -.0790921\n",
240+
" _cons | .7544109 .0273445 27.59 0.000 .7008168 .8080051\n",
241+
"-------------+----------------------------------------------------------------\n",
242+
" sigma_u | .26027808\n",
243+
" sigma_e | .29477563\n",
244+
" rho | .43808743 (fraction of variance due to u_i)\n",
245+
"------------------------------------------------------------------------------\n",
246+
"\n",
247+
". estimates store re"
248+
]
249+
},
250+
{
251+
"cell_type": "markdown",
252+
"metadata": {},
253+
"source": [
254+
"This is a random effects model.\n",
255+
"Note the differences when we assume no correlation (and the model output reminds us of that fact)."
256+
]
257+
},
258+
{
259+
"cell_type": "markdown",
260+
"metadata": {},
261+
"source": [
262+
"# Testing whether the RE model is consistent\n",
263+
"\n",
264+
"A Hausman test can test whether the FE and RE estimates are consistent. \n",
265+
"If they are, we can use use the more efficient RE model.\n",
266+
"\n",
267+
"**Note:** Using this test assumes that a fixed-effects model would be appropriate.\n",
268+
"If you want a time-invariant variable in the regression, it will be dropped be FE.\n",
269+
"If you want a nearly time-invariant variable, almost all of the variance will be wiped out, but the model will still give you a parameter estimate.\n",
270+
"Reviewers often ask for this test, and you may need to argue smartly if FE isn't appropriate for your study."
271+
]
272+
},
273+
{
274+
"cell_type": "raw",
275+
"metadata": {},
276+
"source": [
277+
". \n",
278+
". hausman fe re\n",
279+
"\n",
280+
" ---- Coefficients ----\n",
281+
" | (b) (B) (b-B) sqrt(diag(V_b-V_B))\n",
282+
" | fe re Difference S.E.\n",
283+
"-------------+----------------------------------------------------------------\n",
284+
" age | -.0026787 -.0038386 .0011599 .0005626\n",
285+
" ttl_exp | .0287709 .0301313 -.0013603 .000915\n",
286+
" tenure | .0114355 .0134656 -.0020301 .0003729\n",
287+
" not_smsa | -.0921689 -.128591 .0364221 .0064187\n",
288+
" south | -.0633396 -.0932646 .029925 .0083977\n",
289+
"------------------------------------------------------------------------------\n",
290+
" b = consistent under Ho and Ha; obtained from xtreg\n",
291+
" B = inconsistent under Ha, efficient under Ho; obtained from xtreg\n",
292+
"\n",
293+
" Test: Ho: difference in coefficients not systematic\n",
294+
"\n",
295+
" chi2(5) = (b-B)'[(V_b-V_B)^(-1)](b-B)\n",
296+
" = 121.50\n",
297+
" Prob>chi2 = 0.0000"
298+
]
299+
}
300+
],
301+
"metadata": {
302+
"kernelspec": {
303+
"display_name": "Python 3",
304+
"language": "python",
305+
"name": "python3"
306+
},
307+
"language_info": {
308+
"codemirror_mode": {
309+
"name": "ipython",
310+
"version": 3
311+
},
312+
"file_extension": ".py",
313+
"mimetype": "text/x-python",
314+
"name": "python",
315+
"nbconvert_exporter": "python",
316+
"pygments_lexer": "ipython3",
317+
"version": "3.7.5"
318+
}
319+
},
320+
"nbformat": 4,
321+
"nbformat_minor": 4
322+
}

0 commit comments

Comments
 (0)