Added panel materials.

jtkiley · jtkiley · commit 691f3094e12d · 2020-03-05T12:23:08.000-06:00
diff --git a/_utils/panel.do b/_utils/panel.do
@@ -0,0 +1,29 @@
+// Week 7b: Panel regression (an aside)
+// Adapted from the Stata docs, so we have
+// a dataset that's publicly available.
+
+use http://www.stata-press.com/data/r13/nlswork
+
+xtset idcode year
+
+
+local controls ///
+    grade ///
+    age ///
+    ttl_exp ///
+    tenure
+
+local ivs ///
+    not_smsa ///
+    south
+
+
+reg ln_wage `controls' `ivs'
+
+xtreg ln_wage `controls' `ivs', fe
+estimates store fe
+
+xtreg ln_wage `controls' `ivs', re
+estimates store re
+
+hausman fe re
diff --git a/notebooks/7a_panel.ipynb b/notebooks/7a_panel.ipynb
@@ -0,0 +1,322 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Regression with panel data (an aside)\n",
+    "\n",
+    "In many studies in strategy and OT, we use text analysis as part of econometric models with panel data.\n",
+    "Since we do not cover it elsewhere in the curriculum, we will take a small aside to discuss some of these models.\n",
+    "\n",
+    "**Note:** I'm using Stata here, so none of this content is interactive.\n",
+    "\n",
+    "This is partially adapted from the Stata `xtreg` docs, because we are covering it very quickly.\n",
+    "You can find more detail [here](https://www.stata.com/manuals13/xtxtreg.pdf)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Read data"
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {},
+   "source": [
+    ". do panel.do\n",
+    "\n",
+    ". // Week 7b: Panel regression (an aside)\n",
+    ". // Adapted from the Stata docs, so we have\n",
+    ". // a dataset that's publicly available.\n",
+    ". \n",
+    ". use http://www.stata-press.com/data/r13/nlswork\n",
+    "(National Longitudinal Survey.  Young Women 14-26 years of age in 1968)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In Stata, the `use` command reads data, including from URLs."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Setting the panel variables\n",
+    "\n",
+    "To help the model commands understand the panel structure, we use the `xtset` command. \n",
+    "Do note that the year variables are not automatically added, so you would need to add `i.year` to have Stata create and use indicators for you.\n",
+    "\n",
+    "`xtset idcode year`"
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {},
+   "source": [
+    "     panel variable:  idcode (unbalanced)\n",
+    "      time variable:  year, 68 to 88, but with gaps\n",
+    "              delta:  1 unit"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The output of `xtset` tells us about the panel variables."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Using local macros for collecting variable names\n",
+    "\n",
+    "A good practice with Stata is using a local macro to collect variable names.\n",
+    "That way, if we're running multiple models, we can keep them in sync.\n",
+    "It's especially helpful when we decide to add a control or other variable, and we want the change to apply to all models."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "```stata\n",
+    "local controls ///\n",
+    "   grade ///\n",
+    "   age ///\n",
+    "   ttl_exp ///\n",
+    "   tenure\n",
+    "\n",
+    "\n",
+    "local ivs ///\n",
+    "    not_smsa ///\n",
+    "    south\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Note that we're using Stata's line continuation sentinel, `///`. \n",
+    "This allows us to tell Stata that it should ignore the end of the line and process the next one as if there is no line break.\n",
+    "\n",
+    "There are two forms of practical significance here. \n",
+    "First, we can avoid having a command that is one very long line that is hard to read and edit.\n",
+    "Second, we can add a line continuation in front of one of these variables, and that one will be skipped, allowing us to easily \"turn off\" a variable in our analyses.\n",
+    "\n",
+    "**Note:** For some reason, the Stata app does not properly handle line continuations when entered in the command window."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Regressions compared"
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {},
+   "source": [
+    ". \n",
+    ". reg ln_wage `controls' `ivs'\n",
+    "\n",
+    "      Source |       SS           df       MS      Number of obs   =    28,091\n",
+    "-------------+----------------------------------   F(6, 28084)     =   2626.73\n",
+    "       Model |  2305.54089         6  384.256816   Prob > F        =    0.0000\n",
+    "    Residual |  4108.32299    28,084   .14628696   R-squared       =    0.3595\n",
+    "-------------+----------------------------------   Adj R-squared   =    0.3593\n",
+    "       Total |  6413.86388    28,090  .228332641   Root MSE        =    .38247\n",
+    "\n",
+    "------------------------------------------------------------------------------\n",
+    "     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
+    "-------------+----------------------------------------------------------------\n",
+    "       grade |   .0670419   .0010237    65.49   0.000     .0650355    .0690483\n",
+    "         age |  -.0038303   .0005265    -7.28   0.000    -.0048622   -.0027984\n",
+    "     ttl_exp |   .0287283   .0009252    31.05   0.000     .0269148    .0305417\n",
+    "      tenure |   .0195421   .0008321    23.48   0.000      .017911    .0211731\n",
+    "    not_smsa |  -.1637396   .0051791   -31.62   0.000    -.1738909   -.1535883\n",
+    "       south |  -.1135945   .0047533   -23.90   0.000    -.1229112   -.1042777\n",
+    "       _cons |   .8004553   .0173735    46.07   0.000     .7664024    .8345081\n",
+    "------------------------------------------------------------------------------"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The model above is simply an OLS model.\n",
+    "As we'll see below, some of these parameter estimates are a lot higher than they are when we account for the non-independence.\n",
+    "\n",
+    "Note the syntax for using the local macros we created earlier: we use the name with a backtick `` ` `` on the left (the key to the left of the number 1 on a US keyboard) and an apostrophe `'` (the key to the right of the semicolon key on a US keyboard)."
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {},
+   "source": [
+    ". xtreg ln_wage `controls' `ivs', fe\n",
+    "note: grade omitted because of collinearity\n",
+    "\n",
+    "Fixed-effects (within) regression               Number of obs     =     28,091\n",
+    "Group variable: idcode                          Number of groups  =      4,697\n",
+    "\n",
+    "R-sq:                                           Obs per group:\n",
+    "     within  = 0.1491                                         min =          1\n",
+    "     between = 0.3526                                         avg =        6.0\n",
+    "     overall = 0.2517                                         max =         15\n",
+    "\n",
+    "                                                F(5,23389)        =     819.94\n",
+    "corr(u_i, Xb)  = 0.2348                         Prob > F          =     0.0000\n",
+    "\n",
+    "------------------------------------------------------------------------------\n",
+    "     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]\n",
+    "-------------+----------------------------------------------------------------\n",
+    "       grade |          0  (omitted)\n",
+    "         age |  -.0026787    .000863    -3.10   0.002    -.0043703   -.0009871\n",
+    "     ttl_exp |   .0287709   .0014474    19.88   0.000     .0259339    .0316079\n",
+    "      tenure |   .0114355   .0009229    12.39   0.000     .0096265    .0132445\n",
+    "    not_smsa |  -.0921689   .0096641    -9.54   0.000    -.1111112   -.0732266\n",
+    "       south |  -.0633396   .0110819    -5.72   0.000    -.0850608   -.0416184\n",
+    "       _cons |   1.591678   .0186849    85.19   0.000     1.555054    1.628302\n",
+    "-------------+----------------------------------------------------------------\n",
+    "     sigma_u |  .36167618\n",
+    "     sigma_e |  .29477563\n",
+    "         rho |  .60086475   (fraction of variance due to u_i)\n",
+    "------------------------------------------------------------------------------\n",
+    "F test that all u_i=0: F(4696, 23389) = 6.63                 Prob > F = 0.0000\n",
+    "\n",
+    ". estimates store fe"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This is a fixed effects model.\n",
+    "Note that grade does not vary within units, so the model drops it.\n",
+    "Also, note that it splits out the within, between, and overall effects for us, and reports some panel stats in the header.\n",
+    "\n",
+    "It also has an F test that the unit effects are zero, which is rejected in this case.\n",
+    "Note that, when using robust standard errors (as we often do), that test is suppressed.\n",
+    "\n",
+    "The command at the bottom, `estimates store fe` stores the model estimates with the name `fe`.\n",
+    "We could have named it anything, but `fe` is descriptive."
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {},
+   "source": [
+    ". xtreg ln_wage `controls' `ivs', re\n",
+    "\n",
+    "Random-effects GLS regression                   Number of obs     =     28,091\n",
+    "Group variable: idcode                          Number of groups  =      4,697\n",
+    "\n",
+    "R-sq:                                           Obs per group:\n",
+    "     within  = 0.1483                                         min =          1\n",
+    "     between = 0.4701                                         avg =        6.0\n",
+    "     overall = 0.3569                                         max =         15\n",
+    "\n",
+    "                                                Wald chi2(6)      =    8304.62\n",
+    "corr(u_i, X)   = 0 (assumed)                    Prob > chi2       =     0.0000\n",
+    "\n",
+    "------------------------------------------------------------------------------\n",
+    "     ln_wage |      Coef.   Std. Err.      z    P>|z|     [95% Conf. Interval]\n",
+    "-------------+----------------------------------------------------------------\n",
+    "       grade |   .0691836   .0017689    39.11   0.000     .0657166    .0726506\n",
+    "         age |  -.0038386   .0006544    -5.87   0.000    -.0051212   -.0025559\n",
+    "     ttl_exp |   .0301313   .0011215    26.87   0.000     .0279331    .0323294\n",
+    "      tenure |   .0134656   .0008442    15.95   0.000      .011811    .0151202\n",
+    "    not_smsa |   -.128591   .0072246   -17.80   0.000     -.142751    -.114431\n",
+    "       south |  -.0932646    .007231   -12.90   0.000     -.107437   -.0790921\n",
+    "       _cons |   .7544109   .0273445    27.59   0.000     .7008168    .8080051\n",
+    "-------------+----------------------------------------------------------------\n",
+    "     sigma_u |  .26027808\n",
+    "     sigma_e |  .29477563\n",
+    "         rho |  .43808743   (fraction of variance due to u_i)\n",
+    "------------------------------------------------------------------------------\n",
+    "\n",
+    ". estimates store re"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This is a random effects model.\n",
+    "Note the differences when we assume no correlation (and the model output reminds us of that fact)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Testing whether the RE model is consistent\n",
+    "\n",
+    "A Hausman test can test whether the FE and RE estimates are consistent. \n",
+    "If they are, we can use use the more efficient RE model.\n",
+    "\n",
+    "**Note:** Using this test assumes that a fixed-effects model would be appropriate.\n",
+    "If you want a time-invariant variable in the regression, it will be dropped be FE.\n",
+    "If you want a nearly time-invariant variable, almost all of the variance will be wiped out, but the model will still give you a parameter estimate.\n",
+    "Reviewers often ask for this test, and you may need to argue smartly if FE isn't appropriate for your study."
+   ]
+  },
+  {
+   "cell_type": "raw",
+   "metadata": {},
+   "source": [
+    ". \n",
+    ". hausman fe re\n",
+    "\n",
+    "                 ---- Coefficients ----\n",
+    "             |      (b)          (B)            (b-B)     sqrt(diag(V_b-V_B))\n",
+    "             |       fe           re         Difference          S.E.\n",
+    "-------------+----------------------------------------------------------------\n",
+    "         age |   -.0026787    -.0038386        .0011599        .0005626\n",
+    "     ttl_exp |    .0287709     .0301313       -.0013603         .000915\n",
+    "      tenure |    .0114355     .0134656       -.0020301        .0003729\n",
+    "    not_smsa |   -.0921689     -.128591        .0364221        .0064187\n",
+    "       south |   -.0633396    -.0932646         .029925        .0083977\n",
+    "------------------------------------------------------------------------------\n",
+    "                           b = consistent under Ho and Ha; obtained from xtreg\n",
+    "            B = inconsistent under Ha, efficient under Ho; obtained from xtreg\n",
+    "\n",
+    "    Test:  Ho:  difference in coefficients not systematic\n",
+    "\n",
+    "                  chi2(5) = (b-B)'[(V_b-V_B)^(-1)](b-B)\n",
+    "                          =      121.50\n",
+    "                Prob>chi2 =      0.0000"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.7.5"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}