Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data issue: problems with county-level export of NY congressional elections #9

Open
42 tasks
lmullen opened this issue Aug 17, 2016 · 3 comments
Open
42 tasks

Comments

@lmullen
Copy link
Member

lmullen commented Aug 17, 2016

This a running list of the problems associated with the county level data for NY congressional elections:

  • ny.uscongress3.1822.xml does not have counties: it has only New York City.
  • Running the check aggregates script for NY 1822 fails, probably because of district 3.
  • ny.uscongress3.1824.xml does not have counties: it has only New York City.
  • Running the check aggregates script for NY 1824 fails, probably because of district 3.
  • ny.uscongress1.1789: no county-level vote totals for Kings, Queens, Richmond, and Suffolk counties, though there are district level totals.
  • ny.uscongress6.1789: no county-level vote totals for Albany and Montgomery counties, though there are district level totals.
  • ny.uscongress3.1789: no county-level vote totals for Dutchess and Westchester counties, though there are district level totals.
  • ny.uscongress7.1796: Discrepancy in whether 29 votes or 167 votes are counted for Henry Glenn in district 7. The district uses the larger number, but the county has the smaller number with a reference to the larger number in a note. (Question for Phil.)
  • ny.uscongress8.1798: No county level data is available. (An unopposed election.)
  • ny.uscongress8.1798: Scattering is not recorded in the county total.
  • ny.uscongress3.1798: New York City needs to be treated like its own county.
  • ny.uscongress4.1798: Minor discrepancy of 47 votes for Lucas Elmendorf in district 4. Probably can't be resolved. Except note that John Hathorn received exactly 47 votes in Delaware county. Perhaps that county reported the same votes for both Elmendorf and Hathorn, since they were both Republicans? (Question for Phil.)
  • ny.uscongress6.1798: Vote total for Elisha Jenkins differs by 11 (< 1%). Probably can't be resolved.
  • ny.uscongress1.1800: Vote total for Silas Wood differs by 100. Arithmetic error or just a difference in the newspapers? (Question for Phil.)
    ny.uscongress6.1800
  • ny.uscongress4.1800: Elmendorf is listed under two different names. In addition Elmendorf Jr (same person) has a discrepancy between his county total and his district total. Probably can't be resolved. Minor differences for insignificant candidates.
  • ny.uscongress6.1800: Woodworth got 52 votes in one county, which is recorded in the election results but not the district results. That is an error in the XML file most likely.
  • ny.uscongress9.1800: Minor discrepancy for insignificant candidate Nathaniel King.
  • ny.uscongress10.1800: Minor discrepancy for a few insignificant candidates.
  • ny.uscongress11.1804: The district total for Adam Comstock is probably wrong, and the county total is probably correct.
  • ny.uscongress15.1804: Probably the county total is correct and the district total is wrong for Henry Huntington.
  • ny.uscongress17.1804: Peter Hughes is listed twice. Should be combined and the totals checked.
  • ny.uscongress11.1804: Probably county total for John Thompson is correct.
  • ny.uscongress16.1804: Probably the county total for Jedediah Peck is correct.
  • ny.uscongress17.1804: Probably the county totals for Jedediah Peck, Jedediah Sanger, John Sayre, and Silas Kent are correct.
  • ny.uscongress17.1806: Silas Halsey has a discrepancy of 300 votes, and it's not clear why. Daniel Lewis has a similar discrepencay of 103 votes., John Harris has a discrepancy of 50 votes, and Septimus Evans has a discrepancy of 22 votes. In all cases the district total is less than the county total.
  • ny.uscongress15.1808: District totals are not reported for this election, which is an error in the NNV data.
  • ny.uscongress6.1810: Roger Skinner has a discrepancy of 39 votes; it's not clear which is correct.
  • ny.uscongress14.1810: A discrepancy of 30 votes for Daniel Avery seems to be based on different newspaper reports.
  • ny.uscongress13.1810: Joshua Forman's county total is probably correct, not the district total.
  • ny.uscongress14.1812: District totals are not reported for this election, which is an error in NNV.
  • ny.uscongress12.1812: Now this one is curious. A discrepancy of -368 for William Livingstone adds up to the discrepancy of 361 for Melancton Smith and 7 for Roger Skinner. Was there a transcription error in the XML, with votes being assigned to the wrong person?
  • ny.uscongress21.1812: NNV reports two people as elected. Is that right? A discrepancy of 100 for Samuel Hopkins looks like a transcription error.
  • ny.uscongress5.1814: Difference of 100 votes for Edward Livingstone looks like a transcription error.
  • ny.uscongress21.1814: Hard to say which there is a discrepancy of 64 votes for Michah Brooks, except there is a note about him getting 64 votes under a different name. Perhaps those votes were not counted in the newspaper reporting the final totals?
  • ny.uscongress3.1814: Discrepancies for most of the minor candidates.
  • ny.uscongress1.1816: Unlike in previous elections, New York City is treated as a city under a New York county, but no votes are reported for the county. So the city votes should count for the county, and that will fix the discrepancies.
  • ny.uscongress20.1816: Almost all the candidates have discrepancies which I can't seem to account for.
  • ny.uscongress21.1818: It is not clear the reason for discrepancies for Nathaniel Allen, Albert Tracey, and Hastings Bender.
  • ny.uscongress5.1818: 10 vote discrepancy for James Strong, perhaps a transcription error.
  • ny.uscongress1.1821: Minor discrepancies, but it is not clear why.
  • ny.uscongress4.1821: Minor discrepancies, but it is not clear why.
  • ny.uscongress3.1821: Minor discrepancies, but it is not clear why.
@lmullen
Copy link
Member Author

lmullen commented Aug 17, 2016

This is a list of the combination elections in a given year that need to be checked for NY. (Checked does not necessarily imply that the problems have been fixed.)

  • 1789:
  • 1790:
  • 1793:
  • 1794:
  • 1796:
  • 1798:
  • 1800
  • 1802
  • 1804
  • 1806: Minor discrepancies less than 0.5% of candidate totals
  • 1808: Minor discrepancies of a single vote for some candidates.
  • 1810: Minor discrepancies of a single vote for candidates who got either a lot of votes or only one vote.
  • 1812: Scattering of one vote differences.
  • 1814: Scattering of one vote differences.
  • 1816: Scattering of one vote differences.
  • 1818: Scattering of one vote differences.
  • 1821: Scattering of one vote differences.
  • 1822
  • 1824

@lmullen
Copy link
Member Author

lmullen commented Aug 17, 2016

Some thoughts on the missing county level data, when we know the district total for the race. We could just map the counties as NA values. But that's not quite accurate if we do have some idea of the number of votes. We could make up an estimate that divided the actual vote based on the population of the counties. That is likely to be an oversimplification, if we even know the county populations.

@lmullen
Copy link
Member Author

lmullen commented Aug 17, 2016

What is to be done about New York city? Geographically too small and demographically too big.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant