Skip to content

Commit fd38383

Browse files
author
Tyler Coles
committed
Multi-strata simulation refactor.
Includes class-based refactor for initializer and parameter functions. Add spec to StaticGeo. Regenerate geos. Update vignettes for API changes. New initializers vignette.
1 parent 89fd46c commit fd38383

File tree

106 files changed

+6690
-4018
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

106 files changed

+6690
-4018
lines changed

.vscode/settings.json

+1-1
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626

2727
"python.formatting.provider": "none",
2828
"python.analysis.autoImportCompletions": true,
29-
"python.analysis.typeCheckingMode": "basic",
29+
"python.analysis.typeCheckingMode": "standard",
3030
"python.analysis.diagnosticMode": "workspace",
3131

3232
"python.testing.pytestEnabled": false,

README.md

+10-70
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,18 @@
11
# epymorph
22

3-
Prototype EpiMoRPH system written in Python. It is usable as a CLI program and also as a code library; for instance you may want to use it from within a Jupyter Notebook (see: `USAGE.ipynb`).
3+
Prototype EpiMoRPH system written in Python. It is usable as a code library, for instance, from within a Jupyter Notebook (see: `USAGE.ipynb`).
44

5-
The `epymorph/data` directory is the model library, containing named implementations of IPMs, MMs, and GEOs. Ultimately our goal is to allow users to bring-their-own models by loading specification files, but for now they need to be registered in the model library.
5+
The `epymorph/data` directory is the model library, containing named implementations of IPMs, MMs, and GEOs. Ultimately our goal is to allow users to bring their own models by loading specification files, but for now they need to be registered in the model library.
66

77
The `doc/devlog` directory contains Jupyter Notebooks demonstrating features of epymorph and general development progress.
88

99
Beware: much of this code structure is experimental and subject to change!
1010

11-
## Project setup
11+
## Basic usage
12+
13+
See the `USAGE.ipynb` Notebook for a simple usage example.
14+
15+
## Development setup
1216

1317
For starters, you should have Python 3.11 installed and we'll assume it's accessible via the command `python3.11`.
1418

@@ -18,7 +22,7 @@ You may need to install additional system packages for virtual environments and
1822
sudo apt install python3.11-venv python3.11-tk
1923
```
2024

21-
Using VS Code, install the project's recommended IDE extensions. Then use the "Python - Create Environment" command (`Ctrl+Shift+P`) to create a Venv environment and install all dependencies (including `dev`).
25+
If you are using VS Code, install the project's recommended IDE extensions. Then use the "Python - Create Environment" command (`Ctrl+Shift+P`) to create a Venv environment and install all dependencies (including `dev`).
2226

2327
Or you can set up from the command line:
2428

@@ -29,64 +33,14 @@ cd $PROJECT_DIRECTORY
2933
python3.11 -m venv .venv
3034

3135
# activate it (after which `python` should be bound to the venv python)
36+
# NOTE: activating venv on Windows is different; see documentation
3237
source .venv/bin/activate
3338

3439
# then install the project in editable mode
3540
python -m pip install --editable ".[dev]"
3641
```
3742

38-
(The quotes in the last command are necessary on some platforms, but if the quotes give you issues you can also try without them.)
39-
40-
## Running from the command line
41-
42-
The most basic task epymorph can perform is to run a spatial, compartmental disease simulation and output the time-series data of compartment populations (prevalence) as well as new events (incidence).
43-
44-
A commonly-cited model was proposed by [Sen Pei, et al. in 2018](https://www.pnas.org/doi/10.1073/pnas.1708856115), modeling influenza in six
45-
southern US states. epymorph has an intra-population model (IPM), movement model (MM), and geographic model (GEO) that closely mimics Pei's experiment.
46-
47-
To run a simulation, first we need an input file that describes the simulation and all its parameters. Thankfully epymorph has a subcommand to help us create such a file. This example will store files in a `scratch` folder within the project (this is just for convenience, you can opt to put the input files anywhere you like).
48-
49-
```bash
50-
cd $PROJECT_DIRECTORY
51-
52-
# Activate the venv (if it's not already):
53-
source .venv/bin/activate
54-
55-
# scratch is a convenient place to put all sorts of temp files because our .gitignore excludes it
56-
mkdir scratch
57-
58-
# Prepare the simulation input file:
59-
python -m epymorph prepare --ipm pei --mm pei --geo pei ./scratch/my-experiment.toml
60-
61-
# Now we need to edit the input file to specify the parameters needed by our combo of IPM and MM:
62-
# (I'll use `cat` for this but you can use any text editor of course.)
63-
cat << EOF >> ./scratch/my-experiment.toml
64-
theta = 0.1
65-
move_control = 0.9
66-
infection_duration = 4.0
67-
immunity_duration = 90.0
68-
EOF
69-
70-
# Now we can run the simulation:
71-
python -m epymorph run ./scratch/my-experiment.toml --out ./scratch/output.csv
72-
73-
# Now if I open that csv file I see:
74-
# - for each time-step (t) and population (p)
75-
# - prevalence data by compartment (c0, c1, c2)
76-
# - incidence data by event (e0, e1, e2)
77-
78-
# You can also run to display a chart:
79-
python -m epymorph run ./scratch/my-experiment.toml --chart p0
80-
```
81-
82-
To learn more about these and all other commands, you can always consult the CLI help:
83-
84-
```bash
85-
python -m epymorph --help
86-
87-
# or for a specific subcommand
88-
python -m epymorph run --help
89-
```
43+
Make sure you have correctly configured auto-formatting in your development environment. We're currently using autopep8 and isort. These formatting tools should run every time you save a file.
9044

9145
### Other command-line tasks
9246

@@ -95,17 +49,3 @@ Run all unit tests:
9549
```bash
9650
python -m unittest discover -v -s ./epymorph -p '*_test.py'
9751
```
98-
99-
Run a simulation with the pdb debugger:
100-
101-
```bash
102-
python -m pdb -m epymorph run ./scratch/my-experiment.toml
103-
```
104-
105-
Profile the simulation and show the results in `snakeviz`:
106-
107-
```bash
108-
TMP=$(mktemp /tmp/py-XXXXXXXX.prof)
109-
python -m cProfile -o $TMP -m epymorph run ./scratch/my-experiment.toml --profile
110-
snakeviz $TMP
111-
```

USAGE.ipynb

+109-70
Large diffs are not rendered by default.

doc/demo/01-SIRH-IPM.ipynb

+16-14
Large diffs are not rendered by default.

doc/demo/02-states-GEO.ipynb

+35-36
Large diffs are not rendered by default.

doc/demo/03-counties-GEO.ipynb

+37-36
Large diffs are not rendered by default.

doc/demo/04-time-varying-beta.ipynb

+29-24
Large diffs are not rendered by default.

doc/demo/05-visualizing-mm.ipynb

+4-4
Large diffs are not rendered by default.

doc/devlog/2023-06-30.ipynb

+46-39
Large diffs are not rendered by default.

doc/devlog/2023-07-06.ipynb

+12-7
Original file line numberDiff line numberDiff line change
@@ -26,17 +26,22 @@
2626
"\n",
2727
"from epymorph.data_shape import Shapes\n",
2828
"from epymorph.data_type import CentroidDType\n",
29-
"from epymorph.geo.spec import StaticGeoSpec, Year, attrib\n",
29+
"from epymorph.geo.spec import StaticGeoSpec, Year\n",
30+
"from epymorph.geography.us_census import StateScope\n",
31+
"from epymorph.simulation import AttributeDef\n",
3032
"\n",
3133
"spec = StaticGeoSpec(\n",
3234
" attributes=[\n",
33-
" attrib('label', str, Shapes.N),\n",
34-
" attrib('geoid', str, Shapes.N),\n",
35-
" attrib('centroid', CentroidDType, Shapes.N),\n",
36-
" attrib('population', int, Shapes.N),\n",
37-
" attrib('commuters', int, Shapes.NxN),\n",
38-
" attrib('humidity', float, Shapes.TxN),\n",
35+
" AttributeDef('label', str, Shapes.N),\n",
36+
" AttributeDef('geoid', str, Shapes.N),\n",
37+
" AttributeDef('centroid', CentroidDType, Shapes.N),\n",
38+
" AttributeDef('population', int, Shapes.N),\n",
39+
" AttributeDef('commuters', int, Shapes.NxN),\n",
40+
" AttributeDef('humidity', float, Shapes.TxN),\n",
3941
" ],\n",
42+
" # critically: these states are listed here in GEOID order,\n",
43+
" # and we maintain that order when entering data below\n",
44+
" scope=StateScope.in_states_by_code(['FL', 'GA', 'MD', 'NC', 'SC', 'VA'], year=2015),\n",
4045
" time_period=Year(2015),\n",
4146
")"
4247
]

doc/devlog/2023-07-07.ipynb

+24-14
Original file line numberDiff line numberDiff line change
@@ -29,23 +29,33 @@
2929
"from epymorph.data_shape import Shapes\n",
3030
"from epymorph.data_type import CentroidDType\n",
3131
"from epymorph.error import GeoValidationException\n",
32-
"from epymorph.geo.spec import LABEL, StaticGeoSpec, Year, attrib\n",
32+
"from epymorph.geo.spec import LABEL, StaticGeoSpec, Year\n",
3333
"from epymorph.geo.static import StaticGeo\n",
3434
"from epymorph.geo.static import StaticGeoFileOps as F\n",
35+
"from epymorph.geography.us_census import StateScope\n",
36+
"from epymorph.simulation import AttributeDef\n",
3537
"\n",
3638
"YEAR = 2015\n",
3739
"NUM_STATES = 52\n",
3840
"NUM_COUNTIES = 3220\n",
3941
"\n",
40-
"spec = StaticGeoSpec(\n",
42+
"state_scope = StateScope.all(year=YEAR)\n",
43+
"county_scope = state_scope.lower_granularity()\n",
44+
"\n",
45+
"# Both state and county geo will have the same attributes, just different scope.\n",
46+
"state_spec = StaticGeoSpec(\n",
4147
" attributes=[\n",
4248
" LABEL,\n",
43-
" attrib('geoid', str, Shapes.N),\n",
44-
" attrib('centroid', CentroidDType, Shapes.N),\n",
45-
" attrib('population', int, Shapes.N),\n",
46-
" attrib('commuters', int, Shapes.NxN)\n",
49+
" AttributeDef('geoid', str, Shapes.N),\n",
50+
" AttributeDef('centroid', CentroidDType, Shapes.N),\n",
51+
" AttributeDef('population', int, Shapes.N),\n",
52+
" AttributeDef('commuters', int, Shapes.NxN),\n",
4753
" ],\n",
48-
" time_period=Year(YEAR))\n",
54+
" scope=state_scope,\n",
55+
" time_period=Year(YEAR),\n",
56+
")\n",
57+
"\n",
58+
"county_spec = dataclasses.replace(state_spec, scope=county_scope)\n",
4959
"\n",
5060
"# Initialize Census API\n",
5161
"census = Census(os.environ['CENSUS_API_KEY'])"
@@ -91,7 +101,7 @@
91101
" \"wrk_state\",\n",
92102
" \"wrk_county\",\n",
93103
" \"workers\",\n",
94-
" \"moe\"\n",
104+
" \"moe\",\n",
95105
" ],\n",
96106
" dtype=str\n",
97107
")\n",
@@ -142,7 +152,7 @@
142152
"d = pd.DataFrame.from_records(state_data).astype({\n",
143153
" 'NAME': np.str_,\n",
144154
" 'B01003_001E': np.int64,\n",
145-
" 'state': np.str_\n",
155+
" 'state': np.str_,\n",
146156
"})\n",
147157
"d.rename(columns={\n",
148158
" 'NAME': 'label',\n",
@@ -170,7 +180,7 @@
170180
" 'geoid': d['geoid'].to_numpy(dtype=np.str_),\n",
171181
" 'centroid': d['centroid'].to_numpy(dtype=CentroidDType),\n",
172182
" 'population': d['population'].to_numpy(dtype=np.int64),\n",
173-
" 'commuters': c\n",
183+
" 'commuters': c,\n",
174184
"}\n",
175185
"\n",
176186
"num_states = len(states_values['label'])\n",
@@ -186,7 +196,7 @@
186196
"source": [
187197
"geofile = Path('epymorph/data/geo') / F.to_archive_filename('us_states_2015')\n",
188198
"try:\n",
189-
" states_geo = StaticGeo(dataclasses.replace(spec), states_values)\n",
199+
" states_geo = StaticGeo(state_spec, states_values)\n",
190200
" states_geo.validate()\n",
191201
" states_geo.save(geofile)\n",
192202
"except GeoValidationException as e:\n",
@@ -201,7 +211,7 @@
201211
{
202212
"data": {
203213
"text/plain": [
204-
"<epymorph.geo.static.StaticGeo at 0x7ff85c5ebc50>"
214+
"<epymorph.geo.static.StaticGeo at 0x7f0c8a6dc390>"
205215
]
206216
},
207217
"execution_count": 6,
@@ -286,7 +296,7 @@
286296
"source": [
287297
"geofile = Path('epymorph/data/geo') / F.to_archive_filename('us_counties_2015')\n",
288298
"try:\n",
289-
" counties_geo = StaticGeo(dataclasses.replace(spec), counties_values)\n",
299+
" counties_geo = StaticGeo(county_spec, counties_values)\n",
290300
" counties_geo.validate()\n",
291301
" counties_geo.save(geofile)\n",
292302
"except GeoValidationException as e:\n",
@@ -301,7 +311,7 @@
301311
{
302312
"data": {
303313
"text/plain": [
304-
"<epymorph.geo.static.StaticGeo at 0x7ff833500290>"
314+
"<epymorph.geo.static.StaticGeo at 0x7f0c8f1db850>"
305315
]
306316
},
307317
"execution_count": 10,

doc/devlog/2023-07-12.ipynb

+60-24
Original file line numberDiff line numberDiff line change
@@ -26,29 +26,31 @@
2626
"from epymorph.data_shape import Shapes\n",
2727
"from epymorph.data_type import CentroidDType\n",
2828
"from epymorph.error import GeoValidationException\n",
29-
"from epymorph.geo.spec import LABEL, StaticGeoSpec, Year, attrib\n",
29+
"from epymorph.geo.spec import LABEL, StaticGeoSpec, Year\n",
3030
"from epymorph.geo.static import StaticGeo\n",
3131
"from epymorph.geo.static import StaticGeoFileOps as F\n",
32+
"from epymorph.geography.scope import ScopeFilter\n",
33+
"from epymorph.geography.us_census import BlockGroupScope\n",
34+
"from epymorph.simulation import AttributeDef\n",
3235
"\n",
3336
"YEAR = 2019\n",
3437
"NUM_COUNTIES = 2494\n",
3538
"\n",
36-
"spec = StaticGeoSpec(\n",
37-
" attributes=[\n",
38-
" LABEL,\n",
39-
" attrib('geoid', str, Shapes.N),\n",
40-
" attrib('centroid', CentroidDType, Shapes.N),\n",
41-
" attrib('population', int, Shapes.N),\n",
42-
" attrib('population_by_age', int, Shapes.NxA(3)),\n",
43-
" attrib('population_by_age_x6', int, Shapes.NxA(6)),\n",
44-
" attrib('median_age', float, Shapes.N),\n",
45-
" attrib('median_income', int, Shapes.N),\n",
46-
" attrib('average_household_size', float, Shapes.N),\n",
47-
" attrib('pop_density_km2', float, Shapes.N),\n",
48-
" attrib('tract_gini_index', float, Shapes.N),\n",
49-
" attrib('tract_median_income', int, Shapes.N),\n",
50-
" ],\n",
51-
" time_period=Year(YEAR))\n",
39+
"attributes: list[AttributeDef] = [\n",
40+
" LABEL,\n",
41+
" AttributeDef('geoid', str, Shapes.N),\n",
42+
" AttributeDef('centroid', CentroidDType, Shapes.N),\n",
43+
" AttributeDef('population', int, Shapes.N),\n",
44+
" # AttributeDef('population_by_age', int, Shapes.NxA(3)),\n",
45+
" # AttributeDef('population_by_age_x6', int, Shapes.NxA(6)),\n",
46+
" AttributeDef('median_age', float, Shapes.N),\n",
47+
" AttributeDef('median_income', int, Shapes.N),\n",
48+
" AttributeDef('average_household_size', float, Shapes.N),\n",
49+
" AttributeDef('pop_density_km2', float, Shapes.N),\n",
50+
" AttributeDef('tract_gini_index', float, Shapes.N),\n",
51+
" AttributeDef('tract_median_income', int, Shapes.N),\n",
52+
"]\n",
53+
"\n",
5254
"\n",
5355
"AGE_VARS = [\n",
5456
" \"B01001_003E\", # Population (Male) 0-4 years\n",
@@ -234,6 +236,26 @@
234236
"execution_count": 4,
235237
"metadata": {},
236238
"outputs": [
239+
{
240+
"data": {
241+
"text/plain": [
242+
"68 040139411001\n",
243+
"870 040139805001\n",
244+
"1223 040139804001\n",
245+
"1268 040131167331\n",
246+
"1444 040131138021\n",
247+
"1523 040131134001\n",
248+
"1661 040139807001\n",
249+
"1808 040137233061\n",
250+
"2054 040137233031\n",
251+
"2056 040139801001\n",
252+
"2273 040130610171\n",
253+
"Name: geoid, dtype: object"
254+
]
255+
},
256+
"metadata": {},
257+
"output_type": "display_data"
258+
},
237259
{
238260
"data": {
239261
"text/html": [
@@ -383,10 +405,14 @@
383405
"cbgs.drop(columns=['tract_geoid'], inplace=True)\n",
384406
"\n",
385407
"# Filter CBGs\n",
386-
"cbgs.drop(cbgs[\n",
408+
"dropped_cbgs = cbgs[\n",
387409
" (cbgs['median_income'] == 0) &\n",
388410
" (cbgs['tract_median_income'] == 0)\n",
389-
"].index, inplace=True)\n",
411+
"]\n",
412+
"\n",
413+
"display(dropped_cbgs['geoid'])\n",
414+
"\n",
415+
"cbgs.drop(dropped_cbgs.index, inplace=True)\n",
390416
"\n",
391417
"cbgs.sort_values(by='geoid', inplace=True)\n",
392418
"cbgs.reset_index(drop=True, inplace=True)\n",
@@ -547,8 +573,8 @@
547573
" 'geoid': cbgs['geoid'].to_numpy(dtype=np.str_),\n",
548574
" 'centroid': cbgs['centroid'].to_numpy(dtype=CentroidDType),\n",
549575
" 'population': cbgs['population'].to_numpy(dtype=np.int64),\n",
550-
" 'population_by_age': cbgs_age_1.to_numpy(dtype=np.int64),\n",
551-
" 'population_by_age_x6': cbgs_age_2.to_numpy(dtype=np.int64),\n",
576+
" # 'population_by_age': cbgs_age_1.to_numpy(dtype=np.int64),\n",
577+
" # 'population_by_age_x6': cbgs_age_2.to_numpy(dtype=np.int64),\n",
552578
" 'median_age': cbgs['median_age'].to_numpy(dtype=np.float64),\n",
553579
" 'median_income': cbgs['median_income'].to_numpy(dtype=np.int64),\n",
554580
" 'average_household_size': cbgs['average_household_size'].to_numpy(dtype=np.float64),\n",
@@ -568,6 +594,16 @@
568594
"metadata": {},
569595
"outputs": [],
570596
"source": [
597+
"spec = StaticGeoSpec(\n",
598+
" attributes=attributes,\n",
599+
" # Maricopa County, AZ is GEOID 04013\n",
600+
" scope=ScopeFilter(\n",
601+
" parent=BlockGroupScope.in_counties(['04013'], year=YEAR),\n",
602+
" remove=dropped_cbgs['geoid'].to_numpy(np.str_),\n",
603+
" ),\n",
604+
" time_period=Year(YEAR),\n",
605+
")\n",
606+
"\n",
571607
"try:\n",
572608
" geo = StaticGeo(spec, values)\n",
573609
" geo.validate()\n",
@@ -601,7 +637,6 @@
601637
"label: No diffs!\n",
602638
"population: No diffs!\n",
603639
"median_age: No diffs!\n",
604-
"population_by_age_x6: No diffs!\n",
605640
"median_income: No diffs!\n",
606641
"average_household_size: No diffs!\n",
607642
"tract_gini_index: No diffs!\n",
@@ -650,8 +685,9 @@
650685
"diff('population', d1['population'], d2['population'], np.int_.__eq__)\n",
651686
"diff('median_age', d1['median_age'], d2['median_age'], np.isclose)\n",
652687
"# there's no equivalent in the old geo to our new 'population_by_age'\n",
653-
"diff('population_by_age_x6', d1['population_by_age_x6'],\n",
654-
" d2['pop_by_age'], np.array_equal)\n",
688+
"# NOTE: we can't check population_by_age_x6 anymore...\n",
689+
"# diff('population_by_age_x6', d1['population_by_age_x6'],\n",
690+
"# d2['pop_by_age'], np.array_equal)\n",
655691
"diff('median_income', d1['median_income'], d2['median_income'], np.int_.__eq__)\n",
656692
"diff('average_household_size', d1['average_household_size'],\n",
657693
" d2['average_household_size'], np.isclose)\n",

0 commit comments

Comments
 (0)