Skip to content

Commit 214e746

Browse files
committed
docs(updated-workshop-for-DCCEEW): updated workshop for DCEEW
1 parent 94be735 commit 214e746

20 files changed

+1275
-331
lines changed

_episodes/03-Query-Template.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ A query can be as simple as the following:
3939
```
4040
```output
4141
totalRecords
42-
0 130570372
42+
0 150408171
4343
```
4444

4545
This returns the total number of records in the Atlas of Living Australia. This will change weekly, so this number will likely be higher for you!

_episodes/04-Building-An-Example-Query.md

+5-5
Original file line numberDiff line numberDiff line change
@@ -79,7 +79,7 @@ galah.atlas_counts(
7979
```
8080
```output
8181
totalRecords
82-
0 95886
82+
0 88184
8383
```
8484

8585
### Filter our data by year
@@ -137,10 +137,10 @@ galah.atlas_counts(
137137
```
138138
```output
139139
totalRecords
140-
0 69468
140+
0 62273
141141
```
142142

143-
As this is less than the 95844 records that were shown above, we can see we have already filtered the data.
143+
As this is less than the 88184 records that were shown above, we can see we have already filtered the data.
144144

145145
### Adding other filters: Australian States
146146

@@ -199,7 +199,7 @@ galah.atlas_counts(
199199
```
200200
```output
201201
totalRecords
202-
0 61984
202+
0 47743
203203
```
204204

205205
### Adding other filters: Data Resources
@@ -284,5 +284,5 @@ galah.atlas_counts(
284284
```
285285
```output
286286
totalRecords
287-
0 27969
287+
0 39840
288288
```

_episodes/05-Group-By-Example.md

+151
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,151 @@
1+
---
2+
title: "Grouping counts to gain a deeper understanding of the data"
3+
start: true
4+
teaching: 10
5+
exercises: 10
6+
questions:
7+
- "What does \"grouping counts\" mean?"
8+
- "How can I use it to give me a better understanding of the data"
9+
objectives:
10+
- "Understand what \"grouping counts\" means"
11+
- "Learn how to group ALA data and interpret it"
12+
keypoints:
13+
- "Grouping data can provide valuable insights into what kind of data is avilable on the ALA"
14+
- "This grouping can also serve to better filer your queries"
15+
---
16+
17+
# Group counts by fields
18+
19+
When looking into data such as species occurrences, there may be angles that are hidden by the raw counts of records in the ALA. For example, we could see in our previous query that the number of records for *Litoria peronii* since 2018 in NSW dropped from 61952 to 27969 when we specified we only want records that were documented by FrogID. But what other data resources are we leaving out, and how many records are they each responsible for?
20+
21+
To do this, we will use the `group_by` option in `atlas_counts()`. Any of the fields specified for `filters` can be used in `group_by`. To group your counts, add `group_by="dataResourceName"` to your query, as well as `expand=False` (the `expand` argument will be explained in detail below):
22+
23+
```python
24+
galah.atlas_counts(
25+
taxa="litoria peronii",
26+
filters=["year>=2018",
27+
"cl22=New South Wales"],
28+
group_by="dataResourceName",
29+
expand=False
30+
)
31+
```
32+
```output
33+
dataResourceName count
34+
0 FrogID 39840
35+
1 NSW BioNet Atlas 4882
36+
2 iNaturalist Australia 2578
37+
3 NatureMapr 249
38+
4 Earth Guardians Weekly Feed 151
39+
5 ALA species sightings and OzAtlas 16
40+
6 Victorian Biodiversity Atlas 10
41+
7 FrogWatch SA 6
42+
8 Australian Museum provider for OZCAM 4
43+
9 BowerBird 3
44+
10 Melbourne Water Frog Census 2
45+
11 SA Fauna 2
46+
```
47+
48+
We can see that there are 12 data resources that have provided the ALA observations of *Litoria peronii*, and surprisingly, FrogID provides the second most observations!
49+
50+
Now, in the query above, we specified that we want records since 2018. However, we can also see how many records came from each year by adding `year` to the `group_by` arguments.
51+
52+
```python
53+
galah.atlas_counts(
54+
taxa="litoria peronii",
55+
filters=["year>=2018",
56+
"cl22=New South Wales"],
57+
group_by=["dataResourceName","year"],
58+
expand=False
59+
)
60+
```
61+
```output
62+
dataResourceName year count
63+
0 FrogID - 39840
64+
1 NSW BioNet Atlas - 4882
65+
2 iNaturalist Australia - 2578
66+
3 NatureMapr - 249
67+
4 Earth Guardians Weekly Feed - 151
68+
5 ALA species sightings and OzAtlas - 16
69+
6 Victorian Biodiversity Atlas - 10
70+
7 FrogWatch SA - 6
71+
8 Australian Museum provider for OZCAM - 4
72+
9 BowerBird - 3
73+
10 Melbourne Water Frog Census - 2
74+
11 SA Fauna - 2
75+
12 - 2018 5200
76+
13 - 2019 5469
77+
14 - 2020 13358
78+
15 - 2021 14469
79+
16 - 2022 7506
80+
17 - 2023 817
81+
18 - 2024 762
82+
19 - 2025 162
83+
```
84+
85+
Now, we not only have the data resources providing observations of *Litoria peronii*, we can also see how many observations there were per year.
86+
87+
But what if you wanted to know, for each year, how many records each data resource provided?
88+
89+
This is where the `expand=True` option comes in. This option will tell `galah-python` that you want to see the number of observations for each dadta resource in each year specified.
90+
91+
#### Note: `expand=True` option is the default, and is only possible when you have more than one option for `group_by`; otherwise, you will get an error.
92+
93+
```python
94+
galah.atlas_counts(
95+
taxa="litoria peronii",
96+
filters=["year>=2018",
97+
"cl22=New South Wales"],
98+
group_by=["dataResourceName","year"],
99+
)
100+
```
101+
```output
102+
dataResourceName year count
103+
0 FrogID 2018 4154
104+
1 FrogID 2019 4382
105+
2 FrogID 2020 12248
106+
3 FrogID 2021 12851
107+
4 FrogID 2022 6205
108+
5 NSW BioNet Atlas 2018 850
109+
6 NSW BioNet Atlas 2019 872
110+
7 NSW BioNet Atlas 2020 808
111+
8 NSW BioNet Atlas 2021 1244
112+
9 NSW BioNet Atlas 2022 840
113+
10 NSW BioNet Atlas 2023 205
114+
11 NSW BioNet Atlas 2024 63
115+
12 iNaturalist Australia 2018 108
116+
13 iNaturalist Australia 2019 113
117+
14 iNaturalist Australia 2020 227
118+
15 iNaturalist Australia 2021 321
119+
16 iNaturalist Australia 2022 409
120+
17 iNaturalist Australia 2023 576
121+
18 iNaturalist Australia 2024 665
122+
19 iNaturalist Australia 2025 159
123+
20 NatureMapr 2018 37
124+
21 NatureMapr 2019 48
125+
22 NatureMapr 2020 47
126+
23 NatureMapr 2021 24
127+
24 NatureMapr 2022 27
128+
25 NatureMapr 2023 33
129+
26 NatureMapr 2024 30
130+
27 NatureMapr 2025 3
131+
28 Earth Guardians Weekly Feed 2018 30
132+
29 Earth Guardians Weekly Feed 2019 43
133+
30 Earth Guardians Weekly Feed 2020 24
134+
31 Earth Guardians Weekly Feed 2021 27
135+
32 Earth Guardians Weekly Feed 2022 22
136+
33 Earth Guardians Weekly Feed 2023 1
137+
34 Earth Guardians Weekly Feed 2024 4
138+
35 ALA species sightings and OzAtlas 2018 7
139+
36 ALA species sightings and OzAtlas 2019 5
140+
37 ALA species sightings and OzAtlas 2020 1
141+
38 ALA species sightings and OzAtlas 2022 3
142+
39 Victorian Biodiversity Atlas 2018 5
143+
40 Victorian Biodiversity Atlas 2019 5
144+
41 FrogWatch SA 2019 1
145+
42 FrogWatch SA 2020 3
146+
43 FrogWatch SA 2023 2
147+
44 Australian Museum provider for OZCAM 2018 4
148+
45 BowerBird 2018 3
149+
46 Melbourne Water Frog Census 2018 2
150+
47 SA Fauna 2021 2
151+
```

_episodes/05-Taxonomy-Examples.md

Whitespace-only changes.

_episodes/08-Make-a-Map.md _episodes/06-Make-a-Map.md

+12-3
Original file line numberDiff line numberDiff line change
@@ -110,15 +110,24 @@ states = gpd.read_file("STE_2021_AUST_GDA94.shp")
110110

111111
# Change Coordinate Reference System (CRS) of the shape file and plot New South Wales
112112
states = states.to_crs(4326)
113-
states[states["STE_NAME21"] == "New South Wales"].plot(edgecolor = "#5A5A5A", linewidth = 0.5, facecolor = "white", figsize = (24,10))
113+
states[states["STE_NAME21"] == "New South Wales"].plot(edgecolor = "#5A5A5A", linewidth = 1.0, facecolor = "white", figsize = (24,10))
114114
```
115115

116116
![](../fig/states_map.png)
117117

118118
```python
119119
# Add occurrence records to the map
120-
ax = states[states["STE_NAME21"] == "New South Wales"].plot(edgecolor = "#5A5A5A", linewidth = 0.5, facecolor = "white", figsize = (24,10))
120+
ax = states[states["STE_NAME21"] == "New South Wales"].plot(edgecolor = "#5A5A5A", linewidth = 1.0, facecolor = "white", figsize = (24,10))
121121
plt.scatter(frogs['decimalLongitude'],frogs['decimalLatitude'], c = "#6fab3f", alpha = 0.5)
122122
```
123123

124-
![](../fig/frog_occurrences.png)
124+
![](../fig/frog_occurrences.png)
125+
126+
```python
127+
# Add final touches to figure
128+
plt.suptitle("Peron's tree frog",fontsize=24)
129+
plt.title("FrogID observations in New South Wales since 2018",fontsize=16)
130+
plt.axis('off')
131+
```
132+
133+
![](../fig/frog_occurrences_labels.png)

_episodes/07-Group-By-Example.md

-147
This file was deleted.

0 commit comments

Comments
 (0)