You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: vignettes/mixed_reconciliation.Rmd
+20-20Lines changed: 20 additions & 20 deletions
Original file line number
Diff line number
Diff line change
@@ -37,15 +37,15 @@ Sect. 5 of the paper presents the results for 10 stores, each reconciled
37
37
The M5 competition [@MAKRIDAKIS20221325] is about daily time series of sales data referring to 10 different stores.
38
38
Each store has the same hierarchy: 3049 bottom time series (single items) and 11 upper time series, obtained by aggregating the items by department, product category, and store; see the figure below.
39
39
40
-
```{r out.width = '100%', echo = FALSE}
40
+
```{rM5hier, fig.cap="**Figure 1**: graph of the M5 hierarchy.", out.width = '100%', echo = FALSE}
41
41
knitr::include_graphics("img/M5store_hier.png")
42
42
```
43
43
44
44
We reproduce the results of the store "CA_1". The base forecasts (for h=1) of the bottom and upper time series are stored in `M5_CA1_basefc`, available as data in the package.
45
45
The base forecast are computed using ADAM [@svetunkov2023iets], implemented in the R package smooth [@smooth_pkg].
46
46
47
47
48
-
```{r}
48
+
```{r InitializeHierarchy}
49
49
# Hierarchy composed by 3060 time series: 3049 bottom and 11 upper
50
50
n_b <- 3049
51
51
n_u <- 11
@@ -74,7 +74,7 @@ It assumes all forecasts to be Gaussian, even though the bottom base forecasts
74
74
We assume the upper base forecasts to be a multivariate Gaussian and we estimate their covariance matrix from the in-sample residuals. We assume also the bottom base forecasts to be independent Gaussians.
75
75
76
76
77
-
```{r}
77
+
```{r readBaseForecasts}
78
78
# Parameters of the upper base forecast distributions
79
79
mu_u <- unlist(lapply(base_fc_upper, "[[", "mu")) # upper means
80
80
# Compute the (shrinked) covariance matrix of the residuals
@@ -107,7 +107,7 @@ We reconcile using the function `reconc_gaussian()`, which takes as input:
107
107
108
108
The function returns the reconciled mean and covariance for the bottom time series.
@@ -136,7 +136,7 @@ The algorithm is implemented in the function `reconc_MixCond()`. The function ta
136
136
137
137
The function returns the reconciled forecasts in the form of probability mass functions for both the upper and bottom time series. The function parameter `return_type` can be changed to `samples` or `all` to obtain the IS samples.
138
138
139
-
```{r}
139
+
```{r MixedCondReconciliation}
140
140
seed <- 1
141
141
N_samples_IS <- 5e4
142
142
@@ -169,7 +169,7 @@ Moreover, forecasts for count time series are usually biased and their sum tends
169
169
Top down conditioning (TD-cond; see @zambon2024mixed, Sect. 4) is a more reliable approach for reconciling mixed variables in high dimensions.
170
170
The algorithm is implemented in the function `reconc_TDcond()`; it takes the same arguments as `reconc_MixCond()` and returns reconciled forecasts in the same format.
171
171
172
-
```{r}
172
+
```{r TDcondReconciliation}
173
173
N_samples_TD <- 1e4
174
174
175
175
# TDcond reconciliation
@@ -182,7 +182,7 @@ stop <- Sys.time()
182
182
The algorithm TD-cond raises a warning regarding the incoherence between the joint bottom-up and the upper base forecasts.
183
183
We will see that this warning does not impact the performances of TD-cond.
184
184
185
-
```{r}
185
+
```{r print computational time TD cond}
186
186
rec_fc$TD_cond <- list(
187
187
bottom = td$bottom_reconciled$pmf,
188
188
upper = td$upper_reconciled$pmf
@@ -206,7 +206,7 @@ For each time series in the hierarchy, we compute the following scores for each
206
206
207
207
- RPS: Ranked Probability Score
208
208
209
-
```{r}
209
+
```{r InitializeMetrics}
210
210
# Parameters for computing the scores
211
211
alpha <- 0.1 # MIS uses 90% coverage intervals
212
212
jitt <- 1e-9 # jitter for numerical stability
@@ -236,7 +236,7 @@ The following functions are used for computing the scores:
236
236
237
237
The implementation of these functions is available in the source code of the vignette but not shown here.
The mean MASE skill score is positive only for the TD-cond reconciliation. Both Mix-cond and Gauss achieve scores lower than the base forecasts, even if Mix-cond degrades less the base forecasts compared to Gauss.
379
379
380
-
```{r}
380
+
```{r PrintMIStable}
381
381
knitr::kable(mean_skill_scores$mis,digits = 2,caption = "Mean skill score on MIS.")
382
382
```
383
383
The mean MIS score of TD-cond is slightly above that of the base forecasts. Mix-cond achieves slightly higher scores than the base forecasts only on the bottom variables. Gauss strongly degrades the base forecasts according to this metric.
384
384
385
-
```{r}
385
+
```{r printRPStable}
386
386
knitr::kable(mean_skill_scores$rps,digits = 2,caption = "Mean skill score on RPS.")
387
387
```
388
388
The mean RPS skill score for TD-cond is positive for both upper and bottom time series. Mix-cond slightly improves the base forecasts on the bottom variables, however it degrades the upper base forecasts. Gauss strongly degrades both upper and bottom base forecasts.
@@ -391,7 +391,7 @@ The mean RPS skill score for TD-cond is positive for both upper and bottom time
391
391
392
392
Finally, we show the boxplots of the skill scores for each method divided in upper and bottom levels.
393
393
394
-
```{r,fig.width=7,fig.height=8}
394
+
```{r MASEboxplots, fig.cap="**Figure 2**: boxplot of MASE skill scores for upper and bottom time series.", fig.width=7,fig.height=8}
395
395
custom_colors <- c("#a8a8e4",
396
396
"#a9c7e4",
397
397
"#aae4df")
@@ -405,13 +405,13 @@ boxplot(skill_scores$mase$bottom, main = "MASE bottom time series",
405
405
col = custom_colors, ylim = c(-200,200))
406
406
abline(h=0,lty=3)
407
407
```
408
-
```{r,eval=TRUE,include=FALSE}
408
+
```{r setupParams,eval=TRUE,include=FALSE}
409
409
par(mfrow = c(1, 1))
410
410
```
411
411
412
412
Both Mix-cond and TD-cond do not improve the bottom MASE over the base forecasts (boxplot flattened on the value zero), however TD-cond provides a slight improvement over the upper base forecasts (boxplot over the zero line).
413
413
414
-
```{r,fig.width=7,fig.height=8}
414
+
```{r MISboxplots, fig.cap="**Figure 3**: boxplot of MIS skill scores for upper and bottom time series.", fig.width=7,fig.height=8}
415
415
# Boxplots of MIS skill scores
416
416
par(mfrow = c(2, 1))
417
417
boxplot(skill_scores$mis$upper, main = "MIS upper time series",
@@ -428,7 +428,7 @@ par(mfrow = c(1, 1))
428
428
Both Mix-cond and TD-cond do not improve nor degrade the bottom base forecasts in MIS score as shown by the small boxplots centered around zero. On the upper variables instead only TD-cond does not degrade the MIS score of the base forecasts.
429
429
430
430
431
-
```{r,fig.width=7,fig.height=8}
431
+
```{r RPSboxplots,fig.cap="**Figure 4**: boxplot of RPS skill scores for upper and bottom time series.", fig.width=7,fig.height=8}
432
432
# Boxplots of RPS skill scores
433
433
par(mfrow = c(2,1))
434
434
boxplot(skill_scores$rps$upper, main = "RPS upper time series",
0 commit comments