## Metadata - Author: Kamarthi, Harshavardhan, Alexander Rodríguez, and B. Aditya Prakash #people/kamarthi-harshavardhan - Full Title: Back2Future: Leveraging Backfill Dynamics for Improving Real-Time Predictions in Future. - Category: #papers - Document Tags: #machine-learning #neural-networks #nowcasting - DOI: [ 10.48550/arXiv.2106.04420](https://doi.org/10.48550/arXiv.2106.04420) ## Highlights - Traditional mechanistic epidemiological models (Shaman & Karspeck, 2012; Zhang et al., 2017), and the fairly newer statistical approaches (Brooks et al., 2018; Adhikari et al., 2019; Osthus et al., 2019b) including deep learning models (Page 1) - Note: Missing semi-mechanistic models - While previous works have addressed anomalies (Liu et al., 2017), missing data (Yin et al., 2020), and data delays (Žliobaite, 2010) in general time-series problems, the backfill problem has not been addressed. In contrast, the topic of revisions has not received as much attention, with few exceptions. For example in epidemic forecasting, a few papers have either (a) mentioned about the ‘backfill (Page 1) - Note: If interested in this being used by epis need to ground more in the literature. "nowcasting" missing here for example. - However, they focus only on revisions in the target and typically study in the context of influenza forecasting, which is substantially less noisy and more regular than the novel COVID-19 pandemic or assume access to stable values for some features which is not the case for COVID-19. (Page 2) - Note: Only benefit to considering jointly is uncertainty handling. Is this resolved here? Influenza is not the only area this is used - just the only area cited here. - We introduce the multi-variate backfill problem using real-time epidemiological forecasting as the primary motivating example. (Page 2) - We introduce the multi-variate backfill problem using real-time epidemiological forecasting as the primary motivating example. (Page 2) - Note: Nice to frame the fact that multiple targets are delayed - In this challenging setting, which generalizes (the limited) prior work, the forecast targets, as well as exogenous features, are subject to retrospective revision (Page 2) - Note: This isn't really true given there is no handling of uncertainty and so joint fitting is only giving the benefits from shared learning (which is novel) + the use of covariates for post processing forecasts. - Another useful way we propose to look at backfill is by focusing on revisions of a single value. Let’s focus on value of signal i at an observation week t(cid:48). For this observation week, the value of the signal can be revised at any t > t(cid:48), which induces a sequence of revisions. We refer to revision week r ≥ 0 as the relative amount of time that has passed since the observation week t(cid:48). (Page 3) - We collected important publicly available signals from a variety of trusted sources that are relevant to COVID-19 forecasting to form the COVID-19 Surveillance Dataset (CoVDS ). See Table 1 for the list of 20 features (|Feat| = 21, including Deaths). Our revisions dataset contains signals that we collected every week since April 2020 and ends on July 2021. (Page 3) - Note: available for usage by others? - Backfill sequence BSEQ patterns. There is significant similarity among BSEQs. We cluster BSEQs via K-means using Dynamic Time Warping (DTW) as pair-wise distance (as DTW can handle sequences of varying magnitude and length). We found five canonical categories of behaviors (see Figure 2), each of size roughly 11.58% of all BSEQs. Also, each cluster is not defined only by signal 1020300.20.40.60.81.0Early DeclineEarly IncreaseSteady/SpikeLate rise01020Mid Decreaseueed) (Page 4) - To study the relationship between model performance (via Mean Absolute Error MAE of a prediction) and BERR, we use REVDIFFMAE: the difference between MAE computed against real-time target value and one against the stable target value (Page 4) - Back2Future (B2F), a deep-learning model that uses revision information from BSEQ to refine predictions (Page 5) - Note: Train a model on difference between forecasts (points?) and ultimately observed data. Essentially a calibrate step. - Simultaneously model all BSEQ available till current week t using spatial and signal similarities in the temporal dynamics of BSEQ. (Page 5) - mation of that feature as well as features that have shown similar revision patterns in the past. Due to the large number of signals that cover all regions, we cannot model the relations between every pair using fully connected modules or attention similar to (Jin et al., 2020). Therefore, we first construct a sparse graph between signals based on past BSEQ similarities. Then we inject this similarity information using Graph Convolutional Networks (GCNs) and combine it with deep sequential models to model temporal dynamics of BSEQ of each signal while combining information from BSEQ s of signals in the neighborhood of the graph. (Page 5) - Our training process, that involves pre-training on model-agnostic auxiliary task, greatly improves training time for refining any given model M . (Page 6) - Note: evidence for this claim? - We leveraged observations (Section 2) from BSEQ for period June 2020 - Dec. 2020 to design B2F. (Page 7) - We tuned the model hyperparameters using data from June 2020 to Aug. 2020 (Page 8) - tested it on the rest of dataset including completely unseen data from Jan. 2021 to June 2021. (Page 8) - % improvements in MAE and MAPE scores averaged over all regions from Jan 2021 to June 2021 (Page 8) - Note: How much are the differences in performance driven by correcting for backfilling and how much from using a flexible model to calibrate forecasts? - Due to the novel problem, there are no standard baselines. (Page 8) - Note: Not true. Use sequential nowcasting. As only considering point estimates can do this relatively trivially. Any baseline would be better than none from a users perspective. - (a) FFNtrain an FFN REG: for regression task takes as inputs that model’s prediction and real-time target to predict the stable target. (b) PREDRNN: use the MODELPREDENC architecture and append a linear layer that takes encodings from MODELPREDENC and model’s prediction and train it to refine the prediction. (c) BSEQREG: similarly, only use BSEQENC architecture and append a linear layer that takes encodings from BSEQENC and model’s prediction to predict the stable target. (d) BSEQREG2: similar to BSEQREG but remove the graph convolutional layers and retain only RNNs. Note that FFNREG and PREDRNN do not use revision data. BSEQREG and BSEQREG2 don’t use past predictions of the model. (Page 8) - Note: So many acronyms. Giving this a legibility parse would really help understand what the comparators are here. I think this is implying that post model calibration in the absence of backfilling information performs worse than doing nothing which seems surprising. - Impressive avg. improvements (Page 9) - Note: is it? l would let the reader judge. - Specifically, ENSEMBLE’s predictions are refined to be up to 74.2% closer to stable target. (Page 9) - Note: From this I assume that very large anomaly corrections drive the majority of performance improvement? - As described in Section 4, we evaluated our model over 5 runs with different random seeds to show the robustness of our results to randomization. We also provide a more extensive description of hyperparameters and data pre-processing in the Appendix. The code for B2F and the CoVDS dataset is attached in the appendix. The dataset for GDP Forecasting is publicly available as described in Appendix Section B. Please refer to the README file in the code folder for more details to reproduce the results. On acceptance, we will also make the code and datasets available publicly to encourage reproducibility and allow for further exploration and research. (Page 11) - Note: Code: https://github.com/AdityaLab/Back2Future Data?: https://github.com/AdityaLab/Back2Future/tree/master/covid_data