Some referenes
Methods for big spatial data – from statistics
Sun, Y., Li, B., & Genton, M. G. (2012). Geostatistics for large datasets. In Advances and challenges in space-time modelling of natural events (pp. 55-77). Springer Berlin Heidelberg.
Bradley, J. R., Cressie, N., & Shi, T. (2016). A comparison of spatial predictors when datasets could be very large.
Methods for big spatial data – from ML/AI
-
Apply the proposed method for regression tasks for various datasets: Combined Cycle Power Plant, Ailerons from LIAD (Laboratory of Artificial Intelligence and Decision), Elevators from LIAD, California Housing from LIAD, Airline Delay from DOT, Household Power Consumption from UCI, and the sinc function.
-
Compare to other methods such as Sparse Gaussian Process and Stochastic Variational Gaussian Process.
Jakkala, K. (2021). Deep Gaussian processes: A survey. arXiv preprint arXiv:2106.12135.
- The results showed that proper statistical models (BCL, GpGp, GpGp0, SPDE, and NNGP) consistently outperformed other (i.e., deep learning and algorithmic) approaches (DeepGP, DeepKriging, Gap-fill, FRK) on all five simulated competition datasets and also the total precipitable water (TPW) dataset.
Methods for spatial data – from statistics
Methods for spatial data – from ML/AI
-
Provide a systematic review on the principles and methods on spatial prediction.
-
Provide a taxonomy of existing spatial prediction methods, as shown in Fig. 1.
GP (Gaussian process) models
Variants of GPs
NNGPs (Nearest-neighbor Gaussian process)
Datta, A., Banerjee, S., Finley, A. O., & Gelfand, A. E. (2016). Hierarchical nearest-neighbor Gaussian process models for large geostatistical datasets. Journal of the American Statistical Association, 111(514), 800-812.
Finley, A. O., Datta, A., Cook, B. D., Morton, D. C., Andersen, H. E., & Banerjee, S. (2019). Efficient algorithms for Bayesian nearest neighbor Gaussian processes. Journal of Computational and Graphical Statistics, 28(2), 401-414.
NNMPs (Nearest-neighbor mixture process)
Deep learning for spatial data
Deep GPs
Jakkala, K. (2021). Deep Gaussian processes: A survey. arXiv preprint arXiv:2106.12135.
Deep kriging
Convolutional Gaussian Processes
DGMRF
-
Establish a formal connection between GMRFs and convolutional neural networks (CNNs).
-
Refer to Lindgren et al., 2011 for the link between GPs and GMRFs.
-
DGMRF outperforms all the methods from the competition on all criteria (e.g. MAE, RMSE, CRPS, INT) except CVG, where it is slightly worse than NNGP (Nearest-neighbor Gaussian process).
Deep learning
Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61, 85-117.
LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436-444.
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press.
- Discuss: What is deep learning? What are the new characteristics of deep learning, compared with classical methods? What are the theoretical foundations of deep learning? from a statistical point of view.
Methods for spatiotemporal data
Generative mixture models
Mixture-based models
Harshvardhan, G. M., Gourisaria, M. K., Pandey, M., & Rautaray, S. S. (2020). A comprehensive survey and analysis of generative models in machine learning. Computer Science Review, 38, 100285.
Copula-based models
Copula
Copula-based models for non-Gaussian time series
Escarela, G., Mena, R. H., & Castillo-Morales, A. (2006). A flexible class of parametric transition regression models based on copulas: application to poliomyelitis incidence. Statistical Methods in Medical Research, 15(6), 593-609.
Huang, X. W., & Emura, T. (2022). Computational methods for a copula-based Markov chain model with a binomial time series. Communications in Statistics-Simulation and Computation, 1-18.
Copula-based models for non-Gaussian spatial data
Methods for time series (temporal data)
Hassan, M. Y. (2021). The deep learning LSTM and MTD models best predict acute respiratory infection among under-five-year old children in Somaliland. Symmetry, 13(7), 1156.
- The results of this study have shown that no model is a panacea over the other two models, but they demonstrated that the deep learning LSTM (long short term memory) and EM-algorithm based MTD (mixture transitions distribution) models slightly outperformed the classical SARIMA (seasonal autoregressive integrated moving averages) model in predicting ARI (acute respiratory infection) values.