Large-scale assessment of Prophet for multi-step ahead forecasting of monthly streamflow
Air Force Support Command, Hellenic Air Force, Elefsina, 192 00,
Greece
Georgia A. Papacharalampous
Department of Water Resources and Environmental Engineering,
National Technical University of Athens, Zografou, 157 80, Greece
Related authors
Thomas Dimopoulos, Hristos Tyralis, Nikolaos P. Bakas, and Diofantos Hadjimitsis
Adv. Geosci., 45, 377–382, https://doi.org/10.5194/adgeo-45-377-2018, https://doi.org/10.5194/adgeo-45-377-2018, 2018
Short summary
Short summary
The paper examines a machine learning algorithm (Random Forests) in comparison with Multivariate Linear Regression, for a data-set of 3500 transactions of residential apartments in Nicosia District in Cyprus. The methodology suggested, indicated high accuracy of the Random Forests Method, that can be applied in automated valuation models and CAMA systems.
Georgia A. Papacharalampous and Hristos Tyralis
Adv. Geosci., 45, 201–208, https://doi.org/10.5194/adgeo-45-201-2018, https://doi.org/10.5194/adgeo-45-201-2018, 2018
Short summary
Short summary
The predictive performance of random forests (a machine learning algorithm)
and three configurations of Prophet (a method largely implemented in
Facebook) is assessed in daily streamflow forecasting in a river in the US.
Random forests perform better compared to the utilized benchmarks, i.e. a naïve
method and a multiple regression linear model, while Prophet's performance is
subject to improvements. Random forests are recommended for daily streamflow
forecasting.
Thomas Dimopoulos, Hristos Tyralis, Nikolaos P. Bakas, and Diofantos Hadjimitsis
Adv. Geosci., 45, 377–382, https://doi.org/10.5194/adgeo-45-377-2018, https://doi.org/10.5194/adgeo-45-377-2018, 2018
Short summary
Short summary
The paper examines a machine learning algorithm (Random Forests) in comparison with Multivariate Linear Regression, for a data-set of 3500 transactions of residential apartments in Nicosia District in Cyprus. The methodology suggested, indicated high accuracy of the Random Forests Method, that can be applied in automated valuation models and CAMA systems.
Georgia A. Papacharalampous and Hristos Tyralis
Adv. Geosci., 45, 201–208, https://doi.org/10.5194/adgeo-45-201-2018, https://doi.org/10.5194/adgeo-45-201-2018, 2018
Short summary
Short summary
The predictive performance of random forests (a machine learning algorithm)
and three configurations of Prophet (a method largely implemented in
Facebook) is assessed in daily streamflow forecasting in a river in the US.
Random forests perform better compared to the utilized benchmarks, i.e. a naïve
method and a multiple regression linear model, while Prophet's performance is
subject to improvements. Random forests are recommended for daily streamflow
forecasting.
Cited articles
Addor, N., Newman, A. J., Mizukami, N., and Clark, M. P.: Catchment
attributes for large-sample studies, UCAR/NCAR, Boulder, CO,
https://doi.org/10.5065/D6G73C3Q, 2017a.
Addor, N., Newman, A. J., Mizukami, N., and Clark, M. P.: The CAMELS data
set: catchment attributes and meteorology for large-sample studies, Hydrol.
Earth Syst. Sci., 21, 5293–5313, https://doi.org/10.5194/hess-21-5293-2017,
2017b.
Allaire, J. J., Xie, Y., McPherson, J., Luraschi, J., Ushey, K., Atkins, A.,
Wickham, H., Cheng, J., and Chang, W.: rmarkdown: Dynamic Documents for R, R
package version 1.9, available at:
https://CRAN.R-project.org/package=rmarkdown (last access:
15 August 2018), 2018.
Ballini, R., Soares, S., and Andrade, M. G.: Multi-step-ahead monthly
streamflow forecasting by a neurofuzzy network model, IFSA World Congress and
20th NAFIPS International Conference, 992–997,
https://doi.org/10.1109/NAFIPS.2001.944740, 2001.
Brownrigg, R., Minka, T. P., and Deckmyn, A.: maps: Draw Geographical Maps, R
package version 3.3.0, available at:
https://CRAN.R-project.org/package=maps (last access: 15 August 2018), 2018.
Callegari, M., Mazzoli, P., de Gregorio, L., Notarnicola, C., Pasolli, L.,
Petitta, M., and Pistocchi, A.: Seasonal streamflow forecasting using support
vector regression: a case study in the Italian Alps, Water, 7, 2494–2515,
https://doi.org/10.3390/w7052494, 2015.
De Gregorio, L., Callegari, M., Mazzoli, P., Bagli, S., Broccoli, D.,
Pistocchi, A., and Notarnicola, C.: Operational Streamflow Forecasting with
Support Vector Regression Technique Applied to Alpine Catchments: Results,
Advantages, Limits and Lesson Learned, Water Resour. Manag., 32, 229–242,
https://doi.org/10.1007/s11269-017-1806-3, 2018.
Gagolewski, M.: stringi: Character String Processing Facilities, R package
version 1.2.2, available at: https://CRAN.R-project.org/package=stringi
(last access: 15 August 2018), 2018.
Grolemund, G. and Wickham, H.: Dates and Times Made Easy with lubridate, J.
Stat. Softw., 40, https://doi.org/10.18637/jss.v040.i03, 2011.
Fraley, C., Leisch, F., Maechler, M., Reisen, V., and Lemonte, A.: fracdiff:
Fractionally differenced ARIMA aka ARFIMA(p,d,q) models, R package version
1.4-2, available at: https://CRAN.R-project.org/package=fracdiff (last
access: 15 August 2018), 2012.
Hong, T. and Fan, S.: Probabilistic electric load forecasting: A tutorial
review, Int. J. Forecasting, 32, 914–938,
https://doi.org/10.1016/j.ijforecast.2015.11.011, 2016.
Hyndman, R. J. and Athanasopoulos, G.: Forecasting: principles and practice,
available at: https://www.otexts.org/fpp (last access: 15 August 2018), 2018.
Hyndman, R. J. and Khandakar, Y.: Automatic time series forecasting: the
forecast package for R, J. Stat. Softw., 27, 1–22,
https://doi.org/10.18637/jss.v027.i03, 2008.
Hyndman, R., Athanasopoulos, G., Bergmeir, C., Caceres, G., Chhay, L.,
O'Hara-Wild, M., Petropoulos, F., Razbash, S., Wang, E., Yasmeen, F., R Core
Team, Ihaka, R., Reid, D., Shaub, D., Tang, Y., and Zhou, Z.: forecast:
Forecasting functions for time series and linear models, R package version
8.3, available at:
https://cran.r-project.org/web/packages/forecast/index.html (last
access: 15 August 2018), 2018.
Koutsoyiannis, D., Yao, H., and Georgakakos, A.: Medium-range flow prediction
for the Nile: a comparison of stochastic and deterministic methods, Hydrolog.
Sci. J., 53, 142–164, https://doi.org/10.1623/hysj.53.1.142, 2008.
Newman, A. J., Sampson, K., Clark, M. P., Bock, A., Viger, R. J., and
Blodgett, D.: A large-sample watershed-scale hydrometeorological dataset for
the contiguous USA, UCAR/NCAR, Boulder, CO, https://doi.org/10.5065/D6MW2F4D, 2014.
Newman, A. J., Clark, M. P., Sampson, K., Wood, A., Hay, L. E., Bock, A.,
Viger, R. J., Blodgett, D., Brekke, L., Arnold, J. R., Hopson, T., and Duan,
Q.: Development of a large-sample watershed-scale hydrometeorological data
set for the contiguous USA: data set characteristics and assessment of
regional variability in hydrologic model performance, Hydrol. Earth Syst.
Sci., 19, 209–223, https://doi.org/10.5194/hess-19-209-2015, 2015.
Papacharalampous, G., Tyralis, H., and Koutsoyiannis, D.: Error evolution in
multi-step ahead streamflow forecasting for the operation of hydropower
reservoirs, Preprints, 2017100129, https://doi.org/10.20944/preprints201710.0129.v1,
2017a.
Papacharalampous, G., Tyralis, H., and Koutsoyiannis, D.: Forecasting of
geophysical processes using stochastic and machine learning algorithms, Eur.
Water, 59, 161–168, 2017b.
Papacharalampous, G., Tyralis, H., and Koutsoyiannis, D.: Comparison of
stochastic and machine learning methods for multi-step ahead forecasting of
hydrological processes, Preprints, 2017100133,
https://doi.org/10.20944/preprints201710.0133.v2, 2018a.
Papacharalampous, G., Tyralis, H., and Koutsoyiannis, D.: One-step ahead
forecasting of geophysical processes within a purely statistical framework,
Geosci. Lett., 5, 12, https://doi.org/10.1186/s40562-018-0111-1, 2018b.
Papacharalampous, G., Tyralis, H., and Koutsoyiannis, D.: Predictability of
monthly temperature and precipitation using automatic time series forecasting
methods, Acta Geophys., 66, 807–831, https://doi.org/10.1007/s11600-018-0120-7, 2018c.
R Core Team: R: A language and environment for statistical computing, R
Foundation for Statistical Computing, Vienna, Austria, available at:
https://www.R-project.org/ (last access: 15 August 2018), 2018.
Schaake, J., Cong, S., and Duan, Q.: US MOPEX data set, IAHS-AISH P., 307,
9–28, 2006.
Silveira, C. S., Alexandre, A. M. B., Souza Filho, F. A., Junior, V., and
Cabral, S. L.: Monthly streamflow forecast for National Interconnected System
(NIS) using Periodic Auto-regressive Endogenous Models (PAR) and Exogenous
(PARX) with climate information, RBRH, Porto Alegre, 22, e30,
https://doi.org/10.1590/2318-0331.011715186, 2017.
Spinu, V., Grolemund, G., and Wickham, H.: lubridate: Make Dealing with Dates
a Little Easier, R package version 1.7.4, available at:
https://CRAN.R-project.org/package=lubridate (last access:
15 August 2018), 2018.
Sun, A. Y., Wang, D., and Xu, X.: Monthly streamflow forecasting using
Gaussian Process Regression, J. Hydrol., 511, 72–81,
https://doi.org/10.1016/j.jhydrol.2014.01.023, 2014.
Taylor, S. J. and Letham, B.: prophet: Automatic Forecasting Procedure, R
package version 0.2, available at:
https://CRAN.R-project.org/package=prophet (last access:
15 August 2018), 2017.
Taylor, S. J. and Letham, B.: Forecasting at scale, Am. Stat., 72, 37–45,
https://doi.org/10.1080/00031305.2017.1380080, 2018.
Thornton, P. E., Thornton, M. M., Mayer, B. W., Wilhelmi, N., Wei, Y.,
Devarakonda, R., and Cook, R. B.: Daymet: Daily Surface Weather Data on a
1-km Grid for North America, Version 2, ORNL DAAC, Oak Ridge, Tennessee, USA,
https://doi.org/10.3334/ORNLDAAC/1219, 2014.
Tyralis, H.: HKprocess: Hurst-Kolmogorov Process, R package version 0.0-2,
available at: https://CRAN.R-project.org/package=HKprocess (last
access: 15 August 2018), 2016.
Tyralis, H. and Koutsoyiannis, D.: Simultaneous estimation of the parameters
of the Hurst–Kolmogorov stochastic process, Stoch. Env. Res. Risk A., 25,
21–33, https://doi.org/10.1007/s00477-010-0408-x, 2011.
Tyralis, H. and Koutsoyiannis, D.: A Bayesian statistical model for deriving
the predictive distribution of hydroclimatic variables, Clim. Dynam., 42,
2867–2883, https://doi.org/10.1007/s00382-013-1804-y, 2014.
Tyralis, H. and Papacharalampous, G.: Variable selection in time series
forecasting using random forests, Algorithms, 10, 114, https://doi.org/10.3390/a10040114,
2017.
Tyralis, H., Dimitriadis, P., Koutsoyiannis, D., O'Connell, P. E., Tzouka,
K., and Iliopoulou, T.: On the long-range dependence properties of annual
precipitation using a global network of instrumental measurements, Adv. Water
Resour., 111, 301–318, https://doi.org/10.1016/j.advwatres.2017.11.010, 2018.
Warnes, G. R., Bolker, B., Gorjanc, G., Grothendieck, G., Korosec, A.,
Lumley, T., MacQueen, D., Magnusson, A., and Rogers, J.: gdata: Various R
Programming Tools for Data Manipulation, R package version 2.18.0, available
at: https://CRAN.R-project.org/package=gdata (last access:
15 August 2018), 2017.
Wickham, H.: ggplot2, Springer International Publishing,
https://doi.org/10.1007/978-3-319-24277-4, 2016.
Wickham, H. and Chang, W.: ggplot2: Create Elegant Data Visualisations Using
the Grammar of Graphics, R package version 2.2.1, available at:
https://CRAN.R-project.org/package=ggplot2 (last access:
15 August 2018), 2016.
Wickham, H. and Chang, W.: devtools: Tools to Make Developing R Packages
Easier, R package version 1.13.4, available at:
https://CRAN.R-project.org/package=devtools (last access:
15 August 2018), 2018.
Wickham, H., Hester, J., and Francois, R.: readr: Read Rectangular Text Data,
R package version 1.1.1, available at:
https://CRAN.R-project.org/package=readr (last access: 15 August 2018), 2017.
Wolpert, D. H.: The lack of a priori distinctions between learning
algorithms, Neural Comput., 8, 1341–1390, https://doi.org/10.1162/neco.1996.8.7.1341,
1996.
Xie, Y.: knitr: A Comprehensive Tool for Reproducible Research in R, in:
Implementing Reproducible Computational Research, Chapman and Hall/CRC, 2014.
Xie, Y.: Dynamic Documents with R and knitr, 2nd Edn., Chapman and Hall/CRC,
2015.
Xie, Y.: knitr: A General-Purpose Package for Dynamic Report Generation in R,
R package version 1.20, available at:
https://CRAN.R-project.org/package=knitr (last access: 15 August 2018), 2018.
Yang, T., Asanjan, A. A., Welles, E., Gao, X., Sorooshian, S., and Liu, X.:
Developing reservoir monthly inflow forecasts using artificial intelligence
and climate phenomenon information, Water Resour. Res., 53, 2786–2812,
https://doi.org/10.1002/2017WR020482, 2017.
Zeileis, A. and Grothendieck, G.: zoo: S3 infrastructure for regular and
irregular time series, J. Stat. Softw., 14, https://doi.org/10.18637/jss.v014.i06,
2005.
Zeileis, A., Grothendieck, G., and Ryan, J. A.: zoo: S3 Infrastructure for
Regular and Irregular Time Series (Z's Ordered Observations), R package
version 1.8-1, available at: https://CRAN.R-project.org/package=zoo (last access:
15 August 2018), 2018.
Short summary
We use the CAMELS dataset to compare two different approaches in multi-step ahead forecasting of monthly streamflow. The first approach uses past monthly streamflow information only, while the second approach additionally uses past information about monthly precipitation and/or temperature (exogenous information). The incorporation of exogenous information is made by utilizing Prophet, a model largely implemented in Facebook. The findings suggest that the compared approaches are equally useful.
We use the CAMELS dataset to compare two different approaches in multi-step ahead forecasting of...