Cite as:
Vorndran, M.; Sch&uuml;tz, A.; Bendix, J. &amp; Thies, B. (2022): <b>Current training and validation weaknesses in classification-based radiation fog nowcast using machine learning algorithms</b>. <i>Artificial Intelligence for the Earth Systems</i> <b>1</b>(2), e210006.

Resource Description

Title: Current training and validation weaknesses in classification-based radiation fog nowcast using machine learning algorithms
FOR816dw ID: 481
Publication Date: 2022-05-18
License and Usage Rights:
Resource Owner(s):
Individual: Michaela Vorndran
Individual: Adrian Schütz
Individual: Joerg Bendix
Individual: Boris Thies
Large inaccuracies still exist in accurately predicting fog formation, dissipation, and duration. To improve these deficiencies, machine learning (ML) algorithms are increasingly used in nowcasting in addition to numerical fog forecasts because of their computational speed and their ability to learn the nonlinear interactions between the variables. Although a powerful tool, ML models require precise training and thoroughly evaluation to prevent misinterpretation of the scores. In addition, a fog dataset’s temporal order and the autocorrelation of the variables must be considered. Therefore, classification-based ML related pitfalls in fog forecasting will be demonstrated in this study by using an XGBoost fog forecasting model. By also using two baseline models that simulate guessing and persistence behavior, we have established two independent evaluation thresholds allowing for a more assessable grading of the ML model’s performance. It will be shown that, despite high validation scores, the model could still fail in operational application. If persistence behavior is simulated, commonly used scores are insufficient to measure the performance. That will be demonstrated through a separate analysis of fog formation and dissipation, because these are crucial for a good fog forecast. We also show that commonly used blockwise and leave-many-out cross-validation methods might inflate the validation scores and are therefore less suitable than a temporally ordered expanding window split. The presented approach provides an evaluation score that closely mimics not only the performance on the training and test dataset but also the operational model’s fog forecasting abilities.
| Radiation fog | Machine learning | Model evaluation | Decision Trees | Classification | Nowcasting |
Literature type specific fields:
Journal: Artificial Intelligence for the Earth Systems
Volume: 1
Issue: 2
Page Range: e210006
Publisher: American Meteorological Society
Metadata Provider:
Individual: Michaela Vorndran
Online Distribution:
Download File:

Quick search

  • Publications:
  • Datasets: