Datameteo.com®, a certified brand of LRC SERVIZI active since 2000, is a leading Weather Provider in meteorological and climatic data processing and calculation, meteorological consulting, and the development of services based on weather data.
Datameteo operates in multiple sectors, including professional climate consulting, environment, road safety, renewable energy and insurance, agriculture, industry, and institutions, providing customized and cutting-edge solutions. From data collection (satellites, radar, stations, buoys, radiosondes), subjected to rigorous quality controls (BIGDATAMETEO), to analysis and validation by expert meteorologists, the company provides strategic and operational information accompanied by state-of-the-art APIs and Custom Web Applications for every market.
The Challenge
The Emilia Romagna transport authority needs to manage operations for their electric monorail mass rapid transit system during adverse weather conditions. Ice and snow on the rail require de-icing and cleaning operations to ensure safety and viability.
The need was to develop a high-resolution meteorological monitoring system, weather bulletins managed by a team of expert meteorologists, advanced sensors to cover the route with comprehensive data collection, and a custom predictive system for the route using a completely custom ICING index to manage these critical situations in advance and minimize disruption to users.
Operations are threatened by potential interruptions due to the conditions of the elevated track, which would require costly de-icing interventions. Standard meteorological modeling techniques are not effective in such a limited spatial domain, where microclimatic variations are significant between urban and peripheral areas.
Solutions
It was possible to design a TBI service, which proposed a predictive approach based on Machine Learning. Through a technological partner, five thermo-hygrometric sensors were installed along the rail and two weather stations at intermediate stops. The numerous meteorological variables are sampled at 15-minute intervals, generating historical data series starting from March 2021.
Although the total volume of collected data may appear significant, the effective quantity is limited for Artificial Intelligence applications, partly due to temporal gaps in measurements. An additional critical issue is the unbalanced distribution of icing events, which occur rarely, making predictive analysis complex.
The first phase of the project therefore focused on identifying the most suitable predictive model, taking into account both business needs and data availability and quality.
In an initial experimental phase, classical machine learning models were tested, such as the Random Forest Classifier (RFC) and Artificial Neural Network (ANN), using the detected icing condition as the target variable. The initial approach involved binary classification on a single aggregated icing class.
One of the main limitations faced was the presence of missing data, a challenge that was partially mitigated through interpolation or by selecting shorter but continuous time intervals, with the risk, however, of incurring model overfitting.
A key operational factor was the forecast horizon: to effectively activate preventive safety procedures against icing, it was necessary to have a forecast at least 15 hours in advance, initially reduced to 12 hours for a first test phase.
The decision was made to train two separate models, each referring to one of the two installed weather stations, with the aim of calculating the probability of ice presence as an aggregated value over the next 12 hours. To introduce temporal information into the model, the measurement time was included among the input variables, appropriately aggregating data to capture chronological patterns.
After a phase of parameter tuning and comparison with other algorithms, the choice was initially oriented towards an RNN (Recurrent Neural Network) model, particularly suitable for handling time sequences, and subsequently towards a simpler and more robust SVC (Support Vector Classifier) model as an alternative benchmark.
The second phase of the project focused on improving the selected model, with the aim of increasing both its predictive quality and operational timeliness. To this end, it was decided to initially reduce the forecast time horizon to just 6 hours, obtaining promising results, partly due to the integration of data collected during the last winter, which enriched the database and improved the representation of rare events, such as icing.
Results and Benefits
In the most recent project iterations, several significant optimizations and simplifications were introduced at both modeling and operational levels:
The model output has been structured to provide an hourly forecast for the first 6 hours and an aggregated output for the following 6 hours, thus meeting the short- and medium-term operational planning needs.
Non-aggregated inputs were used, maintaining the original detail of the measurements.
A reduced and optimal number of meteorological variables have been identified, sufficient for effective forecasting.
Weather forecasts from external models have been integrated as new input variables, enriching the informational context available for inference.
The system has been made flexible in its configuration, allowing processing with just one weather station and the single thermo-hygrometric sensor closest or most correlated to the point to be monitored.