Forecasting the Number of Road Accidents in Poland by Road Type

On Polish highways, a staggering number of individuals pass away each year. The quantity is still quite large even if the value is declining year after year. The value of traffic accidents has greatly decreased due to the epidemic, but it is still quite high. In order to reduce this number, it is required to identify the roads where the majority of accidents occur and to understand the predicted number of accidents in the upcoming years. The article’s goal is to predict how many accidents will occur on Polish roads based on the kind of roads. To achieve this, monthly accident data for Poland from the Police’s statistics for the years 2007–2021 were analyzed, and a prediction for the years 2022–2024 was created. As is evident, either the number of accidents is rising or it is stabilizing. This is mostly caused by the rise in automobile traffic. Additionally, predictions indicate that given the existing circumstances, a significant rise in the number of accidents on Polish roads may be anticipated. This is especially evident in the nation’s growing number of freeways. It should be remembered that the current epidemic distorts the findings. Selected time series models were used in the investigation in Statistica.


Introduction
Road accidents are situations that result in both property damage and injury or death to other road users.The World Health Organization (WHO) estimates that 1.3 million individuals lose their lives in traffic accidents each year.Road accidents cost the gross domestic product (GDP) of the majority of nations throughout the world roughly 3%.For kids and young adults aged 5 to 29 years old, traffic accidents constitute the main cause of mortality [1].By 2030, the United Nations (UN) General Assembly wants to cut in half the number of people killed and injured in traffic accidents.
A factor in judging the severity of a traffic collision is its scope.For the appropriate authorities to develop road safety rules aimed at preventing accidents and minimizing injuries, fatalities, and property damage, it is vital to predict accident severity [2,3].Adopting countermeasures to prevent and reduce accident severity requires the identification of crucial elements that influence accident severity [4].A multi-node Deep Neutral Network (DNN) structure is put out by Yang et al. [5] for forecasting various degrees of damage, demise, and property loss.It allows for a thorough and precise investigation of the seriousness of road accidents.
The data on accidents comes from several sources.Most frequently, they are gathered and evaluated by government officials using the appropriate government agencies.Police reports, insurance company databases, and hospital records are all used for data collection.The transportation industry is then processing traffic accident data in part on a wider scale [6].
At the moment, the most significant source of information for the analysis and prediction of traffic accidents is intelligent transportation systems.Vehicle-mounted GPS systems can be used to process this data [7].Roadside microwave vehicle detection systems may continually capture information about moving vehicles, such as speed, amount of traffic, and vehicle type [8].Additionally, it is feasible to gather a significant amount of traffic data over a supervised time [9] using a license plate recognition system.Social media is another possible source of data for gathering information on traffic and accidents, although their accuracy may not be sufficient owing to the inexperience of reporters [10].
Work with several data sources, which need to be correctly challenged, in order to make accident data useful.The accuracy of analytical findings can be improved by combining diverse data sources and merging heterogeneous traffic accident data [11].
A statistical analysis was undertaken by Vilaca et al. [12] to evaluate the seriousness and determine the connection between traffic participants and accidents.A suggestion to raise the bar for traffic safety regulations and implement more traffic safety measures is the study's outcome.
Based on the number of traffic accidents-a measure of the research of accident causes-Bak et al. [13] conducted a statistical analysis of traffic safety in a chosen Polish area.In order to investigate the safety elements of people who cause accidents, the study employed multivariate statistical analysis.
The type of traffic issue being tackled determines the source of accident data to be used for analysis.The accuracy of accident prediction and accident abolition are improved by combining statistical models with additional natural driving data or other data received through intelligent traffic systems [14].
The literature has a number of techniques for predicting the frequency of accidents.The most popular techniques for predicting the frequency of accidents are time series approaches [15,16], which have the drawback of not being able to evaluate the forecast's accuracy based on previous predictions and the frequent residual component of autocorrelation [17].While Sunny & Nithya [18] employed the Holt-Winters exponential smoothing approach, Procházka et al. [19] used a multi-seasonality model for forecasting.One of its drawbacks is that exogenous variables cannot be added to the model [20].
The vector autoregressive model, which has the drawback of requiring many observations of variables to accurately estimate their parameters [21], as well as the curve-fitting regression models of Al-Madani [22] and Monederoa et al. [23] for analyzing the number of fatalities, have all been used to predict the frequency of traffic accidents.In turn, these merely need a few straightforward linear connections [24] and an order of autoregression [25] (assuming that the series is already stationary).
Random Forest regression was used by Biswas et al. [26] to forecast the frequency of traffic accidents.In this situation, smaller groups are preferred over bigger ones [27], the approach and peak prediction are unstable [28], and the data contain groups of correlated characteristics of equal importance to the original data.For the given forecasting problem, Chudy-Laskowska & Pisula [29] employed an autoregressive quadratic trend model, a univariate periodic trend model, and an exponential smoothing model.The problem at hand may alternatively be predicted using a moving average model, but this approach has the drawbacks of poor prediction accuracy, loss of data within a sequence, and inability to take trends and seasonal impacts into consideration [30].
In order to ensure that the process is stationary, Prochozka & Camaj [31] employed the GARMA (generalized autoregressive moving average) approach, which places restrictions on the parameter space.Forecasting frequently uses the ARMA (autoregressive moving average) model for a stationary process or the ARIMA (autoregressive integrated moving average) or SARIMA (seasonal autoregressive integrated moving average) model for a non-stationary process [18,19,32,33].These models provide the analyzed models a great deal of flexibility, but this has the drawback of requiring more research knowledge from the researcher than, say, regression analysis [30].The linearity of the ARIMA model is another drawback [34].
ANOVA (analysis of variance) was employed by Chudy-Laskowska & Pisula in their study [35] to forecast the frequency of traffic accidents.This method's drawback is that it includes extra presumptions, particularly the presumption of sphericity, the violation of which might result in incorrect findings [36].The frequency of road accidents is also anticipated using artificial neural network (ANN) algorithms.The drawbacks of ANNs include the requirement for prior knowledge in this area [35,37], the dependence of the final result on the network's initial conditions, and the inability to interpret results in a conventional manner because ANNs are typically referred to as "black boxes", where input data is entered and the model outputs results without knowledge of the analysis [38].
Kumar et al.'s [39] usage of the Hadoop modelfkumar was a novel prediction technique.This method's drawback is that it cannot handle tiny data sets [40].The Garch model was employed by Karlaftis & Vlahogianni [33] for prediction.This method's intricate model and complex form are a drawback [41,42].McIlroy and his team employed the ADF (augmented Dickey-Fuller) test [43], which has the drawback of having insufficient power to detect autocorrelation of the random component [44].
Data-mining approaches, which often have the drawback of having enormous collections of generic descriptions [45], have also been employed for predicting [46].The mixture of models put out by Sebego et al. [47] is another example of a combination of models.Bloomfield [48] also suggests parametric models.
Taking into account the above analysis, the author made forecasts of the number of road accidents in Poland depending on road types.Selected time series models and exponential models were used to forecast the number of accidents.Forecasting of the number of road accidents using factors influencing this value and the use of other forecasting methods (e.g., neural networks, linear regression) can be found in the author's other work in this area [49][50][51][52][53][54][55][56][57][58][59][60][61].In their study, the authors used the police's statics data, not taking into account other factors affecting the occurrence of a traffic accident.

Materials and Methods
Every year there is an increase in the number of new cars on Polish roads.Currently, there are nearly 750 vehicles per 1000 inhabitants in Poland.This translates into an increase or stabilization of the number of accidents on the roads (Figure 1).Poland covers an area of 312,705 km 2 and has the following road categories: -motorway, -expressway, -2 one-way carriageway, -single carriageway, -1 carriageway 2 directions.The Kruskall-Wallis test was used to examine how the number of accidents on different types of roads changed over time.The test statistic has a value of 830 and a test probability of p = 0.000.The result shows that we do not accept the equality of the mean level of traffic accidents (Figure 2).

Selected Statistical Methods
In the article, the authors used various static methods.Among them can be mentioned the Brownian, Holt, and Winters methods.The Brownian method is classified as an exponential smoothing method.It is most often used in the case of a time series in which there is no trend, i.e., the series used does not show a development trend, and its fluctuations are due to random factors that occur when forecasting, for example, the number of traffic accidents.The model for the change in the projected number of traffic accidents () in the analyzed time series takes the form: where:  −1 * -The forecast values of the number of traffic accidents for the optimal value of the smoothing parameter ; -subsequent periods (1, 2, ..., n + 1); -constant value of the process smoothing parameter taking a value in the range:

Holt Method
Another of the exponential equalization methods presented is the Holt method.In this method, it is assumed that the number of traffic accidents does not fluctuate over time, but only there is a trend and random fluctuations that can change over time.The  change model in the case under consideration takes the form: where:   -the smoothed value of the forecast variable at time ,   -the smoothed value of the trend increment at time , -constant value of the process smoothing parameter taking a value in the range: -constant value of the process smoothing parameter taking a value in the range: Based on the above relationships, the forecast equation at time  >  takes the form: where: ()-the forecast value of the number of traffic accidents at time ,   -the smoothed value of the forecast variable at time ,   -the value of the trend increment at the moment ,  -the number of words in the time series.The initial value of  1 is assumed to be the first value of the forecast variable .While the initial value of  1 is assumed to be the difference in the value of the forecast variable  2 −  1 .The values of  and  coefficient are determined so that their value determines the smallest error of expired forecasts.Expired forecasts are determined from the following formula:

Winters Method
Another of the proposed methods for determining the number of traffic accidents over a certain period is the Winters method.This method is used for time series in which there is a development trend and seasonal and random fluctuations.
In the method in question, the following versions of the model are distinguished: • Additive-occurs when the effect of a season is constant over time.
In this case, the smoothed value of the forecast variable   at time , after removing seasonal data, is determined from the formula: On the other hand, trend increment analysis at time  − 1, takes the form of: The parameter   , the seasonality index at time , is determined by the formula: where: -length of the seasonal cycle, taking the value 1 ≤  ≤  , , -model parameters with values in the interval [0,1], are determined by minimizing the forecast error.
• Multiplicative-occurs if in a time series, the relative amplitude of seasonal changes is constant over time, and seasonal fluctuations are of proportional trend.
For the multiplicative model, the variables   ,   and   take the form, respectively: The forecast () at time  is determined from the following relationships: • For a version of the additive model: • For a version of the multiplicative model:

Forecasting the Number of Road Accidents
For each type of road, the number of accidents was predicted using specific exponential equalization models.The fundamental difference between both approaches is that the weights are chosen using an exponential function, and the time series of the forecast variable is given by a weighted moving average.The Statistica program used to conduct the applied analysis chose these weights in the best possible way.
A weighted average of the recent and historical records was used to predict the number of accidents for each kind of route under study.The model used and the best values for its parameters determine the outcomes of forecasts made using these approaches.Using certain time series models, predictions of the number of accidents on Polish roads by kind of road were created.
Measures of analytical forecasting perfection were calculated using the errors of forecasts that had expired, which were calculated using Equations ( 17)-( 21): • ME-mean error.
where: -the length of the forecast horizon,  -observed value of road accidents,   -the forecasted value of road accidents.The mean absolute percentage error was reduced to compare the number of accidents during a pandemic.
Using information from the Polish police from 2007 to 2021, the number of accidents on the different types of roads studied was predicted.A number of different methods can be used for forecasting.In this study, the authors use a fast exponential smoothing method.
Figures 3-7 show the results of the road forecasts.The different forecasting techniques used in the study are denoted by M1, M2, ..., M15.The forecasting methods used in the study are shown in Table 1.The aforementioned information leads to the conclusion that not all procedures utilized in the situation under review are successful.To determine the best forecasting method, the value of mean absolute percentage error MAPE (20) was used for which the analyzed value was the smallest.The best forecasting techniques for each road were determined to be the following: • motorway-M6; • expressway-M4; • 2 one-way carriageway-M10; • single carriageway-M4; • 1 carriageway 2 directions-M10.
The data acquired allow it to be determined that the manner of analysis relies on the kind of route being examined.The exponential smoothing without trend and exponential smoothing approaches consistently produced the least MAPE error.Based on this, a prediction of the number of accidents on the roads according to the examined road types was created, which is depicted in Figure 8, and the forecast errors achieved are reported in Table 2 and Figure 9.The findings indicate that there will likely be more accidents in the future.On highways, this is particularly important.The results have been altered as a result of the epidemic, which should be highlighted.A maximum error value of 1% illustrates the selection of an efficient forecasting method.

Conclusions
With the help of the Statistica application, an exponential equalization approach was used to anticipate the number of accidents in Poland.In order to reduce the mean absolute error and the mean absolute percentage error, the algorithm determined the weights that would be most effective.
The study's findings suggest that the number of traffic accidents would likely be around the same as they were before the epidemic.The exception is the number of highways, which is growing along with the amount of accidents that occur on them.The results have been altered as a result of the epidemic, which should be highlighted.The selection of an efficient forecasting technique may be demonstrated by an error value of no more than 1.06%.
In addition, based on the study, the largest projected increase in traffic accidents is on highways.By the end of 2024, it could reach more than 50,000, in which case we can expect a large increase.For the rest of the analyzed roads (expressway, 2 one-way carriageways, 1 one-way carriageway, and 1 two-way carriageway), we can expect the number of accidents to stabilize over the analyzed period.A minimal increase is observed for 2 one-way roadways.In this case, the number of accidents could reach more than 10,000 by the end of 2024.
Future actions to reduce the number of accidents on roads, particularly on motorways, can be developed using the estimated number of road accidents received from the article.These actions may, for instance, begin on 1 January 2022, with the imposition of harsher fines for traffic infractions on Polish roads.
The authors intend to consider further variables affecting accident rates in Poland in their future study.Examples of these can be the amount of traffic, the day of the week, or the age of the accident's perpetrator.

Figure 2 .
Figure 2. Comparison of the average number of road accidents in Poland by road type from 2007 to 2021.

Figure 3 .
Figure 3. Forecasting the number of road accidents on the motorway between 2022 and 2024.

Figure 4 .
Figure 4. Forecasting the number of road traffic accidents on the ecompass road from 2022 to 2024.

Figure 5 .
Figure 5. Forecasting the number of road accidents on a 2 carriageway road between 2022 and 2024.

Figure 6 .
Figure 6.Forecasting the number of road accidents on a one-way road between 2022 and 2024.

Figure 7 .
Figure 7. Forecasting the number of road accidents on a 1 carriageway 2 direction road between 2022 and 2024.

Figure 8 .
Figure 8. Optimal projected number of road accidents according to the analyzed road types in the period 2022-2024.