Forecasting the Crude Oil Prices Volatility With Stochastic Volatility Models

In this article, the stochastic volatility model is introduced to forecast crude oil volatility by using data from the West Texas Intermediate (WTI) and Brent markets. Not only that the model can capture stylized facts of multiskilling, extended memory, and structural breaks in volatility, it is also more frugal in parameterizations. The Euler–Maruyama scheme was applied to approximate the Heston model. On the contrary, the root mean square error (RMSE) and the mean average error (MAE) were used to approximate the generalized autoregressive conditional heteroskedasticity (GARCH)–type models (symmetric and asymmetric). Based on the approximation results obtained, the study established that the stochastic volatility model fits oil return data better than the traditional GARCH-class models.


Introduction
The forecasting of crude oil volatility price provides valuable information to players in the market about the uncertainty or risk in the market. To the oil-dependent countries like Russia, the United States, China, and Saud Arabia, variability in the oil price implies huge losses or gains and hence lower revenue or higher reserves to meet developmental goals (Nakajima, 2009). Therefore, given the importance of oil price variations for all economies, mostly for those whose fiscal budget and policy depend on oil production and revenues, price volatility forecasting remains one of great concern globally.
Moreover, fluctuations in the crude oil price are increasingly gaining popularity among academicians and practitioners because it affects government planning and policy-making decisions for the revenue system and subsidies in traditional and renewable energy sectors, as well as in the agricultural industry. Moreover, the instability in oil prices is also associated with inflation and its effect on the cost of consumer goods and products in the industrial sector. Intuitively, this may critically impact the political order of oil-intensive economies. In addition, the behavior of agricultural commodity prices and its association with the price of oil matters for global food policy and the sensitivity of various economic agents, to whom oil is the primary input in the production process.
Furthermore, the instability in oil prices is also associated with inflation and its effect on the cost of consumer goods and products in the industrial sector. Intuitively, this may critically impact the political order of oil-intensive economies. In addition, the behavior of agricultural commodity prices and its association with the price of oil matters for global food policy and the sensitivity of various economic agents, to whom oil is the primary input in the production process.
Therefore, from the investors' point of view, the forecasting nature of future returns and volatility on oil markets is essential for determining asset prices, hedging, derivative pricing, and risk controlling. Considering its vital importance worldwide, and minimizing the negative impact of oil price fluctuations, varsity literature, in recent years, pays attention to modeling and forecasting the volatility of oil prices. The crude oil market at times appears relatively calm and other times extremely volatile (Charles & Darné, 2017).
Several studies have examined the modeling and forecasting of crude oil price volatility, with most employing improved generalized autoregressive conditional heteroskedasticity (GARCH) models. Although there are various works on finding the most suitable model that offers the best out-of-sample forecasting performance, definitely no model has consistently outclassed the others (Charles & Darné, 2017). For instance, the autoregressive conditional heteroskedasticity (ARCH) model, which is a powerful tool for describing the historical conditional volatilities, was introduced by Engle (Cont, 2001;Lee, 2000;Sadorsky, 2005). Subsequently, the generalized autoregressive conditional heteroskedasticity (GARCH) model was proposed as an extension to the ARCH models by Bollerslev (1986). Furthermore, various researchers developed variants of GARCH models, for instance, integrated GARCH (IGARCH), exponential GARCH (EGARCH), asymmetric power GARCH (APGARCH), and fractionally integrated GARCH (FIGARCH; Cont, 2001;Lee, 2000;Sadorsky, 2006). These extension models seek out to improve the GARCH model to capture the characteristics of the return series.
The improved GARCH-type models are capable of capturing the most important stylized facts regarding crude oil returns (Cont, 2001;Laurent & Neely, 2012;Wang et al., 2016). The stylized facts include heavy-tailed distributions, volatility clustering, asymmetry, and extended memory volatility (Charles & Darné, 2017;Kulikova & Taylor, 2013;Lee, 2000;Nakajima, 2009;Sadorsky, 2005;Salisu & Fasanya, 2013). These models are mainly used to model time-varying conditional volatility as a deterministic function of lagged variance and lagged conditional squared residuals and incorporate the past observations into the future volatility (Charles & Darné, 2017;Vo, 2009). Moreover, these models are parametric and typically estimate the daily, weekly, or monthly volatility using data sampled at the same frequency. However, sometimes, these models fail to capture the fat-tail property of financial data.
The current existing alternative models for accurately capturing stylized facts are stochastic volatility (SV) models (Sadorsky, 2005). These models are nonparametric and typically based on the continuous-time probability process; it is a discrete-time approach to the diffusion process. Moreover, the variance in the SV models is modeled as an unobserved part that follows a stochastic process and as a logarithmic first-order autoregressive process. Generally, the SV model is the extension of the geometric Brownian motion model by considering one more source of uncertainty. Thus, by considering the stochastic model when describing the behavior of volatility, the estimation of crude oil volatility can be more realistic (Crisostomo, 2015). Despite the fact that SV model is theoretically attractive, it is empirically challenging as the unobserved volatility process enters the model in a nonlinear mode, which leads to the likelihood function dependent upon high-dimensional integrals (Sadorsky, 2005(Sadorsky, , 2006. This article's new contribution to the literature is the introduction of the Heston stochastic volatility model to forecast crude oil price volatility. The Heston model extends the Black and Scholes model (BSM) by considering stochastic volatility driven by a Cox-Ingersoll-Ross (CIR) process (Heston, 1993;Sircar, 1999). Moreover, this model can capture the heavy-tailed nature of the return distributions, leverage effect, and volatility clustering. In the Heston model, the volatility of variance controls the kurtosis of the underlying asset return distribution, and the correlation sets its asymmetry. Finally, this model is analytically tractable (Heston, 1993). Despite the benefits mentioned above, the Heston model's fundamental challenge is the complexity in the estimation process. Compared with other models like BSM, the Heston model's implementation involves more sophisticated mathematics, and it also requires a more challenging process for model calibration. As closed-form solutions are rarely available for nonlinear SDEs or are too challenging to obtain, then numerical approximations are very important. This study applies the numerical method to approximate the crude oil price volatility model to overcome this challenge.
In a nutshell, the main objective of this article is the introduction of the stochastic volatility model in forecasting crude oil price volatility. Furthermore, the novelties of this article are introducing the Heston model to measure the crude oil volatility prices and apply the Euler-Maruyama method to simulate the Heston model. Furthermore, the study compared the efficiency of the Heston model and GARCH-type models to capture the crude oil volatilities. To examine the accuracy of the model, the study compares the Heston model performance with the improved GARCH-type models by performing an error analysis. Intuitively, the smallest error implies a good approximation measure. As for improved GARCH models, this study employs two error measures to evaluate the forecasting performance of models. These error measures are root mean square error (RMSE) and mean absolute error (MAE). Moreover, for the Heston model, this study applies the Euler-Maruyama method to obtain errors.
The rest of this article is organized in the following order. The mathematical model is presented in "Mathematical Model" section. "Data and Preliminary Analysis" section briefly describes the data and preliminary analysis. The numerical illustration in "Numerical Illustrations" section. "Conclusion" section concludes the study.

Mathematical Model
We start by introducing the crude oil market structure to describe price dynamics. We assume that the crude oil market is a complete, frictionless, and continuously open over the fixed time interval [0, T]. The uncertainties involved in the financial market are defined and modeled by a complete filtered probability space (Ω, F, P). From the probability space, Ω is the sample space, F = ( ), 0  t t ≥ denotes the information available at a time t , and P denotes the historical probability measure.

The Heston Model
We assume that the dynamics of crude oil price admits the Heston model, which is a stochastic volatility model. The asset's volatility is instead a random process and not constant or deterministic. Thus, the asset price at time t admits the diffusion process: where X t ( ) , µ, and V t ( ) stand for the asset price, the expected return of an asset, and the instantaneous volatility, respectively. Moreover, W t 1 ( ) is a Wiener process. Again assuming that the volatility process admits an Ornstein-Uhlenbeck process, we have Applying Ito's lemma in Equation 2 shows that the vari- Hence, the volatility process is simplified by where β, θ, and σ are the reversion rate, the long-run variance, and the volatility of volatility (variance of volatility), respectively. Therefore, the Heston volatility model is given by the system of two correlated stochastic differential equations, one for the asset price and the other for the volatility as in Equations 1 and 4, respectively. Also, we must have 2 > 2 βθ σ to guarantee that V t ( ) is almost surely nonnegative (the Feller condition must be satisfied). Moreover, W t 1 ( ) and W t 2 ( ) are two correlated standard Brownian motions with a nonzero correlation.

The Euler-Maruyama Method
The simplest and most useful scheme for the approximation of the numerical solution of stochastic differential equations is the Euler-Maruyama. It is a generalization of the Euler method for ordinary differential equations to stochastic differential equations (Cao et al., 2004). Consider the SDE below: where µ and σ are scalar functions and the initial condition X(0) is a random variable, whereas W t ( ) denotes the Wiener process. The solution to Equation 5 is given by the process X t satisfying To approximate the solution of the SDE on the time interval , we discretize time into N equal subintervals with width ∆t =T ⁄ N. Therefore, the Euler-Maruyama approximation to the actual solution of Equation 5 is the Markov chain X in the form: with X n+1 standing for an approximation, ∆ − Proposition 1: Let X n+1 and V n+1 denote discrete-time approximations of X and V , respectively, and the corresponding continuous Euler-Maruyama approximate solution applied to the Heston model is defined by where X n+1 and V n+1 are approximate crude oil price and its volatility, respectively.

Numerical Approximation of the Model
The simulation methods are essential when approximating the numerical solutions of SDEs. For the model efficiency, we focus on the error analysis measured at the point t T = by the quantity: The process enables us to determine whether the method converges to the exact solution. The Euler-Maruyama scheme is strongly and weakly convergent (Broadie & Kaya, 2006;Gerstner & Kloeden, 2013;Rosa, 2016). The strong error (convergent) measures the error of the approximate sample paths X on average, and the pathwise error is the random quantity that satisfies From above expression, E stands for the expected value, whereas X T ( ) is the approximation of X t ( ) at the time t T = calculated with constant step ∆t. On the contrary, it is weakly convergent (weak error) if the random quantity satisfies for all polynomials f.

GARCH-Type Models and Approximation
This study uses the GARCH-type models to compare with the Heston model. These are symmetric GARCH and asymmetric GARCH (EGARCH and threshold GARCH [TGARCH]). These models are presented and extensively described in the following studies (Charles & Darné, 2017;Lux et al., 2016;Manera et al., 2016;Sadorsky, 2005Sadorsky, , 2006Salisu & Fasanya, 2013, few to mention). Moreover, to evaluate the forecasting performance of these models, this study uses the RMSE and MAE as these are the two most famous measures to test the accuracy of the model (Charles & Darné, 2017). To sum up, the model with the smallest values of the estimate statistics is referred to be the best one.
RMSE. RMSE is the most favored measure among the practitioners and has an even stronger preference among the academics although it is not unit free. It is expressed as The RMSE also gives equal weights to all the errors similar to the MSE, regardless of any period.

MAE. It is expressed as
where | | e X t t / is the absolute error computed on the fitted values for a specific forecasting method.

Data and Preliminary Analysis
The data used are from the U.S. Energy Information Administration (EIA). To carry out numerical illustration on crude oil price volatility, this study employed daily spot prices data for Brent and West Texas Intermediate (WTI), which are two of the most commonly used benchmark prices for crude oil in the literature. The sample covers the period from January 4, 2009, to December 31, 2019, making a total of 5,549 observations. Meanwhile, the Logarithmic returns are more tractable analytically when relating together subperiod returns to form returns over long intervals. Thus, in this study, we transform the crude oil price of each series to logarithmic returns. We focus on the behavior of the logarithmic of crude oil returns and perform simulations with the Heston model and apply the Euler-Maruyama numerical method to approximate the model.

Preliminary Analysis
The primary analysis shows the evidence of significant differences in the trends of prices for both markets over the period covered. Likewise, the statistical distribution of the crude oil prices, for WTI and Brent, reveals evidence of positive skewness, which indicates that the right tail is particularly extreme. As returns are nonnormal, this shows evidence of significant positive skewness and excess kurtosis, as expected from daily returns. We used MATLAB for computations and simulations.
With regard to kurtosis, the crude oil price is leptokurtic for both markets, meaning fat tails than the normal distribution as shown in Table 1. Precisely, Table 1 presents the descriptive statistics for WTI and Brent crude oil returns, unit root test, normality test, and ARCH effects test. Results show that the mean return for the BRENT is higher than the WTI market and also the WTI returns are a little more volatile, as measured by their standard. Moreover, all returns are leptokurtic (i.e., fat-tailed), and hence, the variance in crude oil prices primarily reflects occasional but extreme deviations. Likewise, both returns are positively skewed, which suggests that the series has a longer right tail than the left tail.
In addition, Table 1 shows that the WTI and Brent returns display similar statistical characteristics-this phenomenon is well explained by the high degree of integration in world crude oil markets.
The other statistical tests result for the crude oil data included in Table 1 are the normality test, unit root test, and ARCH effects. For the unit root test, the Augmented Dickey-Fuller (ADF; Dickey & Fuller, 1979) and Phillips-Perron (Phillips & Perron, 1988) tests examine the existence of a unit root. In general, this is the test of stationarity of data. The approach tests the hypotheses H0: A series is nonstationary versus H1: A series is stationary. The unit root test results for the data, both at 0.01 and 0.05 levels, reject the null hypothesis of a unit root's existence in the returns.
Moreover, the ARCH effect test examines the presence of heteroscedasticity for crude oil data. The ARCH effect test shows that crude oil prices exhibit strong conditional heteroscedasticity, common for financial data. Thus, there is enough evidence to reject the null hypothesis of no ARCH effect. Correspondingly, the Jarque-Bera (JB) statistic that uses the information from the kurtosis and skewness test for normality shows nonnormality for both markets, hence rejecting the null hypothesis.

Approximation of Volatilities and the Model Parameters
Taking X t as the crude oil price on day t , then we express the transformation of daily spot price return as with S t as the daily crude oil return, X t is the current day crude oil price, and X t −1 is the previous day crude oil price. Moreover, the squared daily returns are taken as a proxy for the actual volatilities. By using transformation results, we can determine the annual and daily volatilities with their respective variances (see Tables 2 and 3). By applying the Parkinson extreme value method, we approximate daily volatilities. The Parkinson extreme value method is a best estimator of volatility than the traditional volatility measure (Parkinson, 1980). Hence, by using this method, we find the daily and annual crude oil price volatilities (σ) with respective variances ( σ 2 ) for each year.

Numerical Illustrations
The daily crude oil returns observations for the interval between 02.01.2019 and 31.12.2019 are used for the real data application in simulation, and these accounts to 250 and 256 days, respectively, for WTI and BRENT. We describe and forecast crude oil price volatility dynamics by the Heston volatility model. Figures 1 to 6 demonstrate the crude oil spot prices, returns, and volatilities for the markets considered. Results show that the behavior of crude oil prices and their returns follows an asymmetrical pattern. That is to say, the trends in returns suggest evidence of volatility clustering. Intuitively, the periods of relatively low volatility follow the periods of high volatility. The unusual spikes from these figures are  evidence of significant unsteady patterns of crude oil price returns. Figures 1 and 2 present the trend in crude oil spot price. In particular, Figures 1 and 2 show that during the period from 2014 to 2016, crude oil prices declined sharply, which reveals the heightened uncertainty in the crude oil market. Figures 3  and 4 illustrate the dynamics of crude oil returns for each market. The results imply that some aggregate supply and demand tremors lead to large fluctuations in crude oil markets. Generally speaking, it seems that crude oil prices demonstrate a high uncertainty over time. Thus, considering the essential role played by crude oil in the world economy, the problem of managing the oil price risk is of great interest to economists, policymakers, and other market participants. That is to say, for the oil-producing countries that do not have a well consolidated and technologically upgraded manufacturing industry, declining in oil incomes lead either to borrowing or to declining expenditure in public services.

Simulation Results
We perform simulations with Heston stochastic volatility model for crude oil data by focusing on the logarithmic crude oil price behavior. The Heston model addresses well all kinds of fat-tails properties in the daily price return distributions under various market circumstances. Initially, the parameters were computed from the observation from January 2009 to January 2019. To forecast, we used the data from 02.01.2019 to 31.12.2019. That is, for t = 1 to N is used to estimate the parameters and then forecast N +1. The Euler-Maruyama numerical method is employed in this study to simulate the Heston model. The computed parameters are presented in Table 4. Moreover, the study used the following parameters for simulations: t = 0 (the initial time), T = 1 (terminal time), n = 100 (the number of discretization point between 0 and T ), ∆t = 0.01 (the uniform mesh size), N = 1 000 , (the number of paths). Figures 5 and 6 display the crude oil volatilities simulation results for both markets. Simulated results show that crude oil prices tend to attain an asymmetric structure with lower kurtosis.

Error Analysis
For the model efficiency, we focus on the error analysis defined as E | ( ) ( ) | X T X T − at the point t T = . To examine the strong and weak convergence for the Euler-Maruyama, we use N = 2 9 discretized Brownian paths over [0,1] , and five different step sizes: ∆t d t p = 2 ( 1) − for 1 ≤ p ≤5. Results are presented in Table 5. Results obtained are compared with results for GARCH-type models presented in Table 6. The model with smallest error compared implies that it is more accurate to estimate the model. Tables 5 and 6 present results for the Heston model and GARCH-type models for both markets. In particular, Table 5 presents the error analysis for the Heston model for both markets by using the Euler-Maruyama method. Similarly, Table 6 describes the overall evaluation of three GARCH-type models by using two error measures, which are RMSE and MAE. The results in Table 6 indicate that the GARCH model provides a most accurate forecast than the EGARCH and TGARCH models. Comparing results of Tables 5 and 6, results in general show that the Heston model approximation presents small   errors if compared with the improved GARCH models. Thus, it can be concluded that the Heston model forecast better crude oil price volatility than the counterpart models.

Conclusion
In this study, the Heston stochastic volatility model to forecast crude oil price volatility is introduced to forecasting crude oil price volatility. The primary analysis shows the evidence of differences in the trends of prices for both markets over the period covered. The Euler-Maruyama method was used to approximate the Heston volatility model. Moreover, the simulation results of the Heston model were compared with the approximation results of the improved GARCH models. The results revealed that the Heston stochastic volatility model can forecast crude oil volatility better than the improved GARCH models. The strategic role of crude oil price volatility and its effects on all countries in the world, that is, non-oil-producing and oil-producing countries, and, thus, different forecasting methods on this is essential. For future studies, it is suggested to compare the performance of jump-diffusion models with stochastic volatility models to model the behavior of crude oil price volatility.

Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) received no financial support for the research and/or authorship of this article.