Forecasting residential building costs in New Zealand using a univariate approach

Construction cost index has been widely used to prepare cost estimates, budgets, and bids for construction projects. It can also be regarded as an indicator of cost level, which makes it valuable to public authorities for understanding the conditions in the construction industry. Accurate forecasting of future construction cost index is essential for construction industry at both micro- and macro-level. To improve the accuracy of the cost forecasting, time series modeling techniques are adopted in this study. The performance of the exponential smoothing models and seasonal autoregressive integrated moving average (ARIMA) models for forecasting the building cost of five categories of residential building (one-story house, two-story house, town house, apartment, and retirement village building) in New Zealand is compared. Exponential smoothing models can produce more accurate forecasts for cost series of the one-story house and two-story house in New Zealand, while seasonal ARIMA models outperform exponential smoothing models across the cost series for town house, apartment, and retirement village building. This study contributes toward the development of the current state of knowledge in the area of cost index forecasting for New Zealand and provides insights that should be valuable from the practitioner perspectives.


Introduction
2][3][4] Accurate cost estimates can lead to more reliable bids and help to achieve goals since reliable and reasonable cost forecasting can improve business strategy.Over the past few decades, many projects have suffered huge losses due to the considerable difference between the final project cost and the initial cost estimate. 57][8] Moreover, contractors may suffer high cost variation, insufficient cash flow, and delayed progress associated with the high variation of cost index.Although cost index increases in the long term, it has short-term variations, which makes it difficult to obtain accurate cost estimates.
The construction industry is essential to New Zealand's economy, contributing significantly to business, GDP, and employment. 9The construction industry is highly integrated across the economy and uses outputs from a number of key industries across New Zealand. 9This integration means that the construction industry can significantly influence other industries of New Zealand's economy.It is an important sector and its economic performance and productivity plays an important role in the overall health of the economy. 10Economic growth is reliant on contributions from the construction industry.However, the construction industry is a resource-based sector.If the sector consumes an excessive amount of unrecoverable natural resources, they will cost more in the future.Monitoring the construction industry and proper allocation of resources are essential to long-term economic development.Given the importance of this industry to the overall economy, proper monitoring of the industry is essential to policy makers and strategy developers at the industry and national levels. 11owever, identifying an appropriate indicator for monitoring the construction industry is difficult.Some studies suggest that productivity may be a qualified indicator of the construction industry.However, according to Harrison 12 and Rojas and Aramvareekul, 13 productivity is an inaccurate indicator of the construction industry since errors exist in the measurements and calculations.Also, Sui 14 used quality to measure the construction industry; however, quality is a qualitative indicator that needs other quantified factors to measure it.
Cost index is an effective tool for capturing trends and provides insight into understanding complex and dynamic environments. 15Cost index reflects the unit price of basic materials, which is a good indicator for comparing the input and output.Moreover, compared with economic indicators that only reflect the economic activities in the construction industry, 11 construction cost index also incorporates other factors such as the relationship between demand and supply.Cost index forecasts can help the government to plan its investment in order to smooth the boom-bust cycle inherent to the construction industry.For example, countercyclical investment by the government may reduce volatility in the industry and reduce the magnitude of busts.Better planning can guarantee certainty of work and better resource planning in the industry, improving industry performance and productivity.
Given the utmost importance of cost forecasting and the important role of predicting it accurately, this study tends to use time series modeling techniques to forecast the building cost of five categories of residential building (one-story house, two-story house, town house, apartment, and retirement village building) in New Zealand.Building costs time series usually exhibit strong trends presenting challenges in developing useful models.How to effectively model building cost series and how to enhance forecasting performance are still outstanding questions.There are two most widely used time series forecasting methods: exponential smoothing and autoregressive integrated moving average (ARIMA).The performance of the two forecasting techniques was evaluated in terms of error measures.
Exponential smoothing method was originally introduced by Holt, 16 Brown, 17 and Winters 18 for short-term sales forecasting in support of supply chain management and production planning.The widespread usage of this method is mainly due to the fact that it is a relatively simple forecasting method requiring a small-sized sample and having a comprehensible statistical framework and model parameters.Exponential smoothing models developed are based on the trend and seasonality in time series, while ARIMA models are supposed to describe the autocorrelations in the time series.A framework for exponential smoothing methods was developed based on statespace models. 19RIMA approach is a renowned and widely used linear method. 20It carries more flexibility by representing various components of time series including autoregressive (AR), moving average (MA), and combined AR and MA.ARIMA can model the underlying changes in cost data to make structural forecasts. 21It is the most efficient approach for short-term forecasting with rapid changes.ARIMA models can also predict the future based on modeling the behavior of the serial correlation between the observations of the time series.The future predictions based on ARIMA models can be explained by previous or lagged values and the terms of the stochastic errors. 22ime series forecasting techniques such as exponential smoothing models and the ARIMA models have not yet been examined for residential building cost forecasting in New Zealand.Therefore, the study as original contribution to the existing literature is for the first time to evaluate the forecasting performance of these models for residential building cost in New Zealand.Moreover, based on the comparison of the forecasting techniques, industry practitioners can obtain a general understanding of forecasting techniques for building cost, and thereby improve forecasting accuracy.
The rest of this study is organized as follows.The second section presents the previous related works about forecasting methods for construction cost index.The third section illustrates exponential smoothing method and ARIMA models in detail.Both exponential smoothening and ARIMA models for the cost series of New Zealand are shown in the fourth section.In the fifth section, forecasting performance of the proposed models is compared based on error measures.Discussion of the results is presented in the sixth section.In the final section, conclusion is presented.

Research background
Two of the most widely used forecasting models for construction cost index were published in previous works: the causal method and the time series method. 5,21,23Causal methods usually adopt regression models to estimate construction costs.Trost and Oberlender 24 introduced a mathematical model for investigating the accuracy of early cost estimates by using principal component analysis and regression analysis.Ng et al. 25 introduced an integrated regression analysis and ARIMA techniques to predict a tender price index for Hong Kong building projects.Moreover, Hwang 26 used dynamic model that includes past values of cost index and explanatory variables to forecast construction cost index of the United States.Hwang and Liu 27 addressed a dynamic regression model to examine the relationship between the economic conditions in the market and construction cost.Although these methods are effective by incorporating the explanatory variables to obtain accurate cost estimates, they are difficult to deal with as time-varying variables and reflect the time lag effects.Since much time-related variables have an autocorrelation, 28 time-related techniques can be adopted to overcome these limitations.
In an attempt to solve time-related problems in the methods, time series techniques, which estimate future values of a certain variable according to past values of itself and random shock factors, have been adapted to cost forecasting in construction projects.For example, Fellows 29 used time series models to provide reliable forecasts of building costs, tender prices, and the impacts of economic inflation on building projects.According to Wong et al., 30 it is possible to make accurate predications based on historical patterns.Several studies have been conducted using the time series method.For example, Ashuri and Lu 21 illustrated a time series method that estimates future values according to past values and corresponding random errors and produces a reliable prediction of construction cost.These time series techniques provide systematic and time-related models to forecast future values.Hwang 2 used a time series model to estimate the construction cost index.Xu and Moon 4 employed a vector model to forecast construction cost trends.
Certain artificial intelligence (AI) methods were used to forecast the construction cost index.For example, Williams 31 explained a way of applying neutral networks to forecast changes in the construction cost index.Kim et al. 32 used three forecasting methods to estimate construction cost index including multiple regression analysis, neutral networks, and case-based reasoning.Cheng et al. 33 developed a hybrid model, the evolutionary least square support vector machine, to forecast the Taiwanese construction cost index.The results indicated that an AI method can be used as an effective method for forecasting the Taiwanese construction cost index.Also, Cao et al. 34 used a hybrid computation model that included multivariate adaptive regression splines, a radial basis function neutral network, and an artificial bee colony to forecast the Taiwanese construction cost index.
The forecasting methods described above generate accurate forecasts of the construction cost index.However, there are no studies that provide an effective tool for forecasting the construction cost index of New Zealand.This study provides time series models that can model the characteristics of the construction cost index of New Zealand and generate accurate cost forecast.[38] Research methods

Data
The building cost index is useful for construction professionals to quantify cost variations. 39The index can provide information of cost changes caused by a combination of changes in material, labor, and equipment. 27Hence, the cost index has been used widely in the industry for cost estimation. 3The cost index provided by QV costbuilder has been accepted in the Architectural, Engineering and Construction (AEC) industry in New Zealand.Since 2000, QV has published construction cost index for four major cities of New Zealand.For more information about the construction cost index, readers can refer to the QV website. 40The construction cost index includes costs among a wide range of materials and buildings.QV costbuilder carried out various surveys on construction economics including construction material, labor, and equipment costs to provide comprehensive statistical information.As for many other industries and sectors, QV costbuilder compiles historical data to guide construction organizations and industry professionals and to identify cost fluctuations in the construction industry.
This study uses quarterly CCI for the period of first quarter 2001 to the fourth quarter 2018.The available data set consists of quarterly cost data over a period of 18 years (72 observations) for five categories of residential buildings in New Zealand.It is usual to separate the data into two sections: in-sample data and out-of-sample data.The in-sample data are used for model fitting and the out-ofsample data aim to evaluate the forecasting performance of the model. 41The data (72 observations) were split into two parts: the training part for model fitting and the testing part for evaluating forecasting performance by comparing forecasts with observations. 42There is no clear rule for this dividing; in this study, about 72% of the data (2001: Q1-2013: Q4) were used for model fitting and the remaining 28% (2014: Q1-2018: Q4) were used for out-of-sample forecasts evaluation.The quarterly average building cost for the five categories of residential building in New Zealand from 2001: Q1 to 2018: Q4 are depicted in Figure 1.

Exponential smoothing model
Exponential smoothing is one of the most effective forecasting methods when a time series has a trend that has changed over time, for example, since the 1950s. 43It unequally weights the observed time series values.More recently observed values are weighted more heavily than more remote observations.The weights for the observed time series values decrease exponentially as one moves further into the remote.A smoothing constant can determine the rate at which the weights of older observed values decrease.Exponential smoothing techniques include simple exponential smoothing, linear trend corrected exponential smoothing, Holt-Winters methods, and damped trend exponential smoothing. 43ccording to Hyndman et al., 44 exponential smoothing models have been widely used in many research fields and industry practices due to their relative simplicity and good overall forecasting performance as well as considering trends, seasonality, and other features of the data.A large number of existing research and studies also indicated their extensive industrial applications. 45,46In this study, Holt-Winters exponential smoothing method was adopted.
Holt-Winters method.Holt-Winters method can be applied to time series data displaying trend and seasonality; it has level and trend smoothing parameters (a and b) in addition to a seasonal parameter (g).Although there is no strong evidence for seasonality in the time series of the residential building costs in New Zealand, Holt-Winters method is used to evaluate whether the involvement of a seasonal parameter can improve the model.Holt-Winters methods are designed for time series that exhibit linear trend and seasonal variation, which include additive Holt-Winters method and multiplicative Holt-Winters methods. 43An advantage of these methods is that they can model data seasonality directly instead of stationary transforming for the data.If a time series has a linear trend and additive seasonal pattern, the additive Holt-Winters method is appropriate.Then the time series can be described in equation (1) where b 1 is growth rate, S t is a seasonal pattern, and E t is error term.For such time series, the mean, the growth rate, and the seasonal variation may be changing over time.A statespace model for these changing components can be found in equations ( 2) to ( 5) To begin the estimation, the initial values for level, growth rate, and seasonal variation should be estimated.Hence, first, a least squares regression model should be generated based on available data.The regression model can be expressed in equation (6).The initial values l 0 , b 0 were also obtained from the model Obtain estimated values for each time period based on the above regression model.The initial seasonal factor in each of L seasons can be calculated in equation ( 7) where S Li are seasonal factors, L is the number of seasons in a year, and i is the number of time periods in each season.
After finding the values for the seasonal factors, the state-space models are employed to obtain model parameters that minimize the sum of the squared errors.Future values of the time series are predicted by the state-space model in equation (5).
Multiplicative Holt-Winters method.If a time series has a linear trend with multiplicative seasonal variations, the multiplicative Holt-Winters is appropriate to be used.The state-space models for this method can be described in equations ( 8) to (11) And the seasonal factors can be computed in the following equation ( 12)

ARIMA model
There are four steps to select an appropriate model for the time series data in the Box-Jenkins approach. 47The development process of an ARIMA model is shown in Figure 2  Seasonality implies that a pattern repeats itself over a fixed time interval. 48In this study, the quarterly data present a seasonal period of four quarters.The autocorrelation function (ACF) and partial autocorrelation function (PACF) were employed to determine the stationarity in the data set.The seasonal difference was used to transform the nonstationary seasonal data into stationary by taking the difference between the current observation and the corresponding observation from the previous year.A seasonal ARIMA model can be shown in equation ( 13) where where B is backshift operator; L is the number of seasons in a year (L ¼ 4 for quarterly data and L ¼ 12 for monthly data); d is a constant term; a t ; a tÀ1 ; Á Á Á are random shocks; : 1 ; : 2 ; Á Á Á ; : p are nonseasonal AR parameters; ϕ 1;L ; ϕ 2;L ; Á Á Á ; ϕ P;L are seasonal AR parameters; y 1 ; y 2 ; Á Á Á ; y q are nonseasonal MA parameters; and W 1;L ; W 2;L ; Á Á Á ; W Q;L are seasonal MA parameters.
Stationary checking.Classical ARIMA models are usually used to describe stationary time series.Thus, in order to identify an appropriate ARIMA model, the stationary of the times series should be determined at first.If the time series is not stationary, the transformation of the time series to stationary should be undertaken.A stationary time series can be described as the statistical properties (e.g. the mean and the variance) of it are essentially constant over time. 43he differences of the time series values are shown in equation ( 16).
The parameters are usually estimated by the least square method.The least square estimation method means that the model parameters minimize the sum of the squared errors.The sum of the squared errors can be computed by using equation ( 19) where y t is the real value of the time series and ŷt is the value estimated by the tentative model.

Diagnostic checking.
The obtained models should be checked for whether the ARIMA assumptions are satisfied.As a more accurate test, the Ljung-Box test is usually undertaken to examine whether the autocorrelation of the residuals are statistically different from an expected white noise process.If the p value is greater than 0.05, indicating no significant autocorrelation in residuals, in turn, the model is adequate. 49recasting error measure.Error measures are widely used for examining the accuracy of a forecasting model, which have many forms, including root mean square error (RMSE), 50 mean squared error (MSE), 51 and mean absolute percentage error (MAPE). 52This study evaluated the accuracy of the forecasts by MAPE between the actual and predicted values of building cost.The lower the values are, the better the forecasting performance of the proposed model.Denote the real observations for the time series by y i and the forecasting values for the same series by ŷi .MAPE can be computed in equation ( 20) Models for cost series

Holt-Winters models for building cost
Both additive Holt-Winters and multiplicative Holt-Winters models were applied to the five cost series.Following the methods outlined in the third section, the model parameters were estimated.The results of the exponential smoothing models for the cost of the five categories of the residential building are displayed in Table 1.The p value of the model parameters is less than 0.05, indicating that the parameters are significant and that they can remain in the model.Moreover, the model fit R 2 and error measures including RMSE, MAPE, and mean absolute error (MAE) were also generated.In addition, the model parsimony measure Bayesian information criterion (BIC) was also obtained.The results are shown in Table 3.As the results show, the values of BIC for the exponential smoothing models are similar, which indicate the models are parsimonious.The error measures indicate Holt-Winters models can fit the cost series fairly well.

Seasonal ARIMA models
Model selection.Box et al. 20 suggested that to properly implement the ARIMA method a time series with at least 30 observations is required.In this study, for each cost series, a total of 52 observations from 2001: Q1 to 2013: Q4 were used to obtain the proposed models.For the stationary analysis of the five cost series, ACF and PACF were used; results are shown in Figure 3. On investigating the graphs of ACF and PACF for the five building cost series, it can be observed that the ACFs decay very slowly at both nonseasonal and seasonal lags.For each cost series, the appropriate number of differencing should be determined.Hence, it is reasonable to transform to a stationary series by taking four quarter differencing of data to remove seasonality and regular differencing to remove trends for the four cost series, except the cost series for the two-story house in New Zealand.The cost series has only made a regular differencing to transform the data into stationary.After the differencing, the results of ACFs and PACFs for the five cost series are shown in Figure 4.The seasonal ARIMA models for the five cost series are shown in Table 2.
Based on the approach provided by Box et al., 47 the model parameters, model fit, and error measures were estimated for the five cost series.In order to select proper seasonal ARIMA models, different models with various combinations of regular orders (p and q) and seasonal orders (P and Q) were evaluated.The model parameters of the five cost series are presented in Table 2. Furthermore, the model fit and error measures of the ARIMA models are provided in Table 3.As seen from Table 3, the values of R 2 indicate the ARIMA models fit the cost data fairly well.
Model validation.The Ljung-Box Q test was employed to examine the autocorrelation of model residuals.If the p value is greater than the value of 0.05, the null hypothesis that the data are not correlated should be accepted. 20To examine the normality of the residuals, the analysis applied the Shapiro-Wilk test.If the p value of the test is greater than the value of 0.05, it indicates that there is no evidence to reject the null hypothesis that the data follow a normal distribution. 53As seen from Table 3, the residuals of all the models pass the tests, indicating the proposed models are adequate.

Comparisons of the models in the out-ofsample period
Although a model can fit the data fairly well, it does not indicate that the model can produce better forecasts. 54The forecasting accuracy of a model is affected by many factors, such as the number of observations in the time series, the number of forecast time origins examined, and the number of forecast lead times regarded. 55The forecasting performance of the univariate methods was evaluated by MAPE statistics.The AR1 results presented in Table 4 suggest that the exponential smoothing models generate better results in comparison to seasonal ARIMA models.In particular, the additive Holt-Winters model produces  For AR4 suggest that ARIMA(0,1,1)(0,1,0) 4 is the best forecasting model.For AR5, ARIMA(0,1,0)(0,1,1) 4 produced the best forecasts among the proposed models.Therefore, these results suggest that the ARIMA approach is better than the exponential smoothing method for building cost of the town house, apartments, and retirement village in New Zealand.The results show that seasonal ARIMA models perform better for predicting building cost for town house, apartments, and retirement village in New Zealand, while exponential smooth models are superior in cost forecasting for both the one-story house and the two-story house in New Zealand.This outcome may be due to the relative stability of the cost series for the one-story house and the two-story house.
From the above results, it can be seen that both the exponential smoothing method and the ARIMA approach can produce good forecasts for residential building cost in New Zealand.Which method is better, depends on the characteristics of the data.For example, the ARIMA approach can produce better forecasts for building costs of town house, apartment, and retirement village, which indicates that these costs have a random walk characteristic.The MAPE of the proposed models for all the five cost series are presented in Table 4. Bold type is utilized to identify the lowest values of MAPE for the proposed models of each cost series.As the results show, no single forecasting method is better for all data series.This confirms the generally acceptable idea that no individual forecasting approach can describe all the situations. 56

Results and discussion
During 2001: Q1-2018: Q4, residential building cost has had an increasing trend.Exponential smoothing approach and ARIMA technique are both effective time series ARIMA(0,1,0)(0,1,1) L SMA(l) ¼ 0.554 ARIMA(0,1,0)(0,1,0) L ARIMA: autoregressive integrated moving average; AR: autoregressive; MA: moving average; SAR: seasonal autoregressive; SMA: seasonal moving average.forecasting methods as they both can fairly well describe trend movement in the time series, but they have both strengths and weaknesses.For example, the ARIMA approach is more readily expanded to model interventions, outliers, variations, and variance changes in time series; but it is a relatively sophisticated technique.Due to different data patterns and limited sample size, it is unjust to attempt to determine whether one time series forecasting method is better than the other.Therefore, either the exponential smoothing method or the ARIMA approach should be given a chance to demonstrate its maximum potential in any empirical case study.While exponential smoothing method is based on describing the trend and seasonality in the time series, ARIMA approach is focused on a description of the autocorrelation in the data.There is an idea that ARIMA approach is more advanced than exponential smoothing method since the former has fewer parameters to be estimated.Although ARIMA models are more general, exponential smoothing models can provide framework that is sufficient to capture the dynamics in the data series.ARIMA models are excellent for short-term forecasting.When they are used for long-term forecasting, the models need to be remodeled based on updated data incorporated into the model training process.exponential smoothing method can be very competitive by automatically incorporating updated information into the model, and then producing better forecasts for long-term forecasting.An advantage of ARIMA technique is that only several parameters need to be estimated for generating good forecasting results.However, extreme values in the data set are difficult to be estimated by ARIMA models due to the univariate nature of the model and the lack of a specific ability to simulate unexpected events.
In this study, exponential smoothing method is an effective tool for forecasting future values of building cost for one-story house and two-story house, while seasonal ARIMA models can produce more accurate forecasts for the building cost series of the town house, apartment, and retirement village in New Zealand.In practice, the fluctuations in building costs for the five categories of residential building in New Zealand differ.For example, building cost series for the one-story house and two-story house show a very consistent repetitive pattern, while the current observations of the cost series for town house, apartment, and retirement village in New Zealand would be affected by the previous observations.

Conclusions
In this study, quarterly building costs data (over an 18-year range from 2001: Q1 to 2018: Q4) for five categories of residential building (one-story house, two-story house, town house, apartment, and retirement village building) in New Zealand were analyzed and the major characteristics of these data were explored.It was found that time series data of residential building costs are nonstationary and autocorrelated and do not display a very strong seasonal pattern.Based on the identified characteristics, two time series forecasting techniques, exponential smoothing method and ARIMA approach, were adopted to take into account variations of residential building costs in predicting their future trends.It was concluded that both methods can produce reliable forecasts.The analysis of model residuals explored that the underlying modeling assumptions hold true.For a given data set, it is almost impossible to know in advance which forecasting method will perform better than others.This is supported by a generally accepted principle that no forecasting approach is better for all situations under all circumstances.From this point of view, the empirical findings of this study suggest that exponential smoothing models can be confidently used to forecast building cost for one-story and two-story houses in New Zealand, while seasonal ARIMA models can produce more accurate cost forecasts for town house, apartment, and retirement village building in New Zealand.
The findings of this study can help construction professionals prepare more accurate cost estimates, improve project cost management, and reduce the risk of cost variations in residential building projects.Residential projects of all types can benefit from the proposed models that improve the accuracy of cost estimates.The study also benefits stakeholders and organizations by improving their understanding of cost management.Moreover, the results also play an important role in guiding policy makers to formulate strategic planning by providing an effective monitoring indicator of construction industry.Additionally, the forecasting model provides an initial foundation for cost index estimates in New Zealand, which may provide a basis for researchers to forecast other cost indices in New Zealand.
It is important to emphasize that residential building costs in New Zealand are difficult to forecast due to many factors which impose pressures on them.Future work may consist of an analysis of the predictors affecting building cost specifically in New Zealand, the application of these forecasting techniques to other building cost such as commercial buildings and industrial buildings, and an analysis of the evolution of building cost by economic sectors.To investigate whether combined forecasting methods and multivariate methods can improve accuracy of cost forecasting.Although this study used the QV's residential building cost index, the proposed methods can be used for similar data sets in other cities as well as globally.Moreover, although the model can provide accurate cost forecasts, specific features of a construction project should be thoroughly examined in order to generate accurate cost estimates.

Figure 1 .
Figure 1.Building cost time series for five categories of residential building in New Zealand.
better forecasts based on MAPE measurement.When analyzing the AR2 results, the additive Holt-Winters model outperforms other models based on MAPE measurement.The results regarding AR3, the ARIMA model has the best forecasting performance among the proposed four models.

Table 1 .
Estimated parameter values with significant test for exponential smoothing models.

Table 2 .
Estimated parameter values for seasonal ARIMA models.

Table 3 .
Model fit statistics and residual statistics.

Table 4 .
Forecast values for building cost of one-story house in New Zealand.
ARIMA: autoregressive integrated moving average; MAPE: mean absolute percentage error; AHW: additive Holt-Winters; MHW: multiplicative Holt-Winters.The bold-faced values indicate the minimal MAPE of the four models for each cost series.