ARIMA-Based Forecasting of S&P BSE SENSEX Returns

Investment in the stock market requires a delicate balance between profitability and risk management, with risk aversion playing a vital role. This study explores the ARIMA forecasting method to predict S&P BSE SENSEX returns, providing valuable insights for investors and financial experts. Using a 3-year dataset, the ARIMA (3,1,1) model was identified as the optimal choice. Diagnostic checks confirmed its reliability, ensuring unbiased and accurate forecasts. In static forecasting, the model exhibited high-quality performance with low error rates. Dynamic forecasting further revealed precision in predicting future values. While the ARIMA model aids in making informed financial decisions, it's crucial to acknowledge its limitations. This research contributes to the understanding of stock market forecasting methodologies, benefiting investors and analysts in navigating this dynamic landscape.


INTRODUCTION
Investment in the stock market necessitates the achievement of an optimal equilibrium between profitability and risk.To accomplish this objective, a comprehensive understanding of market dynamics, particularly in the realm of risk prediction and management, is indispensable.The notion of risk aversion is of great significance to various stakeholders, including investors, policymakers, researchers, and financial experts, due to its impact on portfolio diversification and market stability.
The interplay between investment returns and risks exerts a significant influence on decision-making processes.Successful investors strive to transform each action into substantial returns, relying on effective and rational strategies, as underscored by Kaufman (1995).Empirical research offers evidence supporting a positive correlation between stock markets and economic growth (Guptha & Rao, 2018;Kim et al., 2011;Mallikarjuna & Rao, 2019), which underscores the crucial role of investment decisions in attaining desired financial outcomes.
Nevertheless, stock markets are inherently characterized by their dynamic, intricate, and volatile nature.Predicting stock prices and returns in such an environment poses a formidable challenge.In this context, a delve into the realm of ARIMA (Auto-Regressive Integrated Moving Average) forecasting, a robust analytical tool that offers valuable insights into the future movements of the S&P BSE SENSEX, thereby providing assistance to investors as they navigate the ever-changing landscape of the stock market.

II. LITERATURE REVIEW
In a recent study conducted by Neely et al. (2014), the authors emphasized the significance of employing technical indicators to forecast stock returns, illustrating their relevance from both an economic and statistical perspective.These findings align with a wider body of research that has investigated the predictability of stock returns, as demonstrated by studies conducted Zhu & Zhu (2013) by and Jiahan & Ilias, (2017).
Chari & Henry (2004) provided valuable insights into the reduction of systematic risk in stock market liberalizations.They argued that as the global market assumes the role of the primary source of systematic risk, these liberalizations introduce an exogenous change that allows for the testing of theoretical predictions.

IV. DATA AND RESEARCH METHODOLOGY
In the ARIMA, also known as the Box-Jenkins Approach, four stages are sequentially pursued: identification, estimation, diagnostic checks, and forecasting.In this investigation, the data have been gathered from the official BSE website, encompassing a time span of 2 years, 11 months and 4 weeks, comprising a total of 744 trading day observations.This study is focused on analysing the closing value of the S&P BSE SENSEX.This time series analysis executes with the help of EViews software version 10.

Identification
When constructing an ARIMA model, the first step is to evaluate the stationarity of the data using informal techniques like graphs and correlograms, as well as formal tests like ADF and PP tests.If non-stationarity is detected, data transformation is used to remove underlying trend patterns.Once stationarity is achieved, potential models are identified using ACF and PACF plots.The PACF helps select the AR component, while the ACF guides the selection of the MA component.Model orders can be inferred from values exceeding the confidence band on the plot.However, it is important to choose a parsimonious model to avoid unnecessary complexity.

Estimation
To identify potential ARIMA model candidates, evaluate six key criteria to determine the most suitable model: significant coefficients, SIGMASQ, adjusted R 2 , Akaike Information Criterion (AIC), Schwartz Information Criterion (SIC), and Hannan-Quinn Criterion (HQC).These considerations guide us in selecting the most appropriate ARIMA model.

Diagnostic Check
In the diagnostic phase of the Box-Jenkins Method, focus lies on three crucial aspects: Firstly, ensure the absence of autocorrelation in the residuals of the chosen model.This is accomplished by examining the Ljung-Box Q-statistic.Secondly, it is of utmost importance to check the stationarity of the residuals in the time series regression.Nonstationary residuals imply an unreliable model and the potential for misinterpretation of results.Thirdly, to ascertain the stability of the ARIMA model, consider two things: a) Verify if the estimated model exhibits covariance stationarity, which is indicated by the inverse AR roots residing within the unit circle.b) Ensuring that the estimated process is invertible by ensuring that the inverse MA roots lie inside the unit circle.If these diagnostic assumptions are not met then must seek a more appropriate model by engaging in overfitting.Overfitting involves adding parameters to the AR or MA components of the model.

Forecasting
With the completion of model diagnostics and the subsequent confirmation of the model, the appropriate course of action is now to employ it for the purpose of forecasting.

V. RESULT AND ANALYSIS 1. Identification
Initially plotting the data, it is observed that the time series plot exhibits an overall positive trend (Figure 1).Also, it can be observed from Figure 2 that the plot of ACF exhibits a gradual and linear decay, thereby indicating the non-stationarity of the series at level.In light of this, the unit root tests are performed.The hypothesis for testing the stationarity of S&P BSE SENSEX series using the ADF and PP tests can be stated as follows: H 0 : The null hypothesis in a unit root test assumes that the time series has a unit root, indicating it is not stationary.H 1 : The alternative hypothesis suggests that the time series does not have a unit root, implying it is stationary.Based on the information presented in table 1, it can be observed that the p-value exceeds the critical threshold of 0.05 for both tests conducted in the test equation labelled as "None".Consequently, the null hypothesis is deemed acceptable, thus signifying that the data exhibits non-stationarity at the given level.In order to address this non-stationarity concern, it is recommended to initially transform the series into logarithmic values and subsequently apply differencing (DLCLOSE).Henceforth, it is appropriate to proceed with the estimation of a model.Figure 3 indicates that the series achieves weak stationarity.

Estimation
In the process of selecting the most suitable ARIMA model from a set of candidates, significant coefficients with preferable p-values of less than 0.05 for both AR and MA terms are sought, ensuring that the included variables exhibit statistical significance.Additionally, lower SIGMASQ is desired, indicating a preference for models with lower volatility.The goal is to maximize the Adjusted R 2 , indicating a better fit of the model to the data.Furthermore, the AIC, SIC, and HQC are minimized to identify the most appropriate model.These criteria collectively guide the selection of the ARIMA model that best suits the analysis.

Diagnostic Check
After selecting the best model, it is important to ensure it meets criteria for accurate forecasting.Two important elements come into focus during the diagnostic phase of the Box-Jenkins Method:

I. Absence of Autocorrelation
To validate that the model's residuals exhibit the characteristics of white noise, free from any discernible patterns of autocorrelation, this is tested with the help of Ljung-Box Q-statistic.H 0 : The data demonstrate independent distribution.H 1 : The data do not display independent distribution.

II. Stationary Check of the Residuals
Evaluating the stationarity of residuals in a time series regression model is crucial for confirming the model's credibility and the dependability of its results.When residuals exhibit non-stationarity, it indicates potential issues with the model's validity and raises the risk of misinterpreting the findings.H 0 : The null hypothesis in a unit root test suggests the presence of a unit root in the time series, indicating non-stationarity.H 1 : The alternative hypothesis states that there is no unit root in the time series, indicating stationarity.5 shows that all values are lower than the significance level of 0.05, resulting in the rejection of H 0 .Consequently, all residuals exhibit stationarity at the level, ensuring the model's validity, forecast reliability, and statistical inference accuracy.

Forecasting I) Static Forecasting within the Sample
The ARIMA (3,1,1) model has been used to forecast the closing price returns of S&P BSE SENSEX spanning from October 01, 2020, to September 29, 2023.Figure 7 shows the forecast and actual values, with a confidence interval.The forecast performance metrics are also provided.The model is of high quality, with a remarkably low Theil coefficient (0.000433).Figure 7 also shows a strong alignment between actual and forecasted values.
An ideal forecast should be unbiased, accurate, and free from random fluctuations.The bias proportion value (0.000013) in Figure 7 indicates a positive outcome, suggesting highly accurate forecasts without systematic bias.
The majority of the variability in the time series data is accounted for by the model's forecasts, as indicated by the variance proportion (0.0003399) in Figure 7.The covariance proportion (0.996508) is also high, reflecting satisfactory performance.
With satisfactory bias and variance proportion values, the model is suitable for forecasting.The Mean Absolute Percentage Error (MAPE) of 0.06% indicates a level of accuracy commonly considered acceptable in practical scenarios.

II) Dynamic Forecasting Out of the Sample
The process of dynamic forecasting begins by determining the period for future predictions.A 29-day horizon was chosen for this analysis.The ARIMA (3,1,1) model is used to generate the forecasts, along with 95% confidence intervals.The analysis began with a 29-day horizon for future predictions using an ARIMA model.It included 95% confidence intervals to account for uncertainty.Figure 8 shows a visual representation of the forecasts, with the shaded region indicating the confidence interval.
The RMSE and MAE metrics indicate the accuracy of the forecasts, with minimal errors.The MAPE shows a relative error of 0.043%, demonstrating precision.The Theil Inequality Coefficient is impressively low, indicating highquality performance.

VI. CONCLUSION
For stock market forecasting, the ARIMA methodology, also known as Box-Jenkins, is widely employed.It enables traders, investors, portfolio managers, and financial institutions to build robust financial models, effectively manage risks, and make well-informed decisions.For forecasting, a robust ARIMA (3,1,1) model was adopted.This study focuses on forecasting S&P BSE SENSEX returns using ARIMA.The diagnostics, stability, and forecasting skills of the model were assessed and consistently produced positive results.While ARIMA does not guarantee profits, it does provide useful information for decision-making.Other methodologies, like as GARCH models, can be useful for volatility modelling and forecasting.

Figure 1 :
Figure 1: Graphical representation of close value of S&P BSE SENSEX

Figure 2 :
Figure 2: Correlogram of close value of S&P BSE SENSEX

Figure 3 :
Figure 3: Graphical representation of DCLOSE of S&P BSE SENSEX Returns Now let's examine the correlogram of the log-differentiated series in order to ascertain the values of p and q for possible models.In this regard, the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) provide insights into the potential models.

Figure 2 :
Figure 2: Correlogram of close value of log-differentiated S&P BSE SENSEX

Figure 5 :
Figure 5: Correlogram of residuals In assessing the stability condition within the ARIMA model, two critical aspects are examined: a) Verification of covariance stationarity: The roots of the inverse AR components should reside within the unit circle.b) Confirmation of invertibility: The roots of the inverse MA components should also remain inside the unit circle.

Figure 8
visually represents the dynamic forecasts by overlaying actual historical values with forecasted values.The shaded region within the plot represents the 95% confidence interval.This approach allows for assessing the model's performance and understanding forecasted trends and variations.The data and forecast information from September 29, 2023 to October 28, 2023 are presented in ANNEXURE A1, utilising the ARIMA model (3,1,1).

Table 1 :
Stationary of the data set

Table 2 :
Stationary of the data setUnit Root Test at 1 st Difference

Table 3 :
Evolution of the Best Fit Model ) model has been identified as the optimal choice, with further details provided in Table4.

Table 4 :
ARIMA (3,1,1) model t is the value of the differentiated series at time t. c is a constant (intercept). ϕ 1 , ϕ 2 and ϕ 3 are the autoregressive coefficients corresponding to the lagged values Y t−1 , Y t−2 and Y t−3 respectively. ɑ is the moving average (MA) coefficient. U t is a white noise error term at time t.Applying the ARIMA (3,1,1) coefficients from Table 4, the model takes the following form: DLCLOSE t =0.000714-0.067677Yt−3 -0.073156 U t-1 2

Table 5 :
Stationary of the Residuals