1. Mathematical Formulation of a GARCH(1,1) Model
The Generalized Autoregressive Conditional Heteroskedasticity (GARCH) model, introduced by Bollerslev (1986), extends Engle's (1982) ARCH model to capture time-varying volatility in financial returns. A GARCH(1,1) model is specified by two main equations:
Mean Equation:
$r_t = \mu + \epsilon_t$
where:
- $r_t$ is the return of the equity index at time $t$.
- $\mu$ is the conditional mean return (often assumed to be zero or modeled with an ARMA process for financial returns).
- $\epsilon_t$ is the error term, which is conditionally heteroskedastic, meaning its variance changes over time.
Variance Equation:
$\sigma_t^2 = \omega + \alpha_1 \epsilon_{t-1}^2 + \beta_1 \sigma_{t-1}^2$
where:
- $\sigma_t^2$ is the conditional variance of the error term $\epsilon_t$ at time $t$. This is the quantity we want to forecast.
- $\omega$ (omega) is a constant term (intercept) representing the long-run average variance.
- $\alpha_1$ (alpha) is the ARCH term coefficient, which measures the impact of the previous period's squared error (news about volatility) on the current conditional variance. It captures the short-run persistence of volatility.
- $\epsilon_{t-1}^2$ is the squared error (or squared residual) from the previous period, representing the impact of past "shocks" on current volatility.
- $\beta_1$ (beta) is the GARCH term coefficient, which measures the impact of the previous period's conditional variance on the current conditional variance. It captures the long-run persistence of volatility.
- $\sigma_{t-1}^2$ is the conditional variance from the previous period.
For the variance to be positive and the process to be stationary (mean-reverting volatility), the following conditions typically apply:
- $\omega > 0$
- $\alpha_1 \ge 0$
- $\beta_1 \ge 0$
- $\alpha_1 + \beta_1 < 1$ (for weak stationarity of the variance process)
The error term $\epsilon_t$ is typically assumed to be conditionally normally distributed, i.e., $\epsilon_t | \mathcal{F}{t-1} \sim N(0, \sigma_t^2)$, where $\mathcal{F}{t-1}$ represents the information set available up to time $t-1$. However, other distributions like Student's t-distribution or Generalized Error Distribution (GED) are often used to account for fat tails observed in financial data.
2. Key Assumptions and Their Relevance to Financial Time Series
Conditional Normality of Errors: The standard GARCH(1,1) assumes that the standardized errors ($\epsilon_t / \sigma_t$) are independently and identically distributed (i.i.d.) as a standard normal distribution.
- Relevance: Financial returns, even after accounting for heteroskedasticity, often exhibit leptokurtosis (fat tails) and skewness. This means large shocks are more frequent than predicted by a normal distribution, and positive/negative returns may not be symmetrical. Using conditional Student's t-distribution or GED often provides a better fit.
Stationarity of the Volatility Process: The condition $\alpha_1 + \beta_1 < 1$ ensures that the conditional variance process is weakly stationary, implying mean reversion of volatility to a long-run average. If $\alpha_1 + \beta_1 = 1$, the model becomes an Integrated GARCH (IGARCH), implying infinite persistence of shocks.
- Relevance: Financial volatility is generally considered mean-reverting. However, periods of extreme market stress or structural breaks can lead to near-unit root behavior or even non-stationarity, challenging this assumption.
Symmetry of Shocks: A GARCH(1,1) model assumes that positive and negative shocks of the same magnitude have the same impact on future volatility. The squared error term $\epsilon_{t-1}^2$ doesn't distinguish between positive and negative returns.
- Relevance: This assumption is often violated in financial markets due to the leverage effect. Negative shocks (bad news) typically lead to a larger increase in future volatility than positive shocks (good news) of the same magnitude. This is because a drop in stock price increases the financial leverage of a company, making its equity riskier.
No Structural Breaks: The parameters $\omega, \alpha_1, \beta_1$ are assumed to be constant over the entire sample period.
- Relevance: Financial markets are subject to significant structural changes (e.g., policy shifts, financial crises, technological disruptions) that can alter volatility dynamics, leading to shifts in model parameters. Ignoring these can lead to biased estimates and poor forecasts.
3. Practical Steps for Implementation, Estimation, and Evaluation
Data Collection and Preprocessing:
- Obtain daily closing prices for the S&P 500 index for a sufficient historical period (e.g., 5-10 years). * Calculate log returns: $r_t = \ln(P_t / P_{t-1})$. * Visualize returns: Inspect for volatility clustering, outliers, and trends.
Preliminary Analysis:
- Test for ARCH Effects: Perform an ARCH-LM test on the residuals of a simple mean model (e.g., ARMA(p,q) on returns) to confirm the presence of conditional heteroskedasticity. If no ARCH effects are found, a GARCH model might not be necessary.
- ACF/PACF of Returns and Squared Returns: Analyze autocorrelation functions (ACF) and partial autocorrelation functions (PACF) of the returns and squared returns. Significant autocorrelation in squared returns suggests ARCH/GARCH effects.
- Descriptive Statistics: Calculate kurtosis and skewness of returns to gauge deviation from normality.
Model Specification and Estimation:
- Mean Equation: First, fit an appropriate ARMA(p,q) model to the returns $r_t$ to capture any conditional mean dynamics and obtain initial residuals. Often, for daily equity index returns, a simple mean of zero is assumed if autocorrelations are negligible.
- GARCH(1,1) Estimation: Use Maximum Likelihood Estimation (MLE) to estimate the parameters ($\mu, \omega, \alpha_1, \beta_1$) of the GARCH(1,1) model, assuming a conditional distribution (e.g., normal, Student's t). This is typically done numerically using optimization algorithms.
- Parameter Constraints: Ensure estimated parameters satisfy the non-negativity and stationarity conditions.
Diagnostic Checking:
- Standardized Residuals: Calculate the standardized residuals: $z_t = \epsilon_t / \sigma_t$. These should ideally be i.i.d. with mean zero and variance one.
- Ljung-Box Test: Apply Ljung-Box test to standardized residuals and squared standardized residuals to check for any remaining autocorrelation, indicating that the model has successfully captured the conditional mean and variance dynamics.
- ARCH-LM Test: Run an ARCH-LM test on the squared standardized residuals to confirm no remaining ARCH effects. * Goodness-of-Fit Tests: Compare the empirical distribution of standardized residuals to the assumed conditional distribution (e.g., QQ plots, Jarque-Bera test for normality).
Forecasting:
- To forecast the daily conditional variance for the next week (5 trading days), we use the iterative nature of the GARCH equation.
- 1-step ahead: $\sigma_{t+1}^2 = \omega + \alpha_1 \epsilon_t^2 + \beta_1 \sigma_t^2$. Here, $\epsilon_t^2$ and $\sigma_t^2$ are known from the last observation.
- 2-steps ahead: $\sigma_{t+2}^2 = \omega + \alpha_1 E[\epsilon_{t+1}^2 | \mathcal{F}t] + \beta_1 \sigma{t+1}^2$. Since $E[\epsilon_{t+1}^2 | \mathcal{F}t] = \sigma{t+1}^2$, this simplifies to $\sigma_{t+2}^2 = \omega + (\alpha_1 + \beta_1) \sigma_{t+1}^2$.
- k-steps ahead: Generalizing, for $k > 1$, $\sigma_{t+k}^2 = \omega \frac{1 - (\alpha_1 + \beta_1)^{k-1}}{1 - (\alpha_1 + \beta_1)} + (\alpha_1 + \beta_1)^{k-1} \sigma_{t+1}^2$. As $k \to \infty$, $\sigma_{t+k}^2 \to \omega / (1 - \alpha_1 - \beta_1)$, which is the long-run variance.
- Calculate forecasts for $k=1, 2, ..., 5$.
Model Evaluation:
- Out-of-sample forecasting: Split the data into training and testing sets. Estimate the model on the training data and forecast volatility on the test set.
- Compare with Realized Volatility: Compare the forecasted conditional variances with a proxy for actual (realized) volatility, often calculated from intraday data (e.g., sum of squared intraday returns) or squared daily returns (though noisy).
- Loss Functions: Use appropriate loss functions to assess forecast accuracy, such as Mean Squared Error (MSE), Mean Absolute Error (MAE), or the QLIKE loss function, which is particularly robust to noise in realized volatility proxies. ($QLIKE = \frac{1}{N} \sum_{t=1}^N (\log(\sigma_t^2) + \frac{R_t^2}{\sigma_t^2})$ where $R_t^2$ is the realized variance proxy).
- Backtesting VaR/ES: If the purpose is risk management, backtest Value-at-Risk (VaR) or Expected Shortfall (ES) based on the GARCH forecasts to check their accuracy in capturing tail risk.
4. Major Strengths and Limitations of GARCH(1,1)
Strengths:
- Volatility Clustering: GARCH models effectively capture the empirically observed phenomenon of volatility clustering, where large price changes tend to be followed by large price changes, and small by small.
- Mean Reversion: The model implicitly assumes that volatility reverts to a long-run average level, which is consistent with financial market behavior (periods of high volatility don't last forever).
- Parsimony: The GARCH(1,1) is a very parsimonious model, requiring only a few parameters ($\omega, \alpha_1, \beta_1$) to capture complex volatility dynamics. This reduces the risk of overfitting compared to higher-order ARCH models.
- Widely Accepted: It is a foundational and well-established model in financial econometrics, making it a standard benchmark for volatility forecasting.
Limitations:
- Symmetry of Shocks (No Leverage Effect): As discussed, GARCH(1,1) cannot distinguish between positive and negative shocks. It assumes that good news and bad news have the same impact on future volatility, which contradicts the observed leverage effect in equity markets.
- Conditional Normality Assumption: Financial returns often exhibit fat tails and skewness even after accounting for time-varying volatility. Assuming conditional normality leads to underestimation of extreme events (VaR, ES).
- Limited Long Memory: While GARCH captures short-to-medium term persistence, it struggles with very long memory processes sometimes observed in volatility (e.g., fractional integration).
- Reaction to Large Shocks: The model can be slow to react to very large, sudden shifts in volatility (e.g., during crises) and might take too long to revert to its mean.
- Difficulty with Regime Shifts: GARCH models assume constant parameters over time, making them less effective when there are abrupt structural breaks or changes in market regimes that alter volatility dynamics.
- Requires High-Quality Data: Its effectiveness depends on the quality and frequency of historical data. Issues like missing data or erroneous observations can significantly impact parameter estimates.
5. Alternative or Extended GARCH-type Models / Approaches
EGARCH (Exponential GARCH) or GJR-GARCH (Glosten, Jagannathan, Runkle GARCH):
- Why it's an improvement: These models are designed to capture the leverage effect. EGARCH models the logarithm of the conditional variance, allowing for asymmetric responses to positive and negative shocks. GJR-GARCH introduces an additional term that activates only for negative shocks ($\epsilon_{t-1} < 0$), allowing a different impact on volatility. For instance, the GJR(1,1) variance equation is:
$\sigma_t^2 = \omega + (\alpha_1 + \gamma_1 I_{t-1}) \epsilon_{t-1}^2 + \beta_1 \sigma_{t-1}^2$
where $I_{t-1}$ is an indicator function that is 1 if $\epsilon_{t-1} < 0$ and 0 otherwise. If $\gamma_1 > 0$, negative shocks increase volatility more than positive shocks of the same magnitude.
- Relevance: Given the known leverage effect in equity markets, these models are often superior for forecasting equity index volatility as they provide a more realistic representation of market dynamics.
GARCH with Student's t-distribution (or GED):
- Why it's an improvement: Instead of assuming conditional normality for the standardized residuals, using a Student's t-distribution or a Generalized Error Distribution (GED) allows the model to accommodate the fat tails (leptokurtosis) commonly observed in financial returns. The Student's t-distribution has an additional parameter (degrees of freedom, $\nu$) that dictates the kurtosis of the distribution, with lower $\nu$ indicating fatter tails.
- Relevance: This leads to more accurate estimation of tail risk, which is crucial for applications like Value-at-Risk (VaR) or Expected Shortfall (ES) calculations, as it better captures the likelihood of extreme price movements.
Completely Different Approach (if GARCH-type is insufficient):
Realized Volatility Models (e.g., HAR-RV):
- Why it's an improvement: Instead of modeling conditional variance indirectly from daily returns, realized volatility (RV) models directly use high-frequency intraday data (e.g., 5-minute returns) to construct a much more precise, model-free estimate of daily volatility. The Heterogeneous Autoregressive (HAR-RV) model, for instance, models future RV as a linear combination of past daily, weekly, and monthly RVs, capturing different persistence components.
- Relevance: RV models generally produce more accurate and robust volatility forecasts than GARCH models because they incorporate significantly more information. They are particularly useful when high-frequency data is available and the focus is purely on forecasting volatility with minimal distributional assumptions on the underlying returns.
Stochastic Volatility (SV) Models:
- Why it's an improvement: Unlike GARCH where volatility is a deterministic function of past errors and past volatility, SV models treat volatility itself as an unobserved (latent) stochastic process. This provides greater flexibility and a more realistic representation of financial markets, where volatility is not perfectly predictable.
- Relevance: SV models can better capture complex features like jumps in volatility and unobserved market factors. However, they are computationally more intensive to estimate, often requiring Bayesian methods like Markov Chain Monte Carlo (MCMC), making them more challenging to implement than GARCH models but potentially more accurate in capturing the true data generating process of volatility.