Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Chapter 5: Stochastic Processes for Aggregate Shocks

kapitaali.com

AR(1) Processes and Macroeconomic Uncertainty

“The theory of stochastic processes is the theory of time-indexed families of random variables. In macroeconomics, every variable of interest is one.”

Cross-reference: Principles Ch. 6 (time-series properties, HP filter, unit roots, data vintages); Ch. 15 (rational expectations, information sets); Ch. 27 (RBC calibration — TFP as AR(1)) [P:Ch.6, P:Ch.15, P:Ch.27]


5.1 Why Stochastic Models? The Centrality of Uncertainty

The deterministic dynamic models of Chapters 3 and 4 assume that once initial conditions and parameters are specified, the entire future path of the economy is known. This is an obviously wrong description of the world. GDP fluctuates for reasons that are not fully predictable in advance. Monetary policy decisions are made under uncertainty about the current state of the economy. Firms’ investment decisions depend on uncertain future profitability. Consumer spending responds to news about future income.

Modern macroeconomics — from the rational expectations revolution onward — treats uncertainty not as an afterthought but as a fundamental feature of the economic environment [P:Ch.15]. The stochastic process is to modern macroeconomics what the differential equation is to classical physics: the natural language for describing how systems evolve over time when driven by unpredictable forces.

This chapter introduces the key concepts of probability and stochastic process theory that underlie the macroeconomic models of Parts V–IX. We develop: the formal probabilistic framework (probability spaces, filtrations, conditional expectations); stationary stochastic processes and their characterization by autocovariance functions and spectra; the AR(1) process in depth; and the Wold decomposition, which guarantees that every covariance-stationary process can be represented as a moving average of white noise — the fundamental result underlying all time-series econometrics.


5.2 Probability Spaces and Information Sets

Definition 5.1 (Probability Space). A probability space is a triple (Ω,F,P)(\Omega, \mathcal{F}, P) where:

  • Ω\Omega is the sample space — the set of all possible states of the world.

  • F\mathcal{F} is a sigma-algebra on Ω\Omega — a collection of subsets of Ω\Omega (events) closed under countable unions, countable intersections, and complementation.

  • P:F[0,1]P: \mathcal{F} \to [0,1] is a probability measureP(Ω)=1P(\Omega) = 1 and PP is countably additive.

A random variable X:ΩRX: \Omega \to \mathbb{R} is a measurable function from Ω\Omega to R\mathbb{R}.

For macroeconomics, the key interpretation is: Ω\Omega is the space of all possible histories of the economy (sequences of TFP realizations, policy shocks, preference shocks); an event AFA \in \mathcal{F} is a statement about the economy that is either true or false in any realized history; and P(A)P(A) is the prior probability of that event.

5.2.1 Filtrations and Information

In dynamic models, information accumulates over time. At date tt, agents have observed the history of the economy up to tt but not the future.

Definition 5.2 (Filtration). A filtration {Ft}t=0\{\mathcal{F}_t\}_{t=0}^\infty is a non-decreasing sequence of sigma-algebras: FsFt\mathcal{F}_s \subseteq \mathcal{F}_t for all sts \leq t. The sigma-algebra Ft\mathcal{F}_t represents the information available at date tt.

A stochastic process {Xt}\{X_t\} is adapted to the filtration {Ft}\{\mathcal{F}_t\} if XtX_t is Ft\mathcal{F}_t-measurable for each tt — that is, the value of XtX_t is known at date tt. In macroeconomics, every variable we observe (GDP, inflation, interest rates) is adapted to the information filtration of agents.

Definition 5.3 (Conditional Expectation). The conditional expectation E[XFt]\mathbb{E}[X|\mathcal{F}_t] is the Ft\mathcal{F}_t-measurable random variable that best predicts XX given information Ft\mathcal{F}_t in the mean-squared sense. It satisfies:

E[E[XFt]Fs]=E[XFs]for all st.\mathbb{E}[\mathbb{E}[X|\mathcal{F}_t]|\mathcal{F}_s] = \mathbb{E}[X|\mathcal{F}_s] \quad \text{for all } s \leq t.

This is the law of iterated expectations — a key tool in rational expectations models. It says: the best forecast of the future best forecast of XX is the current best forecast of XX.

The rational expectations hypothesis [P:Ch.15] states that agents’ subjective expectations equal the conditional expectation given their information set: Etagents[Xt+h]=E[Xt+hFt]\mathbb{E}_t^{agents}[X_{t+h}] = \mathbb{E}[X_{t+h}|\mathcal{F}_t]. Agents use the correct probability model; they are not systematically wrong.


5.3 Stationarity, Autocovariance, and the Autocorrelation Function

5.3.1 Stationarity

Definition 5.4 (Covariance Stationarity). A stochastic process {Xt}\{X_t\} is covariance-stationary (or weakly stationary) if:

  1. E[Xt]=μ\mathbb{E}[X_t] = \mu for all tt (constant mean).

  2. Var(Xt)=σ2<\text{Var}(X_t) = \sigma^2 < \infty for all tt (finite, constant variance).

  3. Cov(Xt,Xtk)=γk\text{Cov}(X_t, X_{t-k}) = \gamma_k depends only on the lag kk, not on tt (time-invariant autocovariances).

Definition 5.5 (Strict Stationarity). A process is strictly stationary if the joint distribution of (Xt1,,Xtn)(X_{t_1}, \ldots, X_{t_n}) equals the joint distribution of (Xt1+h,,Xtn+h)(X_{t_1+h}, \ldots, X_{t_n+h}) for any hh — the distribution is invariant to time translation. Strict stationarity plus finite second moments implies covariance stationarity.

Most empirical time-series methods require only covariance stationarity, which is weaker and easier to verify.

5.3.2 The Autocovariance and Autocorrelation Functions

Definition 5.6 (Autocovariance Function). For a covariance-stationary process {Xt}\{X_t\} with mean μ\mu:

γk=Cov(Xt,Xtk)=E[(Xtμ)(Xtkμ)],k=0,1,2,\gamma_k = \text{Cov}(X_t, X_{t-k}) = \mathbb{E}[(X_t - \mu)(X_{t-k} - \mu)], \quad k = 0, 1, 2, \ldots

Note γ0=Var(Xt)=σ2\gamma_0 = \text{Var}(X_t) = \sigma^2 and γk=γk\gamma_k = \gamma_{-k} (symmetry).

Definition 5.7 (Autocorrelation Function, ACF). The autocorrelation function is:

ρk=γkγ0[1,1].\rho_k = \frac{\gamma_k}{\gamma_0} \in [-1, 1].

The ACF {ρk}k=0\{\rho_k\}_{k=0}^\infty characterizes the temporal persistence of the process. For macroeconomic variables:

  • U.S. log real GDP (HP-filtered) has ρ10.87\rho_1 \approx 0.87, ρ40.52\rho_4 \approx 0.52 — high persistence.

  • U.S. TFP growth (quarterly) has ρ10.70\rho_1 \approx 0.70, falling off gradually.

  • White noise has ρk=0\rho_k = 0 for all k1k \geq 1.

Bochner’s theorem: A sequence {ρk}\{\rho_k\} is a valid ACF if and only if it is positive semi-definite. This means the Fourier transform of the ACF (the spectral density, defined below) must be non-negative everywhere.


5.4 The AR(1) Process: The Workhorse of Macroeconomic Dynamics

The autoregressive process of order 1, or AR(1), is the single most important stochastic process in macroeconomics. It appears as:

  • The technology shock in every RBC model [P:Ch.27]: lnAt=ρlnAt1+εt\ln A_t = \rho\ln A_{t-1} + \varepsilon_t.

  • The monetary policy shock in NK models [P:Ch.23]: m^t=ρmm^t1+εtm\hat{m}_t = \rho_m\hat{m}_{t-1} + \varepsilon_t^m.

  • The approximation to any persistent but stationary shock process.

Definition 5.8 (AR(1) Process). An AR(1) process satisfies:

Xt=μ+ρ(Xt1μ)+εt,εtWN(0,σε2),X_t = \mu + \rho(X_{t-1} - \mu) + \varepsilon_t, \quad \varepsilon_t \sim \text{WN}(0, \sigma_\varepsilon^2),

where ρ(1,1)\rho \in (-1, 1) is the persistence parameter, μ\mu is the unconditional mean, and εt\varepsilon_t is white noise — serially uncorrelated innovations with zero mean and constant variance.

Definition 5.9 (White Noise). {εt}\{\varepsilon_t\} is white noise, written εtWN(0,σ2)\varepsilon_t \sim \text{WN}(0, \sigma^2), if:

  • E[εt]=0\mathbb{E}[\varepsilon_t] = 0 for all tt.

  • E[εt2]=σ2\mathbb{E}[\varepsilon_t^2] = \sigma^2 for all tt.

  • E[εtεs]=0\mathbb{E}[\varepsilon_t\varepsilon_s] = 0 for all tst \neq s.

If additionally εtN(0,σ2)\varepsilon_t \sim \mathcal{N}(0, \sigma^2) i.i.d., we have Gaussian white noise — the standard assumption in DSGE models.

5.4.1 Moments of the AR(1)

Taking expectations of the AR(1) definition (assuming stationarity):

E[Xt]=μ+ρ(E[Xt]μ)+0    E[Xt]=μ.\mathbb{E}[X_t] = \mu + \rho(\mathbb{E}[X_t] - \mu) + 0 \implies \mathbb{E}[X_t] = \mu.

For the variance, use the WLOG assumption μ=0\mu = 0: Xt=ρXt1+εtX_t = \rho X_{t-1} + \varepsilon_t. Then:

Var(Xt)=ρ2Var(Xt1)+σε2    σX2=σε21ρ2.\text{Var}(X_t) = \rho^2\text{Var}(X_{t-1}) + \sigma_\varepsilon^2 \implies \sigma_X^2 = \frac{\sigma_\varepsilon^2}{1 - \rho^2}.

This requires ρ<1|\rho| < 1 for the variance to be finite. The autocovariances:

Theorem 5.1 (AR(1) Autocovariance Function). For the zero-mean AR(1) Xt=ρXt1+εtX_t = \rho X_{t-1} + \varepsilon_t with ρ<1|\rho| < 1:

γk=σε21ρ2ρk=σX2ρk,ρk=ρk.\gamma_k = \frac{\sigma_\varepsilon^2}{1-\rho^2}\rho^{|k|} = \sigma_X^2\rho^{|k|}, \quad \rho_k = \rho^{|k|}.

Proof. Multiply both sides of Xt=ρXt1+εtX_t = \rho X_{t-1} + \varepsilon_t by XtkX_{t-k} and take expectations:

E[XtXtk]=ρE[Xt1Xtk]+E[εtXtk].\mathbb{E}[X_tX_{t-k}] = \rho\mathbb{E}[X_{t-1}X_{t-k}] + \mathbb{E}[\varepsilon_tX_{t-k}].

For k1k \geq 1, εt\varepsilon_t is uncorrelated with XtkX_{t-k} (since t>tkt > t-k), so E[εtXtk]=0\mathbb{E}[\varepsilon_tX_{t-k}] = 0. Therefore γk=ργk1\gamma_k = \rho\gamma_{k-1} for k1k \geq 1, with γ0=σε2/(1ρ2)\gamma_0 = \sigma_\varepsilon^2/(1-\rho^2). Solving the recursion: γk=ρkγ0=σX2ρk\gamma_k = \rho^k\gamma_0 = \sigma_X^2\rho^k. \square

The ACF ρk=ρk\rho_k = \rho^k decays geometrically. High ρ|\rho| (near 1) means slow decay — the process has long memory and is highly persistent. Low ρ|\rho| (near 0) means rapid decay — shocks dissipate quickly.

5.4.2 Moving Average Representation

The AR(1) can be written as an infinite moving average (MA):

Xt=εt+ρεt1+ρ2εt2+=j=0ρjεtj.X_t = \varepsilon_t + \rho\varepsilon_{t-1} + \rho^2\varepsilon_{t-2} + \cdots = \sum_{j=0}^\infty \rho^j\varepsilon_{t-j}.

Proof. Iterate Xt=ρXt1+εtX_t = \rho X_{t-1} + \varepsilon_t backward:

Xt=εt+ρ(ρXt2+εt1)+=j=0Tρjεtj+ρT+1XtT1j=0ρjεtjX_t = \varepsilon_t + \rho(\rho X_{t-2} + \varepsilon_{t-1}) + \cdots = \sum_{j=0}^T\rho^j\varepsilon_{t-j} + \rho^{T+1}X_{t-T-1} \to \sum_{j=0}^\infty\rho^j\varepsilon_{t-j}

since ρT+1XtT10|\rho^{T+1}X_{t-T-1}| \to 0 in mean square when ρ<1|\rho| < 1. \square

The MA representation is the impulse response function of the AR(1): the coefficient on εtj\varepsilon_{t-j} gives the effect of a shock jj periods ago on XtX_t today. For the AR(1), the IRF is ψj=ρj\psi_j = \rho^j — a geometrically declining function of the lag jj.

In DSGE models, the IRF of any variable to any shock is a sum of such geometrically declining terms, weighted by the eigenvalues of the transition matrix. Chapter 17 computes these IRFs for the RBC model; Chapter 28 does so for the full NK model.


5.5 ARMA Processes and the Wold Decomposition

5.5.1 ARMA(p,q) Processes

Definition 5.10 (ARMA Process). An ARMA(pp,qq) process satisfies:

Xt=μ+i=1pϕi(Xtiμ)+j=0qθjεtj,εtWN(0,σ2),X_t = \mu + \sum_{i=1}^p\phi_i(X_{t-i}-\mu) + \sum_{j=0}^q\theta_j\varepsilon_{t-j}, \quad \varepsilon_t \sim \text{WN}(0, \sigma^2),

where the AR polynomial ϕ(L)=1ϕ1LϕpLp\phi(L) = 1 - \phi_1 L - \cdots - \phi_p L^p and MA polynomial θ(L)=1+θ1L++θqLq\theta(L) = 1 + \theta_1 L + \cdots + \theta_q L^q are defined in terms of the lag operator LL: LXt=Xt1LX_t = X_{t-1}.

The process is stationary iff all roots of ϕ(z)=0\phi(z) = 0 lie outside the unit circle (z>1|z| > 1). It is invertible iff all roots of θ(z)=0\theta(z) = 0 lie outside the unit circle.

5.5.2 The Wold Decomposition

The Wold theorem is arguably the most important theorem in time-series analysis. It says that any covariance-stationary process can be decomposed into a predictable component and an unpredictable MA(\infty) component.

Theorem 5.2 (Wold Decomposition). Let {Xt}\{X_t\} be a zero-mean covariance-stationary process. Then:

Xt=j=0ψjεtj+Vt,X_t = \sum_{j=0}^\infty \psi_j\varepsilon_{t-j} + V_t,

where:

  1. εt=XtE[XtFt1]\varepsilon_t = X_t - \mathbb{E}[X_t|\mathcal{F}_{t-1}] is the one-step-ahead forecast error (innovation), which is white noise with σ2=γ0var(best linear predictor)\sigma^2 = \gamma_0 - \text{var(best linear predictor)}.

  2. ψ0=1\psi_0 = 1 and j=0ψj2<\sum_{j=0}^\infty\psi_j^2 < \infty.

  3. VtV_t is the deterministic component — perfectly predictable from its own past.

  4. εtVs\varepsilon_t \perp V_s for all t,st, s (the components are uncorrelated).

For purely non-deterministic processes (those without a deterministic component), Vt=0V_t = 0 and the entire process is its MA(\infty) representation. The impulse response coefficients ψj\psi_j are the dynamic multipliers: a unit innovation at tt affects Xt+jX_{t+j} by ψj\psi_j.

Economic significance. The Wold decomposition underpins the structural VAR methodology [P:Appendix B]. By identifying which innovations εt\varepsilon_t correspond to which structural shocks (monetary policy, technology, demand), empirical macroeconomists can trace the dynamic effects of each shock through the economy. Chapter 19 develops this methodology in full.


5.6 VAR Processes and the Companion Form

A vector autoregression (VAR) of order pp generalizes the scalar AR(pp) to a vector of variables:

Xt=c+Φ1Xt1+Φ2Xt2++ΦpXtp+εt,\mathbf{X}_t = \mathbf{c} + \Phi_1\mathbf{X}_{t-1} + \Phi_2\mathbf{X}_{t-2} + \cdots + \Phi_p\mathbf{X}_{t-p} + \boldsymbol{\varepsilon}_t,

where XtRn\mathbf{X}_t \in \mathbb{R}^n, ΦiRn×n\Phi_i \in \mathbb{R}^{n\times n}, and εtWN(0,Σ)\boldsymbol{\varepsilon}_t \sim \text{WN}(\mathbf{0}, \Sigma).

The VAR(pp) can be rewritten as a VAR(1) in the companion form by stacking:

(XtXt1Xtp+1)X~t=(Φ1Φ2ΦpI000I0)A~(Xt1Xt2Xtp)X~t1+(εt00).\underbrace{\begin{pmatrix}\mathbf{X}_t \\ \mathbf{X}_{t-1} \\ \vdots \\ \mathbf{X}_{t-p+1}\end{pmatrix}}_{\tilde{\mathbf{X}}_t} = \underbrace{\begin{pmatrix}\Phi_1 & \Phi_2 & \cdots & \Phi_p \\ I & 0 & \cdots & 0 \\ 0 & I & \cdots & 0 \\ \vdots & & \ddots & \vdots\end{pmatrix}}_{\tilde{A}} \underbrace{\begin{pmatrix}\mathbf{X}_{t-1} \\ \mathbf{X}_{t-2} \\ \vdots \\ \mathbf{X}_{t-p}\end{pmatrix}}_{\tilde{\mathbf{X}}_{t-1}} + \begin{pmatrix}\boldsymbol{\varepsilon}_t \\ \mathbf{0} \\ \vdots \\ \mathbf{0}\end{pmatrix}.

Stationarity of the VAR(pp): The VAR(pp) is covariance-stationary iff all eigenvalues of the companion matrix A~\tilde{A} lie inside the unit circle.

The companion form shows that any DSGE model’s solution — which takes the form yt=Cyt1+Dzt\mathbf{y}_t = C\mathbf{y}_{t-1} + D\mathbf{z}_t after solving (Chapter 28) — is a first-order VAR with companion matrix CC. The stationarity condition for the DSGE model solution is exactly the Blanchard–Kahn condition applied to CC.


5.7 The Spectral Density

The spectral density decomposes the variance of a process by frequency, revealing which periodicities are most important.

Definition 5.11 (Spectral Density). For a covariance-stationary process {Xt}\{X_t\} with absolutely summable autocovariances (k=γk<\sum_{k=-\infty}^\infty|\gamma_k| < \infty), the spectral density is:

SX(ω)=12πk=γkeikω,ω[π,π].S_X(\omega) = \frac{1}{2\pi}\sum_{k=-\infty}^\infty\gamma_k e^{-ik\omega}, \quad \omega \in [-\pi, \pi].

SX(ω)dωS_X(\omega)d\omega is the contribution to variance from cyclical components with frequencies in [ω,ω+dω][\omega, \omega+d\omega]. The spectral density is the Fourier transform of the autocovariance function; by the inverse transform:

γk=ππSX(ω)eikωdω,σ2=γ0=ππSX(ω)dω.\gamma_k = \int_{-\pi}^\pi S_X(\omega)e^{ik\omega}d\omega, \quad \sigma^2 = \gamma_0 = \int_{-\pi}^\pi S_X(\omega)d\omega.

Theorem 5.3 (Spectral Density of an AR(1)). For Xt=ρXt1+εtX_t = \rho X_{t-1} + \varepsilon_t with εtWN(0,σε2)\varepsilon_t \sim \text{WN}(0,\sigma_\varepsilon^2):

SX(ω)=σε2/2π1ρeiω2=σε2/2π1+ρ22ρcosω.S_X(\omega) = \frac{\sigma_\varepsilon^2/2\pi}{|1 - \rho e^{-i\omega}|^2} = \frac{\sigma_\varepsilon^2/2\pi}{1 + \rho^2 - 2\rho\cos\omega}.

Proof. Using the MA(\infty) representation Xt=j=0ρjεtjX_t = \sum_{j=0}^\infty\rho^j\varepsilon_{t-j} and the spectral density of white noise Sε(ω)=σε2/(2π)S_\varepsilon(\omega) = \sigma_\varepsilon^2/(2\pi):

SX(ω)=j=0ρjeijω2Sε(ω)=σε2/2π1ρeiω2.S_X(\omega) = \left|\sum_{j=0}^\infty\rho^je^{-ij\omega}\right|^2 S_\varepsilon(\omega) = \frac{\sigma_\varepsilon^2/2\pi}{|1-\rho e^{-i\omega}|^2}. \quad\square

For ρ>0\rho > 0, SX(ω)S_X(\omega) is largest at ω=0\omega = 0 (low frequency, long cycles) — confirming that high persistence generates power at business cycle frequencies. The HP filter exploits this by attenuating the low-frequency components; its frequency response function is closely related to the spectral density of a smooth trend.


5.8 Martingales and the Innovation Representation

Definition 5.12 (Martingale). A stochastic process {Mt}\{M_t\} adapted to {Ft}\{\mathcal{F}_t\} is a martingale if:

E[Mt+1Ft]=Mtfor all t.\mathbb{E}[M_{t+1}|\mathcal{F}_t] = M_t \quad \text{for all } t.

Equivalently, E[Mt+hFt]=Mt\mathbb{E}[M_{t+h}|\mathcal{F}_t] = M_t for all h0h \geq 0: the best forecast of future values is the current value.

A martingale difference sequence (MDS) {Dt}\{D_t\} satisfies E[DtFt1]=0\mathbb{E}[D_t|\mathcal{F}_{t-1}] = 0 — innovations are unpredictable from past information. The Wold innovations εt\varepsilon_t are an MDS: E[εtFt1]=0\mathbb{E}[\varepsilon_t|\mathcal{F}_{t-1}] = 0.

Hall’s (1978) random walk hypothesis for consumption [P:Ch.11.3] states that consumption is a martingale: Et[ct+1]=ct\mathbb{E}_t[c_{t+1}] = c_t. This follows directly from the Euler equation u(ct)=β(1+r)Et[u(ct+1)]u'(c_t) = \beta(1+r)\mathbb{E}_t[u'(c_{t+1})] under quadratic utility (u(c)=(cc)2/2u(c) = -(c-c^*)^2/2, implying u(c)=ccu'(c) = c^*-c): ct=Et[ct+1]c_t = \mathbb{E}_t[c_{t+1}]. Therefore ct+1ct=εt+1c_{t+1} - c_t = \varepsilon_{t+1}, which is an MDS — changes in consumption are unforecastable.

This is an empirically testable restriction: no variable in Ft\mathcal{F}_t should help predict ct+1ctc_{t+1} - c_t. The widespread violation of this restriction (excess sensitivity of consumption to predictable income changes [P:Ch.11.4]) is one of the major puzzles in consumption economics, motivating the buffer-stock and liquidity-constraint models.


5.9 Worked Example: Fitting an AR(1) to U.S. TFP

Cross-reference: Principles Ch. 27 (RBC calibration) [P:Ch.27]

The standard RBC calibration [P:Ch.27] sets the TFP shock as lnAt=ρAlnAt1+εtA\ln A_t = \rho_A\ln A_{t-1} + \varepsilon_t^A. We estimate ρA\rho_A and σA\sigma_A from the Solow residual.

Data: Quarterly U.S. TFP index, 1953Q1–2019Q4, obtained from the San Francisco Fed’s TFP database. Define a^t=lnAtlnAˉ\hat{a}_t = \ln A_t - \ln\bar{A} (log-deviation from mean).

OLS estimation: Regress a^t\hat{a}_t on a^t1\hat{a}_{t-1}:

a^t=ρAa^t1+εtA.\hat{a}_t = \rho_A\hat{a}_{t-1} + \varepsilon_t^A.

The OLS estimator:

ρ^A=t=2Ta^ta^t1t=2Ta^t12=γ1γ0=ρ^1.\hat{\rho}_A = \frac{\sum_{t=2}^T\hat{a}_t\hat{a}_{t-1}}{\sum_{t=2}^T\hat{a}_{t-1}^2} = \frac{\gamma_1}{\gamma_0} = \hat{\rho}_1.

The OLS estimator of the AR(1) coefficient is the first sample autocorrelation. Typical estimates: ρ^A0.95\hat{\rho}_A \approx 0.95, σ^A0.0072\hat{\sigma}_A \approx 0.0072 (quarterly).

Interpretation: A TFP shock of +1%+1\% decays at rate 0.95t0.95^t per quarter. After 5 years (20 quarters): 0.9520=0.3580.95^{20} = 0.358 — 36% of the original shock remains. The half-life is ln2/ln0.95=0.693/0.05113.5\ln 2/|\ln 0.95| = 0.693/0.051 \approx 13.5 quarters, or about 3.4 years.

⎕IO←0 ⋄ ⎕ML←1

rho_A ← 0.95 ⋄ sigma_A ← 0.0072 ⋄ T ← 200

⍝ 1. Better Innovation Draw: Standard Normal approximation
⍝ sum of 12 uniforms - 6 is a quick way to get N(0,1)
rnorm ← { ( +/ 12 ⍴ ?0 ) - 6 }
eps ← sigma_A × rnorm¨ ⍳ T

⍝ 2. Fix: AR(1) Simulation via Scan
⍝ The scan function takes previous state (⍺) and current shock (⍵)
ar1_path ← { (rho_A × ⍺) + ⍵ } \ eps

⍝ 3. Fix: Sample Autocorrelation (OLS Estimator)
demean ← {⍵ - (+/⍵) ÷ ≢ ⍵}
a ← demean ar1_path

⍝ We use a proper dyadic function call here
⍝ rho_hat = (Y'X) / (X'X) where Y is lead and X is lag
Y ← 1 ↓ a
X ← ¯1 ↓ a
rho_hat ← (Y +.× X) ÷ (X +.× X)
rho_hat

⍝ 4. Fix: Autocovariance Function (ACF)
⍝ We calculate Cov(a_t, a_{t-k}) for k = 0 to 9
⍝ Normalizing by the variance (k=0) gives the correlation
gamma ← { (⍵ ↓ a) +.× ((-⍵) ↓ a) } ¨ ⍳10
acf ← gamma ÷ ⊃gamma  ⍝ Divide by variance to get correlations
10 ↑ acf
Loading...
Loading...
# Python — AR(1) estimation and diagnostics
import numpy as np
from statsmodels.tsa.ar_model import AutoReg
import matplotlib.pyplot as plt

np.random.seed(42)
rho_true, sigma_true, T = 0.95, 0.0072, 200

# Simulate AR(1)
eps = np.random.normal(0, sigma_true, T)
a = np.zeros(T)
for t in range(1, T):
    a[t] = rho_true * a[t-1] + eps[t]

# OLS estimation: first autocorrelation
rho_hat = np.corrcoef(a[1:], a[:-1])[0,1]
sigma_hat = np.std(a[1:] - rho_hat * a[:-1])
print(f"True: ρ={rho_true}, σ={sigma_true}")
print(f"OLS:  ρ̂={rho_hat:.4f}, σ̂={sigma_hat:.6f}")

# Impulse response
h = np.arange(40)
irf = rho_hat ** h
plt.plot(h, irf); plt.axhline(0, c='k', lw=0.5)
plt.title('AR(1) Impulse Response Function'); plt.xlabel('Quarters'); plt.show()
# Julia — AR(1) moments and simulation
using Statistics

rho_true, sigma_true, T = 0.95, 0.0072, 200
Random.seed!(42)
eps = randn(T) .* sigma_true
a = zeros(T)
for t in 2:T
    a[t] = rho_true * a[t-1] + eps[t]
end

rho_hat = cor(a[2:end], a[1:end-1])
sigma_hat = std(a[2:end] .- rho_hat .* a[1:end-1])
println("OLS ρ̂ = $(round(rho_hat, digits=4)),  σ̂ = $(round(sigma_hat, digits=6))")

# Theoretical vs. simulated ACF
lags = 0:20
acf_theory = rho_hat .^ lags
acf_sample = [cor(a[1+k:end], a[1:end-k]) for k in lags]
println("ACF comparison (lags 0-5):")
display([acf_theory[1:6] acf_sample[1:6]])
# R — AR(1) estimation
set.seed(42)
rho_true <- 0.95; sigma_true <- 0.0072; T <- 200
eps <- rnorm(T, 0, sigma_true)
a <- numeric(T)
for(t in 2:T) a[t] <- rho_true * a[t-1] + eps[t]

# Yule-Walker / OLS estimate
rho_hat <- cor(a[-1], a[-T])
sigma_hat <- sd(a[-1] - rho_hat * a[-T])
cat(sprintf("OLS: rho_hat = %.4f, sigma_hat = %.6f\n", rho_hat, sigma_hat))

# Theoretical spectral density
omega <- seq(-pi, pi, length=500)
spec_theory <- sigma_hat^2 / (2*pi * (1 + rho_hat^2 - 2*rho_hat*cos(omega)))
plot(omega, spec_theory, type='l', main='AR(1) Spectral Density',
     xlab='Frequency ω', ylab='S(ω)')

5.10 The Tauchen (1986) Discretization

Continuous-state AR(1) processes must be discretized for numerical dynamic programming (Chapters 15–17). The Tauchen (1986) method constructs a discrete Markov chain that approximates the AR(1).

Algorithm 5.1 (Tauchen Discretization).

Given the AR(1): zt+1=ρzt+σεεt+1z_{t+1} = \rho z_t + \sigma_\varepsilon\varepsilon_{t+1}, εtN(0,1)\varepsilon_t \sim \mathcal{N}(0,1):

  1. Choose the number of grid points NN and the number of standard deviations mm (typically m=3m = 3).

  2. Set grid endpoints: zmin=mσzz_{min} = -m\sigma_z, zmax=mσzz_{max} = m\sigma_z, where σz=σε/1ρ2\sigma_z = \sigma_\varepsilon/\sqrt{1-\rho^2}.

  3. Construct evenly spaced grid: {z1,,zN}\{z_1, \ldots, z_N\} with z1=zminz_1 = z_{min}, zN=zmaxz_N = z_{max}, spacing Δz=(zmaxzmin)/(N1)\Delta z = (z_{max}-z_{min})/(N-1).

  4. Compute transition probabilities: for grid points ziz_i and zjz_j:

    Pij=Φ ⁣(zj+Δz/2ρziσε)Φ ⁣(zjΔz/2ρziσε),P_{ij} = \Phi\!\left(\frac{z_j + \Delta z/2 - \rho z_i}{\sigma_\varepsilon}\right) - \Phi\!\left(\frac{z_j - \Delta z/2 - \rho z_i}{\sigma_\varepsilon}\right),

    with boundary corrections for j=1j = 1 (lower tail) and j=Nj = N (upper tail). Here Φ\Phi is the standard normal CDF.

The transition matrix PP is an N×NN\times N row-stochastic matrix: each row sums to 1.

In APL, the Tauchen discretization exploits the outer product ∘.- to compute all differences simultaneously:

⎕IO←0 ⋄ ⎕ML←1

⍝ 1. Setup Parameters
rho ← 0.95 ⋄ sig_e ← 0.007 ⋄ N ← 5 ⋄ m ← 3
sig_z ← sig_e ÷ (1 - rho*2)*0.5

⍝ 2. Build the Grid
z ← sig_z × (¯1 × m) + (⍳N) × (2 × m) ÷ (N - 1)
dz ← z[1] - z[0]

⍝ 3. Improved CDF Approximation (To prevent the "leak")
⍝ Using a slightly more accurate coefficients for the logistic fit
Phi ⇐ { 1 ÷ 1 + * ¯1.702 × ⍵ }

⍝ 4. Compute Transition Matrix P
E ← rho × z
U ← (z + dz ÷ 2) ∘∙- E
L ← (z - dz ÷ 2) ∘∙- E

⍝ Calculate probabilities
P ← (Phi U ÷ sig_e) - (Phi L ÷ sig_e)

⍝ 5. Fix Boundaries & Ensure Positivity
P[;0] ← Phi ( (z[0] + dz ÷ 2) - E ) ÷ sig_e
P ← 0 ⌈ P  ⍝ CLIP: Force all values to be at least 0

⍝ Normalize Rows: This is the proper way to ensure they sum to 1
⍝ instead of just forcing the last column to take the hit.
P ← P ÷ [1] +/ P

ret ← 2 2 ⍴ "Grid" z "Transition Matrix" P

⍝ 6. ACTUALLY EXECUTE AND SHOW RESULTS
⊣ o3:show ret
┌→─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
↓             "Grid"                        ┌→────────────────────────────────────────────────────────────────────────────────────┐│
│                                           │-0.06725382459813659 -0.03362691229906829 0.0 0.03362691229906829 0.06725382459813659││
│                                           └─────────────────────────────────────────────────────────────────────────────────────┘│
│"Transition Matrix" ┌→───────────────────────────────────────────────────────────────────────────────────────────────────────────┐│
│                    ↓   0.9886881947148289  0.011108002114551237 4.640807064280832E-6 1.9623425921396166E-9 8.740061983817253E-13││
│                    │ 0.011310180040574639    0.9722564556760945 0.016226473749243887  6.976309931736691E-6  3.107195155267467E-9││
│                    │ 4.841378538597081E-6   0.02481454595410667   0.9515460873954106   0.02419098747695417 1.1046327375907569E-5││
│                    │2.0495559094083867E-9   7.15613463715283E-6 0.016226473749243794    0.9478248680087559  0.037833730037353706││
│                    │8.676577532913112E-13 2.0129248288344396E-9 4.640807064250625E-6   0.01082887192633153    0.9886022229108655││
│                    └────────────────────────────────────────────────────────────────────────────────────────────────────────────┘│
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

The Rouwenhorst (1995) method is an alternative that better approximates highly persistent processes (ρ|\rho| near 1) and is preferred for calibrated DSGE models where ρ0.95\rho \approx 0.95.


5.11 Programming Exercises

Exercise 5.1 (APL — ACF)

Write a dfn acf ← {n ← ⍺ ⋄ x ← ⍵-+/⍵÷≢⍵ ⋄ ...} that computes the sample autocorrelation function at lags 0,1,,n10, 1, \ldots, n-1 for a time series passed as right argument. The computation should use +.× (inner product) for each lag. Test on the simulated AR(1) from Section 5.9 and verify the ACF matches ρ^k\hat{\rho}^k.

⎕IO←0 ⋄ ⎕ML←1

acf ← {
    n ← ⍺
    x ← ⍵ - (+/⍵)÷≢⍵        ⍝ demean
    v ← x+.×x                  ⍝ sum of squares (unnormalized variance)
    {(k↓x)+.×(k↓⌽x)}¨⍳n) ÷ v  ⍝ autocov at each lag, normalized
}

⍝ Test on AR(1) simulation
T ← 500
rho ← 0.8
eps ← 0.1 × (T⍴0) + ? T⍴0    ⍝ rough noise (replace with normal)
⍝ ar1 ← {rho×⍵+eps[⍺]}\ ⍳T   ⍝ would need proper simulation
20 acf ar1                      ⍝ first 20 autocorrelations

Exercise 5.2 (Python — Spectral Density)

Plot the theoretical and estimated spectral density of the AR(1) TFP process calibrated with ρ=0.95\rho = 0.95 and σε=0.0072\sigma_\varepsilon = 0.0072. Use numpy’s FFT to estimate the empirical spectrum from a long simulation, and overlay the theoretical formula from Theorem 5.3. Shade the business-cycle frequency band (cycles of 6–32 quarters).

import numpy as np; import matplotlib.pyplot as plt
rho, sigma, T = 0.95, 0.0072, 10000
eps = np.random.normal(0, sigma, T)
a = np.zeros(T)
for t in range(1,T): a[t] = rho*a[t-1] + eps[t]

omega = np.linspace(-np.pi, np.pi, 1000)
spec_theory = sigma**2 / (2*np.pi * (1 + rho**2 - 2*rho*np.cos(omega)))

from scipy.signal import periodogram
f, Pxx = periodogram(a, window='hann')
omega_data = 2*np.pi*f

fig, ax = plt.subplots()
ax.semilogy(omega_data, Pxx, alpha=0.5, label='Empirical (periodogram)')
ax.semilogy(omega, spec_theory, 'r-', linewidth=2, label='Theoretical')
ax.axvspan(2*np.pi/32, 2*np.pi/6, alpha=0.1, color='green', label='BC frequencies')
ax.legend(); ax.set_xlabel('Frequency ω'); plt.show()

Exercise 5.3 (Julia — Tauchen vs. Rouwenhorst)

using Distributions, LinearAlgebra

function tauchen(rho, sigma_eps, N, m=3)
    sigma_z = sigma_eps / sqrt(1-rho^2)
    z = range(-m*sigma_z, m*sigma_z, length=N)
    dz = step(z)
    P = zeros(N, N)
    d = Normal(0, sigma_eps)
    for i in 1:N
        mu = rho * z[i]
        P[i,1] = cdf(d, z[1] + dz/2 - mu)
        P[i,N] = 1 - cdf(d, z[N] - dz/2 - mu)
        for j in 2:N-1
            P[i,j] = cdf(d, z[j]+dz/2-mu) - cdf(d, z[j]-dz/2-mu)
        end
    end
    return collect(z), P
end

z_grid, P = tauchen(0.95, 0.0072, 7)
println("Grid: ", round.(z_grid, digits=4))
println("Row sums (should be 1): ", round.(sum(P, dims=2), digits=6))

Exercise 5.4 (R — Wold Representation)

# Verify Wold MA coefficients for AR(1) match IRF
rho <- 0.8; sigma <- 0.1; T <- 500
set.seed(1)
a <- arima.sim(list(ar=rho), T, sd=sigma)

# Fit AR(1)
fit <- arima(a, order=c(1,0,0))
rho_hat <- as.numeric(coef(fit)[1])

# Wold MA coefficients: psi_j = rho^j
lags <- 0:20
psi <- rho_hat^lags

# Verify via impulse response from ARMAtoMA
psi_check <- ARMAtoMA(ar=rho_hat, lag.max=20)
cat("Max discrepancy:", max(abs(psi[-1] - psi_check)), "\n")

Exercise 5.5 — Persistence and the HP Filter (\star)

The Hodrick–Prescott (HP) filter with parameter λ=1600\lambda = 1600 (quarterly) isolates components with period 6–32 quarters. Show that for an AR(1) with ρ=0.95\rho = 0.95, approximately what fraction of the total variance falls in the business-cycle band [2π/32,2π/6][2\pi/32, 2\pi/6]? Compute this as 2π/322π/62SX(ω)dω/σX2\int_{2\pi/32}^{2\pi/6} 2S_X(\omega)d\omega / \sigma_X^2 using numerical integration (Gaussian quadrature). How does this fraction change as ρ\rho varies from 0 to 0.99?

Exercise 5.6 — Hall’s Test (\star\star)

Using quarterly U.S. non-durable consumption growth data from FRED (series DNDGRD3Q086SBEA): (a) Test whether lagged consumption growth Δlnct1\Delta\ln c_{t-1} helps predict current consumption growth Δlnct\Delta\ln c_t (Hall’s test for the martingale property). (b) Test whether lagged income growth Δlnyt1\Delta\ln y_{t-1} has predictive power (excess sensitivity test). (c) Use HAC (Newey–West) standard errors for both tests. What do your results imply about the extent of liquidity constraints in the data?


5.12 Chapter Summary

Key results:

  • A probability space (Ω,F,P)(\Omega, \mathcal{F}, P) with filtration {Ft}\{\mathcal{F}_t\} formalizes the structure of information in dynamic economies. Rational expectations = conditional expectation given Ft\mathcal{F}_t.

  • A process is covariance-stationary if its mean, variance, and autocovariances are all time-invariant. The autocovariance function (ACF) and spectral density are its complete second-moment characterizations.

  • The AR(1) Xt=ρXt1+εtX_t = \rho X_{t-1} + \varepsilon_t has ACF ρk=ρk\rho_k = \rho^k, variance σε2/(1ρ2)\sigma_\varepsilon^2/(1-\rho^2), and spectral density σε2/(2π1ρeiω2)\sigma_\varepsilon^2/(2\pi|1-\rho e^{-i\omega}|^2).

  • The Wold decomposition guarantees that any covariance-stationary process has an MA(\infty) representation in terms of its own innovations — the foundation of structural VAR identification.

  • The Tauchen (1986) algorithm discretizes an AR(1) into a finite Markov chain using the ∘.- outer product in APL, enabling numerical dynamic programming.

  • Hall’s random walk — consumption is a martingale under rational expectations and quadratic utility — is the testable implication of the consumption Euler equation; its widespread empirical failure points to liquidity constraints.

Connections forward: Chapter 18 develops the full rational expectations solution methodology using these stochastic concepts. Chapter 19 uses ARMA and VAR processes for structural identification of macroeconomic shocks. Chapter 20 applies the Kalman filter to estimate unobserved state variables (potential output, the natural rate) from noisy data. Chapter 26 uses Monte Carlo simulation of AR(1) shock processes to generate the empirical moments of calibrated DSGE models.


Next: Part II — Static Macroeconomic Models and Comparative Statics