← Daybreak
THE S&P 500, HONESTLY · A DATA-DRIVEN OVERVIEW

What 155 years of the U.S. stock market can — and can’t — tell you

0 years·0+ models

1 honest verdict on forecasting.

You cannot time the S&P. We prove it below, AI included. But valuation predicts the long run, and risk is forecastable. That’s where the signal lives.

Built from 30-plus models on the index’s full daily history (1871 → mid-2026), with Robert Shiller’s public valuation data merged in. We sell no predictions. The job is to separate what the data supports from wishful thinking. Observation, not advice.

THE WHOLE PICTURE

A century and a half up, through seven brutal crashes.

The index went from about 4 to roughly 7,400. Every chart here uses a log scale, the only honest way to show percentage moves evenly across that range.

The long run is up. The crashes are savage and recurring — 1929 (−86%), 1937, 1973–74, 1987, 2000–02 (−49%), 2008–09 (−57%), 2020 (−34%). Both are permanent features. Neither cancels the other.

overview

loading overview
The 1929 crash wiped out 86% and took until 1954 to get back to even — more than two decades underwater. Volume only starts in 1950; nothing was recorded before.
PART 1 — THE VERDICT

Can anyone predict the index? No.

We tested every forecasting model the honest way: train on the past, predict the future it hadn’t seen, roll forward, repeat. The benchmark to beat is the dumbest forecast there is — “tomorrow’s level = today’s level.”

Almost nothing beats it. We put this first on purpose.

backtest_skill_heatmap

loading backtest_skill_heatmap
A wall of slate. The more slate, the worse the model did than just using today’s level; only a blue cell beats it. No model shows reliable price skill at any horizon, simple or AI.

backtest_error_by_horizon

loading backtest_error_by_horizon
Every forecaster’s error, plotted against how far out it looks. The lines sit on top of the benchmark and on top of each other. Clustered lines mean no edge.

model_gbm_direction

loading model_gbm_direction
A model trained to call next month’s direction is right 56% of the time — worse than just saying “up” every month (61%), and barely above a coin flip.

forecast_foundation_ensemble

loading forecast_foundation_ensemble
Two independent AI forecasters, handed the exact same history, land on surprisingly close middle paths — about 17,000 and 21,000 by 2034, eight years out. But each model’s own honest range is enormous: one alone spans roughly 6,100 to 63,400, about a 10x spread. Even agreeing on the center, they admit they barely know.

backtest_turns_scorecard

loading backtest_turns_scorecard
Did the models call the famous turns (1929, 2000, 2007, the 2009 bottom, 2020) using only data from before each one? Mostly empty cells. Almost nothing flagged a top before it happened.
PART 2 — THE LONG RUN

Where the long run points, honestly bounded.

Short-horizon prediction has no skill. The long run is different. For nearly a century the index has climbed one steady line — about 7% a year after you strip out the noise.

Independent methods agree on that central path. They also agree the spread around it is enormous, and the timing is unknowable. A direction is not a price target.

compare_all

loading compare_all
History plus an honest forward range, built by replaying the index’s real daily moves from the last ~30 years. The middle rides the 7%-a-year trend; the range from a calm decade to a rough one spans more than 3x by 2036.

forecast_bayes_powerlaw

loading forecast_bayes_powerlaw
The same trend, projected with honest uncertainty: middle near 8,000 by 2036, inside a likely range of roughly 4,400 to 14,700. Where it lands, and when, nobody knows.

montecarlo_v2

loading montecarlo_v2
Replay the index’s recent daily moves forward ten years and you get a range from about 7,500 in a rough decade to 27,000 in a strong one. About 9 in 100 replays even finish below today.
WHAT U.S. STOCKS HAVE ACTUALLY RETURNED — AFTER INFLATION, DIVIDENDS REINVESTED
+-6%Best 10-year window (per year) — but −6% at worst
+0%Average 30-year window (per year)
3030-year windows that finished behind inflation
+0%Best 30-year window (per year)

forecast_real_return

loading forecast_real_return
Hold 30 years and, in the modern record, you have never finished behind inflation. The 10-year result swings far wider — and today’s expensive starting point sits in its weak low end.

decomp_secular_cyclical

loading decomp_secular_cyclical
Split into a growth spine and the swing around it. The 1966–82 and 2000–13 flat decades were just the market sitting below its spine for a while — ordinary cycle, not a breakdown.
PART 3 — WHAT IT MOVES THROUGH

The market lives in moods, not patterns.

Price level is unforecastable. The state the market is in is not. It runs calm for years, then flips into crisis and stays there. Those moods are sticky, and crises cluster around the famous crashes.

There are faint calendar and cycle rhythms too. They are real and they are tiny, and several are fading. None is a clock you can set a trade by.

regime_hmm

loading regime_hmm
Every day colored by its mood — calm, choppy, or crisis. The crisis stretches cluster around 1929–32, 1987, 2000–02, 2008–09 and 2020. Right now it reads choppy.

regime_switching

loading regime_switching
A simpler two-mode read: the index spends about 80% of its days in a calm bull mode and 20% in a turbulent bear mode where moves are far bigger. It flags every famous crash as a bear stretch.

wavelet_scalogram

loading wavelet_scalogram
The strongest repeating rhythm runs about 8 years — the business cycle. It is real but it wanders, drifting in length and strength across eras. Never a fixed clock.

seasonality

loading seasonality
The “stronger November-to-April” pattern, quantified: under half a percent a month (+0.46%), and it has faded from its mid-century peak. Real, honestly tiny, not a way to beat the market.

hurst

loading hurst
A “memory” read on daily moves. It has hovered near a coin-flip 0.50 for a century and sits at about 0.47 today. If anything, a faint tendency for moves to reverse, not to trend.
PART 4 — THE FORECASTABLE PART

Risk is the one thing you can model.

Price is unforecastable. Volatility is not. It clusters: a wild stretch tends to be followed by more wild days, a calm one by more calm days. That makes the near-term risk level genuinely predictable, unlike anything in price.

And the risk that matters most — crash risk — is far worse than a normal bell curve admits.

garch

loading garch
Every spike lines up with a real crisis, and quiet stretches stay quiet for years. That clustering is what makes near-term risk, unlike price, genuinely predictable.

evt

loading evt
The real crashes (1987, 2008, 2020) land right on the orange fat-tail line and far below the bell curve, which treats those same days as all but impossible. Plan for the fat-tail number.

model_drawdown

loading model_drawdown
The deepest falls are also the slowest to heal. The bottom line counts the years spent below the old high before a new record, so a tall bar there is a decade you waited just to break even.
PLAN-FOR-IT CRASH RISK — THE WORST SINGLE DAY
0%Once a decade (fat-tail model)
0%Worst on record — Black Monday, Oct 1987
0xWorse than a bell curve implies
0%1929 peak-to-trough fall (recovered by 1954)

sim_jump_diffusion

loading sim_jump_diffusion
An honest forward range must include the rare crash and spike days a normal bell curve forbids. This model leans deliberately crash-prone: about 74 in 100 ten-year paths see a down-day worse than −10% — more than the real record, which had only about four such days in a century.
PART 5 — THE ONE REAL FORECAST

Valuation is where the signal actually lives.

There is one place equities are genuinely forecastable, and it is not timing. It is valuation. Measure how expensive stocks are — price against the last ten years of earnings — and history has reliably pointed to the next decade’s real return.

Buy expensive and the next ten years have tended to be weak. Buy cheap and they have been strong. This is the report’s one forecast we can partly check against history — though these ten-year windows overlap, so there are far fewer truly independent decades than it looks. Read the tilt, not a precise law. And it is mute on timing.

valuation_cape_overlay

loading valuation_cape_overlay
The same gauge across 145 years spikes into every famous top and sinks into every great bottom. Read it as context, not a timer: it ran expensive for years before both the 2000 and 2007 peaks.

valuation_cape

loading valuation_cape
Each dot is one month since 1881: how expensive stocks were, against the inflation-adjusted total return (dividends reinvested) the next decade then delivered. The cloud slopes down. Start expensive, and the following ten years have tended to be weaker.

valuation_conditional_dist

loading valuation_conditional_dist
Sort every month into ten valuation groups. The most expensive tenth still returned about +5% a year after inflation over the next decade — but today sits at its very top edge (valuation 41, in a group that starts at 25), and the rare months as pricey as today did clearly worse: roughly −3% a year, like the cloud above.

valuation_erp

loading valuation_erp
Stocks now yield about 2% a year less than bonds (a 2.4% earnings yield against a 4.5% bond), versus a long-run gap of +1.4%. A gap this thin has shown up in only about one month in eight of history — mostly clustered near past market peaks.
WHAT TODAY'S VALUATION IMPLIES — NEXT 10 YEARS, AFTER INFLATION
0Valuation gauge today (long-run average ≈ 18)
0%Of all months since 1881 were cheaper than now
0%/yrWhat the closest analogs imply for the next decade
−5 to +0%Honest range around it (a range, not a number)
PUTTING IT TOGETHER · MID-2026

The honest map.

This is not advice to buy or sell. It is a map of where prediction works and where it is wishful thinking. Observation, not advice.

Short run

Unforecastable. Today's level is the best guess of tomorrow's. The AI models did no better. Ignore confident price targets.

Long run

Up, at roughly 7% a year through history, but the spread around that is huge and the timing is unknowable. A direction, not a target.

Where we are

A choppy market mood, near the high end of valuation. Flat decades are an ordinary part of the cycle, not a break.

Risk

The forecastable part. Volatility clusters; plan for a worse-than-10% down day in most decades and a multi-year drawdown when crashes hit.

Valuation

The one genuine long-horizon signal. Long-run real returns have averaged about +7% a year, but today's level is near its most expensive ever — and from valuations this stretched, the next decade has historically run far weaker, with the closest analogs near flat-to-slightly-negative (around −3% a year after inflation).

LIMITATIONS — READ THESE
  • One price series plus one valuation series (Robert Shiller's public data). No options, breadth, or sector data.
  • Return figures are real total returns — dividends reinvested, then adjusted for inflation — so they reflect what a holder actually earned, not just the index's price change. The index level charts elsewhere are price only.
  • The long-horizon forward cones can't be fully validated; with 155 years you have only a handful of independent multi-decade windows. Those cones are scenarios, not validated forecasts.
  • The valuation forecast is the one long-horizon claim we can partly check. It predicts the next decade's return, not its timing — valuation was 'expensive' for years before both the 2000 and 2007 tops.
  • The trend assumes the past pattern continues. Past growth doesn't guarantee future growth.
  • Descriptive, not predictive. These models describe what happened; they don't tell you what's next.

Descriptive research, not financial advice. The S&P 500 can fall by half in a downturn and stay underwater for years — as it has more than once in this dataset.

Appendix — every chart & model inventory →