Retroactive Consensus Markets

Part I · The Engine — Chapter 2

Conventional prediction markets bind three things that need not be bound: (i) eliciting a forecast, (ii) declaring what actually happened, and (iii) paying out. This mechanism decouples elicitation from resolution and reward.

Forecasters merely publish timestamped probability distributions. Each capital holder ("resolver") privately declares, whenever they like, what they believe happened, and an automated proper-scoring rule retroactively scores every forecaster against that resolver's own ground truth. Forecasters are therefore paid for predicting the future consensus of resolvers — which, under a shared-reality assumption, is a proxy for predicting future common knowledge. The same engine doubles as a skill ranking of forecasters, which can later weight votes in a governance system.

The primitives

  • A set of questions , each with an outcome space .
  • Each forecaster publishes, at times , a timestamped distribution — their belief about at time . (Timestamps must be tamper-evident; see A4.) Forecasters stake information, not money.
  • Each resolver may, at a time of their choosing, declare a subjective resolution — their own private verdict on what happened. Declaration is optional and revisable; a resolver may simply discard any question they find ambiguous or manipulated.
  • A strictly proper scoring rule (canonically the log score ), whose defining property is that reporting one's true belief uniquely maximizes expected score.

The reward rule

Score forecaster against resolver 's verdict on question , relative to a reference belief (e.g. the consensus belief just before 's update):

  • The reference subtraction is Hanson's market scoring rule: a forecaster is paid for the marginal movement of belief toward the eventual verdict. You profit only by being right and different from the crowd — the formal version of "surprising and true." Echoing consensus earns nothing; moving belief the wrong way costs.
  • can up-weight earlier forecasts, further rewarding being right before it was common knowledge.

Forecaster 's reputation with respect to resolver is . Resolver distributes a reward budget across forecasters as a monotone function of . Equivalently: an automated market maker retroactively "places bets" on each forecaster's behalf using their published probabilities, then settles against 's verdict.

The structural facts that make this work:

  • Resolution is subjective (per-resolver), retroactive (scored after the fact), optional, and revisable. No global instant of truth exists to attack.
  • Forecasters are ranked relative to each resolver's worldview — no claim of objective truth, only skill at predicting a given resolver's eventual verdicts.

What it incentivizes — the consensus-proxy claim

Proposition (informal). Suppose a forecaster is risk-neutral, cannot influence resolutions, and maximizes total expected reward . Because is strictly proper and the objective is linear in , the uniquely optimal report is the budget-weighted predictive distribution of resolvers' eventual verdicts:

In words: the money-optimal strategy is to forecast what the capital-weighted population of resolvers will eventually conclude. Under shared reality (A1) that population converges, so this is a proxy for future common knowledge. Forecasters keep updating, because stale beliefs decay their reputation relative to peers.

This yields two outputs at once:

  1. A live, aggregated forecast of future resolver consensus on every question — a shared, evolving world-model.
  2. A skill ranking identifying who reliably predicts that consensus, usable downstream as voting weight.

Properties

  • No exploitable resolution. Each resolver settles privately, retroactively, and can discard bad questions — so there is no single oracle to bribe, no point-in-time ambiguity to exploit, and no forced verdict on genuinely unresolved questions. This defeats failure modes (2) and (3) from the previous chapter.
  • Optimization power without real-money risk. Reward is real but paid retroactively by resolvers who chose to pay, decoupled from a zero-sum betting pool — addressing failure mode (1) without importing the fragility of (2)–(3).
  • Manufactures common knowledge. Given existing common knowledge of resolutions, the mechanism computes a new common knowledge of resolution-forecasts — and identifies who is best at producing it.
  • Predicting ≈ deciding. Those who best predict future consensus are often those best positioned to say what is worth doing. The skill ranking can therefore weight a voting system, biased toward demonstrated forecasters to whatever degree a resolver desires.

To see all of this in motion on a single concrete question, read the worked example. The premises it leans on are catalogued in The Five Assumptions.

Open problems

The mechanism's load-bearing risks. Each is taken up and either answered or reduced to a measurable quantity later in the book — status tags below.

  1. Reflexivity / beauty contest (A5). If the skill ranking feeds back into the votes that constrain resolvers, forecasters may be predicting an equilibrium they help create. This breaks properness and admits self-fulfilling but false focal points. Convergence to truth vs. to a stable falsehood is the central theoretical gap. → Answered in principle: Reflexivity and the Dark Room / Mathematical Core §2. Linear herding is harmless; danger is a bifurcation at deference slope , which is measurable and controllable.
  2. Resolver honesty is unmodeled. Nothing forces honest resolution once verdicts gate public money. → Reduced to an inequality: Mathematical Core §2.4.
  3. Manufactured surprise. A poorly chosen reference can reward contrarianism for its own sake. → Mitigated by construction (lagged-consensus reference, frozen budgets): Mathematical Core §2.5.
  4. Collusion / Sybil. Resolver–forecaster collusion, or Sybil forecasters farming the reference subtraction. → Reframed and substantially answered: Sybil Resistance Is Independence / Mathematical Core §3.
  5. "Reward by extrapolated intention/utility." Rewarding the extrapolated value of information when verdicts are late — a desideratum, not yet an operator. → Defined: Mathematical Core §2.5.
  6. Causal resolution. Verdicts must be ex-post counterfactual contrasts, not raw conditional outcomes, or the scoring rule lends authority to a confounded judgment (futarchy's flaw). → Futarchy and Causality.