The Dither Project
What is it?
Dither is a project with the ultimate goal of creating a decentralized and privacy-respecting Internet. It is a repository of libraries, tools, ideas, and applications that allow people to communicate privately, distribute data, manage accounts, much more. It is currently being developed by @Zyansheep and (for the time being) hosted via GitHub under the libdither organization.
The aim for Dither is to replace existing centralized applications with decentralized alternatives that are unified through their use of a singular, modular protocol.
See the application document for outlines of various applications that could be built using Dither.
Core Design Tenets
It seems helpful for projects to have guidelines to help aid design and collaboration. These are the ones I've chosen for now, as with everything, they are subject to change.
Dither should be useful
- Dither is a project. Projects should be useful.
- This one probably doesn't even need to be said, but it is always important while thinking about design, to keep in mind that you eventually want to create something useful.
Dither should be modular and modelable
- Once you build something, it often gets harder and harder to add features to it unless it is clear how the parts of the system connect and you can separate your concerns. To ensure extensibility and comprehensibility for future Dither developers, modularity and modelability are key.
Dither should be interoperable
- The goal of Dither is to replace existing services and standards. This is very hard to do (see: xkcd #927). To try to avoid this fate and make transition as easy as possible, Dither should do anything necessary to make the experience the same, or better, than existing platforms and services.
- A real-world example of this working is PipeWire acting as a single program unifying nearly all existing audio APIs on linux. (and being backwards compatible with programs that use those other audio APIs).
- See Dither Interoperability for more details.
Dither should rely on itself
- This tenet is simply a reminder of the end-goal of Dither: to replace the centralized internet. So long as the other tenets are satisfied, this is the ultimate goal. (Because who doesn't want to reinvent the wheel!)
Structure
Following the first tenet of Dither, in the future the lines between these layers will blur and everything will be a module. However, the design of current operating systems don't easily allow for shared code and data, requiring a more formal structure. In the future this layered structure will be replaced with a more flexible system.
Core Process
The Core Dither Process is the part that deals with all operating system-facing operations such as data storage, establishing peer-to-peer connections with other computers, in addition to managing all Dither services and connections between them.
This "Core Process" provides a few core APIs that only certain services running in the "Service Swarm" are allowed to use for security purposes. The idea behind a "Core Process" is to create a sandboxed environment for services to run in a safe manner.
peer-to-peer connections with other computers running Dither. Currenly in the dither-sim program, this is implemented via a simple TCP stream. In the future this layer will be implemented using existing libraries such as libp2p transports, Pluggable Transports, something else, or some amalgamation of all three. The idea behind this layer is to provide as many methods of communication as feasibly possible.
Service Swarm
The service layer provides all functionality related to routing, encryption, data storage, user management and everything else. Each of these services are split up into separate modules each of which runs its own processes and communicates with other services through inter-process communication.
All these processes are managed as child processes under one "main process". The main process contains the Transport Layer implementation, the routing protocol API and APIs for managing the child services as well as managing inter-process communication between child processes.
User Interface
The application layer contains services just like the service layer that are registered under the main process. These registered service's APIs can be used by other applications.
The application layer also refers to the application's core API, which is used by the interface layer. This "core application API" can be built into the applicaiton's executable, or it can run as a service under the main process and used by multiple interfaces. This is left up to the application developers. Applications with multiple interfaces should prefer to register application APIs under the main process.
Existing planned applications may be found here.
Interface Layer
The final layer is the interface layer. This just refers to standalone applications that provide some kind of interface to the user using the services running under Dither.
These interfaces can be implemented however, but it is recommended for them to follow Dither's application design philosophy for some level of standardization.
Other Links
Inspirations for Dither
As with any creative endeavor, Dither takes inspiration from many other projects. This is a list of what parts of Dither have been inspired from other projects.
Structure
This document outlines the major structure of Dither.
System Manager
At the core of Dither is the system manager. This will be written in Rust and should only be run once on any given user account. The system manager provides a sandbox for Dither protocols and is built in a modular fashion to support any kind of setup or platform Dither might run on.
Core Services
In order to sandbox Dither services, the system manager provides certain core services such as access to storage, network, or other Dither services.
- Service Manager
- Provides any service with access the ability to organize other service's permissions, manage storage, as well as stop and start services at will.
- Network Service
- Provides any service with access the ability to establish arbitrary TCP or UDP connections.
- Storage Service
- Provides any service with access the ability to fetch and store data unique to that service.
Other services may be added as needed.
Service Swarm
The system manager acts as a kind of sandbox for all services running on Dither and facilitates communication between different services.
Services
This is a list of planned Dither services and their dependencies, note these may be split up into smaller sub-services.
- Distance-Based Routing, DBR (Network, Storage)
- Directional-Trail Search, DTS (Distance-Based Routing, Storage)
- Reverse-hash-lookup, RHL (Directional-Trail Search)
- User Manager (DBR, DTS, RHL, Storage)
- Dither Chat (DBR, DTS, RHL, User Manager, Storage)
- Dithca (DTS, RHL, User Manager, Storage)
- Protocol of Truth (DTS, RHL)
Applications
Applications in Dither will be external programs that communicate with specific services in the Dither System Manager. i.e. Dither Chat will use the Dither Chat Service Dithca will use the Dithca service.
Or applications can use a multitude of different services as needed.
Dither Anonymous Routing
Dither Anonymous Routing (DAR) is a peer-to-peer protocol for efficiently and flexibly obfuscating connections between computers.
It improves on speed and versatility over existing solutions (i.e. I2P and TOR) by incorporating latency and bandwidth estimation techniques so that intermediate relays may be (optionally) optimally chosen. It exposes this option to applications built on DAR, allowing the developers and users to explicitly dictate the trade-off they prefer between speed and anonymity for each networked application they use.
In addition, to align itself with the philosophy of Dither, DAR aims to be as generic as possible, so that different encryption schemes and transport types may be used for different connections to hedge against the risk of one of them breaking, as well as easily allowing for novel schemes that may not be rigorously tested, but have desirable properties such as the stateless encryption protocol described in HORNET.
Network Modeling
In order to figure out what paths through the network are ideal for the given application and user (trading-off latency, bandwidth, anonymity, and cost). We need to be able to model the effects of connecting and routing through different nodes. "Will connecting to 245.14.973.23 to hide my connection still reach my latency or bandwidth targets?", "To what degree will it gain me anonymity from various threat models?" "What will it cost?" etc. Users should be able to set QOL goals that they are happy with, minimum performance and anonymity guarantees on a per-application basis and have the protocol automatically figure out what nodes to select and route through. This is a general RL task however, and thus to solve it it seems likely that we'll need some general world modeling algorithms.
A world model should be able to:
- Take two IP addresses, or a series of IP addresses and a timestamp, and guess the latency and bandwidth between them.
- Take a desired latency and bandwidth and generate likely single or multiple IP chains that satisfy the desired latency, bandwidth, and cost. Or are as close to satisfying them as possible.
Ideally this world model should be trainable in a federated / decentralized fashion. It is an open problem for how this could be done without divulging local information. (Perhaps it could be incorporated into the loss function for it to be bad at predicting certain "sensitive" metrics, at least for versions of the model sent to other nodes. This would somehow have to be balanced well with the game theory of the other nodes.)
Once you do have a world model, you need to RL it. What are the rewards we are maximizing?
- Predictive accuracy of metrics outside of yourself. (Predict results of actions)
- Measurements, cost requests
- Path selection metrics
What about the incentive layer here? Lets assume we have an out-of-band currency exchange system that is anonymous.
Nodes are trying to maximize their own income from acting as proxies. Cost is similar to latency or bandwidth, its a metric you receive from pinging a node and it can be predicted.
Nodes get requests for proxy and send back price, they have some baseline resource usage and want to maximize cumulative money over time. When they get a proxy request they can either accept or send back their own price which the sending node can either accept or not. RL algorithms will then need to learn how to bargain with each other automatically within their constraints.
Peer Discovery
To form a network, there needs to be a process for new nodes to connect to existing nodes. Dither Anonymous Routing aims to allow peers to be as anonymous as possible, and thus the peer discovery method aims to expose as little as possible about nodes on the network. Specifically, as little information as possible about nodes that don’t want to be discovered, and nodes far away (latency-wise) from the new node.
TLDR: Peer Discovery must expose information about some nodes, but should only expose information about nodes that are nearby and want to be known.
For more information, check out the Discovery section.
To assign routing coordinates to nodes, there is a process of peer-discovery that functions as follows. This process happens whenever a new node joins the network.
- New node bootstraps onto the network by initiating connection to one or more existing nodes.
- New node tests response times (latency) to connected nodes (peers).
- New node requests from some subset of lowest-latency (closest) peers that it would like more peers.
- New node’s peers notify some slice of their peers that a new node would like more connections.
- Notified nodes initiate connection if they are configured to do so and measure latency to new peers.
- Notified nodes initiate connection with new nodes and new nodes measure latency to new peers.
- New node takes note of the smallest latencies of its peers and goes back to step 3 until there are no closer nodes who want to peer.
- After a certain number of closest nodes are found whose latency measurements are stable, the new node then calculates routing coordinates and is records its currently connected nodes so that it may reconnect when if went offline.
Through this process, a distributed network is formed that reflects the physical topology of the relative orientations of the nodes.
Process
A packet with an RRC can be routed to its destination via the following process:
- Node chooses the peer that will receive the packet next by comparing RRC directions
- Node subtracts next peer’s RRC from packet’s RRC
- Node forwards modified packet to next peer
The process continues until the packet’s RRC is all zeroes and the last node it reaches either is the destination node, knows the destination node, or is the wrong node in which case the packet is dropped or sent back depending on the packet type.
Usage
Compared to the global routing tables and complicated peering and address space allocation protocols that the existing internet uses. Routing Coordinates are much better for peer-to-peer applications because they are pretty much infinitely scalable.
That said, RC in some ways give away more information than traditional IP addresses do. Since the self-organized networks that use RCs reflect real-world network topologies, just knowing someone’s routing coordinate relative to you could be akin to knowing roughly where they live. This is an acceptable risk because it is much easier to do efficient onion routing on networks with RCs than those without meaning that there is no reason not to have all connections onion-routed to some degree, providing better privacy overall.
Other benefits of routing coordinates are that they have the potential to almost completely prevent denial-of-service attacks. To even attempt such an attack, the attacker must find the routing coordinate of their target. Disregarding user error, this kind of attack is essentially impossible since everything is onion-routed by default. Even if the attacker does have the target’s routing coordinate, trying to DOS a routing coordinate is like trying to DOS the entire expanse of network between the attacker and the target, the attacker(s) will be ineffective or blocked by other nodes automatically for overuse of the network. Even distributed denial of service attacks can be mitigated with additions to the protocol allowing the victim to notify the network that they are being attacked and to rate limit the attackers.
Anonymous Routing (Onions, Garlic, and all the others…)
Conventionally, anonymous routing is an incredibly slow ordeal because of how intermediate peers are selected from the network. Due to this inefficiency, onion routing protocols have been somewhat limited in what kind of privacy they can provide because low data rates and high latency was a concern. This is no longer the case with DBR, which may support all kinds of anonymous routing schemes:
- Onion Routing
- The simplest routing of them all. Simply select a list of peers and establish a route from beginning to end.
- Garlic Routing
- Similar to onion routing, but when sending packets to multiple peers at once, send them together for them to be split apart at some mid-point in the path.
- Multi-path routing
- Maintain multiple Onion routes throughout the network and randomly send packets along all or some subset of them.
- Pool Routing
- Create a group that nodes can join. All nodes randomly send randomly-sized data packets to all other nodes at random intervals, sending real data (padded) if there is data to send and sending random bytes if not.
DBR plans to use a modification of the HORNET protocol for setting up fast onion-routed links.
Preventing Network Abuse
Routing protocols that rely solely on people voluntarily hosting nodes typically only have a relatively small number of peers willing to route packets through themselves (i.e. TOR). This is why protocols like BitTorrent, I2P and IPFS have systems in place that incentivize peers who use the network to contribute back for the benefit of all.
To accomplish this behavior for DBR there must be some way to limit packets going through nodes that either don’t use the network that much or don’t have a lot of bandwidth capacity and speed up packets through nodes that contribute greatly to the network. Also, to take into consideration are the management of nodes that have inconsistent uptime or inconsistent routing.
When talking about incentives, we are talking about game theory. So lets analyze the game theoretical situation at the level of an individual node.
Constraints:
- Each node is directly connected to a fixed number of other nodes at a varying latencies and bandwidth.
- Each node wants to send traffic through other nodes to use the network.
- Each node wants to establish onion proxies with other nodes for privacy.
- Each node has set of parameters that may change over time:
- Percentage of the time it will immediately respond and route a packet.
- Amount of traffic per unit time it is willing to route on average.
- Max amount of traffic per unit time it can route.
The goal is to allow for unrelated nodes to route and establish proxies through each other in proportion to how much each node contributes in some way to the network.
Ideas:
- Each node keeps track of the amount of traffic (bytes) flowing through its itself from directly connected nodes.
- Each node only sends traffic through direct nodes it knows it has received traffic from.
- There is more to theorize about here for future research :)
Conflicts with ISP Load Balancing
(This section is WIP)
Implementing an alternative routing protocol on top of regular IP routing may pose issues for ISP routing (i.e. forcing utilising of certain links too much, causing major slowdowns). ISPs don't optimize for latency or bandwidth, they optimize for load balancing to prevent too much link utilisation. TODO: Dither should take this into account by implementing its own second-layer load balancing system that makes sure ISP links aren't overloaded.
Questions for this problem:
- How might peer-to-peer overlay networks (of various kinds) effect an ISP’s ability to do load balancing well?
- ISP simply have to do more buffering, slowing queue times for specific links, making those links become unattractive, shifting the route prioritization to ones that are less desirable -> problem is this makes those routes unusable for regular (nearby) people and are instead hijacked by through-traffic.
Research
-
Dawn - Selfish overlay compensate for careless underlay
-
If overlay networks can or do conflict with ISP load balancing effort, how can that conflict be reduced via design of the protocol?
- Maybe reimplement load balancing into the network protocol?
- What might this look like in a peer-to-peer setting?
- Maybe reimplement load balancing into the network protocol?
Some set of nodes in the network:
- Sending along certain local paths (need to be careful to not overwhelm bandwidth along local link: not likely)
- Sending along mid-range path, some room for path diversity, but if everyone is doing it, it may create hotspot: need
Main issue: A is sending large file to B, anonymously. If they route through one path, this will be a problem, if they route along multiple paths, ISPs have more ability to distribute the load.
- More paths = more chance for paths to be surveiled: is it obvious who is chatting with who? -> More obvious than just one path
- Mixnets ? -> How fast would these be for large data transfers?
Research, Inspirations & Similar Work
(List in rough order of reading of notable articles I am using to implement Dither Routing. If anyone knows of any similar work not listed here, please let me know on Matrix!)
[1] HORNET: High-speed Onion Routing at the Network Layer
- Stateless Onion Routing, improves establishment of onion routes as well as speed of forwarding.
- The paper that pretty much started the distributed Network Coodinate System field of research.
[3] Coordinate-Based Routing for High Performance Anonymity
- Applying Vivaldi to Anonymous Routing
[4] Phoenix: A Weight-Based Network Coordinate System Using Matrix Factorization
- Improvements on Vivaldi, uses Matrix Factorization instead of Euclidian Embedding
[5] NCShield: Protecting Decentralized, Matrix Factorization-Based Network Coordinate Systems
- Threat modeling on Network Coordinate Systems. Prevents Frog-Boiling attacks.
[6] DMFSGD: A Decentralized Matrix Factorization Algorithm for Network Distance Prediction
- Improvements on Phoenix paper’s algorithm.
[7] Application-Aware Anonymity, Sherr et al.
Directional Trail Search
Directional Trail Search (DTS) is a protocol for efficiently fetching a piece of stored data from a network given its hash.
It is intended to be a vast improvement over existing protocols like IPFS by removing the use of poor-latency Distributed Hash Tables (DHT) and giving up some amount of data persistence in exchange for fast anonymous retrival of popular documents and flexibly anonymous hosting, and fixing the scalability issues to create a true "inter-planetary" file system.
The Main Idea
DTS works via two forces:
- The desire of people (nodes) who want to host information and want their information to be quickly accessible.
- The desire of people (nodes) who want to access information and want to access it as fast as possible.
These two forces are corralled by DTS to create an efficient system to host and find data. First let us talk about the desire of a node who wants to host some data. This desire manifests in the form of a "data trail".
A data trail in DTS is a trail left in the network with the sole purpose of leading to a specific piece of data. Specifically, a trail is a chain of peered nodes that store a mapping between the hash of a specific piece of data, and the id of the next peer in the chain.
A data trail is formed with the following process:
- A node that wants to host information broadcasts a "trail-laying" packet that travels to a specific relative coordinate on the network.
- All the nodes that this "trail-laying" packet encounters on the way to its destination will do one of the following:
- Reject to be a part of the trail, sending the packet to the previous node in the chain. This will reflect poorly on the node if it wants to host data of its own.
- Register a connection between the hash contained within the packet and the id of the node the packet came from and forward the packet on to another node of a consistent distance away that it thinks will agree to be apart of the chain.
- Allowing nodes to either host data & be apart of trails or not host data and not be apart of trails makes sure packets flow along computers that actually host data and are likely to be relatively stable and with resources to spare.
Once a data trail is formed, it may be encountered by a "trail-tracing" packet which, once on the trail, is routed directly to the root of the trail.
However, this trail of nodes may be very thin, and thus trail-searching packets may have a hard time finding trail because they skip over too many nodes. To fix this issue, trail-laying packets will broadcast out to all known peers within a certain range that they are a part of a trail related to a specific hash. Peers receiving this broadcast will use a counting bloom filter to make a record that they are nearby a trail for a specific hash. "trail-searching" packets that come across these "nearby-trail" nodes can ask the nodes to ask all their peers within a certain radius if the peers are apart of a real trail. This casts a larger net and makes it easier to find trails without putting undue burden on too many nodes.
WIP: Section about the requesters role in the network and the expectations of the network for requesters to contribute to hosting for some time. (like bitorrent or ipfs)
Specific Structure (WIP)
Note: This is an insanely hard problem to solve well: finding the node hosting a piece of data that matches a given hash on a network. The protocol here is a formulation of a protocol that might work to solve this problem.
Every Dither node implementing DTS contains the following state:
- A hashmap that maps multihashes to its corresponding data stored on disk, so that the node may retrieve its own data.
- The size of this map should be primarily up to the node owner's discretion, depending on how much data they would like to store.
- Data may be temporarily be added to this hashmap to help with the caching of other node's data.
- A hashmap that maps multihashes to peers.
- A bloom filter containing a set of multihashes of which there are trails nearby.
Every node will store the following information about their peers:
- WIP
Downsides
While Directional Trail Search is in theory much faster and much more efficient than DHTs, it is likely not as good when considering rare data. With a DHT, as long as there is at least one node hosting the data, it will be found eventually. With DTS, there is no guarantee that a piece of data will be found (i.e. if the data trails are too far away from the requesting node to be encountered by a searching packet).
There are multiple potential solutions in order of feasibility:
- Use a DHT in addition to DTS by default (this may have privacy implications)
- Figure out how to get DTS to work better for rare files, perhaps by making it so that trails form circles around the earth and thus are nearly impossible not to encounter.
- Store routing coordinates / routing areas on the Reverse Hash Lookup.
- Implement a Network Coordination feature that tells all nodes in the network to notify a node when they find the requested data. (induces denial of service vector, probably a bad idea)
Reverse Hash Lookup (WIP)
Directed Acyclic Graph data structures (DAGs) may be elegant for storing and linking pieces of data, but they don't provide any kind of mutability on their own. This is the purpose of the Reverse Hash Lookup (RHL). RHL allows you to create a piece of data that links to two or more other pieces of data (via hash-linking) and then lookup the link given one of the linked pieces of data.
Structure
RHL has many different ways to solve the two problems of distributing links and finding links. (Links here being pieces of data containing the hash of some other object that is being "linked to"). Links could be shared only with friends or trusted individuals, links could be broadcast all over the network and re-stored by other nodes, links could be stored in blockchains or link maps maintained by centralized servers, etc. All these different use-cases may be useful for different applications, so RHL tries to generalize over all possible use-cases.
Ideas:
- A Link defines its own methods of distribution and search. (Perhaps embedded as a hashtype).
Methods of Distribution & Search
- A link can be broadcast to some set of trusted nodes (friend group), queries are done by asking friends if they have registered any links to an object.
- A link can be registered in some sort of global consensus
- Central server(s) store any links that are uploaded and respond to queries
- Blockchain storing links (all "full" nodes store all links), queries done locally
- Global Binary Tree mapping hashes to links (less space, more network activity when searching)
- A link can be broadcast via a publish-subscribe system and exist ephemerally, find beliefs by constantly listening on a topic (or to the entire network).
Potential Applications
This kinda of system is useful as a basis for other systems that need to link disparate pieces of data together for the purposes of querying or consensus. Here are just a few examples of the systems that RHL could enable:
- Comment systems. Each comment is a piece of data signed by some individual. Links can created over the comments that allow for querying some subset of all the comments, i.e.
top commentsorset of all commentsorcomments from friends. These links themselves may be immutable, but can be joined together via other links in a chain where the newest link in the chain is the most up-to-date view of the comment system. - Web of Beliefs. A justification / rule links two beliefs together and beliefs may be private or publicly shared and distributed. Beliefs justified by some rule sets may be more propagated than others (i.e. scientific beliefs). Or beliefs can be accepted or rejected based on arbitrary social conditions like "what proportion of my friends hold belief x?" (Such as is the case with names). Multiple conflicting beliefs can be held at once, such as if it is not clear which one should be accepted as default (or there are uses for both beliefs in different context, such as with definitions).
- Anonymous Interactions. Links could be generated ephemerally to prove that a public key in some trusted set of public keys interacted with a piece of content in some manner, and then added to some hyperloglog or statistical counter.
Specification Ideas
Disp type Link<Type> is implemented for whatever hash type is used in the specific link needed. i.e.
For a link to be type-valid, the objects it links to must also be the correct type. Zero-knowledge proofs could be used for this application in the future to avoid unnecessary fetching.
struct Link<T: Type> {
hash: Multihash,
valid_link: (fetch(hash) : T)
}
User Magagement
Desirable Properties
- Flexible Provenance
- A user should be able to choose how strong the link between the data they publish and themselves is. (i.e. optional / zk proof / ring signature signing)
- Flexible Privacy
- A user should be able to choose to a reasonable extent who can see the data they publish. (i.e. one-to-many group-based publishing)
- Multi-device support & Realistic recovery mechanisms from compromised devices.
- If a user looses access to their devices or a mechanism of authentication (password) or device gets compromised, there should be realistic mechanisms to recover the damage both to the public and privately.
- Anonymous Collective Feedback
- For mechanisms of collective feedback (like/dislike counters, analytics), the data associated with the feedback event should not in general be attributable to a particular group, but should be verifiable that it was a unique member of the group that did the feedback.
- Flexible Storage Permissions
- Encrypted data associated with a user, stored across many different computers, should support permission hierarchies based on the sensitivity of the data. I.e. requiring more factors of authentication.
- Web-of-trust Identity
- Data associated with a user's identity should be vouchable by other users (in a web-of-trust) or by mechanistic processes. The identifying data should
Inspiration
- Scuttlebutt
- Polycentric
Identity
Identification is needed when one party must store and retrieve data associated with another party for the purpose of specific interaction.
Formally, identification is the process by which one party (identifier) under some assumptions internally associates some external stimulus with the known identity of another party, thus changing the identifying party's behavior.
For Example:
- A service (identifying party) must identify a user (identified party) using a username and password (external stimulus) to allow the user to log in (changed behavior).
- Main Assumption: The user's password or computer they are using has not been compromised.
- An individual must identify a fellow human being using their senses before treating them not as a stranger.
- Main Assumption: The fellow human is not attempting to disguise themselves as someone else.
- When receiving an encrypted message from a friend, the receiver must identify the friend by verifying the signature of the message using the friend's known public key.
- Main Assumption: The public key truly does belong to the friend and the friend's private key has not been compromised.
Design
In Dither, there will be two general categories of identification.
- Cryptographic identification
- Characteristic-based identification.
Cryptographic identification will occur for when one needs to identify an party on other other end of a specific communication channel.
Characteristic-based identification will occur when one needs to find publically-declared associated data (i.e. public keys) of another party given only knowing certain characteristics of the other party.
Disp
The key to decentralization is effective coordination of collaborating computational components.
Disp is a programming language where programs, types, and the type checker itself are all represented as regular programs in a self-reflective combinator calculus called the tree calculus. It is being developed to solve some of the fundamental problems with Dither, and in doing so should be pretty much the end-game of programming language design.
The problems to solve are:
- For a distributed system you need to be absolutely sure your code has no bugs and never will. To do so you need formal verification against the most rigorous set of bug-denying constraints you can come up with, ideally with a compiler that is itself verified in the same way.
- We need a language that can allow users to prove arbitrary equivalences and for these equivalences to be actively applied as optimizations, ideally directly to assembly.
- For a language to not ossify over time and new ideas/designs to be able to be invented and be automatically safely adopted new designs need to be able to be formally proven to apply and safely rewrite old code dynamically at each individual user's desire.
- As a subpoint here, the surface level details of the language should be user-customizable and fit to their preferred language syntax styles, whether that be block coding, ML-syntax, C-style, pythonic or what have you.
- In order to create things fast, it is generally infeasible to spend copious amounts of time worrying about specific algorithms and it would be ideal if programmers could simply write constraints and have an optimizer satisfy them. i.e. programming-language-native program synthesis.
The Core Idea: Types as predicate functions
The Core™ idea of disp is that type systems (the feature of compilers that handle whether or not a given program you've written fits some constraints) should be definable in the language itself. Thinking about type systems for a second, you essentially have a program in the meta-language that looks like this: check(term: Term, typ: Typ) where Term and Typ are some special datastructures. If you think about this though, specifically the case where you partially-apply typ to check, you get a function that just checks if a given term is of a particular type. But what if you could just define this as its own function?
This is what disp does and it makes it so that types are fundamentally first-class objects in the language, i.e. simply just functions that inspect some encoding of some data, and return true or false. More on this in Universal System of Types.
The key part to make this happen being "inspect some encoding of some data" and this is where the backend combinator calculus disp is built on comes in. It's called the tree calculus (Invented by Barry Jay, with development help by Johannes Bader) and it allows you to (similar to quote in lisp) directly inspect arbitrary trees (which can include datatypes and functions).
The Solutions
Each of the problems above can be pretty easily solved downstream of types-as-predicates plus the tree calculus:
- Formal verification: A type can encode any constraint you can compute, up to and including full specifications ("this function sorts its input"). And because the type checker is itself disp code, it can be checked by the same machinery it implements, instead of being a pile of trusted compiler internals.
- Provable optimization: Equality between programs is just another type, so "these two programs always produce the same result" is a proposition you can prove, package up, and share. Combined with hardware modeling to judge when a rewrite is actually faster, optimizations become a library anyone can contribute to rather than a compiler release.
- No ossification: Since the type system is a library, new disciplines can be adopted (or not) per user without forking the language. And since programs are nameless trees identified by hash, code deduplicates naturally across Dither, names and syntax become personal rendering layers on top, and proven equivalences let new designs safely rewrite old code.
- Program synthesis: Partially apply the type checker to a specification-type and you get a function that accepts or rejects programs. Add scoring functions for speed and size and you have exactly the thing an optimizer can search against.
Current State
There is a working prototype at github.com/libdither/disp. A small TypeScript runtime evaluates trees; everything else (the kernel, the standard types, the test suite) is written in Disp itself. See Implementation for what exists and what doesn't yet.
Inspirations
- Tree calculus (Barry Jay) for a substrate where programs can inspect programs.
- Lisp for treating programs-as-data in the first place.
- Idris, Coq, Agda, Lean & Friends for dependent types.
- Unison & IPFS for content-addressed, deduplicated code.
- Rust for its safe & fast zero-cost abstractions.
Universal System of Types
a type system is a logical system comprising a set of rules that assigns a property called a type to every "term"
Who chooses the rules that assign types to terms? What makes one set of rules better than another? Some type systems are more general than others, some are more easily compiled into efficient bytecode, some allow for writing complex mathematical proofs. Exactly how large is this design space?
Disp captures the entire design space of possible type systems by generalizing the concept of types:
To make a long story short, a type is simply a program that returns true, false, or loops forever when given the source code of another program.
This allows the programmer to define their own type system in Disp. Or just use a pre-made one.
The Implementation
In Disp, every program is a tree, and trees are data that other programs can inspect. So a type can literally be a function: give it a program, and it tells you whether that program belongs to the type. Bool is a function that accepts exactly two trees (true and false). Nat accepts the trees that represent counting numbers. Checking x : Nat means running Nat on x and seeing what comes back.
The tricky part is function types. To check that some f has type Nat -> Nat, you can't just run f on every number (there are a lot of them). Instead, the checker mints a hypothetical number: a sealed token that answers "I am a Nat" when asked its type, but reveals nothing else. It feeds this token to f and watches what happens. If f tries to peek inside the token, checking fails. If f produces something that checks as a Nat anyway, then f works for any number, because it never depended on which one it got.
Since types are just programs, things that would normally require new language features come for free:
- A dependent type (a type that depends on a value) is just a function that returns a type.
- A refinement like "even numbers only" is just a stricter checker.
- A proposition is a type whose inhabitants are proofs, so the same machinery checks programs and theorems.
The standard library comes with all the familiar types (Bool, Nat, function types, pairs, equality, etc.), but none of them are built in. They are ordinary definitions, and you can read them, swap them out, or write your own.
The mechanics of how checking stays sound (what the sealed tokens are, and what programs are allowed to do with them) are sketched in Implementation.
Syntax
Disp currently has one concrete syntax, and there isn't much of it. A .disp file is a sequence of definitions, tests, and imports. Here is a representative chunk:
open use "lib/prelude.disp"
// The identity function. {x} -> x is a function literal.
id := {x} -> x
test id t = t
// A typed definition: addition on natural numbers.
add : Nat -> Nat -> Nat := {n, m} -> nat_rec ({_} -> Nat) m ({pred, ih} -> succ ih) n
test add 2 3 = 5
The pieces:
name := exprdefines and exportsname.let name = exprdefines something private to the file.name : Type := exprattaches a type. The annotation is not just a comment for the reader, it makes the compiler run the type (which is a program) against the definition when the file loads.{x, y} -> bodyis a function of two arguments. Binders can carry types, as in{x : Nat} -> body, and a function type is written the same way:{x : Nat} -> Nat, or justNat -> Natwhen the result doesn't depend on the argument. This one form covers ordinary functions, generic functions, and dependent types.test expr = expris an assertion checked every time the file loads. The test suite for the language is mostly files like this.use "path"loads another file as a record of its exported names;opendumps those names into scope.{ a : Nat, b := double a }is a record type with a derived field: give it anaandbis filled in for you.
There is also match for case analysis, if/then/else, and a hole marker _ for "let the compiler figure this out". That is most of the language. There are no keywords for classes, interfaces, modules, or macros, because records and functions end up covering those jobs.
Fair warning: the syntax is still in flux. It exists to get trees into the computer, and the trees are the real program, so syntax decisions are intentionally low-stakes. The authoritative grammar lives in SYNTAX.typ in the repo.
Syntax Agnosticism
Syntax is one of the most visible barriers separating different languages from each other. It is easy to distinguish between Lisp and C, BASIC and APL, Haskell and Fortran. All these languages can pretty much do the same things, but their different syntax plays a big role in preventing programmers experienced in one from trying out another. Syntax is also where language communities waste the most energy: arguments about braces and keywords (bikeshedding) stall features that everyone otherwise agrees on.
Disp's position is that syntax is not what a program is. A Disp program's identity is not its text. Source text is parsed into a tree, and the tree is the program: it is what gets type checked, what gets evaluated, and what gets hashed for naming and sharing. Format the same definition differently, or rename every variable in it, and you get the same tree, byte for byte. The text is just one rendering.
Today Disp has just the one concrete syntax described above, so in practice you are looking at one particular rendering. But nothing downstream of the parser knows or cares what the source looked like, which leaves the door open for things that are very hard to retrofit onto text-based languages:
- Alternative grammars. A different surface syntax that parses to the same trees is a frontend, not a fork. Code written in one style stays usable from any other.
- Personal rendering. Since names live outside the trees, your editor could show you the same library with your preferred naming conventions, or in your native language.
- No more bikeshedding. When people disagree about how something should look, both spellings can coexist as long as they mean the same tree. Defaults can be picked (and re-picked) by something like persistent voting without breaking anyone's code.
The eventual goal is for Disp code to be stored and shared as trees, with text generated on demand in whatever style the reader prefers. Switching between styles should be as simple as a click of a button.
Names
A Natural Idea
A word in a language is a label to which we assign a definition. These definitions may be physical or abstract, but in a programming language, a definition is a piece of code.
id := {x} -> x - We are fitting the label "id" to the program {x} -> x.
Words in natural languages may have more than one definition, yet in conversation we usually intend only one of them, and the listener works out which from the context. If we are talking about garnishing food, you mean adding something, whereas if we are talking about garnishing wages, you mean taking something away. Nobody has to announce which dictionary entry they intend.
Contexts also exist in programming languages: namespaces, modules, classes. But unlike real conversation, the computer can't just "figure it out" (yet). The programmer has to spell out exactly where in a meticulously organized hierarchy of modules every name lives. This puts a special strain on the programmer, who must keep that whole pre-defined hierarchy in their head just to program effectively.
The goal: remove this burden by resolving names from the context of the program, the way a listener does.
Structure
In Disp, names and programs are kept strictly separate. A program compiles to a tree that contains no names at all: variables are compiled away into the tree's structure, so two definitions that differ only in naming (or formatting, or syntax style) produce the identical tree. The tree's hash is the program's true identity.
Names are then just labels that people attach to hashes. A .disp file is, in effect, a record mapping labels to trees. This has a few nice consequences:
- The same program may be named differently by different people, in different styles or different human languages, without anyone forking anything.
- Renaming is free. It touches the label layer, never the code.
- When code is shared over Dither, identical definitions deduplicate automatically, no matter who wrote them or what they called them.
Resolving from Context
A label-to-hash map is the simple version. A perfect way to model the richer version is a knowledge graph: a name may resolve to many different programs, and the scope of possibilities is narrowed down by the surrounding context (the types in play, the file's imports, specifically declared contexts), much like a conversation does. Type information makes this more tractable than it sounds, since most candidate meanings of a name simply won't fit where it is being used.
Programs from Specifications
The long-term goal behind Disp: I want to be able to synthesize the best possible program given a specification.
Unpacking that sentence: a specification is a type, and in Disp a type is a program that checks other programs. So the type checker can be turned into a function that takes a candidate program and returns 1 or 0. Combine that with functions that measure other things you care about (speed, memory usage, code size) and you have a scoring function: a single program that looks at another program and tells you how good it is.
Once "how good is this program?" is itself a program, writing software can be turned inside out. Instead of writing the program, you write the scorer, and an optimizer searches for a program that scores well: one that is formally correct (the type checker accepts it) and efficient along whatever axes you measured.
For this to work, a few things have to hold:
- Programs must be data, so that the scorer can inspect candidates and the optimizer can build them. This is what tree calculus gives Disp.
- The specification language must be strong enough to say what you actually mean. That is what the dependent type system is for: "a sorting function" can be specified precisely, not just "a function from lists to lists".
- Performance must be measurable without guesswork, which is what the rest of this page is about.
Provable Optimization
The whole business of computing faster in general comes down to finding new structures and algorithms that contain the same information as the old ones, but can do certain computations faster.
For example: multiplying by 2 can be done two ways on a CPU, through the multiplication instruction (slow) or bitshifting (very fast!). Same answer, different cost. A bigger example might be swapping one data structure for an equivalent one with better O(n) behavior, which only pays off past a certain dataset size. Real programs are towers of choices like these.
Today, these substitutions happen in two ways. Compilers apply a fixed bag of rewrites that their authors hand-picked and trust. Programmers do the rest incrementally by hand, using benchmarks as a form of proof. Ideally all optimization could be done by the compiler, but that compiler would have to have knowledge of every possible implementation of every algorithm and when each one wins.
Disp aims to solve this by making optimization a library instead of a compiler internal. Since Disp programs are data, and equality between programs is something the type system can state and check, an optimization can be packaged up as: here are two programs, here is a proof that they always produce the same result, and here is the evidence about when one is faster. Such a package can be:
- applied manually, like importing any other dependency,
- found automatically, by a compiler searching the library for rewrites that apply to your code (with results recorded for subsequent runs),
- shared safely, because the equivalence proof is checked on arrival. You don't have to trust whoever sent it.
The "when is it faster" half is the harder part, since benchmarks are noisy and machine-specific. That is what hardware modeling is for.
Hardware Modeling
How do we generate fast code?
Optimizing compilers generate fast code through intermediate representations and various techniques for figuring out what code is needed and what code is not. These strategies are mostly arbitrary and not easily improvable: a programmer has to go through and figure out which optimizations are worth the compiler computation cost and which aren't, and there are an immeasurable number of trade-offs and potential improvements for any given piece of hardware.
How do we create fast algorithms?
We typically write algorithms we think will run fast and then compare them to other algorithms using benchmarks. However, benchmarks have so many confounding factors that it is difficult to compare two of them and make definitive conclusions. The same algorithm may run faster or slower depending on a bazillion different factors, down to the layout of the compiled object file the compiler happened to pick.
To create efficient code, we must have a model of how that code is run. Typically this model is held in the programmer's mind and expressed through compiler and algorithm design. Instead of holding it in our forgetful, biased brains, it might be a good idea to have a software-defined model of our hardware, so that we can prove (or at least have really good heuristics that check whether) our optimizations and algorithms are actually faster.
What such a model should account for:
- Modern CPU features: pipelining, superscalar and out-of-order execution, branch prediction.
- Differences between generations of the same architecture.
- Everything else that moves the needle: RAM latency, cache behavior, GPU parallelism.
Simple models could be hand-built from published data like Agner Fog's instruction tables. For open hardware (RISC-V, say), more sophisticated models could be derived from the actual logic design.
Benefits of this approach:
- No more annoying benchmarks. A faster algorithm can be provably faster for a given model. (Benchmarks still get used to compare different models of a hardware system, but those benchmarks can be much more sophisticated and account for more confounding factors.)
- Faster optimization adoption. With models and equivalence proofs, optimizations can spread through a public codebase and pushed to users without a compiler release in between:
- Bob finds a new optimization for a certain pattern of expressions.
- Bob proves it is faster for a well-trusted model of x86_64 CPUs.
- Bob broadcasts the optimization to everyone.
- Everyone's computers check the proof.
- Everyone's programs get faster.
Closing the Loop
The really interesting part is what happens once all of this feeds back into itself. The type checker is a Disp program, so it too can be scored and optimized. The optimizer can be pointed at its own components. The hardware models can be refined by the very search they accelerate. And a large enough library of proven-equivalent programs is a perfect dataset for training an AI to find new equivalences. Every piece of the system is a candidate for the system to improve, which is why so much of Disp's design is about keeping everything (checker included) inside the language rather than bolted onto it.
This is all a long way off, to be sure. What exists today is the foundation it requires: a calculus where programs are data, and a self-hosted type checker on top of it. The optimizer is future work, and an external search process (combinatorial, neural-guided, or both) will probably have to bootstrap it before it can take over its own improvement.
Implementation
Disp has a working prototype, developed at github.com/libdither/disp. This page is a snapshot of how it is built and how far along it is. There is also an interactive walkthrough that builds everything below from scratch with runnable examples, though fair warning: it describes an older version of Disp, and it was largely AI-generated, so it reads like it.
The Substrate
Everything bottoms out in tree calculus: programs are binary trees, and there is a single reduction rule for applying one tree to another. A small TypeScript runtime implements this rule, along with a parser for .disp source files and a compiler that turns parsed definitions into trees.
One implementation detail matters a lot: trees are hash-consed, meaning structurally identical trees are stored exactly once, and comparing two trees for equality is a single pointer comparison. This is what makes "same code = same hash" cheap enough to use as the language's notion of identity.
The Kernel
The type system is not in the TypeScript. It is written in Disp, as a small kernel of .disp files, under a strict discipline:
The in-language code is the specification. Host code is only allowed to be an optimization of it.
Since types are programs that check other programs, most of the type system is plain library code. The kernel proper is just the machinery that plain code can't be trusted to build: it mints the sealed "hypothetical value" tokens used to check functions, and runs candidate programs under a watcher that keeps them from peeking inside those tokens. A program that checks out on a hypothetical input is sound for every input, and the watcher is what makes that argument hold. Everything else (Bool, Nat, function types, records, equality and its proof rules) is ordinary library code built on top, and the test suite is itself written in Disp.
What Exists / What Doesn't
Working today: the runtime, the parser and compiler, the kernel, a standard library (numbers, lists, options, results, pairs, sets), dependent records with derived fields, and a few hundred in-language tests, including ones that pin down the soundness boundary by checking that known attacks on the checker fail.
Not yet built: erasing checks from verified code so it runs at full speed, a proper effects story for talking to the outside world, better error messages, and the optimizer. The full design, including the parts that are still on paper, lives in the spec documents in the repo (TYPE_THEORY.typ, SYNTAX.typ, COMPILATION.typ).
Dither Application Index
List of application Ideas
- Dither Chat - Community Chat application aiming to replace Discord. Provides e2ee encrypted DMs, voice chat, servers, voting, and integration with most other chat protocols.
- Dithca - Comprehensive & Versatile decentralized comment system where anyone can comment on any type of data structure on Dither. Can interface with most other centralized comment systems and deal with misinformation & crediting using a comprehensive community flagging system.
- Can be used to create Reddit / Twitter Replacement. Can be integrated into other Dither applications or ported to web.
- Dithgit - Github on Dither
- Dithix - Dither Resource Manager: Manage and cache any kind of resource, interface between the Merkle Tree and the Filesystem.
- Nomia on Dither - Nomia on Dither,
- Tree of Math - Directed Acyclic Graph linking a standardized data structure for defintions and proofs together based on set theory creating a comprehensive tree of knowledge.
- Dither Coin - Cryptocurrency that solves all the current problems and creates a complete digital replication of cash (being decentralized, anonymous, non-volatile, and difficult to trace).
- Dither DEX - A all-faceted decentralized exchange to facilitate trade any kind of real or virtual asset. Supports meeting up in real life or exchanging other virtual assets anonymously and securely. Also supports moderation, karmic filtering
- Protocol of Truth
Other Application Ideas:
- Manga & Reading App
- UI will be similar to Tachiyomi, but will also support book reading. Pulls content from various websites and stores on Dither. Written in Flutter, desktop & mobile versions. Supports comments through Dithca. Has built-in feature for paying for translation & replacing bad translations with better translations if they are made.
- YouTube replacement
- pulls and stores videos in a decentralized, uncensorable manner from any site that youtube-dl supports
- Built-in chat (Using Dithca protocol)
- Community-generated captions, sections, sponsor segments (pulled from sponsorblock) flagging etc.
- Support for likes, view counting (and congregating), and "Hearts" (method of giving Dither Coin to creators).
- Automatic community flagging of stolen/used content (including music, other people's videos, meme origins, pretty much anything)
- Community misinformation flagging of content (interfaces with Dithca)
- Peer-to-Peer Exchange
- Allow for the exchange
Dither Chat
What is it
Dither Chat is a decentralized communication application using the Dither protocol. Servers are communally hosted with local consensus. Bots and plugins will be supported and also communally hosted. tl;dr Discord but decentralized and better.
Users & Sync
- Each user may have multiple Peers (devices that Dither is installed on)
- Chat Event history can be optionally synced across Peers.
- A peer may host multiple users
- Each UserId must have at least 1 peer that hosts it.
Chat Events
- All Events are signed with the private key of the person who sent it (these will be verified with a config option to let through or ignore unsigned or incorrectly signed events)
- Chat Message structure
- Date sent, last edited, markdown data / embed json, UserID mentions, emoji reactions
- Rich Presence (updating custom statuses and online/offline status)
- Optional storage - can store presence update history (off by default)
- Optional sending, extracts information about what you are currently doing and updates your friends. (on by default)
- Customization options to only share with certain friends
Event Storage - Storage of a sequence of events in memory or storage
- Stored as hash-linked local blocktree that messages are added to and new blocks are created when a certain amount of time elapses between the last message sent or max block size exceeded. Block size can be set to 1 to prevent messages being ordered out-of-order.
- Indexing can be done on a by-block level (TODO: more customization options needed)
- Block structure can allow for thread branching & thread conversation movement across users. (e.g. create a group dm on top of an ongoing conversation)
Trusted Friends Application API
- Option to rank friends manually or by how much you chat with them
- Can mark friends as “Trusted, Neutral or Untrusted”
- Friend rank can be used by other applications
- e.g. for Stellar Consensus Protocol quorum selection
Chat Interface
- Built-in markdown formatting + advanced chat box (similar to discord)
- Link Displaying
- Metadata can be sent so receiver doesn’t have to send request to web pages
- TODO: Do we need to worry about invalid metadata being sent, tricking the user? Perhaps just scanning for suspicious domains is enough.
- Receivers can choose if they want to fetch link data, only fetch commonly used sites (e.g. youtube, twitter, soundcloud etc.) or not fetch anything at all and only display sent link metadata
Direct Messaging
- Simply sending JSON-encoded message/other events to UserID on Dither
Group messaging
- Messages are broadcast over gossipsub and conflicting blocks are ordered by time.
Servers
- Servers are communially hosted by the moderator's computers. However, the owner has full control over the server and can choose who can assist
- Red Nodes are hosting nodes, blue nodes are members. Blue node with yellow stroke is proxying it's connection to the server
Image of a possible dither-chat server arrangement
Roles, Tags & Colors
- To distinguish people in a server, there are roles, name colors, and tags
- Tags are shown next to tagowner's name as a small icon (like in Discord)
- By default there is only 1 tag enabled for a server: the Owner tag.
- There are other tag presets such as Donator, Moderator & Member
- Custom tags can be created and be attached to a specific role.
- Tags should be displayed in order of importance
- Roles can be made for anyone and can have permissions attached to them
- Roles and can also be organized into hierarchies where users can be ranked-up to posses a higher role (usually with more permissions)
- Each role has a color and color priority assigned
- This is used to determine what color a user's username should be assigned
Protocol of Truth
This is a protocol of back and forth debate with the aim to evaluate the quality and truthfulness of content posted to the internet.
If you have ever seen a photoshopped post on social media, or a video citing a badly done study you will know that it is incredibly hard to figure out how well-researched a piece of content is at first glance. This Dither protocol aims to inform through discourse the truthfulness of a given piece of media to anyone who might come across it.
The Process
The process for the Protocol of Truth is carried out by trusted individuals (chosen by a karma system). The steps are as follows:
- The Media is examined for a set of assertions it makes as well as deductive and inductive reasoning made using the assertions. This creates a "Assertion Graph".
- Once the graph is constructed and properly cited (i.e. each assertion should provide some link to the part of the media it was extracted from) assertions may be supported or criticized.
- Assertions in "Assertion Graphs" may be shared between different media and thus the supporting evidence presented by one media may be used in the justification of another. This supporting evidence may be found independently or through the author's cited sources.
- Each Assertion in the graph may be labeled through a persistent voting scheme under the following categories of deduction type, truthfulness and standard of evidence.
- Deductive / Inductive - What kind of assertion is it.
- Truthiness - How the trusted community rates the assertion in terms of truthfulness.
- Inherited - The assertion is deduced from the assumptions, therefore its truthfulness is inherited from the truthfulness of the assumptions.
- Likely True - The assertion is likely true
- Likely False - The assertion is likely false
- Ambiguous - The truth of the assertion cannot be satisfyingly asserted either way.
- Standard of Evidence
- Statistically Significant - The assertion is supported by sound scientific observation or experiment.
- Personal Experience - The assertion is a personal observation, it is subject to individual bias.
- Common Knowledge - The assertion is generally held to be true, but there is no specific evidence to support it. i.e. stuff like "thing X exists" or "hammers are generally used to hammer in nails". There is no hard evidence for it, but it is generally held to be true. This is the least powerful standard of evidence and should be replaced with a more powerful form if at all possible.
Fractional Funding
This idea aims to solve the issue of effectively funding peer-to-peer creative endeavors.
Rough Description
One really big idea which I think would improve the content economy (i.e. remove dependence on ads or sponsorships and make it more feasible for small creators to make a living) would be to have a system similar to YouTube premium where you pay a monthly subscription and the money gets distributed among the creators you watch, but instead of it being distributed among everyone you watch, its distributed only to people who make videos that you click a certain button on (lets say its like a heart-shaped button next to the "like" button), and you can click it multiple times to give multiple "shares" of your monthly subscription. Then these shares are distributed to all these videos at the end of the month, but instead of simply going to the uploader of the video, there is a system that tries to identify either via user feedback or AI or something else parts of the video/content that were made by the original uploader, or that where taken from elsewhere to create a "fair" monetary distribution for that video. (i.e. to pay the artists and thumbnail drawers and anyone else) and then depending on each user's preferences the money from each video can either go to everyone equally (weighted by distribution) or they can add further weights to prioritize people who have less income or things like that.
Formal Description
Users
- People who consume content and set aside some fixed amount of money to give to creators.
Content
- Pieces of data (videos, images, writing, etc.) by credit for its creation can be proportioned to some set of humans.
Creators
- People who upload/remix content.
Goal: Take money from users and give it to creators in some fair manner.
Idea for Process to achieve Goal:
- Each user watches a video and gives to it some fixed number of shares of monthly money they've set aside.
- At the end of each month the user's preferences on how money should be distributed for each video is sent to creators.
The Decentralization Stack
🚧 Draft — restructure in progress. This problem-map reorganization supersedes the earlier prediction-market-first draft. Part A (core ideas) is a sketch; the in-depth Part B chapters are not yet written.
This section is the conceptual core of Dither: the full set of problems that stand between us and a decentralized, privacy-respecting Internet, a short core idea for each, and — in depth — how the solutions lean on one another.
The goal
Dither's end-goal is to replace the centralized Internet with decentralized alternatives unified by one modular protocol, under four design tenets: it should be useful, modular & modelable, interoperable, and ultimately self-reliant. This section is about the hard part of "modelable": showing that the pieces form one coherent system rather than a pile of unrelated mechanisms.
There is no single root
It is tempting to pick one idea — a currency, or a prediction market — and call it the seed everything grows from. That framing is wrong and it misleads. The stack is a set of interdependent problems; the closest thing to a root is boundary integrity (knowing who is a distinct participant), and the closest thing to a recurring principle is prediction (agents persist by predicting their environment). But neither is "the thing the rest is built on." The honest picture is a dependency graph, shown below.
The problem map
Eight problems, each stated in one line. (Part A gives a short core idea for each; Part B develops them in depth.)
| # | Problem | In one line |
|---|---|---|
| 1 | Boundary & Identity | Tell distinct participants apart (Sybil resistance) with no central registry and without breaking anonymity. The problem under all the others. |
| 2 | Ordering & Timestamps | Agree "this happened before that" — and that a record wasn't backdated — without a global clock or blockchain. |
| 3 | A Verifiable Substrate | Express computation as portable, content-addressed data whose results anyone can independently check (disp). |
| 4 | Anonymous Routing & Retrieval | Move data between nodes and find who holds a value, without revealing who wants what. |
| 5 | One Resource Market | Buy and sell storage, bandwidth, and compute as a single priced flow, not three markets. |
| 6 | Non-Concentrating Money | A medium of exchange that resists the wealth concentration which wrecks aggregation. |
| 7 | Aggregating Truth | Combine dispersed expert knowledge into a shared world-model with no corruptible oracle. |
| 8 | Deciding Together | Combine people's preferences with that world-model into collective decisions that resist capture. |
Plus the composition question — how these eight become one living system — and the cross-cutting concerns that ride on top of it: bootstrapping new users, exit/competition between networks, and the threat model.
How they depend on each other
┌──────────────────────────┐
ROOT │ 1 · Boundary & Identity │ every weighted/counted thing needs it
└────────────┬─────────────┘
│ distinct participants
┌──────────────┬──────┴──────┬──────────────────┐
▼ ▼ ▼ ▼
2 · Ordering & 3 · Verifiable 4 · Anonymous 6 · Non-Concentr.
Timestamps Substrate Routing Money
│ │ │ │
│ └──────┬──────┘ │ prices / denominates
│ ▼ │
│ 5 · One Resource Market ◄───────────┘
│ │
└─────────┬───────────┘
▼
7 · Aggregating Truth ◄────► 6 · Money
│ (δ-dial ⇄ dispersion: co-designed cycle)
▼
8 · Deciding Together
│
▼
Composition — one living system (+ bootstrapping · exit · threat model)
Read it top-down: identity is the root; timestamps, the substrate, routing, and money are near-primitives; the resource market is the first place they combine; the truth machine sits on top of money and timestamps; governance sits on the truth machine. The one genuine cycle — money ⇄ truth machine — is why those two must be co-designed, not stacked.
How to read this section
- Part A · Core Ideas — eight short chapters, one seed solution per problem. Read these and you'll hold the whole shape in your head, including where each idea is still thin.
- Part B · In Depth (drafting) — the same eight, developed in dependency order at engineering depth, each ending with a "where it could break in implementation" section. This is also the stack-wide threat model the roadmap flags as missing.
- Reference — the formal Mathematical Core, the Futarchy & Causality deep dive, a glossary, the engineering roadmap, and open questions.
Conventions. Honest caveats are marked
⚠️inline, next to the claim they qualify. Knowing exactly where this design might break is a feature, not an embarrassment — it's what Part B is for.
1 · Boundary & Identity
🚧 Draft (core-idea sketch). Part A · Core Ideas. The problem under all the others.
The problem. Almost everything below — voting, consensus, fair pricing, basic income — silently assumes you can tell distinct participants apart. A central identity registry is out (centralization, privacy); a cheap "proof of personhood" is forgeable; a strong one is privacy-destroying. This is the Sybil problem, and it's the precondition for the system to have a well-defined boundary at all.
The core idea. Stop asking "is this a distinct person?" and ask "how much independent information does this participant contribute?" Look at each account's behavioral residuals — forecast errors, transaction timing, verdicts — after conditioning on public information. Honest distinct agents are roughly independent; one controller's puppets stay correlated because they share private state. An effective population n_eff collapses a tightly-correlated cluster of k sock-puppets toward 1, while genuine independents each count fully. Issue all weight (votes, income, influence) per unit n_eff, and the Sybil attack yields nothing. Identity becomes an accumulated signature of real work — faking k identities costs about as much as being k real contributors.
Leans on: nothing — it's the root. Enables: money (6), the resource market (5), the truth machine (7), governance (8) — everything counted or weighted.
⚠️ Where it's thin. It's an arms race: how much behavioral monitoring is "enough" is empirical, estimating the correlation structure at scale is unsolved, and a fresh account is weakest exactly at birth (handled later via local vouching). Spec depth + failure modes: Part B.
2 · Ordering & Timestamps
🚧 Draft (core-idea sketch). Part A · Core Ideas.
The problem. Money must order a sender's transactions; the truth machine must prove a forecast wasn't backdated; a few decisions (who won an auction for a unique item) need a single agreed winner. The reflex is "use a blockchain" — but a global total order is expensive, centralizing, and far more than most of these need.
The core idea. Most of it needs no global consensus at all.
- Payments from single-owner accounts are consensus number 1: only the sender needs to order their own spends, and they provide that order. Byzantine reliable broadcast suffices — no global ledger.
- Timestamps come from causal entanglement: once a record's hash is cited inside other participants' signed messages, it is sandwiched in time — it existed before everything that cites it and after everything it cites. Backdating means rewriting the signed history of independent witnesses (and "independent" is again
n_eff, from problem 1). - Only genuinely contested allocation — auctions for the same scarce item, name registries — needs ordering, handled by a small zonal BFT quorum.
Leans on: identity (1) for witness independence. Enables: money (6), the truth machine (7, the no-backdating assumption).
⚠️ Where it's thin. Timestamp precision is set by gossip rate; the narrow "contested allocation" carve-out still needs a real BFT design and a story for cross-zone disputes. Part B.
3 · A Verifiable Substrate
🚧 Draft (core-idea sketch). Part A · Core Ideas.
The problem. To sell computation you must trust the result. To share code and data across a decentralized system you need it portable, deduplicated, and addressable by what it is rather than where it lives. General-purpose code doesn't give you any of this for free.
The core idea. disp: programs are content-addressed trees, computation is deterministic reduction (tree calculus), and types are predicates the result must satisfy. Because reduction is deterministic and confluent, two honest executors of the same program agree bit-for-bit — so a buyer can verify a result by re-execution, by sampling k independent executors, or (when the result-predicate is cheap) by just checking the predicate. Reduction-step counts give a machine-independent "gas" unit for metering work. Moving a tree preserves its hash, so storage and routing of code/data become the same operation.
Leans on: nothing — it's foundational tech. Enables: the resource market (5) — verifiable compute, the value algebra, and metering all come from here; and machine-resolvable questions for the truth machine (7).
⚠️ Where it's thin. disp is a working prototype, but effects, erasure, the optimizer, and a self-hosted parser are pending — and crucially there is no networking/serialization story yet. "Nothing connects disp to the network today beyond intent" is the single largest gap between layers. Part B.
4 · Anonymous Routing & Retrieval
🚧 Draft (core-idea sketch). Part A · Core Ideas.
The problem. Move data between nodes, and find who holds a given value, without revealing who wants what or exposing the network's structure to observers. Ordinary routing leaks exactly this; ordinary content-discovery (a DHT) broadcasts your interests.
The core idea. Three cooperating pieces:
- Distance-aware anonymous routing (DAR) — nodes embed in a latency coordinate space (Vivaldi-style) so "closer" means "cheaper to reach," and traffic is onion-wrapped (HORNET-style) so relays learn neither source nor destination.
- Data-trail search (DTS) — follow breadcrumbs left by previous retrievals to locate rare data without a global index.
- Reverse-hash-lookup (RHL) — find providers of a content hash.
The latency coordinate space is reused everywhere else as the system's "body map": it also organizes currency zones, consensus tiers, and birth-time vouching (problem 1).
Leans on: identity (1) for relay reputation under pseudonyms. Enables: the resource market (5) — this is its move-through-space edge — and general data availability.
⚠️ Where it's thin. The routing-coordinate + HORNET design has no simulation results yet; the relay-incentive game (who pays whom for bandwidth, and how bargaining settles) is unresolved; DTS's rare-data case and RHL are sketches. Reviving the simulator is the cheapest de-risking step. Part B.
5 · One Resource Market
🚧 Draft (core-idea sketch). Part A · Core Ideas. The first place the primitives combine.
The problem. A decentralized system seems to need three separate markets — storage, bandwidth, compute — each with its own protocol, pricing, and incentives, that must somehow interoperate. That's three hard designs and three integration problems.
The core idea. They are one operation: MATERIALIZE(value, spacetime-region) — make a value available at a place and time. It has three elementary edges:
Any request is satisfied by a min-cost DAG over these edges, and the same value can be produced many ways — "fetch a cached copy vs. recompute it" is just route choice in one graph. So caching = replication = memoization = one threshold rule. disp is the verifiable value algebra that lets a buyer trustlessly mix fetched and computed sub-results; prices act as the system's shared prediction-error signal (predict demand well → pre-position → get paid).
Leans on: identity (1), routing (4), substrate (3), money (6). Enables: the first real paid economic loop in the network.
⚠️ Where it's thin. The unification is abstraction-level: the three edge costs differ by orders of magnitude and by risk type, so three real cost models are still owed. The clean price↔incentive alignment assumes price-taking and convex costs — lumpy storage commitments and relays with market power break it. Part B.
6 · Non-Concentrating Money
🚧 Draft (core-idea sketch). Part A · Core Ideas.
The problem. Both the resource market and the truth machine degrade as wealth concentrates — in fact "dispersed capital" is a named assumption the truth machine stands on. But money, left alone, tends to concentrate. We need a medium of exchange that structurally pushes the other way.
The core idea. There is no global "value" — only value to a constituent (the capacity to do useful work for someone), exactly as there is no global truth, only per-resolver verdicts. Currency tracks that value as it flows:
- Demurrage — currency decays unless it circulates; idle stock is reclaimed.
- The reclaimed flow is recycled as a universal basic drip to every participant (weighted by
n_eff, so it isn't farmable).
Steady-state wealth becomes equal baseline + (your net flow) / δ: unbounded stock inequality turns into bounded flow inequality. The decay rate δ is a dial that directly enforces a chosen concentration bound — i.e. monetary policy is the truth machine's dispersion guarantee.
Leans on: identity (1, Sybil-proof drip), timestamps (2, transaction order). Enables: the resource market (5), the truth machine (7, dispersed budgets), governance (8, expenditure). Co-designed with: the truth machine — this is the one dependency cycle in the stack.
⚠️ Where it's thin. It bounds stock, not bursts. Inter-zone exchange rates, speculation against a deliberately-decaying currency, and the (probably unsound) external USD peg all need real macroeconomic analysis. Part B.
7 · Aggregating Truth
🚧 Draft (core-idea sketch). Part A · Core Ideas. One organ — perception — not the root.
The problem. Combine the dispersed knowledge of many specialists into a shared, evolving world-model — without a single oracle that can be bribed, and without a point-in-time resolution that wealthy actors can push around. (This is the problem behind funding science, grants, and policy.)
The core idea. The retroactive consensus market, which decouples eliciting a forecast from declaring what happened from paying out:
- Forecasters publish timestamped probability distributions.
- Capital-holding resolvers privately and retroactively declare what they believe happened — optional, revisable, discardable.
- A reference-relative proper scoring rule pays each forecaster for moving belief toward that resolver's eventual verdict. The money-optimal forecast becomes the capital-weighted prediction of future resolver consensus.
Output: a live shared world-model and a skill ranking of forecasters — with no exploitable point-in-time oracle. It is the system's perception; it is the most worked-out instance of the recurring "pay for prediction" principle, but it sits on top of identity, timestamps, and money — it is not what they're built on.
Leans on: identity (1), timestamps (2), money (6), optionally the substrate (3) for machine-resolvable questions. Enables: governance (8) — the fact-finding half.
⚠️ Where it's thin. Two central, unproven theory gaps: reflexivity (if resolvers defer to the forecast, does the loop converge to truth or to a self-fulfilling fiction?) and causal validity (futarchy's conditional-vs-causal flaw). Both are reduced to measurable conditions in the Reference, but neither is closed. Part B.
8 · Deciding Together
🚧 Draft (core-idea sketch). Part A · Core Ideas.
The problem. Turn many people's conflicting preferences — plus the shared world-model — into collective decisions that resist capture by the wealthy or the well-organized. Pure preference-voting ignores facts; pure expert rule ignores values.
The core idea. Liquid-democracy quadratic voting over expenditure (a proxy for all policy, since policy flows through spending), with one decisive split:
- Facts are delegated to the truth machine (problem 7) — "to what degree does policy X advance goal Health?" is a market question.
- Preferences stay quadratic and personal — people vote on goals, then back policies structured around the market's estimates.
A Health-prioritizing voter can thus support high-level policy confident that its funded sub-policies are well-designed, and forecasting skill can weight the fact estimates. The natural first testbed is Dither governing itself: allocating contribution funding by QV, informed by the market's estimate of "which roadmap item most advances goal X."
Leans on: the truth machine (7), identity (1, QV needs distinct persons), money (6, expenditure). Enables: the system to fund and steer its own development (the self-reliant tenet).
⚠️ Where it's thin. The hard rule: never wire "allocate ∝ market estimate" mechanically — that recreates futarchy's flaw; the market must stay evidence feeding human judgment. And the liquid/quadratic delegation mechanics, credit issuance, and collusion resistance are currently a TLDR, not a design. Part B.
Sybil Resistance Is Independence
Part II · The Recurring Pattern — Chapter 7
Every layer so far quietly assumed something it has no right to: that we can tell distinct participants apart.
The problem under all the others
A handful of mechanisms break completely if one actor can cheaply pretend to be many (a Sybil attack):
- Quadratic voting is meaningless if you can't count distinct persons.
- Consensus is meaningless if "independent" resolvers are one puppeteer.
- Contribution-pricing is meaningless if a "contributor" is the requester in disguise.
- A UBI drip is infinitely farmable.
This is why Sybil resistance keeps surfacing everywhere: it is the precondition for the system to have a well-defined boundary — to know which participants are genuinely inside and distinct — at all.
The classic approach is a personhood proof: verify each account is a unique human. But that collides head-on with two other goals — it's privacy-destroying, and any cheap proof is forgeable. Demanding a yes/no "is this a real distinct person?" classifier asks an unanswerable question.
The reframe: from personhood to precision
Stop asking "is this a distinct person?" Ask instead:
How much independent information does this participant contribute?
Look at each agent's behavioral residuals — forecast errors, transaction timing, verdicts — after conditioning on public information. Honest distinct agents have residuals that are roughly independent. Puppets of one controller stay correlated even after conditioning on what's public, because they share private state.
So measure the correlation structure and compute an effective population: for a cluster of accounts with mutual correlation , the number of genuinely-independent voices is
Perfect Sybils () collapse to no matter how many accounts they spin up; genuine independents () each count fully. Issue all weight — votes, UBI, aggregation influence, witness power — per unit , not per account, and the Sybil attack yields nothing.
Why this composes with "real work"
The remaining attack is adversarial decorrelation: a puppeteer makes accounts that simulate independence on every monitored dimension. But the monitored dimensions are exactly the ones the system pays for — forecast accuracy, storage served, bytes routed, compute delivered. Producing genuinely-decorrelated, individually-rewarded performances requires separately-maintained positions of real work. In the limit:
The cheapest way to fake agents is to be agents — at which point, from the network's view, they are agents. Identity becomes an accumulated signature of real work, and faking the boundary requires doing the metabolism we wanted anyway.
The autoimmune dilemma, dissolved
The feared false positive — "an honest local community that genuinely agrees gets punished as colluders" — stops being an error. Agents correlated through a shared private channel really do contribute less independent information; weighting them as is accurate inference, not injustice. There's no binary verdict to get wrong: weight is graduated, continuous, and recoverable (diverge behaviorally and your weight grows back). The dilemma was an artifact of demanding a yes/no classifier; precision accounting never asks the unanswerable question.
⚠️ Honest caveat. This is an asymptotic/arms-race bound, not a finite guarantee. How much monitored behavioral diversity suffices in practice — the arms-race floor — is empirical, and estimating at scale needs the geographic/vouching graph as a prior on its structure.
📐 Formal version: The Mathematical Core §3 — , GLS down-weighting, and the mimicry-cost proposition. This single number is the stack's one security parameter.
The Substrate: Storage = Routing = Compute
Part II · The Recurring Pattern — Chapter 5
Part I built an engine for aggregating knowledge. Part II shows the same shapes — prediction, independence, value-as-flow — reappearing at every layer of a decentralized system. Start with the layer that looks least like a market: the raw substrate that stores, moves, and transforms data.
The problem: three services, or one?
A decentralized network seems to need three separate subsystems — a storage market, a bandwidth market, and a compute market — each with its own protocol, pricing, and incentives. That's three hard designs that must somehow interoperate. The first simplification of the stack is noticing they are not three things.
- Storage is transport of information through time: the same bits, same place, later.
- Routing is transport through space: the same bits, elsewhere, ~now.
- Compute is transformation: different bits (a function of the input), here or there, later.
Storage and routing are the identity function applied across a displacement; compute is an arbitrary function across a displacement. All three are special cases of one operation:
MATERIALIZE(V, R): make value
V(given by content hash, or by a predicatePacceptable values must satisfy) available at spacetime regionR.
One flow problem
Define three elementary edges over nodes (value, location, time):
| Edge | Action | Cost ∝ |
|---|---|---|
TRANSPORT-TIME | hold V at x from t to t' | size × duration (storage) |
TRANSPORT-SPACE | move V from x to x' | size × distance (routing) |
TRANSFORM | reduction steps (compute) |
Any request is satisfied by some DAG of these edges, and the same value can be produced by many different DAGs — fetch a cached copy (TIME + SPACE), recompute from inputs (TRANSFORM + SPACE), or a hybrid. The network is not running three services; it is solving one min-cost flow problem in a spacetime-value graph: given a demand field, find the cheapest DAG of edges that satisfies it.
Why collapsing them pays
- Caching, replication, and memoization become one decision — adding a
TRANSPORT-TIMEedge (a value held near expected demand) wheneverexpected_demand × recompute_or_refetch_cost > storage_cost. CDN behavior, DHT replication, and function memoization are one optimization with different constants. - Routing already solved the geometry. Routing coordinates (Vivaldi) embed
TRANSPORT-SPACEcost as distance. Extend that space with a storage axis and a compute-distance axis and the same shortest-path machinery prices all three. - disp is the value algebra this needs. In disp, programs are content-addressed trees, compute is reduction, and hash-identity makes storage and routing equivalent (moving a tree preserves its hash). "Fetch vs. recompute" is "transport the normal form vs. transport-and-reduce." Because results are checkable by hash (or by a disp predicate), you can trustlessly mix fetched and computed sub-results — the buyer verifies the answer regardless of which DAG produced it. This is the concrete bridge from the language layer to the resource market.
⚠️ Honest caveat. The unification is real at the level of abstraction (min-cost flow), but the edge weights span many orders of magnitude and carry different risk (a lost stored byte ≠ a lost expensive computation ≠ a missed latency deadline). The abstraction tells you what to optimize; it does not flatten the three very different cost models you must still supply.
📐 Formal version: The Mathematical Core §1 — MATERIALIZE as priced multicommodity flow, the local caching threshold, and the disp ↔ network effect algebra.
Value as Flow
Part II · The Recurring Pattern — Chapter 9
The last recurring primitive is money — but reframed. The currency layer isn't bolted on; it's the same idea (value relative to whose need it serves) plus one physical principle (value must flow to persist).
The problem: what is "value" with no global oracle?
The truth machine refused a global truth — there is only per-resolver verdict. Currency needs the same honesty. There is no global "value," only value to a constituent: the capacity to do useful work for someone — a cached value near demand, committed disk, a relay's bandwidth. Value is inherently relative to whose need it reduces, exactly as truth is relative to whose verdict it predicts.
A second problem is sharper: assumption A2 (dispersed capital) is load-bearing for the entire truth machine, and nothing so far enforces it. Wealth tends to concentrate; concentration captures the world-model. We need a currency that structurally resists concentration.
Demurrage: value that must circulate
A living structure persists only by throughput — value that stops flowing dissipates. Translate that into monetary policy:
- Currency is the token that tracks value flowing between participants.
- Demurrage is the rule that currency decays unless circulated — the economic second law. Idle stock is reclaimed.
- The reclaimed flow is recycled as a baseline drip to everyone — a UBI.
This single mechanism does three jobs at once:
- It enforces dispersion (A2). With a demurrage rate , steady-state wealth converges to equal baseline plus net-contribution-flow scaled by . You can only be richer than baseline by your sustained net flow over . Unbounded stock inequality becomes bounded flow inequality — and becomes a dial that directly enforces a target concentration bound on resolver budgets. Monetary policy and truth-machine validity are the same knob.
- It funds newcomers. Hoarders (stagnant stock) fund the baseline drip to fresh participants (flow). The stock-rich subsidize the flow-poor — which is also the dispersion the truth machine needs.
- It makes capital a weak lock-in. Wealth melts if you sit on it; what persists is flow and portable reputation. That keeps exit cheap, which (per the reflexivity chapter) is how the system stays coupled to reality.
Bootstrapping = perfusion, gated by local integrity
A new participant is like a new cell: it must be perfused (given a baseline so it can transact and have an experience before contributing) and then differentiate (develop a reputation/behavioral signature). But the moment of weakest boundary integrity is exactly birth — a newcomer has no history, so Sybil resistance is weakest precisely when we want to hand out resources.
Geography resolves this. Physical proximity is hard to forge in bulk: locally-present members can cheaply vouch for a newcomer, and faking many local identities requires physical presence in many places. So a newcomer enters through a local zone, vouched by neighbors, with a baseline endowment bounded by the strength of that local personhood proof. The same coordinate space then does quadruple duty — latency, birth-time vouching, currency locality, and consensus locality.
⚠️ Honest caveat. This bounds steady-state stock, not instantaneous flow — a high-flow actor can still spend in bursts (closed by burst-spend caps). And inter-zone exchange rates, speculation against a decaying currency, and any external peg all need real macroeconomic analysis the design doesn't yet have.
📐 Formal version: The Mathematical Core §4 — the wealth dynamics, the egalitarian-attractor proposition, and the δ-dial that turns A2 into a theorem.
Retroactive Consensus Markets
Part I · The Engine — Chapter 2
Conventional prediction markets bind three things that need not be bound: (i) eliciting a forecast, (ii) declaring what actually happened, and (iii) paying out. This mechanism decouples elicitation from resolution and reward.
Forecasters merely publish timestamped probability distributions. Each capital holder ("resolver") privately declares, whenever they like, what they believe happened, and an automated proper-scoring rule retroactively scores every forecaster against that resolver's own ground truth. Forecasters are therefore paid for predicting the future consensus of resolvers — which, under a shared-reality assumption, is a proxy for predicting future common knowledge. The same engine doubles as a skill ranking of forecasters, which can later weight votes in a governance system.
The primitives
- A set of questions , each with an outcome space .
- Each forecaster publishes, at times , a timestamped distribution — their belief about at time . (Timestamps must be tamper-evident; see A4.) Forecasters stake information, not money.
- Each resolver may, at a time of their choosing, declare a subjective resolution — their own private verdict on what happened. Declaration is optional and revisable; a resolver may simply discard any question they find ambiguous or manipulated.
- A strictly proper scoring rule (canonically the log score ), whose defining property is that reporting one's true belief uniquely maximizes expected score.
The reward rule
Score forecaster against resolver 's verdict on question , relative to a reference belief (e.g. the consensus belief just before 's update):
- The reference subtraction is Hanson's market scoring rule: a forecaster is paid for the marginal movement of belief toward the eventual verdict. You profit only by being right and different from the crowd — the formal version of "surprising and true." Echoing consensus earns nothing; moving belief the wrong way costs.
- can up-weight earlier forecasts, further rewarding being right before it was common knowledge.
Forecaster 's reputation with respect to resolver is . Resolver distributes a reward budget across forecasters as a monotone function of . Equivalently: an automated market maker retroactively "places bets" on each forecaster's behalf using their published probabilities, then settles against 's verdict.
The structural facts that make this work:
- Resolution is subjective (per-resolver), retroactive (scored after the fact), optional, and revisable. No global instant of truth exists to attack.
- Forecasters are ranked relative to each resolver's worldview — no claim of objective truth, only skill at predicting a given resolver's eventual verdicts.
What it incentivizes — the consensus-proxy claim
Proposition (informal). Suppose a forecaster is risk-neutral, cannot influence resolutions, and maximizes total expected reward . Because is strictly proper and the objective is linear in , the uniquely optimal report is the budget-weighted predictive distribution of resolvers' eventual verdicts:
In words: the money-optimal strategy is to forecast what the capital-weighted population of resolvers will eventually conclude. Under shared reality (A1) that population converges, so this is a proxy for future common knowledge. Forecasters keep updating, because stale beliefs decay their reputation relative to peers.
This yields two outputs at once:
- A live, aggregated forecast of future resolver consensus on every question — a shared, evolving world-model.
- A skill ranking identifying who reliably predicts that consensus, usable downstream as voting weight.
Properties
- No exploitable resolution. Each resolver settles privately, retroactively, and can discard bad questions — so there is no single oracle to bribe, no point-in-time ambiguity to exploit, and no forced verdict on genuinely unresolved questions. This defeats failure modes (2) and (3) from the previous chapter.
- Optimization power without real-money risk. Reward is real but paid retroactively by resolvers who chose to pay, decoupled from a zero-sum betting pool — addressing failure mode (1) without importing the fragility of (2)–(3).
- Manufactures common knowledge. Given existing common knowledge of resolutions, the mechanism computes a new common knowledge of resolution-forecasts — and identifies who is best at producing it.
- Predicting ≈ deciding. Those who best predict future consensus are often those best positioned to say what is worth doing. The skill ranking can therefore weight a voting system, biased toward demonstrated forecasters to whatever degree a resolver desires.
To see all of this in motion on a single concrete question, read the worked example. The premises it leans on are catalogued in The Five Assumptions.
Open problems
The mechanism's load-bearing risks. Each is taken up and either answered or reduced to a measurable quantity later in the book — status tags below.
- Reflexivity / beauty contest (A5). If the skill ranking feeds back into the votes that constrain resolvers, forecasters may be predicting an equilibrium they help create. This breaks properness and admits self-fulfilling but false focal points. Convergence to truth vs. to a stable falsehood is the central theoretical gap. → Answered in principle: Reflexivity and the Dark Room / Mathematical Core §2. Linear herding is harmless; danger is a bifurcation at deference slope , which is measurable and controllable.
- Resolver honesty is unmodeled. Nothing forces honest resolution once verdicts gate public money. → Reduced to an inequality: Mathematical Core §2.4.
- Manufactured surprise. A poorly chosen reference can reward contrarianism for its own sake. → Mitigated by construction (lagged-consensus reference, frozen budgets): Mathematical Core §2.5.
- Collusion / Sybil. Resolver–forecaster collusion, or Sybil forecasters farming the reference subtraction. → Reframed and substantially answered: Sybil Resistance Is Independence / Mathematical Core §3.
- "Reward by extrapolated intention/utility." Rewarding the extrapolated value of information when verdicts are late — a desideratum, not yet an operator. → Defined: Mathematical Core §2.5.
- Causal resolution. Verdicts must be ex-post counterfactual contrasts, not raw conditional outcomes, or the scoring rule lends authority to a confounded judgment (futarchy's flaw). → Futarchy and Causality.
The Problem
Part I · The Engine — Chapter 1
Start with a question that has no good answer today:
How does a group with money but no expertise buy the knowledge of experts who have no money — and turn it into good decisions — without trusting anyone to be the judge of what's true?
This is the problem under scientific funding, public policy, grant-making, open-source bounties, and venture investing. In every case three populations exist and rarely overlap:
- Resolvers
F— they hold capital and want to allocate it well, but lack the domain knowledge to know what's worth funding. They are the ultimate arbiters of "what turned out to be true / what mattered." - Forecasters
P— they hold domain knowledge but little capital. They have views on which projects, claims, or directions will pan out. - Candidates
Q— the projects, papers, claims, or events being evaluated.
We want a mechanism that lets F buy the dispersed knowledge of P and convert it into good allocations over Q, paying P in proportion to how informative their contribution was.
Why this is hard
The objective is underdetermined. Metrics are contested, the relevant knowledge is opaque and held by specialists, and reasonable people pursue different directions. We cannot write down the target function in advance. We can only specify a process for discovering it.
The obvious tool — a prediction market — almost works. Markets are the best knowledge-aggregators we have. But every existing form fails in a way that matters here.
Why existing prediction markets are insufficient
-
Play-money markets (Metaculus-style) let anyone open a market but give forecasters no financial stake. Without skin in the game they attract little optimization power: effort isn't compensated in proportion to informativeness, so the best specialists don't show up.
-
Real-money markets need a single, objective, point-in-time resolution. Someone — a trusted party (Kalshi) or a game-theoretic oracle (Polymarket) — must declare, at one instant, what definitively happened. That instant is a central point of exploitation: with real money on the line, ambiguous questions invite contested or fraudulent resolutions.
-
Real-money markets degrade under inequality and ambiguity. A wealthy actor indifferent to losses can push the price of an ambiguous market. More generally, a market's ability to aggregate decentralized information falls as wealth concentration rises — exactly the regime (frontier science, policy) where we most need it.
The common root
All three failures share one cause:
Resolution is a single, objective, irreversible event coupled directly to payout.
Remove that coupling — separate eliciting a forecast from declaring what happened from paying out — and all three problems become tractable. That decoupling is the whole mechanism, and it's the subject of the next chapter.
Worked Example — Impact Markets for Science
Part I · The Engine — Chapter 3
The mechanism is abstract; let's run one question all the way through it. The setting is the one the idea was born in: deciding which scientific work matters, so funding can follow.
The setting
A top-tier venue does two separable things:
- Dissemination — spreading the work. Increasingly served by other channels.
- Credentialing — the prestige of acceptance, which is the real prize. Prestige is society's signal for "high odds of future impact," used to route funding and resources.
But prestige is a retroactive, lossy proxy: things become prestigious because they turn out useful. So why not estimate the underlying quantity — future impact — directly, with the engine from the previous chapter, and keep prestige only as a fallback?
Map the roles:
| Role | In this example |
|---|---|
| Candidates | This year's papers / results |
| Forecasters | Researchers who read them and have a view on what will matter |
| Resolvers | A funding body that will, years later, judge realized impact and pay out |
One question, start to finish
Question : "Will resolver judge paper X to be high-impact by 2030?" Outcome space .
2025 — forecasts accumulate. The crowd is skeptical; the consensus (reference) belief sits at . Three forecasters act:
- Alice has read the paper closely and thinks it's a sleeper hit. She publishes , early.
- Bob just echoes the crowd: .
- Carol is a reflexive contrarian and bets it flops: .
All three forecasts are timestamped and tamper-evident, so "Alice said this in 2025" is later unforgeable.
2030 — a resolver decides. Resolver , now with five years of citations, downstream results, and comparison papers, privately declares the verdict: yes, high-impact. (Had the evidence been hopelessly muddy, could have simply discarded the question and paid no one.)
Scoring. Using the log score with outcome = yes, each forecaster's reward contribution is :
| Forecaster | Belief | minus | Reward | |
|---|---|---|---|---|
| Alice | 0.80 | |||
| Bob | 0.30 | |||
| Carol | 0.10 |
Alice is paid handsomely for being right and different from the crowd. Bob, who only echoed consensus, earns nothing — he added no information. Carol, who was different and wrong, is penalized.
The penalty is symmetric and keeps confidence honest. Had the verdict been no, Alice's confident 0.80 would have scored — a large loss. You are only rewarded for moving belief toward what the resolver eventually concludes.
Reputation and payout. Summed over many questions, Alice's reputation with this resolver is . Resolver then splits its budget across forecasters in proportion to their reputation — Alice gets a large share, Bob ~none, Carol less than she started with.
Two phases, because reputation has a cold start
Before anyone has a track record, there is nothing to weight. So the system runs in two phases:
- Bootstrap. A play-money market with tokens handed to reviewers, to seed forecasts and begin accumulating reputation before real budgets are at stake.
- Allocation. Once reputations exist, retroactive scoring against resolvers' eventual impact verdicts converts forecaster reputation into real credentialing and funding weight.
What this example deliberately does not claim
The construction's own non-goals, stated honestly:
- It does not measure non-observable impact — only impact that is, for the targeted cases, publicly and quantitatively observable.
- It does not settle whether prestige should exist.
- It does not claim to align science with societal good.
It assumes only that impact is observable for the cases it targets, and that prestige ought to track it. That narrow scope is what makes it tractable.
The same three roles and the same scoring reappear in governance (resolvers = voters' delegated fact-finders, candidates = policies) and, more surprisingly, inside the network itself — which is where Part II goes.
The Five Assumptions
Part I · The Engine — Chapter 4
The mechanism is only as sound as its premises. Five assumptions hold it up. Naming them precisely does two things: it tells you exactly when the mechanism applies, and it turns each "what if this fails?" into a concrete design requirement that the rest of the book addresses.
A1 — Shared reality
Claim. On most questions, a large fraction of resolvers eventually agree; their verdicts are positively correlated.
Why it's needed. Forecasters are paid for predicting resolver consensus. If there is no consensus to predict — if every resolver's verdict is idiosyncratic — then "consensus" is undefined and the target disappears.
What breaks if it fails. Forecasting degenerates into modeling each resolver separately; the shared world-model evaporates.
Where it's addressed. The architecture keeps the perception layer (forecasts, verdicts, world-model) global so the shared-reality population is as large as possible — see The Living System and Mathematical Core §5.
A2 — Dispersed capital
Claim. Reward budgets are not extremely concentrated.
Why it's needed. If one whale holds most of the budget, the money-optimal forecast is "predict the whale," not "predict consensus" — and aggregation collapses to the same failure as real-money markets under inequality.
What breaks if it fails. Capture: the world-model tracks one actor's beliefs.
Where it's addressed. This is the deepest cross-layer link in the stack: a demurrage currency structurally enforces dispersion, turning A2 from an assumption into a tunable inequality. See Value as Flow and the δ-dial in Mathematical Core §4.2.
A3 — Repeated game
Claim. Forecasters value a future stream of reward, so they keep their published beliefs current.
Why it's needed. Reputation only means something if stale beliefs cost you. A one-shot game gives no reason to update.
What breaks if it fails. Forecasts go stale; the world-model lags reality.
Where it's addressed. Inherent in the scoring design — reputation decays relative to peers who keep updating (Mechanism).
A4 — Tamper-evident timestamps
Claim. Forecasts cannot be backdated.
Why it's needed. The reward for being "surprising and early" is only meaningful if "early" can't be forged after the outcome is known.
What breaks if it fails. Anyone can claim, post hoc, to have predicted everything — the entire skill ranking becomes fiction.
Where it's addressed. A minimal timestamping primitive built from causal entanglement — no global blockchain required. See Mathematical Core §4.4 and the roadmap's shared-primitive analysis.
A5 — Non-reflexivity (the fragile one)
Claim. Forecasters' reports do not themselves change resolvers' verdicts.
Why it's needed. Proper scoring assumes you are predicting an outcome you can't move. If publishing a forecast causes the verdict it predicts, the rule rewards self-fulfilling prophecy instead of truth.
What breaks if it fails. The Keynesian beauty contest: forecasters predict an equilibrium they help create, and the system can converge on a stable, comfortable fiction.
Where it's addressed. This is the single most important open problem, and it gets a chapter of its own: Reflexivity and the Dark Room, formalized in Mathematical Core §2. The headline result is reassuring — linear influence is harmless; the danger is a sharp threshold that turns out to be measurable and controllable.
Reflexivity and the Dark Room
Part II · The Recurring Pattern — Chapter 8
This is the fragile assumption (A5) made into a chapter, because it is the design's central theoretical risk — and the place where the math delivers its most reassuring surprise.
The problem: a market that believes its own lies
The truth machine pays forecasters for predicting resolver consensus. But what if resolvers, when deciding their verdicts, look at the published forecast? Then the forecast is partly predicting itself. Push that far enough and the system can lock onto a self-fulfilling fiction: a belief everyone holds because everyone holds it, decoupled from reality.
The Free Energy Principle has a name for this failure: the dark room. An agent can trivially minimize "surprise" by sitting in a dark, predictable room doing nothing — perfectly predicting a trivial world. For us the dark room appears at three scales as one failure:
the truth machine's stable self-fulfilling fiction = governance's captured commons = the social echo chamber.
The reassuring result
Model a resolver's verdict as a mix: a fraction driven by deference to the published forecast , and the rest by their own private signal. Forecasters then publish the fixed point , where is the honest posterior and is how resolvers respond to the forecast.
If deference is linear () and , the fixed point is exactly . The influence passes straight through and cancels; the forecast remains the honest posterior. Reflexivity does not bias the forecast at all in the linear regime.
So mere influence is harmless. The beauty-contest pathology requires non-linearity — specifically, a conformity response strong enough that the deference slope
exceeds 1. Below the honest belief is the unique outcome; above it, the system bifurcates and fiction equilibria appear. The whole reflexivity question collapses to one measurable scalar and one safety condition: .
Three levers that control
- Blind resolution (information design). Don't show the current consensus to a resolver while they record a verdict. You can't make (knowledge leaks), but you remove the highest-bandwidth channel from forecast to verdict. A UI choice with a theorem behind it.
- Reality coupling (incentive design). Pay resolvers a small bonus on their own verdicts scored against later-arriving ground truth. This rewards using the private signal, directly lowering . At the largest scale, the reality-coupling mechanism is exit — a self-deluding network loses members to one that isn't deluded (The Living System).
- Epistemic value (reward routing). Steer reward budget toward questions where resolving uncertainty matters, so optimization power flows to live questions instead of comfortable ones — without touching the scoring rule's properness.
⚠️ Honest caveat. is measurable and controllable, but the shape of the human deference curve in this setting is unknown until a real pilot measures it — and the coupled, whole-system stability proof is still open.
📐 Formal version: The Mathematical Core §2 — the convergence propositions, the bifurcation, continuous degradation below threshold, and perturbation audits that estimate directly. The closely related causal version of this trap is in Futarchy and Causality.
Pay for Prediction
Part II · The Recurring Pattern — Chapter 6
The substrate is a flow problem. But who decides which value to hold where, and why would they decide well? The answer reveals the pattern that ties the whole stack together.
A node is a predictor
Consider a single node deciding what to cache and what to serve. To do this well it must predict: where will demand for this value appear? what will it pay? which routes will be cheap? A node maintains a boundary — the messages it receives (its senses) and the messages it sends (its actions) — and everything inside (its stored data, keys, local compute) is hidden behind that boundary. (This boundary has a precise name, the Markov blanket; we use it fully in The Living System.)
The node persists by predicting the demand/price/topology field and pre-positioning materializations (the caching decision from the previous chapter) to serve requests cheaply:
A node that predicts well serves fast, earns, and survives; a node that predicts badly wastes resources and dies. Economic survival is accurate prediction.
This isn't a metaphor bolted on afterward. The routing design's RL world-models — agents predicting latency, bandwidth, and cost to choose relays — are already doing exactly this. We just hadn't named it as the same thing the truth machine does.
The pattern, everywhere
Once you look for it, "pay agents for predictive accuracy" is at every layer:
| Layer | What's predicted | Reward for accuracy |
|---|---|---|
| Routing | latency / bandwidth / cost of relays | better relay selection, more traffic served |
| Caching / storage | future demand for a value | profitable pre-positioning |
| Truth machine | future resolver consensus | reputation + payout (Mechanism) |
| Governance | which policy advances a goal | influence weighted by track record |
The same primitive — a scoring/staking framework that rewards calibrated prediction — could serve all four. The truth-machine engine of Part I is the most fully worked-out instance, but it is an instance, not a special case.
Prices are prediction errors
There is a precise sense in which this all collapses to one quantity. If the network prices delivery at the marginal cost of unmet demand, then a node's profit from any local action equals the reduction in total system-wide unmet demand that the action causes. Profit-seeking becomes distributed gradient descent on a single global error signal: the price field is the prediction-error field. Predict well → reduce error → get paid.
⚠️ Honest caveat. This clean alignment needs convex costs and price-taking nodes. Lumpy storage commitments and relays with market power break it; those need real mechanism design, not just the abstraction.
📐 Formal version: The Mathematical Core §1.3 (prices as prediction errors; the alignment proposition) and §2 (the truth-machine fixed point).
The Living System
Part III · The Synthesis — Chapter 10
You have now met every piece separately: a knowledge-aggregating engine, a substrate that unifies storage/routing/compute, nodes that survive by predicting, identity as independence, belief kept honest against reflexivity, and value that must flow. They were introduced as separate engineering problems. They are not separate.
They are the organs of a single kind of thing: a self-producing, predictive, dissipative system — an artificial organism — recursively instantiated at the scale of bits, nodes, zones, and whole networks. Every layer is the same act at a different scale: a system maintaining its own existence across a boundary by predicting its environment well enough to keep paying its maintenance cost.
The design kept feeling incomplete because we were specifying organs without having named the animal. Three established theories give it an anatomy:
- Autopoiesis (Maturana & Varela) — what it is: a system that continuously produces the components, and the boundary, that produce it.
- Active inference / the Free Energy Principle (Friston) — how it runs: anything that persists maintains a Markov blanket and acts to minimize surprise about what crosses it.
- Multi-level selection / major evolutionary transitions (Maynard Smith & Szathmáry; Michod) — how it competes and evolves: cooperation at one level is stabilized by selection at the level above.
The through-line connecting all three — and the reason it surfaces everywhere — is boundary integrity, which in our setting is exactly Sybil resistance (Chapter 7).
The blanket nests
A node is a Markov blanket; a zone (a latency/geographic cluster) is a higher-order blanket whose internal states are nodes; the whole network is a blanket whose internal states are zones. Each level minimizes surprise at its own scale. That is the rigorous content of "Dither as a higher organism": a hierarchical Markov blanket that persists by maintaining an accurate shared model and acting to keep its constituents' predictions satisfied. And it tells you which layer is which organ:
| Organ (living system) | Stack layer |
|---|---|
| Membrane + immune system (self/non-self) | Identity + Sybil resistance (Ch. 7) |
| Metabolism (free-energy transport) | Storage / compute / routing (Ch. 5) |
| Bloodstream / ATP | Currency; demurrage = turnover; UBI = perfusion (Ch. 9) |
| Perception (generative model) | Truth machine (Ch. 2) |
| Will / motor system (action) | Governance / liquid QV |
| Body map / anatomy | Latency-trust coordinate space |
| Reproduction & evolution | Network forking + between-network selection |
| Morphogenesis / development | New-user bootstrapping (Ch. 9) |
| Cancer | Sybil / collusion / commons-defection |
Perception and action are the same calculus pointed opposite ways: the truth machine is perception (change beliefs to match the world); governance is action (change the world to match beliefs). Reflexivity — forecasters predicting an equilibrium they help create — is not an exotic bug but the motor itself; the dark-room chapter is the condition under which the motor stays coupled to reality.
The commons and competing networks
The tragedy of the commons is a multipolar trap. Evolution's answer is a major evolutionary transition: a higher-level individual emerges whose own fitness depends on the commons being maintained, and which can discipline its constituents. The stability conditions are exactly this design's spec:
- (a) Align constituent fitness with the whole. A node survives precisely by doing work the network demands; currency is the fitness-coupling; demurrage+UBI keeps alignment from drifting into capture.
- (b) Suppress within-group defection. Independence accounting is the immune system; a Sybil/colluder/free-rider is cancer — a constituent defecting on the shared metabolism for private replication.
- (c) Enable between-group selection. Multiple competing networks. If there's only one network it faces no selection on itself and can ossify or turn despotic. Between-network selection — via cheap exit — is what keeps the higher organism honest: badly-governed networks shed members and die. Exit is the reality-coupling of Chapter 8 at societal scale.
The central tension this exposes: shared reality wants global; commons-discipline wants local-and-exitable. The truth machine needs a large shared-reality population (A1/A2) to define consensus; discipline needs small exitable units so selection works and tyranny is escapable. The resolution is the nested blanket: reality is shared at the higher (global) levels, while governance and exit operate at lower, exitable levels. The depth of nesting is set by this tension — a design principle, not a slogan: it tells you how to size zones.
The smallest generating set
If the whole design comes from a few seeds:
- One substrate operation:
MATERIALIZE(value, spacetime-region); the network solves min-cost flow over it; disp is the verifiable value algebra. (Ch. 5) - One agent principle: persist by minimizing surprise across your Markov blanket. (Ch. 6)
- One recursion: blankets nest — node ⊂ zone ⊂ network — with the truth machine as collective perception and governance as collective action. (this chapter)
- One boundary condition: Sybil resistance = blanket integrity = conditional-independence among constituents; identity is a metabolic signature. (Ch. 7)
- One flow: currency = value accounting; demurrage = turnover; UBI = perfusion; geography makes the nesting physical. (Ch. 9)
- One selection: align fitness, suppress cancer, allow between-network exit. (this chapter)
Compressed to a sentence: build one living system and the layers are just its organs — "value" is whatever reduces a constituent's surprise, "currency" tracks its flow, "trade" is transport across spacetime, "perception" is the truth machine, "action" is governance, "identity" is the membrane, and "evolution" is competition between whole networks.
Where the elegance could deceive us
A beautiful frame is dangerous precisely because it explains everything. Guardrails:
- The FEP is slippery. "Minimize free energy" can be retrofit to almost anything; its value here is as organizing language and design heuristic, not free predictive math. The real content lives in the specific instantiations — the min-cost flow, the independence detector, demurrage-funded UBI, the epistemic-value term. If we ever catch ourselves deriving a mechanism from "free energy" alone, we've started decorating instead of designing. (The Mathematical Core §1.3 earns the frame back by choosing the functional and deriving the alignment.)
- The metabolism unification is abstraction-level, not operational — three real cost models are still owed.
- Two open problems are irreducible and central: reflexivity/dark-room collapse, and the autoimmune-vs-tolerance limit of independence accounting.
- The identity trilemma is the true bottleneck: a personhood/boundary proof that is simultaneously strong, cheap, and privacy-preserving.
- Exit has a dark side: between-network selection can become a race to the bottom, and network effects can raise exit costs until selection stops working. Keeping exit genuinely cheap is itself a first-class design requirement.
These are not reasons to abandon the frame; they are it earning its keep by naming exactly which problems are load-bearing — which The Mathematical Core formalizes as the viability inequalities V1–V5.
What this changes about the build
The frame re-prioritizes the engineering roadmap:
- Promote the latency-trust coordinate space to a first-class, shared primitive — routing, consensus tiers, identity-vouching, and currency zones should all read from one coordinate system.
- Specify identity as accumulated behavioral history, not a separate credential — collapsing identity and collusion-defense into one mechanism.
- Add an epistemic-value term and a reality-coupling requirement as explicit defenses against dark-room collapse — testable in the centralized pilot.
- Co-design demurrage + UBI + birth-time local vouching as one mechanism ("perfusion gated by local integrity").
- Make exit cheap by design (portable identity/reputation/data).
- Build the substrate market on MATERIALIZE/min-cost-flow with disp as the value algebra — the single largest simplification available.
The Mathematical Core
Part III · The Synthesis — Chapter 11. The formal reference for the whole stack.
This is the rigorous counterpart to the intuitive chapters of Part II and the synthesis of The Living System. Goal: take every open question raised earlier in the book and either answer it, reduce it to a measurable quantity, or state precisely why it is open. The intuitive chapters forward-link here; each section below is the formal version of one of them. Notation follows the mechanism; every symbol is in the glossary.
0. The object, defined
A Dither organism is a tree of agents (node ⊂ zone ⊂ network), where each agent a maintains a boundary and solves one control problem:
minimize F_a = E[ priced unmet demand inside a's boundary
+ maintenance cost of a's internal states
− revenue from serving demand across a's boundary ]
Agents are coupled by exactly three operators, and the entire stack is these three operators applied at every scale:
| Operator | Couples across | Implements | Layer name |
|---|---|---|---|
Price field p(v, x, t) | space (within a scale) | the substrate market | storage/compute/routing |
Scoring rule ρ | time (present ↔ future verdicts) | the epistemic engine | truth machine |
Independence weight n_eff | identity (who counts, how much) | the immune system | Sybil/identity/voting |
plus one vertical flow: weight issuance downward (credits, UBI, witness power, all gated by n_eff) and aggregation upward (world-models, preference tallies, all weighted by n_eff).
The claim made precise in this document: the organism is viable iff four inequalities hold (§7), and each of the four corresponds to one of the load-bearing open problems from The Living System §8.
1. The substrate: MATERIALIZE as priced flow
1.1 The optimization problem
Demand is a field D(v, x, t) — intensity of requests for value v at location x (locations = points in the latency coordinate space). The network chooses a flow over three edge types:
TRANSPORT-TIME(v, x, [t,t']): cost = c_s(x) · |v| · (t'−t) (storage)
TRANSPORT-SPACE(v, x→y, t): cost = c_b(x,y) · |v| (routing)
TRANSFORM({vᵢ} ↦ f({vᵢ}), x): cost = c_c(x) · work(f) (compute)
subject to: every demanded (v, x, t) is satisfied by some DAG of edges terminating in a materialization of v at (x, t) within its latency bound. This is a min-cost multicommodity flow problem in a spacetime-value graph; "fetch vs. recompute" is route choice within one graph.
work(f) has a canonical unit. Tree calculus is deterministic and confluent, so "number of kernel reduction steps" is a machine-independent work measure — the natural gas metric. disp gives the substrate market its metering for free.
1.2 The local decision rule (caching = replication = memoization)
For a price-taking node at x, holding value v is profitable iff
λ_v(x) · p(v, x) > c_s(x) · |v|
where λ_v(x) is local demand intensity and p(v,x) the posted price of delivering v at x. One inequality; instantiated with v = a file it is a CDN/replication rule, with v = a function result it is memoization, with v = a hot route's session state it is caching. This is the formal content of "storage, compute, routing are one thing": one flow problem, one threshold rule.
1.3 Prices are prediction errors (de-metaphorizing the FEP)
Define network free energy as total expected priced shortfall plus maintenance:
F = Σ_{v,x} D(v,x) · shortfall_cost(v,x) + Σ_nodes maintenance
Proposition 1 (alignment). If delivery is priced at marginal shortfall cost and nodes are price-takers, then a node's profit from any local action equals the decrease in F that the action causes. Individual profit-seeking is distributed gradient descent on F.
Sketch. Marginal-cost pricing makes each served request transfer exactly its social shortfall-saving to the server; maintenance is borne locally; sum over actions. This is the first welfare theorem specialized to a convex flow problem. ∎
This answers the "FEP is slippery" objection from The Living System §8 by choosing the free-energy functional and deriving the alignment, rather than asserting it: the price field is the prediction-error field. (Honest scope: Proposition 1 needs convex costs and no market power; lumpy storage and monopoly relays break it — see §8, Open-5.)
1.4 Verification (the layer-1 ↔ layer-4 bridge, made exact)
Because tree-calculus reduction is deterministic and confluent, two honest executors of f(x) agree bit-for-bit. Hence:
- k-replication: sample
kexecutors from a pool with honest fractionh; undetected fraud requires allksampled to collude on the same wrong hash:P[fraud] ≤ (1−h)^k. Exponential security for linear cost. - Bisection disputes: hash-consed reduction traces are Merklizable; a referee resolves a disputed run by binary search over the trace in
O(log T)checked steps (Truebit-style), so the expensive path is only taken on disagreement. - Predicate checking: when the disp result-contract
Pis cheaper thanf(NP-style asymmetry), the buyer verifies directly and no replication is needed.
1.5 The disp ↔ network boundary (roadmap Q4 — answered)
The language's effect signature and the network's service API are the same three-operation algebra:
Eff_net ::= store : Tree → Duration → Eff Receipt (TRANSPORT-TIME)
| send : Tree → Coord → Eff Receipt (TRANSPORT-SPACE)
| eval : Tree → Tree → Eff Tree (TRANSFORM)
| price : Query → Eff PriceQuote (read the field)
Serialization is canonical: hash-consed subtree encoding with content-addressed chunking (dedup is free, since equal subtrees are pointer-equal already). The "TC-Net backend" in disp's EVALUATOR_PLAN is precisely an implementation of this algebra. There is no separate "integration layer" to design; the substrate market is disp's evaluator.
2. The epistemic engine: reflexivity solved-in-principle
This section addresses the gating question (roadmap Q1, truth-markets §8.1): does the perception↔action loop converge to truth or fiction?
2.1 The model
One binary question, true state θ. Forecaster's honest posterior given evidence: π. Resolvers see private signals of quality q and may also defer to the published consensus forecast p. Model resolver verdicts as
r = λ · g(p) + (1−λ) · σ(θ)
where σ(θ) is the signal-driven verdict probability, g(p) the deference response, and λ ∈ [0,1] the deference weight — the fraction of resolution behavior driven by the forecast itself rather than by independent perception. Forecasters paid by proper scoring against r will publish the fixed point
p* = λ·g(p*) + (1−λ)·π
2.2 The convergence results
Proposition 2 (linear herding is harmless). If g(p) = p (resolvers defer linearly) and λ < 1, the unique fixed point is p* = π. The forecast remains exactly the honest posterior; properness is preserved.
Proof. p* = λp* + (1−λ)π ⟹ (1−λ)p* = (1−λ)π ⟹ p* = π. ∎
This is genuinely surprising and important: reflexivity does not bias the forecast at all in the linear regime — the influence passes through and cancels. The beauty-contest pathology requires nonlinearity, not mere influence.
Proposition 3 (bifurcation threshold). The fixed point is unique for any g with λ · sup|g′| < 1. If λ · sup|g′| > 1 (e.g. conformity response g(p) = sigmoid(κ(p−½)) with λκ/4 > 1), multiple stable fixed points exist — including fiction equilibria where p* is near certainty while π is not. This is the dark room, as a bifurcation.
Sketch. Banach contraction for uniqueness; for the sigmoid case, plot p ↦ λg(p)+(1−λ)π: slope > 1 at the crossing creates the classic S-curve with three intersections, outer two stable. ∎
Proposition 4 (continuous degradation below threshold). Even when unique, the verdict's information about θ degrades: verdict precision scales as (1−λ)², and deference correlates resolvers through the common channel, so m resolvers are worth only
m_eff = m / (1 + (m−1)·ρ_λ)
independent ones, where ρ_λ is the deference-induced verdict correlation. Aggregation quality decays smoothly in λ long before the bifurcation.
2.3 The design consequences (now concrete mechanisms)
The whole reflexivity question reduces to one measurable scalar: the deference slope L = λ·sup|g′|, with safety condition L < 1. Three mechanisms control it:
- Blind resolution (information design). The resolution interface does not display the current consensus
pwhen a resolver records a verdict. You cannot makeλ = 0(public knowledge leaks), but you remove the highest-bandwidth channel fromptor. This is a UI decision with a theorem behind it. - Reality coupling (incentive design). Pay resolvers a small properness bonus on their own verdicts scored against later-arriving information; this rewards using the private signal, directly lowering
λ. - Perturbation audits (measurement). On a random subset of questions, randomize what consensus value is displayed to resolvers (or randomize blind vs. shown). The regression of verdicts on displayed consensus estimates
λ·g′directly. The pilot's primary measurement target isL̂— the system's distance from the bifurcation.
Proposition 5 (effort routing preserves properness). Modifying the score with an uncertainty bonus breaks properness (it pays for confidence per se). But scaling each question's reward budget by any forecast-independent factor u_q preserves per-question properness. Choosing u_q ∝ expected value of information (decision-relevance × current entropy under a lagged consensus model) routes optimization power toward uncertainty that matters — the "epistemic value" requirement from the living-network essay, implemented without touching the scoring rule. (Lagged/frozen u_q so no individual forecast can move its own question's budget.)
2.4 Resolver honesty (truth-markets §8.2 — reduced to an inequality)
Per-resolver scoring contains corruption: resolver j lying corrupts only R_{i,j}, no one else's channel (unlike a global oracle, where one corruption poisons everything). The remaining question is whether j lies in their own channel. But j consumes their own rankings: their allocations are guided by forecasters selected via R_{i,j}. Mis-resolving toward a lie r̃ selects forecasters skilled at predicting r̃-shaped verdicts rather than θ, degrading j's own future allocation utility. Honesty is dominant when
∂U_j/∂(ranking fidelity) > side-payment available for lying
— a skin-in-the-game inequality satisfied for resolvers who actually allocate the budgets they score with (and not for pure influence-buyers, who should be down-weighted in the public aggregate exactly as §3 prescribes — the same independence machinery applies to the resolver side: correlated resolver blocs get GLS-down-weighted in p*).
2.5 The "extrapolated intention/utility" operator (truth-markets §8.5 — defined)
Define the retroactive value of forecaster i's information to resolver j as decision value:
EU(i→j) = E_j[ U_j(allocation with p_i) − U_j(allocation without p_i) ]
evaluated under j's own posterior world-model. This is implementable by self-application of the engine: "will j, in hindsight, judge that i's forecast improved j's allocation?" is itself a question with a per-resolver retroactive verdict. The regress terminates because its base case — j's realized allocation outcomes — is directly observed by j. The previously-undefined desideratum is thus an ordinary question type inside the same market.
3. The immune system: independence accounting
This section addresses Sybil resistance, collusion, the autoimmune dilemma, and the QV attack at once (roadmap Q2, truth-markets §8.4, living-network §4 & §8).
3.1 The reframe: from personhood to precision
Stop asking "is this a distinct person?" Ask: how much independent information does this constituent contribute? Let x_i be agent i's behavioral residual stream — forecast errors, transaction timing, verdicts — after conditioning on public information c. For honest distinct agents, residuals are (approximately) independent; for puppets of one controller, they remain coupled given c, because they share private state.
Let Σ be the residual correlation matrix. Define the effective population:
n_eff = 1ᵀ Σ⁻¹ 1
— the precision of the optimally-weighted (GLS) average of the streams. For an equicorrelated cluster of k accounts with correlation ρ: n_eff = k/(1+(k−1)ρ). Perfect Sybils (ρ→1) collapse to n_eff = 1 no matter how many accounts; genuine independents (ρ→0) count fully.
All weight in the system is issued per unit n_eff, not per account:
- Aggregation (world-model, resolver consensus): GLS weights
w = Σ⁻¹1 / 1ᵀΣ⁻¹1— correlated blocs automatically discounted. - Voting credits: QV is
√k-vulnerable to Sybils (split budgetcacrosskaccounts: votes go from√cto√(kc)). Issue credits proportional to each agent's marginaln_effcontribution and the attack yields zero: a fully-correlated cluster receives one agent's credits regardless of account count. - UBI / perfusion: newcomer drip ramps with established marginal
n_eff(a farm of fresh accounts shares ≈ one UBI until the accounts behaviorally diverge — §4.3). - Witness power (timestamping, §5): attestations weighted by
n_eff, so "many witnesses" means many independent witnesses.
n_effis the single security parameter of the entire stack. Votes, aggregation quality, UBI farming, witness security, and resolver-bloc resistance are all the same number. "Boundary integrity," quantified.
3.2 The autoimmune dilemma — dissolved, not solved
The feared false positive — "an honest local community that genuinely agrees gets punished as colluders" — is not an error under this accounting. Agents correlated through a private shared channel genuinely contribute less independent information; weighting them as n_eff < k is accurate inference, not injustice. There is no binary verdict to get wrong: weight is graduated, continuous, and recoverable (diverge behaviorally and your weight grows). The autoimmune problem was an artifact of demanding a yes/no Sybil classifier; precision accounting never asks the unanswerable question.
What remains open is estimation, not semantics: Σ is n×n and needs regularization. The geographic/vouching graph is the prior on Σ's sparsity structure — locality's fourth job (after latency, birth-vouching, currency zones) is statistical: it tells the estimator where correlation is expected, so deviations are informative. (Privacy: residual correlations are computable over pseudonymous streams via secure aggregation; no real-world identity required — the trilemma's privacy corner holds.)
3.3 The mimicry bound (why this composes with metabolic identity)
The remaining attack is adversarial decorrelation: a puppeteer runs k accounts that simulate independence on every monitored dimension while coordinating on the payload.
Proposition 6 (mimicry cost). As the monitored behavioral dimensions approach the dimensions in which the system pays for work (forecast accuracy on diverse questions, storage/relay/compute service, transaction patterns), maintaining k streams that pass conditional-independence tests requires ≈ k independent streams of real predictive/metabolic work. In that limit, the cheapest way to fake k agents is to be k agents — at which point, from the network's perspective, they are k agents.
Sketch. Passing independence tests on paid dimensions means producing k decorrelated, individually-rewarded performances; decorrelated competent forecasting requires k separately-maintained information positions; decorrelated service requires k resource commitments. The controller's preference correlation can survive — but preferences are exactly where weight is √-dampened (QV) and capped by credits ∝ n_eff earned on the paid dimensions. ∎
This is the formal content of "identity = metabolic signature": deterrence = making mimicry-cost ≥ honest-cost, and the inequality tightens as more of the economy's real work feeds the monitor. Honest scope: this is an asymptotic/arms-race bound, not a closed-form guarantee at any finite monitoring richness (§8, Open-1).
4. Money: demurrage as a control law
4.1 The wealth dynamics
Money supply M, population N (measured in n_eff!), demurrage rate δ. Each wallet decays continuously; the entire decay flow is recycled as the UBI drip δM/N per capita. Agent i earns e_i and spends s_i per unit time:
ẇ_i = −δ·w_i + δM/N + e_i − s_i
Proposition 7 (egalitarian attractor). The steady state is
w_i* = M/N + (e_i − s_i)/δ
with relaxation time 1/δ. Wealth converges to equal plus net-contribution-flow scaled by 1/δ. Demurrage converts unbounded stock inequality into bounded flow inequality: you can only be richer than baseline by (your sustained net flow)/δ.
4.2 The δ-dial: assumption A2 becomes a theorem
The truth machine's aggregation quality requires dispersed resolver budgets (assumption A2). If budgets are proportional to wealth and net earning advantages are bounded by ē, then the largest steady-state budget share is
max_j B_j/ΣB ≤ 1/N + ē/(δM)
To enforce a target concentration bound β, set δ ≥ ē / (M·(β − 1/N)). The demurrage rate is the knob that enforces the epistemic layer's soundness condition. This is the deepest cross-layer coupling in the stack, now an equation: monetary policy ⇄ truth-machine validity. (Caveat: this bounds steady-state stock, not instantaneous flow-through; a high-flow actor can still spend heavily in bursts — burst-spend caps on resolver budgets close that gap.)
4.3 Perfusion (new-user bootstrapping, quantified)
UBI to account i is δM/N · ν_i where ν_i is i's marginal n_eff (§3.1), with the geographic vouching graph supplying the newcomer's prior ν. Consequences:
- A genuine newcomer, locally vouched, starts with modest nonzero perfusion and ramps to full share as their behavioral signature individuates. Saturates at 1 — a ramp, not rich-get-richer.
- A
k-account farm shares ≈ one UBI until (per Prop. 6) it performskagents' worth of real independent work. - Zones run their own
(M_z, δ_z)with exchange at boundaries: local cost-of-living tracking, organ-level metabolic autonomy.
4.4 What needs consensus (roadmap Q3 — answered, with citations)
- Payments need no consensus. Asset transfer with single-owner accounts has consensus number 1 (Guerraoui, Kuznetsov, Monti, Pavlović, Seredinschi, The Consensus Number of a Cryptocurrency, PODC 2019): only the sender's own transaction order matters, and the sender provides it. Byzantine consistent broadcast suffices (implemented in FastPay et al.). Shared
k-owner accounts need consensus only among thekowners. - Demurrage needs no consensus at all.
w(t) = w(t₀)·e^{−δ(t−t₀)} + flowsis locally verifiable arithmetic. - Timestamps need only causal entanglement. A forecast hash
h, once referenced inside other agents' signed messages, is sandwiched:hexisted before everything that cites it and after everything it cites. Backdating requires rewriting the signed causal cone of independent witnesses — and witness independence is, again,n_eff. Gossip rate sets timestamp precision; minutes-level precision is ample for scoring weightsw_t. No total order required. - Only contested allocation needs ordering — auctions for the same scarce item, name registries, zone-level governance execution. These run on small zonal BFT quorums (or, for slow decisions, on the truth machine itself). The stack needs a blockchain nowhere; it needs zonal BFT in one narrow place.
5. Architecture theorem: perception global, action zonal
Proposition 8 (reputation is intrinsically portable). R_{i,j} is a pure deterministic function of two public, self-certifying logs: i's signed timestamped forecast stream and j's signed verdict stream. Therefore any network, zone, or fork can recompute any reputation from portable data. No platform can custody reputation; epistemic switching cost ≈ 0 by construction.
This resolves the shared-reality vs. exit-discipline tension of The Living System §6 with an architectural rule:
The perception layer (signed forecast/verdict logs, the world-model) is global, content-addressed, and portable. The action layers (currency, governance, resource allocation) are zonal and exitable. Reality stays shared at the top (assumption A1 gets its large population); discipline and exit operate at the bottom (between-zone and between-network selection stays live); demurrage already makes currency stock a weak lock-in (wealth melts; what persists is flow and portable reputation).
Exit cost is then dominated by social re-coupling, which the portable logs minimize. This is "make exit cheap" as a design invariant rather than an aspiration.
6. What the centralized pilot must measure (roadmap Q5 — specified)
The pilot is now an experiment with defined estimands, not a demo:
L̂— the deference slope (the bifurcation distance, §2.3). Randomize, per question × resolver, whether current consensus is displayed at verdict time (blind vs. shown), and at what displayed value (within honest jitter). Regression of verdicts on display estimatesλ·g′. Success criterion:L̂measurably< 1under blind-resolution UI; dose-response visible when shown.Σ̂— forecaster residual correlation structure — feasibility of independence accounting: do residuals (conditional on public info) actually separate known-distinct individuals from deliberately-planted sock-puppet pairs (plant some, pre-registered)?- Properness in practice — do participants report calibrated beliefs under the reference-relative log score (calibration curves, sharpness)?
- VOI routing — A/B question-budget weighting
u_q(flat vs. VOI-weighted, §2.5): does optimization power follow the subsidy without distorting calibration?
Scale for signal: on the order of 30–100 resolvers, 100–300 questions with staggered horizons (science-impact claims fit: 6–24-month resolvability), every forecast and verdict signed and hash-entangled from day one so the timestamping layer (§4.4) is exercised by the same pilot.
7. The viability envelope
The four load-bearing problems from The Living System §8 are now four inequalities. The organism is viable iff:
(V1) Contraction: L = λ·sup|g′| < 1 — perception dominates action
(else: dark room / fiction equilibria)
(V2) Immunity: cost(mimic k agents) ≥ cost(be k agents)
— boundary integrity; n_eff sound
(V3) Dispersion: δ ≥ ē/(M·(β−1/N)) — metabolic turnover enforces A2
(else: whale capture of the epistemic layer)
(V4) Exit: switching cost < tyranny premium
— between-network selection stays live;
held by Prop. 8 (portable perception layer)
plus zonal action layers
(V5) Causal resolution: verdicts are ex-post counterfactual contrasts, not raw
conditional outcomes — else the rule lends proper-scoring
authority to a confounded judgment (futarchy's flaw).
See futarchy-causality.md. Dual to V1: both keep the
market an evidence instrument, never a mechanical decision rule.
These are not independent: V2 (n_eff) is a parameter inside V1's resolver count, V3's population N, and V4's witness security. Boundary integrity is the load-bearing wall, exactly as the anatomy predicted.
The grand conjecture (the system's existence theorem, still open): the coupled dynamics — price field (§1), scoring fixed point (§2), weight estimation (§3), wealth flow (§4) — possess a stable joint fixed point whenever V1–V4 hold strictly, and lose it when any is violated. Each subsystem's result above is a lemma toward this; the coupled proof (or agent-based demonstration) is the centerpiece of the Phase-1 theory track.
8. Scorecard: every open question, dispositioned
From roadmap.md "Open questions before Phase 2":
| # | Question | Status |
|---|---|---|
| Q1 | Does the reflexive loop converge? | Answered in principle (§2): linear influence is harmless (Prop. 2); danger is a bifurcation at deference slope L = 1 (Prop. 3); L is measurable (perturbation audits) and controllable (blind resolution, reality-coupling). Remaining: empirical g shape, coupled-system proof (§7). |
| Q2 | Minimal identity primitive: Sybil-resistant + anonymous? | Reframed and substantially answered (§3): no personhood classifier; weight ∝ marginal n_eff from pseudonymous behavioral residuals; geography = prior on Σ; mimicry-cost bound (Prop. 6). Remaining: Σ estimation at scale, monitoring-richness arms race. |
| Q3 | Is global consensus avoidable? | Answered: yes (§4.4). Payments are consensus-number-1 (cited theorem); demurrage is local arithmetic; timestamps = causal entanglement with n_eff-weighted witnesses; only contested allocation needs zonal BFT. |
| Q4 | disp ↔ network boundary? | Answered (§1.5): the network service API is disp's effect algebra — store/send/eval/price = the three MATERIALIZE edges + the price field; canonical hash-consed serialization. |
| Q5 | Smallest meaningful pilot? | Specified (§6): estimands L̂, Σ̂, calibration, VOI-routing; blind/shown randomization; planted sock-puppets; ~30–100 resolvers, ~100–300 questions. |
From the truth-markets synthesis §8:
| # | Problem | Status |
|---|---|---|
| 1 | Reflexivity / beauty contest | → Q1 above. |
| 2 | Resolver honesty unmodeled | Reduced to an inequality (§2.4): per-resolver containment + self-consumption of rankings; honest iff allocation stake > bribe; influence-buyers handled by resolver-side GLS. |
| 3 | Manufactured surprise / bad reference | Mitigated by construction: reference = lagged consensus (the genuine frontier); budgets u_q frozen so they can't be self-moved (Prop. 5). |
| 4 | Collusion / Sybil | → Q2; one mechanism (§3) covers forecaster collusion, resolver blocs, Sybils, and QV's √k attack. |
| 5 | "Extrapolated intention/utility" undefined | Defined (§2.5): decision-value-of-information, computed by self-application of the engine; regress terminates at realized allocation outcomes. |
Genuinely still open (the honest residue):
- The arms race floor (V2 at finite richness). Prop. 6 is asymptotic; how much monitored behavioral diversity suffices in practice is empirical.
- The shape of
g.L < 1is measurable, but we don't know human deference response curves in this setting until the pilot runs. - Preference-side depth.
n_eff-gated QV fixes Sybil/collusion, and the fact/preference split (delegate facts to the market, keep preferences quadratic and personal) is justified — but full liquid preference delegation semantics under independence accounting (delegation is voluntary correlation) needs its own treatment. Current recommendation: delegation operates on the epistemic side only. - Open-economy monetary dynamics. Inter-zone exchange rates, speculative pressure on a demurrage currency, and the (dubious) external peg all need real macro analysis.
- Substrate market under non-convexity. Prop. 1 assumes price-taking and convex costs; lumpy storage commitments and relay market power need mechanism design (posted-price vs. auction hybrids).
- The grand conjecture (§7): the coupled fixed-point/stability proof — the actual "does the organism live" theorem. The Phase-1 agent-based simulation should target exactly the V1–V4 phase boundaries.
9. The whole design in five equations
(1) Substrate: cache/replicate/memoize iff λ_v(x)·p(v,x) > c_s(x)·|v| ;
profit = −∇F under marginal-cost prices. (§1)
(2) Perception: p* = λ·g(p*) + (1−λ)·π ; truth-tracking iff L < 1. (§2)
(3) Identity: n_eff = 1ᵀΣ⁻¹1 ; all weight (votes, UBI, witness,
aggregation) issued per unit marginal n_eff. (§3)
(4) Metabolism: ẇ = −δw + δM/N + e − s ; w* = M/N + (e−s)/δ ;
δ ≥ ē/(M(β−1/N)) enforces dispersion (A2). (§4)
(5) Architecture: perception global & portable (R = pure fn of logs);
action zonal & exitable. (§5)
Five equations, four viability inequalities (V1–V4), one conjecture (§7). That is the mathematical core.
Futarchy and Causality
Part III · The Synthesis — Chapter 12. A deep dive on the deepest objection.
Evaluates the retroactive consensus market (and its formal core, The Mathematical Core) against Dynomight's "Futarchy's fundamental flaw" (dynomight.net/futarchy). Question: does the conditional-vs-causal critique of futarchy sink our design, and if not, exactly which parts survive? This is the causal twin of the reflexivity problem.
1. Dynomight's flaw, stated precisely
Futarchy = make decisions with conditional prediction markets: run P(value | do action A) and P(value | do action B), take the action with the better conditional. The flaw:
Conditional markets reveal probabilistic relationships P(Y | X=x), but decisions need causal ones P(Y | do(X=x)). These differ whenever there is reverse causality or confounding.
Five objections, in increasing depth:
- Reverse causality — the conditioning event may be an effect, not a cause (a falling stock causes the firing).
- Confounding via revelation — taking action A reveals information about the decision-maker (the board that fires Musk reveals hostility → predicts other value-destroying acts), so the conditional price reflects the action's signal, not its effect.
- Real-policy confounding — same structure for policy (a no-fly-zone declaration prices in what it reveals about the leader's temperament, not just the policy's direct effect).
- Pre-commitment doesn't save you — when a market activates conditionally on the decision, bidders condition on activation and order is not preserved (the trick-coin example: you bid more on the branch that cancels-and-refunds when unfavorable, because you're insured).
- Impossibility theorem — no payout function of (bid, final price, outcome) can both force truthful bids and preserve causal information, given conditional cancellation.
Dynomight's own prescription: conditional markets are not worthless — treat them "like observational statistics: one piece of evidence, considered skeptically," never a standalone causal decision rule. The flaw can be fixed with causal-inference machinery, but "none are free."
2. Why our base mechanism is not the object the theorem is about
The impossibility theorem (objections 4–5) is a theorem about decision markets with conditional cancellation: one branch is taken, the other is refunded, and the selection-on-activation distorts bids. Three features of our design break that frame at the base layer:
- Decoupling. Forecasters publish timestamped distributions and are scored by a proper rule against resolver verdicts. They do not bet in a pool with refunds. There is no zero-sum activation branch to be insured against. For any unconditional question ("will Health be high in 2030?") this is pure forecasting; proper scoring is truthful; the theorem is simply irrelevant.
- Retroactive, optional, per-resolver resolution. There is no single point-in-time mechanical resolution. Resolution happens later, by a human, who may decline.
- The target is resolver consensus, not an observed outcome. The market predicts what thoughtful resolvers will eventually judge, not a mechanical readout of a price or statistic.
So the base layer (elicitation + aggregation) is not a decision market and does not inherit the impossibility result. What it produces is exactly the object Dynomight endorses: an aggregated observational/expert-judgment instrument. The flaw can only re-enter where we close the loop — wire a conditional forecast mechanically into a decision. That is the governance application, §4.
3. The key move: retroactivity converts an impossibility into a competence requirement
Dynomight assumes resolution is mechanical and ex ante: the market resolves on an observed outcome (the realized stock price, the realized death rate), and that observed outcome is confounded. Against mechanical resolution, the flaw is a theorem — provably unfixable by any payout function.
Our resolution is judgmental and ex post: a resolver, with hindsight and data, declares a verdict. The decisive consequence:
The causal question is relocated from ex-ante market pricing (where Dynomight proves it cannot be solved) to ex-post human judgment (where it is merely hard — and where causal inference tools actually work: RCTs, natural experiments, synthetic controls, difference-in-differences all operate ex post).
Concretely, the resolver is asked a counterfactual contrast, not a raw conditional: not "was Health high after X?" but "relative to the no-X counterfactual, did X advance Health?" A resolver in 2030, with five years of data and comparison jurisdictions, can attempt P(Health | do(X)). A forward market pricing P(Health | X) in 2025 cannot. Retroactivity is not incidental to the mechanism; it is the structural feature that sidesteps the impossibility result, by deferring the causal question to the time and the agent where evidence exists.
The price of this move is honest and specific: we trade a provable impossibility for a load-bearing assumption — that resolvers judge counterfactually and competently. Impossible becomes hard. That is a good trade, but it must be named as a soundness condition (§6).
4. Where the flaw fully survives: the governance coupling
The Ultimate Governance application says voters back "policies that require sub-policies to be structured around the market's estimates" — e.g. "to what degree does policy X advance goal Health?" Read mechanically — fund X iff the conditional market says X advances Health — this is futarchy and inherits objections 1–5 in full, including the impossibility theorem, because:
- It is a conditional question (Health given X).
- For the untaken branch (Health given ¬X, when X is enacted) the counterfactual world is never observed, so that branch is structurally unresolvable-by-observation — exactly the cancellation asymmetry the theorem formalizes. Our design does not refund it (so the trick-coin insurance distortion is removed), but it must resolve it by resolver best-guess, importing resolver judgment error in place of a clean order-violation.
The remedy is a hard design rule, and it is Dynomight's own prescription:
The market output is an observational instrument feeding human causal judgment, never a mechanical decision rule. Voters set preferences (QV over goals) and use market estimates as skeptically-weighted evidence; the human preference/judgment loop stays open. The moment governance hard-wires "allocate ∝ conditional estimate," we recreate the flaw.
This is the same loop-closure danger as reflexivity (§5), and the same fix: keep perception (market) and action (allocation) coupled through human judgment, not through a mechanical identity.
5. Dynomight's flaw and our reflexivity condition are the same shape
Both critiques say: a market measures what it is scored against, which need not be what you want.
- Dynomight (against observed outcomes): you get P(Y|X), you wanted P(Y|do X).
- Ours (against resolver verdicts, mathematical-core.md §2): you get predicted resolver consensus, you wanted truth — and if resolvers defer to the market (deference slope L→1), the loop self-fulfills.
They compound. Our causal validity is upper-bounded by resolver causal competence and degraded below it by reflexivity:
causal validity ≤ (resolver counterfactual competence) − (reflexivity loss, growing in L)
The system can faithfully aggregate and forecast expert causal judgment (valuable — a continuous, incentivized, n_eff-weighted panel of careful retrospective evaluators); it cannot exceed the causal quality of those evaluators, and it can fall short of it if the deference loop closes. It is not a causal oracle, and we should never market it as one.
6. New soundness condition: causal resolution (V5)
The viability envelope (mathematical-core.md §7) gains a fifth inequality, dual to V1:
(V5) Causal resolution: verdicts are counterfactual contrasts judged ex post with evidence,
not raw conditional outcomes.
Failure mode: the rule perfectly incentivizes predicting a CONFOUNDED judgment,
lending false proper-scoring authority to a correlation. Worse than no system.
V5 has concrete, testable mechanisms — and several fall out of features the design already has:
- Counterfactual question framing. Resolution prompts ask "effect relative to the counterfactual," and the resolution UI supplies the counterfactual scaffolding (comparison units, pre-trends).
- Optionality as causal hygiene (already in the mechanism). Resolvers may discard questions they judge hopelessly confounded (truth-markets §3.1). Forecasters won't be paid for predicting confounded conditionals because thoughtful resolvers won't resolve them → causally-hopeless questions are endogenously deprioritized. Double-edged: the system goes quiet exactly on the hardest causal questions, which may be the most decision-relevant.
- Per-resolver independence as robustness (already in the mechanism). No single confounded resolution; you predict a distribution of independent counterfactual judgments, n_eff-weighted (§3 of the math core). Independent errors wash out; shared confounders (every resolver fooled the same way) do not — so V5 failures are correlated-error failures, and the n_eff immune machinery partially detects them (a confounded consensus looks like a low-independence bloc).
- Slow payout buys causal evidence (the user's "distributed slowly over time"). Because resolution is not pinned to an instant, the resolver can wait for causal evidence to arrive — the natural experiment to mature, the RCT to publish — before declaring. Slow distribution does not fix confounding at the mechanism level; it grants the time and removes the single attackable resolution instant that makes ex-ante markets fragile.
7. Scorecard: each objection against our design
| # | Objection | Verdict on our design |
|---|---|---|
| 1 | Reverse causality | Relocated, not eliminated. Bites the resolver's judgment, not the mechanism. Mitigated by ex-post counterfactual framing (V5); resolver in hindsight can see which way causation ran. |
| 2 | Confounding via revelation (decision-maker signal) | Relocated to resolver. Resolver can be asked to isolate X's own effect ("ignoring what else the administration did"). Whether they can is the V5 competence assumption — honestly hard. |
| 3 | Real-policy confounding | Same as 2. |
| 4 | Activation-selection distortion (trick coin) | Base layer: does not apply (no conditional refund; decoupled proper scoring). Governance layer: structurally present for counterfactual branches, but handled by resolver best-guess rather than refund — removes the insurance distortion, imports judgment error. |
| 5 | Impossibility theorem | Base layer: out of scope (not a decision market). Governance layer: converted from impossibility to V5 competence requirement by ex-post human resolution (§3). The mechanical version remains impossible; we don't run the mechanical version. |
8. Bottom line
Dynomight is right, and the critique improves our design rather than refuting it:
- Vanilla futarchy resolves mechanically and closes the loop → it hits the impossibility wall. Our design resolves judgmentally ex post and keeps the loop open → it lands, by construction, in the "useful observational instrument, considered skeptically" regime that Dynomight explicitly endorses. In a real sense the retroactive consensus market is futarchy built the only way Dynomight says it could work.
- The single most important design rule that follows: never let governance convert a conditional market estimate into a mechanical allocation. Market = perception (evidence); humans = action (preference + final causal judgment). This is identical to the V1 anti-reflexivity rule and should be enforced as one principle.
- The single new soundness condition (V5): the system's causal validity is capped by resolver counterfactual competence. The proper-scoring machinery is dangerous precisely because it can lend rigorous-looking authority to a perfectly-predicted confounded judgment. Resolution methodology (counterfactual framing, discard rights, waiting for evidence, independence weighting) is therefore not a detail — it is a first-class soundness layer, co-equal with the scoring rule.
What we cannot claim: that the market produces causal knowledge. What we can claim: that it incentivizes, aggregates, and forecasts the counterfactual judgments of an independent panel of careful ex-post evaluators, at scale, with no single attackable resolution instant — and that this is the best a market can do, given that Dynomight proved the mechanical alternative impossible.
Notation & Glossary
Appendix A. Every symbol used in the book, defined once.
The chapters introduce symbols where they're first needed; this is the single place they're all collected. A few letters are reused with different meanings in different layers — those collisions are flagged explicitly at the bottom, because they trip up first-time readers.
Populations and objects
| Symbol | Meaning | First used |
|---|---|---|
F | the set of resolvers — capital holders who retroactively declare verdicts | Mechanism |
P | the set of forecasters — specialists who publish predictions | Mechanism |
Q | the set of candidates / questions under evaluation | Mechanism |
q | a single question, with outcome space Ω_q | Mechanism |
i, j | index a forecaster (i ∈ P) and a resolver (j ∈ F) | Mechanism |
The scoring engine
| Symbol | Meaning | First used |
|---|---|---|
p_{i,q,t} | forecaster i's probability distribution on q at time t | Mechanism |
r_{j,q} | resolver j's subjective verdict on q | Mechanism |
S(p, ω) | strictly proper scoring rule; canonically log score ln p(ω) | Mechanism |
p_ref | reference belief (consensus just before an update); the subtraction baseline | Mechanism |
w_t | time weight (can up-weight earlier forecasts) | Mechanism |
ρ_{i,j,q} | reward to i from j on q — marginal movement toward the verdict | Mechanism |
R_{i,j} | forecaster i's reputation with resolver j = Σ_q ρ_{i,j,q} | Mechanism |
B_j | resolver j's reward budget | Mechanism |
p* | the money-optimal forecast (= capital-weighted predicted consensus) | Mechanism |
EU(i→j) | decision-value of i's information to j ("extrapolated utility") | Math Core §2.5 |
u_q | per-question reward-budget factor, set ∝ value-of-information | Math Core §2.5 |
Reflexivity
| Symbol | Meaning | First used |
|---|---|---|
θ | the true state of a question | Math Core §2 |
π | a forecaster's honest posterior given evidence | Reflexivity |
λ | deference weight — fraction of a verdict driven by the published forecast | Reflexivity |
g(p) | the resolver's deference-response function | Reflexivity |
L | deference slope = λ · sup\|g'\|; safety condition L < 1 (= V1) | Reflexivity |
σ(θ) | signal-driven verdict probability (the non-deferring part) | Math Core §2 |
Independence (the immune system)
| Symbol | Meaning | First used |
|---|---|---|
Σ | residual-correlation matrix across agents (after conditioning on public info) | Independence |
n_eff | effective population = 1ᵀ Σ⁻¹ 1; the stack's one security parameter | Independence |
ρ (here) | pairwise correlation within a Sybil cluster — not the reward ρ above | Independence |
k | number of accounts in a cluster (or replication count, §1.4) | Independence |
h | honest fraction of an executor pool | Math Core §1.4 |
ν_i | agent i's marginal n_eff, gating its UBI share | Math Core §4.3 |
Money (the metabolism)
| Symbol | Meaning | First used |
|---|---|---|
δ | demurrage rate — decay-unless-circulated; the dispersion dial | Value as Flow |
M | total money supply | Math Core §4 |
N | population, measured in n_eff units | Math Core §4 |
w_i | wallet balance of agent i | Math Core §4 |
e_i, s_i | agent i's earn / spend rates | Math Core §4 |
ē | bound on net earning advantage | Math Core §4.2 |
β | target concentration bound on resolver budgets | Math Core §4.2 |
The substrate
| Symbol | Meaning | First used |
|---|---|---|
MATERIALIZE(V, R) | make value V available at spacetime region R | Substrate |
D(v, x, t) | demand field — request intensity for value v at location x, time t | Math Core §1 |
p(v, x) | price field — posted price of delivering v at x (the substrate's p) | Math Core §1 |
λ_v(x) | local demand intensity for v at x — not the deference weight λ | Math Core §1.2 |
c_s, c_b, c_c | storage / bandwidth / compute cost coefficients | Math Core §1.1 |
work(f) | reduction-step count of f (the canonical compute unit, from tree calculus) | Math Core §1.1 |
Viability inequalities
The system is viable iff these hold (see Math Core §7):
| Condition | Plain meaning | Chapter | |
|---|---|---|---|
| V1 | L < 1 | perception dominates action (no dark room) | Reflexivity |
| V2 | cost(mimic k) ≥ cost(be k) | boundary integrity; n_eff is sound | Independence |
| V3 | δ ≥ ē/(M(β−1/N)) | turnover keeps capital dispersed (enforces A2) | Value as Flow |
| V4 | switching cost < tyranny premium | exit keeps between-network selection alive | Living System |
| V5 | verdicts are ex-post counterfactual contrasts | the market stays evidence, never a mechanical rule | Futarchy and Causality |
The five assumptions A1–A5 (shared reality, dispersed capital, repeated game, tamper-evident timestamps, non-reflexivity) are defined in The Five Assumptions.
⚠️ Notation collisions
The same letter carries different meanings across layers. The four to watch:
F= the resolver set in the engine, but the free-energy functionalF_ain Math Core §0. (Context always disambiguates: a set of people vs. a quantity to minimize.)ρ= the reward contributionρ_{i,j,q}in the engine, but a pairwise correlation in then_effformula.λ= the deference weight in reflexivity, butλ_v(x)is local demand intensity in the substrate.p= a forecast distributionp_{i,q,t}in the engine, butp(v,x)is the price field in the substrate.
These overlaps are inherited from each layer's own conventional notation; they were kept rather than invent non-standard symbols.
Roadmap: The Decentralization Stack (DRAFT)
Appendix B. The engineering / build-plan view. For the conceptual development, read the chapters in order starting from the overview.
Working draft. The goal: map the path from programming language → routing → currency/incentives → data & compute trading → governance, identify how the layers depend on each other, and call out what's missing at each step.
The stack at a glance
| # | Layer | Project / doc | Maturity |
|---|---|---|---|
| 1 | Language & verification | disp (disp/disp.md, libdither/disp) | Working prototype. Kernel + elaboration stages 0–3 self-hosted; effects, erasure, optimizer pending. |
| 2 | Routing & data | Dither: DAR (dither/02-routing.md), DTS, RHL, identity | Detailed design (DAR) → sketches (DTS/RHL/identity). No implementation; simulator work exists (dither-sim). |
| 3 | Currency & incentives | applications/dither-currency.md, routing incentives, applications/fractional-funding.md | Sketch. Economics ideas, no consensus design, no threat model. |
| 4 | Data & compute trading | dither/decentralized-data-ideas.md | Research agenda only. No protocol design. |
| 5 | Governance / truth machine | Retroactive consensus markets + hierarchical liquid quadratic voting (Retroactive Consensus Markets); applications/protocol-of-truth | Rigorous mechanism synthesis now in repo; central theory gap (reflexivity) reduced to a measurable threshold (mathematical-core.md §2). |
The truth machine in one line: forecasters publish timestamped probability distributions; capital-holding resolvers privately and retroactively declare what they believe happened; a reference-relative proper scoring rule pays forecasters for moving belief toward eventual resolver consensus. Output: a live world-model and a skill ranking, with no exploitable point-in-time oracle. Governance bolts this onto liquid quadratic voting: voters set goals, the market estimates which policies advance them, skill rankings can weight estimates.
How the layers interact
These interactions are the actual argument for building this as one stack rather than five projects:
-
The truth machine fills the currency's biggest hole. dither-currency.md proposes a periodic redistribution from all wallets into a pot managed by an "intelligent democratic mechanism" — left undefined. Liquid QV + retroactive consensus markets is that mechanism. Conversely, the governance layer needs a native unit for resolver budgets
B_j, QV credits, and forecaster payouts. -
The currency's anti-hoarding mechanic supports the market's key assumption. The mechanism's aggregation quality degrades as capital concentrates (assumption A2, dispersed capital). A demurrage/redistribution currency structurally pushes against concentration. These two designs reinforce each other and should be co-designed, not bolted together.
-
Identity is upstream of everything quadratic. QV is meaningless without Sybil resistance; the prediction market's reference-subtraction can be farmed by Sybil forecasters; routing incentives and data markets need persistent pseudonyms for reputation. Dither's web-of-trust identity layer (dither/user-management.md) is currently an idea sketch but is a hard dependency of layers 3–5. It deserves promotion from "application concern" to core protocol.
-
Tamper-evident timestamps are a shared primitive. Forecasts must not be backdatable (assumption A4); the currency needs transaction ordering; RHL needs some consensus on link sets. One minimal ordering/timestamping/data-availability primitive serves all three — likely the only global consensus the stack needs. This reframes the consensus question (old_ideas/consensus/): we don't need general smart-contract consensus, we need cheap verifiable timestamping plus a payments ledger.
-
disp is the verification substrate for compute trading. Selling computation requires the buyer to trust the result. disp's program-as-data + types-as-predicates + provable-equivalence story is exactly a verifiable-compute story: a compute offer is a content-addressed program plus a disp predicate the result must satisfy; disputes resolve by re-execution or proof checking. No design for this exists yet — it's the main missing bridge between layers 1 and 4.
-
disp can make markets machine-resolvable. A question's outcome space and resolution criterion can be a disp predicate over published data. Resolvers can then delegate verdicts to programs for the objective subset of questions, reserving human judgment for ambiguous ones. This also unifies the quantitative truth machine with the qualitative Protocol of Truth assertion graphs: assertions become market questions; debate structure becomes evidence attached to them.
-
Per-resolver subjectivity matches Dither's polycentric philosophy. Reputation
R_{i,j}is relative to each resolver's worldview — no global truth oracle, just per-perspective rankings. Same shape as web-of-trust identity and the polycentric model. The stack is philosophically coherent: subjective-but-aggregable all the way down. -
"Pay for predictive accuracy" recurs at every layer. DAR's network world-models reward predicting latency/bandwidth; data markets reward predicting demand for caching; the truth machine rewards predicting consensus. A shared scoring/staking framework could serve all three.
What's missing — per layer
Layer 1 — disp
- Effects (
Effmonad) and the erasure (strip) pass — needed before disp programs can do real I/O efficiently. - Self-hosted parser + module resolution (elaborator stages 4–5) — last host-trusted pieces.
- Optimizer / synthesis engine (the long-term payoff; not on the critical path for the stack).
- Stack-specific gap: serialization + content-addressed code distribution format, and any networking story at all. Nothing connects disp to Dither today beyond intent.
Layer 2 — routing & data
- DAR implementation; incentive game theory unresolved (pricing/bargaining for relays).
- DTS rare-data problem; RHL is three sentences of architecture.
- Validation gap: no simulation results backing the routing-coordinate + HORNET design. Reviving
dither-simagainst the current spec is the cheapest de-risking step.
Layer 3 — currency
- No concrete consensus/ledger design (Stellar+IOTA hybrid is a preference, not a spec).
- Demurrage/redistribution mechanics, zone design, and the proof-of-external-destruction USD peg all need real economic analysis — the peg in particular is likely unsound and should be re-examined.
- No threat model (eclipse attacks, zone capture, fee-less spam).
Layer 4 — data & compute trading
- Essentially everything: storage market protocol, bandwidth settlement (micropayments per relayed byte?), verifiable compute protocol (see interaction 5), pricing discovery, comparison against Filecoin/Arweave/Golem to know what to copy vs. reject.
Layer 5 — governance / truth machine
- Reflexivity (A5) is the central theoretical gap: when skill rankings feed back into the votes that constrain resolvers, does the loop converge to truth or to a stable fiction? Needs agent-based simulation and/or a formal treatment before any high-stakes deployment.
- Resolver honesty is unmodeled once verdicts gate public money.
- "Retroactive reward by extrapolated intention/utility" is a desideratum, not a defined operator.
- Hierarchical/liquid structure of the QV layer is a TLDR, not a design (delegation mechanics, credit issuance, collusion resistance).
The truth-markets notes live outside this repo and should be merged as a spec chapter.Done — merged as Retroactive Consensus Markets (mechanism, assumptions A1–A5, open problems §8).
Cross-cutting
- Identity/Sybil-resistance primitive (blocks 3, 4, 5).
- Timestamping/ordering primitive (blocks 3, 5, RHL).
- A stack-wide threat model document.
- A dependency-ordered build plan — which this document attempts below.
Sequencing principle
Two observations drive the ordering:
- The truth machine does not need the rest of the stack to be tested. A centralized pilot (a web service with signed, timestamped forecasts and a handful of resolvers) tests the mechanism, the scoring rule, and empirically probes reflexivity — long before Dither routing or currency exist. The science impact-market is the natural pilot (low stakes, measurable outcomes, motivated forecasters).
- Each layer should ship something independently useful (the "Useful" tenet), while producing the primitive the next layer needs.
Phased roadmap (draft)
Phase 0 — Consolidation (weeks)
- ✅ Merge truth-markets notes into this spec (
governance/chapter): mechanism, assumptions A1–A5, open problems — see Retroactive Consensus Markets. - Write the cross-layer dependency map (this doc) and a first threat-model outline.
- Decide the identity and timestamping primitives' requirements (consumers: currency, markets, governance).
Phase 1 — Independent kernels (months, parallelizable)
- disp: land effects + erasure + stages 4–5 → a language someone can actually write a tool in. Then: content-addressed module distribution format.
- Routing: revive
dither-sim; validate routing coordinates + DAR relay selection in simulation; publish results. Implementation only after simulation says the design works. - Truth machine: build the centralized MVP (forecast publication, per-resolver retroactive scoring, skill rankings). Run the science impact-market pilot. Instrument it to study reflexivity and reference-belief gaming.
- Theory: agent-based simulation of the reflexive governance loop; formalize or refute convergence.
Phase 2 — Shared primitives (after Phase 1 signals)
- Web-of-trust identity protocol (design + prototype) — promoted to core.
- Minimal timestamping/data-availability layer (this is "the consensus decision," scoped down).
- Currency testnet on top of both; routing incentives denominated in it (first real economic loop: pay relays).
- Truth machine v2: forecasts anchored to the timestamping layer; identity-backed forecaster accounts (Sybil-resistant reputation).
Phase 3 — Markets
- Storage/bandwidth trading over DAR + currency (DTS trail hosting becomes a paid service).
- Verifiable compute protocol: disp predicates as result contracts; dispute resolution by re-execution.
- Machine-resolvable market questions (disp predicates as resolution criteria) feeding the truth machine.
Phase 4 — Governance
- Liquid QV pilot in a small, real community — the obvious candidate is the Dither project itself: fractional-funding contributions allocated by QV, informed by truth-machine estimates of "which roadmap item most advances goal X." Recursive self-governance satisfies the "Self-reliant" tenet and is the cheapest honest test.
- Scale outward (open-source funding, DAO treasuries, scientific funding bodies) only after the reflexivity question has empirical answers.
Open questions to resolve before Phase 2
Status update: all five are dispositioned — answered, reduced to measurable quantities, or precisely scoped — in mathematical-core.md (see its §8 scorecard).
- Does the reflexive loop converge? → Answered in principle (math core §2): linear influence is harmless; danger is a bifurcation at deference slope
L = 1;Lis measurable and controllable (blind resolution, reality-coupling). - Minimal identity primitive for Sybil-resistant QV + anonymity? → Reframed (math core §3): weight ∝ effective independence
n_eff = 1ᵀΣ⁻¹1over pseudonymous behavioral residuals; no personhood classifier needed. - Is global consensus avoidable? → Yes (math core §4.4): payments are consensus-number-1; demurrage is local arithmetic; timestamps by causal entanglement; only contested allocation needs zonal BFT.
- disp ↔ network boundary? → Answered (math core §1.5): the network API is disp's effect algebra (
store/send/eval/price= the three MATERIALIZE edges + price field). - Smallest meaningful pilot? → Specified (math core §6): estimands
L̂,Σ̂, calibration, VOI-routing; ~30–100 resolvers, ~100–300 questions.
The remaining genuinely-open list (arms-race floor, deference-curve empirics, preference-delegation semantics, open-economy monetary dynamics, non-convex market design, and the coupled stability proof) is in math core §8.
Open Questions — Research Scratchpad
Raw questions that seeded the analytical notes. Each is tagged with its current status; several were taken up and answered (or reduced to measurable quantities) in living-system.md and mathematical-core.md. Kept here as a provenance trail and a list of what genuinely remains.
-
Unification vs. anonymization. How does the unification of storage/compute/routing square with the anonymization of routing? What is a node in this context? Is there a connection to cryptography here? → Partially addressed. "What is a node" is answered: a node is a Markov blanket / agent solving one control problem (living-system.md §2, mathematical-core.md §0), and the unification is formalized as min-cost flow over the three
MATERIALIZEedges (mathematical-core.md §1). Still open: the tension between that unification and anonymous routing — anonymization deliberately hides the spacetime path that the min-cost-flow optimizer wants to expose — and whether there is a clean cryptographic framing of it. -
Connections to compiler theory. Is there a relation between the unification and optimizing compilers / superoptimization? What is the existing mathematical research (topological / categorical connections)? Can we flesh this out mathematically, perhaps in disp syntax? → Still open. Gestured at — tree-calculus reduction steps give a machine-independent work metric and disp is the value algebra (mathematical-core.md §1.1, §1.5) — but the superoptimization / categorical connection is not developed.
-
When are Markov blankets a useful abstraction? Node-level blankets (inputs/outputs) feel natural, but the others feel more constructed / drawable in many ways. When is the abstraction load-bearing rather than decorative? → Partially addressed. The honest answer is that the blanket framing is a design heuristic and organizing language, not free predictive math (living-system.md §8, "FEP is slippery"). mathematical-core.md §1.3 de-metaphorizes it by choosing the free-energy functional and deriving alignment (Prop. 1). The non-arbitrary blankets are exactly the three nesting levels node ⊂ zone ⊂ network (§0); other drawings are not claimed to be load-bearing.
-
The dark-room problem. Is there LessWrong / FEP literature on the dark-room issue, with obvious solutions / regularizers? Or does it solve itself by running into the real world? → Resolved (in principle). The dark room is the FEP failure mode where an agent minimizes surprise by predicting a trivial niche. It is formalized as a bifurcation at deference slope
L = 1(mathematical-core.md §2, Prop. 3 — viability inequality V1) and countered by epistemic-value rewards and reality-coupling (living-system.md §3). "Solves itself by hitting reality" is exactly the reality-coupling / between-network-selection answer (living-system.md §6). -
Is Sybil resistance elegantly solvable? Is there a theoretically clean solution, or only best-approximation? If approximate, who wins, and are there provable bounds? → Largely resolved / reframed. Stop asking "is this a person?"; ask how much independent information the constituent contributes —
n_eff = 1ᵀΣ⁻¹1(mathematical-core.md §3, viability inequality V2). Provable content: the mimicry-cost bound (Prop. 6) — fakingkagents costs ≈ beingkagents — but it is asymptotic. Genuinely-open residue: the arms-race floor at finite monitoring richness (mathematical-core.md §8, Open-1). -
A framework for coupling Markov blankets. Is there a mathematical framework for the general nature of coupling between blankets — correspondences between the modeled ("demiurge") world and real life? → Partially addressed. Cooperation is formalized as a continuous coupling field over the latency-trust coordinate space (living-system.md §6, mathematical-core.md §5): edge weight = degree of fate-sharing. A general theory of coupled blankets — and the model-vs-reality correspondence the "demiurge" phrasing was reaching for — is not yet developed; it overlaps with the still-open grand-conjecture stability proof (mathematical-core.md §7).
Application Design Philosophy
Application APIs should be future-proofed as much as possible
Application GUIs should be generally be as cross platform as possible.
There should be different modes of use for different user skill levels to try and accomidate as many people as possible.
- These modes should enable/disable various aspects of configurability in the application
- Simple mode should be super easy to use for anyone and come with a quick tutorial
- Configuration options should be assumed as much as possible,
- Default mode should come with a tutorial and explain how modes work and how to change them
- Advanced mode assumes the user already knows how to use the app or can figure it out on their own. All configurability functionality should be enabled.
- Application developers should strive to implement as much configurability as possible into every aspect of their application for complex users.
- For desktop applications, configuration on a small level (for specific contextual buttons or features) could be shown through right-click menus.
- For mobile applications these configuration items can be shown through a long-press.
Why Copyright is Cancer
Copyright is ineffective in the internet age. Any piece of content can be pirated on a massive scale without the original creator even knowing. The only thing copyright does is suppress derivative creation on forums with strict copyright adherence. See these videos on why the current copyright system is broken and why it makes no sense. This document outlines an alternate system of funding creation that Dither aims to create.
Have a decentralized & democratized forum of communication and publication that no institution or individual can meaningfully obstruct. Dither is a protocol for creating decentralized applications.
Provide a direct connection of support between the consumers and the creators allowing people to directly support creators for the content they produce (Paying for Production, not Distribution). Even huge projects like Marvel movies or video games can be supported through community fundraising. Each production raises the reputation of the artist(s), allowing them to raise more money from their fans for the next project.
Preventing artists from having control over their art allows all artists to use and adapt and recreate the work of other artists. Imagine all the games and movies that were super hyped up but totally flopped. If anyone was allowed to make anyone else's work but better, artists have an incentive to make good stories so that they can raise money for their next project, instead of making something crappy and making all their money off of preorders. This would also prevent the fragmentation of distribution services as no one entity can have exclusive control over a piece of content.
But What IS a Monad?
This is an article that defines what a Monad while trying to stay as close as possible to its roots in Category Theory.
If you've ever heard of monads in the context of functional programming, you've likely also heard that they are just "Monoids in the Category of Endofunctors". But what does that even mean?
Well for one, this quote could use some more detail. The more descriptive quote is: "a monad in is just a monoid in the category of endofunctors of , with product replaced by composition of endofunctors and unit set by the identity endofunctor."
Throughout this article, we will define "Category", "Endofunctor" and "Monoid" in a fictional programming language, and put these definitions together to define a Monad.
A Category (designated using curly letters, i.e. ), for the purposes of this article, is simply some defined set of mathematical objects (numbers, values, types, terms, or even categories themselves) as well as some set of "morphisms" defined between the objects. A category also has some conventions such as:
- Every object has a morphism to itself (called the identity morphism)
- Morphisms can be combined together into other morphisms much like function composition.
- Morphism composition is associative.
Functors in category theory are simply morphisms in a "category of categories". Where categories are objects and functors are just morphisms that map one category to another. This can be visualized at either a high level of abstraction where the functor is just one arrow between two categories (). Or at a lower level where a functor could be represented as many parellel arrows uniquely mapping every object and morphism in category to an object or morphism in category .
Endofunctors are just the special case where the two categories and are the same category. Endofunctors are essentially just a mapping between each object and morphism in a category to another object/morphism in the same category.
Endofunctors come up often when working in a specific Categories. For example all the types in a programming language, form a category, where the morphisms are functions defined between them. This category is called Type, and can be thought of as the "type of types".
There are many programming languages that have endofunctors. Option in Rust, Maybe in Haskell, or really any kind of wrapper type in any language is technically an endofunctor. While most languages have endofunctors, most don't have a powerful enough type system to formalize the definition of endofunctors itself. (For rustaceans, think of it like a trait that you can implement for type constructors like Option or Vec, not Option<T>, just Option by itself. this is not yet possible in rust at the time this article was written).
When defining what it means to be an Endofunctor formally within a given Category of objects (i.e. a Type System), one needs a powerful dependently-typed lagnuage. For the purposes of this article, we will be using a fictitious dependently-typed language. The following is an outline of some of the general features of the language.
Pseudocode Definitions (skip this if you think you can grok the psuedocode first try):
#![allow(unused)] fn main() { //This binds some name to a value and an optional type in some context. The value may be a type (like Vec(A)) or a value (like `3`) let <name> [: <type>] := <value> }#![allow(unused)] fn main() { // A name bound to a value may be given a valid type, or one will be inferred. // This will have an inferred type of `Nat` because 3 is most likely to be a natural number let three := 3 // The type of `four` is specified to be `Int` (i.e. an integer). This is valid because while 4 is inferred to be Nat, Nat is a subset of Int (integers). let four : Int := 4 // This is a type declaration. `Likelyhood` is defined to be a set of two numbers of type `Nat` and `Float`. Optionally, two names have been assigned to these numbers: `count` and `probability`. Curly braces represents un-ordered sets of objects. The inferred type here is the type of all types: `Type`. let Likelyhood := { count: Nat, probability: Float } // The above definition is semantically equivalent to this definition: let Likelyhood := { probability: Float, count: Num } // But the above two expressions are semantically different from this: (Brackets represent an ordered set, i.e. a list) let LikelyhoodOrdered := [ count: Num, probability: Float ] }# If you are used to haskell-based currying the first two definitions of `Likelyhood` would be similar to providing two constructors for the same object like the following: Likelyhood_a : Num -> Float -> Likelyhood Likelyhood_b : Float -> Num -> Likelyhood # But the 3rd definition would only be equivalent to Likelyhood_a#![allow(unused)] fn main() { // type constructors can be defined like this, with an implicit `->` between `{A:Type}` and `{ x:A, ... }` for structure types. let Vector3 := { A : Type } { x : A, y : A, z : A } -> { x, y, z } // or like this using the `Struct` constructor. let Vector3 := { A : Type } -> Struct { x : A, y : A, z : A } // type constructors may also be defined for enum types. let Option := { A : Type } -> Type; let Some : {A : Type} { a : A } -> Option(A); let None : {A : Type} -> Option(A); // or like this using the `Enum` constructor. let Option := { A : Type } -> Enum { Some { a : A }, None, } // Terms can also be named and typed, but left undefined (like how functions in rust traits don't need to be defined immediately) type add_one : Num -> Num // This is defining add_one in accordance with its previous defined type. It would be an error to define add_one twice with different implementations in the same context, or for the definition to fail to typecheck with the defined type. let add_one { a } -> a + 1 }
Okay now that we have pseudocode out of the way, we can get to defining an endofunctor :D
In this language, Endofunctors can be defined as a dependent collection of two functions: obj_map: A -> F<A> which maps the object of the category, and a function func_map that takes an arbitrary function A -> B where B can be anything and returns a function F<A> -> F<B>, mapping the morphisms from the initial object. When this definition is used as a "type class" (think trait or interface), it allows the user to abstract over a specific endofunctor definition and refer to all endofunctors as a collective.
#![allow(unused)] fn main() { // An Endofunctor is a class of type functions that takes a type constructor F of type `Type -> Type` (i.e. a type constructor / wrapper). Classes are essentially partially defined functions where the return value (implementation) is only valid if it has been implemented. let Endofunctor := { F : Type -> Type } -> { // Specifies how the endofunctor maps objects of the category `Type` to other objects in `Type` using the type constructor `F` type obj_map : { A: Type } -> { obj : A } -> F(A); // Specifies how the endofunctor maps functions between objects in the category `Type` to other functions in the "extended" F(Type) category. type func_map : { A : Type, B : Type} { func : A -> B } -> F(A) -> F(B); } }
Examples of various type constructors and specifications (implementations) of the Endofunctor class.
#![allow(unused)] fn main() { // Option is a generic enum, i.e. a function of type `Type -> Enum` let Option := { T : Type } -> Enum { Some[T], None, } // Defines the Endofunctor "function" for a specific type constructor. let impl_option_endofunctor : Endofunctor(Option) := { let obj_map := {A} Option{A}::Some // Returns Option{A} -> Option{B} from `func : A -> B` let func_map := {A, B} {func} -> ( { fa : F(A) } -> match fa { Some[t] => Some(func(t)) None => None } ) } }
#![allow(unused)] fn main() { // List is a generic List, i.e. a function from Type -> Enum. let List { T : Type } -> Enum { Cons { list : List T, value : T }, // Recursion defined implicitly Nil, } // Defines a term of type Endofunctor(List). let impl_list_endofunctor : Endofunctor(List) := { // Creates a constructor that takes a `value : T` by partially applying the List::Cons variant. let obj_map := {A} List{A}::Cons { list: List{A}::Nil } let func_map := {A, B} {func} -> ({fa : F(A)} -> match fa { Cons{list, value} => Cons(func_map{A,B}{func}{list}, func(value)) // func_map is recursively defined for list Nil => Nil }) } }
#![allow(unused)] fn main() { // Generic Identity Function Constructor, we need this to define unit for the Monad. let identity : [T : Type] [T] T := [T] [v] v }
NOTICE: New Syntax / Convention, identity is the name assigned to the expression [T] [v] v. But that expression has also been associated with a type ([T : Type] [T] T). The type of identity is automatically assigned the name (translated from snake_case to PascalCase): ^Identity.
There is analogous syntax for finding a value defined for a given type. This may not always resolve because multiple values could be defined for a given type.
Example: to refer to impl_option_endofunctor, you can instead infer the definition via: _ : Endofunctor(Option).
This also works when defining "implementations". You can use _ to not define a name, and instead let type inference do its job.
#![allow(unused)] fn main() { let _ : Endofunctor(Identity) := { // object map returns the identity function on `A` (i.e. A -> A). let obj_map := {A} identity(A) // func_map returns the function as-is because identity does nothing. let func_map := {A, B} {func} -> func; } // The above is somewhat equivalent this in rust :) impl Endofunctor for Identity { fn obj_map<A>() -> impl Fn(A) -> A { |a| a } fn map<A, B>(func : impl Fn(A) -> B) -> { func } } }
OKAY, Now we can get back to defining the Monad! According to category theory, a monad in a category is a monoid in the category of endofunctors of some other category. We know what endofunctors are, what what is a Monoid?
Wikipedia describes a monoid (the algebraic version) as an set () equipped with a binary operation we will call "multiplication" () and particular member of the set called the "unit".
To translate this into a typed programming language, we will define a monoid as a type class (a.k.a. trait / interface) that can be implemented for objects. The definition of the type class is parameterized on the set and includes a set of two morphisms, unit and multiplication as well as the two monoid laws of associativity and identity: , and
#![allow(unused)] fn main() { let Monoid := [ M : Type ] { type unit : M, type multiplication : [a : M, b : M] -> M type associativity_proof : { a : M, b : M, c : M } multiplication[multiplication[a, b], c] = multiplication[a, multiplication[b, c]] type identity_proof : { a : M } multiplication(a, unit) = multiplication(unit, a) } }
Now that we have a monoid, we can talk about types that are monoids! (Assume Nat is the type of natural numbers)
#![allow(unused)] fn main() { // The natural numbers under addition form a monoid let naturals_addition_monoid : Monoid(Nat) := { let unit := 0, let multiplication := add, // The actual type theory proofs will be left because I haven't figured out how they work yet lol. // addition is associative let associativity_proof := { a, b, c } <proof of associativity> // 0 + anything = anything + 0 let identity_proof := { a } <proof of identity> } }
Alright lets do another one! This time on a type we've actually seen before (I'll add type annotations for unit and multiplication so its more clear what is going on here)
#![allow(unused)] fn main() { // This is a generic monoid definition for all possible Option types. i.e. `impl<A> Monoid for Option<A>`. let impl_option_monoid : {A : Type} Monoid(Option(A)) { // The `unit` must satisfy the identity law with respect to `multiplication` let unit : Option(A) := Option(A)::None // The combination is essentially `Option::or`, it combines two Options together and returns the first if it contains a value, or the second. let multiplication := [a, b] -> match a { Some(a) => Some(a), None => b } : [Option(A), Option(A)] -> Option(A) let associativity_proof := { a , b , c } <proof of associativity> let identity_proof := { a } <proof of identity> } }
Oh wait, since Option is a endofunctor, isn't this just a Monad? (A monad is a monoid defined on an endofunctor after all) Lets try and pinpoint a more exact definition of a Monad :)
#![allow(unused)] fn main() { let Monad [ M : Type -> Type, M_is_functor : Endofunctor(A) ] { unit : M, multiplication : [a : M, b : M] -> M associativity_proof : { a : M, b : M, c : M } multiplication[multiplication[a, b], c] = multiplication[a, multiplication[b, c]] identity_proof : { a : M } multiplication(a, unit) = multiplication(unit, a) } }
If you followed all that, You should now have a good understanding of what a Monad is! (Specifying it is essentially the same as a monoid above)
How to do Node Discovery while Limiting Structural Data Leakage?
Debate: When requesting for peers from a connected node, should the node send back a list of publically available IPs? Or should the node forward the request to the peers in question for them to initiate the connection?
- Sending back a list of public IPs
- Pros:
- Faster to implement (kinda)
- Used by most networks
- Possibly easier NAT tunneling? (because it doesn't inherently require an open listening port)
- Cons:
- Allows for easy enumeration of all nodes in the network by just requesting lists and connecting to nodes.
- Pros:
- Notifying peers of the request for them to connect.
- Does this strategy actually prevent peer enumeration?
- People with a single computer can just request for a given node and ask for new connections to get IPs. If they repeatedly do this from different IPs, they can get a good sense of all the peers of a given node.
- Fix: Instead of just forwarding request to all peers. Forward request to one peer at a time and have that peer report back the measured latency. Forward request to another peer if first peer reported greater latency than request receiver. If peer was in fact closer to requester, that peer connects to the requester and the process starts anew.
- Even if the peer selection is random and only ~2-3 peers will be contacted at any given peer request, you can still repeatedly ask the node to eventually get a list of all peers.
- Fix: Request is only forwarded to peers of a similar latency to the requesting node. This also likely makes for quicker convergence to a "local group" as well as prevents closeby peers from needlessly connecting to requesters.
- Also: any node after the initial requested node can use network coordinates to more accurately recommend closer and closer nodes to forward the peer request to.
- If someone can place nodes relatively close to a given node (<30ms), they can do peer requests at various distances and eventually sus out all the peers.
- One way to lessen this attack would be for some peers to simply not participate in the discovery process. Its not a process that needs a whole lot of participants anyway. These peers could initialize connections only if arbitrary (user-defined based on threat profile) conditions are met such as:
- Predefined trust between the requester and the peer (via any kind of trust-system like web of trust, trusted certificates, or simply a set of trusted public keys)
- A lighter requirement might be that the node must be connected to some subset of their peers.
- A third requirement could be that the node must be within a certain latency radius as predicted by routing coordinates.
- No matter how many times unknown IPs request for peers from a given node, it will never forward the request to these peers.
- Another way to lessen this kind of attack might be to require new nodes to have knowledge of multiple bootstrap nodes before new peers may be requested. These bootstrap nodes could then collude to decide which of their peers is closest to the requesting node and forward the request on to them (passing along all the latency measurements and routing coordinate estimate data for even more accurate positioning).
- One way to lessen this attack would be for some peers to simply not participate in the discovery process. Its not a process that needs a whole lot of participants anyway. These peers could initialize connections only if arbitrary (user-defined based on threat profile) conditions are met such as:
- At this point, I think only an ISP would be able to figure out peer relationships via faking nodes and doing peer requests, and it would probably be easier for them to just do direct traffic analysis...
- People with a single computer can just request for a given node and ask for new connections to get IPs. If they repeatedly do this from different IPs, they can get a good sense of all the peers of a given node.
- Pros
- Provides wayy more protection from even relatively advanced adversaries trying to probe the structure of the network.
- Cons
- wayyy more complex
- Slower for new nodes to bootstrap depending on design (although its likely fine once they already have established connections to nearby peers), could also potentially be speed up by web-of-trust stuff (i.e. bootstrapping off multiple friends).
- Does this strategy actually prevent peer enumeration?