Notice: this article contains unconventional notions. My depiction of an unconventional notion should NOT be interpreted as suggesting that (i) I am convinced that the idea is accurate/feasible, (ii) I possess a greater than 50% probability estimation that the idea is accurate/feasible, or that (iii) “Ethereum” approves of any of this in any capacity.
One prevalent inquiry that many within the crypto 2.0 domain have regarding the notion of decentralized autonomous organizations is straightforward: what are the benefits of DAOs? What intrinsic advantage would an organization gain from its governance and operations being anchored to hard-coded protocols on a public blockchain, that could not be achieved by opting for a more conventional approach? What benefits do blockchain-based contracts present over traditional shareholder agreements? Specifically, even if public-interest arguments supporting transparent governance and assured non-malicious governance can be presented, what motivates a particular organization to willingly compromise its strength by exposing its core source code, allowing its rivals to observe every action it undertakes or plans to undertake while operating under secrecy?
Numerous pathways exist for addressing this inquiry. In the particular instance of non-profit organizations that are actively committed to philanthropic endeavors, one might justly argue that the absence of personal incentive; they are already focused on bettering the world with little or no financial benefit to themselves. For private enterprises, one can posit the information-theoretic perspective that a governance mechanism will function more effectively if, assuming all else remains constant, everyone can contribute their insights and intellect to the deliberation – a fairly logical proposition given the documented findings from machine learning that significantly greater performance enhancements can be secured through data expansion than through algorithm adjustments. In this piece, however, we will embark on a distinct and more targeted pathway.
What is Superrationality?
In game theory and economics, it is a well-recognized outcome that various classes of scenarios exist where a group of individuals can either “cooperate” or “defect” against each other, resulting in a situation where collective cooperation yields better outcomes for all, yet each individual would benefit more by acting alone through defection. Consequently, the narrative suggests that everyone ultimately defects, leading to individuals’ rational behavior culminating in the least favorable outcome for the group. The most frequently cited example is the famed Prisoner’s Dilemma game.
As many readers may have already encountered the Prisoner’s Dilemma, I will add an interesting twist by presenting Eliezer Yudkowsky’s rather outlandish interpretation of the game:
Imagine that four billion humans – not the entire human race, but a significant fraction of it – are currently battling a terminal illness that can only be treated by substance S.
However, substance S can only be synthesized by collaborating with [a peculiar AI from another dimension whose only objective is to maximize the production of paperclips] – substance S can also be utilized to create paperclips. The paperclip maximizer is solely concerned with the quantity of paperclips in its own universe, not ours, thus we have no means to produce or threaten to eliminate paperclips here. We have never engaged with the paperclip maximizer previously and will never do so afterwards.
Both humanity and the paperclip maximizer will have a singular opportunity to secure some additional portion of substance S for themselves, just before the dimensional nexus collapses; however, the acquisition process diminishes some of substance S.
The payoff matrix is outlined below:
Humans cooperate | Humans defect | |
AI cooperates | 2 billion lives preserved, 2 paperclips gained | 3 billion lives, 0 paperclips |
AI defects | 0 lives, 3 paperclips | 1 billion lives, 1 paperclip |
From our perspective, it is evidently sensible from both a practical and, in this instance, an ethical standpoint that we should defect; there is no conceivable value for a paperclip in an alternate universe that could compare to a billion lives. From the AI’s vantage point, defecting consistently yields an additional paperclip, and its programming assigns a value of precisely zero to human existence; consequently, it will choose to defect. Nevertheless, the outcome produced by this decision is indisputably worse for both parties than if the humans and AI had opted to cooperate – yet, had the AI been inclined to cooperate, we could have saved even more lives by defecting ourselves, and the same applies for the AI if we decided to collaborate.
In reality, many small-scale two-party dilemmas are resolved through trade and the legal system’s capacity to uphold contracts and regulations; in this scenario, if a deity existed possessing absolute authority over both universes but only concerned with adherence to prior agreements, the humans and the AI could formalize a contract to collaborate and solicit the deity to simultaneously prevent any defection. When pre-contracting capabilities are absent, regulations penalize unilateral defection. Nevertheless, numerous instances persist, especially when multiple parties are involved, where chances for defection arise:
- Alice is selling inferior lemons in a marketplace, but she is aware that her current batch is substandard and once customers attempt to use them, they will have to discard them immediately. Should she proceed with the sale? (It is noteworthy that this type of marketplace has so many sellers that reputation is hard to monitor). Expected profit for Alice: $5 revenue per lemon minus $1 shipping/store fees = $4. Expected societal cost: $5 revenue minus $1 expenses minus $5 wasted funds by the customer = -$1. Alice sells the lemons.
- Should Bob contribute $1000 to Bitcoin development? Expectedgain to society: $10 * 100000 individuals – $1000 = $999000, anticipated gain to Bob: $10 – $1000 = -$990, thus Bob does not contribute.
- Charlie discovered another person’s wallet, which contains $500. Should he return it? Anticipated gain to society: $500 (to the recipient) – $500 (loss for Charlie) + $50 (intangible benefit to society from easing worries about the safety of wallets). Anticipated gain to Charlie: -$500, leading him to keep the wallet.
- Should David reduce expenses in his factory by discharging hazardous waste into a river? Anticipated gain to society: $1000 savings minus $10 average increased healthcare expenses * 100000 individuals = -$999000, anticipated gain to David: $1000 – $10 = $990, resulting in David polluting.
- Eve innovated a treatment for a specific type of cancer which costs $500 per unit to manufacture. She can sell it for $1000, allowing 50,000 cancer sufferers to afford it, or for $10000, allowing 25,000 cancer sufferers to afford it. Should she price it higher? Anticipated gain to society: -25,000 lives (including Alice’s profit, which offsets the wealthier buyers’ losses). Anticipated gain to Eve: $237.5 million profit instead of $25 million = $212.5 million, prompting Eve to set the elevated price.
Certainly, in many of these scenarios, individuals sometimes behave morally and collaborate, even when such actions negatively impact their personal outcomes. But what drives them to act this way? Our evolution has crafted us to be generally a predominantly self-serving optimizer. Multiple explanations exist. One we will emphasize involves the notion of superrationality.
Superrationality
Consider the ensuing clarification of virtue, courtesy of David Friedman:
I begin with two insights about humans. Firstly, there is a significant relationship between internal thoughts and external actions. Facial cues, body language, and various other indicators offer us at least some understanding of our companions’ thoughts and feelings. Secondly, we possess limited intellectual capacity—we cannot, within the timeframe available to make a choice, contemplate every possibility. We are, in the terminology of technology, gadgets of limited processing abilities functioning in real-time.
Imagine I desire others to perceive me as possessing certain traits—honesty, kindness, helpfulness toward my acquaintances. If I genuinely have those traits, expressing them is straightforward—I simply act and speak in a manner that feels natural, without heavily considering how I present myself to outsiders. They will witness my words, actions, and expressions, leading to reasonably accurate inferences.
Now, suppose I lack these traits. I am not (for instance) honest. I often behave honestly because it generally serves my interests, yet I remain willing to bypass honesty if it brings me benefit. Consequently, I must perform a dual assessment in many real choices. First, I determine my actions—whether, for example, this presents a suitable opportunity to steal without being discovered. Second, I must consider what thoughts and behaviors I would display, what expressions would manifest on my face, whether I would feel elated or upset if I were genuinely the person I pretend to be.
When a computer is tasked with double the number of calculations, it slows down. Humans do as well. Most individuals are not particularly adept at dishonesty.
If this reasoning holds, it suggests I might be better off in strictly material terms—possess a higher income—if I truly embody honesty (and kindness and …) rather than merely feigning it, as authentic virtues are more convincing than fabricated ones. This leads to the conclusion that, as a narrowly self-interested individual, I might, for purely selfish motives, seek to improve myself—become more virtuous in those aspects valued by others.
The concluding aspect of this argument recognizes that we can enhance ourselves—through our own efforts, through our upbringing, perhaps even through our genetic makeup. Individuals can and do strive to instill good habits—such as the automatic practices of honesty, refraining from theft, and kindness to peers. With sufficient training, such habits evolve into preferences—behaving “badly” leads to discomfort, even in the absence of witnesses, thus individuals refrain from engaging in them. Over time, one may not even need to decide against such actions. This process can be described as cultivating a conscience.
Essentially, it is cognitively challenging to convincingly feign virtue while being greedy whenever possible, which makes it more logical to genuinely embody virtue. Much ancient philosophy aligns with similar reasoning, perceiving virtue as a cultivated habit; David Friedman conveniently transformed this intuition into more easily analyzable frameworks. Now, let us condense this conceptual framework even further. To summarize, the central assertion here is that humans are disorganized agents – with each moment of our activity, we effectively, albeit indirectly, reveal elements of our underlying programming. If we are genuinely intending to be benevolent, we behave in one manner, and if we are only pretending to be kind while secretly planning to take advantage of vulnerable friends, we behave differently, and others frequently detect this difference.
This might appear as a disadvantage; however, it facilitates a form of cooperation that simpler game-theoretic agents cannot achieve. Suppose two agents, A and B, possess the capability to “detect” to what extent the other is “virtuous” and are engaged in a symmetrical Prisoner’s Dilemma. In such a scenario, the agents could adopt the following strategy, presumed to be virtuous:
- Attempt to ascertain if the other party is virtuous.
- If the other party is virtuous, cooperate.
- If the other party is not virtuous, defect.
When two virtuous agents interact, both will collaborate and reap greater rewards. If a virtuous agent encounters a non-virtuous one, the virtuous agent will defect. Therefore, in every situation, the virtuous agent performs at least as well, and often better, than the non-virtuous agent. This encapsulates the essence of superrationality.
As contrived as this strategy seems, human societies possess several deeply rooted mechanisms for executing it, especially regarding distrust toward agents who strive to make their intentions less transparent – consider the common saying that one should never place trust in someone who does not partake in drinking. Naturally, there exists a class of individuals who can convincingly feign friendliness while secretly intending to defect at all times – these individuals are referred to as sociopaths, and they may arguably represent the chief flaw of this system when utilized by humans.
Centralized Manual Organizations…
Such a type of superrational collaboration has indeed been a crucial foundation of human cooperation for the past ten millennia, enabling individuals to be truthful with one another even in instances where mere market incentives might push towards betrayal. Nevertheless, perhaps one of the major regrettable consequences of the contemporary emergence of large centralized entities is that they empower individuals to effectively deceive others’ capacity to perceive their thoughts, complicating this form of cooperation.
Most individuals in modern society have enjoyed substantial benefits, and have also indirectly supported, at least in some cases, instances of individuals in various developing nations discharging hazardous waste into waterways to produce items more inexpensively for themselves; however, we remain largely unaware that we are indirectly complicit in such betrayal; corporations perform the unpleasant tasks on our behalf. The market is so formidable that it can exploit even our personal ethics, relegating the most undesirable and objectionable jobs to those willing to suppress their conscience at the lowest cost while effectively concealing it from the general public. Corporations themselves can easily cultivate a cheerful public persona through their marketing teams, leaving it to an entirely different division to charm potential consumers. This second division may not even realize that the unit creating the product is any less upright and appealing than they are.
The internet has frequently been celebrated as a remedy for many of these organizational and political dilemmas, and indeed it does excel in reducing information mismatches and providing transparency. However, regarding the diminishing feasibility of superrational cooperation, it can sometimes exacerbate issues. Online, individuals are much less “leaky,” and thus it becomes easier to project an image of virtue while harboring intentions to deceive. This is part of why scams in digital spaces and cryptocurrencies are more prevalent than those in the offline world, and serves as one of the key arguments against transitioning all economic interactions to the internet a la cryptoanarchism (the other argument being that cryptoanarchism eliminates the capacity to impose disproportionately large penalties, thereby weakening a significant category of economic mechanisms).
A significantly higher level of transparency, one could argue, provides a solution. Individuals are moderately leaky, current centralized entities are less leaky, but organizations where information is continuously and randomly disseminated to the public are even more leaky than individuals. Picture a scenario where, if you contemplate cheating your friend, business partner, or spouse, there’s a 1% chance that the left side of your hippocampus will rebel and transmit a complete recording of your thoughts to your intended target in exchange for a $7500 reward. That is what it “feels” like to be the decision-making board of a leaky organization.
This essentially reiterates the founding philosophy behind Wikileaks, and more recently an incentivized alternative to Wikileaks, slur.io emerged to advance the cause further. Nevertheless, while Wikileaks exists, shadowy centralized entities also continue to exist and remain in many cases quite obscure. It’s possible that incentivization, combined with prediction-like mechanisms for individuals to gain from exposing their employers’ wrongdoings, is what will lead to increased transparency; however, we can also pursue a different approach: provide a means for organizations to voluntarily, and drastically, become leaky and superrational to a degree previously unseen.
… and DAOs
Decentralized autonomous organizations, as a notion, stand out in that their governance mechanisms are not only leaky, but entirely public. That is, while even transparent centralized organizations allow outsiders to form a rough picture of the organization’s nature, in a DAO outsiders can indeed access the complete source code of the organization. They do not view the “source code” of the individuals behind the DAO, but there are ways to design a DAO’s source code so that it is heavily inclined towards a specific goal, irrespective of who its members are. A futarchy focused on maximizing average human lifespan will function very differently than a futarchy centered on maximizing the output of paperclips, even if the same individuals are at the helm. Thus, it is not only the case that the organization will make it evident to all if they begin to deceive, but it is also not even feasible for the organization’s “mind” to engage in deceit.
Now, what might superrational cooperation utilizing DAOs resemble? Initially, we would need to witness some DAOs manifesting. There are a few scenarios where it seems reasonable to predict their success: gambling, stablecoins, decentralized file storage, one-ID-per-person data provision, SchellingCoin, etc. We can label these DAOs type I DAOs: they possess some internal state, but limited autonomous governance. They can only perhaps modify a few of their parameters to optimize a certain utility metric via PID controllers, simulated annealing, or other basic optimization algorithms. Therefore, they are in a weak sense superrational, but they are also fairly limited and unintelligent, often depending on external processes for upgrades that lack superrationality.
To progress further, we require type II DAOs: DAOs equipped with governance algorithms capable of making theoretically unlimited decisions. Futarchy, various democratic formats, and various forms of subjective extra-protocol governance (i.e., in the event of significant disagreement, the DAO duplicates itself into several factions, with one section for each proposed policy, allowing everyone to select which version to engage with) are the only ones currently known, though other foundational strategies and innovative combinations of these will likely emerge. Once DAOs can make unlimited decisions, they will not only be able to participate in superrational commerce with their human clientele but potentially with each other as well.
What sorts of market failures can superrational cooperation address that traditional cooperation cannot? Issues related to public goods may regrettably be beyond reach; none of the mechanisms described here resolve the massively-multiparty incentive dilemma. In this framework, the rationale behind organizations rendering themselves decentralized/leaky is to gain greater trust from others, and thus organizations that neglect to do so will be excluded from the economic advantages of this “circle of trust.” Concerning public goods, the core issue is that there’s no method to exclude anyone from deriving benefits, which leads to the strategy’s failure. However, any aspect concerning information mismatches falls squarely within the domain, which is indeed vast; as society escalates in complexity, the ease of cheating will, in many respects, become increasingly simpler to achieve and more challenging to monitor or comprehend; the contemporary financial system is merely one example. Perhaps the genuine promise of DAOs, if there exists any promise at all, precisely lies in assisting with this.