Introduction

Ethereum and similar blockchains are facing a scalability crisis. Scaling Ethereum is crucial for the network’s usefulness and adoption, and current transaction fees are certainly not constructive to that end. Making anything except the simplest of DApps is impossible.

This infamous problem has proven to be very hard to solve without compromising on security and decentralization. Piet Hein said a problem worthy of attack proves its worth by fighting back, and blockchain scalability has certainly been a mean bruiser.

However, current discussions on scalability are generally missing a key idea, which is perhaps as important as increasing transaction throughput and reducing fees. The processing limitations of the blockchain are not the only constraint making complex DApps impossible. Focusing only on computational scalability is a mistake; to create these impossible DApps we must do more than just that.

Abstractions and Content Scalability

There’s a second constraint present in Ethereum which must be addressed to truly tackle the scalability problem: that of content scalability¹. To explain what content scalability means, we’ll quote Friedrich Hayek’s seminal article "The Use of Knowledge in Society":

We make constant use of formulas, symbols, and rules whose meaning we do not understand and through the use of which we avail ourselves of the assistance of knowledge which individually we do not possess.
Friedrich Hayek, "The Use of Knowledge in Society"

This idea, although written about civilizations, should ring uncannily familiar to programmers. As software developers, we are the masters of abstractions². We create them by compositionally³ building upon other abstractions, which have already proved successful at their own level and without having to think about how they were implemented. This ecosystem of layered abstractions we've built bottom-up contains the collective knowledge of the entire computer science field and is the foundation upon which software is developed. In this sense, abstractions in computer science can be thought of as an abstraction of content: they’re a process of information hiding, wherein the underlying implementation is hidden from the user and is accessed through an interface. Through this process of information hiding, abstraction in computer science consists in the enlargement of content⁴.

Alfred Whitehead has said that "civilization advances by extending the number of important operations which we can perform without thinking about them", and the same can be said about technology of all kinds. For example, take something as simple as driving a car, which we can do thoughtlessly. Now try to imagine the amount of technology that exists inside said car. Imagine the number of collective human hours it has taken to design a car engine and the decades of accumulated knowledge it contains. And we can apply this thought process recursively to the engine’s sub-components and the tools used to build them; they each hide information about the underlying subsystem and enable the next layer to perform more operations without thinking about them.

This is a crucial characteristic of human-made complex systems. They are compositionally built on top of other subsystems, structured in a way where each layer hides information about what’s underneath. These layers make up what we are calling content. And through this process of layering abstractions, one enlarges content.

In Software, one can observe this layered abstraction structure in the existing ecosystem. This stack — made of libraries, tools, operating systems, compilers, interpreters, and many other components — each abstracts away the underlying system and extends what one can do without thinking. Modern software is very intricate; to write it, quoting Hayek, we must avail ourselves of the assistance of knowledge which individually we do not possess. To this end, we leverage tried and tested content that has been iterated over and over for the past decades, without having to write them or understand their inner workings ourselves.

Blockchain and Novel Machines

Current blockchain technologies do not have this characteristic. The computers in which smart contracts are executed are novel, in all the bad ways. None of the past forty years of developed software can actually run on them. There’s no ecosystem to speak of and no content in any way. To create complex systems in Ethereum, one would have to build all the abstractions from the ground up, requiring knowledge no single individual holds.

Imagine the hypothetical scenario of going to a computer store to buy a new computer. You’re offered a super-fast computer that was just released, but it cannot run anything that exists today: applications, libraries, tools, operating systems, compilers, interpreters, you name it. It can only run its own flavor of machine code. The vendor could try to persuade you by saying “ah, but new software can be developed, from the ground up, for this specific computer”. Which, in many ways, is how software used to be developed before high-level compilers and operating systems. Performance notwithstanding, such a computer would be useless. There’s a reason why we’ve outgrown old software practices. Without any of the abstractions painstakingly built and iterated over for the past decades, no one in their right mind would even consider buying such a computer. There’s no content for this computer at any level.

Ethereum smart contracts are much like this computer, except they are also super slow. Scalability solutions are poised to break the processing constraints present in these smart contracts. However, without also tackling content scalability, their usefulness is rather limited; the emphasis solely on computational scalability is misplaced.

Imagine running a compression and decompression algorithm. On the computers we use every day, this is a trivial task. We just import the relevant, mature, battle-tested library and make a single function call to it, benefiting from decades of accumulated knowledge no single individual holds. On the blockchain, however, such a task is impossible. There’s no computational power to do it, and there are no implementations for it. Addressing just the first issue is not enough; we’d have a fast computer but no content. How can we address these two constraints?

Rollups and Well-Supported Machines

Currently, the most important scalability solution for Ethereum is rollups, with many competing designs and implementations. Vitalik Buterin has written a great guide about it, which can be found here. The basic idea of rollups is shifting the bulk of the computation from the blockchain to a layer-2 protocol, using the blockchain for verifying proofs that what was executed off-chain is following the rules. This shift from layer-1 to layer-2 greatly reduces the cost of computations. But to truly address the scalability issue, we must also tackle content scalability. To create impossible DApps, one must be able to tap into the accumulated knowledge contained in modern development ecosystems.

Projects like Arbitrum and Optimising are developing rollups solution to answer to the computational scalability problem, but still aiming for the EVM. Projects like Cartesi are going for a different route, bringing both computational scalability and content scalability to the table. The key insight for scaling content is dragging into Ethereum the same computer we work with daily, using it as the infrastructure for previously impossible DApps. Inscribed in this computer are decades of rich, mature, and battle-tested content; with it, developers can run the entire modern development stack inside the blockchain.

Bottom line, choosing a well-supported machine like RISC-V instead of the EVM enables putting Linux and all of its toolchains inside Ethereum. This way, developers are no longer resigned to Solidity, confined by an extraordinarily costly computer; now they can use a modern development ecosystem running on a fast computer, with decades of content inside the blockchain. One can really just import a compression library and use it, within one's favorite programming language to boot. Or run Doom.

It doesn't even have to Linux. Other operating systems such as seL4, an open-source security-focused kernel with end-to-end proof of implementation correctness, can be leveraged by well-supported machines. We developers can write smart contracts with Python, Rust, OCaml, JavaScript, C++, Java, or all of them, along with all their combined ecosystems and existing libraries. We can even use actual databases. The benefits of this cannot be overstated: we are making use of knowledge that we individually do not possess, inscribed in trustworthy compilers and interpreters, preexisting mature libraries, and battle-tested operating systems.

Imagine trying to play a decentralized Texas Hold’em Poker. There's well-researched cryptography allowing this with robust implementations⁵. However, running intensive algorithms of this sort on layer-1 is not possible: the computer is just too slow. Additionally, just moving it to a faster layer-2 is not enough. The complexity of understanding the intricacies of mental poker cryptography, and then writing a robust implementation of it in Solidity makes this an impossible DApp. However, if we had a well-supported machine, we could just import already existing libraries from our preferred programming language and use it normally, running everything on a fast computer.

DApps need proper abstractions. Running well-supported machines enable developers to tap into rich, mature, and battle-tested ecosystem decades in the making, containing knowledge no single individual holds. It’s not that it’s hard to build remarkable DApps without content; it’s actually not possible. Developers are hamstrung before they write even a single line of code, and the spectacular potential of blockchain is curbed at any attempt to make intricate ideas concrete. Scaling content breaks the fetters holding us back, unleashing the full range of possibilities of blockchains; the promises of blockchain cannot be fulfilled otherwise.

References

Researcher Nick Szabo has used the term social scalability with a similar meaning, but in a different context. As such, I’ve decided against using the same term, as I feel it doesn’t capture the essence of what I’m trying to communicate.

Colburn, Timothy & Gary Shute, 2007, "Abstraction in Computer Science", Minds and Machines, 17(2): 169–184. doi:10.1007/s11023–007–9061–7

Compositionality is the principle that a system should be designed by composing smaller subsystems, and reasoning about the system should be done recursively on its structure.

⁴

Colburn, Timothy R., 1999, "Software, Abstraction, and Ontology", The Monist, 82(1): 3–19. doi:10.5840/monist19998215

⁵

Schindelhauer, Christian. (1998). A Toolbox for Mental Card Games.

This is an adapted version of an article I wrote at Medium.

Table of Contents

Introduction

Abstractions and Content Scalability

Blockchain and Novel Machines

Rollups and Well-Supported Machines

References