Towards a more accurate and faster metering

Table of Contents

Introduction

It's trivial to write programs that take forever (literally and figuratively) to execute. As well-behaved developers, we take care that our code finishes in a timely manner, and we take extra care when we know our code will be executed on other's computers. It would be rude otherwise.

On Ethereum, however, rudeness is not a deterrent for bad behaviour. Since the network is like a shared computer (anyone can submit arbitrary code to it), a malicious developer could submit an application that takes forever to execute and clog the entire chain. It's the public restroom problem, transferred to computers; the tragedy of the commons, vandalism and indifference all bundled together.

To solve this, we need to use some concept of metering. It’s a fool’s errand to estimate the cost of executing a Turing-complete program a priori; it’s undecidable whether an arbitrary program will run forever or whether it will halt. Even a program that halts may take extraordinarily long to do so. Consequently, Ethereum and its virtual machine (the EVM) implemented a mechanism called gas.

Preemptive Metering

Ethereum transactions cannot take forever to execute, and applications cannot be trusted to behave. As such, the network must call off misbehaving transactions. Modern operating systems do a similar thing, scheduling processes to run on a shared CPU. Looking at the wall clock, the OS suspends the running process after a time slice and schedules the next process, trying to fairly distribute computational resources.

However, using the wall clock for judging resources spent is not possible for the EVM: it wouldn't be deterministic. The approach used by the EVM is to assign a gas cost to each opcode beforehand, as a proxy for the computational resources that would be spent executing that single instruction. Then, at run time, it uses these individual measurements to deterministically estimates how much computational resource a program is spending. The gas cost of each opcode is tuned to make this estimation as close as possible to reality. Some of them even depend on the operands and the state of the program.

Each Ethereum transaction begins with a limited amount of compute units it can spend, and the virtual machine — as it runs — decrements the resource allowance. If the program exceeds its allowance, the VM preemptively halts the execution. There is significant overhead related to calculating gas: besides executing the program, the VM has the added responsibility of tracking gas.

This allowance is related to an important concept in Ethereum: blockspace. Since the underlying Ethereum nodes can only process so much at a time, limited by capacity of Ethereum's reference CPU, there are only so many allowances (i.e. gas units) the nodes can safely process per second. Validators will fill up a block with transactions until the sum of their allowances reach the size of a block. This guarantees the network will never be over tasked with computational effort.

Analysis

Now, all these descriptions make total sense, but they don't survive an encounter with the real world. Real CPUs are sophisticated beasts, and were not designed to run things in constant time. This means that the same CPU instruction run time may vary in three orders of magnitude more to execute, depending on the context in which it is executed. After all, optimizations often work by making the happy path faster, and then dealing with the fallout of the unhappy path (e.g. cache misses, branch mispredictions, IO). Timing attacks like Spectre and Meltdown take advantage exactly of this behaviour.

Back to Ethereum, this makes it a nightmare to estimate the cost of an EVM opcode. In practice, the gas cost has to be chosen pessimistically at the worst possible cost: the unhappy path of the underlying CPU. Otherwise an attacker could submit a malicious application that triggers these paths. If the metering system underestimates the costs of any opcode, one could attack this system and stall the network1.

The consequences of a pessimistic metering is that, while adjusting to the worst case, it overestimates the common case. This in turn decreases the network's capacity. Ethereum blocks that are full (representing the maximum computational effort the node can deal with) could actually not be effortful at all. The metering says the network is running at balls to the wall speed, while in reality the computer is quite idle.

Correctly estimating the common case would indeed reduce the idleness, but it would mean underestimating the worst case, which opens the possibility of stalling the network by overexerting it.

This metering system is inevitable on a shared machine. However, a machine that is not shared like in application-specific rollups opens up the design space for much better metering.

Optimistic Metering (for application-specific rollups)

Using our public restroom analogy, shifting the responsibility of cleaning a shared space to its users doesn't work because of the tragedy of the commons: I may behave perfectly, but be negatively affected by other's selfish behaviour. This works the same way in a shared machine, if we were to shift the responsibility of metering to its applications.

However, people are perfectly capable of keeping their own restrooms clean, as are the applications running on my machine perfectly capable of not using the entire CPU. Back to the blockchain, as long as one misbehaving application cannot ruin it for others like in application-specific rollups, we can shift the responsibility of metering to each individual application. Who's going to vandalize their own restroom? And if they do, who cares?

This is optimistic metering: making applications in application-specific rollups responsible for their own metering. It works because, although the same CPU instruction may take vastly different times to execute, if you run the same program with the same inputs it should finish roughly around the same time.

This shift means the machine does not need to keep track of gas anymore, which makes it much faster. At this granularity of metering (spans instead of instructions), developers can estimate gas accurately enough in a much more performant way.

But most importantly, honest applications will hit the happy CPU paths most of the time, since they're not actively trying to break stuff. Dishonest applications may try whatever they want, but the worse they can do is vandalize themselves; a well-behaved application won't be fouled by misbehaving applications.

There's no need to overestimate anything, the metering will be tight for that particular application and full blocks will truly mean balls to the wall speed. No idleness anymore since there's no more metering slack.

Now, of course applications may estimate their effort completely wrong. This would be a bug, analogous to a normal exploit in a smart contract. In these cases, the application would cease to function and validators would drop it as needed. Bugs wouldn't affect any other application, nor validators.

We cannot do this in the base layer, nor on shared rollups like EVM-equivalent L2s. It would be the tragedy of the commons. However, in application-specific rollups, we greatly scale the computational capacity of each applications.

Not being pessimistic increases computational capacity by orders of magnitude. So in these cases, let's be optimists :)

References

1

Perez, Daniel, and Benjamin Livshits. "Broken metre: Attacking resource metering in EVM." arXiv preprint arXiv:1909.07220 (2019).