The Execution Model and the EVM - Ethereum Yellow Paper Walkthrough (6/7)

Posted Feb 22, 2020

By Lucas Saldanha

10 min read

Hi everyone! We’re now getting to one of my favourite parts of the Yellow Paper: the Execution Model. This is the section where the Yellow Paper describes the Ethereum Virtual Machine (EVM), the abstract machine that runs smart contract code in the network.

After reading this post, you should have a mental model of what the EVM is, what its inputs and internal state look like, and how it executes one instruction at a time. We’ll be covering section 9 of the Yellow Paper.

If you missed any of the previous posts, here they are:

(DISCLAIMER: this post is based on the Byzantium version of the Yellow Paper)

What is the EVM?

The EVM is the part of Ethereum that actually runs smart contract code. Every Ethereum client (Geth, Besu, Nethermind, and so on) has an EVM implementation somewhere inside. When you write a Solidity contract and deploy it, what ends up on chain is a sequence of EVM bytecode that any client can execute and reach the exact same result.

The Yellow Paper describes the EVM as:

“a quasi-Turing-complete machine; the quasi qualification comes from the fact that the computation is intrinsically bounded through a parameter, gas, which limits the total amount of computation done.”

In other words: ignore gas and the EVM could run any algorithm. But every step costs gas, and gas is finite, so every execution has a hard upper bound.

A few things to know about the EVM:

It is stack-based, like the JVM. There are no general-purpose registers.
The native word size is 256 bits (32 bytes). Most operations work on 256-bit values.
The stack is at most 1024 elements deep. Trying to push past this triggers an exceptional halt.
It is deterministic. Given the same inputs and the same world state, every implementation must produce the same output.
It is isolated. EVM code can’t access the filesystem, the network, the system clock, or anything else outside the protocol-defined inputs.

That last point is what makes EVM execution reproducible across thousands of independent nodes. There is nothing non-deterministic for them to disagree about.

The three views of state inside the EVM

When the EVM executes code, it can read and write to three different storage areas. Knowing how they differ matters a lot when you’re writing smart contracts.

Stack

The stack is where operations happen. Almost every opcode takes its operands from the top of the stack and pushes the result back on top. The stack holds up to 1024 256-bit words, and operations like DUP1..DUP16, SWAP1..SWAP16, and POP let you reorganise it.

The stack is volatile: it lives only for the duration of a single call. As soon as the call returns, it’s gone.

Memory

Memory is a linear, byte-addressable region that grows in 32-byte chunks. It’s also volatile, just like the stack, and exists only during a single call.

Memory has a quirk that catches people out: it is zero-initialised and gas-priced quadratically as it grows. The Yellow Paper specifies a memory cost formula:

C_mem(a) = Gmemory * a + (a^2 / 512)

Where a is the number of 32-byte words of memory the contract has touched. The square term means that doubling memory usage more than doubles the gas cost. In practice this stops contracts from allocating gigabytes of memory cheaply.

You read and write memory with MLOAD, MSTORE and MSTORE8. Higher-level languages like Solidity use memory for things like function arguments and return values.

Storage

Storage is the only one of the three that survives across calls. It’s a persistent mapping from 256-bit keys to 256-bit values, and it lives inside the account state of the contract being executed (remember the storageRoot we discussed in part 2?).

Storage is expensive. The SSTORE opcode has been at the centre of every gas-repricing fork because writing to disk in a state trie is genuinely costly for every node on the network. Reading is also expensive compared to memory or stack access. The rule of thumb for a contract author is simple: if you don’t have to put it in storage, don’t.

The execution environment (I)

When the EVM starts running, it doesn’t appear out of nowhere. It receives an execution environment: a set of read-only inputs that describe the call. The Yellow Paper calls this $I$ and includes:

$I_a$ — the address of the account whose code is being executed. This is what ADDRESS returns.
$I_o$ — the original transaction sender (tx.origin).
$I_p$ — the gas price of the original transaction.
$I_d$ — the input data (CALLDATA).
$I_s$ — the sender of this particular call (msg.sender). Note this can differ from $I_o$ when contracts call each other.
$I_v$ — the value sent with the call (msg.value).
$I_b$ — the bytecode being executed.
$I_H$ — the block header (so opcodes like NUMBER, TIMESTAMP, COINBASE, DIFFICULTY work).
$I_e$ — the call depth.
$I_w$ — the permission to modify state (false inside a STATICCALL).

Every one of those values maps to an opcode that contracts can use to read it. If you look at the EVM instruction table in Appendix H, you’ll see the “environmental information” group: ADDRESS, ORIGIN, CALLER, CALLVALUE, CALLDATALOAD, CALLDATASIZE, GASPRICE, CODESIZE, and so on.

Machine state (μ)

While $I$ never changes during execution, the EVM keeps a separate piece of state that does change with every instruction. The Yellow Paper calls this the machine state $\mu$ and it contains:

$\mu_g$ — the gas remaining.
$\mu_{pc}$ — the program counter (the index into the bytecode of the next instruction).
$\mu_m$ — the contents of memory.
$\mu_i$ — the number of active words in memory (used to charge memory expansion).
$\mu_s$ — the stack.

Every instruction the EVM executes is a transformation of $\mu$ (and possibly of the world state $\sigma$ too, for opcodes that write to storage or move value).

The accrued sub-state (A)

There is a third piece of state worth knowing about: the sub-state $A$. Think of it as a side ledger that accumulates information across the entire transaction execution and is finalised at the end. It contains:

$A_s$ — the set of accounts marked for self-destruction (via SELFDESTRUCT).
$A_l$ — the log entries emitted (LOG0 to LOG4).
$A_t$ — the touched accounts set (used to clear “empty” accounts after the transaction).
$A_r$ — the refund balance, which accumulates gas refunds for storage writes that clear a slot and self-destructed accounts.

The sub-state is the answer to “how do logs end up in the transaction receipt?” and “when does a self-destructed account actually disappear?”. These changes aren’t applied as the contract runs. They get queued in $A$ and applied when the transaction ends.

The execution cycle

With all the pieces in place, the EVM cycle is surprisingly simple. The Yellow Paper defines an iterator function $X$ that repeatedly applies a step function until one of three things happens:

The current operation triggers an exceptional halt (out of gas, invalid opcode, stack underflow, stack overflow, invalid jump destination, attempt to modify state inside a static call). The state is rolled back and all remaining gas is consumed.
The execution reaches a normal halt via STOP, RETURN, or SELFDESTRUCT. Any remaining gas is returned to the caller. Sub-state and any state changes are kept.
A REVERT is hit. State is rolled back like an exceptional halt, but unlike one, the remaining gas is refunded and return data is preserved. This is the EVM equivalent of throwing an exception with a message.

Each step does roughly the following:

Read the opcode at $\mu_{pc}$.
Compute its gas cost (some opcodes have dynamic costs depending on inputs, e.g. SSTORE, memory expansion, calls).
Check the stack has enough items and won’t overflow.
Deduct the gas. If there isn’t enough, exceptional halt.
Execute the opcode: pop operands from the stack, do the operation, push results back, optionally update memory, storage, sub-state.
Advance $\mu_{pc}$ (by 1 for most opcodes, or by 1 plus the number of pushed bytes for PUSHn).
Loop.

Section 9.4 of the Yellow Paper formalises all of the above and points to Appendix H, which has the full opcode table with every gas cost and stack effect.

Opcode groups

I won’t walk through all 100+ opcodes (Appendix H does that better than I ever could), but it’s useful to know how they cluster:

Stop and arithmetic: STOP, ADD, SUB, MUL, DIV, MOD, EXP, ADDMOD, MULMOD, etc. All work on 256-bit unsigned integers (with signed variants for some).
Comparison and bitwise: LT, GT, EQ, ISZERO, AND, OR, XOR, NOT, BYTE.
SHA3: KECCAK256 (named SHA3 in the Yellow Paper for historical reasons). This is what every “hash a value” call ends up using.
Environmental information: the opcodes that read $I$.
Block information: BLOCKHASH, COINBASE, TIMESTAMP, NUMBER, DIFFICULTY, GASLIMIT.
Stack, memory, storage and flow: POP, MLOAD, MSTORE, SLOAD, SSTORE, JUMP, JUMPI, PC, MSIZE, GAS, JUMPDEST.
Push, dup, swap: PUSH1..PUSH32, DUP1..DUP16, SWAP1..SWAP16.
Logging: LOG0..LOG4 (the suffix is the number of topics attached to the log entry).
System: CREATE, CALL, CALLCODE, RETURN, DELEGATECALL, STATICCALL, REVERT, SELFDESTRUCT, INVALID.

Jumps and JUMPDEST

One detail worth flagging is jumps. The EVM uses JUMP and JUMPI (conditional jump) to implement loops and function calls inside compiled bytecode. The destination of a jump must be a JUMPDEST opcode, otherwise the jump is invalid and the EVM halts exceptionally. This restriction makes the set of legal jump targets statically enumerable: you can scan the code, find every JUMPDEST, and know up front which positions a jump could legally land at (even if which one is taken at runtime depends on stack values).

Putting it all together

Zooming out, the EVM is a piece of machinery that takes:

The world state $\sigma$,
An execution environment $I$,
An initial machine state $\mu_0$ (no memory, empty stack, gas equal to whatever the caller forwarded),
An initial sub-state $A_0$ (empty),

and produces:

A new world state $\sigma’$ (possibly equal to $\sigma$ if nothing changed),
A final machine state $\mu’$ (mostly useful for reading the remaining gas),
A final sub-state $A’$,
A status (normal halt, exceptional halt, revert),
Return data (for normal halts and reverts).

That’s the whole EVM in one paragraph. Everything else is bookkeeping around it.

Conclusion

In this post, we covered section 9 of the Yellow Paper: the Execution Model. We saw that the EVM is a stack-based, deterministic, quasi-Turing-complete virtual machine. It has three storage surfaces visible to the bytecode (stack, memory, storage) and three pieces of state the protocol tracks during execution (machine state $\mu$, sub-state $A$, and the world state $\sigma$ underneath). We went through the inputs the EVM receives, the cycle it runs, and the groups of opcodes it understands.

If this post left you wanting more detail, my honest recommendation is to go read a real EVM implementation. The Besu EVM module is a good entry point if you read Java. Pyethereum and Geth are great too if you prefer Python or Go. Reading the code side by side with section 9 of the Yellow Paper was the single biggest mental-model upgrade I had while learning Ethereum.

In the next and final post of the series, we’ll look at how all the pieces come together at the chain level: how a block becomes part of the canonical chain (section 10) and how block finalisation works (section 11). That includes ommer (uncle) handling, mining rewards, the proof-of-work problem and the difficulty adjustment.

See you in the next one!

References

Programming, Ethereum Yellow Paper

ethereum blockchain

This post is licensed under CC BY 4.0 by the author.