Coverage-guided fuzzing with revm

I have been building raptor, a fast parallel coverage-guided stateful fuzzer for Solidity. It is my private security tool. I use it to grind smart contract contests and to check my own code.

Coverage-guided fuzzing is the idea that the fuzzer should only keep inputs that cause the program to do something it has never done before. In C++ or Rust code, the compiler inserts counters at every branch. In Solidity, there is no compiler instrumentation. The EVM bytecode is already compiled. We have to observe coverage at runtime by hooking the interpreter.

This post is about the inspector I built to do that. It does not cover the full fuzzing lifecycle or the corpus. It only covers the five signals that tell me whether a sequence of calls is exploring new interesting execution path.

Signal 1: New instruction hit

The simplest signal is whether a bytecode instruction was executed for the first time. Solidity compiles to a flat list of opcodes. The Program Counter (PC) is the index of the current opcode. If a sequence reaches a PC that no previous sequence ever touched, that is new behavior.

In revm, the Inspector trait has a step hook that fires before every opcode. I also use initialize_interp to discover the contract bytecode hash so I can allocate per-contract coverage buffers.

Rust
use revm::{
    inspector::Inspector,
    interpreter::{Interpreter, interpreter::EthInterpreter},
};
use alloy_primitives::B256;
use std::collections::HashMap;

#[derive(Debug, Default)]
struct LocalContractCoverage {
    edges: Vec<u8>,
}

#[derive(Debug, Default)]
struct LocalCoverage {
    contracts: HashMap<B256, LocalContractCoverage>,
}

#[derive(Debug, Default)]
struct CoverageInspector {
    local: LocalCoverage,
    current_contract: Option<B256>,
}

impl<CTX> Inspector<CTX, EthInterpreter> for CoverageInspector {
    fn initialize_interp(
        &mut self,
        interp: &mut Interpreter<EthInterpreter>,
        _context: &mut CTX,
    ) {
        let hash = interp.bytecode.hash_slow();
        if !hash.is_zero() && !interp.bytecode.is_empty() {
            let id = B256::from(hash);
            self.current_contract = Some(id);
            self.local
                .contracts
                .entry(id)
                .or_insert_with(|| LocalContractCoverage {
                    edges: vec![0u8; interp.bytecode.len()],
                });
        }
    }

    fn step(&mut self, interp: &mut Interpreter<EthInterpreter>, _context: &mut CTX) {
        let pc = interp.bytecode.pc();
        let Some(id) = self.current_contract else { return };
        let Some(cov) = self.local.contracts.get_mut(&id) else { return };
        if pc < cov.edges.len() {
            cov.edges[pc] = cov.edges[pc].saturating_add(1);
        }
    }
}

Every step increments edges[pc]. The value is a saturated u8 so it never wraps to zero. I reset the LocalCoverage buffer for every sequence, so the counter reflects how many times that edge was hit within a single execution.

Signal 2: New branch direction

A JUMPI opcode is the EVM’s if statement. The condition is a stack value. If it is zero, execution falls through to the next instruction. If it is non-zero, execution jumps to a destination PC. Two sequences that reach the same JUMPI but take different directions are exploring different behavior.

I track this by recording the source PC and the destination PC as a pair. I use the same encoding as Medusa: rotate_left(src, 32) ^ dst. This packs the pair into a single u64 that I can use as a HashMap key.

Rust
use revm::bytecode::opcode::{JUMP, JUMPI};

fn edge_marker(src_pc: usize, dst_pc: usize) -> u64 {
    (src_pc as u64).rotate_left(32) ^ (dst_pc as u64)
}

impl<CTX> Inspector<CTX, EthInterpreter> for CoverageInspector {
    fn step(&mut self, interp: &mut Interpreter<EthInterpreter>, _context: &mut CTX) {
        let pc = interp.bytecode.pc();
        // ... PC hit tracking from above ...

        let opcode = interp.bytecode.opcode();
        if opcode == JUMP || opcode == JUMPI {
            let stack = interp.stack.data();
            let dest = stack.last().copied()
                .and_then(|v| v.try_into().ok());
            let taken = if opcode == JUMPI {
                match stack.get(stack.len().saturating_sub(2)) {
                    Some(cond) => !cond.is_zero(),
                    None => false,
                }
            } else {
                true
            };
            if taken && let Some(dst) = dest {
                let marker = edge_marker(pc, dst);
                let Some(id) = self.current_contract else { return };
                let Some(cov) = self.local.contracts.get_mut(&id) else { return };
                cov.jump_edges
                    .entry(marker)
                    .and_modify(|c| *c = c.saturating_add(1))
                    .or_insert(1);
            }
        }
    }
}

I only record the marker when the jump is actually taken. For JUMPI, I peek at the condition from the stack. For JUMP, the condition is always true. The jump_edges HashMap stores a saturated u8 counter for each unique edge.

Signal 3: New call depth

The same instruction can behave differently at different call depths. A reentrancy guard is the classic example: at depth 1 it succeeds, at depth 2 it reverts. I track depth as a 64-bit bitset per PC. Each bit represents whether the instruction was hit at that depth.

I increment current_call_depth on every call and create, and decrement it on call_end and create_end. I also push and pop the contract stack so that nested calls are attributed to the correct contract.

Rust
#[derive(Debug, Default)]
struct CoverageInspector {
    local: LocalCoverage,
    current_call_depth: u64,
    current_contract: Option<B256>,
    contract_stack: Vec<Option<B256>>,
}

impl<CTX> Inspector<CTX, EthInterpreter> for CoverageInspector {
    fn step(&mut self, interp: &mut Interpreter<EthInterpreter>, _context: &mut CTX) {
        let pc = interp.bytecode.pc();
        let Some(id) = self.current_contract else { return };
        let Some(cov) = self.local.contracts.get_mut(&id) else { return };

        if pc < cov.edges.len() {
            cov.edges[pc] = cov.edges[pc].saturating_add(1);
        }
        if pc < cov.depths.len() {
            let depth = self.current_call_depth.min(63);
            cov.depths[pc] |= 1u64 << depth;
        }
    }

    fn call(
        &mut self,
        _context: &mut CTX,
        _inputs: &mut revm::interpreter::CallInputs,
    ) -> Option<revm::interpreter::CallOutcome> {
        self.contract_stack.push(self.current_contract);
        self.current_call_depth += 1;
        None
    }

    fn call_end(
        &mut self,
        _context: &mut CTX,
        _inputs: &revm::interpreter::CallInputs,
        _outcome: &mut revm::interpreter::CallOutcome,
    ) {
        self.current_call_depth = self.current_call_depth.saturating_sub(1);
        self.current_contract = self.contract_stack.pop().flatten();
    }

    fn create(
        &mut self,
        _context: &mut CTX,
        _inputs: &mut revm::interpreter::CreateInputs,
    ) -> Option<revm::interpreter::CreateOutcome> {
        self.contract_stack.push(self.current_contract);
        self.current_call_depth += 1;
        None
    }

    fn create_end(
        &mut self,
        _context: &mut CTX,
        _inputs: &revm::interpreter::CreateInputs,
        _outcome: &mut revm::interpreter::CreateOutcome,
    ) {
        self.current_call_depth = self.current_call_depth.saturating_sub(1);
        self.current_contract = self.contract_stack.pop().flatten();
    }
}

The contract_stack is important because revm calls initialize_interp for every contract. When contract A calls contract B, the interpreter switches to B’s bytecode. I need to remember that when B finishes, I should resume A’s coverage tracking. Pushing the old contract ID onto a stack is the simplest way.

Signal 4: New revert path

A revert at a specific PC is a different path than a success at the same PC. I track this by recording the last PC before a call frame ends. If the frame reverts, I set a bit in the revert bitset for that PC.

I also track the last PC on every step so that I know where the revert happened.

Rust
#[derive(Debug, Default)]
struct CoverageInspector {
    local: LocalCoverage,
    current_call_depth: u64,
    current_contract: Option<B256>,
    contract_stack: Vec<Option<B256>>,
    last_pc: usize,
}

impl<CTX> Inspector<CTX, EthInterpreter> for CoverageInspector {
    fn step(&mut self, interp: &mut Interpreter<EthInterpreter>, _context: &mut CTX) {
        let pc = interp.bytecode.pc();
        self.last_pc = pc;
        // ... depth and edge tracking ...
    }

    fn call_end(
        &mut self,
        _context: &mut CTX,
        _inputs: &revm::interpreter::CallInputs,
        outcome: &mut revm::interpreter::CallOutcome,
    ) {
        if outcome.result.result.is_revert() {
            self.record_revert();
        }
        self.current_call_depth = self.current_call_depth.saturating_sub(1);
        self.current_contract = self.contract_stack.pop().flatten();
    }

    fn create_end(
        &mut self,
        _context: &mut CTX,
        _inputs: &revm::interpreter::CreateInputs,
        outcome: &mut revm::interpreter::CreateOutcome,
    ) {
        if outcome.result.result.is_revert() {
            self.record_revert();
        }
        self.current_call_depth = self.current_call_depth.saturating_sub(1);
        self.current_contract = self.contract_stack.pop().flatten();
    }
}

impl CoverageInspector {
    fn record_revert(&mut self) {
        let Some(id) = self.current_contract else { return };
        let Some(cov) = self.local.contracts.get_mut(&id) else { return };
        let word = self.last_pc / 64;
        let bit = self.last_pc % 64;
        if word < cov.reverts.len() {
            cov.reverts[word] |= 1u64 << bit;
        }
    }
}

The revert bitset is packed: one u64 covers 64 PCs. A contract with 256 bytes of bytecode needs only 4 u64 words. This is compact and fast to merge.

Signal 5: Deeper execution

The first four signals are binary. An instruction is either hit or not. A branch is either taken or not. But what about a loop? A sequence that triggers 3 iterations of a loop is genuinely different from one that triggers 7.

I use AFL-style hit count bucketing to capture this. Instead of treating every counter change as new coverage, I bucket the raw hit count into coarse classes:

Raw countBucket
00
11
22
34
4-78
8-1516
16-3132
32-12764
128-255128

The inspector collects raw counters. The bucketing happens later when I merge the local coverage into the global map.

Rust
fn afl_bucket(raw: u8) -> u8 {
    match raw {
        0 => 0,
        1 => 1,
        2 => 2,
        3 => 4,
        4..=7 => 8,
        8..=15 => 16,
        16..=31 => 32,
        32..=127 => 64,
        _ => 128,
    }
}

fn merge_coverage(global: &mut [u8], local: &[u8]) -> bool {
    let mut new_coverage = false;
    for (curr, hist) in std::iter::zip(local, global) {
        if *curr == 0 {
            continue;
        }
        let bucket = afl_bucket(*curr);
        if *hist < bucket {
            *hist = bucket;
            new_coverage = true;
        }
    }
    new_coverage
}

This is the same algorithm that Foundry uses in its invariant tests. The idea is that 5 hits and 6 hits are the same coverage, but 7 hits and 8 hits cross a threshold and count as novel. For smart contracts, this matters most when a target function iterates over an array or a dynamic data structure.

What I do not track

The inspector does not distinguish between different transaction outcomes beyond “reverted or not”. It does not care whether a call returned true or false, whether it stopped with STOP or RETURN, or whether it ran out of gas. This is sufficient for raptor because invariants are checked via assert panic, not via return value.

Echidna tracks 17 distinct transaction outcomes per PC. Medusa tracks return vs. revert as separate markers. I chose to keep it simple because the other signals already give me enough guidance.

Why this works

The inspector is fast because it does almost no work per opcode. The hot path is:

  1. Increment edges[pc] (array index + saturating add).
  2. Set a bit in depths[pc] (bitwise OR with a constant).
  3. For JUMP/JUMPI, compute a hash and increment a HashMap entry.

All of this happens inside revm’s step hook, which is already called for every opcode. There is no additional interpreter overhead. The inspector does not allocate on the hot path because the per-contract buffers are pre-allocated in initialize_interp.

The merge step is called once per sequence, not once per opcode. A typical sequence is 1-32 calls, so the merge cost is negligible compared to the EVM execution time.

That is the whole inspector. Five signals, one Inspector implementation, and a merge function. The rest of the fuzzer (corpus, mutation, parallel execution) is built on top of this.