LibAFL #
The
LibAFL fuzzer implements features from
AFL-based fuzzers like AFL++. Similarly to AFL++, LibAFL provides better fuzzing performance and more advanced features over libFuzzer. However, with LibAFL, all functionality is provided in a modular and customizable way—in fact, LibAFL is a library that can be used to implement custom fuzzers. Because LibAFL is a library, there is no single-line command to install LibAFL like there is with libFuzzer (apt install clang
) and AFL++ (apt install afl++
). However, LibAFL can be used as a drop-in replacement for libFuzzer. Another way to use LibAFL is to create a Rust project and then use the library to create a fuzzer. This section about LibAFL covers both approaches: 1) libFuzzer drop-in and 2) LibAFL as Rust library.
Setup #
First, install LibAFL’s two major dependencies: Clang/LLVM and Rust. If your system has a version of Clang greater than 14, you may install it directly from the distribution.
apt install clang
If you want to use a specific version of Clang, then install Clang through apt.llvm.org:
wget https://apt.llvm.org/llvm.sh
chmod +x llvm.sh
sudo ./llvm.sh 15
Next, we need to tell Rust to use this version of Clang. Rust depends on the linker cc
being available on the path. We can set the RUSTFLAGS
variable to change the version of the linker binary:
export RUSTFLAGS="-C linker=/usr/bin/clang-15"
Also, the
cc crate depends on the cc
program. We can change the compiler used in that crate by setting the CC
and CXX
variables:
export CC="clang-15"
export CXX="clang++-15"
Please note that you may want to unset these environment variables after compiling the LibAFL Rust project, as they may interfere with compiling your SUT.
The LLVM version requirement can change without warning. At the time of writing, you should use LLVM 15 through 18. Refer to the README if in doubt or if you experience problems.
Next, install Rust using rustup:
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
You may need to install additional system dependencies:
apt install libssl-dev pkg-config
LibAFL as libFuzzer drop-in replacement #
The libFuzzer compatibility layer depends on a nightly Rust version. Install the latest nightly version, including LLVM tools:
rustup toolchain install nightly --component llvm-tools
Next, clone LibAFL’s source code and build the libFuzzer drop-in replacement. Note that we are using the main branch here, not a specific version. While there are releases of LibAFL, the fuzzer is still in active development, so it is generally advisable to use the main branch.
git clone https://github.com/AFLplusplus/LibAFL
cd LibAFL/libafl_libfuzzer/libafl_libfuzzer_runtime
./build.sh
We now have a static binary called libFuzzer.a
in the current working directory.
Compile a fuzz test #
Similarly to the
libFuzzer section, we can now compile our harness.cc
and main.cc
from the
introduction. Instead of using the default libFuzzer with the flag -fsanitize=fuzzer
, we are providing our own libFuzzer runtime. Make sure that the harness.cc
and main.cc
are in your working directory.
clang++ -DNO_MAIN -g -O2 -fsanitize=fuzzer-no-link libFuzzer.a harness.cc main.cc -o fuzz
Note that you will need to recompile if you are changing the SUT or harness.
The flag -fsanitize=fuzzer-no-link
enables several compiler options, but does not link in the fuzzer runtime as it would if you use -fsanitize=fuzzer
. For more details about -DNO_MAIN
, -g
, or -O2
, refer to the
libFuzzer section.
Usage #
Start fuzzing by running ./fuzz <corpus_dir>
, where the corpus directory can be an empty directory. Ideally, you provide seed test cases, like small PNG images if you are fuzzing a PNG parser.
Just like classical libFuzzer, the drop-in replacement also stops fuzzing after finding a first crash. This can be changed by using the -fork=1
and -ignore_crashes=1
flags, which, compared to the default libFuzzer implementation, are not experimental in LibAFL. The -fork=1
flag makes LibAFL fork the process and restart child processes using
SimpleRestartingEventManager
. Note that the flags -fork
and -jobs
are synonyms in LibAFL. If using a value larger than one for these flags, then the more capable
LlmpRestartingEventManager
, which supports multi-processing, is used. The following command is recommended for long-running fuzzing campaigns:
./fuzz -fork=1 -ignore_crashes=1 <corpus_dir>
Because the example is relatively simple, an empty corpus directory is sufficient:
mkdir corpus/
It is also possible to omit the corpus directory. In that case, only crashes are persisted to disk and not the corpus itself. Therefore, the corpus is lost after a fuzzing campaign finishes.
From there, we can execute the fuzzer:
./fuzz corpus/
You will observe a crash quickly because of the simplicity of the example. The output contains statistics about the current executions per second and the corpus size.
[UserStats #0] (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 0, objectives: 0, executions: 0, exec/sec: 0.000, edges: 33.333%
(CLIENT) corpus: 0, objectives: 0, executions: 0, exec/sec: 0.000, edges: 3/9 (33%)
[UserStats #0] (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 0, objectives: 0, executions: 0, exec/sec: 0.000, edges: 33.333%, size_edges: 33.333%
(CLIENT) corpus: 0, objectives: 0, executions: 0, exec/sec: 0.000, edges: 3/9 (33%), size_edges: 3/9 (33%)
[Testcase #0] (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 1, objectives: 0, executions: 1, exec/sec: 0.000, edges: 33.333%, size_edges: 33.333%
(CLIENT) corpus: 1, objectives: 0, executions: 1, exec/sec: 0.000, edges: 3/9 (33%), size_edges: 3/9 (33%)
We imported 1 inputs from disk.
[UserStats #0] (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 1, objectives: 0, executions: 1, exec/sec: 0.000, stability: 100.000%, edges: 33.333%, size_edges: 33.333%
(CLIENT) corpus: 1, objectives: 0, executions: 1, exec/sec: 0.000, stability: 9/9 (100%), edges: 3/9 (33%), size_edges: 3/9 (33%)
[UserStats #0] (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 1, objectives: 0, executions: 1, exec/sec: 0.000, edges: 44.444%, size_edges: 33.333%, stability: 100.000%
(CLIENT) corpus: 1, objectives: 0, executions: 1, exec/sec: 0.000, edges: 4/9 (44%), size_edges: 3/9 (33%), stability: 9/9 (100%)
[UserStats #0] (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 1, objectives: 0, executions: 1, exec/sec: 0.000, edges: 44.444%, size_edges: 44.444%, stability: 100.000%
(CLIENT) corpus: 1, objectives: 0, executions: 1, exec/sec: 0.000, edges: 4/9 (44%), size_edges: 4/9 (44%), stability: 9/9 (100%)
[Testcase #0] (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 2, objectives: 0, executions: 30, exec/sec: 0.000, edges: 44.444%, size_edges: 44.444%, stability: 100.000%
(CLIENT) corpus: 2, objectives: 0, executions: 30, exec/sec: 0.000, edges: 4/9 (44%), size_edges: 4/9 (44%), stability: 9/9 (100%)
[UserStats #0] (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 2, objectives: 0, executions: 30, exec/sec: 0.000, edges: 55.556%, size_edges: 44.444%, stability: 100.000%
(CLIENT) corpus: 2, objectives: 0, executions: 30, exec/sec: 0.000, edges: 5/9 (55%), size_edges: 4/9 (44%), stability: 9/9 (100%)
[UserStats #0] (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 2, objectives: 0, executions: 30, exec/sec: 0.000, edges: 55.556%, size_edges: 55.556%, stability: 100.000%
(CLIENT) corpus: 2, objectives: 0, executions: 30, exec/sec: 0.000, edges: 5/9 (55%), size_edges: 5/9 (55%), stability: 9/9 (100%)
[Testcase #0] (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 3, objectives: 0, executions: 354, exec/sec: 0.000, edges: 55.556%, size_edges: 55.556%, stability: 100.000%
(CLIENT) corpus: 3, objectives: 0, executions: 354, exec/sec: 0.000, edges: 5/9 (55%), size_edges: 5/9 (55%), stability: 9/9 (100%)
[2024-06-06T14:39:11Z ERROR libafl::executors::hooks::unix::unix_signal_handler] Crashed with SIGABRT
[2024-06-06T14:39:11Z ERROR libafl::executors::hooks::unix::unix_signal_handler] Child crashed!
[2024-06-06T14:39:11Z ERROR libafl::executors::hooks::unix::unix_signal_handler] input: "8ac0f08adee16e49"
━━━━━━━━━━━━ CRASH ━━━━━━━━━━━━━━━━
Received signal SIGABRT at 0x00761319e969fc, fault address: 0x00000000000000
━━━━━━━━━━━━ REGISTERS ━━━━━━━━━━━━
r8 : 0x007ffff7d61830, r9 : 0x00000000000000, r10: 0x00000000000008, r11: 0x00000000000246,
r12: 0x00000000000006, r13: 0x00000000000016, r14: 0x00761316080cc2, r15: 0x00000000000025,
rdi: 0x00000000005f2c, rsi: 0x00000000005f2c, rbp: 0x00000000005f2c, rbx: 0x0076131a441940,
rdx: 0x00000000000006, rax: 0x00000000000000, rcx: 0x00761319e969fc, rsp: 0x007ffff7d61760,
rip: 0x00761319e969fc, efl: 0x00000000000246,
━━━━━━━━━━━━ BACKTRACE ━━━━━━━━━━━
0: libafl_bolts::minibsod::generate_minibsod
at libafl_bolts/src/minibsod.rs:1080:30
1: libafl::executors::hooks::unix::unix_signal_handler::inproc_crash_handler
at libafl/src/executors/hooks/unix.rs:208:32
2: libafl::executors::hooks::unix::unix_signal_handler::<impl libafl_bolts::os::unix_signals::Handler for libafl::executors::hooks::inprocess::InProcessExecutorHandlerData>::handle
3: libafl_bolts::os::unix_signals::handle_signal
at libafl_bolts/src/os/unix_signals.rs:436:5
4: <unknown>
5: pthread_kill
6: gsignal
7: abort
8: _Z9check_bufPcm
at libafl_libfuzzer/libafl_libfuzzer_runtime/main.cc:8:17
9: LLVMFuzzerTestOneInput
at libafl_libfuzzer/libafl_libfuzzer_runtime/harness.cc:7:3
10: libafl_libfuzzer_test_one_input
at
...
libafl_libfuzzer/libafl_libfuzzer_runtime/src/lib.rs:500:9
17: LLVMFuzzerRunDriver
at libafl_libfuzzer/libafl_libfuzzer_runtime/src/lib.rs:641:32
18: main
at libafl_targets/src/libfuzzer.c:37:10
19: <unknown>
20: __libc_start_main
21: _start
━━━━━━━━━━━━ MAPS ━━━━━━━━━━━━
7fff7000-8fff7000 rw-p 00000000 00:00 0
8fff7000-2008fff7000 ---p 00000000 00:00 0
2008fff7000-10007fff8000 rw-p 00000000 00:00 0
5596786b1000-559678764000 r--p 00000000 08:01 524035 libafl_libfuzzer/libafl_libfuzzer_runtime/fuzz
559678764000-559678d69000 r-xp 000b3000 08:01 524035
...
76131a4a8000-76131a4aa000 rw-p 00039000 08:01 16134 /usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2
7ffff7d46000-7ffff7d67000 rw-p 00000000 00:00 0 [stack]
7ffff7dea000-7ffff7dee000 r--p 00000000 00:00 0 [vvar]
7ffff7dee000-7ffff7df0000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 --xp 00000000 00:00 0 [vsyscall]
[Objective #0] (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 3, objectives: 1, executions: 548, exec/sec: 0.000, edges: 55.556%, size_edges: 55.556%, stability: 100.000%
(CLIENT) corpus: 3, objectives: 1, executions: 548, exec/sec: 0.000, edges: 5/9 (55%), size_edges: 5/9 (55%), stability: 9/9 (100%)
Lines starting with [UserStats #0]
or [Testcase #0]
correspond to events being logged. The UserStats
event (also sometimes called the Client Heartbeat
event) is regularly issued to provide updates on the current fuzzing campaign. The Testcase
event is issued whenever a new test case is discovered. Note that it is normal to occasionally not see new logs for at least some time if no new test cases or crashes are found.
Alongside the events, additional information is printed, such as how long the fuzzer is already running (run time
), how many parallel processes are fuzzing (clients
), how many corpus entries exist (corpus
), how many crashes or out-of-memory bugs have been found (objectives
), how many times the SUT has been invoked (executions
), and the current executions per second (exec/sec
). If you are fuzzing on multiple cores (using -fork
or -jobs
flags), then you will see statistics being printed for each individual fuzzing process as well as global aggregated statistics.
[UserStats #0] (GLOBAL) run time: 0h-0m-0s, clients: 1, corpus: 0, objectives: 0, executions: 0, exec/sec: 0.000, edges: 100.000%, size_edges: 100.000%
(CLIENT) corpus: 0, objectives: 0, executions: 0, exec/sec: 0.000, edges: 2/2 (100%), size_edges: 2/2 (100%)
The flag -tui=1
enables a graphical text-based statistics view. It also implicitly enables fuzzing in a forked child process instead of the main process. The logic for forking is implemented in the
fuzz function in LibAFL.

LibAFL as Rust library #
The more flexible way to use LibAFL is as a library. As the name LibAFL suggests, the project is primarily a fuzzer. The method using the libFuzzer compatibility layer described in the previous section is not part of the core library and is mostly a very easy way to get started. If you want to get the most out of LibAFL, you should consider implementing your own fuzzer using the library.
Creating a LibAFL-based fuzzer #
This section aims to compile the harness harness.cc
, the target main.cc
, and our custom-configured LibAFL fuzzer runtime. We will then bootstrap a new Rust project that uses LibAFL. Then, the resulting binary is linked to our harness and target program.
Create a new Rust project called appsec_guide and add the LibAFL dependency.
cargo init --lib appsec_guide
cd appsec_guide
cargo add libafl@0.13 libafl_targets@0.13 libafl_bolts@0.13 libafl_cc@0.13 --features "libafl_targets@0.13/libfuzzer,libafl_targets@0.13/sancov_pcguard_hitcounts"
The feature selection is an essential part of getting your LibAFL fuzzer right:
libafl_targets/libfuzzer
: Adds several helper functions that mimic the behavior of libFuzzer in a lightweight way. For instance, it provides a wrapper function to callLLVMFuzzerTestOneInput
by callinglibafl_targets::libfuzzer_test_one_input
. It also defines the main function that calls thelibafl_main
function we are going to define further below.libafl_targets/sancov_pcguard_hitcounts
: Adds a global map that maintains coverage statistics. It also defines functions that can be used by the LLVM coverage instrumentation to populate the coverage map.
PRO TIP: Previously, we discussed that LibAFL is still a work in progress. If you instead want to use a specific commit of LibAFL, use the following command:
cargo add --git https://github.com/AFLplusplus/LibAFL --rev 2356ba575 libafl libafl_targets libafl_bolts libafl_cc --features "libafl_targets/libfuzzer,libafl_targets/sancov_pcguard_hitcounts"
Please note that you might need to update Rust to the latest version, as a bug has been discovered that results in an error in the above command.
Our example fuzzer will use command-line options. Therefore, we need to add the clap dependency:
cargo add clap@4 --features "derive"
The library crate should be compiled into a static library. To achieve this, we need to set the crate-type
to staticlib
in the Cargo.toml
file:
[package]
name = "appsec_guide"
version = "0.1.0"
edition = "2021"
[lib]
crate-type = ["staticlib"]
[dependencies]
clap = { version = "4", features = ["derive"] }
libafl = "0.13.1"
libafl_bolts = "0.13.1"
libafl_cc = "0.13.1"
libafl_targets = { version = "0.13.1", features = ["libfuzzer", "sancov_pcguard_hitcounts"] }
Cargo.toml
file of the fuzzerAfter setting up the basic project, clear the src/lib.rs
file. The
full source code for the following fuzzer can be found on GitHub.
The following figure showcases a basic skeleton for a LibAFL fuzzer. The skeleton includes imports, a struct including parsed command line options, and the libafl_main
function. Within the libafl_main
function, we parse the command-line options, define the closure run_client
, and finally set up everything using a LibAFL
Launcher. To set up the launcher instance, we define several options:
.shmem_provider()
sets a shared-memory provider. Shared memory is used to synchronize fuzzing clients. The implementation is chosen based on the operating system..configuration()
gives the fuzzer a name, which in this case is “default.”.monitor()
sets the monitor records aggregated statistics in a TOML file called./fuzzer_stats.toml
and logs current statistics to the standard output..run_client()
defines the lambda that creates the fuzzer and clients..cores()
defines which and how many cores are used. Generally, you use theCores::from_cmdline
function to create aCores
instance from a string. For example, the string “0” creates a single client fuzzer that runs on the core with ID0
. The string ”0,8-15” launches nine clients that run on core0
and on the cores ranging from ID 8 until 15 (inclusive)..broker_port()
sets the port that the main broker process is using. Initial communication between a client and the main broker process happens over TCP..remote_broker_addr()
can be used to set a broker process to connect to in case one already exists. Otherwise, this can be set toNone
..stdout_file()
redirects the standard output of the SUT. You may want to set this to/dev/null
if the output does not matter and the output is noisy. We comment this option here, as it may hide issues with the fuzzer because it also hides output printed in therun_client
closure.
use libafl::{
...
};
use libafl_bolts::{
...
};
use libafl_targets::{libfuzzer_initialize, libfuzzer_test_one_input, std_edges_map_observer};
...
/// The commandline args this fuzzer accepts
#[derive(Debug, Parser)]
#[command(
name = "appsec_guide",
about = "A libfuzzer-like fuzzer with llmp-multithreading support and a launcher"
)]
struct Opt {
...
}
/// The main fn, `no_mangle` as it is a C symbol
#[no_mangle]
pub extern "C" fn libafl_main() {
let opt = Opt::parse();
println!(
"Workdir: {:?}",
env::current_dir().unwrap().to_string_lossy().to_string()
);
let mut run_client = |state: Option<_>, mut restarting_mgr, _core_id| {
...
Ok(())
};
let shmem_provider = StdShMemProvider::new().expect("Failed to init shared memory");
let monitor = OnDiskTOMLMonitor::new(
"./fuzzer_stats.toml",
MultiMonitor::new(|s| println!("{s}")),
);
let broker_port = opt.broker_port;
let cores = opt.cores;
match Launcher::builder()
.shmem_provider(shmem_provider)
.configuration(EventConfig::from_name("default"))
.monitor(monitor)
.run_client(&mut run_client)
.cores(&cores)
.broker_port(broker_port)
.remote_broker_addr(opt.remote_broker_addr)
// .stdout_file(Some("/dev/null"))
.build()
.launch()
{
Ok(()) => (),
Err(Error::ShuttingDown) => println!("Fuzzing stopped by user. Good bye."),
Err(err) => panic!("Failed to run launcher: {err:?}"),
}
}
The skeleton code does not yet fuzz any target. The actual fuzzer setup is handled in the run_client
closure; because LibAFL natively supports multiprocessing, it can handle multiple clients by executing the closure multiple times.
The following figures show how to set up a fuzzer client in the run_client
closure step by step.
// Create an observation channel using the coverage map
let edges_observer =
HitcountsMapObserver::new(unsafe { std_edges_map_observer("edges") }).track_indices();
// Create an observation channel to keep track of the execution time
let time_observer = TimeObserver::new("time");
// Feedback to rate the interestingness of an input
// This one is composed of two feedbacks in OR
let mut feedback = feedback_or!(
// New maximization map feedback linked to the edges observer
MaxMapFeedback::new(&edges_observer),
// Time feedback
TimeFeedback::new(&time_observer)
);
The fuzzer we are creating is guided by coverage and time. Coverage is gathered from a global map. After every execution of the harness, LibAFL uses data from the map to determine whether coverage has increased and how many new edges have been discovered. The code above first defines two observers, which are then used as feedback, essentially defining whether an execution of a test case has been interesting. The feedback_or!
macro specifies that either feedback suffices to deem the test case interesting.
The observers will become important again when setting up the scheduler and executor. Note here that the edges_observer
can track indices (see the call to the track_indices
function).
// A feedback to choose if an input is a solution or not
let mut objective = feedback_or_fast!(CrashFeedback::new(), TimeoutFeedback::new());
The fuzzer not only needs feedback to guide the fuzzer, but also to establish a goal. This is defined similarly to feedback; in this instance, the fuzzer has the goal of finding crashing inputs and inputs that time out. Note that neither type of feedback requires an observer because the core LibAFL engine natively supports both types. By using the feedback_or_fast!
macro instead of feedback_or!
, the timeout feedback is not evaluated if a crash occurs.
// If not restarting, create a State from scratch
let mut state = state.unwrap_or_else(|| {
StdState::new(
// RNG
StdRand::new(),
// Corpus that will be evolved, we keep it in memory for performance
InMemoryCorpus::new(),
// Corpus in which we store solutions (crashes in this example),
// on disk so the user can get them after stopping the fuzzer
OnDiskCorpus::new(&opt.output).unwrap(),
// States of the feedbacks.
// The feedbacks can report the data that should persist in the State.
&mut feedback,
// Same for objective feedbacks
&mut objective,
)
.unwrap()
});
println!("We're a client, let's fuzz :)");
Next, we either create a new fuzzer state or, if we are restarting this client after a crash, reuse an existing state. When creating a new state, we can specify a random number generator (RNG), the corpora, and include the feedback definition. StdRand
is sufficient for most applications for the RNG. The RNG is used to mutate test cases or generate new ones.
The fuzzer uses two corpora: one for storing interesting test cases and one for storing test cases that fulfill the objective (test cases that time out or crash). The current example uses an InMemoryCorpus
for test cases, which means that discovered test cases that increase coverage will be lost after a restart. InMemoryCorpus
can be replaced with an
InMemoryOnDiskCorpus
to keep test cases primarily in memory while also storing a backup on disk.
// Setup a basic mutator with a mutational stage
let mutator = StdScheduledMutator::new(havoc_mutations());
let mut stages = tuple_list!(StdMutationalStage::new(mutator));
From here, we define the mutations the fuzzer should use. We use the havoc mutations that are known from fuzzers like AFL++. LibAFL has the concept of stages that are executed for every test case.
// A minimization+queue policy to get test cases from the corpus
let scheduler =
IndexesLenTimeMinimizerScheduler::new(&edges_observer, QueueScheduler::new());
Scheduling which test cases to mutate and execute is an important part of fuzzing. The simplest strategy is to pick a random test case that is implemented in
RandScheduler
. The above code defines a schedule that picks favorable test cases in a queue-like fashion. To achieve this, the fuzzer maintains a state that describes which test cases are short and take a short time to execute. Internally, LibAFL attaches metadata to the fuzzer state (
TopRatedsMetadata
) and test cases (
IsFavoredMetadata
and
MapIndexesMetadata
). The important part to note here is that this specific scheduler depends on having an observer who can track indices. This is true for our edges_observer
because we used the track_indices
function when setting it up.
For more information, refer to the
source code.
// A fuzzer with a corpus scheduler, feedback, and objective
let mut fuzzer = StdFuzzer::new(scheduler, feedback, objective);
We are now ready to set up the fuzzer and link the scheduler with the feedback and objective.
// The wrapped harness function, calling out to the LLVM-style harness
let mut harness = |input: &BytesInput| {
let target = input.target_bytes();
let buf = target.as_slice();
libfuzzer_test_one_input(buf);
ExitKind::Ok
};
let mut executor = InProcessExecutor::with_timeout(
&mut harness,
tuple_list!(edges_observer, time_observer),
&mut fuzzer,
&mut state,
&mut restarting_mgr,
opt.timeout,
)?;
A simple but important step is to define the harness. The harness above calls libfuzzer_test_one_input
, which is a wrapper around the LLVMFuzzerTestOneInput
function. Our harness is libFuzzer compatible, so this function is ideal for our use case.
// The actual target run starts here.
// Call LLVMFUzzerInitialize() if present.
let args: Vec<String> = env::args().collect();
if libfuzzer_initialize(&args) == -1 {
println!("Warning: LLVMFuzzerInitialize failed with -1");
}
We are not currently using the LLVMFuzzerInitialize
function in our harness, but it is a good practice to call this function if it exists in case the harness requires initialization. LibAFL links and calls it if it exists; otherwise, nothing happens.
// In case the corpus is empty (on first run), reset
if state.must_load_initial_inputs() {
state
.load_initial_inputs(&mut fuzzer, &mut executor, &mut restarting_mgr, &opt.input)
.unwrap_or_else(|_| panic!("Failed to load initial corpus at {:?}", &opt.input));
println!("We imported {} inputs from disk.", state.corpus().count());
}
fuzzer.fuzz_loop(&mut stages, &mut executor, &mut state, &mut restarting_mgr)?;
The final setup is to load the initial corpus when the client starts for the first time and then start the fuzzing loop by calling the fuzz_loop
function. An important thing to note here is that the fuzzer will not start and run if none of the test cases in the corpus are interesting, which could be the case if the coverage instrumentation of the target fails.
We can now build the fuzzer:
cargo build --release
The resulting binary is located at target/release/target/debug/libappsec_guide.a
.
Compile a fuzz test #
In the previous section, we created a custom LibAFL-based fuzzer. Now we are going to link the compiled fuzzer runtime, called libappsec_guide.a
, to our harness and target. We are reusing the harness.cc
and main.cc
from the
introduction. The resulting binary will use the harness and the libFuzzer runtime.
This section presents two options for creating the final binary. The first one shows how to link in a verbose way: it details all LLVM features that are required to create the fuzzer. The second option is the standard one and is recommended for new users of LibAFL. However, it is important to understand what happens behind the scenes, so we will start with the verbose way.
Verbose instrumentation and linking #
If using the Clang compiler, the following command produces a binary called fuzz
in the current working directory:
clang++-15 -DNO_MAIN -g -O2 -fsanitize-coverage=trace-pc-guard -fsanitize=address -Wl,--whole-archive target/release/libappsec_guide.a -Wl,--no-whole-archive main.cc harness.cc -o fuzz
Here are the descriptions for each argument:
clang++-15
: This is the compiler used for instrumenting the SUT. To use a specific version of LLVM, use, for example,clang++-15
instead ofclang++
.-DNO_MAIN -g -O2
: These arguments prevent compiling of themain
function for main.cc, add debug symbols and use the most common production optimization level.-fsanitize-coverage=trace-pc-guard
: This argument enables LLVM coverage instrumentation. This is required, or else the fuzzer won’t get any feedback and won’t add any test cases to the corpus.-fsanitize=address
: This argument enables AddressSanitizer.-Wl,--whole-archive target/release/libappsec_guide.a -Wl,--no-whole-archive
: These arguments link the complete fuzzer runtime. Without--whole-archive
, the linker will discard the fuzzing runtime completely and set thelibafl_main
external symbol to null. This behavior is linked to weak symbols and is also described in this StackOverflow issue. The other workaround is to add-u libafl_main
instead of using the whole archive flags.main.cc harness.cc
: These are the SUT and harness to compile, instrument, and link.-o fuzz
- This is the argument for the output binary.
The resulting binary fuzz
can now be executed to start the fuzzing campaign.
Standard linking #
The standard way in LibAFL to link the runtime, SUT, and harness is to create a compiler wrapper. When compiling the SUT and harness, instead of using the Clang directly from LLVM, we invoke a custom wrapper that parses the compilation flags; modifies, injects, or removes flags; and then invokes the LLVM Clang.
The LibAFL project provides a library called libafl_cc
which allows us to easily create wrappers. The following figure shows a compiler wrapper. Add this as libafl_cc.rs
to your Rust crate under src/bin
(the default path for Rust binaries).
use std::env;
use libafl_cc::{ClangWrapper, CompilerWrapper, Configuration, ToolWrapper};
pub fn main() {
let args: Vec<String> = env::args().collect();
if args.len() > 1 {
let mut dir = env::current_exe().unwrap();
let wrapper_name = dir.file_name().unwrap().to_str().unwrap();
let is_cpp = match wrapper_name[wrapper_name.len()-2..].to_lowercase().as_str() {
"cc" => false,
"++" | "pp" | "xx" => true,
_ => panic!("Could not figure out if c or c++ wrapper was called. Expected {dir:?} to end with c or cxx"),
};
dir.pop();
let mut cc = ClangWrapper::new();
if let Some(code) = cc
.cpp(is_cpp)
.parse_args(&args)
.expect("Failed to parse the command line")
.link_staticlib(&dir, "appsec_guide")
.add_args(&Configuration::GenerateCoverageMap.to_flags().unwrap())
.add_args(&Configuration::AddressSanitizer.to_flags().unwrap())
.run()
.expect("Failed to run the wrapped compiler")
{
std::process::exit(code);
}
} else {
panic!("LibAFL CC: No Arguments given");
}
}
The wrapper binary name determines whether the SUT is written in C or C++. Therefore, we added another binary in src/bin/libafl_ccx.rs
to the project.
pub mod libafl_cc;
fn main() {
libafl_cc::main();
}
We can now compile both wrappers by invoking:
cargo build --release
After that, we can compile the final fuzzer binary:
target/release/libafl_cxx -DNO_MAIN -g -O2 main.cc harness.cc -o fuzz
This creates several binaries, each with a different suffix. Because we need coverage instrumentation for our fuzzer, we need to use the fuzz
binary. You can create more binaries with different instrumentations by adding more configurations using add_configuration
. Note that the LibAFL fuzzer needs to be compatible with the instrumentations. For instance, adding the cmplog instrumentation in the wrapper script will not make the LibAFL fuzzer aware of the new instrumentation. You will need to adjust the observers and feedback in the fuzzer setup.
Usage #
To fuzz on a single core with an initial corpus stored in the corpus/
directory, execute the following command:
./fuzz --cores 0 --input corpus
The fuzzer supports a few more options like --timeout
, which finds inputs that exceed a certain execution time, and --output
, which sets the directory where test cases are stored that either crash or cause a timeout.
The default output will look identical to the output observed in the section
LibAFL as libFuzzer drop-in replacement. Additionally, the fuzzer we created in this section logs statistics to a hard-coded file called fuzzer_stats.toml
.
The fuzzer will quickly find the bug from main.cc
; however, by default, the fuzzer will not exit after finding a bug. Instead, it will continue fuzzing, and the objectives
counter, which you can see in the terminal, will increase.
The following sections show additional APIs that demonstrate how extensible LibAFL is.
Extending the fuzzer: Deduplicate crashes by hashing backtraces #
The fuzzer we introduced above will store test cases in the output directory that crash because of the same bug. For instance, suppose that a test case causes a buffer overflow if the first four bytes contain an integer larger than the test case size. A mutation in the other bytes is a different test case, but they will likely trigger the same memory corruption. The output directory will quickly be filled with test cases that all reach the same memory corruption. To avoid this, we can use a LibAFL feature that analyzes the crash backtrace, and will store the test case only if the backtrace is novel.
This LibAFL feature requires two components:
- An observer that will look at backtraces when a crash occurs, and
- A feedback that is evaluated when a crash happens. The feedback checks if the crash backtrace is novel. If the crash has not been observed before, the feedback returns true.
First, we create the backtrace observer. Our example fuzzer needs to set the harness type to InProcess
because each of our fuzzer clients uses the
InProcessExecutor
. Change this if you are using a fork server.
let backtrace_observer = libafl::prelude::BacktraceObserver::owned("BacktraceObserver", libafl::observers::HarnessType::InProcess);
Next, we add the observer to the observer tuple in the executor declaration. If we do not add it here, then the observer will not be executed.
let mut executor = InProcessExecutor::with_timeout(
&mut harness,
tuple_list!(edges_observer, time_observer, backtrace_observer),
&mut fuzzer,
&mut state,
&mut restarting_mgr,
opt.timeout,
)?;
Finally, we update the objective definition with a NewHashFeedback
. We use the feedback_and
macro, so an objective is added only if a new backtrace is discovered. The feedback fetches the hash of the latest backtrace and stores it in a set. The feedback returns true only if the hash has not yet been observed.
let mut objective = feedback_and!(feedback_or_fast!(CrashFeedback::new(), TimeoutFeedback::new()), libafl::feedbacks::new_hash_feedback::NewHashFeedback::new(&backtrace_observer));
PRO-TIP: We recommend running the fuzzer at least once without the deduplication after fixing the found crashes or if no crashes are found. Because LibAFL did not yet have a stable release, there might be bugs in the deduplication, leading to missed crashes. This feature is still helpful when you are fuzzing a target for the first time and have easy-to-reach crashes. After fixing the first batch of crashes rerun the fuzzer though without deduplication.
Extending the fuzzer: Dictionary fuzzing #
To fuzz with information from a dictionary, we first add a command-line option to the example fuzzer. We are using the
clap library to parse arguments. Here we add a tokens argument that uses the short flag -x,
just as we did with AFL++. LibAFL uses the term ”set of tokens” to refer to a dictionary. The
dictionary format from AFL++ is compatible with the token files we are loading here.
/// The commandline args this fuzzer accepts
#[derive(Debug, Parser)]
#[command(
name = "appsec_guide",
about = "A libfuzzer-like fuzzer with llmp-multithreading support and a launcher"
)]
struct Opt {
...
#[arg(
short = 'x',
long,
help = "Path to a tokens file",
name = "TOKENS"
)]
tokenfile: Option<PathBuf>,
}
Next, we will use the path to the token file to load tokens. First, we create a Tokens
struct and then fill it from the token file. After that, the tokens are added to the fuzzer state.
let mut state = state.unwrap_or_else(|| {
...
});
let mut tokens = Tokens::new();
if let Some(tokenfile) = &opt.tokenfile {
tokens.add_from_file(tokenfile)?;
}
println!("Using tokens: {:?}", &tokens);
if !tokens.is_empty() {
state.add_metadata(tokens);
}
Now that the fuzzer has access to the tokens, we need to add a mutation to the mutator that actually uses the metadata from the state. We extend the list of havoc mutations with token mutations.
let mutator = StdScheduledMutator::new(havoc_mutations().merge(tokens_mutations()));
We are now done with the basic setup of a fuzzer. If we have a very purpose-built fuzzer, then we may choose not to allow configuring the tokens through the command-line interface but instead hard-code them. For the PNG file format, for example, we could create the following tokens:
state.add_metadata(Tokens::from([
vec![137, 80, 78, 71, 13, 10, 26, 10], // PNG header
"IHDR".as_bytes().to_vec(),
"IDAT".as_bytes().to_vec(),
"PLTE".as_bytes().to_vec(),
"IEND".as_bytes().to_vec(),
]));
Auto token fuzzing #
AFL++ originally introduced the AUTODICTIONARY feature, which uses a Clang compilation pass to extract magic values and checksums from the program. The fuzzer can then use these values to improve fuzzing coverage.
First, we want to update the check_buf
function in the main.cc
file to include a call to the memcmp
function that the auto tokens feature can use to extract a value.
void check_buf(char *buf, size_t buf_len) {
if (buf_len >= 4) {
if (memcmp(buf, "buf", 4) == 0) {
abort();
}
}
if(buf_len > 0 && buf[0] == 'a') {
if(buf_len > 1 && buf[1] == 'b') {
if(buf_len > 2 && buf[2] == 'c') {
abort();
}
}
}
}
The pass can be enabled in the libafl_cc.rs
file by including the auto tokens pass.
let mut cc = ClangWrapper::new();
if let Some(code) = cc
.cpp(is_cpp)
// silence the compiler wrapper output, needed for some configure scripts.
.silence(true)
.parse_args(&args)
.expect("Failed to parse the command line")
.link_staticlib(&dir, "appsec_guide")
.add_args(&Configuration::GenerateCoverageMap.to_flags().unwrap())
.add_args(&Configuration::AddressSanitizer.to_flags().unwrap())
.add_pass(LLVMPasses::AutoTokens)
.run()
.expect("Failed to run the wrapped compiler")
{
std::process::exit(code);
}
To verify that the Clang pass succeeded, you can use gdb
to check whether the section has been populated.
echo "p (uint8_t *)__token_start" | gdb fuzz
If the global variable is null, then the pass has not been executed.
When we compile our SUT and harness with libafl_cc
, a new section will be added to the fuzzer. We can read the compiled-in tokens during runtime and add them to the Tokens
struct using the following code.
tokens += libafl_targets::autotokens()?;
Extending the fuzzer: Debugging #
Sometimes you need to debug your LibAFL fuzzer. Its multiprocessing nature can make it difficult to attach with a debugger like GDB. A simple trick is to make the fuzzer run in a single process; instead of building a Launcher and then starting it, we can run the run_client
closure directly. Simply comment on the launcher code and directly invoke the closure.
run_client(None, libafl::prelude::SimpleEventManager::new(monitor), 0).unwrap();
// match Launcher::builder()
// .shmem_provider(shmem_provider)
// .configuration(EventConfig::from_name("default"))
// .monitor(monitor)
// .run_client(&mut run_client)
// .cores(&cores)
// .broker_port(broker_port)
// .remote_broker_addr(opt.remote_broker_addr)
// .stdout_file(Some("/dev/null"))
// .build()
// .launch()
// {
// Ok(()) => (),
// Err(Error::ShuttingDown) => println!("Fuzzing stopped by user. Good bye."),
// Err(err) => panic!("Failed to run launcher: {err:?}"),
// }
After that, you may run the fuzzer in GDB:
gdb --args ./fuzz --cores 0 --input corpus
Real-world examples #
libpng #
If you are fuzzing C projects that produce static libraries, you can follow this recipe:
- Read the
INSTALL
file in the project’s codebase (or other appropriate documentation) and find out how to create a static library. - Set the compiler to your LibAFL-based fuzzer wrapper, and pass required flags to the compiler during compilation.
- Build the static library by using the LibAFL compiler wrapper. The build step will create an instrumented static library, which we will refer to as
$static_library
. - Find the compiled static library from step 3 and call:
target/release/libafl-cxx $static_library harness.cc -o fuzz
. - You can start fuzzing by calling
./fuzz --input seeds/ --cores 0-8
.
Let’s go through these instructions for the well-known libpng library. First, we get the source code:
curl -L -O https://downloads.sourceforge.net/project/libpng/libpng16/1.6.37/libpng-1.6.37.tar.xz
tar xf libpng-1.6.37.tar.xz
cd libpng-1.6.37/
Before we can compile libpng, we have to install dependencies for it:
apt install zlib1g-dev
Now, set up a variable that holds the path to the Cargo project:
export FUZZER_CARGO_DIR="/some/path"
Next, we configure and compile libpng as a static library without linking libFuzzer.
export CC=$FUZZER_CARGO_DIR/target/release/libafl_cc
export CXX=$FUZZER_CARGO_DIR/target/release/libafl_cxx
./configure --enable-shared=no # Configure to compile a static library
make # Run compilation
By default, the configuration script sets the optimization level to -O2
, which is what we recommend in the
Compile a fuzz test section.
Next, we download a harness from GitHub. Usually, you would have to write a harness yourself. However, for this example, an existing one suffices.
curl -O https://raw.githubusercontent.com/glennrp/libpng/f8e5fa92b0e37ab597616f554bee254157998227/contrib/oss-fuzz/libpng_read_fuzzer.cc
Finally, we link together the instrumented libpng, the harness, and the libFuzzer runtime.
$CXX libpng_read_fuzzer.cc .libs/libpng16.a -lz -o fuzz
Before we can launch the campaign, we need to prepare the seeds because AFL++ cannot start from an empty set of seeds. We do this by downloading a small example PNG file.
mkdir seeds/
curl -o seeds/input.png https://raw.githubusercontent.com/glennrp/libpng/acfd50ae0ba3198ad734e5d4dec2b05341e50924/contrib/pngsuite/iftp1n3p08.png
If you added the -x
option to use a token file as explained in the previous section, then we can also download a
dictionary for the PNG format to better guide the fuzzer. A dictionary provides the fuzzer with some initial clues about the file format, such as which magic bytes PNG uses.
curl -O https://raw.githubusercontent.com/glennrp/libpng/2fff013a6935967960a5ae626fc21432807933dd/contrib/oss-fuzz/png.dict
The fuzzing campaign can be launched by running:
./fuzz --input seeds/ --cores 0 -x png.dict
CMake-based project #
Let’s assume we are using CMake to build the program mentioned in the
introduction. We add a CMake target that builds the main.cc
and harness.cc
and links the target together with AFL++. Note that we are excluding the main function through the NO_MAIN
flag; otherwise, the program would have two main functions.
project(BuggyProgram)
cmake_minimum_required(VERSION 3.0)
add_executable(buggy_program main.cc)
add_executable(fuzz main.cc harness.cc)
target_compile_definitions(fuzz PRIVATE NO_MAIN=1 )
target_compile_options(fuzz PRIVATE -g -O2)
The non-instrumented binary can be built with the following commands:
cmake -DCMAKE_C_COMPILER=clang -DCMAKE_CXX_COMPILER=clang++ .
cmake --build . --target buggy_program
Next, set up a variable that holds the path to the Cargo project:
export FUZZER_CARGO_DIR="/some/path"
From there, configure and compile libpng as a static library without linking libFuzzer.
The fuzzer can be built by choosing the fuzz target and changing the compiler:
cmake -DCMAKE_C_COMPILER=$FUZZER_CARGO_DIR/target/release/libafl_cc -DCMAKE_CXX_COMPILER=$FUZZER_CARGO_DIR/target/release/libafl_cxx .
cmake --build . --target fuzz
The fuzzing campaign can be launched by running:
./fuzz --input seeds/ --cores 0