Fuzzing dictionary #
A dictionary can be used to guide the fuzzer. A dictionary is usually passed as a file to the fuzzer. The simplest input accepted by libFuzzer is a ASCII text file where each line consists of a quoted string. Strings can contain escaped byte sequences like “\xF7\xF8"
. Optionally, a key-value pair like hex_value="\xF7\xF8"
can be used for documentation purposes. Comments are supported by starting a line with #
. See the following example:
Dictionaries are compatible between the libFuzzer, cargo-fuzz, and AFL++ fuzzers. They can be used according to the following table:
libFuzzer | ./fuzz -dict=./dictionary.dict ... |
AFL++ | afl-fuzz -x ./dictionary.dict ... |
cargo-fuzz | cargo fuzz run fuzz_target -- -dict=./dictionary.dict |
Generating a dictionary #
There are several ways to generate a dictionary.
- LLMs (large language models): Tools like OpenAI’s ChatGPT are helpful in generating a dictionary for your fuzzing task. However, be aware of LLM hallucinations. If the LLM proposes a feature not mentioned in this handbook, check first if it really exists. Try the following LLM prompt with the task
PNG parser
:A dictionary can be used to guide the fuzzer. A dictionary is passed as a file to the fuzzer usually. The simplest input accepted by libFuzzer is an ASCII text file where each line consists of a quoted string. Strings can contain escaped byte sequences like "\xF7\xF8". Optionally, a key-value pair can be used like hex_value="\xF7\xF8" for documentation purposes. Comments are supported by starting a line with #. Write me an example dictionary file for a <fuzzing task>:
- Header files: If you found C header file that contains relevant strings, then they can be extracted using the following command:
grep -o '".*"' header.h > header.dict
- Man pages: If the project you are fuzzing has man pages, then you can use these to generate a dictionary. This is especially helpful when fuzzing a CLI.
man curl | grep -oP '^\s*(--|-)\K\S+' | sed 's/[,.]$//' | sed 's/^/"&/; s/$/&"/' | sort -u > man.dict
- AFL++
AUTODICTIONARIES: If you are using
afl-clang-lto
, then AFL++ will automatically generate a dictionary based on the binary that is being fuzzed. - If you are not using AFL++, then you might want to use the strings binary to generate dictionary:
strings ./binary | sed 's/^/"&/; s/$/&"/' > strings.dict