Fazing place in the attack chain
Fazing is areconnaissance and resource development tool for MITRE AT&CK.Vulnerability Scanning (T1595.002, Reconnaissance) - automated searchfor weaknesses in publicly available applications. Found through afactory bug turns into an exploit - Exploits (T1587.004, ResourceDevelopment), which is used for Exploit Public-Facting Application(T1190, Initial Access).
On a particularpentest, the chain looks like this:
1. Informationcollection - define the target application stack: language,framework, native parser, libraries used
2. Writingharness - create a wrapper that gives the fazzer data in the formatexpected by the target parser
3. Fazing withsanitizers - launch of AFL++ or LibFuzzer with AddressSanitizer,crash
4. Triage -analysis of crashes, root call determination, exploitabilityassessment
5. Exploitdevelopment - if the bug is operated, assembled PoC
6. Application- on the pentest, in bug bounty or CVE-report
Applicationcontext: coverage-guided phase-facing server components of webapplications is relevant for internal audit (white box, grey box) -when there is access to the source code or binary. For an externalpentest without access to the code, the black-box API fuzzing runsvia HTTP. Both approaches find fundamentally different classes ofvulnerabilities, and we will understand below when to apply.
Black-box vscoverage-guided fuzzing: a choice of approach
Before graspingfor specific tools, decide on the approach. The choice depends on twothings: whether there is source code and what a goal.

Key limit to theblack-box approach: Fazing through HTTP requests almost never findsmemory in the corruption server code. The WAF or web server itselfwill drop the malformed request before the data reaches a vulnerableparser. The Coverage-guided approach works at the function level –the data gets directly into the target code, bypassing all theintermediate layers.
According to theDharmadi et al. (2024, arxiv) - perhaps the most complete survey ofserver web applications - the main problem of the web interface APIis that HTTP requests must be valid, otherwise the web server rejectsthem at an early stage. This is a fundamental difference from binaryphases, where you can serve arbitrary garbage on stdin. The samereview points to “ineffectiveness of instrumentation” asone of the key unsolved problems – it is technically moredifficult to tool the server web application than a compiled binary.
When thecoverage-guided approach does not apply: external SaaS pentestwithout access to the source or binary; testing of the cloud-nativeAPI through public endpoints; audit of legacy systems wherereassembly is not possible (there are no source, there is no buildsystem). It's just black-box.
AFL++ for webcomponents: as a guide
Adjustments to theenvironment
• OS:Linux (Ubuntu 22.04+ or Debian 12+), macOS (stratenial mode ofsome features)
• RAM: atrem 4 GB, recommended 8+ GB when launching parallel instances
• AFL++4.x (Repository is actively maintained, commits weekly)
• Compilers:clang/LLVM 14+ for LTO tools
• AddressSanitizeris included in LLVM, no installation separate requirement
• NetworkRequest: offline-compatible, all workly locally
What to phase in aweb application
A web applicationis not a monolith. This is a set of parsers, serializers andhandlers, each of which accepts untrusted data. The most productivetargets for vulnerability-guided risking vulnerabilities:
• Parsersof formats : JSON, XML, YAML, protobuf, MessagePack. CategoryA08:2021 (Software and Data Integrity Failures) by OWASPpoints to insecure deserialization as a critical risk
• HTTPparsers: header processing, multipart form-data, chunked encoding.Often written in C/C++ under Python/Node.js stacks - through nativeExtensions
• Fileuploaders: image parsing (libping, libjpeg), PDF, XLSX - classic forRecimation-based fuzzing
• Validators: regex-engines (ReDoS, CWE-1333), email parsers, URL parsers, customDSL
• Cryptographicoperations: verification of signatures, certificate parsing (ASN.1),JWT processing
Harness: anatomywrapping for web parsers
Harness is afunction that takes raw bytes from the vasser and transmits them tothe target code. The quality of the harness depends on whether youfind the bug fazzar in 20 minutes or will be wasted a day. I saw asituation where two for one harness library - one written in 10minutes, the second in 2 hours with the analysis of real calls - gavea difference in coverage 4x for the first hour.
Harness for AFL++must: read data through shared memory (or stdin), call the targetfunction and not do unnecessary I/O-operations (the network, diskslow down phases by order).
C:
// harness_json.c- example to demonstrate the concept
#include <stdio.h>
#include<stdlib.h>
#include"target_json_parser.h"
__AFL_FUZZ_INIT();
int main(void) {
__AFL_INIT();
unsigned char*buf = __AFL_FUZZ_TESTCASE_BUF;
while(__AFL_LOOP(10000)) {
int len =__AFL_FUZZ_TESTCASE_LEN;
json_parse(buf,len);
}
}
AFL_LOOP(10000)- mode point: the process is reused instead of fork-exe for each testcase. Speed growth is 10-20 times. AFL_FUZZ_TESTCASE_BUF - sharedmemory, even faster reading from stdin.
Compilation withtools and AddressSanitizer: AFL_USE_ASAN=1 AFL_USE_UBSAN=1afl-clang-lto -o harness harness_json.c -ltarget_json. AFL_USE_ASAN=1activates ASan, which catches heap-buffer-overflow, use-after-free,stack-buffer-overflow, double-free. Without ASan, the vasser will seeonly hard crash (SIGSEGV/SIGABRT), and thin memory corruption willslip unnoticed.
Launch: afl-fuzz-i corpus/ -o findings/ -m none -- ./harness. Catalog corpus/ shouldcontain ced-files - minimum valid examples. For JSON:{"key":"value"}, [], "", 0, null. Adiverse celed corpus accelerates the output of the phases to the deepbranches of the code.
CmpLog: solvingthe problem of magical bytes
Standardcoverage-guided phased does not cope well with the conditions of thespecies if (header == 0xDEADBEEF) - the probability of guessing 4bytes with a mutation is negligible. AFL++ solves this through CmpLog(similar to RedQueen): it tools to compare and substitutes from theright side of the condition in corpus. For web parsers it is critical- XML begins with <?xml, JSON diagrams contain mandatory keys,HTTP headers have a fixed syntax.
Connection:assemble two binary. The main one with ASan: AFL_USE_ASAN=1afl-clang-lto -o harness harness_json.c -ltarget_json. AuxiliaryCmpLog without ASan (the combination of ASan + CmLog is notrecommended due to overhead): AFL_LLVM_CMPLOG=1 afl-clang-lto -oharness.cmplog harness_json.c -ltarget_json. Launch: afl-fuzz -icorpus/ -o findings/ -c ./harness.cmplog -- ./harness. CmpLogsignificantly speeds up the passage of parsers with a rigid structureof input data.
AFL++ FazingLimits in Web Context
• Requiresrefulation of the target code - not easy for SaaS withoutAccess to source
• Forcontexts (Python, Ruby, PHP) need specialized wrappers
• PersistentMode is Not Always Correct: If the Target Function Is A Globalstate and is not reset between iterations, there are false positives
• Doesn'tsee logic bugs (IDOR, auth bypass) - only memory and crash
LibFuzzer: fastin-process vulnerability search
LibFuzzer -in-process coverage-guided vasser built into LLVM. Unlike AFL++, itworks within the same process without fork – faster forvectoring of individual functions, but less stable: one crashcompletes the session.
Applicability forweb applications: LibFuzzer is ideal for fazing native extensions. IfPython/Node.js application uses the C-libretage for parsing (libxml2,rapidjson, zlib, opensl - and almost everyone uses), LibFuzzer is aphased one with this native layer.
C:
// fuzz_xml.c -harness for libxml2
#include<libxml/parser.h>
intLLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
xmlDocPtr doc= xmlReadMemory(
(constchar *)data, size, "noname.xml", NULL, 0);
if (doc !=NULL) xmlFreeDoc(doc);
xmlCleanupParser();
return 0;
}
Compilation: clang-g -fsanitize=fuzzer,address -o fuzz_xml fuzz_xml.c $(xml2-config--cflags --libs). Launch: ./fuzz_xml corpus_xml/ -max_len=65536-jobs=4. Parameter -jobs=4 launches four parallel processes - theutilization of a multi-core machine.
AFL++ vs LibFuzzerfor phase-sharing web applications: AFL++ is more sustainable forlong-term campaigns (days, weeks) – crash does not kill thewhole process. LibFuzzer is faster for short sessions and is easierto integrate into CI/CD (one binary, without external dependencies).In practice, for a serious search for 0-day vulnerabilities, I launchboth: LibFuzzer - fast first pass, AFL++ - a long-term campaign withadvanced mutational strategies (MOpt, CmpLog).
Atheris: FazyingPython Components REST API
A significant partof modern web applications is written in Python, JavaScript or Go. Itis impossible to phase them AFL++ directly - there is no nativebinary for the tooling. For Python there is Atheris - coverage-guidedphaser from Google, running on the basis of LibFuzzer.
Atheris interceptscoverage at the level of CPython bytecode. Where it is useful:
• Django/Flask/FlastaviConcestions functions that process custom
• CustomValidators and TV Series in REST API
• Tracersof specific formats (CV with custom logic, proprietary protocols ontop of HTTP)
Python:
#fuzz_api_validator.py - example to demonstrate the concept
import atheris,sys
frommyapp.validators import parse_user_input
defTestOneInput(data):
try:
parse_user_input(data.decode("utf-8",errors="ignore"))
except(ValueError, KeyError):
pass #We skip expected exceptions
atheris.Setup(sys.argv,TestOneInput)
atheris.Fuzz()
Launch: pythonfuzz_api_validator.py -max_len=4096 corpus/. Atheris will findunhandled exceptions, infinite cycles, excessive memory (EndpointDenial of Service, T1499 MITRE ATT&CK; exhaust rechargeable viaReDoS/memory - T1499.004, Application or System Exploitation).
Speed limit:Atheris produces 100-1000 exec/sec instead of 10000-100000 fromAFL++/LibFuzzer on native code. For a deep search for memorycorruption in the native Python packages (Pillow, lxml, cryptography)Atheris requires re-assessing C-code with LibFuzzer-tooling((atheris_no_libfuzzer_main), which is essentially equivalent to thedirect use of LibFuzzer on the C-layer. Atheris is about the logic ofPython code, not about memory bugs.
Fazing REST API:from specification to server crashes
Coverage-giidedapproach requires access to the code. On an external pentest, whenonly HTTP endpoints are available, the stateful API fuzzing operates- the generation of HTTP-request sequences based on OpenAPI/ShowkSpecification.
Tools API fuzzing:trade-off

The same surveyDharmaadi et al. (2024) shows that most web vasser APIs use OpenAPIspecification to generate query templates. This solves the problem ofvalidity of HTTP - servers reject unfacilitated requests. But theapproach has a serious gap: the specification describes onlydocumented endpoints. Hidden APIs, debug routes, internal endpointsremain outside the coverage area.
How I do on anexternal pentest: First ffuf -u https://target/FUZZ -wapi-wordlist.txt -mc 200,301,403 to detect hidden endpoints, thenRESTler or Schemathesis for deep stateful phased based on found anddocumented routes.
In the context ofOWASP A03:2021 (Injection), APIs find SQL injection, NoSQL injection,OS command injection through a mutation of parameters. But even morevaluable - logical bugs: race conditions with parallel queries, IDORthrough the search of identifiers, violations of business logic withnon-standard call sequences.
Integrating phasesin CI/CD: automated vulnerability search
Fazing brings themaximum benefit during continuous operation, and not at a singlestart. One runway, a lottery. Regular runs are statistics.
Corpus-basedfuzzing in a pypaline
The minimumCI-scheme for phase-aging binary applications and web components:
1. For each PR- a short run (10-30 minutes) with regression corpus. Purpose: tocheck that the new code did not create regression over previouslyfound crashes
2. Nightly isa long campaign (4-8 hours) with advanced mutations (MOpt, CmpLog).Purpose: Search for New Bugs
3. Weekly -full phases with updated corpus (added from real API requests)
Google OSS-Fuzz isa free continuous phase-fazing infrastructure for open-sourceprojects. If you are a baseer library continent, integrationautomates phases, triage and notifications. For proprietary code -own pipleline through GitHub Actions or GitLab CI with AFL++ in aDocker container.
corpus Management
Corpus - whatdistinguishes effective phases from useless noise:
• Seedfrom production logs: real HTTP requests to API (with osphoçentolds)- the best start corpus for Fazking REST API
• Minimization: afl-cmin -i corpus/ -o corpus_min/ -- ./harnessfor AFL++,./fuzz_target -merge=1 corpus_merged corpus_rawfor LibFuzzer -Recuss duplicates by coverage
• Versionation: corpus is stored in Git LFS or CI artfacts. Loss of corpus betweenrun = loss of life of the fazzer
• Handmadeenrichment: empty lines, maximum longest input, unico, nulle bytes,boundary values. For web context: unkempt Content-Type, nested JSON100+, multipart with a million boundaries
Summary table ofapproaches to phase-singing web applications

From crash to CVE:workflow search 0-day vulnerabilities
Crash in theopen-source library is a potential 0-day if the library is used inproduction web applications. Process:
1.Verification - confirm crash on the latest stable version
2. Root causeanalysis - define CWE. Heap-buffer-overflow: CWE-122. Use-after-free:CWE-416. Null dereference: CWE-476. Integer overflow: CWE-190
3.Exploitability - WRITE primitive with controlled size = probable RCE.READ-only = information disclosure. Null dereference = DoS
4. Responsibledisclosure - report to the mainstreamer via security/ or GitHubSecurity Advisory. Standard deadline: 90 days
5. CVE request- via MITRE CVE form or GitHub CNA
6. PoC - afterthe release of the patch, the minimum playable example
Legal context:factorying open-source projects - legal activity. Fazing of otherpeople's production systems without permission - no. For a pentest,you need scope, for bug bounty - program, for research - own stand.