Typically, developers, and even testers, can't always predict in advance what kind of invalid input data will be received by their web application. Traditional testing methods—manual, integration, and even unit tests—are good at handling expected scenarios. However, testing unpredictable ones requires different approaches.
Hello, Habr! My name is Alexey Lomay, and I'm a Junior Systems Engineer in the DevOps department at IBS. In this article, I'll discuss fuzzing as a powerful tool for web application security testing. We'll discuss the history and development of this approach, the available tools, and two CI/CD implementation options.
Fuzzers automatically generate various data types, be they strings, numbers, symbols, patterns, and so on, and feed them into the system under test, analyzing the results. This allows for the detection of errors that could be exploited by attackers.
Fuzz testing consists of five main stages:
The history of fuzzing began in 1985, when Bill Pelton began experimenting with random data to test network protocols. In 1989, Bart Miller and his team at the University of Wisconsin-Madison published the paper "An Empirical Study of the Reliability of UNIX Utilities," in which they found that about a quarter of Unix utilities were vulnerable to failures when processing invalid input. By 1996, Microsoft had begun actively using and developing fuzzing for testing its products, creating the SAGE tool. In 2006, fuzzers began actively using code coverage analysis methods to more effectively find vulnerabilities. The now-famous American Fuzzy Lop (AFL) tool was born.
In recent years, there has been a trend toward integrating AI and machine learning technologies into fuzzing. Tools like Driller and Vuzzer, as well as cloud platforms like OSS-Fuzz, have emerged for fuzzing open source code. Fuzzing has begun to be integrated into CI/CD pipelines, enabling automated security testing at every stage of development.
The first is an approach to generating input data for phasers:
Another classification method is based on vulnerabilities identified during fuzzing:
You can also use fuzzing to test network protocols—services that use DNS, HTTP, FTP, and other protocols.
You can test file systems, in particular file operations, including creation, reading, writing, deletion, and overflow.
Fuzzers can be used to test software more broadly, including applications, libraries, and drivers.
The first is identifying vulnerabilities in code. A good example is the sqlmap tool, which can automatically test web applications for SQL injection vulnerabilities. It sends specially crafted SQL queries and analyzes the server responses.
The second is detecting errors and system crashes. The previously mentioned American Fuzzy Lop tool can test binaries and complex protocols by sending specially crafted data and analyzing program behavior to find buffer overflows and other errors.
It is worth noting that AFL is suitable for a wider range of purposes, not just for detecting errors and failures.
Third, testing user input processing. Feroxbuster, which I'll discuss later, can test a web application for hidden directories and files by sending requests with various paths and analyzing the server responses.
It is worth noting that, although both approaches are fully automated using Jenkins, they have different operating logic, goals, and vulnerability detection methods.
It's worth noting that Feroxbuster also works independently, without a dictionary. It supports multiple threads to speed up the process, which is beneficial when integrated into CDs. It can use substitutions for dynamic paths and supports filtering.
It's important to note that Feroxbuster can be used as an extension for scanning subdomains, along with Subfinder. However, I personally haven't used this feature of this tool.
Used in conjunction with Feroxbuster, Nuclei is an open-source tool with highly flexible configuration options. It can scan for known vulnerabilities based on identified URLs. It can target predefined vulnerability patterns (XSS, SQL injection, SSRF, LFI, and so on) and has a large, rapidly growing community that is expanding its functionality. Nuclei can scan URLs, IP addresses, and domains. It is currently considered one of the best and most versatile in its class.
Nuclei supports a variety of templates and allows you to add your own when a more detailed vulnerability analysis of your own project is required. It uses internal HTTP requests to discovered endpoints. In my implementation, it sends requests to paths identified by Feroxbuster, analyzing the server responses and searching for vulnerability signatures.
I'd like to point out that the FFUF development team has released several dictionaries, each varying in size. They are quite versatile and can be used for a wide range of web applications. In one of my tests, I used the developer's largest library, which contains 600 million entries.
During scanning, FFUF sequentially inserts values from dictionaries into specified templates, creates HTTP requests with various parameters, sends them to the target resource, and analyzes the server responses, comparing the results with the specified filtering criteria. The tool then detects unexpected responses, server errors, information leaks, and identifies potential vulnerabilities.
The main difference is the difference in purpose. While Feroxbuster and Nuclei are used to first find hidden paths and then analyze them for known vulnerabilities, FFUF is a straightforward fuzzer. We input data into the tool and, as a result, find vulnerabilities without running it against any templates or databases. While the first pipeline provides a broader coverage, the second is a more targeted search.
Two CI/CD pipelines are needed to:
Furthermore, the fuzzer corrupted the Docker artifact repository during its operation. This indicates vulnerabilities in the public domain that could have been accessed by attackers and harmed the company.
The implemented tools are quite flexible and easy to configure within the Jenkins pipeline (this was an important factor in the selection process). They don't require any additional steps to ensure functionality, yet offer extensive functionality. Jenkins and Docker with basic images are sufficient for implementation.
Automating the scanning process with Jenkins has significantly improved the speed and reliability of security testing. Using Jenkins Pipeline specifically has integrated fuzzing into the standard development process, making it part of regular security audits. FFUF testing on a moderately large dictionary takes approximately 5-7 minutes, meaning code can be checked more frequently.
There's just one caveat. I've just scanned externally accessible data. But if a more complex authentication system is used, like SSO, the fuzzer won't work. This will require either manual testing or time-consuming configuration of the scanner. But in general, fuzzing is typically used during the development phase—there's no point in integrating it into a secure production environment. In fact, the application of this approach isn't limited to web applications. Each stack has its own fuzzing tools—it's a very broad methodology. Even GOST 56939 requires fuzzing.
Hello, Habr! My name is Alexey Lomay, and I'm a Junior Systems Engineer in the DevOps department at IBS. In this article, I'll discuss fuzzing as a powerful tool for web application security testing. We'll discuss the history and development of this approach, the available tools, and two CI/CD implementation options.
What is fuzzing?
Fuzz testing is a security testing method in which a system is exposed to random or specially generated input data in order to identify vulnerabilities.Fuzzers automatically generate various data types, be they strings, numbers, symbols, patterns, and so on, and feed them into the system under test, analyzing the results. This allows for the detection of errors that could be exploited by attackers.
Fuzz testing consists of five main stages:
- Input data generation. The fuzzer creates test data based on dictionaries, random values, or specific attack patterns.
- Sending data. Test data is sent to the system under test via HTTP requests, network packets, or other interaction mechanisms.
- System response monitoring. The fuzzer monitors system responses, analyzing unexpected behavior (processing errors, crashes, and information leaks).
- Results analysis. If the system reacts unexpectedly—crashes, produces errors, or generates unwanted information—the fuzzer registers this as a potential vulnerability.
- Repeating the process. The steps are repeated until the initially identified vulnerabilities are no longer present in the system.
History of fuzzing
The history of fuzzing began in 1985, when Bill Pelton began experimenting with random data to test network protocols. In 1989, Bart Miller and his team at the University of Wisconsin-Madison published the paper "An Empirical Study of the Reliability of UNIX Utilities," in which they found that about a quarter of Unix utilities were vulnerable to failures when processing invalid input. By 1996, Microsoft had begun actively using and developing fuzzing for testing its products, creating the SAGE tool. In 2006, fuzzers began actively using code coverage analysis methods to more effectively find vulnerabilities. The now-famous American Fuzzy Lop (AFL) tool was born.
In recent years, there has been a trend toward integrating AI and machine learning technologies into fuzzing. Tools like Driller and Vuzzer, as well as cloud platforms like OSS-Fuzz, have emerged for fuzzing open source code. Fuzzing has begun to be integrated into CI/CD pipelines, enabling automated security testing at every stage of development.
Classification of tools
Fuzzing tools can be classified according to various criteria.The first is an approach to generating input data for phasers:
- We can send random data—generate random strings, numbers, characters, and other types of data for the fuzzer to test how the system reacts to unexpected input.
- Specially crafted data can also be used. Fuzzers can rely on pre-prepared patterns, such as SQL injections, XSS attacks, buffer overflows, and others, to specifically test for specific vulnerabilities.
- HTTP requests. For web applications, fuzzers can send HTTP requests with various parameters, headers, and request bodies.
- Network packets. For network protocols, fuzzers can send specially crafted packets.
- File operations. For file systems, fuzzers can create, modify, and delete files with various names and contents.
Another classification method is based on vulnerabilities identified during fuzzing:
- Processing errors. The system is not processing input data correctly.
- Information leak. The system is returning unexpected data.
- Unexpected behavior. The system does not behave as expected.
Where can fuzzing tools be applied?
Fuzzing can be applied to web applications. This is probably its primary and key purpose. This can include testing forms, APIs, URL parameters, headers, and other elements.You can also use fuzzing to test network protocols—services that use DNS, HTTP, FTP, and other protocols.
You can test file systems, in particular file operations, including creation, reading, writing, deletion, and overflow.
Fuzzers can be used to test software more broadly, including applications, libraries, and drivers.
The main objectives of fuzzing
There are three main goals of fuzzing.The first is identifying vulnerabilities in code. A good example is the sqlmap tool, which can automatically test web applications for SQL injection vulnerabilities. It sends specially crafted SQL queries and analyzes the server responses.
The second is detecting errors and system crashes. The previously mentioned American Fuzzy Lop tool can test binaries and complex protocols by sending specially crafted data and analyzing program behavior to find buffer overflows and other errors.
It is worth noting that AFL is suitable for a wider range of purposes, not just for detecting errors and failures.
Third, testing user input processing. Feroxbuster, which I'll discuss later, can test a web application for hidden directories and files by sending requests with various paths and analyzing the server responses.
Classification according to knowledge of the internal structure of the system
Fuzzers can be classified based on their knowledge of the system's internal structure. There are three main levels:- White-box testing involves complete knowledge of the system's structure and fuzzing based on this knowledge. This is the most detailed approach to security testing, allowing the fuzzer to fully access the source code, architecture, and internal mechanisms of the system under test. This method allows the fuzzer to analyze data flows, entry points, processing algorithms, and other internal components to create the most effective test scenarios.
- Grey-box testing is a technique that allows us to test a specific part of a web application with partial knowledge of the system structure. It's a hybrid approach that combines the advantages of black-box and white-box testing. This method uses profiling data, code coverage, execution metrics, and performance statistics to optimize the fuzzing process.
- Black-box testing is when we completely ignore the internal structure of the system under test, analyzing only the execution results and its responses. In this approach, the fuzzer generates random or specially selected input data, working only with external interfaces.
Implementation in CI/CD
I set myself the goal of finding the most suitable fuzzer for implementation in the Jenkins pipeline, and simultaneously searching for vulnerabilities in applications developed or used within the IBS infrastructure. To this end, I created two pipelines using Jenkins to automate the scanning process. For the first, I chose Feroxbuster and Nuclei, and for the second, FFUF. I implemented the tools themselves and launched the scanners using Docker containers.It is worth noting that, although both approaches are fully automated using Jenkins, they have different operating logic, goals, and vulnerability detection methods.
Feroxbuser & Nuclei
The first pipeline works as follows:- scans for hidden resources using Feroxbuster;
- analyzes found resources for known vulnerabilities using Nuclei;
- generates a report with found vulnerabilities and unwanted information.
It's worth noting that Feroxbuster also works independently, without a dictionary. It supports multiple threads to speed up the process, which is beneficial when integrated into CDs. It can use substitutions for dynamic paths and supports filtering.
It's important to note that Feroxbuster can be used as an extension for scanning subdomains, along with Subfinder. However, I personally haven't used this feature of this tool.
Used in conjunction with Feroxbuster, Nuclei is an open-source tool with highly flexible configuration options. It can scan for known vulnerabilities based on identified URLs. It can target predefined vulnerability patterns (XSS, SQL injection, SSRF, LFI, and so on) and has a large, rapidly growing community that is expanding its functionality. Nuclei can scan URLs, IP addresses, and domains. It is currently considered one of the best and most versatile in its class.
Nuclei supports a variety of templates and allows you to add your own when a more detailed vulnerability analysis of your own project is required. It uses internal HTTP requests to discovered endpoints. In my implementation, it sends requests to paths identified by Feroxbuster, analyzing the server responses and searching for vulnerability signatures.
FFUF
In the second pipeline, I used only one tool: FFUF. It's the easiest to configure and integrate into CI/CD. FFUF works by "polling" the scanned URL using pre-prepared dictionaries and dynamic parameter substitution. It's probably safe to say it's a purebred fuzzer that:- performs fuzzing of URL paths, header parameters and request body;
- supports various fuzzing modes (FUZZ — replacing a word in a URL, HEADERS, BODY, PARAMS — fuzzing of headers, request body, parameters);
- uses internal templates to detect vulnerabilities - unlike Feroxbuster, a template must be present, otherwise it will not run;
- supports filtering of results.
I'd like to point out that the FFUF development team has released several dictionaries, each varying in size. They are quite versatile and can be used for a wide range of web applications. In one of my tests, I used the developer's largest library, which contains 600 million entries.
During scanning, FFUF sequentially inserts values from dictionaries into specified templates, creates HTTP requests with various parameters, sends them to the target resource, and analyzes the server responses, comparing the results with the specified filtering criteria. The tool then detects unexpected responses, server errors, information leaks, and identifies potential vulnerabilities.
Comparison of pipelines
I have compiled the main differences in a table.
The main difference is the difference in purpose. While Feroxbuster and Nuclei are used to first find hidden paths and then analyze them for known vulnerabilities, FFUF is a straightforward fuzzer. We input data into the tool and, as a result, find vulnerabilities without running it against any templates or databases. While the first pipeline provides a broader coverage, the second is a more targeted search.
Two CI/CD pipelines are needed to:
- Use different scanning levels. Feroxbuster and Nuclei are better suited for general scanning and identifying web resource issues, while FFUF identifies data entry vulnerabilities.
- Achieving different goals. Feroxbuster and Nuclei, when paired, are good for detecting vulnerabilities in existing paths. FFUF, on the other hand, is suitable for finding vulnerabilities in request parameters that may be hidden from standard scanning.
Vulnerabilities found
To analyze the two pipelines, I chose an internal resource—the Nexus artifact repository. After scanning with the two methods described, I found a number of vulnerabilities. I'll list the most interesting ones:- A vulnerability exists in Apache Struts, a framework for developing Java web applications. It falls under the Remote Code Execution (RCE) category, meaning an attacker can execute arbitrary code on the server by sending specially crafted data. This vulnerability was discovered by Nuclei after running template tests. Importantly, the developer is aware of this bug, and it has already been patched in a more recent version.
- One of the initialization files reflects nginx version 118, which could allow an attacker to attack the resource using a known vulnerability in that version. Generally, nginx is not the most recent version, so there are plenty of vulnerabilities.
- The resource's API documentation was found publicly available. Publicly publishing this data could allow an attacker to obtain the full API map and conduct further attacks using it.
- FFUF found a page with a Docker token for accessing a Docker image repository, which could allow attackers to obtain sensitive information related to the repository's contents.
- FFUF also found a URL that could hypothetically execute PHP code on the page (if the server executes the string passed as a function call).
- An attempt to exploit a JavaScript prototype vulnerability was also discovered. If a site uses JavaScript in a library with this vulnerability, it could be an entry point for attackers.
Furthermore, the fuzzer corrupted the Docker artifact repository during its operation. This indicates vulnerabilities in the public domain that could have been accessed by attackers and harmed the company.
Conclusions
When a web application is used in mission-critical systems, a vulnerability can have serious consequences. Fuzz scanning is a powerful tool for identifying hidden vulnerabilities that can be exploited by an attacker.The implemented tools are quite flexible and easy to configure within the Jenkins pipeline (this was an important factor in the selection process). They don't require any additional steps to ensure functionality, yet offer extensive functionality. Jenkins and Docker with basic images are sufficient for implementation.
Automating the scanning process with Jenkins has significantly improved the speed and reliability of security testing. Using Jenkins Pipeline specifically has integrated fuzzing into the standard development process, making it part of regular security audits. FFUF testing on a moderately large dictionary takes approximately 5-7 minutes, meaning code can be checked more frequently.
There's just one caveat. I've just scanned externally accessible data. But if a more complex authentication system is used, like SSO, the fuzzer won't work. This will require either manual testing or time-consuming configuration of the scanner. But in general, fuzzing is typically used during the development phase—there's no point in integrating it into a secure production environment. In fact, the application of this approach isn't limited to web applications. Each stack has its own fuzzing tools—it's a very broad methodology. Even GOST 56939 requires fuzzing.