Hello, I am a Ph.D. candidate at Wuhan University under the supervision of Prof.
My research interests include software security and program analysis.
Feel free to contact me via email.
-
RIMFuzz: real-time impact-aware mutation for library API fuzzing
Xiaoke Wang, and Lei Zhao*.
(* Corresponding author)
In Journal of King Saud University Computer and Information Sciences.
As libraries merely expose APIs to developers rather than directly handling user input, applying fuzzing to libraries requires fuzz drivers to help process fuzzer-provided input and invoke APIs. To reduce manual effort and avoid reliance on additional samples, some techniques generate fuzz drivers during fuzzing by modeling the test cases to describe API calls and permitting the mutation on the execution sequence as well as argument values of API calls. However, such techniques schedule the sequence and value mutation via inflexible thresholds and randomly select the objects for mutators, which fails to consider the importance of sequence and value mutation in varying stages of fuzzing and the inherent differences between APIs.
In this work, we present RIMFuzz, which employs a real-time impact-aware mutation strategy for library API fuzzing. Specifically, RIMFuzz infers the real-time impact of APIs on coverage during fuzzing, while capturing the benefits of mutations on the impact. Based on the dynamic feedback that sequence and value mutation bring to the impact, RIMFuzz adjusts the probability of selecting them accordingly. Moreover, both the activated impact of each API and the number of times the API has been selected are considered to determine which object is to be operated by distinct mutators. The experimental results show that RIMFuzz outperforms baselines in code coverage and can be applied to test real-world libraries at a minor development cost. With the help of RIMFuzz, we reported 11 new bugs to the corresponding maintainers, of which 9 have been fixed.
@article{wang2025rimfuzz,
title={RIMFuzz: real-time impact-aware mutation for library API fuzzing},
author={Wang, Xiaoke and Zhao, Lei},
journal={Journal of King Saud University Computer and Information Sciences},
volume={37},
number={4},
pages={1--17},
year={2025},
publisher={Springer}
}
-
Input-Driven Dynamic Program Debloating for Code-Reuse Attack Mitigation
Xiaoke Wang, Tao Hui, Lei Zhao*, and Yueqiang Cheng.
(* Corresponding author)
In Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2023).
Modern software is bloated, especially for libraries. The unnecessary code not only brings severe vulnerabilities, but also assists attackers to construct exploits. To mitigate the damage of bloated libraries, researchers have proposed several debloating techniques to remove or restrict the invocation of unused code in a library. However, existing approaches either statically keep code for all expected inputs, which leave unused code for each concrete input, or rely on runtime context to dynamically determine the necessary code, which could be manipulated by attackers.
In this paper, we propose Picup, a practical approach that dynamically customizes libraries for each input. Based on the observation that the behavior of a program mainly depends on the given input, we design Picup to predict the necessary library functions immediately after we get the input, which erases the unused code before attackers can affect the decision-making data. To achieve an effective prediction, we adopt a convolutional neural network (CNN) with attention mechanism to extract key bytes from the input and map them to library functions. We evaluate Picup on real-world benchmarks and popular applications. The results show that we can predict the necessary library functions with 97.56% accuracy, and reduce the code size by 87.55% on average with low overheads. These results indicate that Picup is a practical solution for secure and effective library debloating.
@inproceedings{wang2023picup,
title={Input-Driven Dynamic Program Debloating for Code-Reuse Attack Mitigation},
author={Wang, Xiaoke and Hui, Tao and Zhao, Lei and Cheng, Yueqiang},
booktitle={Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering},
pages={934--946},
year={2023}
}
-
APICAD: Augmenting API Misuse Detection through Specifications from Code and Documents
Xiaoke Wang, and Lei Zhao*.
(* Corresponding author)
In Proceedings of 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE).
Using API should follow its specifications. Otherwise, it can bring security impacts while the functionality is damaged. To detect API misuse, we need to know what its specifications are. In addition to being provided manually, current tools usually mine the majority usage in the existing codebase as specifications, or capture specifications from its relevant texts in human language. However, the former depends on the quality of the codebase itself, while the latter is limited to the irregularity of the text. In this work, we observe that the information carried by code and documents can complement each other. To mitigate the demand for a high-quality codebase and reduce the pressure to capture valid information from texts, we present APICAD to detect API misuse bugs of C/C++ by combining the specifications mined from code and documents. On the one hand, we effectively build the contexts for API invocations and mine specifications from them through a frequency-based method. On the other hand, we acquire the specifications from documents by using lightweight keyword-based and NLP-assisted techniques. Finally, the combined specifications are generated for bug detection. Experiments show that APICAD can handle diverse API usage semantics to deal with different types of API misuse bugs. With the help of APICAD, we report 153 new bugs in Curl, Httpd, OpenSSL and Linux kernel, 145 of which have been confirmed and 126 have applied our patches.
@inproceedings{wang2023apicad,
author={Wang, Xiaoke and Zhao, Lei},
booktitle={2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE)},
title={APICAD: Augmenting API Misuse Detection through Specifications from Code and Documents},
pages={245--256},
year={2023}
}
-
Alphuzz: Monte carlo search on seed-mutation tree for coverage-guided fuzzing
Yiru Zhao, Xiaoke Wang, Lei Zhao*, Yueqiang Cheng, and Heng Yin.
(* Corresponding author)
In Proceedings of the 38th Annual Computer Security Applications Conference.
Coverage-based greybox fuzzing (CGF) has been approved to be effective in finding security vulnerabilities. Seed scheduling, the process of selecting an input as the seed from the seed pool for the next fuzzing iteration, plays a central role in CGF. Although numerous seed scheduling strategies have been proposed, most of them treat these seeds independently and do not explicitly consider the relationships among seeds.
In this study, we make a key observation that the relationships among seeds are valuable for seed scheduling. We design and propose a “seed mutation tree” by investigating and leveraging the mutation relationships among seeds. With the “seed mutation tree”, we further model the seed scheduling problem as a Monte-Carlo Tree Search (MCTS) problem. That is, we select the next seed for fuzzing by walking this “seed mutation tree” through an optimal path, based on the estimation of MCTS. We implement two prototypes, Alphuzz on top of AFL and Alphuzz++ on top of AFL++. The evaluation results on three datasets (the UniFuzz dataset, the CGC binaries, and 12 real-world binaries) show that Alphuzz and Alphuzz++ outperform state-of-the-art fuzzers with higher code coverage and more discovered vulnerabilities. In particular, Alphuzz discovers 3 new vulnerabilities with CVEs.
@inproceedings{zhao2022alphuzz,
title={Alphuzz: Monte carlo search on seed-mutation tree for coverage-guided fuzzing},
author={Zhao, Yiru and Wang, Xiaoke and Zhao, Lei and Cheng, Yueqiang and Yin, Heng},
booktitle={Proceedings of the 38th Annual Computer Security Applications Conference},
pages={534--547},
year={2022}
}