Exploring ChatGPT's Capabilities on Vulnerability Management

Authors: 

Peiyu Liu and Junming Liu, Zhejiang University NGICS Platform; Lirong Fu, Hangzhou Dianzi University; Kangjie Lu, University of Minnesota; Yifan Xia, Zhejiang University NGICS Platform; Xuhong Zhang, Zhejiang University and Jianghuai Advance Technology Center; Wenzhi Chen, Zhejiang University; Haiqin Weng, Ant Group; Shouling Ji, Zhejiang University; Wenhai Wang, Zhejiang University NGICS Platform

Abstract: 

Recently, ChatGPT has attracted great attention from the code analysis domain. Prior works show that ChatGPT has the capabilities of processing foundational code analysis tasks, such as abstract syntax tree generation, which indicates the potential of using ChatGPT to comprehend code syntax and static behaviors. However, it is unclear whether ChatGPT can complete more complicated real-world vulnerability management tasks, such as the prediction of security relevance and patch correctness, which require an all-encompassing understanding of various aspects, including code syntax, program semantics, and related manual comments.

In this paper, we explore ChatGPT's capabilities on 6 tasks involving the complete vulnerability management process with a large-scale dataset containing 70,346 samples. For each task, we compare ChatGPT against SOTA approaches, investigate the impact of different prompts, and explore the difficulties. The results suggest promising potential in leveraging ChatGPT to assist vulnerability management. One notable example is ChatGPT's proficiency in tasks like generating titles for software bug reports. Furthermore, our findings reveal the difficulties encountered by ChatGPT and shed light on promising future directions. For instance, directly providing random demonstration examples in the prompt cannot consistently guarantee good performance in vulnerability management. By contrast, leveraging ChatGPT in a self-heuristic way—extracting expertise from demonstration examples itself and integrating the extracted expertise in the prompt is a promising research direction. Besides, ChatGPT may misunderstand and misuse the information in the prompt. Consequently, effectively guiding ChatGPT to focus on helpful information rather than the irrelevant content is still an open problem.

Open Access Media

USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.

BibTeX
@inproceedings {299549,
author = {Peiyu Liu and Junming Liu and Lirong Fu and Kangjie Lu and Yifan Xia and Xuhong Zhang and Wenzhi Chen and Haiqin Weng and Shouling Ji and Wenhai Wang},
title = {Exploring {ChatGPT{\textquoteright}s} Capabilities on Vulnerability Management},
booktitle = {33rd USENIX Security Symposium (USENIX Security 24)},
year = {2024},
isbn = {978-1-939133-44-1},
address = {Philadelphia, PA},
pages = {811--828},
url = {https://www.usenix.org/conference/usenixsecurity24/presentation/liu-peiyu},
publisher = {USENIX Association},
month = aug
}

Presentation Video