Samiha Shimmi, Ashiqur Rahman, and Mohan Gadde, Northern Illinois University; Hamed Okhravi, MIT Lincoln Laboratory; Mona Rahimi, Northern Illinois University
Despite decades of research in vulnerability detection, vulnerabilities in source code remain a growing problem, and more effective techniques are needed in this domain. To enhance software vulnerability detection, in this paper, we first show that various vulnerability classes in the C programming language share common characteristics, encompassing semantic, contextual, and syntactic properties. We then leverage this knowledge to enhance the learning process of Deep Learning (DL) models for vulnerability detection when only sparse data is available. To achieve this, we extract multiple dimensions of information from the available, albeit limited, data. We then consolidate this information into a unified space, allowing for the identification of similarities among vulnerabilities through nearest-neighbor embeddings. The combination of these steps allows us to improve the effectiveness and efficiency of vulnerability detection using DL models. Evaluation results demonstrate that our approach surpasses existing State-of-the-art (SOTA) models and exhibits strong performance on unseen data, thereby enhancing generalizability.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Samiha Shimmi and Ashiqur Rahman and Mohan Gadde and Hamed Okhravi and Mona Rahimi},
title = {{VulSim}: Leveraging Similarity of {Multi-Dimensional} Neighbor Embeddings for Vulnerability Detection},
booktitle = {33rd USENIX Security Symposium (USENIX Security 24)},
year = {2024},
isbn = {978-1-939133-44-1},
address = {Philadelphia, PA},
pages = {1777--1794},
url = {https://www.usenix.org/conference/usenixsecurity24/presentation/shimmi},
publisher = {USENIX Association},
month = aug
}