IntJect: Vulnerability Intent Bug Seeding

Benjamin PETIT, Ahmed Khanfir, Ezekiel Soremekun, Gilles Perrouin, Michail Papadakis

Résultats de recherche: Contribution dans un livre/un catalogue/un rapport/dans les actes d'une conférenceArticle dans les actes d'une conférence/un colloque

87 Téléchargements (Pure)

Résumé

Studying and exposing software vulnerabilities is important to ensure software security, safety, and reliability. Software engineers often inject vulnerabilities into their programs to test the reliability of their test suites, vulnerability detectors, and security measures. However, state-of-the-art vulnerability injection methods only capture code syntax/patterns, they do not learn the intent of the vulnerability and are limited to the syntax of the original dataset. To address this challenge, we propose the first intent-based vulnerability injection method that learns both the program syntax and vulnerability intent. Our approach applies a combination of NLP methods and semantic-preserving program mutations (at the bytecode level) to inject code vulnerabilities. Given a dataset of known vulnerabilities (containing benign and vulnerable code pairs), our approach proceeds by employing semantic-preserving program mutations to transform the existing dataset to semantically similar code. Then, it learns the intent of the vulnerability via neural machine translation (Seq2Seq) models. The key insight is to employ Seq2Seq to learn the intent (context) of the vulnerable code in a manner that is agnostic of the specific program instance. We evaluate the performance of our approach using 1275 vulnerabilities belonging to five (5) CWEs from the Juliet test suite. We examine the effectiveness of our approach in producing compilable and vulnerable code. Our results show that IntJECT is effective, almost all (99%) of the code produced by our approach is vulnerable and compilable. We also demonstrate that the vulnerable programs generated by IntJECT are semantically similar to the withheld original vulnerable code. Finally, we show that our mutation-based data transformation approach outperforms its alternatives, namely data obfuscation and using the original data.

langue originaleAnglais
titreProceedings - 2022 IEEE 22nd International Conference on Software Quality, Reliability and Security, QRS 2022
Pages19-30
Nombre de pages12
ISBN (Electronique)9781665477048
Les DOIs
Etat de la publicationPublié - 2022

Série de publications

NomIEEE International Conference on Software Quality, Reliability and Security, QRS
Volume2022-December
ISSN (imprimé)2693-9177

Empreinte digitale

Examiner les sujets de recherche de « IntJect: Vulnerability Intent Bug Seeding ». Ensemble, ils forment une empreinte digitale unique.

Contient cette citation