Intelligent Code Completion Using Distributed Representation of Code

Martin WEYSSOW

Student thesis: Master types › Master en sciences informatiques à finalité spécialisée en data science

Résumé

Code completion is an important feature of integrated development environments (IDEs). It allows developers to produce code faster, especially novice ones who are not fully familiar with APIs and others’ code. Previous works on code completion have mainly exploited static type systems of programming languages or code history of the project under development or of other projects using common APIs. In this work, we present a novel approach for improving current function-calls completion tools by learning from independent code repositories, using well-known natural language processing models that can learn vector representation of source code (code embeddings). Our models are not trained on historical data of specific projects. Instead, our approach allows to learn high-level concepts and their relationships present among thousands of projects. As a consequence, the resulting system is able to provide general suggestions that are not specific to particular projects or APIs. Additionally, by taking into account the context of the call to complete, our approach suggests function calls relevant to that context. We evaluated our approach on a set of open-source projects unseen during the training. The results show that the use of the trained model along with a code suggestion plug-in based on static type analysis improves significantly the correctness of the completion suggestions.

la date de réponse	1 sept. 2020
langue originale	Anglais
L'institution diplômante	Universite de Namur
Superviseur	Benoit Vanderose (Promoteur) & Benoît Frénay (Copromoteur)

Contient cette citation

Les documents

2020_WeyssowM_memoire
Fichier: application/pdf, 2,25 MB
Type: Thèse