Neural Code Search Revisited: Enhancing Code Snippet Retrieval through Natural Language Intent

27 Aug 2020  ยท  Geert Heyman, Tom Van Cutsem ยท

In this work, we propose and study annotated code search: the retrieval of code snippets paired with brief descriptions of their intent using natural language queries. On three benchmark datasets, we investigate how code retrieval systems can be improved by leveraging descriptions to better capture the intents of code snippets. Building on recent progress in transfer learning and natural language processing, we create a domain-specific retrieval model for code annotated with a natural language description. We find that our model yields significantly more relevant search results (with absolute gains up to 20.6% in mean reciprocal rank) compared to state-of-the-art code retrieval methods that do not use descriptions but attempt to compute the intent of snippets solely from unannotated code.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Annotated Code Search PACS-CoNaLa Ensemble:USE-tuned+NCS MRR 0.351 # 1
Annotated Code Search PACS-CoNaLa USE-tuned MRR 0.340 # 2
Annotated Code Search PACS-CoNaLa NCS MRR 0.167 # 4
Annotated Code Search PACS-CoNaLa USE MRR 0.181 # 3
Annotated Code Search PACS-SO-DS Ensemble:USE-tuned+NCS MRR 0.323 # 1
Annotated Code Search PACS-SO-DS NCS MRR 0.113 # 4
Annotated Code Search PACS-SO-DS USE-tuned MRR 0.304 # 2
Annotated Code Search PACS-SO-DS USE MRR 0.244 # 3
Annotated Code Search PACS-StaQC-py USE MRR 0.104 # 3
Annotated Code Search PACS-StaQC-py USE-tuned MRR 0.117 # 2
Annotated Code Search PACS-StaQC-py Ensemble:USE-tuned+NCS MRR 0.126 # 1
Annotated Code Search PACS-StaQC-py NCS MRR 0.030 # 4

Methods


No methods listed for this paper. Add relevant methods here