Code Search
49 papers with code • 5 benchmarks • 10 datasets
The goal of Code Search is to retrieve code fragments from a large code corpus that most closely match a developer’s intent, which is expressed in natural language.
Libraries
Use these libraries to find Code Search models and implementationsDatasets
Latest papers with no code
Rewriting the Code: A Simple Method for Large Language Model Augmented Code Search
In code search, the Generation-Augmented Retrieval (GAR) framework, which generates exemplar code snippets to augment queries, has emerged as a promising strategy to address the principal challenge of modality misalignment between code snippets and natural language queries, particularly with the demonstrated code generation capabilities of Large Language Models (LLMs).
Code Search Debiasing:Improve Search Results beyond Overall Ranking Performance
To mitigate biases, we develop a general debiasing framework that employs reranking to calibrate search results.
GenCodeSearchNet: A Benchmark Test Suite for Evaluating Generalization in Programming Language Understanding
Language models can serve as a valuable tool for software developers to increase productivity.
Noisy Pair Corrector for Dense Retrieval
Most dense retrieval models contain an implicit assumption: the training query-document pairs are exactly matched.
Contrastive Prompt Learning-based Code Search based on Interaction Matrix
However, existing code search methods still suffer from two performance constraints: inadequate semantic representation and the semantic gap between natural language (NL) and programming language (PL).
Code Representation Pre-training with Complements from Program Executions
The test cases are obtained with the assistance of a customized fuzzer and are only required during pre-training.
Laminar: A New Serverless Stream-based Framework with Semantic Code Search and Code Completion
This paper introduces Laminar, a novel serverless framework based on dispel4py, a parallel stream-based dataflow library.
Evaluating and Optimizing the Effectiveness of Neural Machine Translation in Supporting Code Retrieval Models: A Study on the CAT Benchmark
Our NMT models of learning ASTTrans Representation can boost the Mean Reciprocal Rank of these state-of-the-art code search processes by up to 3. 08% and improve 23. 08% of queries' results over the CAT benchmark.
CCT-Code: Cross-Consistency Training for Multilingual Clone Detection and Code Search
We consider the clone detection and information retrieval problems for source code, well-known tasks important for any programming language.
Searching by Code: a New SearchBySnippet Dataset and SnippeR Retrieval Model for Searching by Code Snippets
Code search is an important task that has seen many developments in recent years.