1 code implementation • 9 May 2023 • Dung Nguyen Manh, Nam Le Hai, Anh T. V. Dau, Anh Minh Nguyen, Khanh Nghiem, Jin Guo, Nghi D. Q. Bui
We present The Vault, a dataset of high-quality code-text pairs in multiple programming languages for training large language models to understand and generate code.
1 code implementation • 2 May 2023 • Thang Nguyen-Duc, Hoang Thanh-Tung, Quan Hung Tran, Dang Huu-Tien, Hieu Ngoc Nguyen, Anh T. V. Dau, Nghi D. Q. Bui
Influence functions (IFs) are a powerful tool for detecting anomalous examples in large scale datasets.
no code implementations • 25 May 2022 • Anh T. V. Dau, Thang Nguyen-Duc, Hoang Thanh-Tung, Nghi D. Q. Bui
Despite the recent trend of developing and applying neural source code models to software engineering tasks, the quality of such models is insufficient for real-world use.