Search Results for author: Wasi Uddin Ahmad

Found 34 papers, 28 papers with code

IllusionVQA: A Challenging Optical Illusion Dataset for Vision Language Models

1 code implementation • 23 Mar 2024 • HAZ Sameen Shahgir, Khondker Salman Sayeed, Abhik Bhattacharjee, Wasi Uddin Ahmad, Yue Dong, Rifat Shahriyar

GPT4V, the best-performing VLM, achieves 62. 99% accuracy (4-shot) on the comprehension task and 49. 7% on the localization task (4-shot and Chain-of-Thought).

Ranked #1 on Object Localization on IllusionVQA

Common Sense Reasoning In-Context Learning +3

Paper
Code

Repoformer: Selective Retrieval for Repository-Level Code Completion

no code implementations • 15 Mar 2024 • Di wu, Wasi Uddin Ahmad, Dejiao Zhang, Murali Krishna Ramanathan, Xiaofei Ma

Recent advances in retrieval-augmented generation (RAG) have initiated a new era in repository-level code completion.

Code Completion Retrieval +1

Paper
Add Code

On Leveraging Encoder-only Pre-trained Language Models for Effective Keyphrase Generation

1 code implementation • 21 Feb 2024 • Di wu, Wasi Uddin Ahmad, Kai-Wei Chang

This study addresses the application of encoder-only Pre-trained Language Models (PLMs) in keyphrase generation (KPG) amidst the broader availability of domain-tailored encoder-only models compared to encoder-decoder models.

Keyphrase Generation

Paper
Code

Rethinking Model Selection and Decoding for Keyphrase Generation with Pre-trained Sequence-to-Sequence Models

1 code implementation • 10 Oct 2023 • Di wu, Wasi Uddin Ahmad, Kai-Wei Chang

DeSel improves greedy search by an average of 4. 7% semantic F1 across five datasets.

Keyphrase Generation Model Selection

Paper
Code

CoCoMIC: Code Completion By Jointly Modeling In-file and Cross-file Context

no code implementations • 20 Dec 2022 • Yangruibo Ding, Zijian Wang, Wasi Uddin Ahmad, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, Bing Xiang

While pre-trained language models (LM) for code have achieved great success in code completion, they generate code conditioned only on the contents within the file, i. e., in-file context, but ignore the rich semantics in other files within the same project, i. e., cross-file context, a critical source of information that is especially useful in modern modular software development.

Code Completion

Paper
Add Code

PLUE: Language Understanding Evaluation Benchmark for Privacy Policies in English

1 code implementation • 20 Dec 2022 • Jianfeng Chi, Wasi Uddin Ahmad, Yuan Tian, Kai-Wei Chang

Privacy policies provide individuals with information about their rights and how their personal information is handled.

Language Modelling Natural Language Understanding

Paper
Code

Pre-trained Language Models for Keyphrase Generation: A Thorough Empirical Study

1 code implementation • 20 Dec 2022 • Di wu, Wasi Uddin Ahmad, Kai-Wei Chang

However, there lacks a systematic study of how the two types of approaches compare and how different design choices can affect the performance of PLM-based models.

Keyphrase Extraction Keyphrase Generation

Paper
Code

Multi-lingual Evaluation of Code Generation Models

2 code implementations • 26 Oct 2022 • Ben Athiwaratkun, Sanjay Krishna Gouda, Zijian Wang, Xiaopeng Li, Yuchen Tian, Ming Tan, Wasi Uddin Ahmad, Shiqi Wang, Qing Sun, Mingyue Shang, Sujan Kumar Gonugondla, Hantian Ding, Varun Kumar, Nathan Fulton, Arash Farahani, Siddhartha Jain, Robert Giaquinto, Haifeng Qian, Murali Krishna Ramanathan, Ramesh Nallapati, Baishakhi Ray, Parminder Bhatia, Sudipta Sengupta, Dan Roth, Bing Xiang

Using these benchmarks, we are able to assess the performance of code generation models in a multi-lingual fashion, and discovered generalization ability of language models on out-of-domain languages, advantages of multi-lingual models over mono-lingual, the ability of few-shot prompting to teach the model new languages, and zero-shot translation abilities even on mono-lingual settings.

Code Completion Code Translation +1

Paper
Code

ContraCLM: Contrastive Learning For Causal Language Model

no code implementations • 3 Oct 2022 • Nihal Jain, Dejiao Zhang, Wasi Uddin Ahmad, Zijian Wang, Feng Nan, Xiaopeng Li, Ming Tan, Ramesh Nallapati, Baishakhi Ray, Parminder Bhatia, Xiaofei Ma, Bing Xiang

Specifically, we attain $44\%$ relative improvement on the Semantic Textual Similarity tasks and $34\%$ on Code-to-Code Search tasks.

Code Generation Code Search +4

Paper
Add Code

FixEval: Execution-based Evaluation of Program Fixes for Programming Problems

1 code implementation • 15 Jun 2022 • Md Mahim Anjum Haque, Wasi Uddin Ahmad, Ismini Lourentzou, Chris Brown

To address this issue, we introduce FixEval, a benchmark comprising of buggy code submissions to competitive programming problems and their corresponding fixes.

Bug fixing

Paper
Code

Summarize and Generate to Back-translate: Unsupervised Translation of Programming Languages

1 code implementation • 23 May 2022 • Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

In code generation, the model learns to do the opposite.

Code Generation Code Summarization +2

Paper
Code

BanglaNLG and BanglaT5: Benchmarks and Resources for Evaluating Low-Resource Natural Language Generation in Bangla

2 code implementations • 23 May 2022 • Abhik Bhattacharjee, Tahmid Hasan, Wasi Uddin Ahmad, Rifat Shahriyar

This work presents BanglaNLG, a comprehensive benchmark for evaluating natural language generation (NLG) models in Bangla, a widely spoken yet low-resource language.

Conditional Text Generation Dialogue Generation +1

Paper
Code

Retrieval Enhanced Data Augmentation for Question Answering on Privacy Policies

no code implementations • 19 Apr 2022 • Md Rizwan Parvez, Jianfeng Chi, Wasi Uddin Ahmad, Yuan Tian, Kai-Wei Chang

Prior studies in privacy policies frame the question answering (QA) task as identifying the most relevant text segment or a list of sentences from a policy document given a user query.

Data Augmentation Question Answering +1

Paper
Add Code

Representation Learning for Resource-Constrained Keyphrase Generation

1 code implementation • 15 Mar 2022 • Di wu, Wasi Uddin Ahmad, Sunipa Dev, Kai-Wei Chang

State-of-the-art keyphrase generation methods generally depend on large annotated datasets, limiting their performance in domains with limited annotated data.

Denoising Domain Adaptation +4

Paper
Code

CrossSum: Beyond English-Centric Cross-Lingual Summarization for 1,500+ Language Pairs

1 code implementation • 16 Dec 2021 • Abhik Bhattacharjee, Tahmid Hasan, Wasi Uddin Ahmad, Yuan-Fang Li, Yong-Bin Kang, Rifat Shahriyar

LaSE is strongly correlated with ROUGE and, unlike ROUGE, can be reliably measured even in the absence of references in the target language.

Abstractive Text Summarization Cross-Lingual Abstractive Summarization +1

Paper
Code

Retrieval Augmented Code Generation and Summarization

1 code implementation • Findings (EMNLP) 2021 • Md Rizwan Parvez, Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

To mimic developers' code or summary generation behavior, we propose a retrieval augmented framework, REDCODER, that retrieves relevant code or summaries from a retrieval database and provides them as a supplement to code generation or summarization models.

Ranked #1 on Code Generation on CodeXGLUE - CodeSearchNet (using extra training data)

Code Generation Code Summarization +1

Paper
Code

AVATAR: A Parallel Corpus for Java-Python Program Translation

1 code implementation • 26 Aug 2021 • Wasi Uddin Ahmad, Md Golam Rahman Tushar, Saikat Chakraborty, Kai-Wei Chang

Automating program translation is of paramount importance in software migration, and recently researchers explored unsupervised approaches due to the unavailability of parallel corpora.

Translation

Paper
Code

Syntax-augmented Multilingual BERT for Cross-lingual Transfer

1 code implementation • ACL 2021 • Wasi Uddin Ahmad, Haoran Li, Kai-Wei Chang, Yashar Mehdad

In recent years, we have seen a colossal effort in pre-training multilingual text encoders using large-scale corpora in many languages to facilitate cross-lingual transfer learning.

Cross-Lingual Transfer named-entity-recognition +7

Paper
Code

CoDesc: A Large Code-Description Parallel Dataset

1 code implementation • 29 May 2021 • Masum Hasan, Tanveer Muttaqueen, Abdullah Al Ishtiaq, Kazi Sajeed Mehrab, Md. Mahim Anjum Haque, Tahmid Hasan, Wasi Uddin Ahmad, Anindya Iqbal, Rifat Shahriyar

In this study, we present CoDesc -- a large parallel dataset composed of 4. 2 million Java methods and natural language descriptions.

Ranked #1 on Code Search on CoDesc

Code Search Code Summarization +2

Paper
Code

Improving Zero-Shot Cross-Lingual Transfer Learning via Robust Training

1 code implementation • EMNLP 2021 • Kuan-Hao Huang, Wasi Uddin Ahmad, Nanyun Peng, Kai-Wei Chang

Pre-trained multilingual language encoders, such as multilingual BERT and XLM-R, show great potential for zero-shot cross-lingual transfer.

Sentence text-classification +4

Paper
Code

Text2App: A Framework for Creating Android Apps from Text Descriptions

2 code implementations • 16 Apr 2021 • Masum Hasan, Kazi Sajeed Mehrab, Wasi Uddin Ahmad, Rifat Shahriyar

We overcome this limitation by transforming natural language into an abstract intermediate formal language representing an application with a substantially smaller number of tokens.

Code Generation Language Modelling

Paper
Code

Unified Pre-training for Program Understanding and Generation

1 code implementation • NAACL 2021 • Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

Experiments on code summarization in the English language, code generation, and code translation in seven programming languages show that PLBART outperforms or rivals state-of-the-art models.

Clone Detection Code Summarization +6

178

Paper
Code

BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla

1 code implementation • Findings (NAACL) 2022 • Abhik Bhattacharjee, Tahmid Hasan, Wasi Uddin Ahmad, Kazi Samin, Md Saiful Islam, Anindya Iqbal, M. Sohel Rahman, Rifat Shahriyar

In this work, we introduce BanglaBERT, a BERT-based Natural Language Understanding (NLU) model pretrained in Bangla, a widely spoken yet low-resource language in the NLP literature.

Document Classification Language Modelling +5

228

Paper
Code

Intent Classification and Slot Filling for Privacy Policies

1 code implementation • ACL 2021 • Wasi Uddin Ahmad, Jianfeng Chi, Tu Le, Thomas Norton, Yuan Tian, Kai-Wei Chang

We refer to predicting the privacy practice explained in a sentence as intent classification and identifying the text spans sharing specific information as slot filling.

General Classification intent-classification +3

Paper
Code

Simple or Complex? Learning to Predict Readability of Bengali Texts

1 code implementation • 9 Dec 2020 • Susmoy Chakraborty, Mir Tafseer Nayeem, Wasi Uddin Ahmad

Determining the readability of a text is the first step to its simplification.

Sentence

Paper
Code

PolicyQA: A Reading Comprehension Dataset for Privacy Policies

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Wasi Uddin Ahmad, Jianfeng Chi, Yuan Tian, Kai-Wei Chang

Prior studies in this domain frame the QA task as retrieving the most relevant text segment or a list of sentences from the policy document given a question.

Question Answering Reading Comprehension

Paper
Code

GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and Event Extraction

1 code implementation • 6 Oct 2020 • Wasi Uddin Ahmad, Nanyun Peng, Kai-Wei Chang

Recent progress in cross-lingual relation and event extraction use graph convolutional networks (GCNs) with universal dependency parses to learn language-agnostic sentence representations such that models trained on one language can be applied to other languages.

Event Extraction Graph Attention +2

Paper
Code

Select, Extract and Generate: Neural Keyphrase Generation with Layer-wise Coverage Attention

no code implementations • ACL 2021 • Wasi Uddin Ahmad, Xiao Bai, Soomin Lee, Kai-Wei Chang

Natural language processing techniques have demonstrated promising results in keyphrase generation.

Keyphrase Extraction Keyphrase Generation

Paper
Add Code

A Transformer-based Approach for Source Code Summarization

9 code implementations • ACL 2020 • Wasi Uddin Ahmad, Saikat Chakraborty, Baishakhi Ray, Kai-Wei Chang

Generating a readable summary that describes the functionality of a program is known as source code summarization.

Code Summarization Position +1

237

Paper
Code

Cross-lingual Dependency Parsing with Unlabeled Auxiliary Languages

1 code implementation • CONLL 2019 • Wasi Uddin Ahmad, Zhisong Zhang, Xuezhe Ma, Kai-Wei Chang, Nanyun Peng

We conduct experiments on cross-lingual dependency parsing where we train a dependency parser on a source language and transfer it to a wide range of target languages.

Cross-Lingual Transfer Dependency Parsing +2