Search Results for author: Rifat Shahriyar

Found 20 papers, 15 papers with code

Inceptive Transformers: Enhancing Contextual Representations through Multi-Scale Feature Learning Across Domains and Languages

no code implementations26 May 2025 Asif Shahriar, Rifat Shahriyar, M Saifur Rahman

Conventional transformer models typically compress the information from all tokens in a sequence into a single \texttt{[CLS]} token to represent global context-- an approach that can lead to information loss in tasks requiring localized or hierarchical cues.

Emotion Recognition

ChakmaNMT: A Low-resource Machine Translation On Chakma Language

no code implementations14 Oct 2024 Aunabil Chakma, Aditya Chakma, Soham Khisa, Chumui Tripura, Masum Hasan, Rifat Shahriyar

Thus, we have worked on MT between CCP-BN(Chakma-Bangla) by introducing a novel dataset of 15, 021 parallel samples and 42, 783 monolingual samples of the Chakma Language.

Benchmarking Translation +1

ConVerSum: A Contrastive Learning-based Approach for Data-Scarce Solution of Cross-Lingual Summarization Beyond Direct Equivalents

no code implementations17 Aug 2024 Sanzana Karim Lora, M. Sohel Rahman, Rifat Shahriyar

Cross-lingual summarization (CLS) is a sophisticated branch in Natural Language Processing that demands models to accurately translate and summarize articles from different source languages.

Articles Contrastive Learning

An Empirical Study of Gendered Stereotypes in Emotional Attributes for Bangla in Multilingual Large Language Models

1 code implementation8 Jul 2024 Jayanta Sadhu, Maneesha Rani Saha, Rifat Shahriyar

All of our resources including code and data are made publicly available to support future research on Bangla NLP.

Fairness

An Empirical Study on the Characteristics of Bias upon Context Length Variation for Bangla

1 code implementation25 Jun 2024 Jayanta Sadhu, Ayan Antik Khan, Abhik Bhattacharjee, Rifat Shahriyar

Specifically, in this study, we (1) create a dataset for intrinsic gender bias measurement in Bangla, (2) discuss necessary adaptations to apply existing bias measurement methods for Bangla, and (3) examine the impact of context length variation on bias measurement, a factor that has been overlooked in previous studies.

BanglaParaphrase: A High-Quality Bangla Paraphrase Dataset

1 code implementation11 Oct 2022 Ajwad Akil, Najrin Sultana, Abhik Bhattacharjee, Rifat Shahriyar

In this work, we present BanglaParaphrase, a high-quality synthetic Bangla Paraphrase dataset curated by a novel filtering pipeline.

Diversity Vocal Bursts Intensity Prediction

GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

1 code implementation22 Jun 2022 Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashish Upadhyay, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Wang, Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Hiroaki Hayashi, Jekaterina Novikova, Jenna Kanerva, Jenny Chim, Jiawei Zhou, Jordan Clive, Joshua Maynez, João Sedoc, Juraj Juraska, Kaustubh Dhole, Khyathi Raghavi Chandu, Laura Perez-Beltrachini, Leonardo F. R. Ribeiro, Lewis Tunstall, Li Zhang, Mahima Pushkarna, Mathias Creutz, Michael White, Mihir Sanjay Kale, Moussa Kamal Eddine, Nico Daheim, Nishant Subramani, Ondrej Dusek, Paul Pu Liang, Pawan Sasanka Ammanamanchi, Qi Zhu, Ratish Puduppully, Reno Kriz, Rifat Shahriyar, Ronald Cardenas, Saad Mahamood, Salomey Osei, Samuel Cahyawijaya, Sanja Štajner, Sebastien Montella, Shailza, Shailza Jolly, Simon Mille, Tahmid Hasan, Tianhao Shen, Tosin Adewumi, Vikas Raunak, Vipul Raheja, Vitaly Nikolaev, Vivian Tsai, Yacine Jernite, Ying Xu, Yisi Sang, Yixin Liu, Yufang Hou

This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims.

Benchmarking Text Generation

BanglaNLG and BanglaT5: Benchmarks and Resources for Evaluating Low-Resource Natural Language Generation in Bangla

2 code implementations23 May 2022 Abhik Bhattacharjee, Tahmid Hasan, Wasi Uddin Ahmad, Rifat Shahriyar

This work presents BanglaNLG, a comprehensive benchmark for evaluating natural language generation (NLG) models in Bangla, a widely spoken yet low-resource language.

Conditional Text Generation Dialogue Generation +2

CrossSum: Beyond English-Centric Cross-Lingual Summarization for 1,500+ Language Pairs

1 code implementation16 Dec 2021 Abhik Bhattacharjee, Tahmid Hasan, Wasi Uddin Ahmad, Yuan-Fang Li, Yong-Bin Kang, Rifat Shahriyar

LaSE is strongly correlated with ROUGE and, unlike ROUGE, can be reliably measured even in the absence of references in the target language.

Abstractive Text Summarization Articles +2

XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44 Languages

2 code implementations Findings (ACL) 2021 Tahmid Hasan, Abhik Bhattacharjee, Md Saiful Islam, Kazi Samin, Yuan-Fang Li, Yong-Bin Kang, M. Sohel Rahman, Rifat Shahriyar

XL-Sum induces competitive results compared to the ones obtained using similar monolingual datasets: we show higher than 11 ROUGE-2 scores on 10 languages we benchmark on, with some of them exceeding 15, as obtained by multilingual training.

Abstractive Text Summarization

Text2App: A Framework for Creating Android Apps from Text Descriptions

2 code implementations16 Apr 2021 Masum Hasan, Kazi Sajeed Mehrab, Wasi Uddin Ahmad, Rifat Shahriyar

We overcome this limitation by transforming natural language into an abstract intermediate formal language representing an application with a substantially smaller number of tokens.

Code Generation Language Modeling +1

BERT2Code: Can Pretrained Language Models be Leveraged for Code Search?

no code implementations16 Apr 2021 Abdullah Al Ishtiaq, Masum Hasan, Md. Mahim Anjum Haque, Kazi Sajeed Mehrab, Tanveer Muttaqueen, Tahmid Hasan, Anindya Iqbal, Rifat Shahriyar

In this work, we leverage the efficacy of these embedding models using a simple, lightweight 2-layer neural network in the task of semantic code search.

Code Search Natural Language Queries

Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation

1 code implementation EMNLP 2020 Tahmid Hasan, Abhik Bhattacharjee, Kazi Samin, Masum Hasan, Madhusudan Basak, M. Sohel Rahman, Rifat Shahriyar

With the segmenter and the two methods combined, we compile a high-quality Bengali-English parallel corpus comprising of 2. 75 million sentence pairs, more than 2 million of which were not available before.

Machine Translation Sentence +2

Early Prediction for Merged vs Abandoned Code Changes in Modern Code Reviews

1 code implementation7 Dec 2019 Md. Khairul Islam, Toufique Ahmed, Rifat Shahriyar, Anindya Iqbal, Gias Uddin

In our empirical study on the 146, 612 code changes from the three software projects, we find that (1) The new features like reviewer dimensions that are introduced in PredCR are the most informative.

Management

Cannot find the paper you are looking for? You can Submit a new open access paper.