no code implementations • 11 Apr 2025 • Leo Kampen, Carlos Rabat Villarreal, Louis Yu, Santu Karmaker, Dongji Feng
In this paper, we conducted a Multi-Perspective Comparative Narrative Analysis (CNA) on three prominent LLMs: GPT-3. 5, PaLM2, and Llama2.
no code implementations • 3 Apr 2025 • Zishuo Liu, Carlos Rabat Villarreal, Mostafa Rahgouy, Amit Das, Zheng Zhang, Chang Ren, Dongji Feng
Comparative experiments confirmed this hypothesis, demonstrating that LLMs performed better on standard FPs in terms of both accuracy and efficiency.
no code implementations • 2 Mar 2025 • Yan Wang, Lingfei Qian, Xueqing Peng, Jimin Huang, Dongji Feng
The evaluation of ranking tasks remains a significant challenge in natural language processing (NLP), particularly due to the lack of direct labels for results in real-world scenarios.
no code implementations • 19 Feb 2025 • Wenwen Xie, Gray Gwizdz, Dongji Feng
While Large Language Models (LLMs) have emerged as promising tools for evaluating Natural Language Generation (NLG) tasks, their effectiveness is limited by their inability to appropriately weigh the importance of different topics, often overemphasizing minor details while undervaluing critical information, leading to misleading assessments.
1 code implementation • 12 Aug 2024 • Mostafa Rahgouy, Hamed Babaei Giglou, Mehnaz Tabassum, Dongji Feng, Amit Das, Taher Rahgooy, Gerry Dozier, Cheryl D. Seals
However, current AA benchmarks commonly overlook this uniqueness and frame the problem as a closed-world classification, assuming a fixed number of authors throughout the system's lifespan and neglecting the inclusion of emerging new authors.
3 code implementations • 17 Jun 2024 • Amit Das, Zheng Zhang, Najib Hasan, Souvika Sarkar, Fatemeh Jamshidi, Tathagata Bhattacharya, Mostafa Rahgouy, Nilanjana Raychawdhary, Dongji Feng, Vinija Jain, Aman Chadha, Mary Sandage, Lauramarie Pope, Gerry Dozier, Cheryl Seals
This paper serves as a crucial resource, guiding researchers and practitioners in harnessing the potential of LLMs for data annotation, thereby fostering advancements in this critical field.
1 code implementation • 4 Mar 2024 • Amit Das, Mostafa Rahgouy, Dongji Feng, Zheng Zhang, Tathagata Bhattacharya, Nilanjana Raychawdhary, Fatemeh Jamshidi, Vinija Jain, Aman Chadha, Mary Sandage, Lauramarie Pope, Gerry Dozier, Cheryl Seals
Firstly, the existing datasets primarily rely on the collection of texts containing explicit offensive keywords, making it challenging to capture implicitly offensive contents that are devoid of these keywords.
1 code implementation • 23 Feb 2024 • Eftekhar Hossain, Sanjeev Kumar Sinha, Naman Bansal, Alex Knipper, Souvika Sarkar, John Salvador, Yash Mahajan, Sri Guttikonda, Mousumi Akter, Md. Mahadi Hassan, Matthew Freestone, Matthew C. Williams Jr., Dongji Feng, Santu Karmaker
One of the most important yet onerous tasks in the academic peer-reviewing process is composing meta-reviews, which involves assimilating diverse opinions from multiple expert peers, formulating one's self-judgment as a senior expert, and then summarizing all these perspectives into a concise holistic overview to make an overall recommendation.
no code implementations • 19 May 2023 • Shubhra Kanti Karmaker Santu, Dongji Feng
However, conducting such benchmarking studies is challenging because of the large variations in LLMs' performance when different prompt types/styles are used and different degrees of detail are provided in the prompts.
no code implementations • 14 Apr 2023 • Souvika Sarkar, Dongji Feng, Shubhra Kanti Karmaker Santu
Sentence encoders have indeed been shown to achieve superior performances for many downstream text-mining tasks and, thus, claimed to be fairly general.
no code implementations • 12 Sep 2022 • Shubhra Kanti Karmaker Santu, Dongji Feng
Experiments on two different data-sets with eight Learning-to-Rank (LETOR) methods demonstrate the following properties of the new LB normalized metric: 1) Statistically significant differences (between two methods) in terms of original metric no longer remain statistically significant in terms of Upper Lower (UL) Bound normalized version and vice-versa, especially for uninformative query-sets.