no code implementations • COLING 2022 • Huanli Gong, Liangming Pan, Hengchang Hu
Designing in-depth educational questions is a time-consuming and cognitively demanding task.
no code implementations • NAACL (BEA) 2022 • Bowei Zou, Pengfei Li, Liangming Pan, Ai Ti Aw
In field of teaching, true/false questioning is an important educational method for assessing students’ general understanding of learning materials.
no code implementations • 22 Dec 2024 • Jundong Xu, Hao Fei, Meng Luo, Qian Liu, Liangming Pan, William Yang Wang, Preslav Nakov, Mong-Li Lee, Wynne Hsu
In the context of large language models (LLMs), current advanced reasoning methods have made impressive strides in various reasoning tasks.
1 code implementation • 18 Dec 2024 • Xiaobao Wu, Liangming Pan, Yuxi Xie, Ruiwen Zhou, Shuai Zhao, Yubo Ma, Mingzhe Du, Rui Mao, Anh Tuan Luu, William Yang Wang
Data contamination hinders fair LLM evaluation by introducing test data into newer models' training sets.
no code implementations • 15 Dec 2024 • Shengqiong Wu, Hao Fei, Liangming Pan, William Yang Wang, Shuicheng Yan, Tat-Seng Chua
Our framework systematically addresses potential issues in both visual and textual inputs by verifying and integrating perception-level information with cognition-level commonsense knowledge, ensuring more reliable outputs.
1 code implementation • 12 Dec 2024 • Ruiwen Zhou, Wenyue Hua, Liangming Pan, Sitao Cheng, Xiaobao Wu, En Yu, William Yang Wang
This paper introduces RuleArena, a novel and challenging benchmark designed to evaluate the ability of large language models (LLMs) to follow complex, real-world rules in reasoning.
1 code implementation • 22 Oct 2024 • Longxuan Yu, Delin Chen, Siheng Xiong, Qingyang Wu, Qingzhen Liu, Dawei Li, Zhikai Chen, Xiaoze Liu, Liangming Pan
In this survey, we provide a comprehensive review of research aimed at enhancing LLMs for causal reasoning.
1 code implementation • 12 Oct 2024 • Yuxi Xie, Anirudh Goyal, Xiaobao Wu, Xunjian Yin, Xiao Xu, Min-Yen Kan, Liangming Pan, William Yang Wang
Our approach models multiple token dependencies within manageable context windows, enabling the model to perform iterative refinement internally during the generation process.
1 code implementation • 10 Oct 2024 • Sitao Cheng, Liangming Pan, Xunjian Yin, Xinyi Wang, William Yang Wang
To support this investigation, we introduce ECHOQA, a benchmark spanning scientific, factual, and commonsense knowledge.
2 code implementations • 6 Oct 2024 • Xunjian Yin, Xinyi Wang, Liangming Pan, Xiaojun Wan, William Yang Wang
The rapid advancement of large language models (LLMs) has significantly enhanced the capabilities of AI-driven agents across various tasks.
1 code implementation • 18 Sep 2024 • Xinyuan Lu, Liangming Pan, Yubo Ma, Preslav Nakov, Min-Yen Kan
Current Large Language Models (LLMs) exhibit limited ability to understand table structures and to apply precise numerical reasoning, which is crucial for tasks such as table question answering (TQA) and table-based fact verification (TFV).
1 code implementation • 1 Jul 2024 • Yubo Ma, Yuhang Zang, Liangyu Chen, Meiqi Chen, Yizhu Jiao, Xinze Li, Xinyuan Lu, Ziyu Liu, Yan Ma, Xiaoyi Dong, Pan Zhang, Liangming Pan, Yu-Gang Jiang, Jiaqi Wang, Yixin Cao, Aixin Sun
Moreover, 33. 2% of the questions are cross-page questions requiring evidence across multiple pages.
1 code implementation • 27 Jun 2024 • Zijun Yao, Weijian Qi, Liangming Pan, Shulin Cao, Linmei Hu, Weichuan Liu, Lei Hou, Juanzi Li
This paper introduces Self-aware Knowledge Retrieval (SeaKR), a novel adaptive RAG model that extracts self-aware uncertainty of LLMs from their internal states.
no code implementations • 21 Jun 2024 • Kyle Wong, Alfonso Amayuelas, Liangming Pan, William Yang Wang
To explain this behavior, we perform a further analysis and find that contrary to preexisting beliefs, the correlation between reasoning ability and code correction ability is weak.
no code implementations • 20 Jun 2024 • Alfonso Amayuelas, Xianjun Yang, Antonis Antoniades, Wenyue Hua, Liangming Pan, William Wang
Large Language Models (LLMs) have shown exceptional results on current benchmarks when working individually.
2 code implementations • 28 May 2024 • Xiaobao Wu, Xinshuai Dong, Liangming Pan, Thong Nguyen, Anh Tuan Luu
However, existing models suffer from repetitive topic and unassociated topic issues, failing to reveal the evolution and hindering further applications.
1 code implementation • 28 May 2024 • Jundong Xu, Hao Fei, Liangming Pan, Qian Liu, Mong-Li Lee, Wynne Hsu
Technically, building upon an LLM, SymbCoT 1) first translates the natural language context into the symbolic format, and then 2) derives a step-by-step plan to solve the problem with symbolic logical rules, 3) followed by a verifier to check the translation and reasoning chain.
1 code implementation • 29 Feb 2024 • Xiaobao Wu, Liangming Pan, William Yang Wang, Anh Tuan Luu
Knowledge editing injects knowledge updates into language models to keep them correct and up-to-date.
1 code implementation • 26 Feb 2024 • Alon Albalak, Yanai Elazar, Sang Michael Xie, Shayne Longpre, Nathan Lambert, Xinyi Wang, Niklas Muennighoff, Bairu Hou, Liangming Pan, Haewon Jeong, Colin Raffel, Shiyu Chang, Tatsunori Hashimoto, William Yang Wang
A major factor in the recent success of large language models is the use of enormous and ever-growing text datasets for unsupervised pre-training.
no code implementations • 18 Feb 2024 • Yubo Ma, Zhibin Gou, Junheng Hao, Ruochen Xu, Shuohang Wang, Liangming Pan, Yujiu Yang, Yixin Cao, Aixin Sun, Hany Awadalla, Weizhu Chen
To make this task more practical and solvable for LLMs, we introduce a new task setting named tool-augmented scientific reasoning.
1 code implementation • 18 Feb 2024 • Wenda Xu, Guanglei Zhu, Xuandong Zhao, Liangming Pan, Lei LI, William Yang Wang
We discovered that such a contrary is due to LLM's bias in evaluating their own output.
1 code implementation • 5 Feb 2024 • Xinyi Wang, Alfonso Amayuelas, Kexun Zhang, Liangming Pan, Wenhu Chen, William Yang Wang
To understand how pre-training with a next-token prediction objective contributes to the emergence of such reasoning capability, we propose that we can view an LM as deriving new conclusions by aggregating indirect reasoning paths seen at pre-training time.
no code implementations • 24 Jan 2024 • Iain Xie Weissburg, Mehir Arora, Xinyi Wang, Liangming Pan, William Yang Wang
As the number of accepted papers at AI and ML conferences reaches into the thousands, it has become unclear how researchers access and read research publications.
1 code implementation • 5 Dec 2023 • Alon Albalak, Liangming Pan, Colin Raffel, William Yang Wang
The data used to pretrain large language models has a decisive impact on a model's downstream performance, which has led to a large body of work on data selection methods that aim to automatically determine the most suitable data to use for pretraining.
2 code implementations • 15 Nov 2023 • Yuxia Wang, Revanth Gangi Reddy, Zain Muhammad Mujahid, Arnav Arora, Aleksandr Rubashevskii, Jiahui Geng, Osama Mohammed Afzal, Liangming Pan, Nadav Borenstein, Aditya Pillai, Isabelle Augenstein, Iryna Gurevych, Preslav Nakov
The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the factual accuracy of their outputs.
1 code implementation • 24 Oct 2023 • Xianjun Yang, Liangming Pan, Xuandong Zhao, Haifeng Chen, Linda Petzold, William Yang Wang, Wei Cheng
The burgeoning capabilities of advanced large language models (LLMs) such as ChatGPT have led to an increase in synthetic content generation with implications across a variety of sectors, including media, cybersecurity, public discourse, and education.
1 code implementation • 19 Oct 2023 • Deepak Nathani, David Wang, Liangming Pan, William Yang Wang
Language Models (LMs) have shown impressive performance in various natural language tasks.
1 code implementation • 11 Oct 2023 • Liangming Pan, Xinyuan Lu, Min-Yen Kan, Preslav Nakov
Fact-checking real-world claims often requires complex, multi-step reasoning due to the absence of direct evidence to support or refute them.
1 code implementation • 9 Oct 2023 • Xinze Li, Yixin Cao, Liangming Pan, Yubo Ma, Aixin Sun
Although achieving great success, Large Language Models (LLMs) usually suffer from unreliable hallucinations.
1 code implementation • 18 Sep 2023 • Liangming Pan, Yunxiang Zhang, Min-Yen Kan
In this paper, we explore zero- and few-shot generalization for fact verification (FV), which aims to generalize the FV model trained on well-resourced domains (e. g., Wikipedia) to low-resourced domains that lack human annotations.
no code implementations • 10 Sep 2023 • Yan Meng, Liangming Pan, Yixin Cao, Min-Yen Kan
We introduce the task of real-world information-seeking follow-up question generation (FQG), which aims to generate follow-up questions seeking a more in-depth understanding of an initial question and answer.
1 code implementation • 6 Aug 2023 • Liangming Pan, Michael Saxon, Wenda Xu, Deepak Nathani, Xinyi Wang, William Yang Wang
Large language models (LLMs) have demonstrated remarkable performance across a wide array of NLP tasks.
1 code implementation • 23 May 2023 • Alfonso Amayuelas, Kyle Wong, Liangming Pan, Wenhu Chen, William Wang
This paper investigates the capabilities of Large Language Models (LLMs) in the context of understanding their knowledge and uncertainty over questions.
1 code implementation • 23 May 2023 • Yikang Pan, Liangming Pan, Wenhu Chen, Preslav Nakov, Min-Yen Kan, William Yang Wang
In this paper, we comprehensively investigate the potential misuse of modern Large Language Models (LLMs) for generating credible-sounding misinformation and its subsequent impact on information-intensive applications, particularly Open-Domain Question Answering (ODQA) systems.
2 code implementations • 23 May 2023 • Wenda Xu, Danqing Wang, Liangming Pan, Zhenqiao Song, Markus Freitag, William Yang Wang, Lei LI
By harnessing both explicit human instruction and the implicit knowledge of GPT-4, we fine-tune a text evaluation metric based on LLaMA, producing both a score for generated text and a human readable diagnostic report.
1 code implementation • 23 May 2023 • Shuo Zhang, Liangming Pan, Junzhou Zhao, William Yang Wang
Large language models often necessitate grounding on external knowledge to generate faithful and reliable answers.
1 code implementation • 22 May 2023 • Xinyuan Lu, Liangming Pan, Qian Liu, Preslav Nakov, Min-Yen Kan
Current scientific fact-checking benchmarks exhibit several shortcomings, such as biases arising from crowd-sourced claims and an over-reliance on text-based evidence.
2 code implementations • 22 May 2023 • Liangming Pan, Xiaobao Wu, Xinyuan Lu, Anh Tuan Luu, William Yang Wang, Min-Yen Kan, Preslav Nakov
Fact-checking real-world claims often requires collecting multiple pieces of evidence and applying complex multi-step reasoning.
1 code implementation • 20 May 2023 • Liangming Pan, Alon Albalak, Xinyi Wang, William Yang Wang
We also introduce a self-refinement module, which utilizes the symbolic solver's error messages to revise symbolic formalizations.
1 code implementation • 4 May 2023 • Xuan Long Do, Bowei Zou, Shafiq Joty, Anh Tai Tran, Liangming Pan, Nancy F. Chen, Ai Ti Aw
In addition, we propose Conv-Distinct, a novel evaluation metric for CQG, to evaluate the diversity of the generated conversation from a context.
2 code implementations • 7 Apr 2023 • Xiaobao Wu, Xinshuai Dong, Thong Nguyen, Chaoqun Liu, Liangming Pan, Anh Tuan Luu
Instead of the direct alignment in previous work, we propose a topic alignment with mutual information method.
1 code implementation • 20 Feb 2023 • Shizhe Diao, Sedrick Scott Keh, Liangming Pan, Zhiliang Tian, Yan Song, Tong Zhang
Social media classification tasks (e. g., tweet sentiment analysis, tweet stance detection) are challenging because social media posts are typically short, informal, and ambiguous.
1 code implementation • 23 Sep 2022 • Hengchang Hu, Liangming Pan, Yiding Ran, Min-Yen Kan
Prerequisites can play a crucial role in users' decision-making yet recommendation systems have not fully utilized such contextual background knowledge.
1 code implementation • COLING 2022 • Xuan Long Do, Bowei Zou, Liangming Pan, Nancy F. Chen, Shafiq Joty, Ai Ti Aw
While previous studies mainly focus on how to model the flow and alignment of the conversation, there has been no thorough study to date on which parts of the context and history are necessary for the model.
1 code implementation • 15 Oct 2021 • Liangming Pan, Wenhu Chen, Min-Yen Kan, William Yang Wang
We curate both human-written and model-generated false documents that we inject into the evidence corpus of QA models and assess the impact on the performance of these systems.
no code implementations • Findings (ACL) 2022 • Yunxiang Zhang, Liangming Pan, Samson Tan, Min-Yen Kan
In this work, we test the hypothesis that the extent to which a model is affected by an unseen textual perturbation (robustness) can be explained by the learnability of the perturbation (defined as how well the model learns to identify the perturbation with a small amount of evidence).
1 code implementation • ACL 2021 • Liangming Pan, Wenhu Chen, Wenhan Xiong, Min-Yen Kan, William Yang Wang
However, for each new domain that requires fact verification, creating a dataset by manually writing claims and linking them to their supporting evidence is expensive.
1 code implementation • COLING 2020 • Yuxi Xie, Liangming Pan, Dongzhe Wang, Min-Yen Kan, Yansong Feng
Recent question generation (QG) approaches often utilize the sequence-to-sequence framework (Seq2Seq) to optimize the log-likelihood of ground-truth questions using teacher forcing.
1 code implementation • NAACL 2021 • Liangming Pan, Wenhu Chen, Wenhan Xiong, Min-Yen Kan, William Yang Wang
Obtaining training data for multi-hop question answering (QA) is time-consuming and resource-intensive.
1 code implementation • EMNLP 2020 • Yixin Cao, Liangming Pan, Juanzi Li, Zhiyuan Liu, Tat-Seng Chua
GNN-based EA methods present promising performances by modeling the KG structure defined by relation triples.
no code implementations • 20 Aug 2020 • Liangming Pan, Jingjing Chen, Jianlong Wu, Shaoteng Liu, Chong-Wah Ngo, Min-Yen Kan, Yu-Gang Jiang, Tat-Seng Chua
Understanding food recipe requires anticipating the implicit causal effects of cooking actions, such that the recipe can be converted into a graph describing the temporal workflow of the recipe.
2 code implementations • ACL 2022 • Shulin Cao, Jiaxin Shi, Liangming Pan, Lunyiu Nie, Yutong Xiang, Lei Hou, Juanzi Li, Bin He, Hanwang Zhang
To this end, we introduce KQA Pro, a dataset for Complex KBQA including ~120K diverse natural language questions.
no code implementations • ACL 2020 • Yixin Cao, Ruihao Shui, Liangming Pan, Min-Yen Kan, Zhiyuan Liu, Tat-Seng Chua
The curse of knowledge can impede communication between experts and laymen.
1 code implementation • ACL 2020 • Liangming Pan, Yuxi Xie, Yansong Feng, Tat-Seng Chua, Min-Yen Kan
This paper proposes the problem of Deep Question Generation (DQG), which aims to generate complex questions that require reasoning over multiple pieces of information of the input passage.
no code implementations • 22 May 2019 • Liangming Pan, Wenqiang Lei, Tat-Seng Chua, Min-Yen Kan
Emerging research in Neural Question Generation (NQG) has started to integrate a larger variety of inputs, and generating questions requiring higher levels of cognition.
no code implementations • 21 Nov 2018 • Ya-Hui An, Liangming Pan, Min-Yen Kan, Qiang Dong, Yan Fu
We propose the novel problem of learning resource mention identification in MOOC forums.
no code implementations • IJCNLP 2017 • Liangming Pan, Xiaochen Wang, Chengjiang Li, Juanzi Li, Jie Tang
Massive Open Online Courses (MOOCs), offering a new way to study online, are revolutionizing education.
no code implementations • ACL 2017 • Liangming Pan, Chengjiang Li, Juanzi Li, Jie Tang
What prerequisite knowledge should students achieve a level of mastery before moving forward to learn subsequent coursewares?