Search Results for author: Yew Ken Chia

Found 24 papers, 18 papers with code

The Jumping Reasoning Curve? Tracking the Evolution of Reasoning Performance in GPT-[n] and o-[n] Models on Multimodal Puzzles

2 code implementations3 Feb 2025 Vernon Y. H. Toh, Yew Ken Chia, Deepanway Ghosal, Soujanya Poria

Our results reveal that o-[n] series, particularly later iterations like o3 and o4-mini, significantly outperform the GPT-[n] series and show strong scalability in multimodal reasoning.

ARC Multimodal Reasoning

Can-Do! A Dataset and Neuro-Symbolic Grounded Framework for Embodied Planning with Large Multimodal Models

no code implementations22 Sep 2024 Yew Ken Chia, Qi Sun, Lidong Bing, Soujanya Poria

Large multimodal models have demonstrated impressive problem-solving abilities in vision and language tasks, and have the potential to encode extensive world knowledge.

World Knowledge

SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages

2 code implementations29 Jul 2024 Wenxuan Zhang, Hou Pong Chan, Yiran Zhao, Mahani Aljunied, Jianyu Wang, Chaoqun Liu, Yue Deng, Zhiqiang Hu, Weiwen Xu, Yew Ken Chia, Xin Li, Lidong Bing

Large Language Models (LLMs) have shown remarkable abilities across various tasks, yet their development has predominantly centered on high-resource languages like English and Chinese, leaving low-resource languages underserved.

Diversity Instruction Following +2

Auto-Arena: Automating LLM Evaluations with Agent Peer Battles and Committee Discussions

1 code implementation30 May 2024 Ruochen Zhao, Wenxuan Zhang, Yew Ken Chia, Weiwen Xu, Deli Zhao, Lidong Bing

During the peer battles, we observe intriguing scenarios where the LLM candidates display competitive behaviors and even learn from the opponents.

Chatbot Fairness

PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual Patterns

2 code implementations20 Mar 2024 Yew Ken Chia, Vernon Toh Yan Han, Deepanway Ghosal, Lidong Bing, Soujanya Poria

To diagnose the reasoning challenges in large multimodal models, we progressively guide the models with our ground truth reasoning explanations for visual perception, inductive reasoning, and deductive reasoning.

Multimodal Reasoning

SeaLLMs -- Large Language Models for Southeast Asia

2 code implementations1 Dec 2023 Xuan-Phi Nguyen, Wenxuan Zhang, Xin Li, Mahani Aljunied, Zhiqiang Hu, Chenhui Shen, Yew Ken Chia, Xingxuan Li, Jianyu Wang, Qingyu Tan, Liying Cheng, Guanzheng Chen, Yue Deng, Sen yang, Chaoqun Liu, Hang Zhang, Lidong Bing

Despite the remarkable achievements of large language models (LLMs) in various tasks, there remains a linguistic bias that favors high-resource languages, such as English, often at the expense of low-resource and regional languages.

Instruction Following

Contrastive Chain-of-Thought Prompting

1 code implementation15 Nov 2023 Yew Ken Chia, Guizhen Chen, Luu Anh Tuan, Soujanya Poria, Lidong Bing

Compared to the conventional chain of thought, our approach provides both valid and invalid reasoning demonstrations, to guide the model to reason step-by-step while reducing reasoning mistakes.

Language Modeling Language Modelling +1

Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning

1 code implementation5 Jul 2023 Deepanway Ghosal, Yew Ken Chia, Navonil Majumder, Soujanya Poria

Interestingly, despite being introduced four years ago, T5-based LLMs, such as FLAN-T5, continue to outperform the latest decoder-based LLMs, such as LLAMA and VICUNA, on tasks that require general problem-solving skills.

Decoder Language Modelling +1

M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models

1 code implementation NeurIPS 2023 Wenxuan Zhang, Sharifah Mahani Aljunied, Chang Gao, Yew Ken Chia, Lidong Bing

M3Exam exhibits three unique characteristics: (1) multilingualism, encompassing questions from multiple countries that require strong multilingual proficiency and cultural knowledge; (2) multimodality, accounting for the multimodal nature of many exam questions to test the model's multimodal understanding capability; and (3) multilevel structure, featuring exams from three critical educational periods to comprehensively assess a model's proficiency at different levels.

INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models

2 code implementations7 Jun 2023 Yew Ken Chia, Pengfei Hong, Lidong Bing, Soujanya Poria

Instruction-tuned large language models have revolutionized natural language processing and have shown great potential in applications such as conversational agents.

Domain-Expanded ASTE: Rethinking Generalization in Aspect Sentiment Triplet Extraction

no code implementations23 May 2023 Yew Ken Chia, Hui Chen, Wei Han, Guizhen Chen, Sharifah Mahani Aljunied, Soujanya Poria, Lidong Bing

Through comprehensive experiments involving multiple tasks, settings, and models, we demonstrate that CASE can serve as a general decoding strategy for complex sentiment tasks.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +4

Is GPT-3 a Good Data Annotator?

1 code implementation20 Dec 2022 Bosheng Ding, Chengwei Qin, Linlin Liu, Yew Ken Chia, Shafiq Joty, Boyang Li, Lidong Bing

In this paper, we evaluate the performance of GPT-3 as a data annotator by comparing it with traditional data annotation methods and analyzing its output on a range of tasks.

Language Modeling Language Modelling

A Dataset for Hyper-Relational Extraction and a Cube-Filling Approach

1 code implementation18 Nov 2022 Yew Ken Chia, Lidong Bing, Sharifah Mahani Aljunied, Luo Si, Soujanya Poria

Hence, we propose CubeRE, a cube-filling model inspired by table-filling approaches and explicitly considers the interaction between relation triplets and qualifiers.

graph construction Hyper-Relational Extraction +2

Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction

2 code implementations ACL 2021 Lu Xu, Yew Ken Chia, Lidong Bing

Aspect Sentiment Triplet Extraction (ASTE) is the most recent subtask of ABSA which outputs triplets of an aspect target, its associated sentiment, and the corresponding opinion term.

Aspect-Based Sentiment Analysis (ABSA) Aspect Sentiment Triplet Extraction +3

Red Dragon AI at TextGraphs 2020 Shared Task: LIT : LSTM-Interleaved Transformer for Multi-Hop Explanation Ranking

1 code implementation28 Dec 2020 Yew Ken Chia, Sam Witteveen, Martin Andrews

Explainable question answering for science questions is a challenging task that requires multi-hop inference over a large set of fact sentences.

Question Answering Re-Ranking

Red Dragon AI at TextGraphs 2019 Shared Task: Language Model Assisted Explanation Generation

1 code implementation WS 2019 Yew Ken Chia, Sam Witteveen, Martin Andrews

The TextGraphs-13 Shared Task on Explanation Regeneration asked participants to develop methods to reconstruct gold explanations for elementary science questions.

Explanation Generation Language Modeling +1

Scene Graph Parsing by Attention Graph

no code implementations13 Sep 2019 Martin Andrews, Yew Ken Chia, Sam Witteveen

Scene graph representations, which form a graph of visual object nodes together with their attributes and relations, have proved useful across a variety of vision and language applications.

Cannot find the paper you are looking for? You can Submit a new open access paper.