no code implementations • ACL 2022 • Moxin Li, Fuli Feng, Hanwang Zhang, Xiangnan He, Fengbin Zhu, Tat-Seng Chua
Neural discrete reasoning (NDR) has shown remarkable progress in combining deep models with discrete reasoning.
no code implementations • 7 Mar 2025 • Fengbin Zhu, Junfeng Li, Liangming Pan, Wenjie Wang, Fuli Feng, Chao Wang, Huanbo Luan, Tat-Seng Chua
Finance decision-making often relies on in-depth data analysis across various data sources, including financial tables, news articles, stock prices, etc.
no code implementations • 4 Feb 2025 • Yi Fang, Wenjie Wang, Yang Zhang, Fengbin Zhu, Qifan Wang, Fuli Feng, Xiangnan He
We then introduce the Deliberative User Preference Alignment framework, designed to enhance reasoning capabilities by utilizing verbalized user feedback in a step-wise manner to tackle this task.
no code implementations • 23 Dec 2024 • Chengbing Wang, Yang Zhang, Fengbin Zhu, Jizhi Zhang, Tianhao Shi, Fuli Feng
Leveraging Large Language Models (LLMs) to harness user-item interaction histories for item generation has emerged as a promising paradigm in generative recommendation.
no code implementations • 8 Dec 2024 • Zhiguang Wu, Fengbin Zhu, Xuequn Shang, Yupei Zhang, Pan Zhou
In the first stage, agents analyze their respective schema and communicate with each other to collect the schema information relevant to the question.
1 code implementation • 30 Oct 2024 • Youcheng Huang, Fengbin Zhu, Jingkun Tang, Pan Zhou, Wenqiang Lei, Jiancheng Lv, Tat-Seng Chua
With the new RADAR dataset, we further develop a novel and effective iN-time Embedding-based AdveRSarial Image DEtection (NEARSIDE) method, which exploits a single vector that distilled from the hidden states of VLMs, which we call the attacking direction, to achieve the detection of adversarial images against benign ones in the input.
no code implementations • 25 Oct 2024 • Fengbin Zhu, Ziyang Liu, Xiang Yao Ng, Haohui Wu, Wenjie Wang, Fuli Feng, Chao Wang, Huanbo Luan, Tat Seng Chua
Large Vision-Language Models (LVLMs) have achieved remarkable performance in many vision-language tasks, yet their capabilities in fine-grained visual understanding remain insufficiently evaluated.
no code implementations • 22 Oct 2024 • Hongru Cai, Yongqi Li, Wenjie Wang, Fengbin Zhu, Xiaoyu Shen, Wenjie Li, Tat-Seng Chua
To overcome the limitation, we first formulate the task of LLM-empowered personalized Web agents, which integrate personalized data and user instructions to personalize instruction comprehension and action execution.
1 code implementation • 8 Aug 2024 • Junbin Xiao, Nanxin Huang, Hangyu Qin, Dongyang Li, Yicong Li, Fengbin Zhu, Zhulin Tao, Jianxing Yu, Liang Lin, Tat-Seng Chua, Angela Yao
Video Large Language Models (Video-LLMs) are flourishing and has advanced many video-language tasks.
1 code implementation • 17 Jun 2024 • Boyi Deng, Wenjie Wang, Fengbin Zhu, Qifan Wang, Fuli Feng
To address this issue, we explore the task of "credibility-aware RAG", in which LLMs automatically adjust the influence of retrieved documents based on their credibility scores to counteract misinformation.
no code implementations • 15 Mar 2024 • Moxin Li, Wenjie Wang, Fuli Feng, Fengbin Zhu, Qifan Wang, Tat-Seng Chua
Self-detection for Large Language Models (LLMs) seeks to evaluate the trustworthiness of the LLM's output by leveraging its own capabilities, thereby alleviating the issue of output hallucination.
no code implementations • 24 Jan 2024 • Fengbin Zhu, Ziyang Liu, Fuli Feng, Chao Wang, Moxin Li, Tat-Seng Chua
In this work, we address question answering (QA) over a hybrid of tabular and textual data that are very common content on the Web (e. g. SEC filings), where discrete reasoning capabilities are often required.
1 code implementation • 3 May 2023 • Fengbin Zhu, Chao Wang, Fuli Feng, Zifeng Ren, Moxin Li, Tat-Seng Chua
Discrete reasoning over table-text documents (e. g., financial reports) gains increasing attention in recent two years.
no code implementations • 25 Jul 2022 • Fengbin Zhu, Wenqiang Lei, Fuli Feng, Chao Wang, Haozhou Zhang, Tat-Seng Chua
Document Visual Question Answering (VQA) aims to understand visually-rich documents to answer questions in natural language, which is an emerging research topic for both Natural Language Processing and Computer Vision.
no code implementations • 14 Jun 2022 • Fengbin Zhu, Chao Wang, Wenqiang Lei, Ziyang Liu, Tat Seng Chua
Key Information Extraction (KIE) is aimed at extracting structured information (e. g. key-value pairs) from form-style documents (e. g. invoices), which makes an important step towards intelligent document understanding.
1 code implementation • ACL 2021 • Fengbin Zhu, Wenqiang Lei, Youcheng Huang, Chao Wang, Shuo Zhang, Jiancheng Lv, Fuli Feng, Tat-Seng Chua
In this work, we extract samples from real financial reports to build a new large-scale QA dataset containing both Tabular And Textual data, named TAT-QA, where numerical reasoning is usually required to infer the answer, such as addition, subtraction, multiplication, division, counting, comparison/sorting, and the compositions.
Ranked #1 on
Question Answering
on TAT-QA
no code implementations • 4 Jan 2021 • Fengbin Zhu, Wenqiang Lei, Chao Wang, Jianming Zheng, Soujanya Poria, Tat-Seng Chua
Open-domain Question Answering (OpenQA) is an important task in Natural Language Processing (NLP), which aims to answer a question in the form of natural language based on large-scale unstructured documents.
Machine Reading Comprehension
Open-Domain Question Answering