no code implementations • ACL 2022 • Moxin Li, Fuli Feng, Hanwang Zhang, Xiangnan He, Fengbin Zhu, Tat-Seng Chua
Neural discrete reasoning (NDR) has shown remarkable progress in combining deep models with discrete reasoning.
no code implementations • EMNLP 2020 • Wenqiang Lei, Weixin Wang, Zhixin Ma, Tian Gan, Wei Lu, Min-Yen Kan, Tat-Seng Chua
By providing a schema linking corpus based on the Spider text-to-SQL dataset, we systematically study the role of schema linking.
1 code implementation • 13 Jan 2025 • Han Liu, Yinwei Wei, Fan Liu, Wenjie Wang, Liqiang Nie, Tat-Seng Chua
In this paper, we develop a novel meta-learning-based multimodal fusion framework called Meta Multimodal Fusion (MetaMMF), which dynamically assigns parameters to the multimodal fusion function for each micro-video during its representation learning.
no code implementations • 10 Jan 2025 • Chen Huang, Yang Deng, Wenqiang Lei, Jiancheng Lv, Tat-Seng Chua, Jimmy Xiangji Huang
With the advancement of large language models (LLMs), intelligent models have evolved from mere tools to autonomous agents with their own goals and strategies for cooperating with humans.
no code implementations • 10 Jan 2025 • Zheqi Lv, Keming Ye, Zishu Wei, Qi Tian, Shengyu Zhang, Wenqiao Zhang, Wenjie Wang, Kun Kuang, Tat-Seng Chua, Fei Wu
The integrated model can be used directly for inference or for further fine-tuning.
1 code implementation • 2 Jan 2025 • Yong Zhao, Yang Deng, See-Kiong Ng, Tat-Seng Chua
In this work, we propose a novel framework, named Alignment for Faithful Integrity with Confidence Estimation (AFICE), which aims to align the LLM responses with faithful integrity.
1 code implementation • 24 Dec 2024 • Xiaohao Liu, Xiaobo Xia, Zhuo Huang, Tat-Seng Chua
Multi-modal learning has achieved remarkable success by integrating information from various modalities, achieving superior performance in tasks like recognition and retrieval compared to uni-modal approaches.
no code implementations • 19 Dec 2024 • Yuxuan Gu, Wenjie Wang, Xiaocheng Feng, Weihong Zhong, Kun Zhu, Lei Huang, Tat-Seng Chua, Bing Qin
Large language models (LLMs) have demonstrated impressive instruction following capabilities, while still struggling to accurately manage the length of the generated text, which is a fundamental requirement in many real-world applications.
no code implementations • 18 Dec 2024 • Jinghan He, Kuan Zhu, Haiyun Guo, Junfeng Fang, Zhenglin Hua, Yuheng Jia, Ming Tang, Tat-Seng Chua, Jinqiao Wang
Large vision-language models (LVLMs) have made substantial progress in integrating large language models (LLMs) with visual inputs, enabling advanced multimodal reasoning.
1 code implementation • 18 Dec 2024 • YiPeng Zhang, Yifan Liu, Zonghao Guo, Yidan Zhang, Xuesong Yang, Chi Chen, Jun Song, Bo Zheng, Yuan YAO, Zhiyuan Liu, Tat-Seng Chua, Maosong Sun
To address this issue, we present LLaVA-UHD v2, an advanced MLLM centered around a Hierarchical window transformer that enables capturing diverse visual granularity by constructing and integrating a high-resolution feature pyramid.
no code implementations • 17 Dec 2024 • Moxin Li, Yong Zhao, Yang Deng, Wenxuan Zhang, Shuaiyi Li, Wenya Xie, See-Kiong Ng, Tat-Seng Chua
Although large language models (LLMs) store vast amount of knowledge in their parameters, they still have limitations in the memorization and utilization of certain knowledge, leading to undesired behaviors such as generating untruthful and inaccurate responses.
no code implementations • 15 Dec 2024 • Shengqiong Wu, Hao Fei, Liangming Pan, William Yang Wang, Shuicheng Yan, Tat-Seng Chua
Our framework systematically addresses potential issues in both visual and textual inputs by verifying and integrating perception-level information with cognition-level commonsense knowledge, ensuring more reliable outputs.
no code implementations • 8 Dec 2024 • Leigang Qu, Haochuan Li, Wenjie Wang, Xiang Liu, Juncheng Li, Liqiang Nie, Tat-Seng Chua
To adapt SILMM to LMMs with continuous features, we propose a diversity mechanism to obtain diverse representations and a kernel-based continuous DPO for alignment.
no code implementations • 29 Nov 2024 • Haiyi Qiu, Minghe Gao, Long Qian, Kaihang Pan, Qifan Yu, Juncheng Li, Wenjie Wang, Siliang Tang, Yueting Zhuang, Tat-Seng Chua
Video Large Language Models (Video-LLMs) have recently shown strong performance in basic video understanding tasks, such as captioning and coarse-grained question answering, but struggle with compositional reasoning that requires multi-step spatio-temporal inference across object relations, interactions, and events.
no code implementations • 28 Nov 2024 • Shuo Xu, Haokai Ma, Yunshan Ma, Xiaohao Liu, Lei Meng, Xiangxu Meng, Tat-Seng Chua
The inherent popularity bias in the pre-extracted user feedback features and the insufficient utilization of other popularity-independent knowledge may force the conventional bundling methods to find more popular items, thereby struggling with this long-tail bundling scenario.
1 code implementation • 18 Nov 2024 • Chenhang Cui, Gelei Deng, An Zhang, Jingnan Zheng, Yicong Li, Lianli Gao, Tianwei Zhang, Tat-Seng Chua
Recent advances in Large Vision-Language Models (LVLMs) have showcased strong reasoning abilities across multiple modalities, achieving significant breakthroughs in various real-world applications.
no code implementations • 7 Nov 2024 • Ruiyang Ren, Yuhao Wang, Kun Zhou, Wayne Xin Zhao, Wenjie Wang, Jing Liu, Ji-Rong Wen, Tat-Seng Chua
Large language models (LLMs), with advanced linguistic capabilities, have been employed in reranking tasks through a sequence-to-sequence approach.
1 code implementation • 31 Oct 2024 • Ke Li, Fuyu Dong, Di Wang, Shaofeng Li, Quan Wang, Xinbo Gao, Tat-Seng Chua
Furthermore, we present VisTA, a simple yet effective baseline method that unifies the tasks of question answering and grounding by delivering both visual and textual answers.
1 code implementation • 30 Oct 2024 • Yang Zhang, Juntao You, Yimeng Bai, Jizhi Zhang, Keqin Bao, Wenjie Wang, Tat-Seng Chua
Recent advancements in recommender systems have focused on leveraging Large Language Models (LLMs) to improve user preference modeling, yielding promising outcomes.
1 code implementation • 30 Oct 2024 • Youcheng Huang, Fengbin Zhu, Jingkun Tang, Pan Zhou, Wenqiang Lei, Jiancheng Lv, Tat-Seng Chua
With the new RADAR dataset, we further develop a novel and effective iN-time Embedding-based AdveRSarial Image DEtection (NEARSIDE) method, which exploits a single vector that distilled from the hidden states of VLMs, which we call the attacking direction, to achieve the detection of adversarial images against benign ones in the input.
no code implementations • 22 Oct 2024 • Hongru Cai, Yongqi Li, Wenjie Wang, Fengbin Zhu, Xiaoyu Shen, Wenjie Li, Tat-Seng Chua
To overcome the limitation, we first formulate the task of LLM-empowered personalized Web agents, which integrate personalized data and user instructions to personalize instruction comprehension and action execution.
no code implementations • 18 Oct 2024 • Chenhang Cui, An Zhang, Yiyang Zhou, Zhaorun Chen, Gelei Deng, Huaxiu Yao, Tat-Seng Chua
The recent advancements in large language models (LLMs) and pre-trained vision models have accelerated the development of vision-language large models (VLLMs), enhancing the interaction between visual and linguistic modalities.
1 code implementation • 17 Oct 2024 • Shuo Liu, An Zhang, Guoqing Hu, Hong Qian, Tat-Seng Chua
Recommender systems predict personalized item rankings based on user preference distributions derived from historical behavior data.
no code implementations • 17 Oct 2024 • Kangkang Lu, Yanhua Yu, Zhiyong Huang, Jia Li, Yuling Wang, Meiyu Liang, Xiting Qin, Yimeng Ren, Tat-Seng Chua, Xidian Wang
Specifically, we propose a Heterogeneous Heterophilic Spectral Graph Neural Network (H2SGNN), which employs a dual-module approach: local independent filtering and global hybrid filtering.
no code implementations • 15 Oct 2024 • Man Liu, Huihui Bai, Feng Li, Chunjie Zhang, Yunchao Wei, Meng Wang, Tat-Seng Chua, Yao Zhao
Generalized zero-shot learning (GZSL) endeavors to identify the unseen categories using knowledge from the seen domain, necessitating the intrinsic interactions between the visual features and attribute semantic features.
no code implementations • 8 Oct 2024 • Hao Fei, Shengqiong Wu, Hanwang Zhang, Tat-Seng Chua, Shuicheng Yan
Recent developments of vision large language models (LLMs) have seen remarkable progress, yet still encounter challenges towards multimodal generalists, such as coarse-grained instance-level understanding, lack of unified support for both images and videos, and insufficient coverage across various vision tasks.
no code implementations • 7 Oct 2024 • Xinyu Lin, Chaoqun Yang, Wenjie Wang, Yongqi Li, Cunxiao Du, Fuli Feng, See-Kiong Ng, Tat-Seng Chua
To alleviate this, we consider 1) boosting top-K sequence alignment between the draft model and the target LLM, and 2) relaxing the verification strategy to reduce trivial LLM calls.
no code implementations • 7 Oct 2024 • Kelvin J. L. Koa, Yunshan Ma, Ritchie Ng, Huanhuan Zheng, Tat-Seng Chua
Stock portfolios are often exposed to rare consequential events (e. g., 2007 global financial crisis, 2020 COVID-19 stock market crash), as they do not have enough historical information to learn from.
2 code implementations • 3 Oct 2024 • Junfeng Fang, Houcheng Jiang, Kun Wang, Yunshan Ma, Xiang Wang, Xiangnan He, Tat-Seng Chua
To address this, we introduce AlphaEdit, a novel solution that projects perturbation onto the null space of the preserved knowledge before applying it to the parameters.
no code implementations • 1 Oct 2024 • Yu Zhao, Hao Fei, Shengqiong Wu, Meishan Zhang, Min Zhang, Tat-Seng Chua
Grammar Induction could benefit from rich heterogeneous signals, such as text, vision, and acoustics.
1 code implementation • 29 Sep 2024 • Jingnan Zheng, Jiaohao Liu, An Zhang, Jun Zeng, Ziqi Yang, Zhenkai Liang, Tat-Seng Chua
Among the various tools employed in malware detection, graph representations (e. g., function call graphs) have played a pivotal role in characterizing the behaviors of Android apps.
1 code implementation • 22 Sep 2024 • Sheng Zhou, Junbin Xiao, Xun Yang, Peipei Song, Dan Guo, Angela Yao, Meng Wang, Tat-Seng Chua
In this paper, we propose to study Grounded TextVideoQA by forcing models to answer questions and spatio-temporally localize the relevant scene-text regions, thus decoupling QA from scenetext recognition and promoting research towards interpretable QA.
1 code implementation • 22 Sep 2024 • Peixin Qin, Chen Huang, Yang Deng, Wenqiang Lei, Tat-Seng Chua
With the aid of large language models, current conversational recommender system (CRS) has gaining strong abilities to persuade users to accept recommended items.
no code implementations • 4 Sep 2024 • Xing Lan, Jian Xue, Ji Qi, Dongmei Jiang, Ke Lu, Tat-Seng Chua
Specifically, we have designed the CoT mechanism from three key perspectives: key observations, overall emotional interpretation, and conclusion.
Facial Expression Recognition Facial Expression Recognition (FER)
1 code implementation • Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics 2024 • Haohao Luo, Yang Deng, Ying Shen, See-Kiong Ng, Tat-Seng Chua
Multiple-choice questions (MCQs) are important in enhancing concept learning and student engagement for educational purposes.
1 code implementation • 8 Aug 2024 • Junbin Xiao, Nanxin Huang, Hangyu Qin, Dongyang Li, Yicong Li, Fengbin Zhu, Zhulin Tao, Jianxing Yu, Liang Lin, Tat-Seng Chua, Angela Yao
Video Large Language Models (Video-LLMs) are flourishing and has advanced many video-language tasks.
1 code implementation • 8 Aug 2024 • Haoxuan Li, Zhengmao Yang, Yunshan Ma, Yi Bin, Yang Yang, Tat-Seng Chua
We study an emerging and intriguing problem of multimodal temporal event forecasting with large language models.
no code implementations • 24 Jul 2024 • Yongqi Li, Hongru Cai, Wenjie Wang, Leigang Qu, Yinwei Wei, Wenjie Li, Liqiang Nie, Tat-Seng Chua
Despite its great potential, existing generative approaches are limited due to the following issues: insufficient visual information in identifiers, misalignment with high-level semantics, and learning gap towards the retrieval target.
no code implementations • 16 Jul 2024 • Xiaohao Liu, Jie Wu, Zhulin Tao, Yunshan Ma, Yinwei Wei, Tat-Seng Chua
Recent advances in product bundling have leveraged multimodal information through sophisticated encoders, but remain constrained by limited semantic understanding and a narrow scope of knowledge.
no code implementations • 16 Jul 2024 • He Chang, Chenchen Ye, Zhulin Tao, Jie Wu, Zhengmao Yang, Yunshan Ma, Xianglin Huang, Tat-Seng Chua
However, the reasoning capability of LLMs on temporal event forecasting has been under-explored.
1 code implementation • 10 Jul 2024 • An Zhang, Han Wang, Xiang Wang, Tat-Seng Chua
Domain Generalization (DG), designed to enhance out-of-distribution (OOD) generalization, is all about learning invariance against domain shifts utilizing sufficient supervision signals.
1 code implementation • 7 Jul 2024 • Leheng Sheng, An Zhang, Yi Zhang, Yuxin Chen, Xiang Wang, Tat-Seng Chua
Contrary to prevailing understanding that LMs and traditional recommenders learn two distinct representation spaces due to the huge gap in language and behavior modeling objectives, this work re-examines such understanding and explores extracting a recommendation space directly from the language representation space.
no code implementations • 27 Jun 2024 • Hao Fei, Shengqiong Wu, Meishan Zhang, Min Zhang, Tat-Seng Chua, Shuicheng Yan
Then, an SG-based framework is built, where the textual SG (TSG) is encoded with a graph Transformer, while the video dynamic SG (DSG) and the HSG are modeled with a novel recurrent graph Transformer for spatial and temporal feature propagation.
1 code implementation • 20 Jun 2024 • Rebecca Salganik, Xiaohao Liu, Yunshan Ma, Jian Kang, Tat-Seng Chua
As online music consumption increasingly shifts towards playlist-based listening, the task of playlist continuation, in which an algorithm suggests songs to extend a playlist in a personalized and musically cohesive manner, has become vital to the success of music streaming.
1 code implementation • 18 Jun 2024 • Xuan Zhang, Yang Deng, Zifeng Ren, See-Kiong Ng, Tat-Seng Chua
In this work, we introduce a new task, Proactive Agent Planning, which requires language agents to predict clarification needs based on user-agent conversation and agent-environment interaction, invoke external tools to collect valid information, and generate a plan to fulfill the user's demands.
1 code implementation • 13 Jun 2024 • Yuxin Chen, Junfei Tan, An Zhang, Zhengyi Yang, Leheng Sheng, Enzhi Zhang, Xiang Wang, Tat-Seng Chua
Specifically, we incorporate multiple negatives in user preference data and devise an alternative version of DPO loss tailored for LM-based recommenders, which is extended from the traditional full-ranking Plackett-Luce (PL) model to partial rankings and connected to softmax sampling strategies.
no code implementations • 9 Jun 2024 • Leigang Qu, Haochuan Li, Tan Wang, Wenjie Wang, Yongqi Li, Liqiang Nie, Tat-Seng Chua
Subsequently, we unify generation and retrieval in an autoregressive generation way and propose an autonomous decision module to choose the best-matched one between generated and retrieved images as the response to the text query.
1 code implementation • 9 Jun 2024 • Hao Li, Chenghao Yang, An Zhang, Yang Deng, Xiang Wang, Tat-Seng Chua
Crucial to addressing this real-world need are event summary and persona management, which enable reasoning for appropriate long-term dialogue responses.
no code implementations • 7 Jun 2024 • Shengqiong Wu, Hao Fei, Xiangtai Li, Jiayi Ji, Hanwang Zhang, Tat-Seng Chua, Shuicheng Yan
The resulting vision tokens effectively preserve semantic integrity and capture both low-frequency and high-frequency visual features.
Ranked #72 on Visual Question Answering on MM-Vet
no code implementations • 5 Jun 2024 • Man Liu, Huihui Bai, Feng Li, Chunjie Zhang, Yunchao Wei, Tat-Seng Chua, Yao Zhao
Zero-shot learning (ZSL) endeavors to transfer knowledge from seen categories to recognize unseen categories, which mostly relies on the semantic-visual interactions between image and attribute tokens.
1 code implementation • 4 Jun 2024 • Zhihan Zhang, Yixin Cao, Chenchen Ye, Yunshan Ma, Lizi Liao, Tat-Seng Chua
We refer to the complex events composed of many news articles over an extended period as Temporal Complex Event (TCE).
no code implementations • 31 May 2024 • Shengyu Zhang, Ziqi Jiang, Jiangchao Yao, Fuli Feng, Kun Kuang, Zhou Zhao, Shuo Li, Hongxia Yang, Tat-Seng Chua, Fei Wu
The emerging causal recommendation methods achieve this by modeling the causal effect between user behaviors, however potentially neglect unobserved confounders (\eg, friend suggestions) that are hard to measure in practice.
3 code implementations • 27 May 2024 • Tianyu Yu, Haoye Zhang, Qiming Li, Qixin Xu, Yuan YAO, Da Chen, Xiaoman Lu, Ganqu Cui, Yunkai Dang, Taiwen He, Xiaocheng Feng, Jun Song, Bo Zheng, Zhiyuan Liu, Tat-Seng Chua, Maosong Sun
Traditional feedback learning for hallucination reduction relies on labor-intensive manual labeling or expensive proprietary models.
Ranked #1 on Visual Question Answering on AMBER
1 code implementation • 23 May 2024 • Jingnan Zheng, Han Wang, An Zhang, Tai D. Nguyen, Jun Sun, Tat-Seng Chua
Systematic analysis also validates that the generated test scenarios represent meaningful use cases, as well as integrate enhanced measures to probe long-tail risks.
1 code implementation • 23 May 2024 • Zhiyuan Liu, Yaorui Shi, An Zhang, Sihang Li, Enzhi Zhang, Xiang Wang, Kenji Kawaguchi, Tat-Seng Chua
To resolve the challenges above, we propose a new pretraining method, ReactXT, for reaction-text modeling, and a new dataset, OpenExp, for experimental procedure prediction.
no code implementations • 22 May 2024 • Weixiang Zhao, Yulin Hu, Zhuojun Li, Yang Deng, Jiahe Guo, Xingyu Sui, Yanyan Zhao, Bing Qin, Tat-Seng Chua, Ting Liu
Safety alignment of large language models (LLMs) has been gaining increasing attention.
1 code implementation • 21 May 2024 • Zhiyuan Liu, An Zhang, Hao Fei, Enzhi Zhang, Xiang Wang, Kenji Kawaguchi, Tat-Seng Chua
ProtT3 empowers an LM to understand protein sequences of amino acids by incorporating a PLM as its protein understanding module, enabling effective protein-to-text generation.
1 code implementation • 20 May 2024 • Tong Zhang, Peixin Qin, Yang Deng, Chen Huang, Wenqiang Lei, Junhong Liu, dingnan jin, Hongru Liang, Tat-Seng Chua
To this end, we introduce CLAMBER, a benchmark for evaluating LLMs using a well-organized taxonomy.
no code implementations • 20 May 2024 • Yue Chen, Chen Huang, Yang Deng, Wenqiang Lei, dingnan jin, Jia Liu, Tat-Seng Chua
However, they still struggle to deliver promising performance on unseen domains, struggling to achieve effective domain transferability.
no code implementations • 16 May 2024 • Chen Huang, Xinwei Yang, Yang Deng, Wenqiang Lei, Jiancheng Lv, Tat-Seng Chua
However, successful legal case matching requires the tacit knowledge of legal practitioners, which is difficult to verbalize and encode into machines.
1 code implementation • 12 May 2024 • Wenjie Wang, Honghui Bao, Xinyu Lin, Jizhi Zhang, Yongqi Li, Fuli Feng, See-Kiong Ng, Tat-Seng Chua
Utilizing powerful Large Language Models (LLMs) for generative recommendation has attracted much attention.
no code implementations • 10 May 2024 • Wenqi Fan, Yujuan Ding, Liangbo Ning, Shijie Wang, Hengyun Li, Dawei Yin, Tat-Seng Chua, Qing Li
Given the powerful abilities of RAG in providing the latest and helpful auxiliary information, Retrieval-Augmented Large Language Models (RA-LLMs) have emerged to harness external and authoritative knowledge bases, rather than solely relying on the model's internal knowledge, to augment the generation quality of LLMs.
1 code implementation • 3 May 2024 • Kaihang Pan, Siliang Tang, Juncheng Li, Zhaoyu Fan, Wei Chow, Shuicheng Yan, Tat-Seng Chua, Yueting Zhuang, Hanwang Zhang
For multimodal LLMs, the synergy of visual comprehension (textual output) and generation (visual output) presents an ongoing challenge.
1 code implementation • 27 Apr 2024 • Chen Xu, Xiaopeng Ye, Wenjie Wang, Liang Pang, Jun Xu, Tat-Seng Chua
From a taxation perspective, we theoretically demonstrate that most previous fair re-ranking methods can be reformulated as an item-level tax policy.
no code implementations • 25 Apr 2024 • Yongqi Li, Xinyu Lin, Wenjie Wang, Fuli Feng, Liang Pang, Wenjie Li, Liqiang Nie, Xiangnan He, Tat-Seng Chua
With the information explosion on the Web, search and recommendation are foundational infrastructures to satisfying users' information needs.
no code implementations • 19 Apr 2024 • Yang Deng, Lizi Liao, Zhonghua Zheng, Grace Hui Yang, Tat-Seng Chua
Recent research on proactive conversational agents (PCAs) mainly focuses on improving the system's capabilities in anticipating and planning action sequences to accomplish tasks and achieve goals before users articulate their requests.
no code implementations • 17 Apr 2024 • Minghe Gao, Shuang Chen, Liang Pang, Yuan YAO, Jisheng Dang, Wenqiao Zhang, Juncheng Li, Siliang Tang, Yueting Zhuang, Tat-Seng Chua
Their ability to execute intricate compositional reasoning tasks is also constrained, culminating in a stagnation of learning progression for these models.
1 code implementation • 4 Apr 2024 • Chen Huang, Peixin Qin, Yang Deng, Wenqiang Lei, Jiancheng Lv, Tat-Seng Chua
The conversational recommendation system (CRS) has been criticized regarding its user experience in real-world scenarios, despite recent significant progress achieved in academia.
no code implementations • 2 Apr 2024 • Yunshan Ma, Yingzhi He, Wenjun Zhong, Xiang Wang, Roger Zimmermann, Tat-Seng Chua
However, the cross-item relations have been under-explored in the current multimodal pre-train models.
1 code implementation • 22 Mar 2024 • Changmeng Zheng, Dayong Liang, WengYu Zhang, Xiao-Yong Wei, Tat-Seng Chua, Qing Li
The study addresses two key challenges: the trivialization of opinions resulting from excessive summarization and the diversion of focus caused by distractor concepts introduced from images.
1 code implementation • 18 Mar 2024 • Ruyi Xu, Yuan YAO, Zonghao Guo, Junbo Cui, Zanlin Ni, Chunjiang Ge, Tat-Seng Chua, Zhiyuan Liu, Maosong Sun, Gao Huang
To address the challenges, we present LLaVA-UHD, a large multimodal model that can efficiently perceive images in any aspect ratio and high resolution.
Ranked #5 on Long-Context Understanding on MMNeedle
no code implementations • 15 Mar 2024 • Moxin Li, Wenjie Wang, Fuli Feng, Fengbin Zhu, Qifan Wang, Tat-Seng Chua
Self-detection for Large Language Models (LLMs) seeks to evaluate the trustworthiness of the LLM's output by leveraging its own capabilities, thereby alleviating the issue of output hallucination.
no code implementations • 11 Mar 2024 • Tong Zhang, Chen Huang, Yang Deng, Hongru Liang, Jia Liu, Zujie Wen, Wenqiang Lei, Tat-Seng Chua
We investigate non-collaborative dialogue agents, which are expected to engage in strategic conversations with diverse users, for securing a mutual agreement that leans favorably towards the system's objectives.
1 code implementation • 11 Mar 2024 • Yujuan Ding, Yunshan Ma, Wenqi Fan, Yige Yao, Tat-Seng Chua, Qing Li
Fashion analysis refers to the process of examining and evaluating trends, styles, and elements within the fashion industry to understand and interpret its current state, generating fashion reports.
no code implementations • CVPR 2024 • Leigang Qu, Wenjie Wang, Yongqi Li, Hanwang Zhang, Liqiang Nie, Tat-Seng Chua
We present a discriminative adapter built on T2I models to probe their discriminative abilities on two representative tasks and leverage discriminative fine-tuning to improve their text-image alignment.
no code implementations • 5 Mar 2024 • Zixuan Li, Lizi Liao, Yunshan Ma, Tat-Seng Chua
In this work, we delve into deep session data understanding via scrutinizing the various clues inside the rich information in user sessions.
1 code implementation • 5 Mar 2024 • Wenjie Wang, Changsheng Wang, Fuli Feng, Wentao Shi, Daizong Ding, Tat-Seng Chua
UBA estimates the treatment effect on each target user and optimizes the allocation of fake user budgets to maximize the attack performance.
no code implementations • 5 Mar 2024 • Zixuan Li, Lizi Liao, Tat-Seng Chua
In this paper, we propose a dual-learning model that hybrids the best from both implicit session feedback and proactively clarifying with users on the most critical questions.
no code implementations • CVPR 2024 • Jianwu Fang, Lei-Lei Li, Junfei Zhou, Junbin Xiao, Hongkai Yu, Chen Lv, Jianru Xue, Tat-Seng Chua
This model involves a contrastive interaction loss to learn the pair co-occurrence of normal, near-accident, accident frames with the corresponding text descriptions, such as accident reasons, prevention advice, and accident categories.
no code implementations • 28 Feb 2024 • Shasha Guo, Lizi Liao, Cuiping Li, Tat-Seng Chua
In this survey, we present a detailed examination of the advancements in Neural Question Generation (NQG), a field leveraging neural network techniques to generate relevant questions from diverse inputs like knowledge bases, texts, and images.
1 code implementation • 28 Feb 2024 • Jizhi Zhang, Keqin Bao, Wenjie Wang, Yang Zhang, Wentao Shi, Wanhong Xu, Fuli Feng, Tat-Seng Chua
Additionally, we prospect the evolution of Rec4Agentverse and conceptualize it into three stages based on the enhancement of the interaction and information exchange among Agent Items, Agent Recommender, and the user.
1 code implementation • 23 Feb 2024 • Yang Deng, Xuan Zhang, Wenxuan Zhang, Yifei Yuan, See-Kiong Ng, Tat-Seng Chua
Web agents powered by Large Language Models (LLMs) have demonstrated remarkable abilities in planning and executing multi-step interactions within complex web-based environments, fulfilling a wide range of web navigation tasks.
1 code implementation • 23 Feb 2024 • Yang Deng, Yong Zhao, Moxin Li, See-Kiong Ng, Tat-Seng Chua
Despite the remarkable abilities of Large Language Models (LLMs) to answer questions, they often display a considerable level of overconfidence even when the question does not have a definitive answer.
1 code implementation • 23 Feb 2024 • Zirui Guo, Lianghao Xia, Yanhua Yu, Yuling Wang, Zixuan Yang, Wei Wei, Liang Pang, Tat-Seng Chua, Chao Huang
Graph Structure Learning (GSL) focuses on capturing intrinsic dependencies and interactions among nodes in graph-structured data by generating novel graph structures.
1 code implementation • 18 Feb 2024 • Long Qian, Juncheng Li, Yu Wu, Yaobo Ye, Hao Fei, Tat-Seng Chua, Yueting Zhuang, Siliang Tang
Large Language Models (LLMs) demonstrate remarkable proficiency in comprehending and handling text-based tasks.
no code implementations • 16 Feb 2024 • Yongqi Li, Wenjie Wang, Leigang Qu, Liqiang Nie, Wenjie Li, Tat-Seng Chua
Building upon this capability, we propose to enable multimodal large language models (MLLMs) to memorize and recall images within their parameters.
1 code implementation • 16 Feb 2024 • Yongqi Li, Zhen Zhang, Wenjie Wang, Liqiang Nie, Wenjie Li, Tat-Seng Chua
Generative retrieval is a promising new paradigm in text retrieval that generates identifier strings of relevant passages as the retrieval target.
no code implementations • 15 Feb 2024 • Jujia Zhao, Wenjie Wang, Chen Xu, Zhaochun Ren, See-Kiong Ng, Tat-Seng Chua
Nevertheless, applying Fed4Rec to LLM-based recommendation presents two main challenges: first, an increase in the imbalance of performance across clients, affecting the system's efficiency over time, and second, a high demand on clients' computational and storage resources for local training and inference of LLMs.
1 code implementation • 6 Feb 2024 • Kelvin J. L. Koa, Yunshan Ma, Ritchie Ng, Tat-Seng Chua
The training samples for the PPO trainer are also the responses generated during the reflective process, which eliminates the need for human annotators.
1 code implementation • 30 Jan 2024 • Xinyu Lin, Wenjie Wang, Yongqi Li, Shuo Yang, Fuli Feng, Yinwei Wei, Tat-Seng Chua
To pursue the two objectives, we propose a novel data pruning method based on two scores, i. e., influence score and effort score, to efficiently identify the influential samples.
no code implementations • 28 Jan 2024 • Kangkang Lu, Yanhua Yu, Hao Fei, Xuan Li, Zixuan Yang, Zirui Guo, Meiyu Liang, Mengran Yin, Tat-Seng Chua
Moreover, we theoretically establish that the number of distinguishable eigenvalues plays a pivotal role in determining the expressive power of spectral graph neural networks.
1 code implementation • 25 Jan 2024 • Sihang Li, Zhiyuan Liu, Yanchen Luo, Xiang Wang, Xiangnan He, Kenji Kawaguchi, Tat-Seng Chua, Qi Tian
Through 3D molecule-text alignment and 3D molecule-centric instruction tuning, 3D-MoLM establishes an integration of 3D molecular encoder and LM.
no code implementations • 24 Jan 2024 • Fengbin Zhu, Ziyang Liu, Fuli Feng, Chao Wang, Moxin Li, Tat-Seng Chua
In this work, we address question answering (QA) over a hybrid of tabular and textual data that are very common content on the Web (e. g. SEC filings), where discrete reasoning capabilities are often required.
no code implementations • 16 Jan 2024 • Lidong Zeng, Zhedong Zheng, Yinwei Wei, Tat-Seng Chua
This paper delves into the text-guided image editing task, focusing on modifying a reference image according to user-specified textual feedback to embody specific attributes.
1 code implementation • 13 Jan 2024 • Jujia Zhao, Wenjie Wang, Yiyan Xu, Teng Sun, Fuli Feng, Tat-Seng Chua
To achieve this target, the key lies in offering appropriate guidance to steer the reverse denoising process and providing a proper starting point to start the forward-reverse process during inference.
1 code implementation • 10 Jan 2024 • Luzhi Wang, Dongxiao He, He Zhang, Yixin Liu, Wenjie Wang, Shirui Pan, Di Jin, Tat-Seng Chua
To identify and reject OOD samples with GNNs, recent studies have explored graph OOD detection, often focusing on training a specific model or modifying the data on top of a well-trained GNN.
1 code implementation • CVPR 2024 • Yicong Li, Na Zhao, Junbin Xiao, Chun Feng, Xiang Wang, Tat-Seng Chua
With this regard we propose a novel task Language-guided Affordance Segmentation on 3D Object (LASO) which challenges a model to segment a 3D object's part relevant to a given affordance question.
1 code implementation • 15 Dec 2023 • Xinyu Lin, Wenjie Wang, Jujia Zhao, Yongqi Li, Fuli Feng, Tat-Seng Chua
They learn a feature extractor on warm-start items to align feature representations with interactions, and then leverage the feature extractor to extract the feature representations of cold-start items for interaction prediction.
no code implementations • 3 Dec 2023 • Yang Deng, Zifeng Ren, An Zhang, Wenqiang Lei, Tat-Seng Chua
In this work, we investigate a new task, named Goal-oriented Intelligent Tutoring Systems (GITS), which aims to enable the student's mastery of a designated concept by strategically planning a customized sequence of exercises and assessment.
1 code implementation • 2 Dec 2023 • Yunshan Ma, Chenchen Ye, Zijian Wu, Xiang Wang, Yixin Cao, Liang Pang, Tat-Seng Chua
Temporal complex event forecasting aims to predict the future events given the observed events from history.
3 code implementations • CVPR 2024 • Tianyu Yu, Yuan YAO, Haoye Zhang, Taiwen He, Yifeng Han, Ganqu Cui, Jinyi Hu, Zhiyuan Liu, Hai-Tao Zheng, Maosong Sun, Tat-Seng Chua
Multimodal Large Language Models (MLLMs) have recently demonstrated impressive capabilities in multimodal understanding, reasoning, and interaction.
Ranked #1 on Visual Question Answering on VQA v2
1 code implementation • 28 Nov 2023 • Yunshan Ma, Yingzhi He, Xiang Wang, Yinwei Wei, Xiaoyu Du, Yuyangzi Fu, Tat-Seng Chua
It does, however, have two limitations: 1) the two-view formulation does not fully exploit all the heterogeneous relations among users, bundles and items; and 2) the "early contrast and late fusion" framework is less effective in capturing user preference and difficult to generalize to multiple views.
2 code implementations • 21 Nov 2023 • Meng Chu, Zhedong Zheng, Wei Ji, Tingyu Wang, Tat-Seng Chua
Navigating drones through natural language commands remains challenging due to the dearth of accessible multi-modal datasets and the stringent precision requirements for aligning visual and textual data.
1 code implementation • 13 Nov 2023 • Chen Xu, Wenjie Wang, Yuxin Li, Liang Pang, Jun Xu, Tat-Seng Chua
Worse still, in this paper, we identify a subtler form of discrimination in LLMs, termed \textit{implicit ranking unfairness}, where LLMs exhibit discriminatory ranking patterns based solely on non-sensitive user profiles, such as user names.
1 code implementation • 8 Nov 2023 • Ao Zhang, Yuan YAO, Wei Ji, Zhiyuan Liu, Tat-Seng Chua
The development of large language models (LLMs) has greatly advanced the field of multimodal understanding, leading to the emergence of large multimodal models (LMMs).
1 code implementation • 1 Nov 2023 • Yang Deng, Wenxuan Zhang, Wai Lam, See-Kiong Ng, Tat-Seng Chua
Proactive dialogues serve as a practical yet challenging dialogue problem in the era of large language models (LLMs), where the dialogue policy planning is the key to improving the proactivity of LLMs.
1 code implementation • NeurIPS 2023 • An Zhang, Leheng Sheng, Zhibo Cai, Xiang Wang, Tat-Seng Chua
To bridge the gap, we delve into the reasons underpinning the success of contrastive loss in CF, and propose a principled Adversarial InfoNCE loss (AdvInfoNCE), which is a variant of InfoNCE, specially tailored for CF methods.
1 code implementation • 28 Oct 2023 • Yunshan Ma, Xiaohao Liu, Yinwei Wei, Zhulin Tao, Xiang Wang, Tat-Seng Chua
Specifically, we use self-attention modules to combine the multimodal and multi-item features, and then leverage both item- and bundle-level contrastive learning to enhance the representation learning, thus to counter the modality missing, noise, and sparsity problems.
1 code implementation • NeurIPS 2023 • Zhiyuan Liu, Yaorui Shi, An Zhang, Enzhi Zhang, Kenji Kawaguchi, Xiang Wang, Tat-Seng Chua
Our results show that a subgraph-level tokenizer and a sufficiently expressive decoder with remask decoding have a large impact on the encoder's representation learning.
1 code implementation • 19 Oct 2023 • Zhiyuan Liu, Sihang Li, Yanchen Luo, Hao Fei, Yixin Cao, Kenji Kawaguchi, Xiang Wang, Tat-Seng Chua
MolCA enables an LM (e. g., Galactica) to understand both text- and graph-based molecular contents via the cross-modal projector.
Ranked #5 on Molecule Captioning on ChEBI-20
1 code implementation • 18 Oct 2023 • Ruihao Shui, Yixin Cao, Xiang Wang, Tat-Seng Chua
Large language models (LLMs) have demonstrated great potential for domain-specific applications, such as the law domain.
1 code implementation • 16 Oct 2023 • An Zhang, Yuxin Chen, Leheng Sheng, Xiang Wang, Tat-Seng Chua
Recommender systems are the cornerstone of today's information dissemination, yet a disconnect between offline metrics and online performance greatly hinders their development.
1 code implementation • 16 Oct 2023 • An Zhang, Wenchang Ma, Jingnan Zheng, Xiang Wang, Tat-Seng Chua
The popularity shortcut tricks are good for in-distribution (ID) performance but poorly generalized to out-of-distribution (OOD) data, i. e., when popularity distribution of test data shifts w. r. t.
1 code implementation • 11 Oct 2023 • Liang Chen, Yang Deng, Yatao Bian, Zeyu Qin, Bingzhe Wu, Tat-Seng Chua, Kam-Fai Wong
Large language models (LLMs) outperform information retrieval techniques for downstream knowledge-intensive tasks when being prompted to generate world knowledge.
no code implementations • 10 Oct 2023 • Xinyu Lin, Wenjie Wang, Yongqi Li, Fuli Feng, See-Kiong Ng, Tat-Seng Chua
Harnessing Large Language Models (LLMs) for recommendation is rapidly emerging, which relies on two fundamental steps to bridge the recommendation item space and the language space: 1) item indexing utilizes identifiers to represent items in the language space, and 2) generation grounding associates LLMs' generated token sequences to in-corpus items.
1 code implementation • 26 Sep 2023 • Han Yi, Zhedong Zheng, Xiangyu Xu, Tat-Seng Chua
We aspire for our work to pave the way for automatic 3D prototyping via natural language descriptions.
1 code implementation • 11 Sep 2023 • Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, Tat-Seng Chua
While recently Multimodal Large Language Models (MM-LLMs) have made exciting strides, they mostly fall prey to the limitation of only input-side multimodal understanding, without the ability to produce content in multiple modalities.
no code implementations • CVPR 2024 • Hao Fei, Shengqiong Wu, Wei Ji, Hanwang Zhang, Tat-Seng Chua
In this work, we investigate strengthening the awareness of video dynamics for DMs, for high-quality T2V generation.
no code implementations • 19 Aug 2023 • Kaihang Pan, Juncheng Li, Wenjie Wang, Hao Fei, Hongye Song, Wei Ji, Jun Lin, Xiaozhong Liu, Tat-Seng Chua, Siliang Tang
Recent studies indicate that dense retrieval models struggle to perform well on a wide variety of retrieval tasks that lack dedicated training data, as different retrieval tasks often entail distinct search intents.
1 code implementation • 18 Aug 2023 • Kelvin J. L. Koa, Yunshan Ma, Ritchie Ng, Tat-Seng Chua
The hierarchical VAE allows us to learn the complex and low-level latent variables for stock prediction, while the diffusion probabilistic model trains the predictor to handle stock price stochasticity by progressively adding random noise to the stock data.
1 code implementation • 12 Aug 2023 • Yunshan Ma, Chenchen Ye, Zijian Wu, Xiang Wang, Yixin Cao, Tat-Seng Chua
The task of event forecasting aims to model the relational and temporal patterns based on historical events and makes forecasting to what will happen in the future.
1 code implementation • 9 Aug 2023 • Leigang Qu, Shengqiong Wu, Hao Fei, Liqiang Nie, Tat-Seng Chua
Afterward, we propose a fine-grained object-interaction diffusion method to synthesize high-faithfulness images conditioned on the prompt and the automatically generated layout.
no code implementations • 9 Aug 2023 • Yu Zhao, Hao Fei, Yixin Cao, Bobo Li, Meishan Zhang, Jianguo Wei, Min Zhang, Tat-Seng Chua
A scene-event mapping mechanism is first designed to bridge the gap between the underlying scene structure and the high-level event semantic structure, resulting in an overall hierarchical scene-event (termed ICE) graph structure.
no code implementations • 8 Aug 2023 • Bobo Li, Hao Fei, Lizi Liao, Yu Zhao, Chong Teng, Tat-Seng Chua, Donghong Ji, Fei Li
On the other hand, during the feature fusion stage, we propose a Contribution-aware Fusion Mechanism (CFM) and a Context Refusion Mechanism (CRM) for multimodal and context integration, respectively.
Ranked #8 on Emotion Recognition in Conversation on IEMOCAP
1 code implementation • 8 Aug 2023 • Juncheng Li, Kaihang Pan, Zhiqi Ge, Minghe Gao, Wei Ji, Wenqiao Zhang, Tat-Seng Chua, Siliang Tang, Hanwang Zhang, Yueting Zhuang
This shortcoming results in MLLMs' underperformance in comprehending demonstrative instructions consisting of multiple, interleaved, and multimodal instructions that demonstrate the required context to complete a task.
no code implementations • 7 Aug 2023 • Yicong Li, Xun Yang, An Zhang, Chun Feng, Xiang Wang, Tat-Seng Chua
This paper identifies two kinds of redundancy in the current VideoQA paradigm.
2 code implementations • 3 Aug 2023 • Keyu Duan, Qian Liu, Tat-Seng Chua, Shuicheng Yan, Wei Tsang Ooi, Qizhe Xie, Junxian He
More recently, with the rapid development of language models (LMs), researchers have focused on leveraging LMs to facilitate the learning of TGs, either by jointly training them in a computationally intensive framework (merging the two stages), or designing complex self-supervised training tasks for feature extraction (enhancing the first stage).
Ranked #1 on Node Property Prediction on ogbn-arxiv
no code implementations • 3 Aug 2023 • Hao Fei, Meishan Zhang, Min Zhang, Tat-Seng Chua
Structured Natural Language Processing (XNLP) is an important subset of NLP that entails understanding the underlying semantic or syntactic structure of texts, which serves as a foundational component for many downstream applications.