2 code implementations • EMNLP 2020 • Xiaobao Wu, Chunping Li, Yan Zhu, Yishu Miao
Topic models have been prevailing for many years on discovering latent semantics while modeling long documents.
1 code implementation • 10 Jun 2025 • Fengjun Pan, Anh Tuan Luu, Xiaobao Wu
Building on these textual descriptions, we further incorporate targeted, interpretable human-crafted guidelines to guide models' reasoning under zero-shot CoT prompting.
no code implementations • 5 Jun 2025 • Yubo Ma, Jinsong Li, Yuhang Zang, Xiaobao Wu, Xiaoyi Dong, Pan Zhang, Yuhang Cao, Haodong Duan, Jiaqi Wang, Yixin Cao, Aixin Sun
We evaluate two token-reduction strategies: token pruning and token merging.
1 code implementation • 20 May 2025 • Huimin Xu, Xin Mao, Feng-Lin Li, Xiaobao Wu, Wang Chen, Wei zhang, Anh Tuan Luu
Process Reward Models (PRMs) have demonstrated promising results in mathematical reasoning, but existing process annotation approaches, whether through human annotations or Monte Carlo simulations, remain computationally expensive.
1 code implementation • 5 May 2025 • Xiaobao Wu
In this survey, we present a comprehensive overview of the paradigm of learning from rewards.
no code implementations • 17 Apr 2025 • Yichao Feng, Shuai Zhao, Yueqiu Li, Luwei Xiao, Xiaobao Wu, Anh Tuan Luu
To address these challenges, in this paper, we propose a novel framework for aspect-based summarization: Self-Aspect Retrieval Enhanced Summary Generation.
1 code implementation • 27 Mar 2025 • Haoran Luo, Haihong E, Guanting Chen, Yandan Zheng, Xiaobao Wu, Yikai Guo, Qika Lin, Yu Feng, Zemin Kuang, Meina Song, Yifan Zhu, Luu Anh Tuan
To retrieve and generate over hypergraphs, we introduce a complete pipeline with a hypergraph construction method, a hypergraph retrieval strategy, and a hypergraph-guided generation mechanism.
no code implementations • 20 Feb 2025 • Huimin Xu, Xin Mao, Feng-Lin Li, Xiaobao Wu, Wang Chen, Wei zhang, Anh Tuan Luu
Direct Preference Optimization (DPO) often struggles with long-chain mathematical reasoning.
no code implementations • 18 Feb 2025 • Cong-Duy Nguyen, Xiaobao Wu, Duc Anh Vu, Shuai Zhao, Thong Nguyen, Anh Tuan Luu
Large Vision-Language Models (LVLMs) have demonstrated impressive multimodal reasoning capabilities, but they remain susceptible to hallucination, particularly object hallucination where non-existent objects or incorrect attributes are fabricated in generated descriptions.
no code implementations • 17 Feb 2025 • Delvin Ce Zhang, Menglin Yang, Xiaobao Wu, Jiasheng Zhang, Hady W. Lauw
We thus propose a Hierarchical Graph Topic Modeling Transformer to integrate both topic hierarchy within documents and graph hierarchy across documents into a unified Transformer.
1 code implementation • 31 Jan 2025 • Haoran Luo, Haihong E, Yikai Guo, Qika Lin, Xiaobao Wu, Xinyu Mu, Wenhao Liu, Meina Song, Yifan Zhu, Luu Anh Tuan
Moreover, it employs MCTS, a heuristic search method driven by policy and reward models, to balance agentic exploration's performance and search space.
no code implementations • 24 Jan 2025 • Cong-Duy Nguyen, Xiaobao Wu, Thong Nguyen, Shuai Zhao, Khoi Le, Viet-Anh Nguyen, Feng Yichao, Anh Tuan Luu
Previous research on multimodal entity linking (MEL) has primarily employed contrastive learning as the primary objective.
1 code implementation • 18 Dec 2024 • Xiaobao Wu, Liangming Pan, Yuxi Xie, Ruiwen Zhou, Shuai Zhao, Yubo Ma, Mingzhe Du, Rui Mao, Anh Tuan Luu, William Yang Wang
Data contamination hinders fair LLM evaluation by introducing test data into newer models' training sets.
1 code implementation • 12 Dec 2024 • Ruiwen Zhou, Wenyue Hua, Liangming Pan, Sitao Cheng, Xiaobao Wu, En Yu, William Yang Wang
This paper introduces RuleArena, a novel and challenging benchmark designed to evaluate the ability of large language models (LLMs) to follow complex, real-world rules in reasoning.
no code implementations • 10 Dec 2024 • Thong Thanh Nguyen, Yi Bin, Xiaobao Wu, Zhiyuan Hu, Cong-Duy T Nguyen, See-Kiong Ng, Anh Tuan Luu
To resolve this problem, we propose a contrastive learning framework to capture salient semantics among video moments.
no code implementations • 10 Dec 2024 • Thong Thanh Nguyen, Xiaobao Wu, Yi Bin, Cong-Duy T Nguyen, See-Kiong Ng, Anh Tuan Luu
To overcome this limitation, we introduce a contrastive representation learning framework that focuses on motion pattern for temporal scene graph generation.
no code implementations • 27 Nov 2024 • Duc Anh Vu, Nguyen Tran Cong Duy, Xiaobao Wu, Hoang Minh Nhat, Du Mingzhe, Nguyen Thanh Thong, Anh Tuan Luu
Large Language Models (LLMs) have shown strong in-context learning (ICL) abilities with a few demonstrations.
1 code implementation • 19 Oct 2024 • Fengjun Pan, Xiaobao Wu, Zongrui Li, Anh Tuan Luu
To elicit fallacy-related knowledge and reasoning abilities of LLMs, we propose diverse single-round and multi-round prompting schemes, applying different task-specific instructions such as extraction, summarization, and Chain-of-Thought reasoning.
1 code implementation • 18 Oct 2024 • Shuai Zhao, Xiaobao Wu, Cong-Duy Nguyen, Yanhao Jia, Meihuizi Jia, Yichao Feng, Luu Anh Tuan
Then, this teacher model guides the large-scale poisoned student model in unlearning the backdoor, leveraging PEFT.
1 code implementation • 12 Oct 2024 • Yuxi Xie, Anirudh Goyal, Xiaobao Wu, Xunjian Yin, Xiao Xu, Min-Yen Kan, Liangming Pan, William Yang Wang
Our approach models multiple token dependencies within manageable context windows, enabling the model to perform iterative refinement internally during the generation process.
no code implementations • 26 Sep 2024 • Shuai Zhao, Leilei Gan, Zhongliang Guo, Xiaobao Wu, Luwei Xiao, Xiaoyu Xu, Cong-Duy Nguyen, Luu Anh Tuan
Despite being widely applied due to their exceptional capabilities, Large Language Models (LLMs) have been proven to be vulnerable to backdoor attacks.
no code implementations • 19 Sep 2024 • Chaoqun Liu, Qin Chao, Wenxuan Zhang, Xiaobao Wu, Boyang Li, Anh Tuan Luu, Lidong Bing
We iteratively prompt LLMs to annotate unlabeled data and retain high-quality labels by filtering.
1 code implementation • 4 Jul 2024 • Thong Nguyen, Yi Bin, Xiaobao Wu, Xinshuai Dong, Zhiyuan Hu, Khoi Le, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan
To address these problems, we propose MAMA, a new approach to learning video-language representations by utilizing a contrastive objective with a subtractive angular margin to regularize cross-modal representations in their effort to reach perfect similarity.
no code implementations • 10 Jun 2024 • Shuai Zhao, Meihuizi Jia, Zhongliang Guo, Leilei Gan, Xiaoyu Xu, Xiaobao Wu, Jie Fu, Yichao Feng, Fengjun Pan, Luu Anh Tuan
Large Language Models (LLMs), which bridge the gap between human language understanding and complex problem-solving, achieve state-of-the-art performance on several NLP tasks, particularly in few-shot and zero-shot settings.
1 code implementation • 30 May 2024 • Thong Thanh Nguyen, Zhiyuan Hu, Xiaobao Wu, Cong-Duy T Nguyen, See-Kiong Ng, Anh Tuan Luu
Seeking answers effectively for long videos is essential to build video question answering (videoQA) systems.
2 code implementations • 28 May 2024 • Xiaobao Wu, Thong Nguyen, Delvin Ce Zhang, William Yang Wang, Anh Tuan Luu
We further propose a novel Embedding Transport Plan (ETP) method.
2 code implementations • 28 May 2024 • Xiaobao Wu, Xinshuai Dong, Liangming Pan, Thong Nguyen, Anh Tuan Luu
However, existing models suffer from repetitive topic and unassociated topic issues, failing to reveal the evolution and hindering further applications.
1 code implementation • 26 Mar 2024 • Cong-Duy Nguyen, Thong Nguyen, Xiaobao Wu, Anh Tuan Luu
Previous work on multimodal sentence embedding has proposed multimodal contrastive learning and achieved promising results.
1 code implementation • 29 Feb 2024 • Xiaobao Wu, Liangming Pan, William Yang Wang, Anh Tuan Luu
Knowledge editing injects knowledge updates into language models to keep them correct and up-to-date.
no code implementations • 12 Feb 2024 • Thong Nguyen, Xiaobao Wu, Xinshuai Dong, Cong-Duy T Nguyen, See-Kiong Ng, Anh Tuan Luu
Secondly, we explicitly cast contrastive topic modeling as a gradient-based multi-objective optimization problem, with the goal of achieving a Pareto stationary solution that balances the trade-off between the ELBO and the contrastive objective.
2 code implementations • 27 Jan 2024 • Xiaobao Wu, Thong Nguyen, Anh Tuan Luu
In this paper, we present a comprehensive survey on neural topic models concerning methods, applications, and challenges.
2 code implementations • 25 Jan 2024 • Xiaobao Wu, Fengjun Pan, Thong Nguyen, Yichao Feng, Chaoqun Liu, Cong-Duy Nguyen, Anh Tuan Luu
Hierarchical topic modeling aims to discover latent topics from a corpus and organize them into a hierarchy to understand documents with desirable semantic granularity.
1 code implementation • 12 Dec 2023 • Thong Nguyen, Xiaobao Wu, Xinshuai Dong, Khoi Le, Zhiyuan Hu, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan
Fully fine-tuning pretrained large-scale transformer models has become a popular paradigm for video-language modeling tasks, such as temporal language grounding and video-language summarization.
no code implementations • 5 Dec 2023 • Thong Nguyen, Xiaobao Wu, Xinshuai Dong, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan
Temporal Language Grounding seeks to localize video moments that semantically correspond to a natural language query.
1 code implementation • 13 Sep 2023 • Xiaobao Wu, Fengjun Pan, Anh Tuan Luu
Topic models have a rich history with various applications and have recently been reinvigorated by neural topic modeling.
2 code implementations • 7 Jun 2023 • Xiaobao Wu, Xinshuai Dong, Thong Nguyen, Anh Tuan Luu
Topic models have been prevalent for decades with various applications.
1 code implementation • 22 May 2023 • Thong Nguyen, Xiaobao Wu, Xinshuai Dong, Anh Tuan Luu, Cong-Duy Nguyen, Zhen Hai, Lidong Bing
Multimodal Review Helpfulness Prediction (MRHP) aims to rank product reviews based on predicted helpfulness scores and has been widely applied in e-commerce via presenting customers with useful reviews.
2 code implementations • 22 May 2023 • Liangming Pan, Xiaobao Wu, Xinyuan Lu, Anh Tuan Luu, William Yang Wang, Min-Yen Kan, Preslav Nakov
Fact-checking real-world claims often requires collecting multiple pieces of evidence and applying complex multi-step reasoning.
1 code implementation • 19 May 2023 • Chaoqun Liu, Wenxuan Zhang, Guizhen Chen, Xiaobao Wu, Anh Tuan Luu, Chip Hong Chang, Lidong Bing
In this work, we propose a new paradigm based on self-supervised learning to solve zero-shot text classification tasks by tuning the language models with unlabeled data, called self-supervised tuning.
2 code implementations • 7 Apr 2023 • Xiaobao Wu, Xinshuai Dong, Thong Nguyen, Chaoqun Liu, Liangming Pan, Anh Tuan Luu
Instead of the direct alignment in previous work, we propose a topic alignment with mutual information method.
2 code implementations • 23 Nov 2022 • Xiaobao Wu, Anh Tuan Luu, Xinshuai Dong
To overcome the data sparsity issue in short text topic modeling, existing methods commonly rely on data augmentation or the data characteristic of short texts to introduce more word co-occurrence information.
1 code implementation • 7 Nov 2022 • Thong Nguyen, Xiaobao Wu, Anh-Tuan Luu, Cong-Duy Nguyen, Zhen Hai, Lidong Bing
To overcome the aforementioned issues, we propose Multimodal Contrastive Learning for Multimodal Review Helpfulness Prediction (MRHP) problem, concentrating on mutual information between input modalities to explicitly elaborate cross-modal relations.
1 code implementation • 5 Jul 2022 • Thong Nguyen, Cong-Duy Nguyen, Xiaobao Wu, See-Kiong Ng, Anh Tuan Luu
Moreover, a list of training datasets and downstream tasks is supplied to further polish the perspective into V\&L pretraining.