1 code implementation • 19 May 2025 • Thong Nguyen, Zhiyuan Hu, Xu Lin, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan
In this work, we conduct a thorough empirical study to demystify crucial components that influence the temporal understanding of LVLMs.
no code implementations • 30 Apr 2025 • Franco Maria Nardini, Thong Nguyen, Cosimo Rulli, Rossano Venturini, Andrew Yates
In this paper, we conduct an extended evaluation of regularization approaches for LSR where we discuss their effectiveness, efficiency, and out-of-domain generalization capabilities.
no code implementations • 25 Apr 2025 • Jingfen Qiao, Thong Nguyen, Evangelos Kanoulas, Andrew Yates
Learned Sparse Retrieval (LSR) has traditionally focused on small-scale encoder-only transformer architectures.
no code implementations • 9 Mar 2025 • Khang H. N. Vo, Duc P. T. Nguyen, Thong Nguyen, Tho T. Quan
This paper focuses on multimodal alignment within the realm of Artificial Intelligence, particularly in text and image modalities.
no code implementations • 18 Feb 2025 • Cong-Duy Nguyen, Xiaobao Wu, Duc Anh Vu, Shuai Zhao, Thong Nguyen, Anh Tuan Luu
Large Vision-Language Models (LVLMs) have demonstrated impressive multimodal reasoning capabilities, but they remain susceptible to hallucination, particularly object hallucination where non-existent objects or incorrect attributes are fabricated in generated descriptions.
no code implementations • 24 Jan 2025 • Cong-Duy Nguyen, Xiaobao Wu, Thong Nguyen, Shuai Zhao, Khoi Le, Viet-Anh Nguyen, Feng Yichao, Anh Tuan Luu
Previous research on multimodal entity linking (MEL) has primarily employed contrastive learning as the primary objective.
1 code implementation • 10 Oct 2024 • Thong Nguyen, Shubham Chatterjee, Sean MacAvaney, Iain Mackie, Jeff Dalton, Andrew Yates
Learned Sparse Retrieval (LSR) models use vocabularies from pre-trained transformers, which often split entities into nonsensical fragments.
no code implementations • 25 Sep 2024 • Thong Nguyen, Truc-My Nguyen
Hence, we consider the problem of counterfactual detection (CFD) and seek to enhance the CFD models.
1 code implementation • 4 Jul 2024 • Thong Nguyen, Yi Bin, Xiaobao Wu, Xinshuai Dong, Zhiyuan Hu, Khoi Le, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan
To address these problems, we propose MAMA, a new approach to learning video-language representations by utilizing a contrastive objective with a subtractive angular margin to regularize cross-modal representations in their effort to reach perfect similarity.
1 code implementation • 9 Jun 2024 • Thong Nguyen, Yi Bin, Junbin Xiao, Leigang Qu, Yicong Li, Jay Zhangjie Wu, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan
Humans use multiple senses to comprehend the environment.
2 code implementations • 28 May 2024 • Xiaobao Wu, Thong Nguyen, Delvin Ce Zhang, William Yang Wang, Anh Tuan Luu
We further propose a novel Embedding Transport Plan (ETP) method.
2 code implementations • 28 May 2024 • Xiaobao Wu, Xinshuai Dong, Liangming Pan, Thong Nguyen, Anh Tuan Luu
However, existing models suffer from repetitive topic and unassociated topic issues, failing to reveal the evolution and hindering further applications.
1 code implementation • 26 Mar 2024 • Cong-Duy Nguyen, Thong Nguyen, Xiaobao Wu, Anh Tuan Luu
Previous work on multimodal sentence embedding has proposed multimodal contrastive learning and achieved promising results.
1 code implementation • 27 Feb 2024 • Thong Nguyen, Mariya Hendriksen, Andrew Yates, Maarten de Rijke
Our proposed approach efficiently transforms dense vectors from a frozen dense model into sparse lexical vectors.
no code implementations • 12 Feb 2024 • Thong Nguyen, Xiaobao Wu, Xinshuai Dong, Cong-Duy T Nguyen, See-Kiong Ng, Anh Tuan Luu
Secondly, we explicitly cast contrastive topic modeling as a gradient-based multi-objective optimization problem, with the goal of achieving a Pareto stationary solution that balances the trade-off between the ELBO and the contrastive objective.
no code implementations • 12 Feb 2024 • Thong Nguyen, Mariya Hendriksen, Andrew Yates
Motivated by this, in this work, we explore the application of LSR in the multi-modal domain, i. e., we focus on Multi-Modal Learned Sparse Retrieval (MLSR).
2 code implementations • 27 Jan 2024 • Xiaobao Wu, Thong Nguyen, Anh Tuan Luu
In this paper, we present a comprehensive survey on neural topic models concerning methods, applications, and challenges.
2 code implementations • 25 Jan 2024 • Xiaobao Wu, Fengjun Pan, Thong Nguyen, Yichao Feng, Chaoqun Liu, Cong-Duy Nguyen, Anh Tuan Luu
Hierarchical topic modeling aims to discover latent topics from a corpus and organize them into a hierarchy to understand documents with desirable semantic granularity.
1 code implementation • 12 Dec 2023 • Thong Nguyen, Xiaobao Wu, Xinshuai Dong, Khoi Le, Zhiyuan Hu, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan
Fully fine-tuning pretrained large-scale transformer models has become a popular paradigm for video-language modeling tasks, such as temporal language grounding and video-language summarization.
no code implementations • 5 Dec 2023 • Thong Nguyen, Xiaobao Wu, Xinshuai Dong, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan
Temporal Language Grounding seeks to localize video moments that semantically correspond to a natural language query.
no code implementations • 4 Dec 2023 • Cong-Duy Nguyen, Thong Nguyen, Duc Anh Vu, Luu Anh Tuan
The effectiveness of a model is heavily reliant on the quality of the fusion representation of multiple modalities in multimodal sentiment analysis.
no code implementations • 4 Dec 2023 • Cong-Duy Nguyen, The-Anh Vu-Le, Thong Nguyen, Tho Quan, Luu Anh Tuan
Language models have been supervised with both language-only objective and visual grounding in existing studies of visual-grounded language learning.
no code implementations • 20 Jun 2023 • Thong Nguyen, Andrew Yates
Generative retrieval is a promising new neural retrieval paradigm that aims to optimize the retrieval pipeline by performing both indexing and retrieval with a single transformer model.
2 code implementations • 7 Jun 2023 • Xiaobao Wu, Xinshuai Dong, Thong Nguyen, Anh Tuan Luu
Topic models have been prevalent for decades with various applications.
1 code implementation • 29 May 2023 • Thong Nguyen, Sean MacAvaney, Andrew Yates
We investigate existing aggregation approaches for adapting LSR to longer documents and find that proximal scoring is crucial for LSR to handle long documents.
1 code implementation • 22 May 2023 • Thong Nguyen, Xiaobao Wu, Xinshuai Dong, Anh Tuan Luu, Cong-Duy Nguyen, Zhen Hai, Lidong Bing
Multimodal Review Helpfulness Prediction (MRHP) aims to rank product reviews based on predicted helpfulness scores and has been widely applied in e-commerce via presenting customers with useful reviews.
2 code implementations • 7 Apr 2023 • Xiaobao Wu, Xinshuai Dong, Thong Nguyen, Chaoqun Liu, Liangming Pan, Anh Tuan Luu
Instead of the direct alignment in previous work, we propose a topic alignment with mutual information method.
1 code implementation • 23 Mar 2023 • Thong Nguyen, Sean MacAvaney, Andrew Yates
We then reproduce all prominent methods using a common codebase and re-train them in the same environment, which allows us to quantify how components of the framework affect effectiveness and efficiency.
1 code implementation • 7 Nov 2022 • Thong Nguyen, Xiaobao Wu, Anh-Tuan Luu, Cong-Duy Nguyen, Zhen Hai, Lidong Bing
To overcome the aforementioned issues, we propose Multimodal Contrastive Learning for Multimodal Review Helpfulness Prediction (MRHP) problem, concentrating on mutual information between input modalities to explicitly elaborate cross-modal relations.
1 code implementation • 5 Jul 2022 • Thong Nguyen, Cong-Duy Nguyen, Xiaobao Wu, See-Kiong Ng, Anh Tuan Luu
Moreover, a list of training datasets and downstream tasks is supplied to further polish the perspective into V\&L pretraining.
1 code implementation • ACL 2022 • Thong Nguyen, Andrew Yates, Ayah Zirikly, Bart Desmet, Arman Cohan
In dataset-transfer experiments on three social media datasets, we find that grounding the model in PHQ9's symptoms substantially improves its ability to generalize to out-of-distribution data compared to a standard BERT-based approach.
1 code implementation • 7 Dec 2021 • Thong Nguyen, Luu Anh Tuan
Current state-of-the-art cross-lingual summarization models employ multi-task learning paradigm, which works on a shared vocabulary module and relies on the self-attention mechanism to attend among tokens in two languages.
2 code implementations • NeurIPS 2021 • Thong Nguyen, Anh Tuan Luu
Recent empirical studies show that adversarial topic models (ATM) can successfully capture semantic patterns of the document by differentiating a document with another dissimilar sample.
no code implementations • EMNLP 2021 • Thong Nguyen, Anh Tuan Luu, Truc Lu, Tho Quan
Recently, Transformer-based models have been proven effective in the abstractive summarization task by creating fluent and informative summaries.
no code implementations • 10 Mar 2020 • Thong Nguyen, Duy Nguyen, Pramod Rao
For several purposes in Natural Language Processing (NLP), such as Information Extraction, Sentiment Analysis or Chatbot, Named Entity Recognition (NER) holds an important role as it helps to determine and categorize entities in text into predefined groups such as the names of persons, locations, quantities, organizations or percentages, etc.
no code implementations • 25 Jan 2019 • Thong Nguyen, Tianjian Lu, Ken Wu, Jose Schutt-Aine
In this paper, we leverage machine learning methods, to be specific, the recurrent neural network (RNN), to generate black-box macromodels and achieve significant reduction of computation time.