1 code implementation • 22 May 2023 • Thong Nguyen, Xiaobao Wu, Xinshuai Dong, Anh Tuan Luu, Cong-Duy Nguyen, Zhen Hai, Lidong Bing
Multimodal Review Helpfulness Prediction (MRHP) aims to rank product reviews based on predicted helpfulness scores and has been widely applied in e-commerce via presenting customers with useful reviews.
1 code implementation • 12 Dec 2023 • Thong Nguyen, Xiaobao Wu, Xinshuai Dong, Khoi Le, Zhiyuan Hu, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan
Fully fine-tuning pretrained large-scale transformer models has become a popular paradigm for video-language modeling tasks, such as temporal language grounding and video-language summarization.
1 code implementation • 5 Jul 2022 • Thong Nguyen, Cong-Duy Nguyen, Xiaobao Wu, See-Kiong Ng, Anh Tuan Luu
Moreover, a list of training datasets and downstream tasks is supplied to further polish the perspective into V\&L pretraining.
1 code implementation • 26 Mar 2024 • Cong-Duy Nguyen, Thong Nguyen, Xiaobao Wu, Anh Tuan Luu
Previous work on multimodal sentence embedding has proposed multimodal contrastive learning and achieved promising results.
1 code implementation • 7 Nov 2022 • Thong Nguyen, Xiaobao Wu, Anh-Tuan Luu, Cong-Duy Nguyen, Zhen Hai, Lidong Bing
To overcome the aforementioned issues, we propose Multimodal Contrastive Learning for Multimodal Review Helpfulness Prediction (MRHP) problem, concentrating on mutual information between input modalities to explicitly elaborate cross-modal relations.
1 code implementation • 25 Jan 2024 • Xiaobao Wu, Fengjun Pan, Thong Nguyen, Yichao Feng, Chaoqun Liu, Cong-Duy Nguyen, Anh Tuan Luu
Hierarchical topic modeling aims to discover latent topics from a corpus and organize them into a hierarchy to understand documents with desirable semantic granularity.
no code implementations • 4 Dec 2023 • Cong-Duy Nguyen, The-Anh Vu-Le, Thong Nguyen, Tho Quan, Luu Anh Tuan
Language models have been supervised with both language-only objective and visual grounding in existing studies of visual-grounded language learning.
no code implementations • 5 Dec 2023 • Thong Nguyen, Xiaobao Wu, Xinshuai Dong, Cong-Duy Nguyen, See-Kiong Ng, Luu Anh Tuan
Temporal Language Grounding seeks to localize video moments that semantically correspond to a natural language query.
no code implementations • 4 Dec 2023 • Cong-Duy Nguyen, Thong Nguyen, Duc Anh Vu, Luu Anh Tuan
The effectiveness of a model is heavily reliant on the quality of the fusion representation of multiple modalities in multimodal sentiment analysis.