1 code implementation • ACL 2022 • Yongqi Li, Wenjie Li, Liqiang Nie
In this paper, we hence define a novel research task, i. e., multimodal conversational question answering (MMCoQA), aiming to answer users’ questions with multimodal knowledge sources via multi-turn conversations.
no code implementations • 14 Apr 2025 • Gang Wu, Junjun Jiang, Kui Jiang, Xianming Liu, Liqiang Nie
All-in-one image restoration, addressing diverse degradation types with a unified model, presents significant challenges in designing task-specific prompts that effectively guide restoration across multiple degradation scenarios.
1 code implementation • 27 Mar 2025 • Zixu Li, Zhiheng Fu, Yupeng Hu, Zhiwei Chen, Haokun Wen, Liqiang Nie
Using this pipeline, we refine the FashionIQ and CIRR datasets to create two fine-grained CIR datasets: Fine-FashionIQ and Fine-CIRR.
1 code implementation • 25 Mar 2025 • Haoqiang Lin, Haokun Wen, Xuemeng Song, Meng Liu, Yupeng Hu, Liqiang Nie
The pioneer ZS-CIR studies focus on converting the CIR task into a standard text-to-image retrieval task by pre-training a textual inversion network that can map a given image into a single pseudo-word token.
1 code implementation • 24 Mar 2025 • Yanda Chen, Gongwei Chen, Miao Zhang, Weili Guan, Liqiang Nie
Recent works on dataset distillation demonstrate that combining distilled and real data can mitigate the effectiveness decay.
1 code implementation • 23 Mar 2025 • Jianjian Yin, Tao Chen, Gensheng Pei, Yazhou Yao, Liqiang Nie, Xiansheng Hua
Specifically, we first design a feature knowledge alignment (FKA) strategy to promote the feature consistency learning of the encoder from image-augmentation.
no code implementations • 18 Mar 2025 • Yongqi Li, Lu Yang, Jian Wang, Runyang You, Wenjie Li, Liqiang Nie
Additionally, applying BPO to the MMSafe-PO dataset greatly reduces the base MLLM's unsafe rate on other safety benchmarks (14. 5% on MM-SafetyBench and 82. 9% on HarmEval, demonstrating the effectiveness and robustness of both the dataset and the approach.
1 code implementation • 16 Mar 2025 • Xiao Wang, Qingyi Si, Jianlong Wu, Shiyu Zhu, Li Cao, Liqiang Nie
Multimodal Large Language Models (MLLMs) have revolutionized video understanding, yet are still limited by context length when processing long videos.
no code implementations • 13 Mar 2025 • Yunxiao Wang, Meng Liu, Rui Shao, Haoyu Zhang, Bin Wen, Fan Yang, Tingting Gao, Di Zhang, Liqiang Nie
Video large language models have achieved remarkable performance in tasks such as video question answering, however, their temporal understanding remains suboptimal.
no code implementations • 13 Mar 2025 • Qi Lv, Hao Li, Xiang Deng, Rui Shao, Yinchuan Li, Jianye Hao, Longxiang Gao, Michael Yu Wang, Liqiang Nie
In this paper, we propose Kinematics enhanced Spatial-TemporAl gRaph Diffuser (KStar Diffuser).
no code implementations • 13 Mar 2025 • Qiyuan Deng, Xuefeng Bai, Kehai Chen, YaoWei Wang, Liqiang Nie, Min Zhang
Reinforcement Learning (RL) algorithms for safety alignment of Large Language Models (LLMs), such as Direct Preference Optimization (DPO), encounter the challenge of distribution shift.
no code implementations • 12 Mar 2025 • Haoyu Zhang, Qiaohui Chu, Meng Liu, Yunxiao Wang, Bin Wen, Fan Yang, Tingting Gao, Di Zhang, YaoWei Wang, Liqiang Nie
To address these challenges, we propose learning the mapping between exocentric and egocentric domains, leveraging the extensive exocentric knowledge within existing MLLMs to enhance egocentric video understanding.
no code implementations • 11 Mar 2025 • Runling Long, Yunlong Wang, Jia Wan, Xiang Deng, Xinting Zhu, Weili Guan, Antoni B. Chan, Liqiang Nie
However, most existing methods are designed for indoor navigation, showing unknown performance in analyzing complex object distribution in large scale scenes, such as crowds.
1 code implementation • 11 Mar 2025 • Xinrui Li, Jianlong Wu, Xinchuan Huang, Chong Chen, Weili Guan, Xian-Sheng Hua, Liqiang Nie
Pioneering text-to-image (T2I) diffusion models have ushered in a new era of real-world image super-resolution (Real-ISR), significantly enhancing the visual perception of reconstructed images.
no code implementations • 5 Mar 2025 • Wei Li, Bing Hu, Rui Shao, Leyang Shen, Liqiang Nie
However, existing online video assistants often sacrifice assistant efficacy for real-time efficiency by processing low-frame-rate videos with coarse-grained visual features. To overcome the trade-off between efficacy and efficiency, we propose "Fast & Slow Video-Language Thinker" as an onLIne videO assistaNt, LION-FS, achieving real-time, proactive, temporally accurate, and contextually precise responses.
no code implementations • 28 Feb 2025 • Xiao Wang, Jingyun Hua, WeiHong Lin, Yuanxing Zhang, Fuzheng Zhang, Jianlong Wu, Di Zhang, Liqiang Nie
Recent Multi-modal Large Language Models (MLLMs) have made great progress in video understanding.
no code implementations • 27 Feb 2025 • Hengshuo Chu, Xiang Deng, Qi Lv, Xiaoyang Chen, Yinchuan Li, Jianye Hao, Liqiang Nie
In addition, given the scarcity of 3D affordance datasets for training large models, we seek to extract knowledge from general segmentation data and transfer it to affordance detection.
no code implementations • 27 Feb 2025 • Zaijing Li, Yuquan Xie, Rui Shao, Gongwei Chen, Dongmei Jiang, Liqiang Nie
GOAP contains (1) an Action-guided Behavior Encoder that models causal relationships between observations and actions at each timestep, then dynamically interacts with the historical observation-action sequence, consolidating it into fixed-length behavior tokens, and (2) an MLLM that aligns behavior tokens with open-ended language instructions to predict actions auto-regressively.
1 code implementation • 19 Feb 2025 • Xuemeng Song, Haoqiang Lin, Haokun Wen, Bohan Hou, Mingzhu Xu, Liqiang Nie
To the best of our knowledge, there is currently no comprehensive review of CIR to provide a timely overview of this field.
no code implementations • 18 Feb 2025 • Jiaqi Zhao, Ming Wang, Miao Zhang, Yuzhang Shang, Xuebo Liu, YaoWei Wang, Min Zhang, Liqiang Nie
Then, we conduct extensive experiments with the baseline within each class, covering models with various sizes (7B-70B), bitwidths, training levels (LLaMA1/2/3/3. 1), architectures (Mixtral, DeepSeekMoE and Mamba) and modality (LLaVA1. 5 and VILA1. 5) on a wide range of evaluation metrics. Through comparative analysis on the results, we summarize the superior of each PTQ strategy and modelsize-bitwidth trade-off considering the performance.
no code implementations • 27 Jan 2025 • Renshan Zhang, Rui Shao, Gongwei Chen, Kaiwen Zhou, Weili Guan, Liqiang Nie
To directly address the visual redundancy present in the output of vision encoder, we propose a Register-based Representation Compacting (ReCompact) mechanism.
1 code implementation • 13 Jan 2025 • Han Liu, Yinwei Wei, Fan Liu, Wenjie Wang, Liqiang Nie, Tat-Seng Chua
In this paper, we develop a novel meta-learning-based multimodal fusion framework called Meta Multimodal Fusion (MetaMMF), which dynamically assigns parameters to the multimodal fusion function for each micro-video during its representation learning.
1 code implementation • 29 Dec 2024 • Xiao Wang, Qingyi Si, Jianlong Wu, Shiyu Zhu, Li Cao, Liqiang Nie
Video Large Language Models (VideoLLMs) have made significant strides in video understanding but struggle with long videos due to the limitations of their backbone LLMs.
no code implementations • 20 Dec 2024 • Yangyang Guo, Ziwei Xu, Xilie Xu, Yongkang Wong, Liqiang Nie, Mohan Kankanhalli
This technical report introduces our top-ranked solution that employs two approaches, \ie suffix injection and projected gradient descent (PGD) , to address the TiFA workshop MLLM attack challenge.
no code implementations • 8 Dec 2024 • Leigang Qu, Haochuan Li, Wenjie Wang, Xiang Liu, Juncheng Li, Liqiang Nie, Tat-Seng Chua
To adapt SILMM to LMMs with continuous features, we propose a diversity mechanism to obtain diverse representations and a kernel-based continuous DPO for alignment.
no code implementations • 18 Nov 2024 • Boyao Zhou, Shunyuan Zheng, Hanzhang Tu, Ruizhi Shao, Boning Liu, Shengping Zhang, Liqiang Nie, Yebin Liu
To this end, we introduce Gaussian parameter maps defined on the source views and directly regress Gaussian properties for instant novel view synthesis without any fine-tuning or optimization.
no code implementations • 13 Nov 2024 • Yangyang Guo, Fangkai Jiao, Liqiang Nie, Mohan Kankanhalli
The problem causes these defense methods to exhibit unintended abstention, even in the presence of benign inputs, thereby undermining their reliability in faithfully defending against attacks.
1 code implementation • 19 Oct 2024 • Jingxuan Chen, Derek Yuen, Bin Xie, Yuhao Yang, Gongwei Chen, Zhihao Wu, Li Yixing, Xurui Zhou, Weiwen Liu, Shuai Wang, Kaiwen Zhou, Rui Shao, Liqiang Nie, Yasheng Wang, Jianye Hao, Jun Wang, Kun Shao
SPA-Bench offers three key contributions: (1) A diverse set of tasks covering system and third-party apps in both English and Chinese, focusing on features commonly used in daily routines; (2) A plug-and-play framework enabling real-time agent interaction with Android devices, integrating over ten agents with the flexibility to add more; (3) A novel evaluation pipeline that automatically assesses agent performance across multiple dimensions, encompassing seven metrics related to task completion and resource consumption.
no code implementations • 18 Oct 2024 • Muhe Ding, Jianlong Wu, Xue Dong, Xiaojie Li, Pengda Qin, Tian Gan, Liqiang Nie
It first distills the structural knowledge of both instance-level feature correspondence and the relation between instance features and category centers in a contrastive learning fashion, which can explicitly optimize the category representation and explore the distinct correlation between representations of instances and categories, contributing to discriminative category centers and better classification results.
no code implementations • 18 Oct 2024 • Muhe Ding, Yang Ma, Pengda Qin, Jianlong Wu, Yuhong Li, Liqiang Nie
Multimodal Large Language Models (MLLMs) have recently received substantial interest, which shows their emerging potential as general-purpose models for various vision-language tasks.
1 code implementation • 3 Oct 2024 • Zheng Zhang, Xu Yuan, Lei Zhu, Jingkuan Song, Liqiang Nie
In this paper, we introduce a novel bilateral backdoor to fill in the missing pieces of the puzzle in the cross-modal backdoor and propose a generalized invisible backdoor framework against cross-modal learning (BadCM).
no code implementations • 29 Sep 2024 • Xiao Wang, Jianlong Wu, Zijia Lin, Fuzheng Zhang, Di Zhang, Liqiang Nie
For iterative refinement, we first leverage a video-language model to generate synthetic annotations, resulting in a refined dataset.
no code implementations • 28 Sep 2024 • Jiarui Jiang, Wei Huang, Miao Zhang, Taiji Suzuki, Liqiang Nie
To address this gap, this work delves deeply into the benign overfitting perspective of transformers in vision.
1 code implementation • International Journal of Computer Vision (IJCV) 2024 • Xianzhu Liu, Haozhe Xie, Shengping Zhang, Hongxun Yao, Rongrong Ji, Liqiang Nie, DaCheng Tao
Semantic scene completion (SSC) aims to simultaneously perform scene completion (SC) and predict semantic categories of a 3D scene from a single depth and/or RGB image.
Ranked #1 on
3D Semantic Scene Completion
on NYUv2
1 code implementation • 5 Sep 2024 • Qianlong Xiang, Miao Zhang, Yuzhang Shang, Jianlong Wu, Yan Yan, Liqiang Nie
Furthermore, considering that the source data is either unaccessible or too enormous to store for current generative models, we introduce a new paradigm for their distillation without source data, termed Data-Free Knowledge Distillation for Diffusion Models (DKDM).
no code implementations • 3 Sep 2024 • Xinyu Zhang, Linmei Hu, Luhao Zhang, Dandan song, Heyan Huang, Liqiang Nie
In our Laser, the prefix is utilized to incorporate user-item collaborative information and adapt the LLM to the recommendation task, while the suffix converts the output embeddings of the LLM from the language space to the recommendation space for the follow-up item recommendation.
no code implementations • 19 Aug 2024 • Xiao Han, Zijian Zhang, Xiangyu Zhao, Yuanshao Zhu, Guojiang Shen, Xiangjie Kong, Xuetao Wei, Liqiang Nie, Jieping Ye
As urban residents demand higher travel quality, vehicle dispatch has become a critical component of online ride-hailing services.
no code implementations • 13 Aug 2024 • Harry Cheng, Yangyang Guo, Qingpei Guo, Ming Yang, Tian Gan, Liqiang Nie
Multi-modal Large Language Models (MLLMs) have advanced significantly, offering powerful vision-language understanding capabilities.
1 code implementation • 7 Aug 2024 • Zaijing Li, Yuquan Xie, Rui Shao, Gongwei Chen, Dongmei Jiang, Liqiang Nie
On top of the Hybrid Multimodal Memory module, a multimodal agent, Optimus-1, is constructed with dedicated Knowledge-guided Planner and Experience-Driven Reflector, contributing to a better planning and reflection in the face of long-horizon tasks in Minecraft.
1 code implementation • 28 Jul 2024 • Letian Shi, Qi Lv, Xiang Deng, Liqiang Nie
To address the real-world egocentric task planning problem, we introduce a novel planning framework which comprises three stages: long-term memory Extraction, context-awared Planning, and multi-iteration Decision, named EPD.
no code implementations • 24 Jul 2024 • Yongqi Li, Hongru Cai, Wenjie Wang, Leigang Qu, Yinwei Wei, Wenjie Li, Liqiang Nie, Tat-Seng Chua
Despite its great potential, existing generative approaches are limited due to the following issues: insufficient visual information in identifiers, misalignment with high-level semantics, and learning gap towards the retrieval target.
1 code implementation • 19 Jul 2024 • Renshan Zhang, Yibo Lyu, Rui Shao, Gongwei Chen, Weili Guan, Liqiang Nie
Secondly, we present a token-level sampling method that efficiently captures the most informative tokens by delving into the correlation between the [CLS] token and patch tokens.
1 code implementation • 17 Jul 2024 • Leyang Shen, Gongwei Chen, Rui Shao, Weili Guan, Liqiang Nie
In this paper, we propose a mixture of multimodal experts (MoME) to mitigate task interference and obtain a generalist MLLM.
1 code implementation • 8 Jul 2024 • Xiaojie Li, Yibo Yang, Jianlong Wu, Bernard Ghanem, Liqiang Nie, Min Zhang
The dual design enables the model to maintain the robust features of base classes, while adaptively learning distinctive feature shifts for novel classes.
class-incremental learning
Few-Shot Class-Incremental Learning
+3
no code implementations • 28 Jun 2024 • Wenliang Zhong, Haoyu Tang, Qinghai Zheng, Mingzhu Xu, Yupeng Hu, Liqiang Nie
To address these issues, we offer a new perspective on understanding the essence of Dataset Distillation and MTT through a simple transformation of the objective function, and introduce a novel method called Matching Convexified Trajectory (MCT), which aims to provide better guidance for the student trajectory.
1 code implementation • 22 Jun 2024 • Haoyu Zhang, Yuquan Xie, Yisen Feng, Zaijing Li, Meng Liu, Liqiang Nie
Then in Inference-guided Answering, HCQA utilizes this hierarchical information to reason and answer given question.
1 code implementation • 22 Jun 2024 • Yisen Feng, Haoyu Zhang, Yuquan Xie, Zaijing Li, Meng Liu, Liqiang Nie
In this report, we present our approach for the Natural Language Query track and Goal Step track of the Ego4D Episodic Memory Benchmark at CVPR 2024.
no code implementations • 17 Jun 2024 • Ruili Jiang, Kehai Chen, Xuefeng Bai, Zhixuan He, Juntao Li, Muyun Yang, Tiejun Zhao, Liqiang Nie, Min Zhang
In this survey, we review the progress in exploring human preference learning for LLMs from a preference-centered perspective, covering the sources and formats of preference feedback, the modeling and usage of preference signals, as well as the evaluation of the aligned LLMs.
no code implementations • 9 Jun 2024 • Leigang Qu, Haochuan Li, Tan Wang, Wenjie Wang, Yongqi Li, Liqiang Nie, Tat-Seng Chua
How humans can effectively and efficiently acquire images has always been a perennial question.
1 code implementation • 8 Jun 2024 • Qi Lv, Xiang Deng, Gongwei Chen, Michael Yu Wang, Liqiang Nie
To capture the relationship among RTG-state-action triplets, a fine-grained SSM module is designed and integrated into the original coarse-grained SSM in mamba, resulting in a novel mamba architecture tailored for offline RL.
1 code implementation • 7 Jun 2024 • Yibo Yang, Xiaojie Li, Zhongzhu Zhou, Shuaiwen Leon Song, Jianlong Wu, Liqiang Nie, Bernard Ghanem
For the latter, we use the instruction data from the fine-tuning task, such as math or coding, to orientate the decomposition and train the largest $r$ components that most correspond to the task to learn.
no code implementations • 27 May 2024 • Zhenyang Li, Yangyang Guo, Kejie Wang, Xiaolin Chen, Liqiang Nie, Mohan Kankanhalli
Visual Commonsense Reasoning (VCR) calls for explanatory reasoning behind question answering over visual scenes.
no code implementations • ACM Transactions on Information Systems 2024 • Haitao Shi, Meng Liu, Xiaoxuan Mu, Xuemeng Song, Yupeng Hu, Liqiang Nie
To reduce the negative impact of noisy correspondence, we propose a novel model that first transforms the noisy correspondence filtering problem into a similarity distribution modeling problem by exploiting the powerful capabilities of pre-trained models.
Cross-modal retrieval with noisy correspondence
Image-text matching
+1
no code implementations • 25 Apr 2024 • Han Liu, Yinwei Wei, Xuemeng Song, Weili Guan, Yuan-Fang Li, Liqiang Nie
Multimodal recommendation aims to recommend user-preferred candidates based on her/his historically interacted items and associated multimodal information.
no code implementations • 25 Apr 2024 • Yongqi Li, Xinyu Lin, Wenjie Wang, Fuli Feng, Liang Pang, Wenjie Li, Liqiang Nie, Xiangnan He, Tat-Seng Chua
With the information explosion on the Web, search and recommendation are foundational infrastructures to satisfying users' information needs.
1 code implementation • 21 Apr 2024 • Gensheng Pei, Yazhou Yao, Jianbo Jiao, Wenguan Wang, Liqiang Nie, Jinhui Tang
To achieve this objective, we present a unified self-supervised approach to learn visual representations of static-dynamic feature similarity.
no code implementations • 18 Apr 2024 • Zunran Wang, Zhonghua Li, Wei Shen, Qi Ye, Liqiang Nie
To effectively enrich the feature context representations of term weight, the Feature Context Module (FCM) is introduced, which leverages the power of BERT's representation to determine dynamic weights for each element in the embedding.
1 code implementation • 16 Apr 2024 • Fan Liu, Shuai Zhao, Zhiyong Cheng, Liqiang Nie, Mohan Kankanhalli
This model performs high-order graph convolution on cluster-specific graphs, which are constructed by capturing the multiple interests of users and identifying the common interests among them.
no code implementations • 12 Mar 2024 • Linmei Hu, Hongyu He, Duokang Wang, Ziwang Zhao, Yingxia Shao, Liqiang Nie
Furthermore, we utilize the LLM to enrich the information of personality labels for enhancing the detection performance.
no code implementations • CVPR 2024 • Leigang Qu, Wenjie Wang, Yongqi Li, Hanwang Zhang, Liqiang Nie, Tat-Seng Chua
We present a discriminative adapter built on T2I models to probe their discriminative abilities on two representative tasks and leverage discriminative fine-tuning to improve their text-image alignment.
no code implementations • 19 Feb 2024 • Yuxuan Yue, Zhihang Yuan, Haojie Duanmu, Sifan Zhou, Jianlong Wu, Liqiang Nie
Large Language Models (LLMs) face significant deployment challenges due to their substantial memory requirements and the computational demands of auto-regressive text generation process.
no code implementations • 18 Feb 2024 • Federico Becattini, Xiaolin Chen, Andrea Puccia, Haokun Wen, Xuemeng Song, Liqiang Nie, Alberto del Bimbo
Recommending fashion items often leverages rich user profiles and makes targeted suggestions based on past history and previous purchases.
1 code implementation • 16 Feb 2024 • Yongqi Li, Zhen Zhang, Wenjie Wang, Liqiang Nie, Wenjie Li, Tat-Seng Chua
Generative retrieval is a promising new paradigm in text retrieval that generates identifier strings of relevant passages as the retrieval target.
no code implementations • 16 Feb 2024 • Yongqi Li, Wenjie Wang, Leigang Qu, Liqiang Nie, Wenjie Li, Tat-Seng Chua
Building upon this capability, we propose to enable multimodal large language models (MLLMs) to memorize and recall images within their parameters.
1 code implementation • 6 Feb 2024 • Kun Ouyang, Liqiang Jing, Xuemeng Song, Meng Liu, Yupeng Hu, Liqiang Nie
We then develop a module named Joint Cross Attention-based Sentiment Inference (JCA-SI) by extending the multimodal sentiment analysis model JCA to derive the joint sentiment label for each video-audio clip.
no code implementations • 3 Feb 2024 • Cunxiao Du, Jing Jiang, Xu Yuanchen, Jiawei Wu, Sicheng Yu, Yongqi Li, Shenggui Li, Kai Xu, Liqiang Nie, Zhaopeng Tu, Yang You
Speculative decoding is a relatively new decoding framework that leverages small and efficient draft models to reduce the latency of LLMs.
1 code implementation • 29 Jan 2024 • Harry Cheng, Yangyang Guo, Tianyi Wang, Liqiang Nie, Mohan Kankanhalli
In particular, this dataset leverages 30, 000 carefully collected textual and visual prompts, ensuring the synthesis of images with both high fidelity and semantic consistency.
1 code implementation • 20 Jan 2024 • Tao Chen, Yazhou Yao, Xingguo Huang, Zechao Li, Liqiang Nie, Jinhui Tang
In this paper, we propose spatial structure constraints (SSC) for weakly supervised semantic segmentation to alleviate the unwanted object over-activation of attention expansion.
no code implementations • 12 Jan 2024 • Zaijing Li, Gongwei Chen, Rui Shao, Yuquan Xie, Dongmei Jiang, Liqiang Nie
In this paper, we propose the Emotional Chain-of-Thought (ECoT), a plug-and-play prompting method that enhances the performance of LLMs on various emotional generation tasks by aligning with human emotional intelligence guidelines.
1 code implementation • CVPR 2024 • Gongwei Chen, Leyang Shen, Rui Shao, Xiang Deng, Liqiang Nie
1) Progressive incorporation of fine-grained spatial-aware visual knowledge.
1 code implementation • CVPR 2024 • Xiaoqian Lv, Shengping Zhang, Chenyang Wang, Yichen Zheng, Bineng Zhong, Chongyi Li, Liqiang Nie
Existing joint low-light enhancement and deblurring methods learn pixel-wise mappings from paired synthetic data which results in limited generalization in real-world scenes.
no code implementations • CVPR 2024 • Chenyang Wang, Zerong Zheng, Tao Yu, Xiaoqian Lv, Bineng Zhong, Shengping Zhang, Liqiang Nie
In this paper we propose a novel framework DiffPerformer to synthesize high-fidelity and temporally consistent human video.
no code implementations • 26 Dec 2023 • Fan Liu, Yaqi Liu, Huilin Chen, Zhiyong Cheng, Liqiang Nie, Mohan Kankanhalli
Recommendation systems harness user-item interactions like clicks and reviews to learn their representations.
no code implementations • 22 Dec 2023 • Zhenyang Li, Fan Liu, Yinwei Wei, Zhiyong Cheng, Liqiang Nie, Mohan Kankanhalli
To obtain robust and independent representations for each factor associated with a specific attribute, we first disentangle the representations of features both within and across different modalities.
no code implementations • 15 Dec 2023 • Liqiang Jing, Xuemeng Song, Xinxing Zu, Na Zheng, Zhongzhou Zhao, Liqiang Nie
Existing sign language translation methods follow a two-stage pipeline: first converting the sign language video to a gloss sequence (i. e. Sign2Gloss) and then translating the generated gloss sequence into a spoken language sentence (i. e. Gloss2Text).
1 code implementation • 12 Dec 2023 • Yupeng Hu, Han Jiang, Hao liu, Kun Wang, Haoyu Tang, Liqiang Nie
Recently, temporal action localization (TAL) has garnered significant interest in information retrieval community.
1 code implementation • CVPR 2024 • Liangxiao Hu, Hongwen Zhang, Yuxiang Zhang, Boyao Zhou, Boning Liu, Shengping Zhang, Liqiang Nie
We present GaussianAvatar, an efficient approach to creating realistic human avatars with dynamic 3D appearances from a single video.
1 code implementation • CVPR 2024 • Shunyuan Zheng, Boyao Zhou, Ruizhi Shao, Boning Liu, Shengping Zhang, Liqiang Nie, Yebin Liu
We present a new approach, termed GPS-Gaussian, for synthesizing novel views of a character in a real-time manner.
2 code implementations • 1 Dec 2023 • Xiao Wang, Yaoyu Li, Tian Gan, Zheng Zhang, Jingjing Lv, Liqiang Nie
Recent advancements in video-language understanding have been established on the foundation of image-text models, resulting in promising outcomes due to the shared knowledge between images and videos.
Ranked #9 on
Video Captioning
on MSR-VTT
(using extra training data)
no code implementations • 26 Nov 2023 • Yu-Wei Zhan, Fan Liu, Xin Luo, Liqiang Nie, Xin-Shun Xu, Mohan Kankanhalli
To capitalize on these rich Human-Centric Visual Cues, we propose a novel approach named HCVC for HOI detection.
1 code implementation • 20 Nov 2023 • Gongwei Chen, Leyang Shen, Rui Shao, Xiang Deng, Liqiang Nie
1) Progressive incorporation of fine-grained spatial-aware visual knowledge.
no code implementations • 1 Nov 2023 • Mengxia Wu, Min Cao, Yang Bai, Ziyin Zeng, Chen Chen, Liqiang Nie, Min Zhang
In this paper, we make the first empirical study of frame selection for TVR.
1 code implementation • 17 Oct 2023 • Yangyang Guo, Fangkai Jiao, Zhiqi Shen, Liqiang Nie, Mohan Kankanhalli
Teaching Visual Question Answering (VQA) models to refrain from answering unanswerable questions is necessary for building a trustworthy AI system.
2 code implementations • 11 Oct 2023 • Haoyu Zhang, Meng Liu, YaoWei Wang, Da Cao, Weili Guan, Liqiang Nie
In response to these challenges, we present an iterative search and reasoning framework, which consists of a textual encoder, a visual encoder, and a generator.
1 code implementation • 28 Sep 2023 • Yangyang Guo, Haoyu Zhang, Yongkang Wong, Liqiang Nie, Mohan Kankanhalli
Learning a versatile language-image model is computationally prohibitive under a limited computing budget.
1 code implementation • 25 Sep 2023 • Rui Shao, Tianxing Wu, Jianlong Wu, Liqiang Nie, Ziwei Liu
HAMMER performs 1) manipulation-aware contrastive learning between two uni-modal encoders as shallow manipulation reasoning, and 2) modality-aware cross-attention by multi-modal aggregator as deep manipulation reasoning.
no code implementations • 17 Aug 2023 • Zhonghua Zheng, Lizi Liao, Yang Deng, Liqiang Nie
The integration of emotional support into various conversational scenarios presents profound societal benefits, such as social interactions, mental health counseling, and customer service.
1 code implementation • 14 Aug 2023 • Tian Gan, Xiao Wang, Yan Sun, Jianlong Wu, Qingpei Guo, Liqiang Nie
The goal of TSGSV is to evaluate the relevance between a video stream and a given sentence query.
1 code implementation • 9 Aug 2023 • Leigang Qu, Shengqiong Wu, Hao Fei, Liqiang Nie, Tat-Seng Chua
Afterward, we propose a fine-grained object-interaction diffusion method to synthesize high-faithfulness images conditioned on the prompt and the automatically generated layout.
1 code implementation • 6 Aug 2023 • Peiguang Jing, Xianyi Liu, Ji Wang, Yinwei Wei, Liqiang Nie, Yuting Su
Emotion distribution learning has gained increasing attention with the tendency to express emotions through images.
1 code implementation • 6 Aug 2023 • Fan Liu, Huilin Chen, Zhiyong Cheng, Liqiang Nie, Mohan Kankanhalli
The teacher model first extracts rich modality features from the generic modality feature by considering both the semantic information of items and the complementary information of multiple modalities.
1 code implementation • 27 Jul 2023 • Harry Cheng, Yangyang Guo, Liqiang Nie, Zhiyong Cheng, Mohan Kankanhalli
Training an effective video action recognition model poses significant computational challenges, particularly under limited resource budgets.
no code implementations • 24 Jul 2023 • Harry Cheng, Yangyang Guo, Tianyi Wang, Liqiang Nie, Mohan Kankanhalli
The existing deepfake detection methods have reached a bottleneck in generalizing to unseen forgeries and manipulation approaches.
1 code implementation • 20 Jul 2023 • Teng Sun, Juntong Ni, Wenjie Wang, Liqiang Jing, Yinwei Wei, Liqiang Nie
To this end, we propose a general debiasing framework based on Inverse Probability Weighting (IPW), which adaptively assigns small weights to the samples with larger bias (i. e., the severer spurious correlations).
1 code implementation • SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval 2023 • Yinwei Wei, Wenqi Liu, Fan Liu, Xiang Wang, Liqiang Nie, Tat-Seng Chua
Considering its challenges in effectiveness and efficiency, we propose a novel Transformer-based recommendation model, termed as Light Graph Transformer model (LightGT).
Ranked #1 on
Multi-Media Recommendation
on Kwai
(Recall@10 metric)
1 code implementation • 29 Jun 2023 • Liqiang Jing, Xuemeng Song, Kun Ouyang, Mengzhao Jia, Liqiang Nie
Multimodal Sarcasm Explanation (MuSE) is a new yet challenging task, which aims to generate a natural language sentence for a multimodal social post (an image as well as its caption) to explain why it contains sarcasm.
no code implementations • 13 Jun 2023 • Meng Liu, Liqiang Nie, Yunxiao Wang, Meng Wang, Yong Rui
Video moment localization, also known as video moment retrieval, aiming to search a target segment within a video described by a given natural language query.
2 code implementations • International Journal of Computer Vision 2023 • Shengping Zhang, Xianzhu Liu, Haozhe Xie, Liqiang Nie, Huiyu Zhou, DaCheng Tao, Xuelong Li
It exploits the repetitive geometric structures in common 3D objects to recover the complete shapes, which contains three sub-networks: geometric patch network, structure transformation network, and detail refinement network.
Ranked #4 on
Point Cloud Completion
on ShapeNet
1 code implementation • 1 Jun 2023 • Rui Shao, Tianxing Wu, Liqiang Nie, Ziwei Liu
Unlike existing deepfake detection methods merely focusing on low-level forgery patterns, the forgery detection process of our model can be regularized by generalizable high-level semantics from a pre-trained ViT and adapted by global and local low-level forgeries of deepfake data.
1 code implementation • 23 May 2023 • Yang Bai, Min Cao, Daming Gao, Ziqiang Cao, Chen Chen, Zhenfeng Fan, Liqiang Nie, Min Zhang
RA offsets the overfitting risk by introducing a novel positive relation detection task (i. e., learning to distinguish strong and weak positive pairs).
Ranked #3 on
Text based Person Retrieval
on RSTPReid
no code implementations • 22 May 2023 • Yang Bai, Jingyao Wang, Min Cao, Chen Chen, Ziqiang Cao, Liqiang Nie, Min Zhang
Text-based person search (TBPS) aims to retrieve the images of the target person from a large image gallery based on a given natural language description.
no code implementations • 17 May 2023 • Xiaolin Chen, Xuemeng Song, Yinwei Wei, Liqiang Nie, Tat-Seng Chua
Thereafter, considering that the attribute knowledge and relation knowledge can benefit the responding to different levels of questions, we design a multi-level knowledge composition module in MDS-S2 to obtain the latent composed response representation.
no code implementations • 5 May 2023 • Liqiang Jing, Xuemeng Song, Xuming Lin, Zhongzhou Zhao, Wei Zhou, Liqiang Nie
This task is non-trivial, due to three challenges: the logic of the generated text, unstructured style reference, and biased training samples.
1 code implementation • 25 Apr 2023 • Leigang Qu, Meng Liu, Wenjie Wang, Zhedong Zheng, Liqiang Nie, Tat-Seng Chua
Image-text retrieval aims to bridge the modality gap and retrieve cross-modal content based on semantic similarities.
no code implementations • 24 Apr 2023 • Rui Hao, Linmei Hu, Weijian Qi, Qingliu Wu, Yirui Zhang, Liqiang Nie
Dialogue-based language models mark a huge milestone in the field of artificial intelligence, by their impressive ability to interact with users, as well as a series of challenging tasks prompted by customized instructions.
1 code implementation • 3 Apr 2023 • Qinglin Liu, Xiaoqian Lv, Quanling Meng, Zonglin Li, Xiangyuan Lan, Shuo Yang, Shengping Zhang, Liqiang Nie
Furthermore, AEMatter leverages a large image training strategy to assist the network in learning context aggregation from data.
Ranked #1 on
Image Matting
on Composition-1K
no code implementations • 30 Mar 2023 • Chengliang Liu, Jie Wen, Yong Xu, Bob Zhang, Liqiang Nie, Min Zhang
The application of multi-view contrastive learning has further facilitated this process, however, the existing multi-view contrastive learning methods crudely separate the so-called negative pair, which largely results in the separation of samples belonging to the same category or similar ones.
1 code implementation • 15 Mar 2023 • Xiao Wang, Tian Gan, Yinwei Wei, Jianlong Wu, Dai Meng, Liqiang Nie
Existing methods mostly focus on analyzing video content, neglecting users' social influence and tag relation.
no code implementations • 14 Mar 2023 • Min Cao, Yang Bai, Jingyao Wang, Ziqiang Cao, Liqiang Nie, Min Zhang
The proposed framework equipped with only two embedding layers achieves $O(1)$ querying time complexity, while improving the retrieval efficiency and keeping its performance, when applied prior to the common image-text retrieval methods.
no code implementations • 13 Feb 2023 • Song Wu, Yazhou Ren, Aodi Yang, Xinyue Chen, Xiaorong Pu, Jing He, Liqiang Nie, Philip S. Yu
In this survey, we investigate the main contributions of deep learning applications using medical images in fighting against COVID-19 from the aspects of image classification, lesion localization, and severity quantification, and review different deep learning architectures and some image preprocessing techniques for achieving a preciser diagnosis.
no code implementations • 4 Feb 2023 • Zhenyang Li, Yangyang Guo, Kejie Wang, Fan Liu, Liqiang Nie, Mohan Kankanhalli
Visual Commonsense Reasoning (VCR) remains a significant yet challenging research problem in the realm of visual reasoning.
1 code implementation • 13 Jan 2023 • Han Liu, Yinwei Wei, Jianhua Yin, Liqiang Nie
Towards this end, existing methods tend to code users by modeling their Hamming similarities with the items they historically interact with, which are termed as the first-order similarities in this work.
1 code implementation • CVPR 2023 • Tian Gan, Qing Wang, Xingning Dong, Xiangyuan Ren, Liqiang Nie, Qingpei Guo
Though there are certain methods studying the Chinese video-text pre-training, they pre-train their models on private datasets whose videos and text are unavailable.
1 code implementation • CVPR 2023 • Jianlong Wu, Haozhe Yang, Tian Gan, Ning Ding, Feijun Jiang, Liqiang Nie
In the meantime, we make full use of the structured information in the hierarchical labels to learn an accurate affinity graph for contrastive learning.
1 code implementation • 22 Dec 2022 • Yali Du, Yinwei Wei, Wei Ji, Fan Liu, Xin Luo, Liqiang Nie
The booming development and huge market of micro-videos bring new e-commerce channels for merchants.
1 code implementation • 20 Dec 2022 • Yinwei Wei, Xiang Wang, Liqiang Nie, Shaoyu Li, Dingxian Wang, Tat-Seng Chua
Knowledge Graph (KG), as a side-information, tends to be utilized to supplement the collaborative filtering (CF) based recommendation model.
no code implementations • 12 Dec 2022 • Linmei Hu, Ziwang Zhao, Weijian Qi, Xuemeng Song, Liqiang Nie
Additionally, based on the designed image-text matching-aware co-attention mechanism, we propose to build two co-attention networks respectively centered on text and image for mutual knowledge distillation to improve fake news detection.
no code implementations • 11 Nov 2022 • Linmei Hu, Zeyi Liu, Ziwang Zhao, Lei Hou, Liqiang Nie, Juanzi Li
We introduce appropriate taxonomies respectively for Natural Language Understanding (NLU) and Natural Language Generation (NLG) to highlight these two main tasks of NLP.
1 code implementation • 27 Sep 2022 • Fan Liu, Zhiyong Cheng, Huilin Chen, Yinwei Wei, Liqiang Nie, Mohan Kankanhalli
At the item level, a synthetic data generation module is proposed to generate a synthetic item corresponding to the selected item based on the user's preferences.
1 code implementation • 12 Sep 2022 • Tianyi Wang, Harry Cheng, Kam Pui Chow, Liqiang Nie
Most existing deep learning methods mainly focus on local features and relations within the face image using convolutional neural networks as a backbone.
no code implementations • 24 Jul 2022 • Yudong Han, Liqiang Nie, Jianhua Yin, Jianlong Wu, Yan Yan
Several studies have recently pointed that existing Visual Question Answering (VQA) models heavily suffer from the language prior problem, which refers to capturing superficial statistical correlations between the question type and the answer whereas ignoring the image contents.
1 code implementation • 24 Jul 2022 • Teng Sun, Wenjie Wang, Liqiang Jing, Yiran Cui, Xuemeng Song, Liqiang Nie
Inspired by this, we devise a model-agnostic counterfactual framework for multimodal sentiment analysis, which captures the direct effect of textual modality via an extra text model and estimates the indirect one by a multimodal model.
no code implementations • 21 Jul 2022 • Yudong Han, Jianhua Yin, Jianlong Wu, Yinwei Wei, Liqiang Nie
Visual Question Answering (VQA) is fundamentally compositional in nature, and many questions are simply answered by decomposing them into modular sub-problems.
no code implementations • 16 Jul 2022 • Xiaolin Chen, Xuemeng Song, Liqiang Jing, Shuo Li, Linmei Hu, Liqiang Nie
To address these limitations, we propose a novel dual knowledge-enhanced generative pretrained language model for multimodal task-oriented dialog systems (DKMD), consisting of three key components: dual knowledge selection, dual knowledge-enhanced context learning, and knowledge-enhanced response generation.
1 code implementation • 13 Jul 2022 • Yuzhang Shang, Dan Xu, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan
Relying on the premise that the performance of a binary neural network can be largely restored with eliminated quantization error between full-precision weight vectors and their corresponding binary vectors, existing works of network binarization frequently adopt the idea of model robustness to reach the aforementioned objective.
1 code implementation • 6 Jul 2022 • Yuzhang Shang, Dan Xu, Ziliang Zong, Liqiang Nie, Yan Yan
Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit.
1 code implementation • 30 Jun 2022 • Yangyang Guo, Liqiang Nie, Yongkang Wong, Yibing Liu, Zhiyong Cheng, Mohan Kankanhalli
On the other hand, pertaining to the implicit knowledge, the multi-modal implicit knowledge for knowledge-based VQA still remains largely unexplored.
1 code implementation • 29 Apr 2022 • Wenjie Wang, Fuli Feng, Liqiang Nie, Tat-Seng Chua
both accuracy and diversity.
no code implementations • 28 Mar 2022 • Min Cao, Shiping Li, Juntao Li, Liqiang Nie, Min Zhang
On top of this, the efficiency-focused study on the ITR system is introduced as the third perspective.
1 code implementation • CVPR 2022 • Xingning Dong, Tian Gan, Xuemeng Song, Jianlong Wu, Yuan Cheng, Liqiang Nie
Scene Graph Generation, which generally follows a regular encoder-decoder pipeline, aims to first encode the visual contents within the given image and then parse them into a compact summary graph.
Ranked #1 on
Unbiased Scene Graph Generation
on Visual Genome
(mR@20 metric)
1 code implementation • 10 Mar 2022 • Fan Liu, Huilin Chen, Zhiyong Cheng, AnAn Liu, Liqiang Nie, Mohan Kankanhalli
However, existing methods ignore the fact that different modalities contribute differently towards a user's preference on various factors of an item.
no code implementations • 4 Mar 2022 • Harry Cheng, Yangyang Guo, Tianyi Wang, Qi Li, Xiaojun Chang, Liqiang Nie
To this end, a voice-face matching method is devised to measure the matching degree of these two.
1 code implementation • Findings (ACL) 2022 • Fangkai Jiao, Yangyang Guo, Xuemeng Song, Liqiang Nie
Logical reasoning is of vital importance to natural language understanding.
Ranked #3 on
Reading Comprehension
on ReClor
1 code implementation • 25 Feb 2022 • Yangyang Guo, Liqiang Nie, Harry Cheng, Zhiyong Cheng, Mohan Kankanhalli, Alberto del Bimbo
From the results on four datasets regarding the above three tasks, our method yields remarkable performance improvements compared with the baselines, demonstrating its superiority on reducing the modality bias problem.
1 code implementation • 25 Feb 2022 • Zhenyang Li, Yangyang Guo, Kejie Wang, Yinwei Wei, Liqiang Nie, Mohan Kankanhalli
Given that our framework is model-agnostic, we apply it to the existing popular baselines and validate its effectiveness on the benchmark dataset.
no code implementations • 30 Jan 2022 • Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan
Extensive experiments on CIFAR-10 and CIFAR-100 demonstrate the superiority of our novel Fourier analysis based MBP compared to other traditional MBP algorithms.
1 code implementation • IEEE Transactions on Multimedia (TMM) 2021 • Qifan Wang, Yinwei Wei, Jianhua Yin, Jianlong Wu, Xuemeng Song, Liqiang Nie
Specifically, we first introduce a single-modal representation learning module, which performs graph operations on the user-microvideo graph in each modality to capture single-modal user preferences on different modalities.
Ranked #4 on
Multi-modal Recommendation
on Amazon Clothing
1 code implementation • 2 Dec 2021 • Wenjie Wang, Fuli Feng, Xiangnan He, Liqiang Nie, Tat-Seng Chua
Inspired by this observation, we propose a new training strategy named Adaptive Denoising Training (ADT), which adaptively prunes the noisy interactions by two paradigms (i. e., Truncated Loss and Reweighted Loss).
1 code implementation • 31 Oct 2021 • Ziyang Ma, Xianjing Han, Xuemeng Song, Yiran Cui, Liqiang Nie
Temporal Moment Localization (TML) in untrimmed videos is a challenging task in the field of multimedia, which aims at localizing the start and end points of the activity in the video, described by a sentence query.
1 code implementation • 12 Oct 2021 • Zongmeng Zhang, Xianjing Han, Xuemeng Song, Yan Yan, Liqiang Nie
Towards this end, in this work, we propose a Multi-modal Interaction Graph Convolutional Network (MIGCN), which jointly explores the complex intra-modal relations and inter-modal interactions residing in the video and sentence query to facilitate the understanding and semantic correspondence capture of the video and sentence query.
no code implementations • 29 Sep 2021 • Yuzhang Shang, Dan Xu, Ziliang Zong, Liqiang Nie, Yan Yan
Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit.
no code implementations • ICCV 2021 • Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan
Knowledge distillation has become one of the most important model compression techniques by distilling knowledge from larger teacher networks to smaller student ones.
no code implementations • 17 Aug 2021 • Xiangkun Yin, Yangyang Guo, Liqiang Nie, Zhiyong Cheng
In addition, we empirically prove that collaborative filtering and semantic matching are complementary to each other in product search performance enhancement.
1 code implementation • 12 Jul 2021 • Yinwei Wei, Xiang Wang, Qi Li, Liqiang Nie, Yan Li, Xuanping Li, Tat-Seng Chua
It aims to maximize the mutual dependencies between item content and collaborative signals.
1 code implementation • ACM Special Interest Group on Information Retrieval 2021 • Leigang Qu, Meng Liu, Jianlong Wu, Zan Gao, Liqiang Nie
To address these issues, we develop a novel modality interaction modeling network based upon the routing mechanism, which is the first unified and dynamic multimodal interaction framework towards image-text retrieval.
1 code implementation • 8 Jun 2021 • Han Liu, Yangyang Guo, Jianhua Yin, Zan Gao, Liqiang Nie
To be specific, in this model, positive and negative reviews are separately gathered and utilized to model the user-preferred and user-rejected aspects, respectively.
1 code implementation • Findings (ACL) 2021 • Fangkai Jiao, Yangyang Guo, Yilin Niu, Feng Ji, Feng-Lin Li, Liqiang Nie
Pre-trained Language Models (PLMs) have achieved great success on Machine Reading Comprehension (MRC) over the past few years.
1 code implementation • 5 May 2021 • Yangyang Guo, Liqiang Nie, Zhiyong Cheng, Feng Ji, Ji Zhang, Alberto del Bimbo
Experimental results demonstrate that our adapted margin cosine loss can greatly enhance the baseline models with an absolute performance gain of 15\% on average, strongly verifying the potential of tackling the language prior problem in VQA from the angle of the answer feature space learning.
no code implementations • 17 Apr 2021 • Yongqi Li, Wenjie Li, Liqiang Nie
Moreover, in order to collect more complementary information in the historical context, we also propose to incorporate the multi-round relevance feedback technique to explore the impact of the retrieval context on current question understanding.
Conversational Question Answering
Open-Domain Question Answering
+1
1 code implementation • ICCV 2021 • Huasong Zhong, Jianlong Wu, Chong Chen, Jianqiang Huang, Minghua Deng, Liqiang Nie, Zhouchen Lin, Xian-Sheng Hua
On the other hand, a novel graph-based contrastive learning strategy is proposed to learn more compact clustering assignments.
1 code implementation • 22 Feb 2021 • Zhiyong Cheng, Fan Liu, Shenghan Mei, Yangyang Guo, Lei Zhu, Liqiang Nie
To demonstrate the effectiveness of our method, we design a light attention neural network to integrate both item-level and feature-level attention for neural ICF models.
1 code implementation • 19 Feb 2021 • Fan Liu, Zhiyong Cheng, Lei Zhu, Zan Gao, Liqiang Nie
To form the subgraphs, we design an unsupervised subgraph generation module, which can effectively identify users with common interests by exploiting both user feature and graph structure.
1 code implementation • 3 Feb 2021 • Yibing Liu, Yangyang Guo, Jianhua Yin, Xuemeng Song, Weifeng Liu, Liqiang Nie
However, recent studies have pointed out that the highlighted image regions from the visual attention are often irrelevant to the given question and answer, leading to model confusion for correct visual reasoning.
no code implementations • 18 Jan 2021 • Yongqi Li, Wenjie Li, Liqiang Nie
In the past years, Knowledge-Based Question Answering (KBQA), which aims to answer natural language questions using facts in a knowledge base, has been well developed.
1 code implementation • 11 Dec 2020 • Wenjie Wang, Ling-Yu Duan, Hao Jiang, Peiguang Jing, Xuemeng Song, Liqiang Nie
With the rising incidence of some diseases, such as obesity and diabetes, a healthy diet is arousing increasing attention.
1 code implementation • 30 Oct 2020 • Yangyang Guo, Liqiang Nie, Zhiyong Cheng, Qi Tian, Min Zhang
Concretely, we design a novel interpretation scheme whereby the loss of mis-predicted frequent and sparse answers of the same question type is distinctly exhibited during the late training phase.
1 code implementation • 20 Jun 2020 • Yangyang Guo, Zhiyong Cheng, Jiazheng Jing, Yanpeng Lin, Liqiang Nie, Meng Wang
Traditional FMs adopt the inner product to model the second-order interactions between different attributes, which are represented via feature vectors.
2 code implementations • 7 Jun 2020 • Wenjie Wang, Fuli Feng, Xiangnan He, Liqiang Nie, Tat-Seng Chua
In this work, we explore the central theme of denoising implicit feedback for recommender training.
no code implementations • 20 Mar 2020 • Fan Liu, Zhiyong Cheng, Lei Zhu, Chenghao Liu, Liqiang Nie
Considering the fact that for different users, the attributes of an item have different influence on their preference for this item, we design a novel attention mechanism to filter the message passed from an item to a target user by considering the attribute information.
no code implementations • IJCNLP 2019 • Linmei Hu, Luhao Zhang, Chuan Shi, Liqiang Nie, Weili Guan, Cheng Yang
Distantly-supervised relation extraction has proven to be effective to find relational facts from texts.
1 code implementation • ACM International Conference on Multimedia 2019 • Yinwei Wei, Xiang Wang, Liqiang Nie, Xiangnan He, Richang Hong, Tat-Seng Chua
Existing works on multimedia recommendation largely exploit multi-modal contents to enrich item representations, while less effort is made to leverage information interchange between users and items to enhance user representations and further capture user's fine-grained preferences on different modalities.
Ranked #1 on
Multi-Media Recommendation
on MovieLens 10M
1 code implementation • 27 Aug 2019 • Yinwei Wei, Zhiyong Cheng, Xuzheng Yu, Zhou Zhao, Lei Zhu, Liqiang Nie
The hashtags, that a user provides to a post (e. g., a micro-video), are the ones which in her mind can well describe the post content where she is interested in.
1 code implementation • 21 Aug 2019 • Fan Liu, Zhiyong Cheng, Changchang Sun, Yinglong Wang, Liqiang Nie, Mohan Kankanhalli
To tackle this problem, in this paper, we propose a novel Multimodal Attentive Metric Learning (MAML) method to model user diverse preferences for various items.
1 code implementation • 13 May 2019 • Yangyang Guo, Zhiyong Cheng, Liqiang Nie, Yibing Liu, Yinglong Wang, Mohan Kankanhalli
Benefiting from the advancement of computer vision, natural language processing and information retrieval techniques, visual question answering (VQA), which aims to answer questions about an image or a video, has received lots of attentions over the past few years.
1 code implementation • 23 Nov 2018 • Cunxiao Du, Zhaozheng Chin, Fuli Feng, Lei Zhu, Tian Gan, Liqiang Nie
To address this problem, we introduce the interaction mechanism to incorporate word-level matching signals into the text classification task.
Ranked #4 on
Text Classification
on Yahoo! Answers
1 code implementation • ACL 2018 • Yansen Wang, Chen-Yi Liu, Minlie Huang, Liqiang Nie
Asking good questions in large-scale, open-domain conversational systems is quite significant yet rather untouched.
1 code implementation • 6 May 2018 • Han Liu, Xiangnan He, Fuli Feng, Liqiang Nie, Rui Liu, Hanwang Zhang
In this paper, we develop a generic feature-based recommendation model, called Discrete Factorization Machine (DFM), for fast and accurate recommendation.
no code implementations • 17 Apr 2018 • Xuemeng Song, Fuli Feng, Xianjing Han, Xin Yang, Wei Liu, Liqiang Nie
Nevertheless, existing studies overlook the rich valuable knowledge (rules) accumulated in fashion domain, especially the rules regarding clothing matching.
43 code implementations • WWW 2017 • Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, Tat-Seng Chua
When it comes to model the key factor in collaborative filtering -- the interaction between user and item features, they still resorted to matrix factorization and applied an inner product on the latent features of users and items.
3 code implementations • 5 Jul 2017 • Shaohua Li, Xinxing Xu, Liqiang Nie, Tat-Seng Chua
However in the traditional optimization objective, low-level features of the content image are absent, and the low-level features of the style image dominate the low-level detail structures of the new image.
no code implementations • 10 Jun 2017 • Xiang Wang, Xiangnan He, Liqiang Nie, Tat-Seng Chua
In this work, we address the problem of cross-domain social recommendation, i. e., recommending relevant items of information domains to potential users of social networks.
Ranked #2 on
Recommendation Systems
on WeChat
no code implementations • 7 Apr 2017 • Dan Wang, He-Yan Huang, Chi Lu, Bo-Si Feng, Liqiang Nie, Guihua Wen, Xian-Ling Mao
Specifically, we define a novel similarity formula for hierarchical labeled data by weighting each layer, and design a deep convolutional neural network to obtain a hash code for each data point.
no code implementations • 4 Feb 2017 • Minnan Luo, Xiaojun Chang, Zhihui Li, Liqiang Nie, Alexander G. Hauptmann, Qinghua Zheng
The heterogeneity-gap between different modalities brings a significant challenge to multimedia information retrieval.
2 code implementations • CVPR 2017 • Long Chen, Hanwang Zhang, Jun Xiao, Liqiang Nie, Jian Shao, Wei Liu, Tat-Seng Chua
Existing visual attention models are generally spatial, i. e., the attention is modeled as spatial probabilities that re-weight the last conv-layer feature map of a CNN encoding an input image.
no code implementations • 7 Nov 2016 • Ye Liu, Liqiang Nie, Lei Han, Luming Zhang, David S. Rosenblum
As compared to simple actions, activities are much more complex, but semantically consistent with a human's real life.