MMCoQA: Conversational Question Answering over Text, Tables, and Images

1 code implementation ACL 2022 Yongqi Li, Wenjie Li, Liqiang Nie

In this paper, we hence define a novel research task, i. e., multimodal conversational question answering (MMCoQA), aiming to answer users’ questions with multimodal knowledge sources via multi-turn conversations.

Benchmarking Conversational Question Answering +1

Micro-video Tagging via Jointly Modeling Social Influence and Tag Relation

1 code implementation15 Mar 2023 Xiao Wang, Tian Gan, Yinwei Wei, Jianlong Wu, Dai Meng, Liqiang Nie

Existing methods mostly focus on analyzing video content, neglecting users' social influence and tag relation.

Link Prediction Semantic Similarity +2

Efficient Image-Text Retrieval via Keyword-Guided Pre-Screening

no code implementations14 Mar 2023 Min Cao, Yang Bai, Jingyao Wang, Ziqiang Cao, Liqiang Nie, Min Zhang

The proposed framework equipped with only two embedding layers achieves $O(1)$ querying time complexity, while improving the retrieval efficiency and keeping its performance, when applied prior to the common image-text retrieval methods.

Multi-Label Classification Multi-Task Learning +2

Deep Learning and Medical Imaging for COVID-19 Diagnosis: A Comprehensive Survey

no code implementations13 Feb 2023 Song Wu, Yazhou Ren, Aodi Yang, Xinyue Chen, Xiaorong Pu, Jing He, Liqiang Nie, Philip S. Yu

In this survey, we investigate the main contributions of deep learning applications using medical images in fighting against COVID-19 from the aspects of image classification, lesion localization, and severity quantification, and review different deep learning architectures and some image preprocessing techniques for achieving a preciser diagnosis.

COVID-19 Diagnosis Image Classification

Learning to Agree on Vision Attention for Visual Commonsense Reasoning

no code implementations4 Feb 2023 Zhenyang Li, Yangyang Guo, Kejie Wang, Fan Liu, Liqiang Nie, Mohan Kankanhalli

Visual Commonsense Reasoning (VCR) remains a significant yet challenging research problem in the realm of visual reasoning.

Visual Commonsense Reasoning

HS-GCN: Hamming Spatial Graph Convolutional Networks for Recommendation

1 code implementation13 Jan 2023 Han Liu, Yinwei Wei, Jianhua Yin, Liqiang Nie

Towards this end, existing methods tend to code users by modeling their Hamming similarities with the items they historically interact with, which are termed as the first-order similarities in this work.

Recommendation Systems

Multi-queue Momentum Contrast for Microvideo-Product Retrieval

1 code implementation22 Dec 2022 Yali Du, Yinwei Wei, Wei Ji, Fan Liu, Xin Luo, Liqiang Nie

The booming development and huge market of micro-videos bring new e-commerce channels for merchants.

Representation Learning Retrieval

Causal Inference for Knowledge Graph based Recommendation

1 code implementation20 Dec 2022 Yinwei Wei, Xiang Wang, Liqiang Nie, Shaoyu Li, Dingxian Wang, Tat-Seng Chua

Knowledge Graph (KG), as a side-information, tends to be utilized to supplement the collaborative filtering (CF) based recommendation model.

Collaborative Filtering Counterfactual Inference

MMNet: Multi-modal Fusion with Mutual Learning Network for Fake News Detection

no code implementations12 Dec 2022 Linmei Hu, Ziwang Zhao, Xinkai Ge, Xuemeng Song, Liqiang Nie

The rapid development of social media provides a hotbed for the dissemination of fake news, which misleads readers and causes negative effects on society.

Fake News Detection Text Matching

A Survey of Knowledge-Enhanced Pre-trained Language Models

no code implementations11 Nov 2022 Linmei Hu, Zeyi Liu, Ziwang Zhao, Lei Hou, Liqiang Nie, Juanzi Li

We introduce appropriate taxonomies respectively for Natural Language Understanding (NLU) and Natural Language Generation (NLG) to highlight these two main tasks of NLP.

Natural Language Understanding Retrieval +2

Privacy-Preserving Synthetic Data Generation for Recommendation Systems

1 code implementation27 Sep 2022 Fan Liu, Zhiyong Cheng, Huilin Chen, Yinwei Wei, Liqiang Nie, Mohan Kankanhalli

At the item level, a synthetic data generation module is proposed to generate a synthetic item corresponding to the selected item based on the user's preferences.

Privacy Preserving Recommendation Systems +1

Deep Convolutional Pooling Transformer for Deepfake Detection

no code implementations12 Sep 2022 Tianyi Wang, Harry Cheng, Kam Pui Chow, Liqiang Nie

Most existing deep learning methods mainly focus on local features and relations within the face image using convolutional neural networks as a backbone.

DeepFake Detection Face Swapping +1

Visual Perturbation-aware Collaborative Learning for Overcoming the Language Prior Problem

no code implementations24 Jul 2022 Yudong Han, Liqiang Nie, Jianhua Yin, Jianlong Wu, Yan Yan

Several studies have recently pointed that existing Visual Question Answering (VQA) models heavily suffer from the language prior problem, which refers to capturing superficial statistical correlations between the question type and the answer whereas ignoring the image contents.

Question Answering Visual Question Answering +1

Counterfactual Reasoning for Out-of-distribution Multimodal Sentiment Analysis

1 code implementation24 Jul 2022 Teng Sun, Wenjie Wang, Liqiang Jing, Yiran Cui, Xuemeng Song, Liqiang Nie

Inspired by this, we devise a model-agnostic counterfactual framework for multimodal sentiment analysis, which captures the direct effect of textual modality via an extra text model and estimates the indirect one by a multimodal model.

Counterfactual Inference Multimodal Sentiment Analysis

Semantic-aware Modular Capsule Routing for Visual Question Answering

no code implementations21 Jul 2022 Yudong Han, Jianhua Yin, Jianlong Wu, Yinwei Wei, Liqiang Nie

Visual Question Answering (VQA) is fundamentally compositional in nature, and many questions are simply answered by decomposing them into modular sub-problems.

Question Answering Visual Question Answering +1

Multimodal Dialog Systems with Dual Knowledge-enhanced Generative Pretrained Language Model

no code implementations16 Jul 2022 Xiaolin Chen, Xuemeng Song, Liqiang Jing, Shuo Li, Linmei Hu, Liqiang Nie

To address these limitations, we propose a novel dual knowledge-enhanced generative pretrained language model for multimodal task-oriented dialog systems (DKMD), consisting of three key components: dual knowledge selection, dual knowledge-enhanced context learning, and knowledge-enhanced response generation.

Language Modelling Response Generation

Lipschitz Continuity Retained Binary Neural Network

1 code implementation13 Jul 2022 Yuzhang Shang, Dan Xu, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan

Relying on the premise that the performance of a binary neural network can be largely restored with eliminated quantization error between full-precision weight vectors and their corresponding binary vectors, existing works of network binarization frequently adopt the idea of model robustness to reach the aforementioned objective.

Binarization Quantization

Network Binarization via Contrastive Learning

1 code implementation6 Jul 2022 Yuzhang Shang, Dan Xu, Ziliang Zong, Liqiang Nie, Yan Yan

Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit.

Binarization Contrastive Learning +2

A Unified End-to-End Retriever-Reader Framework for Knowledge-based VQA

1 code implementation30 Jun 2022 Yangyang Guo, Liqiang Nie, Yongkang Wong, Yibing Liu, Zhiyong Cheng, Mohan Kankanhalli

On the other hand, pertaining to the implicit knowledge, the multi-modal implicit knowledge for knowledge-based VQA still remains largely unexplored.

Question Answering Retrieval +2

Image-text Retrieval: A Survey on Recent Research and Development

no code implementations28 Mar 2022 Min Cao, Shiping Li, Juntao Li, Liqiang Nie, Min Zhang

On top of this, the efficiency-focused study on the ITR system is introduced as the third perspective.

Retrieval Text Retrieval

Stacked Hybrid-Attention and Group Collaborative Learning for Unbiased Scene Graph Generation

1 code implementation CVPR 2022 Xingning Dong, Tian Gan, Xuemeng Song, Jianlong Wu, Yuan Cheng, Liqiang Nie

Scene Graph Generation, which generally follows a regular encoder-decoder pipeline, aims to first encode the visual contents within the given image and then parse them into a compact summary graph.

Graph Generation Unbiased Scene Graph Generation

Disentangled Multimodal Representation Learning for Recommendation

1 code implementation10 Mar 2022 Fan Liu, Huilin Chen, Zhiyong Cheng, AnAn Liu, Liqiang Nie, Mohan Kankanhalli

However, existing methods ignore the fact that different modalities contribute differently towards a user's preference on various factors of an item.

Recommendation Systems Representation Learning

Voice-Face Homogeneity Tells Deepfake

no code implementations4 Mar 2022 Harry Cheng, Yangyang Guo, Tianyi Wang, Qi Li, Xiaojun Chang, Liqiang Nie

To this end, a voice-face matching method is devised to measure the matching degree of these two.

Joint Answering and Explanation for Visual Commonsense Reasoning

1 code implementation25 Feb 2022 Zhenyang Li, Yangyang Guo, Kejie Wang, Yinwei Wei, Liqiang Nie, Mohan Kankanhalli

Given that our framework is model-agnostic, we apply it to the existing popular baselines and validate its effectiveness on the benchmark dataset.

Knowledge Distillation Question Answering +3

On Modality Bias Recognition and Reduction

no code implementations25 Feb 2022 Yangyang Guo, Liqiang Nie, Harry Cheng, Zhiyong Cheng, Mohan Kankanhalli, Alberto del Bimbo

From the results on four datasets regarding the above three tasks, our method yields remarkable performance improvements compared with the baselines, demonstrating its superiority on reducing the modality bias problem.

Action Recognition Multi-modal Classification +4

Win the Lottery Ticket via Fourier Analysis: Frequencies Guided Network Pruning

no code implementations30 Jan 2022 Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan

Extensive experiments on CIFAR-10 and CIFAR-100 demonstrate the superiority of our novel Fourier analysis based MBP compared to other traditional MBP algorithms.

Knowledge Distillation Network Pruning

Learning Robust Recommender from Noisy Implicit Feedback

1 code implementation2 Dec 2021 Wenjie Wang, Fuli Feng, Xiangnan He, Liqiang Nie, Tat-Seng Chua

Inspired by this observation, we propose a new training strategy named Adaptive Denoising Training (ADT), which adaptively prunes the noisy interactions by two paradigms (i. e., Truncated Loss and Reweighted Loss).

Denoising Recommendation Systems

Hierarchical Deep Residual Reasoning for Temporal Moment Localization

1 code implementation31 Oct 2021 Ziyang Ma, Xianjing Han, Xuemeng Song, Yiran Cui, Liqiang Nie

Temporal Moment Localization (TML) in untrimmed videos is a challenging task in the field of multimedia, which aims at localizing the start and end points of the activity in the video, described by a sentence query.

Multi-Modal Interaction Graph Convolutional Network for Temporal Language Localization in Videos

1 code implementation12 Oct 2021 Zongmeng Zhang, Xianjing Han, Xuemeng Song, Yan Yan, Liqiang Nie

Towards this end, in this work, we propose a Multi-modal Interaction Graph Convolutional Network (MIGCN), which jointly explores the complex intra-modal relations and inter-modal interactions residing in the video and sentence query to facilitate the understanding and semantic correspondence capture of the video and sentence query.

Semantic correspondence Semantic Similarity +1

Contrastive Mutual Information Maximization for Binary Neural Networks

no code implementations29 Sep 2021 Yuzhang Shang, Dan Xu, Ziliang Zong, Liqiang Nie, Yan Yan

Neural network binarization accelerates deep models by quantizing their weights and activations into 1-bit.

Binarization Contrastive Learning +2

Lipschitz Continuity Guided Knowledge Distillation

no code implementations ICCV 2021 Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan

Knowledge distillation has become one of the most important model compression techniques by distilling knowledge from larger teacher networks to smaller student ones.

Knowledge Distillation Model Compression +2

When Product Search Meets Collaborative Filtering: A Hierarchical Heterogeneous Graph Neural Network Approach

no code implementations17 Aug 2021 Xiangkun Yin, Yangyang Guo, Liqiang Nie, Zhiyong Cheng

In addition, we empirically prove that collaborative filtering and semantic matching are complementary to each other in product search performance enhancement.

Collaborative Filtering Representation Learning

Dynamic Modality Interaction Modeling for Image-Text Retrieval

1 code implementation ACM Special Interest Group on Information Retrieval 2021 Leigang Qu, Meng Liu, Jianlong Wu, Zan Gao, Liqiang Nie

To address these issues, we develop a novel modality interaction modeling network based upon the routing mechanism, which is the first unified and dynamic multimodal interaction framework towards image-text retrieval.

Cross-Modal Retrieval Information Retrieval +2

Review Polarity-wise Recommender

1 code implementation8 Jun 2021 Han Liu, Yangyang Guo, Jianhua Yin, Zan Gao, Liqiang Nie

To be specific, in this model, positive and negative reviews are separately gathered and utilized to model the user-preferred and user-rejected aspects, respectively.

Recommendation Systems

AdaVQA: Overcoming Language Priors with Adapted Margin Cosine Loss

1 code implementation5 May 2021 Yangyang Guo, Liqiang Nie, Zhiyong Cheng, Feng Ji, Ji Zhang, Alberto del Bimbo

Experimental results demonstrate that our adapted margin cosine loss can greatly enhance the baseline models with an absolute performance gain of 15\% on average, strongly verifying the potential of tackling the language prior problem in VQA from the angle of the answer feature space learning.

Question Answering Visual Question Answering +1

A Graph-guided Multi-round Retrieval Method for Conversational Open-domain Question Answering

no code implementations17 Apr 2021 Yongqi Li, Wenjie Li, Liqiang Nie

Moreover, in order to collect more complementary information in the historical context, we also propose to incorporate the multi-round relevance feedback technique to explore the impact of the retrieval context on current question understanding.

Conversational Question Answering Open-Domain Question Answering +1

Graph Contrastive Clustering

1 code implementation ICCV 2021 Huasong Zhong, Jianlong Wu, Chong Chen, Jianqiang Huang, Minghua Deng, Liqiang Nie, Zhouchen Lin, Xian-Sheng Hua

On the other hand, a novel graph-based contrastive learning strategy is proposed to learn more compact clustering assignments.

Contrastive Learning

Feature-level Attentive ICF for Recommendation

1 code implementation22 Feb 2021 Zhiyong Cheng, Fan Liu, Shenghan Mei, Yangyang Guo, Lei Zhu, Liqiang Nie

To demonstrate the effectiveness of our method, we design a light attention neural network to integrate both item-level and feature-level attention for neural ICF models.

Collaborative Filtering Recommendation Systems

Interest-aware Message-Passing GCN for Recommendation

1 code implementation19 Feb 2021 Fan Liu, Zhiyong Cheng, Lei Zhu, Zan Gao, Liqiang Nie

To form the subgraphs, we design an unsupervised subgraph generation module, which can effectively identify users with common interests by exploiting both user feature and graph structure.

Answer Questions with Right Image Regions: A Visual Attention Regularization Approach

1 code implementation3 Feb 2021 Yibing Liu, Yangyang Guo, Jianhua Yin, Xuemeng Song, Weifeng Liu, Liqiang Nie

However, recent studies have pointed out that the highlighted image regions from the visual attention are often irrelevant to the given question and answer, leading to model confusion for correct visual reasoning.

Question Answering Visual Grounding +3

Incremental Knowledge Based Question Answering

no code implementations18 Jan 2021 Yongqi Li, Wenjie Li, Liqiang Nie

In the past years, Knowledge-Based Question Answering (KBQA), which aims to answer natural language questions using facts in a knowledge base, has been well developed.

Incremental Learning Knowledge Distillation +1

Market2Dish: Health-aware Food Recommendation

1 code implementation11 Dec 2020 Wenjie Wang, Ling-Yu Duan, Hao Jiang, Peiguang Jing, Xuemeng Song, Liqiang Nie

With the rising incidence of some diseases, such as obesity and diabetes, a healthy diet is arousing increasing attention.

Food recommendation Nutrition +1

Loss re-scaling VQA: Revisiting the LanguagePrior Problem from a Class-imbalance View

1 code implementation30 Oct 2020 Yangyang Guo, Liqiang Nie, Zhiyong Cheng, Qi Tian, Min Zhang

Concretely, we design a novel interpretation scheme whereby the loss of mis-predicted frequent and sparse answers of the same question type is distinctly exhibited during the late training phase.

Face Recognition Image Classification +3

Enhancing Factorization Machines with Generalized Metric Learning

1 code implementation20 Jun 2020 Yangyang Guo, Zhiyong Cheng, Jiazheng Jing, Yanpeng Lin, Liqiang Nie, Meng Wang

Traditional FMs adopt the inner product to model the second-order interactions between different attributes, which are represented via feature vectors.

Metric Learning Recommendation Systems

Denoising Implicit Feedback for Recommendation

1 code implementation7 Jun 2020 Wenjie Wang, Fuli Feng, Xiangnan He, Liqiang Nie, Tat-Seng Chua

In this work, we explore the central theme of denoising implicit feedback for recommender training.

Denoising Recommendation Systems

A^2-GCN: An Attribute-aware Attentive GCN Model for Recommendation

no code implementations20 Mar 2020 Fan Liu, Zhiyong Cheng, Lei Zhu, Chenghao Liu, Liqiang Nie

Considering the fact that for different users, the attributes of an item have different influence on their preference for this item, we design a novel attention mechanism to filter the message passed from an item to a target user by considering the attribute information.

Recommendation Systems

Personalized Hashtag Recommendation for Micro-videos

1 code implementation27 Aug 2019 Yinwei Wei, Zhiyong Cheng, Xuzheng Yu, Zhou Zhao, Lei Zhu, Liqiang Nie

The hashtags, that a user provides to a post (e. g., a micro-video), are the ones which in her mind can well describe the post content where she is interested in.

User Diverse Preference Modeling by Multimodal Attentive Metric Learning

1 code implementation21 Aug 2019 Fan Liu, Zhiyong Cheng, Changchang Sun, Yinglong Wang, Liqiang Nie, Mohan Kankanhalli

To tackle this problem, in this paper, we propose a novel Multimodal Attentive Metric Learning (MAML) method to model user diverse preferences for various items.

Metric Learning Recommendation Systems

Quantifying and Alleviating the Language Prior Problem in Visual Question Answering

1 code implementation13 May 2019 Yangyang Guo, Zhiyong Cheng, Liqiang Nie, Yibing Liu, Yinglong Wang, Mohan Kankanhalli

Benefiting from the advancement of computer vision, natural language processing and information retrieval techniques, visual question answering (VQA), which aims to answer questions about an image or a video, has received lots of attentions over the past few years.

Information Retrieval Question Answering +3

Explicit Interaction Model towards Text Classification

1 code implementation23 Nov 2018 Cunxiao Du, Zhaozheng Chin, Fuli Feng, Lei Zhu, Tian Gan, Liqiang Nie

To address this problem, we introduce the interaction mechanism to incorporate word-level matching signals into the text classification task.

General Classification Multi Class Text Classification +3

Discrete Factorization Machines for Fast Feature-based Recommendation

1 code implementation6 May 2018 Han Liu, Xiangnan He, Fuli Feng, Liqiang Nie, Rui Liu, Hanwang Zhang

In this paper, we develop a generic feature-based recommendation model, called Discrete Factorization Machine (DFM), for fast and accurate recommendation.

Binarization Quantization

Neural Compatibility Modeling with Attentive Knowledge Distillation

no code implementations17 Apr 2018 Xuemeng Song, Fuli Feng, Xianjing Han, Xin Yang, Wei Liu, Liqiang Nie

Nevertheless, existing studies overlook the rich valuable knowledge (rules) accumulated in fashion domain, especially the rules regarding clothing matching.

Image Classification Knowledge Distillation +2

Neural Collaborative Filtering

41 code implementations WWW 2017 Xiangnan He, Lizi Liao, Hanwang Zhang, Liqiang Nie, Xia Hu, Tat-Seng Chua

When it comes to model the key factor in collaborative filtering -- the interaction between user and item features, they still resorted to matrix factorization and applied an inner product on the latent features of users and items.

Collaborative Filtering Recommendation Systems

Laplacian-Steered Neural Style Transfer

3 code implementations5 Jul 2017 Shaohua Li, Xinxing Xu, Liqiang Nie, Tat-Seng Chua

However in the traditional optimization objective, low-level features of the content image are absent, and the low-level features of the style image dominate the low-level detail structures of the new image.

Image Generation Style Transfer

Item Silk Road: Recommending Items from Information Domains to Social Users

no code implementations10 Jun 2017 Xiang Wang, Xiangnan He, Liqiang Nie, Tat-Seng Chua

In this work, we address the problem of cross-domain social recommendation, i. e., recommending relevant items of information domains to potential users of social networks.

Collaborative Ranking Recommendation Systems

Supervised Deep Hashing for Hierarchical Labeled Data

no code implementations7 Apr 2017 Dan Wang, He-Yan Huang, Chi Lu, Bo-Si Feng, Liqiang Nie, Guihua Wen, Xian-Ling Mao

Specifically, we define a novel similarity formula for hierarchical labeled data by weighting each layer, and design a deep convolutional neural network to obtain a hash code for each data point.

Image Retrieval Retrieval

Simple to Complex Cross-modal Learning to Rank

no code implementations4 Feb 2017 Minnan Luo, Xiaojun Chang, Zhihui Li, Liqiang Nie, Alexander G. Hauptmann, Qinghua Zheng

The heterogeneity-gap between different modalities brings a significant challenge to multimedia information retrieval.

Cross-Modal Retrieval Information Retrieval +3

SCA-CNN: Spatial and Channel-wise Attention in Convolutional Networks for Image Captioning

2 code implementations CVPR 2017 Long Chen, Hanwang Zhang, Jun Xiao, Liqiang Nie, Jian Shao, Wei Liu, Tat-Seng Chua

Existing visual attention models are generally spatial, i. e., the attention is modeled as spatial probabilities that re-weight the last conv-layer feature map of a CNN encoding an input image.

Image Captioning

Action2Activity: Recognizing Complex Activities from Sensor Data

no code implementations7 Nov 2016 Ye Liu, Liqiang Nie, Lei Han, Luming Zhang, David S. Rosenblum

As compared to simple actions, activities are much more complex, but semantically consistent with a human's real life.

Action Recognition Multi-Task Learning +1

