Search Results for author: Yongdong Zhang

Found 65 papers, 28 papers with code

Addressing Confounding Feature Issue for Causal Recommendation

no code implementations13 May 2022 Xiangnan He, Yang Zhang, Fuli Feng, Chonggang Song, Lingling Yi, Guohui Ling, Yongdong Zhang

We demonstrate DCR on the backbone model of neural factorization machine (NFM), showing that DCR leads to more accurate prediction of user preference with small inference time cost.

Recommendation Systems

Rumor Detection with Self-supervised Learning on Texts and Social Graph

no code implementations19 Apr 2022 Yuan Gao, Xiang Wang, Xiangnan He, Huamin Feng, Yongdong Zhang

At the core is to model the rumor characteristics inherent in rich information, such as propagation patterns in social network and semantic patterns in post content, and differentiate them from the truth.

Self-Supervised Learning

Part-level Action Parsing via a Pose-guided Coarse-to-Fine Framework

no code implementations9 Mar 2022 Xiaodong Chen, Xinchen Liu, Wu Liu, Kun Liu, Dong Wu, Yongdong Zhang, Tao Mei

Therefore, researchers start to focus on a new task, Part-level Action Parsing (PAP), which aims to not only predict the video-level action but also recognize the frame-level fine-grained actions or interactions of body parts for each person in the video.

Action Parsing Action Recognition

Motion-Modulated Temporal Fragment Alignment Network for Few-Shot Action Recognition

no code implementations CVPR 2022 Jiamin Wu, Tianzhu Zhang, Zhe Zhang, Feng Wu, Yongdong Zhang

To address this issue, we propose an end-to-end Motion-modulated Temporal Fragment Alignment Network (MTFAN) by jointly exploring the task-specific motion modulation and the multi-level temporal fragment alignment for Few-Shot Action Recognition (FSAR).

Few Shot Action Recognition Image Classification

Partial Class Activation Attention for Semantic Segmentation

1 code implementation CVPR 2022 Sun-Ao Liu, Hongtao Xie, Hai Xu, Yongdong Zhang, Qi Tian

Current attention-based methods for semantic segmentation mainly model pixel relation through pairwise affinity and coarse segmentation.

Semantic Segmentation

From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network

1 code implementation ICCV 2021 Yuxin Wang, Hongtao Xie, Shancheng Fang, Jing Wang, Shenggao Zhu, Yongdong Zhang

Such operation guides the vision model to use not only the visual texture of characters, but also the linguistic information in visual context for recognition when the visual cues are confused (e. g. occlusion, noise, etc.).

Language Modelling Scene Text Recognition

Causal Incremental Graph Convolution for Recommender System Retraining

1 code implementation16 Aug 2021 Sihao Ding, Fuli Feng, Xiangnan He, Yong Liao, Jun Shi, Yongdong Zhang

Towards the goal, we propose a \textit{Causal Incremental Graph Convolution} approach, which consists of two new operators named \textit{Incremental Graph Convolution} (IGC) and \textit{Colliding Effect Distillation} (CED) to estimate the output of full graph convolution.

Causal Inference Recommendation Systems

PERT: A Progressively Region-based Network for Scene Text Removal

1 code implementation24 Jun 2021 Yuxin Wang, Hongtao Xie, Shancheng Fang, Yadong Qu, Yongdong Zhang

However, there exists two problems: 1) the implicit erasure guidance causes the excessive erasure to non-text areas; 2) the one-stage erasure lacks the exhaustive removal of text region.

Lesion-Aware Transformers for Diabetic Retinopathy Grading

no code implementations CVPR 2021 Rui Sun, Yihao Li, Tianzhu Zhang, Zhendong Mao, Feng Wu, Yongdong Zhang

First, to the best of our knowledge, this is the first work to formulate lesion discovery as a weakly supervised lesion localization problem via a transformer decoder.

Diabetic Retinopathy Grading

Uncertainty Guided Collaborative Training for Weakly Supervised Temporal Action Detection

no code implementations CVPR 2021 Wenfei Yang, Tianzhu Zhang, Xiaoyuan Yu, Tian Qi, Yongdong Zhang, Feng Wu

To alleviate this problem, we propose a novel Uncertainty Guided Collaborative Training (UGCT) strategy, which mainly includes two key designs: (1) The first design is an online pseudo label generation module, in which the RGB and FLOW streams work collaboratively to learn from each other.

Action Detection

Diverse Part Discovery: Occluded Person Re-identification with Part-Aware Transformer

no code implementations CVPR 2021 Yulin Li, Jianfeng He, Tianzhu Zhang, Xiang Liu, Yongdong Zhang, Feng Wu

To address these issues, we propose a novel end-to-end Part-Aware Transformer (PAT) for occluded person Re-ID through diverse part discovery via a transformer encoderdecoder architecture, including a pixel context based transformer encoder and a part prototype based transformer decoder.

Person Re-Identification

Causal Intervention for Leveraging Popularity Bias in Recommendation

1 code implementation13 May 2021 Yang Zhang, Fuli Feng, Xiangnan He, Tianxin Wei, Chonggang Song, Guohui Ling, Yongdong Zhang

This work studies an unexplored problem in recommendation -- how to leverage popularity bias to improve the recommendation accuracy.

Collaborative Filtering Recommendation Systems

Action Unit Memory Network for Weakly Supervised Temporal Action Localization

no code implementations CVPR 2021 Wang Luo, Tianzhu Zhang, Wenfei Yang, Jingen Liu, Tao Mei, Feng Wu, Yongdong Zhang

In this paper, we present an Action Unit Memory Network (AUMN) for weakly supervised temporal action localization, which can mitigate the above two challenges by learning an action unit memory bank.

Weakly Supervised Action Localization Weakly-supervised Temporal Action Localization +1

Frequency-aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection

no code implementations CVPR 2021 Jiaming Li, Hongtao Xie, Jiahong Li, Zhongyuan Wang, Yongdong Zhang

Face forgery detection is raising ever-increasing interest in computer vision since facial manipulation technologies cause serious worries.

Foreground Activation Maps for Weakly Supervised Object Localization

no code implementations ICCV 2021 Meng Meng, Tianzhu Zhang, Qi Tian, Yongdong Zhang, Feng Wu

To the best of our knowledge, this is the first work that can achieve remarkable performance for both tasks by optimizing them jointly via FAM for WSOL.

Classification Weakly-Supervised Object Localization

Meta-Attack: Class-Agnostic and Model-Agnostic Physical Adversarial Attack

no code implementations ICCV 2021 Weiwei Feng, Baoyuan Wu, Tianzhu Zhang, Yong Zhang, Yongdong Zhang

To tackle these issues, we propose a class-agnostic and model-agnostic physical adversarial attack model (Meta-Attack), which is able to not only generate robust physical adversarial examples by simulating color and shape distortions, but also generalize to attacking novel images and novel DNN models by accessing a few digital and physical images.

Adversarial Attack Few-Shot Learning

Task-Aware Part Mining Network for Few-Shot Learning

no code implementations ICCV 2021 Jiamin Wu, Tianzhu Zhang, Yongdong Zhang, Feng Wu

The task-aware part filters can adapt to any individual task and automatically mine task-related local parts even for an unseen task.

Few-Shot Learning

Hierarchical Granularity Transfer Learning

no code implementations NeurIPS 2020 Shaobo Min, Hongtao Xie, Hantao Yao, Xuran Deng, Zheng-Jun Zha, Yongdong Zhang

In this paper, we introduce a new task, named Hierarchical Granularity Transfer Learning (HGTL), to recognize sub-level categories with basic-level annotations and semantic descriptions for hierarchical categories.

Transfer Learning

CatGCN: Graph Convolutional Networks with Categorical Node Features

1 code implementation11 Sep 2020 Weijian Chen, Fuli Feng, Qifan Wang, Xiangnan He, Chonggang Song, Guohui Ling, Yongdong Zhang

In this paper, we propose a new GCN model named CatGCN, which is tailored for graph learning when the node features are categorical.

Graph Learning Node Classification +1

Depth image denoising using nuclear norm and learning graph model

no code implementations9 Aug 2020 Chenggang Yan, Zhisheng Li, Yongbing Zhang, Yutao Liu, Xiangyang Ji, Yongdong Zhang

The depth images denoising are increasingly becoming the hot research topic nowadays because they reflect the three-dimensional (3D) scene and can be applied in various fields of computer vision.

Image Denoising Image Restoration

Curriculum Learning for Natural Language Understanding

no code implementations ACL 2020 Benfeng Xu, Licheng Zhang, Zhendong Mao, Quan Wang, Hongtao Xie, Yongdong Zhang

With the great success of pre-trained language models, the pretrain-finetune paradigm now becomes the undoubtedly dominant solution for natural language understanding (NLU) tasks.

Natural Language Understanding

Attribute-Induced Bias Eliminating for Transductive Zero-Shot Learning

no code implementations31 May 2020 Hantao Yao, Shaobo Min, Yongdong Zhang, Changsheng Xu

Then, an attentional graph attribute embedding is proposed to reduce the semantic bias between seen and unseen categories, which utilizes the graph operation to capture the semantic relationship between categories.

Transfer Learning Zero-Shot Learning

How to Retrain Recommender System? A Sequential Meta-Learning Method

1 code implementation27 May 2020 Yang Zhang, Fuli Feng, Chenxu Wang, Xiangnan He, Meng Wang, Yan Li, Yongdong Zhang

Nevertheless, normal training on new data only may easily cause overfitting and forgetting issues, since the new data is of a smaller scale and contains fewer information on long-term user preference.

Meta-Learning Recommendation Systems

ContourNet: Taking a Further Step toward Accurate Arbitrary-shaped Scene Text Detection

1 code implementation CVPR 2020 Yuxin Wang, Hongtao Xie, Zheng-Jun Zha, Mengting Xing, Zilong Fu, Yongdong Zhang

Then a novel Local Orthogonal Texture-aware Module (LOTM) models the local texture information of proposal features in two orthogonal directions and represents text region with a set of contour points.

Region Proposal Scene Text Detection

Graph Structured Network for Image-Text Matching

1 code implementation CVPR 2020 Chunxiao Liu, Zhendong Mao, Tianzhu Zhang, Hongtao Xie, Bin Wang, Yongdong Zhang

The GSMN explicitly models object, relation and attribute as a structured phrase, which not only allows to learn correspondence of object, relation and attribute separately, but also benefits to learn fine-grained correspondence of structured phrase.

Cross-Modal Retrieval Text Matching

Domain-aware Visual Bias Eliminating for Generalized Zero-Shot Learning

1 code implementation CVPR 2020 Shaobo Min, Hantao Yao, Hongtao Xie, Chaoqun Wang, Zheng-Jun Zha, Yongdong Zhang

Recent methods focus on learning a unified semantic-aligned visual representation to transfer knowledge between two domains, while ignoring the effect of semantic-free visual representation in alleviating the biased recognition problem.

Generalized Zero-Shot Learning

Multi-Objective Matrix Normalization for Fine-grained Visual Recognition

1 code implementation30 Mar 2020 Shaobo Min, Hantao Yao, Hongtao Xie, Zheng-Jun Zha, Yongdong Zhang

In this paper, we propose an efficient Multi-Objective Matrix Normalization (MOMN) method that can simultaneously normalize a bilinear representation in terms of square-root, low-rank, and sparsity.

Fine-Grained Visual Recognition

Bilinear Graph Neural Network with Neighbor Interactions

1 code implementation10 Feb 2020 Hongmin Zhu, Fuli Feng, Xiangnan He, Xiang Wang, Yan Li, Kai Zheng, Yongdong Zhang

We term this framework as Bilinear Graph Neural Network (BGNN), which improves GNN representation ability with bilinear interactions between neighbor nodes.

General Classification Node Classification

LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation

11 code implementations6 Feb 2020 Xiangnan He, Kuan Deng, Xiang Wang, Yan Li, Yongdong Zhang, Meng Wang

We propose a new model named LightGCN, including only the most essential component in GCN -- neighborhood aggregation -- for collaborative filtering.

Collaborative Filtering Graph Classification +1

Asymmetric GAN for Unpaired Image-to-image Translation

no code implementations25 Dec 2019 Yu Li, Sheng Tang, Rui Zhang, Yongdong Zhang, Jintao Li, Shuicheng Yan

While in situations where two domains are asymmetric in complexity, i. e., the amount of information between two domains is different, these approaches pose problems of poor generation quality, mapping ambiguity, and model sensitivity.

Image-to-Image Translation Translation

Learning Hierarchy-Aware Knowledge Graph Embeddings for Link Prediction

6 code implementations21 Nov 2019 Zhanqiu Zhang, Jianyu Cai, Yongdong Zhang, Jie Wang

HAKE is inspired by the fact that concentric circles in the polar coordinate system can naturally reflect the hierarchy.

Knowledge Graph Completion Knowledge Graph Embedding +2

Scheduled Differentiable Architecture Search for Visual Recognition

no code implementations23 Sep 2019 Zhaofan Qiu, Ting Yao, Yiheng Zhang, Yongdong Zhang, Tao Mei

Moreover, we enlarge the search space of SDAS particularly for video recognition by devising several unique operations to encode spatio-temporal dynamics and demonstrate the impact in affecting the architecture search of SDAS.

Video Recognition

ACE-Net: Biomedical Image Segmentation with Augmented Contracting and Expansive Paths

no code implementations23 Aug 2019 Yanhao Zhu, Zhineng Chen, Shuai Zhao, Hongtao Xie, Wenming Guo, Yongdong Zhang

Nowadays U-net-like FCNs predominate various biomedical image segmentation applications and attain promising performance, largely due to their elegant architectures, e. g., symmetric contracting and expansive paths as well as lateral skip-connections.

Semantic Segmentation

Domain-Specific Embedding Network for Zero-Shot Recognition

1 code implementation12 Aug 2019 Shaobo Min, Hantao Yao, Hongtao Xie, Zheng-Jun Zha, Yongdong Zhang

In contrast to previous methods, the DSEN decomposes the domain-shared projection function into one domain-invariant and two domain-specific sub-functions to explore the similarities and differences between two domains.

Zero-Shot Learning

Consensus Feature Network for Scene Parsing

no code implementations29 Jul 2019 Tianyi Wu, Sheng Tang, Rui Zhang, Guodong Guo, Yongdong Zhang

However, classification networks are dominated by the discriminative portion, so directly applying classification networks to scene parsing will result in inconsistent parsing predictions within one instance and among instances of the same category.

General Classification Scene Parsing

Dense Scale Network for Crowd Counting

1 code implementation24 Jun 2019 Feng Dai, Hao liu, Yike Ma, Juan Cao, Qiang Zhao, Yongdong Zhang

The key component of our network is the dense dilated convolution block, in which each dilation layer is densely connected with the others to preserve information from continuously varied scales.

Crowd Counting

Context-Aware Visual Policy Network for Fine-Grained Image Captioning

1 code implementation6 Jun 2019 Zheng-Jun Zha, Daqing Liu, Hanwang Zhang, Yongdong Zhang, Feng Wu

With the maturity of visual detection techniques, we are more ambitious in describing visual content with open-vocabulary, fine-grained and free-form language, i. e., the task of image captioning.

Image Captioning Image Paragraph Captioning +1

Relational Collaborative Filtering:Modeling Multiple Item Relations for Recommendation

2 code implementations29 Apr 2019 Xin Xin, Xiangnan He, Yongfeng Zhang, Yongdong Zhang, Joemon Jose

In this work, we propose Relational Collaborative Filtering (RCF), a general framework to exploit multiple relations between items in recommender system.

Collaborative Filtering Recommendation Systems

Not All Words are Equal: Video-specific Information Loss for Video Captioning

no code implementations1 Jan 2019 Jiarong Dong, Ke Gao, Xiaokai Chen, Junbo Guo, Juan Cao, Yongdong Zhang

To address this issue, we propose a novel learning strategy called Information Loss, which focuses on the relationship between the video-specific visual content and corresponding representative words.

Video Captioning

CGNet: A Light-weight Context Guided Network for Semantic Segmentation

3 code implementations20 Nov 2018 Tianyi Wu, Sheng Tang, Rui Zhang, Yongdong Zhang

To tackle this problem, we propose a novel Context Guided Network (CGNet), which is a light-weight and efficient network for semantic segmentation.

Semantic Segmentation

CA3Net: Contextual-Attentional Attribute-Appearance Network for Person Re-Identification

no code implementations19 Nov 2018 Jiawei Liu, Zheng-Jun Zha, Hongtao Xie, Zhiwei Xiong, Yongdong Zhang

An appearance network is developed to learn appearance features from the full body, horizontal and vertical body parts of pedestrians with spatial dependencies among body parts.

Multi-Task Learning Person Re-Identification

Style Separation and Synthesis via Generative Adversarial Networks

2 code implementations7 Nov 2018 Rui Zhang, Sheng Tang, Yu Li, Junbo Guo, Yongdong Zhang, Jintao Li, Shuicheng Yan

The S3-GAN consists of an encoder network, a generator network, and an adversarial network.

Context-Aware Visual Policy Network for Sequence-Level Image Captioning

1 code implementation16 Aug 2018 Daqing Liu, Zheng-Jun Zha, Hanwang Zhang, Yongdong Zhang, Feng Wu

To fill the gap, we propose a Context-Aware Visual Policy network (CAVP) for sequence-level image captioning.

Image Captioning

A Two-Stream Mutual Attention Network for Semi-supervised Biomedical Segmentation with Noisy Labels

no code implementations31 Jul 2018 Shaobo Min, Xuejin Chen, Zheng-Jun Zha, Feng Wu, Yongdong Zhang

\begin{abstract} Learning-based methods suffer from a deficiency of clean annotations, especially in biomedical segmentation.

Time Matters: Multi-scale Temporalization of Social Media Popularity

no code implementations12 Dec 2017 Bo Wu, Wen-Huang Cheng, Yongdong Zhang, Tao Mei

We evaluate our approach on two large-scale Flickr image datasets with over 1. 8 million photos in total, for the task of popularity prediction.

Social Media Popularity Prediction

Sequential Prediction of Social Media Popularity with Deep Temporal Context Networks

1 code implementation12 Dec 2017 Bo Wu, Wen-Huang Cheng, Yongdong Zhang, Qiushi Huang, Jintao Li, Tao Mei

With a joint embedding network, we obtain a unified deep representation of multi-modal user-post data in a common embedding space.

Social Media Popularity Prediction

Multimodal Fusion with Recurrent Neural Networks for Rumor Detection on Microblogs

no code implementations Mountain View, CA, USA 2017 Zhiwei Jin, Juan Cao, Han Guo, Yongdong Zhang

In this paper, we propose a novel Recurrent Neural Network with an at- tention mechanism (att-RNN) to fuse multimodal features for e ective rumor detection.

Scale-Adaptive Convolutions for Scene Parsing

no code implementations ICCV 2017 Rui Zhang, Sheng Tang, Yongdong Zhang, Jintao Li, Shuicheng Yan

Through adding a new scale regression layer, we can dynamically infer the position-adaptive scale coefficients which are adopted to resize the convolutional patches.

Scene Parsing

APE-GAN: Adversarial Perturbation Elimination with GAN

3 code implementations18 Jul 2017 Shiwei Shen, Guoqing Jin, Ke Gao, Yongdong Zhang

Although neural networks could achieve state-of-the-art performance while recongnizing images, they often suffer a tremendous defeat from adversarial examples--inputs generated by utilizing imperceptible but intentional perturbation to clean samples from the datasets.

Deep Representation Learning with Part Loss for Person Re-Identification

no code implementations4 Jul 2017 Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, Qi Tian

The representation learning risk is evaluated by the proposed part loss, which automatically generates several parts for an image, and computes the person classification loss on each part separately.

Classification General Classification +2

One-Shot Fine-Grained Instance Retrieval

no code implementations4 Jul 2017 Hantao Yao, Shiliang Zhang, Yongdong Zhang, Jintao Li, Qi Tian

Aiming to conquer this issue, we propose a retrieval task named One-Shot Fine-Grained Instance Retrieval (OSFGIR).

Fine-Grained Visual Categorization Image Retrieval

Task-Driven Dynamic Fusion: Reducing Ambiguity in Video Description

no code implementations CVPR 2017 Xishan Zhang, Ke Gao, Yongdong Zhang, Dongming Zhang, Jintao Li, Qi Tian

This paper contributes to: 1)The first in-depth study of the weakness inherent in data-driven static fusion methods for video captioning.

Video Captioning Video Description

DR2-Net: Deep Residual Reconstruction Network for Image Compressive Sensing

1 code implementation19 Feb 2017 Hantao Yao, Feng Dai, Dongming Zhang, Yike Ma, Shiliang Zhang, Yongdong Zhang, Qi Tian

Accordingly, DR$^{2}$-Net consists of two components, \emph{i. e.,} linear mapping network and residual network, respectively.

Compressive Sensing Image Reconstruction

Image Credibility Analysis with Effective Domain Transferred Deep Networks

no code implementations16 Nov 2016 Zhiwei Jin, Juan Cao, Jiebo Luo, Yongdong Zhang

In order to overcome the scarcity of training samples of fake images, we first construct a large-scale auxiliary dataset indirectly related to this task.

Image Classification Transfer Learning

AC-BLSTM: Asymmetric Convolutional Bidirectional LSTM Networks for Text Classification

1 code implementation7 Nov 2016 Depeng Liang, Yongdong Zhang

Recently deeplearning models have been shown to be capable of making remarkable performance in sentences and documents classification tasks.

Classification General Classification +3

Scene-adaptive Coded Apertures Imaging

no code implementations19 Jun 2015 Xuehui Wang, Jinli Suo, Jingyi Yu, Yongdong Zhang, Qionghai Dai

Firstly, we capture the scene with a pinhole and analyze the scene content to determine primary edge orientations.

Multi-Task Deep Visual-Semantic Embedding for Video Thumbnail Selection

no code implementations CVPR 2015 Wu Liu, Tao Mei, Yongdong Zhang, Cherry Che, Jiebo Luo

Given the tremendous growth of online videos, video thumbnail, as the common visualization form of video content, is becoming increasingly important to influence user's browsing and searching experience.

Multi-Task Learning

Binary Code Ranking with Weighted Hamming Distance

no code implementations CVPR 2013 Lei Zhang, Yongdong Zhang, Jinhu Tang, Ke Lu, Qi Tian

In this paper, we propose a weighted Hamming distance ranking algorithm (WhRank) to rank the binary codes of hashing methods.

Cannot find the paper you are looking for? You can Submit a new open access paper.