1 code implementation • 2 Nov 2024 • Lei Tan, Yukang Zhang, Keke Han, Pingyang Dai, Yan Zhang, Yongjian Wu, Rongrong Ji
From this view, we unify all data augmentation strategies for cross-spectral re-identification by mimicking such local linear transformations and categorizing them into moderate transformation and radical transformation.
1 code implementation • 22 Oct 2024 • Yongjian Wu, Yang Zhou, Jiya Saiyin, Bingzheng Wei, Maode Lai, Jianzhong Shou, Yan Xu
Notably, this work highlights the astonishing potential of VLPMs pre-trained on natural image-text pairs for downstream tasks in the medical field as well.
no code implementations • 29 Aug 2024 • Lei Tan, Pingyang Dai, Jie Chen, Liujuan Cao, Yongjian Wu, Rongrong Ji
Extracting robust feature representation is critical for object re-identification to accurately identify objects across non-overlapping cameras.
1 code implementation • 16 Jul 2024 • Yang Zhou, Yongjian Wu, Jiya Saiyin, Bingzheng Wei, Maode Lai, Eric Chang, Yan Xu
Existing prompt tuning methods have not effectively addressed the modal mapping and aligning problem for tokens in different modalities, leading to poor transfer generalization.
1 code implementation • 1 Jun 2024 • Yiwei Ma, Jiayi Ji, Xiaoshuai Sun, Yiyi Zhou, Xiaopeng Hong, Yongjian Wu, Rongrong Ji
This paper explores a novel dynamic network for vision and language tasks, where the inferring structure is customized on the fly for different inputs.
1 code implementation • 30 Jun 2023 • Yongjian Wu, Yang Zhou, Jiya Saiyin, Bingzheng Wei, Maode Lai, Jianzhong Shou, Yubo Fan, Yan Xu
Foremost, our work demonstrates that the VLPM pre-trained on natural image-text pairs exhibits astonishing potential for downstream tasks in the medical field as well.
1 code implementation • 5 Jun 2023 • Yang Zhou, Yongjian Wu, Zihua Wang, Bingzheng Wei, Maode Lai, Jianzhong Shou, Yubo Fan, Yan Xu
Experiments on three datasets demonstrate the good generality of our method, which outperforms other image-level weakly supervised methods for nuclei instance segmentation, and achieves comparable performance to fully-supervised methods.
1 code implementation • 29 Mar 2023 • Xingbin Liu, Huafeng Kuang, Hong Liu, Xianming Lin, Yongjian Wu, Rongrong Ji
Deep neural networks have been applied in many computer vision tasks and achieved state-of-the-art performance.
1 code implementation • 27 Mar 2023 • Xingbin Liu, Huafeng Kuang, Xianming Lin, Yongjian Wu, Rongrong Ji
By revisiting the previous methods, we find different adversarial training methods have distinct robustness for sample instances.
no code implementations • 20 Mar 2023 • Jiaer Xia, Lei Tan, Pingyang Dai, Mingbo Zhao, Yongjian Wu, Liujuan Cao
To address this issue, we propose a novel transformer-based Attention Disturbance and Dual-Path Constraint Network (ADP) to enhance the generalization of attention networks.
no code implementations • 3 Feb 2023 • Lei Tan, Pingyang Dai, Qixiang Ye, Mingliang Xu, Yongjian Wu, Rongrong Ji
Based on the observation and analysis of SA-Softmax, we modify the SA-Softmax with the Feature Mask and Absolute-Similarity Term to alleviate the ambiguous optimization during model training.
no code implementations • 2 Feb 2023 • Lei Tan, Yukang Zhang, ShengMei Shen, Yan Wang, Pingyang Dai, Xianming Lin, Yongjian Wu, Rongrong Ji
Cross-spectral person re-identification, which aims to associate identities to pedestrians across different spectra, faces a main challenge of the modality discrepancy.
no code implementations • 29 Jan 2023 • Qiong Wu, Jiahan Li, Pingyang Dai, Qixiang Ye, Liujuan Cao, Yongjian Wu, Rongrong Ji
The knowledge transfer between two networks is based on an asymmetric mutual learning manner.
1 code implementation • 9 Jan 2023 • Haowei Wang, Jiayi Ji, Yiyi Zhou, Yongjian Wu, Xiaoshuai Sun
Extensive experiments on the PNG benchmark dataset demonstrate the effectiveness and efficiency of our method.
no code implementations • 21 Aug 2022 • Qiong Wu, Jiaer Xia, Pingyang Dai, Yiyi Zhou, Yongjian Wu, Rongrong Ji
Visible-infrared person re-identification (VI-ReID) is a task of matching the same individuals across the visible and infrared modalities.
2 code implementations • 19 Jul 2022 • Lei Tan, Pingyang Dai, Rongrong Ji, Yongjian Wu
Although person re-identification has achieved an impressive improvement in recent years, the common occlusion case caused by different obstacles is still an unsettled issue in real application scenarios.
1 code implementation • 14 Jun 2022 • Yuxin Zhang, Mingbao Lin, Zhihang Lin, Yiting Luo, Ke Li, Fei Chao, Yongjian Wu, Rongrong Ji
In this paper, we show that the N:M learning can be naturally characterized as a combinatorial problem which searches for the best combination candidate within a finite collection.
no code implementations • 4 Jun 2022 • Endai Huang, Axiu Mao, Junhui Hou, Yongjian Wu, Weitao Xu, Maria Camila Ceballos, Thomas D. Parsons, Kai Liu
Specifically, CClusnet-Inseg uses each pixel to predict object centers and trace these centers to form masks based on clustering results, which consists of a network for segmentation and center offset vector map, Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm, Centers-to-Mask (C2M), and Remain-Centers-to-Mask (RC2M) algorithms.
1 code implementation • 16 Apr 2022 • Gen Luo, Yiyi Zhou, Xiaoshuai Sun, Yan Wang, Liujuan Cao, Yongjian Wu, Feiyue Huang, Rongrong Ji
Despite the exciting performance, Transformer is criticized for its excessive parameters and computation cost.
no code implementations • 13 Mar 2022 • Chengpeng Dai, Fuhai Chen, Xiaoshuai Sun, Rongrong Ji, Qixiang Ye, Yongjian Wu
Recently, automatic video captioning has attracted increasing attention, where the core challenge lies in capturing the key semantic items, like objects and actions as well as their spatial-temporal correlations from the redundant frames and semantic content.
1 code implementation • 8 Mar 2022 • Mengzhao Chen, Mingbao Lin, Ke Li, Yunhang Shen, Yongjian Wu, Fei Chao, Rongrong Ji
Our proposed CF-ViT is motivated by two important observations in modern ViT models: (1) The coarse-grained patch splitting can locate informative regions of an input image.
1 code implementation • 8 Mar 2022 • Yunshan Zhong, Mingbao Lin, Xunchao Li, Ke Li, Yunhang Shen, Fei Chao, Yongjian Wu, Rongrong Ji
However, these methods suffer from severe performance degradation when quantizing the SR models to ultra-low precision (e. g., 2-bit and 3-bit) with the low-cost layer-wise quantizer.
1 code implementation • 17 Oct 2021 • Gen Luo, Yiyi Zhou, Xiaoshuai Sun, Yongjian Wu, Yue Gao, Rongrong Ji
Based on the LaConv module, we further build the first fully language-driven convolution network, termed as LaConvNet, which can unify the visual recognition and multi-modal reasoning in one forward structure.
1 code implementation • 16 Sep 2021 • Yuexiao Ma, Taisong Jin, Xiawu Zheng, Yan Wang, Huixia Li, Yongjian Wu, Guannan Jiang, Wei zhang, Rongrong Ji
Instead of solving a problem of the original integer programming, we propose to optimize a proxy metric, the concept of network orthogonality, which is highly correlated with the loss of the integer programming but also easy to optimize with linear programming.
1 code implementation • 9 Sep 2021 • Yunshan Zhong, Mingbao Lin, Mengzhao Chen, Ke Li, Yunhang Shen, Fei Chao, Yongjian Wu, Rongrong Ji
While post-training quantization receives popularity mostly due to its evasion in accessing the original complete training dataset, its poor performance also stems from scarce images.
no code implementations • CVPR 2021 • Yunhang Shen, Liujuan Cao, Zhiwei Chen, Feihong Lian, Baochang Zhang, Chi Su, Yongjian Wu, Feiyue Huang, Rongrong Ji
To date, learning weakly supervised panoptic segmentation (WSPS) with only image-level labels remains unexplored.
1 code implementation • CVPR 2021 • Xuying Zhang, Xiaoshuai Sun, Yunpeng Luo, Jiayi Ji, Yiyi Zhou, Yongjian Wu, Feiyue Huang, Rongrong Ji
Then, we build a BERTbased language model to extract language context and propose Adaptive-Attention (AA) module on top of a transformer decoder to adaptively measure the contribution of visual and language cues before making decisions for word prediction.
1 code implementation • CVPR 2021 • Qiong Wu, Pingyang Dai, Jie Chen, Chia-Wen Lin, Yongjian Wu, Feiyue Huang, Bineng Zhong, Rongrong Ji
In this paper, we propose a joint Modality and Pattern Alignment Network (MPANet) to discover cross-modality nuances in different patterns for visible-infrared person Re-ID, which introduces a modality alleviation module and a pattern alignment module to jointly extract discriminative features.
1 code implementation • 18 Jun 2021 • YuHan Wang, Xu Chen, Junwei Zhu, Wenqing Chu, Ying Tai, Chengjie Wang, Jilin Li, Yongjian Wu, Feiyue Huang, Rongrong Ji
In this work, we propose a high fidelity face swapping method, called HifiFace, which can well preserve the face shape of the source face and generate photo-realistic results.
Ranked #7 on
Face Swapping
on FaceForensics++
no code implementations • NeurIPS 2021 • Gengchen Duan, Taisong Jin, Rongrong Ji, Ling Shao, Baochang Zhang, Feiyue Huang, Yongjian Wu
In this article, we propose a novel auxiliary learning induced graph convolutional network in a multi-task fashion.
no code implementations • NeurIPS 2021 • Yixu Wang, Jie Li, Hong Liu, Yan Wang, Yongjian Wu, Feiyue Huang, Rongrong Ji
We argue this is due to the lack of rich information in the probability prediction and the overfitting caused by hard labels.
1 code implementation • 24 Apr 2021 • Yuxin Zhang, Mingbao Lin, Chia-Wen Lin, Jie Chen, Feiyue Huang, Yongjian Wu, Yonghong Tian, Rongrong Ji
Specifically, to model the contribution of each channel to differentiating categories, we develop a class-wise mask for each channel, implemented in a dynamic training manner w. r. t.
1 code implementation • 26 Mar 2021 • Shaojie Li, Mingbao Lin, Yan Wang, Yongjian Wu, Yonghong Tian, Ling Shao, Rongrong Ji
Besides, a self-distillation module is adopted to convert the feature map of deeper layers into a shallower one.
1 code implementation • CVPR 2021 • Xinyang Li, Shengchuan Zhang, Jie Hu, Liujuan Cao, Xiaopeng Hong, Xudong Mao, Feiyue Huang, Yongjian Wu, Rongrong Ji
Recently, image-to-image translation has made significant progress in achieving both multi-label (\ie, translation conditioned on different labels) and multi-style (\ie, generation with diverse styles) tasks.
Disentanglement
Multimodal Unsupervised Image-To-Image Translation
+1
1 code implementation • 20 Jan 2021 • Mingbao Lin, Rongrong Ji, Shaojie Li, Yan Wang, Yongjian Wu, Feiyue Huang, Qixiang Ye
Inspired by the face recognition community, we use a message passing algorithm Affinity Propagation on the weight matrices to obtain an adaptive number of exemplars, which then act as the preserved filters.
1 code implementation • 16 Jan 2021 • Yunpeng Luo, Jiayi Ji, Xiaoshuai Sun, Liujuan Cao, Yongjian Wu, Feiyue Huang, Chia-Wen Lin, Rongrong Ji
Descriptive region features extracted by object detection networks have played an important role in the recent advancements of image captioning.
1 code implementation • ICCV 2021 • Jie Li, Rongrong Ji, Peixian Chen, Baochang Zhang, Xiaopeng Hong, Ruixin Zhang, Shaoxin Li, Jilin Li, Feiyue Huang, Yongjian Wu
A common practice is to start from a large perturbation and then iteratively reduce it with a deterministic direction and a random one while keeping it adversarial.
no code implementations • ICCV 2021 • Yunhang Shen, Liujuan Cao, Zhiwei Chen, Baochang Zhang, Chi Su, Yongjian Wu, Feiyue Huang, Rongrong Ji
Weakly supervised instance segmentation (WSIS) with only image-level labels has recently drawn much attention.
1 code implementation • 13 Dec 2020 • Jiayi Ji, Yunpeng Luo, Xiaoshuai Sun, Fuhai Chen, Gen Luo, Yongjian Wu, Yue Gao, Rongrong Ji
The latter contains a Global Adaptive Controller that can adaptively fuse the global information into the decoder to guide the caption generation.
1 code implementation • NeurIPS 2020 • Yunhang Shen, Rongrong Ji, Zhiwei Chen, Yongjian Wu, Feiyue Huang
In this paper, we propose a unified WSOD framework, termed UWSOD, to develop a high-capacity general detection model with only image-level labels, which is self-contained and does not require external modules or additional supervision.
no code implementations • 20 Oct 2020 • Shaohuai Shi, Xianhao Zhou, Shutao Song, Xingyao Wang, Zilin Zhu, Xue Huang, Xinan Jiang, Feihu Zhou, Zhenyu Guo, Liqiang Xie, Rui Lan, Xianbin Ouyang, Yan Zhang, Jieqian Wei, Jing Gong, Weiliang Lin, Ping Gao, Peng Meng, Xiaomin Xu, Chenyang Guo, Bo Yang, Zhibo Chen, Yongjian Wu, Xiaowen Chu
Distributed training techniques have been widely deployed in large-scale deep neural networks (DNNs) training on dense-GPU clusters.
2 code implementations • NeurIPS 2020 • Mingbao Lin, Rongrong Ji, Zihan Xu, Baochang Zhang, Yan Wang, Yongjian Wu, Feiyue Huang, Chia-Wen Lin
In this paper, for the first time, we explore the influence of angular bias on the quantization error and then introduce a Rotated Binary Neural Network (RBNN), which considers the angle alignment between the full-precision weight vector and its binarized version.
1 code implementation • 23 Jan 2020 • Mingbao Lin, Rongrong Ji, Yuxin Zhang, Baochang Zhang, Yongjian Wu, Yonghong Tian
In this paper, we propose a new channel pruning method based on artificial bee colony algorithm (ABC), dubbed as ABCPruner, which aims to efficiently find optimal pruned structure, i. e., channel number in each layer, rather than selecting "important" channels as previous works did.
no code implementations • NeurIPS 2019 • Fuhai Chen, Rongrong Ji, Jiayi Ji, Xiaoshuai Sun, Baochang Zhang, Xuri Ge, Yongjian Wu, Feiyue Huang, Yan Wang
To model these two inherent diversities in image captioning, we propose a Variational Structured Semantic Inferring model (termed VSSI-cap) executed in a novel structured encoder-inferer-decoder schema.
no code implementations • 6 Aug 2019 • Rongrong Ji, Ke Li, Yan Wang, Xiaoshuai Sun, Feng Guo, Xiaowei Guo, Yongjian Wu, Feiyue Huang, Jiebo Luo
In this paper, we address the problem of monocular depth estimation when only a limited number of training image-depth pairs are available.
no code implementations • ECCV 2020 • Yuchao Li, Rongrong Ji, Shaohui Lin, Baochang Zhang, Chenqian Yan, Yongjian Wu, Feiyue Huang, Ling Shao
More specifically, we introduce a novel architecture controlling module in each layer to encode the network architecture by a vector.
1 code implementation • 28 May 2019 • Xiawu Zheng, Chenyi Yang, Shaokun Zhang, Yan Wang, Baochang Zhang, Yongjian Wu, Yunsheng Wu, Ling Shao, Rongrong Ji
With the proposed efficient network generation method, we directly obtain the optimal neural architectures on given constraints, which is practical for on-device models across diverse search spaces and constraints.
1 code implementation • 29 Jan 2019 • Mingbao Lin, Rongrong Ji, Hong Liu, Xiaoshuai Sun, Yongjian Wu, Yunsheng Wu
In this paper, we propose a novel supervised online hashing method, termed Balanced Similarity for Online Discrete Hashing (BSODH), to solve the above problems in a unified framework.
1 code implementation • CVPR 2019 • Yuchao Li, Shaohui Lin, Baochang Zhang, Jianzhuang Liu, David Doermann, Yongjian Wu, Feiyue Huang, Rongrong Ji
The relationship between the input feature maps and 2D kernels is revealed in a theoretical framework, based on which a kernel sparsity and entropy (KSE) indicator is proposed to quantitate the feature map importance in a feature-agnostic manner to guide model compression.
no code implementations • 19 Oct 2018 • Wen Wang, Yongjian Wu, Haijun Liu, Shiguang Wang, Jian Cheng
Temporal action detection aims at not only recognizing action category but also detecting start time and end time for each action instance in an untrimmed video.
no code implementations • CVPR 2018 • Fuhai Chen, Rongrong Ji, Xiaoshuai Sun, Yongjian Wu, Jinsong Su
In offline optimization, we adopt an end-to-end formulation, which jointly trains the visual tree parser, the structured relevance and diversity constraints, as well as the LSTM based captioning model.
no code implementations • CVPR 2017 • Hong Liu, Rongrong Ji, Yongjian Wu, Feiyue Huang, Baochang Zhang
In this paper, we propose a hashing scheme, termed Fusion Similarity Hashing (FSH), which explicitly embeds the graph-based fusion similarity across modalities into a common Hamming space.
1 code implementation • ACL 2017 • Deng Cai, Hai Zhao, Zhisong Zhang, Yuan Xin, Yongjian Wu, Feiyue Huang
Neural models with minimal feature engineering have achieved competitive performance against traditional methods for the task of Chinese word segmentation.
no code implementations • 19 Nov 2016 • Hong Liu, Rongrong Ji, Yongjian Wu, Feiyue Huang
By given a large-scale training data set, it is very expensive to embed such ranking tuples in binary code learning.