1 code implementation • ECCV 2020 • Ziqiang Zheng, Yang Wu, Xinran Han, Jianbo Shi
We present a ForkGAN for task-agnostic image translation that can boost multiple vision tasks in adverse weather conditions.
no code implementations • Findings (ACL) 2022 • Pengwei Zhan, Yang Wu, Shaolei Zhou, Yunjian Zhang, Liming Wang
We show that the pathological inconsistency is caused by the representation collapse issue, which means that the representation of the sentences with tokens in different saliency reduced is somehow collapsed, and thus the important words cannot be distinguished from unimportant words in terms of model confidence changing.
no code implementations • COLING 2022 • Pengwei Zhan, Chao Zheng, Jing Yang, Yuxiang Wang, Liming Wang, Yang Wu, Yunjian Zhang
Previous works on word-level attacks widely use word importance ranking (WIR) methods and complex search methods, including greedy search and heuristic algorithms, to find optimal substitutions.
no code implementations • 29 Jul 2024 • Yang Wu, Kaihua Zhang, Jianjun Qian, Jin Xie, Jian Yang
The complex traffic environment and various weather conditions make the collection of LiDAR data expensive and challenging.
no code implementations • 12 Jun 2024 • Hao Yang, Yanyan Zhao, Yang Wu, Shilong Wang, Tian Zheng, Hongbo Zhang, Zongyang Ma, Wanxiang Che, Bing Qin
Compared to traditional sentiment analysis, which only considers text, multimodal sentiment analysis needs to consider emotional signals from multimodal sources simultaneously and is therefore more consistent with the way how humans process sentiment in real-world scenarios.
no code implementations • 5 Jun 2024 • Yang Wu, Chenghao Wang, Ece Gumusel, Xiaozhong Liu
The integration of generative Large Language Models (LLMs) into various applications, including the legal domain, has been accelerated by their expansive and versatile nature.
1 code implementation • 5 Jun 2024 • Tao Yang, Yingmin Luo, Zhongang Qi, Yang Wu, Ying Shan, Chang Wen Chen
Layout generation is the keystone in achieving automated graphic design, requiring arranging the position and size of various multi-modal design elements in a visually pleasing and constraint-following manner.
no code implementations • 29 May 2024 • Xuehao Gao, Yang Yang, Yang Wu, Shaoyi Du, Guo-Jun Qi
Inferring 3D human motion is fundamental in many applications, including understanding human activity and analyzing one's intention.
1 code implementation • 21 May 2024 • Libo Qin, Qiguang Chen, Xiachong Feng, Yang Wu, Yongheng Zhang, Yinghui Li, Min Li, Wanxiang Che, Philip S. Yu
While large language models (LLMs) like ChatGPT have shown impressive capabilities in Natural Language Processing (NLP) tasks, a systematic investigation of their potential in this field remains largely unexplored.
no code implementations • 30 Apr 2024 • Jiabao Wang, Yang Wu, Jun Wang, Ni Chen
The multi-plane phase retrieval method provides a budget-friendly and effective way to perform phase imaging, yet it often encounters alignment challenges due to shifts along the optical axis in experiments.
1 code implementation • 26 Apr 2024 • Yang Wu, Yao Wan, Hongyu Zhang, Yulei Sui, Wucai Wei, Wei Zhao, Guandong Xu, Hai Jin
In particular, we first explore the ways of transforming structured tabular data into sequential text prompts, as to feed them into LLMs and analyze which table content contributes most to the NL2Vis.
1 code implementation • 24 Apr 2024 • Zhaoyang Chu, Yao Wan, Qian Li, Yang Wu, Hongyu Zhang, Yulei Sui, Guandong Xu, Hai Jin
We argue that these factual reasoning-based explanations cannot answer critical what-if questions: What would happen to the GNN's decision if we were to alter the code graph into alternative structures?
no code implementations • 9 Mar 2024 • Xiuzhe Wu, Xiaoyang Lyu, Qihao Huang, Yong liu, Yang Wu, Ying Shan, Xiaojuan Qi
Our system contains a depth estimation module to predict depth, and a new decomposed object-wise 3D motion (DO3D) estimation module to predict ego-motion and 3D object motion.
no code implementations • 19 Jan 2024 • Rabah Ouali, Jean-Yves Dieulot, Pascal Yim, Xavier Guillaud, Frédéric Colas, Yang Wu, Heng Wu
To ensure the proper functioning of the current and future electrical grid, it is necessary for Transmission System Operators (TSOs) to verify that energy providers comply with the grid code and specifications provided by TSOs.
1 code implementation • 4 Jan 2024 • Xuehao Gao, Yang Yang, Zhenyu Xie, Shaoyi Du, Zhongqian Sun, Yang Wu
The whole text-driven human motion synthesis problem is then divided into multiple abstraction levels and solved with a multi-stage generation framework with a cascaded latent diffusion model: an initial generator first generates the coarsest human motion guess from a given text description; then, a series of successive generators gradually enrich the motion details based on the textual description and the previous synthesized results.
Ranked #7 on Motion Synthesis on KIT Motion-Language
no code implementations • 18 Dec 2023 • Zhenyu Xie, Yang Wu, Xuehao Gao, Zhongqian Sun, Wei Yang, Xiaodan Liang
Besides, we introduce a multi-denoiser framework for the advanced diffusion model to ease the learning of high-dimensional model and fully explore the generative potential of the diffusion model.
no code implementations • 28 Nov 2023 • Runruo Yang, Yang Wu, Jie Huang, Cheng-Xiang Wang
Integrated sensing and communication (ISAC) has attracted wide attention as an emerging application scenario for the sixth generation (6G) wireless communication system.
1 code implementation • 25 Oct 2023 • Yang Wu, Shilong Wang, Hao Yang, Tian Zheng, Hongbo Zhang, Yanyan Zhao, Bing Qin
In this paper, we evaluate different abilities of GPT-4V including visual understanding, language understanding, visual puzzle solving, and understanding of other modalities such as depth, thermal, video, and audio.
1 code implementation • 16 Oct 2023 • Yang Wu, Shenglong Hu, Huihui Song, Kaihua Zhang, Bo Liu, Dong Liu
To simultaneously consider the uncertainty introduced by irrelevant images and the consensus features of the remaining relevant images in the group, we designed a latent variable generator branch and CoSOD transformer branch.
1 code implementation • ICCV 2023 • Xiuzhe Wu, Pengfei Hu, Yang Wu, Xiaoyang Lyu, Yan-Pei Cao, Ying Shan, Wenming Yang, Zhongqian Sun, Xiaojuan Qi
Therefore, directly learning a mapping function from speech to the entire head image is prone to ambiguity, particularly when using a short video for training.
1 code implementation • 6 Sep 2023 • Yang Wu, Xurui Li, Xuhong Zhang, Yangyang Kang, Changlong Sun, Xiaozhong Liu
Positive-Unlabeled (PU) Learning is a challenge presented by binary classification problems where there is an abundance of unlabeled data along with a small number of positive data instances, which can be used to address chronic disease screening problem.
no code implementations • 26 Jul 2023 • Xumei Xi, Yuke Zhao, Quan Liu, Liwen Ouyang, Yang Wu
To this end, we train a farsighted recommender by using an offline RL algorithm with the policy network in our model architecture that has been initialized from a pre-trained transformer model.
no code implementations • 7 Jul 2023 • Wangbo Yu, Yanbo Fan, Yong Zhang, Xuan Wang, Fei Yin, Yunpeng Bai, Yan-Pei Cao, Ying Shan, Yang Wu, Zhongqian Sun, Baoyuan Wu
In this work, we propose a one-shot 3D facial avatar reconstruction framework that only requires a single source image to reconstruct a high-fidelity 3D facial avatar.
no code implementations • 8 May 2023 • Yang Wu, Yanyan Zhao, Zhongyang Li, Bing Qin, Kai Xiong
Instruction tuning has been shown to be able to improve cross-task generalization of language models.
no code implementations • 6 May 2023 • Yang Wu, Zhibin Liu, Hefeng Wu, Liang Lin
In this paper, we study video synthesis with emphasis on simplifying the generation conditions.
no code implementations • CVPR 2023 • Xuehao Gao, Shaoyi Du, Yang Wu, Yang Yang
Encouraged by the effectiveness of encoding temporal dynamics within the frequency domain, recent human motion prediction systems prefer to first convert the motion representation from the original pose space into the frequency space.
no code implementations • ICCV 2023 • Yang Wu, Zhiwei Ge, Yuhao Luo, Lin Liu, Sulong Xu
Experiments show that our method outperforms existing methods on the face and person datasets to achieve state-of-the-art.
no code implementations • CVPR 2023 • Yang Wu, Huihui Song, Bo Liu, Kaihua Zhang, Dong Liu
To address this issue, this paper presents a group exchange-masking (GEM) strategy for robust CoSOD model learning.
1 code implementation • IEEE Transactions on Multimedia 2020 • Fan Yang, Yang Wu, Zheng Wang, Xiang Li, Sakriani Sakti, Satoshi Nakamura
Therefore, previous works pre-train their models on rich-labeled photo retrieval data (i. e., source domain) and then fine-tune them on the limited-labeled sketch-to-photo retrieval data (i. e., target domain).
Ranked #1 on Image Retrieval on PKU-Reid
1 code implementation • 8 Nov 2022 • Yang Wu, Jing Yang, Xiaojun Zhou, Liming Wang, Zhen Xu
Automatic detecting rumors on social media has become a challenging task.
no code implementations • 7 Oct 2022 • Renjie Zhang, Yu Fang, Huaxin Song, Fangbin Wan, Yanwei Fu, Hirokazu Kato, Yang Wu
Cloth changing person re-identification(Re-ID) can work under more complicated scenarios with higher security than normal Re-ID and biometric techniques and is therefore extremely valuable in applications.
no code implementations • 20 Sep 2022 • Yang Wu, Pai Peng, Zhenyu Zhang, Yanyan Zhao, Bing Qin
At the low-level, we propose the progressive tri-modal attention, which can model the tri-modal feature interactions by adopting a two-pass strategy and can further leverage such interactions to significantly reduce the computation and memory complexity through reducing the input token length.
1 code implementation • 23 Aug 2022 • Shu Tang, Yang Wu, Hongxing Qin, Xianzhong Xie, Shuli Yang, Jing Wang
Most existing deep-learning-based single image dynamic scene blind deblurring (SIDSBD) methods usually design deep networks to directly remove the spatially-variant motion blurs from one inputted motion blurred image, without blur kernels estimation.
no code implementations • 22 Aug 2022 • Yang Wu, Yinghua Wang, Jie Huang, Cheng-Xiang Wang, Chen Huang
Due to the indoor none-line-of-sight (NLoS) propagation and multi-access interference (MAI), it is a great challenge to achieve centimeter-level positioning accuracy in indoor scenarios.
no code implementations • 28 Jun 2022 • Hao Yang, Yanyan Zhao, Jianwei Liu, Yang Wu, Bing Qin
In this paper, we propose a new dataset, the Multimodal Aspect-Category Sentiment Analysis (MACSA) dataset, which contains more than 21K text-image pairs.
2 code implementations • CVPR 2022 • Ye Liu, Siyuan Li, Yang Wu, Chang Wen Chen, Ying Shan, XiaoHu Qie
Finding relevant moments and highlights in videos according to natural language queries is a natural and highly valuable common need in the current video content explosion era.
Ranked #4 on Highlight Detection on YouTube Highlights
1 code implementation • Findings (ACL) 2022 • Yang Wu, Yanyan Zhao, Hao Yang, Song Chen, Bing Qin, Xiaohuan Cao, Wenting Zhao
Through further analysis of the ASR outputs, we find that in some cases the sentiment words, the key sentiment elements in the textual modality, are recognized as other words, which makes the sentiment of the text change and hurts the performance of multimodal sentiment models directly.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 20 Dec 2021 • Bin Zhang, Yang Wu, Xiaojing Zhang, Ming Ma
In the current salient object detection network, the most popular method is using U-shape structure.
no code implementations • 24 Nov 2021 • Hao Ren, Ziqiang Zheng, Yang Wu, Hong Lu, Yang Yang, Ying Shan, Sai-Kit Yeung
The huge domain gap between sketches and photos and the highly abstract sketch representations pose challenges for sketch-based image retrieval (\underline{SBIR}).
no code implementations • 16 Oct 2021 • Yang Wu, Shirui Feng, Guanbin Li, Liang Lin
PEMR includes a "looking ahead" process, \textit{i. e.} a visual feature extractor module that estimates feasible paths for gathering 3D navigational information, which is mimicking the human sense of direction.
no code implementations • 30 Aug 2021 • Yang Wu, Dingheng Wang, Xiaotong Lu, Fan Yang, Guoqi Li, Weisheng Dong, Jianbo Shi
Visual recognition is currently one of the most important and active research areas in computer vision, pattern recognition, and even the general field of artificial intelligence.
3 code implementations • 27 Jul 2021 • Changan Wang, Qingyu Song, Boshen Zhang, Yabiao Wang, Ying Tai, Xuyi Hu, Chengjie Wang, Jilin Li, Jiayi Ma, Yang Wu
Therefore, we propose a novel count interval partition criterion called Uniform Error Partition (UEP), which always keeps the expected counting error contributions equal for all intervals to minimize the prediction risk.
3 code implementations • 27 Jul 2021 • Qingyu Song, Changan Wang, Zhengkai Jiang, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Yang Wu
In this paper, we propose a purely point-based framework for joint crowd counting and individual localization.
Ranked #6 on Crowd Counting on ShanghaiTech A
no code implementations • 24 May 2021 • Jinlong Peng, Zhengkai Jiang, Yueyang Gu, Yang Wu, Yabiao Wang, Ying Tai, Chengjie Wang, Weiyao Lin
In addition, we add a localization branch to predict the localization accuracy, so that it can work as the replacement of the regression assistance link during inference.
1 code implementation • ICCV 2021 • Changan Wang, Qingyu Song, Boshen Zhang, Yabiao Wang, Ying Tai, Xuyi Hu, Chengjie Wang, Jilin Li, Jiayi Ma, Yang Wu
Therefore, we propose a novel count interval partition criterion called Uniform Error Partition (UEP), which always keeps the expected counting error contributions equal for all intervals to minimize the prediction risk.
1 code implementation • ICCV 2021 • Qingyu Song, Changan Wang, Zhengkai Jiang, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Yang Wu
In this paper, we propose a purely point-based framework for joint crowd counting and individual localization.
no code implementations • 31 Dec 2020 • Haoran Ji, Yanzhao Liu, He Wang, Jiawei Luo, Jiaheng Li, Hao Li, Yang Wu, Yong Xu, Jian Wang
An essential ingredient to realize these quantum states is the magnetic gap in the topological surface states induced by the out-of-plane ferromagnetism on the surface of MnBi2Te4.
Materials Science
no code implementations • COLING 2020 • Xin Lu, Yanyan Zhao, Yang Wu, Yijian Tian, Huipeng Chen, Bing Qin
We noticed that the gold emotion labels of the context utterances can provide explicit and accurate emotion interaction, but it is impossible to input gold labels at inference time.
Ranked #44 on Emotion Recognition in Conversation on IEMOCAP
no code implementations • 17 Aug 2020 • Yuzheng Xu, Yang Wu, Nur Sabrina binti Zuraimi, Shohei Nobuhara, Ko Nishino
Video analysis has been moving towards more detailed interpretation (e. g. segmentation) with encouraging progresses.
1 code implementation • ECCV 2020 • Jinlong Peng, Changan Wang, Fangbin Wan, Yang Wu, Yabiao Wang, Ying Tai, Chengjie Wang, Jilin Li, Feiyue Huang, Yanwei Fu
Existing Multiple-Object Tracking (MOT) methods either follow the tracking-by-detection paradigm to conduct object detection, feature extraction and data association separately, or have two of the three subtasks integrated to form a partially end-to-end solution.
no code implementations • 7 Jul 2020 • Fan Yang, Xin Chang, Chenyu Dang, Ziqiang Zheng, Sakriani Sakti, Satoshi Nakamura, Yang Wu
We aim to improve the performance of Multiple Object Tracking and Segmentation (MOTS) by refinement.
Ranked #1 on Multi-Object Tracking on MOTS20
Multi-Object Tracking Multi-Object Tracking and Segmentation +2
no code implementations • 9 Mar 2020 • Fangbin Wan, Yang Wu, Xuelin Qian, Yixiong Chen, Yanwei Fu
We find that changing clothes makes ReID a much harder problem in the sense of bringing difficulties to learning effective representations and also challenges the generalization ability of previous ReID models to identify persons with unseen (new) clothes.
no code implementations • 30 Jan 2020 • Jimuyang Zhang, Sanping Zhou, Xin Chang, Fangbin Wan, Jinjun Wang, Yang Wu, Dong Huang
Most of Multiple Object Tracking (MOT) approaches compute individual target features for two subtasks: estimating target-wise motions and conducting pair-wise Re-Identification (Re-ID).
no code implementations • 8 Dec 2019 • Dingheng Wang, Guangshe Zhao, Guoqi Li, Lei Deng, Yang Wu
However, due to the higher dimension of convolutional kernels, the space complexity of 3DCNNs is generally larger than that of traditional two dimensional convolutional neural networks (2DCNNs).
Ranked #1 on Quantization on Knowledge-based:
1 code implementation • 24 Nov 2019 • Fan Yang, Feiran Li, Yang Wu, Sakriani Sakti, Satoshi Nakamura
3D panoramic multi-person localization and tracking are prominent in many applications, however, conventional methods using LiDAR equipment could be economically expensive and also computationally inefficient due to the processing of point cloud data.
Ranked #1 on Multi-Object Tracking on MOT15_3D (using extra training data)
no code implementations • 31 Oct 2019 • Yang Wu, Pengxu Wei, Liang Lin
To solve this problem, we derive a second-order Wasserstein gradient flow of the global relative entropy from Fokker-Planck equation.
3 code implementations • arXiv 2019 • Fan Yang, Sakriani Sakti, Yang Wu, Satoshi Nakamura
Although skeleton-based action recognition has achieved great success in recent years, most of the existing methods may suffer from a large model size and slow execution speed.
no code implementations • CVPR 2020 • Yujiang Wang, Mingzhi Dong, Jie Shen, Yang Wu, Shiyang Cheng, Maja Pantic
To the best of our knowledge, this is the first work to use reinforcement learning for online key-frame decision in dynamic video segmentation, and also the first work on its application on face videos.
no code implementations • 24 May 2019 • Zheng Wang, Zhixiang Wang, Yinqiang Zheng, Yang Wu, Wen-Jun Zeng, Shin'ichi Satoh
An efficient and effective person re-identification (ReID) system relieves the users from painful and boring video watching and accelerates the process of video analysis.
1 code implementation • 16 May 2019 • Ziqiang Zheng, Yang Wu, Zhibin Yu, Yang Yang, Haiyong Zheng, Takeo Kanade
We present the tailored models of the proposed ReshapeGAN for all the problem settings, and have them tested on 8 kinds of reshaping tasks with 13 different datasets, demonstrating the ability of ReshapeGAN on generating convincing and superior results for object reshaping.
no code implementations • 24 Jan 2019 • Ziqiang Zheng, Zhibin Yu, Haiyong Zheng, Yang Wu, Bing Zheng, Ping Lin
Current approaches have made great progress on image-to-image translation tasks benefiting from the success of image synthesis methods especially generative adversarial networks (GANs).
Generative Adversarial Network Image-to-Image Translation +1
no code implementations • 4 Dec 2018 • Xu Cai, Yang Wu, Guanbin Li, Ziliang Chen, Liang Lin
FRAME (Filters, Random fields, And Maximum Entropy) is an energy-based descriptive model that synthesizes visual realism by capturing mutual patterns from structural input signals.
no code implementations • 2 Jul 2018 • Tianshui Chen, Liang Lin, Riquan Chen, Yang Wu, Xiaonan Luo
Humans can naturally understand an image in depth with the aid of rich knowledge accumulated from daily lives or professions.
Fine-Grained Image Classification Fine-Grained Image Recognition +2
no code implementations • ICLR 2018 • Kunkun Pang, Mingzhi Dong, Yang Wu, Timothy Hospedales
In contrast to this body of research, we propose to treat active learning algorithm design as a meta-learning problem and learn the best criterion from data.
no code implementations • 9 Feb 2018 • Mingzhi Dong, Xiaochen Yang, Yang Wu, Jing-Hao Xue
In this paper, we propose the Lipschitz margin ratio and a new metric learning framework for classification through maximizing the ratio.
1 code implementation • CVPR 2018 • Shanxin Yuan, Guillermo Garcia-Hernando, Bjorn Stenger, Gyeongsik Moon, Ju Yong Chang, Kyoung Mu Lee, Pavlo Molchanov, Jan Kautz, Sina Honari, Liuhao Ge, Junsong Yuan, Xinghao Chen, Guijin Wang, Fan Yang, Kai Akiyama, Yang Wu, Qingfu Wan, Meysam Madadi, Sergio Escalera, Shile Li, Dongheui Lee, Iason Oikonomidis, Antonis Argyros, Tae-Kyun Kim
Official Torch7 implementation of "V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map", CVPR 2018
Ranked #5 on Hand Pose Estimation on HANDS 2017
2 code implementations • ECCV 2018 • Xuelin Qian, Yanwei Fu, Tao Xiang, Wenxuan Wang, Jie Qiu, Yang Wu, Yu-Gang Jiang, xiangyang xue
Person Re-identification (re-id) faces two major challenges: the lack of cross-view paired training data and learning discriminative identity-sensitive and view-invariant features in the presence of large pose variations.
no code implementations • 15 Sep 2017 • Yanyun Qu, Li Lin, Fumin Shen, Chang Lu, Yang Wu, Yuan Xie, DaCheng Tao
We propose a novel image classification method based on learning hierarchical inter-class structures.
no code implementations • CVPR 2015 • Yuanliu Liu, Zejian yuan, Nanning Zheng, Yang Wu
Specular reflection generally decreases the saturation of surface colors, which will be possibly confused with other colors that have the same hue but lower saturation.
no code implementations • 6 Mar 2014 • Yang Wu, Vansteenberge Jarich, Masayuki Mukunoki, Michihiko Minoh
Sparse representation based classification (SRC) has been proved to be a simple, effective and robust solution to face recognition.