Search Results for author: Chun Yuan

Found 94 papers, 40 papers with code

Structural Supervision for Word Alignment and Machine Translation

no code implementations Findings (ACL) 2022 Lei LI, Kai Fan, Hongjia Li, Chun Yuan

Syntactic structure has long been argued to be potentially useful for enforcing accurate word alignment and improving generalization performance of machine translation.

Decoder Machine Translation +3

MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections

no code implementations20 May 2024 Jiayue Liu, Xiao Tang, Freeman Cheng, Roy Yang, Zhihao LI, Jianzhuang Liu, Yi Huang, Jiaqi Lin, Shiyong Liu, Xiaofei Wu, Songcen Xu, Chun Yuan

To tackle this problem, we present MirrorGaussian, the first method for mirror scene reconstruction with real-time rendering based on 3D Gaussian Splatting.

Novel View Synthesis

FREE: Faster and Better Data-Free Meta-Learning

no code implementations2 May 2024 Yongxian Wei, Zixuan Hu, Zhenyi Wang, Li Shen, Chun Yuan, DaCheng Tao

Data-Free Meta-Learning (DFML) aims to extract knowledge from a collection of pre-trained models without requiring the original data, presenting practical benefits in contexts constrained by data privacy concerns.


Efficient Conditional Diffusion Model with Probability Flow Sampling for Image Super-resolution

1 code implementation16 Apr 2024 Yutao Yuan, Chun Yuan

However, existing diffusion-based super-resolution methods have high time consumption with the use of iterative sampling, while the quality and consistency of generated images are less than ideal due to problems like color shifting.

Image Super-Resolution

Distilling Semantic Priors from SAM to Efficient Image Restoration Models

no code implementations25 Mar 2024 Quan Zhang, Xiaoyu Liu, Wei Li, Hanting Chen, Junchao Liu, Jie Hu, Zhiwei Xiong, Chun Yuan, Yunhe Wang

SPD leverages a self-distillation manner to distill the fused semantic priors to boost the performance of original IR models.

Deblurring Denoising +2

GVGEN: Text-to-3D Generation with Volumetric Representation

no code implementations19 Mar 2024 Xianglong He, Junyi Chen, Sida Peng, Di Huang, Yangguang Li, Xiaoshui Huang, Chun Yuan, Wanli Ouyang, Tong He

To simplify the generation of GaussianVolume and empower the model to generate instances with detailed 3D geometry, we propose a coarse-to-fine pipeline.

3D Generation 3D Reconstruction +1

CricaVPR: Cross-image Correlation-aware Representation Learning for Visual Place Recognition

1 code implementation29 Feb 2024 Feng Lu, Xiangyuan Lan, Lijun Zhang, Dongmei Jiang, YaoWei Wang, Chun Yuan

Over the past decade, most methods in visual place recognition (VPR) have used neural networks to produce feature representations.

Representation Learning Visual Place Recognition

Deep Homography Estimation for Visual Place Recognition

1 code implementation25 Feb 2024 Feng Lu, Shuting Dong, Lijun Zhang, Bingxi Liu, Xiangyuan Lan, Dongmei Jiang, Chun Yuan

Moreover, we design a re-projection error of inliers loss to train the DHE network without additional homography labels, which can also be jointly trained with the backbone network to help it extract the features that are more suitable for local matching.

Homography Estimation Re-Ranking +1

Towards Seamless Adaptation of Pre-trained Models for Visual Place Recognition

1 code implementation22 Feb 2024 Feng Lu, Lijun Zhang, Xiangyuan Lan, Shuting Dong, YaoWei Wang, Chun Yuan

Experimental results show that our method outperforms the state-of-the-art methods with less training data and training time, and uses about only 3% retrieval runtime of the two-stage VPR methods with RANSAC-based spatial verification.

Re-Ranking Visual Place Recognition

Supervised Fine-tuning in turn Improves Visual Foundation Models

1 code implementation18 Jan 2024 Xiaohu Jiang, Yixiao Ge, Yuying Ge, Dachuan Shi, Chun Yuan, Ying Shan

Image-text training like CLIP has dominated the pretraining of vision foundation models in recent years.

Solving Continual Offline Reinforcement Learning with Decision Transformer

no code implementations16 Jan 2024 Kaixin Huang, Li Shen, Chen Zhao, Chun Yuan, DaCheng Tao

We aim to investigate whether Decision Transformer (DT), another offline RL paradigm, can serve as a more suitable offline continuous learner to address these issues.

Offline RL reinforcement-learning +1

ChartBench: A Benchmark for Complex Visual Reasoning in Charts

no code implementations26 Dec 2023 Zhengzhuo Xu, Sinan Du, Yiyan Qi, Chengjin Xu, Chun Yuan, Jian Guo

Multimodal Large Language Models (MLLMs) demonstrate impressive image understanding and generating capabilities.

Visual Reasoning

A Task is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting

1 code implementation6 Dec 2023 Junhao Zhuang, Yanhong Zeng, Wenran Liu, Chun Yuan, Kai Chen

This enables PowerPaint to accomplish various inpainting tasks by utilizing different task prompts, resulting in state-of-the-art performance.

Image Inpainting Object

Task-Distributionally Robust Data-Free Meta-Learning

no code implementations23 Nov 2023 Zixuan Hu, Li Shen, Zhenyi Wang, Yongxian Wei, Baoyuan Wu, Chun Yuan, DaCheng Tao

TDS leads to a biased meta-learner because of the skewed task distribution towards newly generated tasks.

Meta-Learning Model Selection

Mean Teacher DETR with Masked Feature Alignment: A Robust Domain Adaptive Detection Transformer Framework

no code implementations24 Oct 2023 Weixi Weng, Chun Yuan

Most importantly, we propose masked feature alignment methods including Masked Domain Query-based Feature Alignment (MDQFA) and Masked Token-wise Feature Alignment (MTWFA) to alleviate domain shift in a more robust way, which not only prevent training stagnation and lead to a robust pretrained model in the pretraining stage, but also enhance the model's target performance in the self-training stage.

object-detection Object Detection +3

Effective Whole-body Pose Estimation with Two-stages Distillation

1 code implementation29 Jul 2023 Zhendong Yang, Ailing Zeng, Chun Yuan, Yu Li

Different from the previous self-knowledge distillation, this stage finetunes the student's head with only 20% training time as a plug-and-play training strategy.

 Ranked #1 on 2D Human Pose Estimation on COCO-WholeBody (using extra training data)

2D Human Pose Estimation Pose Estimation +1

DreamDiffusion: Generating High-Quality Images from Brain EEG Signals

1 code implementation29 Jun 2023 Yunpeng Bai, Xintao Wang, Yan-Pei Cao, Yixiao Ge, Chun Yuan, Ying Shan

This paper introduces DreamDiffusion, a novel method for generating high-quality images directly from brain electroencephalogram (EEG) signals, without the need to translate thoughts into text.

EEG Image Generation

MA-NeRF: Motion-Assisted Neural Radiance Fields for Face Synthesis from Sparse Images

no code implementations17 Jun 2023 Weichen Zhang, Xiang Zhou, Yukang Cao, Wensen Feng, Chun Yuan

We improve from NeRF and propose a novel framework that, by leveraging the parametric 3DMM models, can reconstruct a high-fidelity drivable face avatar and successfully handle the unseen expressions.

Face Generation Novel View Synthesis

Neural Machine Translation with Dynamic Graph Convolutional Decoder

no code implementations28 May 2023 Lei LI, Kai Fan, Lingyu Yang, Hongjia Li, Chun Yuan

Existing wisdom demonstrates the significance of syntactic knowledge for the improvement of neural machine translation models.

Decoder Machine Translation +1

Learning to Learn from APIs: Black-Box Data-Free Meta-Learning

1 code implementation28 May 2023 Zixuan Hu, Li Shen, Zhenyi Wang, Baoyuan Wu, Chun Yuan, DaCheng Tao

Data-free meta-learning (DFML) aims to enable efficient learning of new tasks by meta-learning from a collection of pre-trained models without access to the training data.

Few-Shot Learning Knowledge Distillation

CrossGET: Cross-Guided Ensemble of Tokens for Accelerating Vision-Language Transformers

1 code implementation27 May 2023 Dachuan Shi, Chaofan Tao, Anyi Rao, Zhendong Yang, Chun Yuan, Jiaqi Wang

Although extensively studied for unimodal models, the acceleration for multimodal models, especially the vision-language Transformers, is relatively under-explored.

Image Captioning Image Retrieval +5

Tailoring Instructions to Student's Learning Levels Boosts Knowledge Distillation

1 code implementation16 May 2023 Yuxin Ren, Zihan Zhong, Xingjian Shi, Yi Zhu, Chun Yuan, Mu Li

It has been commonly observed that a teacher model with superior performance does not necessarily result in a stronger student, highlighting a discrepancy between current teacher training practices and effective knowledge transfer.

Knowledge Distillation text-classification +2

Towards Effective Collaborative Learning in Long-Tailed Recognition

no code implementations5 May 2023 Zhengzhuo Xu, Zenghao Chai, Chengyin Xu, Chun Yuan, Haiqin Yang

In this paper, we observe that the knowledge transfer between experts is imbalanced in terms of class distribution, which results in limited performance improvement of the minority classes.

Transfer Learning

Why is the winner the best?

no code implementations CVPR 2023 Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J. Adler, Sharib Ali, Vincent Andrearczyk, Marc Aubreville, Ujjwal Baid, Spyridon Bakas, Niranjan Balu, Sophia Bano, Jorge Bernal, Sebastian Bodenstedt, Alessandro Casella, Veronika Cheplygina, Marie Daum, Marleen de Bruijne, Adrien Depeursinge, Reuben Dorent, Jan Egger, David G. Ellis, Sandy Engelhardt, Melanie Ganz, Noha Ghatwary, Gabriel Girard, Patrick Godau, Anubha Gupta, Lasse Hansen, Kanako Harada, Mattias Heinrich, Nicholas Heller, Alessa Hering, Arnaud Huaulmé, Pierre Jannin, Ali Emre Kavur, Oldřich Kodym, Michal Kozubek, Jianning Li, Hongwei Li, Jun Ma, Carlos Martín-Isla, Bjoern Menze, Alison Noble, Valentin Oreiller, Nicolas Padoy, Sarthak Pati, Kelly Payette, Tim Rädsch, Jonathan Rafael-Patiño, Vivek Singh Bawa, Stefanie Speidel, Carole H. Sudre, Kimberlin Van Wijnen, Martin Wagner, Donglai Wei, Amine Yamlahi, Moi Hoon Yap, Chun Yuan, Maximilian Zenk, Aneeq Zia, David Zimmerer, Dogu Baran Aydogan, Binod Bhattarai, Louise Bloch, Raphael Brüngel, Jihoon Cho, Chanyeol Choi, Qi Dou, Ivan Ezhov, Christoph M. Friedrich, Clifton Fuller, Rebati Raman Gaire, Adrian Galdran, Álvaro García Faura, Maria Grammatikopoulou, SeulGi Hong, Mostafa Jahanifar, Ikbeom Jang, Abdolrahim Kadkhodamohammadi, Inha Kang, Florian Kofler, Satoshi Kondo, Hugo Kuijf, Mingxing Li, Minh Huan Luu, Tomaž Martinčič, Pedro Morais, Mohamed A. Naser, Bruno Oliveira, David Owen, Subeen Pang, Jinah Park, Sung-Hong Park, Szymon Płotka, Elodie Puybareau, Nasir Rajpoot, Kanghyun Ryu, Numan Saeed, Adam Shephard, Pengcheng Shi, Dejan Štepec, Ronast Subedi, Guillaume Tochon, Helena R. Torres, Helene Urien, João L. Vilaça, Kareem Abdul Wahid, Haojie Wang, Jiacheng Wang, Liansheng Wang, Xiyue Wang, Benedikt Wiestler, Marek Wodzinski, Fangfang Xia, Juanying Xie, Zhiwei Xiong, Sen yang, Yanwu Yang, Zixuan Zhao, Klaus Maier-Hein, Paul F. Jäger, Annette Kopp-Schneider, Lena Maier-Hein

The "typical" lead of a winning team is a computer scientist with a doctoral degree, five years of experience in biomedical image analysis, and four years of experience in deep learning.

Benchmarking Multi-Task Learning

From Knowledge Distillation to Self-Knowledge Distillation: A Unified Approach with Normalized Loss and Customized Soft Labels

1 code implementation ICCV 2023 Zhendong Yang, Ailing Zeng, Zhe Li, Tianke Zhang, Chun Yuan, Yu Li

We decompose the KD loss and find the non-target loss from it forces the student's non-target logits to match the teacher's, but the sum of the two non-target logits is different, preventing them from being identical.

Self-Knowledge Distillation

Make Encoder Great Again in 3D GAN Inversion through Geometry and Occlusion-Aware Encoding

no code implementations ICCV 2023 Ziyang Yuan, Yiming Zhu, Yu Li, Hongyu Liu, Chun Yuan

We leverage the inherent properties of EG3D's latent space to design a discriminator and a background depth regularization.

Architecture, Dataset and Model-Scale Agnostic Data-free Meta-Learning

1 code implementation CVPR 2023 Zixuan Hu, Li Shen, Zhenyi Wang, Tongliang Liu, Chun Yuan, DaCheng Tao

The goal of data-free meta-learning is to learn useful prior knowledge from a collection of pre-trained models without accessing their training data.


Rethink Long-tailed Recognition with Vision Transformers

no code implementations28 Feb 2023 Zhengzhuo Xu, Shuo Yang, Xingjun Wang, Chun Yuan

Hence, we propose to adopt unsupervised learning to utilize long-tailed data.

TextIR: A Simple Framework for Text-based Editable Image Restoration

no code implementations28 Feb 2023 Yunpeng Bai, Cairong Wang, Shuzhao Xie, Chao Dong, Chun Yuan, Zhi Wang

We use the text-image feature compatibility of the CLIP to alleviate the difficulty of fusing text and image features.

Colorization Image Colorization +3

UPop: Unified and Progressive Pruning for Compressing Vision-Language Transformers

1 code implementation31 Jan 2023 Dachuan Shi, Chaofan Tao, Ying Jin, Zhendong Yang, Chun Yuan, Jiaqi Wang

Real-world data contains a vast amount of multimodal information, among which vision and language are the two most representative modalities.

Image Captioning Image Classification +7

ITstyler: Image-optimized Text-based Style Transfer

no code implementations26 Jan 2023 Yunpeng Bai, Jiayue Liu, Chao Dong, Chun Yuan

Text-based style transfer is a newly-emerging research topic that uses text information instead of style image to guide the transfer process, significantly extending the application scenario of style transfer.

Style Transfer

Towards Arbitrary Text-driven Image Manipulation via Space Alignment

no code implementations25 Jan 2023 Yunpeng Bai, Zihan Zhong, Chao Dong, Weichen Zhang, Guowei Xu, Chun Yuan

Then, the text input can be directly accessed into the StyleGAN space and be used to find the semantic shift according to the text description.

Attribute Image Manipulation

Accurate 3D Face Reconstruction with Facial Component Tokens

no code implementations ICCV 2023 Tianke Zhang, Xuangeng Chu, Yunfei Liu, Lijian Lin, Zhendong Yang, Zhengzhuo Xu, Chengkun Cao, Fei Yu, Changyin Zhou, Chun Yuan, Yu Li

However, the current deep learning-based methods face significant challenges in achieving accurate reconstruction with disentangled facial parameters and ensuring temporal stability in single-frame methods for 3D face tracking on video data.

3D Face Reconstruction

Truncate-Split-Contrast: A Framework for Learning from Mislabeled Videos

no code implementations27 Dec 2022 Zixiao Wang, Junwu Weng, Chun Yuan, Jue Wang

Thanks to Noise Contrastive Learning, the average classification accuracy improvement on Mini-Kinetics and Sth-Sth-V1 is over 1. 6\%.

Contrastive Learning Video Classification

Biomedical image analysis competitions: The state of current participation practice

no code implementations16 Dec 2022 Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J. Adler, Patrick Godau, Veronika Cheplygina, Michal Kozubek, Sharib Ali, Anubha Gupta, Jan Kybic, Alison Noble, Carlos Ortiz de Solórzano, Samiksha Pachade, Caroline Petitjean, Daniel Sage, Donglai Wei, Elizabeth Wilden, Deepak Alapatt, Vincent Andrearczyk, Ujjwal Baid, Spyridon Bakas, Niranjan Balu, Sophia Bano, Vivek Singh Bawa, Jorge Bernal, Sebastian Bodenstedt, Alessandro Casella, Jinwook Choi, Olivier Commowick, Marie Daum, Adrien Depeursinge, Reuben Dorent, Jan Egger, Hannah Eichhorn, Sandy Engelhardt, Melanie Ganz, Gabriel Girard, Lasse Hansen, Mattias Heinrich, Nicholas Heller, Alessa Hering, Arnaud Huaulmé, Hyunjeong Kim, Bennett Landman, Hongwei Bran Li, Jianning Li, Jun Ma, Anne Martel, Carlos Martín-Isla, Bjoern Menze, Chinedu Innocent Nwoye, Valentin Oreiller, Nicolas Padoy, Sarthak Pati, Kelly Payette, Carole Sudre, Kimberlin Van Wijnen, Armine Vardazaryan, Tom Vercauteren, Martin Wagner, Chuanbo Wang, Moi Hoon Yap, Zeyun Yu, Chun Yuan, Maximilian Zenk, Aneeq Zia, David Zimmerer, Rina Bao, Chanyeol Choi, Andrew Cohen, Oleh Dzyubachyk, Adrian Galdran, Tianyuan Gan, Tianqi Guo, Pradyumna Gupta, Mahmood Haithami, Edward Ho, Ikbeom Jang, Zhili Li, Zhengbo Luo, Filip Lux, Sokratis Makrogiannis, Dominik Müller, Young-tack Oh, Subeen Pang, Constantin Pape, Gorkem Polat, Charlotte Rosalie Reed, Kanghyun Ryu, Tim Scherr, Vajira Thambawita, Haoyu Wang, Xinliang Wang, Kele Xu, Hung Yeh, Doyeob Yeo, Yixuan Yuan, Yan Zeng, Xin Zhao, Julian Abbing, Jannes Adam, Nagesh Adluru, Niklas Agethen, Salman Ahmed, Yasmina Al Khalil, Mireia Alenyà, Esa Alhoniemi, Chengyang An, Talha Anwar, Tewodros Weldebirhan Arega, Netanell Avisdris, Dogu Baran Aydogan, Yingbin Bai, Maria Baldeon Calisto, Berke Doga Basaran, Marcel Beetz, Cheng Bian, Hao Bian, Kevin Blansit, Louise Bloch, Robert Bohnsack, Sara Bosticardo, Jack Breen, Mikael Brudfors, Raphael Brüngel, Mariano Cabezas, Alberto Cacciola, Zhiwei Chen, Yucong Chen, Daniel Tianming Chen, Minjeong Cho, Min-Kook Choi, Chuantao Xie Chuantao Xie, Dana Cobzas, Julien Cohen-Adad, Jorge Corral Acero, Sujit Kumar Das, Marcela de Oliveira, Hanqiu Deng, Guiming Dong, Lars Doorenbos, Cory Efird, Sergio Escalera, Di Fan, Mehdi Fatan Serj, Alexandre Fenneteau, Lucas Fidon, Patryk Filipiak, René Finzel, Nuno R. Freitas, Christoph M. Friedrich, Mitchell Fulton, Finn Gaida, Francesco Galati, Christoforos Galazis, Chang Hee Gan, Zheyao Gao, Shengbo Gao, Matej Gazda, Beerend Gerats, Neil Getty, Adam Gibicar, Ryan Gifford, Sajan Gohil, Maria Grammatikopoulou, Daniel Grzech, Orhun Güley, Timo Günnemann, Chunxu Guo, Sylvain Guy, Heonjin Ha, Luyi Han, Il Song Han, Ali Hatamizadeh, Tian He, Jimin Heo, Sebastian Hitziger, SeulGi Hong, Seungbum Hong, Rian Huang, Ziyan Huang, Markus Huellebrand, Stephan Huschauer, Mustaffa Hussain, Tomoo Inubushi, Ece Isik Polat, Mojtaba Jafaritadi, SeongHun Jeong, Bailiang Jian, Yuanhong Jiang, Zhifan Jiang, Yueming Jin, Smriti Joshi, Abdolrahim Kadkhodamohammadi, Reda Abdellah Kamraoui, Inha Kang, Junghwa Kang, Davood Karimi, April Khademi, Muhammad Irfan Khan, Suleiman A. Khan, Rishab Khantwal, Kwang-Ju Kim, Timothy Kline, Satoshi Kondo, Elina Kontio, Adrian Krenzer, Artem Kroviakov, Hugo Kuijf, Satyadwyoom Kumar, Francesco La Rosa, Abhi Lad, Doohee Lee, Minho Lee, Chiara Lena, Hao Li, Ling Li, Xingyu Li, Fuyuan Liao, Kuanlun Liao, Arlindo Limede Oliveira, Chaonan Lin, Shan Lin, Akis Linardos, Marius George Linguraru, Han Liu, Tao Liu, Di Liu, Yanling Liu, João Lourenço-Silva, Jingpei Lu, Jiangshan Lu, Imanol Luengo, Christina B. Lund, Huan Minh Luu, Yi Lv, Uzay Macar, Leon Maechler, Sina Mansour L., Kenji Marshall, Moona Mazher, Richard McKinley, Alfonso Medela, Felix Meissen, Mingyuan Meng, Dylan Miller, Seyed Hossein Mirjahanmardi, Arnab Mishra, Samir Mitha, Hassan Mohy-ud-Din, Tony Chi Wing Mok, Gowtham Krishnan Murugesan, Enamundram Naga Karthik, Sahil Nalawade, Jakub Nalepa, Mohamed Naser, Ramin Nateghi, Hammad Naveed, Quang-Minh Nguyen, Cuong Nguyen Quoc, Brennan Nichyporuk, Bruno Oliveira, David Owen, Jimut Bahan Pal, Junwen Pan, Wentao Pan, Winnie Pang, Bogyu Park, Vivek Pawar, Kamlesh Pawar, Michael Peven, Lena Philipp, Tomasz Pieciak, Szymon Plotka, Marcel Plutat, Fattaneh Pourakpour, Domen Preložnik, Kumaradevan Punithakumar, Abdul Qayyum, Sandro Queirós, Arman Rahmim, Salar Razavi, Jintao Ren, Mina Rezaei, Jonathan Adam Rico, ZunHyan Rieu, Markus Rink, Johannes Roth, Yusely Ruiz-Gonzalez, Numan Saeed, Anindo Saha, Mostafa Salem, Ricardo Sanchez-Matilla, Kurt Schilling, Wei Shao, Zhiqiang Shen, Ruize Shi, Pengcheng Shi, Daniel Sobotka, Théodore Soulier, Bella Specktor Fadida, Danail Stoyanov, Timothy Sum Hon Mun, Xiaowu Sun, Rong Tao, Franz Thaler, Antoine Théberge, Felix Thielke, Helena Torres, Kareem A. Wahid, Jiacheng Wang, Yifei Wang, Wei Wang, Xiong Wang, Jianhui Wen, Ning Wen, Marek Wodzinski, Ye Wu, Fangfang Xia, Tianqi Xiang, Chen Xiaofei, Lizhan Xu, Tingting Xue, Yuxuan Yang, Lin Yang, Kai Yao, Huifeng Yao, Amirsaeed Yazdani, Michael Yip, Hwanseung Yoo, Fereshteh Yousefirizi, Shunkai Yu, Lei Yu, Jonathan Zamora, Ramy Ashraf Zeineldin, Dewen Zeng, Jianpeng Zhang, Bokai Zhang, Jiapeng Zhang, Fan Zhang, Huahong Zhang, Zhongchen Zhao, Zixuan Zhao, Jiachen Zhao, Can Zhao, Qingshuo Zheng, Yuheng Zhi, Ziqi Zhou, Baosheng Zou, Klaus Maier-Hein, Paul F. Jäger, Annette Kopp-Schneider, Lena Maier-Hein

Of these, 84% were based on standard architectures.


Learning Imbalanced Data with Vision Transformers

1 code implementation CVPR 2023 Zhengzhuo Xu, Ruikang Liu, Shuo Yang, Zenghao Chai, Chun Yuan

In this paper, we systematically investigate the ViTs' performance in LTR and propose LiVT to train ViTs from scratch only with LT data.

Long-tail Learning

Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks

2 code implementations CVPR 2023 Hao Li, Jinguo Zhu, Xiaohu Jiang, Xizhou Zhu, Hongsheng Li, Chun Yuan, Xiaohua Wang, Yu Qiao, Xiaogang Wang, Wenhai Wang, Jifeng Dai

In this paper, we propose Uni-Perceiver v2, which is the first generalist model capable of handling major large-scale vision and vision-language tasks with competitive performance.

Decoder Language Modelling +1

One Model to Edit Them All: Free-Form Text-Driven Image Manipulation with Semantic Modulations

1 code implementation14 Oct 2022 Yiming Zhu, Hongyu Liu, Yibing Song, Ziyang Yuan, Xintong Han, Chun Yuan, Qifeng Chen, Jue Wang

Based on the visual latent space of StyleGAN[21] and text embedding space of CLIP[34], studies focus on how to map these two latent spaces for text-driven attribute manipulations.

Attribute Image Manipulation

Darwinian Model Upgrades: Model Evolving with Selective Compatibility

no code implementations13 Oct 2022 Binjie Zhang, Shupeng Su, Yixiao Ge, Xuyuan Xu, Yexin Wang, Chun Yuan, Mike Zheng Shou, Ying Shan

The traditional model upgrading paradigm for retrieval requires recomputing all gallery embeddings before deploying the new model (dubbed as "backfilling"), which is quite expensive and time-consuming considering billions of instances in industrial applications.

Face Recognition Retrieval

Tackling Instance-Dependent Label Noise with Dynamic Distribution Calibration

no code implementations11 Oct 2022 Manyi Zhang, Yuxin Ren, ZiHao Wang, Chun Yuan

In this paper, to address the distribution shift in learning with instance-dependent label noise, a dynamic distribution-calibration strategy is adopted.

Dimensionality Reduction

Super-Resolution by Predicting Offsets: An Ultra-Efficient Super-Resolution Network for Rasterized Images

no code implementations9 Oct 2022 Jinjin Gu, Haoming Cai, Chenyu Dong, Ruofan Zhang, Yulun Zhang, Wenming Yang, Chun Yuan

We finally use a guided fusion operation to integrate the sharp edges generated by the network and flat areas by the interpolation method to get the final SR image.

Quantization Super-Resolution

DCE: Offline Reinforcement Learning With Double Conservative Estimates

no code implementations27 Sep 2022 Chen Zhao, Kai Xing Huang, Chun Yuan

Previous conservative estimation methods are usually difficult to avoid the impact of OOD actions on Q-value estimates.

Computational Efficiency D4RL +2

Rethinking Knowledge Distillation via Cross-Entropy

1 code implementation22 Aug 2022 Zhendong Yang, Zhe Li, Yuan Gong, Tianke Zhang, Shanshan Lao, Chun Yuan, Yu Li

Furthermore, we smooth students' target output to treat it as the soft target for training without teachers and propose a teacher-free new KD loss (tf-NKD).

Knowledge Distillation

HyP$^2$ Loss: Beyond Hypersphere Metric Space for Multi-label Image Retrieval

1 code implementation14 Aug 2022 Chengyin Xu, Zenghao Chai, Zhengzhuo Xu, Chun Yuan, Yanbo Fan, Jue Wang

Image retrieval has become an increasingly appealing technique with broad multimedia application prospects, where deep hashing serves as the dominant branch towards low storage and efficient retrieval.

Deep Hashing Metric Learning +1

Improving the Latent Space of Image Style Transfer

no code implementations24 May 2022 Yunpeng Bai, Cairong Wang, Chun Yuan, Yanbo Fan, Jue Wang

The content contrastive loss enables the encoder to retain more available details.

Style Transfer

Masked Generative Distillation

3 code implementations3 May 2022 Zhendong Yang, Zhe Li, Mingqi Shao, Dachuan Shi, Zehuan Yuan, Chun Yuan

The current distillation algorithm usually improves students' performance by imitating the output of the teacher.

Image Classification Instance Segmentation +5

Privacy-Preserving Model Upgrades with Bidirectional Compatible Training in Image Retrieval

1 code implementation29 Apr 2022 Shupeng Su, Binjie Zhang, Yixiao Ge, Xuyuan Xu, Yexin Wang, Chun Yuan, Ying Shan

The task of privacy-preserving model upgrades in image retrieval desires to reap the benefits of rapidly evolving new models without accessing the raw gallery images.

Image Retrieval Privacy Preserving +1

REALY: Rethinking the Evaluation of 3D Face Reconstruction

1 code implementation18 Mar 2022 Zenghao Chai, Haoxian Zhang, Jing Ren, Di Kang, Zhengzhuo Xu, Xuefei Zhe, Chun Yuan, Linchao Bao

The evaluation of 3D face reconstruction results typically relies on a rigid shape alignment between the estimated 3D model and the ground-truth scan.

3D Face Reconstruction

Towards Universal Backward-Compatible Representation Learning

2 code implementations3 Mar 2022 Binjie Zhang, Yixiao Ge, Yantao Shen, Shupeng Su, Fanzi Wu, Chun Yuan, Xuyuan Xu, Yexin Wang, Ying Shan

The task of backward-compatible representation learning is therefore introduced to support backfill-free model upgrades, where the new query features are interoperable with the old gallery features.

Face Recognition Representation Learning

Hot-Refresh Model Upgrades with Regression-Alleviating Compatible Training in Image Retrieval

1 code implementation24 Jan 2022 Binjie Zhang, Yixiao Ge, Yantao Shen, Yu Li, Chun Yuan, Xuyuan Xu, Yexin Wang, Ying Shan

In contrast, hot-refresh model upgrades deploy the new model immediately and then gradually improve the retrieval accuracy by backfilling the gallery on-the-fly.

Image Retrieval regression +1

Semantic-Sparse Colorization Network for Deep Exemplar-based Colorization

1 code implementation2 Dec 2021 Yunpeng Bai, Chao Dong, Zenghao Chai, Andong Wang, Zhengzhuo Xu, Chun Yuan

To address these two problems, we propose Semantic-Sparse Colorization Network (SSCN) to transfer both the global image style and detailed semantic-related colors to the gray-scale image in a coarse-to-fine manner.


StrokeNet: Stroke Assisted and Hierarchical Graph Reasoning Networks

no code implementations23 Nov 2021 Lei LI, Kai Fan, Chun Yuan

Scene text detection is still a challenging task, as there may be extremely small or low-resolution strokes, and close or arbitrary-shaped texts.

Node Classification Relational Reasoning +2

Focal and Global Knowledge Distillation for Detectors

1 code implementation CVPR 2022 Zhendong Yang, Zhe Li, Xiaohu Jiang, Yuan Gong, Zehuan Yuan, Danpei Zhao, Chun Yuan

Global distillation rebuilds the relation between different pixels and transfers it from teachers to students, compensating for missing global information in focal distillation.

Image Classification Knowledge Distillation +2

Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective

1 code implementation NeurIPS 2021 Zhengzhuo Xu, Zenghao Chai, Chun Yuan

Real-world data universally confronts a severe class-imbalance problem and exhibits a long-tailed distribution, i. e., most labels are associated with limited instances.

Data Augmentation Long-tail Learning

MoDeRNN: Towards Fine-grained Motion Details for Spatiotemporal Predictive Learning

1 code implementation25 Oct 2021 Zenghao Chai, Zhengzhuo Xu, Chun Yuan

We carefully design Detail Context Block (DCB) to extract fine-grained details and improve the isolated correlation between upper context state and current input state.

Bootstrapped Hindsight Experience replay with Counterintuitive Prioritization

no code implementations29 Sep 2021 Jiawei Xu, Shuxing Li, Chun Yuan, Zhengyou Zhang, Lei Han

In this paper, inspired by Bootstrapped DQN, we use multiple heads in DDPG and take advantage of the diversity and uncertainty among multiple heads to improve the data efficiency with relabeled goals.


Superior Performance with Diversified Strategic Control in FPS Games Using General Reinforcement Learning

no code implementations29 Sep 2021 Shuxing Li, Jiawei Xu, Chun Yuan, Peng Sun, Zhuobin Zheng, Zhengyou Zhang, Lei Han

We provide comprehensive analysis and experiments to elaborate the effect of each component in affecting the agent performance, and demonstrate that the proposed and adopted techniques are important to achieve superior performance in general end-to-end FPS games.

FPS Games General Reinforcement Learning +2

Hot-Refresh Model Upgrades with Regression-Free Compatible Training in Image Retrieval

no code implementations ICLR 2022 Binjie Zhang, Yixiao Ge, Yantao Shen, Yu Li, Chun Yuan, Xuyuan Xu, Yexin Wang, Ying Shan

In contrast, hot-refresh model upgrades deploy the new model immediately and then gradually improve the retrieval accuracy by backfilling the gallery on-the-fly.

Image Retrieval regression +1

BCDR: Betweenness Centrality-based Distance Resampling for Graph Shortest Distance Embedding

no code implementations29 Sep 2021 Haoyu Wang, Chun Yuan

Second, we perform Distance Resampling (DR) from original walk paths before maximum likelihood optimization instead of the PMI-based optimization and prove that this strategy preserves distance relation with respect to any calibrated node via steering optimization objective to reconstruct a global distance matrix.

Graph Representation Learning

Deep Open Snake Tracker for Vessel Tracing

no code implementations19 Jul 2021 Li Chen, Wenjin Liu, Niranjan Balu, Mahmud Mossa-Basha, Thomas S. Hatsukami, Jenq-Neng Hwang, Chun Yuan

Vessel tracing by modeling vascular structures in 3D medical images with centerlines and radii can provide useful information for vascular health.

CMS-LSTM: Context Embedding and Multi-Scale Spatiotemporal Expression LSTM for Predictive Learning

1 code implementation6 Feb 2021 Zenghao Chai, Zhengzhuo Xu, Yunpeng Bai, Zhihui Lin, Chun Yuan

To tackle the increasing ambiguity during forecasting, we design CMS-LSTM to focus on context correlations and multi-scale spatiotemporal flow with details on fine-grained locals, containing two elaborate designed blocks: Context Embedding (CE) and Spatiotemporal Expression (SE) blocks.

Video Prediction

Bridge the Gap: High-level Semantic Planning for Image Captioning

no code implementations COLING 2020 Chenxi Yuan, Yang Bai, Chun Yuan

To bridge the gaps we propose a high-level semantic planning (HSP) mechanism that incorporates both a semantic reconstruction and an explicit order planning.

Image Captioning Vocal Bursts Intensity Prediction

Reducing the Annotation Effort for Video Object Segmentation Datasets

no code implementations2 Nov 2020 Paul Voigtlaender, Lishu Luo, Chun Yuan, Yong Jiang, Bastian Leibe

We use a deep convolutional network to automatically create pseudo-labels on a pixel level from much cheaper bounding box annotations and investigate how far such pseudo-labels can carry us for training state-of-the-art VOS approaches.

Object Semantic Segmentation +2

A Simple Yet Effective Method for Video Temporal Grounding with Cross-Modality Attention

no code implementations23 Sep 2020 Binjie Zhang, Yu Li, Chun Yuan, Dejing Xu, Pin Jiang, Ying Shan

The task of language-guided video temporal grounding is to localize the particular video clip corresponding to a query sentence in an untrimmed video.


HOSE-Net: Higher Order Structure Embedded Network for Scene Graph Generation

no code implementations12 Aug 2020 Meng Wei, Chun Yuan, Xiaoyu Yue, Kuo Zhong

Second, since learning too many context-specific classification subspaces can suffer from data sparsity issues, we propose a hierarchical semantic aggregation(HSA) module to reduces the number of subspaces by introducing higher order structural information.

General Classification Graph Generation +5

Automated Intracranial Artery Labeling using a Graph Neural Network and Hierarchical Refinement

1 code implementation11 Jul 2020 Li Chen, Thomas Hatsukami, Jenq-Neng Hwang, Chun Yuan

Automatically labeling intracranial arteries (ICA) with their anatomical names is beneficial for feature extraction and detailed analysis of intracranial vascular structures.

Self-Attention ConvLSTM for Spatiotemporal Prediction

2 code implementations AAAI 2020 Zhihui Lin, Maomao Li, Zhuobin Zheng, Yangyang Cheng, Chun Yuan

To extract spatial features with both global and local dependencies, we introduce the self-attention mechanism into ConvLSTM.

Video Prediction

Using an ensemble color space model to tackle adversarial examples

no code implementations10 Mar 2020 Shreyank N Gowda, Chun Yuan

Minute pixel changes in an image drastically change the prediction that the deep learning model makes.

Adversarial Attack Autonomous Driving

Multi-Frame Content Integration with a Spatio-Temporal Attention Mechanism for Person Video Motion Transfer

no code implementations12 Aug 2019 Kun Cheng, Hao-Zhi Huang, Chun Yuan, Lingyiqing Zhou, Wei Liu

Specifically, we transfer the motion of one person in a target video to another person in a source video, while preserving the appearance of the source person.

Video Generation

Fast Registration for cross-source point clouds by using weak regional affinity and pixel-wise refinement

no code implementations11 Mar 2019 Xiaoshui Huang, Lixin Fan, Qiang Wu, Jian Zhang, Chun Yuan

Accurate and fast registration of cross-source 3D point clouds from different sensors is an emerged research problem in computer vision.

Point Cloud Registration

ColorNet: Investigating the importance of color spaces for image classification

1 code implementation1 Feb 2019 Shreyank N Gowda, Chun Yuan

These color images are taken as input in the form of RGB images and classification is done without modifying them.

Classification General Classification +1

Efficient Multi-level Correlating for Visual Tracking

no code implementations13 Oct 2018 Yipeng Ma, Chun Yuan, Peng Gao, Fei Wang

Correlation filter (CF) based tracking algorithms have demonstrated favorable performance recently.

Visual Tracking

FPGA-based Acceleration System for Visual Tracking

no code implementations12 Oct 2018 Ke Song, Chun Yuan, Peng Gao, Yunxu Sun

In order to improve the tracking speed and reduce the overall power consumption of visual tracking, this paper proposes a real-time visual tracking algorithm based on DSST(Discriminative Scale Space Tracking) approach.

Real-Time Visual Tracking

Y-net: 3D intracranial artery segmentation using a convolutional autoencoder

no code implementations19 Dec 2017 Li Chen, Yanjun Xie, Jie Sun, Niranjan Balu, Mahmud Mossa-Basha, Kristi Pimentel, Thomas S. Hatsukami, Jenq-Neng Hwang, Chun Yuan

Automated segmentation of intracranial arteries on magnetic resonance angiography (MRA) allows for quantification of cerebrovascular features, which provides tools for understanding aging and pathophysiological adaptations of the cerebrovascular system.

Binary Classification General Classification +1

A coarse-to-fine algorithm for registration in 3D street-view cross-source point clouds

no code implementations24 Oct 2016 Xiaoshui Huang, Jian Zhang, Qiang Wu, Lixin Fan, Chun Yuan

In this paper, different from previous ICP-based methods, and from a statistic view, we propose a effective coarse-to-fine algorithm to detect and register a small scale SFM point cloud in a large scale Lidar point cloud.

Learning Boltzmann Machine with EM-like Method

no code implementations7 Sep 2016 Jinmeng Song, Chun Yuan

We also propose a new measure to assess the performance of Boltzmann machine as generative models of data, and its computational complexity is O(Rmn).

Cannot find the paper you are looking for? You can Submit a new open access paper.