Search Results for author: Hao Li

Found 289 papers, 116 papers with code

Sketch Input Method Editor: A Comprehensive Dataset and Methodology for Systematic Input Recognition

no code implementations30 Nov 2023 Guangming Zhu, Siyuan Wang, Qing Cheng, Kelong Wu, Hao Li, Liang Zhang

With the recent surge in the use of touchscreen devices, free-hand sketching has emerged as a promising modality for human-computer interaction.

Class Incremental Learning Domain Adaptation +2

Novel OCT mosaicking pipeline with Feature- and Pixel-based registration

no code implementations21 Nov 2023 Jiacheng Wang, Hao Li, Dewei Hu, Yuankai K. Tao, Ipek Oguz

High-resolution Optical Coherence Tomography (OCT) images are crucial for ophthalmology studies but are limited by their relatively narrow field of view (FoV).

SpectralGPT: Spectral Foundation Model

no code implementations13 Nov 2023 Danfeng Hong, Bing Zhang, Xuyang Li, YuXuan Li, Chenyu Li, Jing Yao, Naoto Yokoya, Hao Li, Pedram Ghamisi, Xiuping Jia, Antonio Plaza, Gamba Paolo, Jon Atli Benediktsson, Jocelyn Chanussot

The foundation model has recently garnered significant attention due to its potential to revolutionize the field of visual representation learning in a self-supervised manner.

Change Detection Representation Learning +3

InfMLLM: A Unified Framework for Visual-Language Tasks

1 code implementation12 Nov 2023 Qiang Zhou, Zhibin Wang, Wei Chu, Yinghui Xu, Hao Li, Yuan Qi

Our experiments demonstrate that preserving the positional information of visual embeddings through the pool-adapter is particularly beneficial for tasks like visual grounding.

Image Captioning Instruction Following +3

Machine Learning Parameterization of the Multi-scale Kain-Fritsch (MSKF) Convection Scheme

no code implementations7 Nov 2023 Xiaohui Zhong, Xing Yu, Hao Li

The Weather Research and Forecast (WRF) model is used to generate training and testing data over South China at a horizontal resolution of 5 km.

Promise:Prompt-driven 3D Medical Image Segmentation Using Pretrained Image Foundation Models

1 code implementation30 Oct 2023 Hao Li, Han Liu, Dewei Hu, Jiacheng Wang, Ipek Oguz

To address prevalent issues in medical imaging, such as data acquisition challenges and label availability, transfer learning from natural to medical image domains serves as a viable strategy to produce reliable segmentation results.

Image Segmentation Medical Image Segmentation +4

FuXi-Extreme: Improving extreme rainfall and wind forecasts with diffusion model

no code implementations25 Oct 2023 Xiaohui Zhong, Lei Chen, Jun Liu, Chensen Lin, Yuan Qi, Hao Li

State-of-the-art ML-based weather forecast models, such as FuXi, have demonstrated superior statistical forecast performance in comparison to the high-resolution forecasts (HRES) of the European Centre for Medium-Range Weather Forecasts (ECMWF).

Denoising Weather Forecasting

Unpaired MRI Super Resolution with Self-Supervised Contrastive Learning

no code implementations24 Oct 2023 Hao Li, Quanwei Liu, Jianan Liu, Xiling Liu, Yanni Dong, Tao Huang, Zhihan Lv

High-resolution (HR) magnetic resonance imaging (MRI) is crucial for enhancing diagnostic accuracy in clinical settings.

Contrastive Learning Image Super-Resolution

On Generative Agents in Recommendation

1 code implementation16 Oct 2023 An Zhang, Leheng Sheng, Yuxin Chen, Hao Li, Yang Deng, Xiang Wang, Tat-Seng Chua

Recommender systems are the cornerstone of today's information dissemination, yet a disconnect between offline metrics and online performance greatly hinders their development.

Collaborative Filtering Movie Recommendation +1

High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models

no code implementations27 Sep 2023 Chunyu Qiang, Hao Li, Yixin Tian, Yi Zhao, Ying Zhang, Longbiao Wang, Jianwu Dang

To address these issues, we propose a minimally-supervised high-fidelity speech synthesis method, where all modules are constructed based on the diffusion models.

Speech Synthesis Voice Cloning

Cross-City Matters: A Multimodal Remote Sensing Benchmark Dataset for Cross-City Semantic Segmentation using High-Resolution Domain Adaptation Networks

no code implementations26 Sep 2023 Danfeng Hong, Bing Zhang, Hao Li, YuXuan Li, Jing Yao, Chenyu Li, Martin Werner, Jocelyn Chanussot, Alexander Zipf, Xiao Xiang Zhu

Artificial intelligence (AI) approaches nowadays have gained remarkable success in single-modality-dominated remote sensing (RS) applications, especially with an emphasis on individual urban environments (e. g., single cities or regions).

Domain Adaptation Segmentation +1

DOMAIN: MilDly COnservative Model-BAsed OfflINe Reinforcement Learning

no code implementations16 Sep 2023 Xiao-Yin Liu, Xiao-Hu Zhou, Xiao-Liang Xie, Shi-Qi Liu, Zhen-Qiu Feng, Hao Li, Mei-Jiang Gui, Tian-Yu Xiang, De-Xing Huang, Zeng-Guang Hou

However, uncertainty estimation is unreliable and leads to poor performance in certain scenarios, and the previous methods ignore differences between the model data, which brings great conservatism.

D4RL Model-based Reinforcement Learning +3

Implicit Neural Representation for MRI Parallel Imaging Reconstruction

no code implementations12 Sep 2023 Hao Li, Yusheng Zhou, Jianan Liu, Xiling Liu, Tao Huang, Zhihan Lv

In this paper, we propose a novel MRI PI reconstruction method based on INR, which represents the reconstructed fully-sampled images as the function of voxel coordinates and prior feature vectors of undersampled images to overcome the generalization problem of INR.

MRI Reconstruction

Research on Damage Analysis of Key Parts of UAV Flight Control System

no code implementations7 Sep 2023 Tianshun Li, Huaimin Chen, Ben Xiao, Hao Li, Shiyu Hao, Di Hai, Xuetong Wang

A set of hardware in the loop simulation methods based on the UAV model is proposed to create fault data, which is used to judge the parts where faults happen.

Learning Speech Representation From Contrastive Token-Acoustic Pretraining

no code implementations1 Sep 2023 Chunyu Qiang, Hao Li, Yixin Tian, Ruibo Fu, Tao Wang, Longbiao Wang, Jianwu Dang

However, existing contrastive learning methods in the audio field focus on extracting global descriptive information for downstream audio classification tasks, making them unsuitable for TTS, VC, and ASR tasks.

Audio Classification Automatic Speech Recognition +5

Towards Privacy-Supporting Fall Detection via Deep Unsupervised RGB2Depth Adaptation

1 code implementation23 Aug 2023 Hejun Xiao, Kunyu Peng, Xiangsheng Huang, Alina Roitberg1, Hao Li, Zhaohui Wang, Rainer Stiefelhagen

In this paper, we introduce a privacy-supporting solution that makes the RGB-trained model applicable in depth domain and utilizes depth data at test time for fall detection.

Domain Adaptation

False Negative/Positive Control for SAM on Noisy Medical Images

1 code implementation20 Aug 2023 Xing Yao, Han Liu, Dewei Hu, Daiwei Lu, Ange Lou, Hao Li, Ruining Deng, Gabriel Arenas, Baris Oguz, Nadav Schwartz, Brett C Byram, Ipek Oguz

The method couples multi-box prompt augmentation and an aleatoric uncertainty-based false-negative (FN) and false-positive (FP) correction (FNPC) strategy.

Image Segmentation Medical Image Segmentation +2

MonoNeRD: NeRF-like Representations for Monocular 3D Object Detection

1 code implementation ICCV 2023 Junkai Xu, Liang Peng, Haoran Cheng, Hao Li, Wei Qian, Ke Li, Wenxiao Wang, Deng Cai

To the best of our knowledge, this work is the first to introduce volume rendering for M3D, and demonstrates the potential of implicit reconstruction for image-based 3D perception.

Monocular 3D Object Detection object-detection

IOB: Integrating Optimization Transfer and Behavior Transfer for Multi-Policy Reuse

no code implementations14 Aug 2023 Siyuan Li, Hao Li, Jin Zhang, Zhen Wang, Peng Liu, Chongjie Zhang

Humans have the ability to reuse previously learned policies to solve new tasks quickly, and reinforcement learning (RL) agents can do the same by transferring knowledge from source policies to a related target task.

Continual Learning Reinforcement Learning (RL)

CATS v2: Hybrid encoders for robust medical segmentation

1 code implementation11 Aug 2023 Hao Li, Han Liu, Dewei Hu, Xing Yao, Jiacheng Wang, Ipek Oguz

In our previous work, we proposed CATS, which is a U-shaped segmentation network augmented with transformer encoder.

Domain Adaptation Image Segmentation +3

XMem++: Production-level Video Segmentation From Few Annotated Frames

1 code implementation ICCV 2023 Maksym Bekuzarov, Ariana Bermudez, Joon-Young Lee, Hao Li

Despite advancements in user-guided video segmentation, extracting complex objects consistently for highly complex scenes is still a labor-intensive task, especially for production.

Segmentation Semantic Segmentation +3

Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding

no code implementations28 Jul 2023 Chunyu Qiang, Hao Li, Hao Ni, He Qu, Ruibo Fu, Tao Wang, Longbiao Wang, Jianwu Dang

However, existing methods suffer from three problems: the high dimensionality and waveform distortion of discrete speech representations, the prosodic averaging problem caused by the duration prediction model in non-autoregressive frameworks, and the information redundancy and dimension explosion problems of existing semantic encoding methods.

Language Modelling Speech Synthesis

COLosSAL: A Benchmark for Cold-start Active Learning for 3D Medical Image Segmentation

1 code implementation22 Jul 2023 Han Liu, Hao Li, Xing Yao, Yubo Fan, Dewei Hu, Benoit Dawant, Vishwesh Nath, Zhoubing Xu, Ipek Oguz

Cold-start AL is highly relevant in many practical scenarios but has been under-explored, especially for 3D medical segmentation tasks requiring substantial annotation effort.

Active Learning Image Segmentation +3

Semi-supervised Learning from Street-View Images and OpenStreetMap for Automatic Building Height Estimation

1 code implementation5 Jul 2023 Hao Li, Zhendong Yuan, Gabriel Dax, Gefei Kong, Hongchao Fan, Alexander Zipf, Martin Werner

In this work, we propose a semi-supervised learning (SSL) method of automatically estimating building height from Mapillary SVI and OSM data to generate low-cost and open-source 3D city modeling in LoD1.

object-detection Object Detection +1

FuXi: A cascade machine learning forecasting system for 15-day global weather forecast

no code implementations22 Jun 2023 Lei Chen, Xiaohui Zhong, Feng Zhang, Yuan Cheng, Yinghui Xu, Yuan Qi, Hao Li

Over the past few years, due to the rapid development of machine learning (ML) models for weather forecasting, state-of-the-art ML models have shown superior performance compared to the European Centre for Medium-Range Weather Forecasts (ECMWF)'s high-resolution forecast (HRES) in 10-day forecasts at a spatial resolution of 0. 25 degree.

Weather Forecasting

The ObjectFolder Benchmark: Multisensory Learning with Neural and Real Objects

no code implementations CVPR 2023 Ruohan Gao, Yiming Dou, Hao Li, Tanmay Agarwal, Jeannette Bohg, Yunzhu Li, Li Fei-Fei, Jiajun Wu

We introduce the ObjectFolder Benchmark, a benchmark suite of 10 tasks for multisensory object-centric learning, centered around object recognition, reconstruction, and manipulation with sight, sound, and touch.

Benchmarking Object Recognition

Sonicverse: A Multisensory Simulation Platform for Embodied Household Agents that See and Hear

1 code implementation1 Jun 2023 Ruohan Gao, Hao Li, Gokul Dharan, Zhuzhu Wang, Chengshu Li, Fei Xia, Silvio Savarese, Li Fei-Fei, Jiajun Wu

We introduce Sonicverse, a multisensory simulation platform with integrated audio-visual simulation for training household agents that can both see and hear.

Multi-Task Learning Visual Navigation

OVO: Open-Vocabulary Occupancy

1 code implementation25 May 2023 Zhiyu Tan, ZiChao Dong, Cheng Zhang, Weikun Zhang, Hang Ji, Hao Li

Semantic occupancy prediction aims to infer dense geometry and semantics of surroundings for an autonomous agent to operate safely in the 3D environment.

Knowledge Distillation

Do You Hear The People Sing? Key Point Analysis via Iterative Clustering and Abstractive Summarisation

no code implementations25 May 2023 Hao Li, Viktor Schlegel, Riza Batista-Navarro, Goran Nenadic

Furthermore, evaluating key points is crucial in ensuring that the automatically generated summaries are useful.

Text-Video Retrieval with Disentangled Conceptualization and Set-to-Set Alignment

4 code implementations20 May 2023 Peng Jin, Hao Li, Zesen Cheng, Jinfa Huang, Zhennan Wang, Li Yuan, Chang Liu, Jie Chen

In this paper, we propose the Disentangled Conceptualization and Set-to-set Alignment (DiCoSA) to simulate the conceptualizing and reasoning process of human beings.

Retrieval Video Retrieval

TG-VQA: Ternary Game of Video Question Answering

no code implementations17 May 2023 Hao Li, Peng Jin, Zesen Cheng, Songyang Zhang, Kai Chen, Zhennan Wang, Chang Liu, Jie Chen

Video question answering aims at answering a question about the video content by reasoning the alignment semantics within them.

Contrastive Learning Question Answering +2

Correcting for Interference in Experiments: A Case Study at Douyin

no code implementations4 May 2023 Vivek F. Farias, Hao Li, Tianyi Peng, Xinyuyang Ren, Huawei Zhang, Andrew Zheng

We formalize the problem of inference in such experiments as one of policy evaluation.

COSST: Multi-organ Segmentation with Partially Labeled Datasets Using Comprehensive Supervisions and Self-training

no code implementations27 Apr 2023 Han Liu, Zhoubing Xu, Riqiang Gao, Hao Li, Jianing Wang, Guillaume Chabin, Ipek Oguz, Sasa Grbic

We revisit the problem from a perspective of partial label supervision signals and identify two signals derived from ground truth and one from pseudo labels.

Organ Segmentation Outlier Detection +2

CryoFormer: Continuous Heterogeneous Cryo-EM Reconstruction using Transformer-based Neural Representations

no code implementations28 Mar 2023 Xinhang Liu, Yan Zeng, Yifan Qin, Hao Li, Jiakai Zhang, Lan Xu, Jingyi Yu

Cryo-electron microscopy (cryo-EM) allows for the high-resolution reconstruction of 3D structures of proteins and other biomolecules.

EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

2 code implementations22 Mar 2023 Hansheng Chen, Wei Tian, Pichao Wang, Fan Wang, Lu Xiong, Hao Li

In this paper, we propose the EPro-PnP, a probabilistic PnP layer for general end-to-end pose estimation, which outputs a distribution of pose with differentiable probability density on the SE(3) manifold.

3D Object Detection 6D Pose Estimation using RGB +1

Learning A Sparse Transformer Network for Effective Image Deraining

1 code implementation CVPR 2023 Xiang Chen, Hao Li, Mingqiang Li, Jinshan Pan

To overcome this problem, we propose an effective DeRaining network, Sparse Transformer (DRSformer) that can adaptively keep the most useful self-attention values for feature aggregation so that the aggregated features better facilitate high-quality image reconstruction.

Image Reconstruction Image Restoration +1

DiffusionRet: Generative Text-Video Retrieval with Diffusion Model

4 code implementations ICCV 2023 Peng Jin, Hao Li, Zesen Cheng, Kehan Li, Xiangyang Ji, Chang Liu, Li Yuan, Jie Chen

Existing text-video retrieval solutions are, in essence, discriminant models focused on maximizing the conditional likelihood, i. e., p(candidates|query).

Retrieval Video Retrieval

Video Action Recognition with Attentive Semantic Units

no code implementations ICCV 2023 Yifei Chen, Dapeng Chen, Ruijin Liu, Hao Li, Wei Peng

Supervised by the semantics of action labels, recent works adapt the visual branch of VLMs to learn video representations.

Action Recognition Temporal Action Localization +1

TwERC: High Performance Ensembled Candidate Generation for Ads Recommendation at Twitter

no code implementations27 Feb 2023 Vanessa Cai, Pradeep Prabakar, Manuel Serrano Rebuelta, Lucas Rosen, Federico Monti, Katarzyna Janocha, Tomo Lazovich, Jeetu Raj, Yedendra Shrinivasan, Hao Li, Thomas Markovich

We focus on the candidate generation phase of a large-scale ads recommendation problem in this paper, and present a machine learning first heterogeneous re-architecture of this stage which we term TwERC.

Recommendation Systems Vocal Bursts Intensity Prediction

An Adaptive Plug-and-Play Network for Few-Shot Learning

no code implementations18 Feb 2023 Hao Li, Li Li, Yunmeng Huang, Ning li, Yongtao Zhang

Few-shot learning (FSL) requires a model to classify new samples after learning from only a few samples.

Few-Shot Learning

Boosting Low-Data Instance Segmentation by Unsupervised Pre-training with Saliency Prompt

no code implementations CVPR 2023 Hao Li, Dingwen Zhang, Nian Liu, Lechao Cheng, Yalun Dai, Chao Zhang, Xinggang Wang, Junwei Han

Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models by giving Saliency Prompt for queries/kernels.

Instance Segmentation Semantic Segmentation +1

UNAEN: Unsupervised Abnormality Extraction Network for MRI Motion Artifact Reduction

no code implementations4 Jan 2023 Yusheng Zhou, Hao Li, Jianan Liu, Zhengmin Kong, Tao Huang, Euijoon Ahn, Zhihan Lv, Jinman Kim, David Dagan Feng

Our results substantiate the potential of UNAEN as a promising solution applicable in real-world clinical environments, with the capability to enhance diagnostic accuracy and facilitate image-guided therapies.

OccluMix: Towards De-Occlusion Virtual Try-on by Semantically-Guided Mixup

2 code implementations3 Jan 2023 Zhijing Yang, Junyang Chen, Yukai Shi, Hao Li, Tianshui Chen, Liang Lin

Image Virtual try-on aims at replacing the cloth on a personal image with a garment image (in-shop clothes), which has attracted increasing attention from the multimedia and computer vision communities.

Semantic Parsing Virtual Try-on

StyleGene: Crossover and Mutation of Region-Level Facial Genes for Kinship Face Synthesis

1 code implementation CVPR 2023 Hao Li, Xianxu Hou, Zepeng Huang, Linlin Shen

As cycle-like losses are designed to measure the L_2 distances between the output of Gene Decoder and image encoder, and that between the output of LGE and IGE, only face images are required to train our framework, i. e. no paired kinship face data is required.

Kinship face generation

Guided Recommendation for Model Fine-Tuning

no code implementations CVPR 2023 Hao Li, Charless Fowlkes, Hao Yang, Onkar Dabeer, Zhuowen Tu, Stefano Soatto

With thousands of historical training jobs, a recommendation system can be learned to predict the model selection score given the features of the dataset and the model as input.

Model Selection Transfer Learning

Clusterformer: Cluster-based Transformer for 3D Object Detection in Point Clouds

no code implementations ICCV 2023 Yu Pei, Xian Zhao, Hao Li, Jingyuan Ma, Jingwei Zhang, ShiLiang Pu

Attributed to the unstructured and sparse nature of point clouds, the transformer shows greater potential in point clouds data processing.

3D Object Detection object-detection

Biomedical image analysis competitions: The state of current participation practice

no code implementations16 Dec 2022 Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J. Adler, Patrick Godau, Veronika Cheplygina, Michal Kozubek, Sharib Ali, Anubha Gupta, Jan Kybic, Alison Noble, Carlos Ortiz de Solórzano, Samiksha Pachade, Caroline Petitjean, Daniel Sage, Donglai Wei, Elizabeth Wilden, Deepak Alapatt, Vincent Andrearczyk, Ujjwal Baid, Spyridon Bakas, Niranjan Balu, Sophia Bano, Vivek Singh Bawa, Jorge Bernal, Sebastian Bodenstedt, Alessandro Casella, Jinwook Choi, Olivier Commowick, Marie Daum, Adrien Depeursinge, Reuben Dorent, Jan Egger, Hannah Eichhorn, Sandy Engelhardt, Melanie Ganz, Gabriel Girard, Lasse Hansen, Mattias Heinrich, Nicholas Heller, Alessa Hering, Arnaud Huaulmé, Hyunjeong Kim, Bennett Landman, Hongwei Bran Li, Jianning Li, Jun Ma, Anne Martel, Carlos Martín-Isla, Bjoern Menze, Chinedu Innocent Nwoye, Valentin Oreiller, Nicolas Padoy, Sarthak Pati, Kelly Payette, Carole Sudre, Kimberlin Van Wijnen, Armine Vardazaryan, Tom Vercauteren, Martin Wagner, Chuanbo Wang, Moi Hoon Yap, Zeyun Yu, Chun Yuan, Maximilian Zenk, Aneeq Zia, David Zimmerer, Rina Bao, Chanyeol Choi, Andrew Cohen, Oleh Dzyubachyk, Adrian Galdran, Tianyuan Gan, Tianqi Guo, Pradyumna Gupta, Mahmood Haithami, Edward Ho, Ikbeom Jang, Zhili Li, Zhengbo Luo, Filip Lux, Sokratis Makrogiannis, Dominik Müller, Young-tack Oh, Subeen Pang, Constantin Pape, Gorkem Polat, Charlotte Rosalie Reed, Kanghyun Ryu, Tim Scherr, Vajira Thambawita, Haoyu Wang, Xinliang Wang, Kele Xu, Hung Yeh, Doyeob Yeo, Yixuan Yuan, Yan Zeng, Xin Zhao, Julian Abbing, Jannes Adam, Nagesh Adluru, Niklas Agethen, Salman Ahmed, Yasmina Al Khalil, Mireia Alenyà, Esa Alhoniemi, Chengyang An, Talha Anwar, Tewodros Weldebirhan Arega, Netanell Avisdris, Dogu Baran Aydogan, Yingbin Bai, Maria Baldeon Calisto, Berke Doga Basaran, Marcel Beetz, Cheng Bian, Hao Bian, Kevin Blansit, Louise Bloch, Robert Bohnsack, Sara Bosticardo, Jack Breen, Mikael Brudfors, Raphael Brüngel, Mariano Cabezas, Alberto Cacciola, Zhiwei Chen, Yucong Chen, Daniel Tianming Chen, Minjeong Cho, Min-Kook Choi, Chuantao Xie Chuantao Xie, Dana Cobzas, Julien Cohen-Adad, Jorge Corral Acero, Sujit Kumar Das, Marcela de Oliveira, Hanqiu Deng, Guiming Dong, Lars Doorenbos, Cory Efird, Sergio Escalera, Di Fan, Mehdi Fatan Serj, Alexandre Fenneteau, Lucas Fidon, Patryk Filipiak, René Finzel, Nuno R. Freitas, Christoph M. Friedrich, Mitchell Fulton, Finn Gaida, Francesco Galati, Christoforos Galazis, Chang Hee Gan, Zheyao Gao, Shengbo Gao, Matej Gazda, Beerend Gerats, Neil Getty, Adam Gibicar, Ryan Gifford, Sajan Gohil, Maria Grammatikopoulou, Daniel Grzech, Orhun Güley, Timo Günnemann, Chunxu Guo, Sylvain Guy, Heonjin Ha, Luyi Han, Il Song Han, Ali Hatamizadeh, Tian He, Jimin Heo, Sebastian Hitziger, SeulGi Hong, Seungbum Hong, Rian Huang, Ziyan Huang, Markus Huellebrand, Stephan Huschauer, Mustaffa Hussain, Tomoo Inubushi, Ece Isik Polat, Mojtaba Jafaritadi, SeongHun Jeong, Bailiang Jian, Yuanhong Jiang, Zhifan Jiang, Yueming Jin, Smriti Joshi, Abdolrahim Kadkhodamohammadi, Reda Abdellah Kamraoui, Inha Kang, Junghwa Kang, Davood Karimi, April Khademi, Muhammad Irfan Khan, Suleiman A. Khan, Rishab Khantwal, Kwang-Ju Kim, Timothy Kline, Satoshi Kondo, Elina Kontio, Adrian Krenzer, Artem Kroviakov, Hugo Kuijf, Satyadwyoom Kumar, Francesco La Rosa, Abhi Lad, Doohee Lee, Minho Lee, Chiara Lena, Hao Li, Ling Li, Xingyu Li, Fuyuan Liao, Kuanlun Liao, Arlindo Limede Oliveira, Chaonan Lin, Shan Lin, Akis Linardos, Marius George Linguraru, Han Liu, Tao Liu, Di Liu, Yanling Liu, João Lourenço-Silva, Jingpei Lu, Jiangshan Lu, Imanol Luengo, Christina B. Lund, Huan Minh Luu, Yi Lv, Uzay Macar, Leon Maechler, Sina Mansour L., Kenji Marshall, Moona Mazher, Richard McKinley, Alfonso Medela, Felix Meissen, Mingyuan Meng, Dylan Miller, Seyed Hossein Mirjahanmardi, Arnab Mishra, Samir Mitha, Hassan Mohy-ud-Din, Tony Chi Wing Mok, Gowtham Krishnan Murugesan, Enamundram Naga Karthik, Sahil Nalawade, Jakub Nalepa, Mohamed Naser, Ramin Nateghi, Hammad Naveed, Quang-Minh Nguyen, Cuong Nguyen Quoc, Brennan Nichyporuk, Bruno Oliveira, David Owen, Jimut Bahan Pal, Junwen Pan, Wentao Pan, Winnie Pang, Bogyu Park, Vivek Pawar, Kamlesh Pawar, Michael Peven, Lena Philipp, Tomasz Pieciak, Szymon Plotka, Marcel Plutat, Fattaneh Pourakpour, Domen Preložnik, Kumaradevan Punithakumar, Abdul Qayyum, Sandro Queirós, Arman Rahmim, Salar Razavi, Jintao Ren, Mina Rezaei, Jonathan Adam Rico, ZunHyan Rieu, Markus Rink, Johannes Roth, Yusely Ruiz-Gonzalez, Numan Saeed, Anindo Saha, Mostafa Salem, Ricardo Sanchez-Matilla, Kurt Schilling, Wei Shao, Zhiqiang Shen, Ruize Shi, Pengcheng Shi, Daniel Sobotka, Théodore Soulier, Bella Specktor Fadida, Danail Stoyanov, Timothy Sum Hon Mun, Xiaowu Sun, Rong Tao, Franz Thaler, Antoine Théberge, Felix Thielke, Helena Torres, Kareem A. Wahid, Jiacheng Wang, Yifei Wang, Wei Wang, Xiong Wang, Jianhui Wen, Ning Wen, Marek Wodzinski, Ye Wu, Fangfang Xia, Tianqi Xiang, Chen Xiaofei, Lizhan Xu, Tingting Xue, Yuxuan Yang, Lin Yang, Kai Yao, Huifeng Yao, Amirsaeed Yazdani, Michael Yip, Hwanseung Yoo, Fereshteh Yousefirizi, Shunkai Yu, Lei Yu, Jonathan Zamora, Ramy Ashraf Zeineldin, Dewen Zeng, Jianpeng Zhang, Bokai Zhang, Jiapeng Zhang, Fan Zhang, Huahong Zhang, Zhongchen Zhao, Zixuan Zhao, Jiachen Zhao, Can Zhao, Qingshuo Zheng, Yuheng Zhi, Ziqi Zhou, Baosheng Zou, Klaus Maier-Hein, Paul F. Jäger, Annette Kopp-Schneider, Lena Maier-Hein

Of these, 84% were based on standard architectures.


SteerNeRF: Accelerating NeRF Rendering via Smooth Viewpoint Trajectory

no code implementations CVPR 2023 Sicheng Li, Hao Li, Yue Wang, Yiyi Liao, Lu Yu

Neural Radiance Fields (NeRF) have demonstrated superior novel view synthesis performance but are slow at rendering.

Novel View Synthesis

Entropy-Driven Mixed-Precision Quantization for Deep Network Design

1 code implementation Conference on Neural Information Processing Systems 2022 Zhenhong Sun, Ce Ge, Junyan Wang, Ming Lin, Hesen Chen, Hao Li, Xiuyu Sun

Deploying deep convolutional neural networks on Internet-of-Things (IoT) devices is challenging due to the limited computational resources, such as limited SRAM memory and Flash storage.

Face Detection Hardware Aware Neural Architecture Search +3

Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks

2 code implementations CVPR 2023 Hao Li, Jinguo Zhu, Xiaohu Jiang, Xizhou Zhu, Hongsheng Li, Chun Yuan, Xiaohua Wang, Yu Qiao, Xiaogang Wang, Wenhai Wang, Jifeng Dai

In this paper, we propose Uni-Perceiver v2, which is the first generalist model capable of handling major large-scale vision and vision-language tasks with competitive performance.

Language Modelling Multi-Task Learning

Bayesian Layer Graph Convolutioanl Network for Hyperspetral Image Classification

no code implementations14 Nov 2022 Mingyang Zhang, Ziqi Di, Maoguo Gong, Yue Wu, Hao Li, Xiangming Jiang

In recent years, research on hyperspectral image (HSI) classification has continuous progress on introducing deep network models, and recently the graph convolutional network (GCN) based models have shown impressive performance.

Classification Image Classification

Detecting Line Segments in Motion-blurred Images with Events

1 code implementation14 Nov 2022 Huai Yu, Hao Li, Wen Yang, Lei Yu, Gui-Song Xia

To robustly detect line segments over motion blurs, we propose to leverage the complementary information of images and events.

3D Reconstruction Line Segment Detection +1

VTC-LFC: Vision Transformer Compression with Low-Frequency Components

1 code implementation NIPS 2022 Zhenyu Wang, Hao Luo, Pichao Wang, Feng Ding, Fan Wang, Hao Li

Although Vision transformers (ViTs) have recently dominated many vision tasks, deploying ViT models on resource-limited devices remains a challenging problem.

Towards Consistency and Complementarity: A Multiview Graph Information Bottleneck Approach

1 code implementation11 Oct 2022 Xiaolong Fan, Maoguo Gong, Yue Wu, Mingyang Zhang, Hao Li, Xiangming Jiang

In this paper, we propose a novel Multiview Variational Graph Information Bottleneck (MVGIB) principle to maximize the agreement for common representations and the disagreement for view-specific representations.

Toward 3D Spatial Reasoning for Human-like Text-based Visual Question Answering

no code implementations21 Sep 2022 Hao Li, Jinfa Huang, Peng Jin, Guoli Song, Qi Wu, Jie Chen

Under this setting, these 2D spatial reasoning approaches cannot distinguish the fine-grain spatial relations between visual objects and scene texts on the same image plane, thereby impairing the interpretability and performance of TextVQA models.

Image Captioning Optical Character Recognition (OCR) +2

MimCo: Masked Image Modeling Pre-training with Contrastive Teacher

no code implementations7 Sep 2022 Qiang Zhou, Chaohui Yu, Hao Luo, Zhibin Wang, Hao Li

Specifically, MimCo takes a pre-trained contrastive learning model as the teacher model and is pre-trained with two types of learning targets: patch-level and image-level reconstruction losses.

Contrastive Learning Self-Supervised Learning

Cats: Complementary CNN and Transformer Encoders for Segmentation

no code implementations24 Aug 2022 Hao Li, Dewei Hu, Han Liu, Jiacheng Wang, Ipek Oguz

We fuse the information from the convolutional encoder and the transformer, and pass it to the decoder to obtain the results.

3D Medical Imaging Segmentation Image Segmentation +1

SBPF: Sensitiveness Based Pruning Framework For Convolutional Neural Network On Image Classification

no code implementations9 Aug 2022 Yiheng Lu, Maoguo Gong, Wei Zhao, Kaiyuan Feng, Hao Li

Therefore, we propose a sensitiveness based method to evaluate the importance of each layer from the perspective of inference accuracy by adding extra damage for the original model.

Image Classification

Semantic Data Augmentation based Distance Metric Learning for Domain Generalization

no code implementations2 Aug 2022 Mengzhu Wang, Jianlong Yuan, Qi Qian, Zhibin Wang, Hao Li

Further, we provide an in-depth analysis of the mechanism and rational behind our approach, which gives us a better understanding of why leverage logits in lieu of features can help domain generalization.

Data Augmentation Domain Generalization +1

DnSwin: Toward Real-World Denoising via Continuous Wavelet Sliding-Transformer

no code implementations28 Jul 2022 Hao Li, Zhijing Yang, Xiaobin Hong, Ziying Zhao, Junyang Chen, Yukai Shi, Jinshan Pan

Real-world image denoising is a practical image restoration problem that aims to obtain clean images from in-the-wild noisy inputs.

Image Denoising Image Restoration

Criteria Comparative Learning for Real-scene Image Super-Resolution

2 code implementations26 Jul 2022 Yukai Shi, Hao Li, Sen Zhang, Zhijing Yang, Xiao Wang

Inspired by the observation that the contrastive relationship could also exist between the criteria, in this work, we propose a novel training paradigm for RealSR, named Criteria Comparative Learning (Cria-CL), by developing contrastive losses defined on criteria instead of image patches.

Contrastive Learning Image Super-Resolution +1

Large-Kernel Attention for 3D Medical Image Segmentation

no code implementations19 Jul 2022 Hao Li, Yang Nan, Javier Del Ser, Guang Yang

The performance improvement due to the proposed LK attention module was also statistically validated.

Computed Tomography (CT) Image Segmentation +4

Cross Vision-RF Gait Re-identification with Low-cost RGB-D Cameras and mmWave Radars

no code implementations16 Jul 2022 Dongjiang Cao, Ruofeng Liu, Hao Li, Shuai Wang, Wenchao Jiang, Chris Xiaoxuan Lu

Human identification is a key requirement for many applications in everyday life, such as personalized services, automatic surveillance, continuous authentication, and contact tracing during pandemics, etc.

Metric Learning Person Re-Identification

Dynamic Gradient Reactivation for Backward Compatible Person Re-identification

no code implementations12 Jul 2022 Xiao Pan, Hao Luo, Weihua Chen, Fan Wang, Hao Li, Wei Jiang, Jianming Zhang, Jianyang Gu, Peike Li

To address this issue, we propose the Ranking-based Backward Compatible Learning (RBCL), which directly optimizes the ranking metric between new features and old features.

Person Re-Identification Retrieval

Human Treelike Tubular Structure Segmentation: A Comprehensive Review and Future Perspectives

no code implementations12 Jul 2022 Hao Li, Zeyu Tang, Yang Nan, Guang Yang

Various structures in human physiology follow a treelike morphology, which often expresses complexity at very fine scales.

Computed Tomography (CT)

DLME: Deep Local-flatness Manifold Embedding

2 code implementations7 Jul 2022 Zelin Zang, Siyuan Li, Di wu, Ge Wang, Lei Shang, Baigui Sun, Hao Li, Stan Z. Li

To overcome the underconstrained embedding problem, we design a loss and theoretically demonstrate that it leads to a more suitable embedding based on the local flatness.

Contrastive Learning Data Augmentation +1

Location reference recognition from texts: A survey and comparison

no code implementations4 Jul 2022 Xuke Hu, Zhiyong Zhou, Hao Li, Yingjie Hu, Fuqiang Gu, Jens Kersten, Hongchao Fan, Friederike Klan

Further, there lacks a comprehensive review and comparison of existing approaches for location reference recognition, which is the first and a core step of geoparsing.

Information Retrieval Management +1

CGAR: Critic Guided Action Redistribution in Reinforcement Leaning

1 code implementation23 Jun 2022 Tairan Huang, Xu Li, Hao Li, Mingming Sun, Ping Li

As discussed in this paper, under the settings of the off-policy actor critic algorithms, we demonstrate that the critic can bring more expected discounted rewards than or at least equal to the actor.

Reinforcement Learning (RL)

Real-World Image Super-Resolution by Exclusionary Dual-Learning

1 code implementation6 Jun 2022 Hao Li, Jinghui Qin, Zhijing Yang, Pengxu Wei, Jinshan Pan, Liang Lin, Yukai Shi

Real-world image super-resolution is a practical image restoration problem that aims to obtain high-quality images from in-the-wild input, has recently received considerable attention with regard to its tremendous application potentials.

Image Restoration Image Super-Resolution

Point RCNN: An Angle-Free Framework for Rotated Object Detection

no code implementations28 May 2022 Qiang Zhou, Chaohui Yu, Zhibin Wang, Hao Li

To tackle this problem, we propose a purely angle-free framework for rotated object detection, called Point RCNN, which mainly consists of PointRPN and PointReg.

object-detection Object Detection In Aerial Images

SwinVRNN: A Data-Driven Ensemble Forecasting Model via Learned Distribution Perturbation

no code implementations26 May 2022 Yuan Hu, Lei Chen, Zhibin Wang, Hao Li

We also compare four categories of perturbation methods for ensemble forecasting, i. e. fixed distribution perturbation, learned distribution perturbation, MC dropout, and multi model ensemble.

Weather Forecasting

An Empirical Study on Distribution Shift Robustness From the Perspective of Pre-Training and Data Augmentation

no code implementations25 May 2022 Ziquan Liu, Yi Xu, Yuanhong Xu, Qi Qian, Hao Li, Rong Jin, Xiangyang Ji, Antoni B. Chan

With our empirical result obtained from 1, 330 models, we provide the following main observations: 1) ERM combined with data augmentation can achieve state-of-the-art performance if we choose a proper pre-trained model respecting the data property; 2) specialized algorithms further improve the robustness on top of ERM when handling a specific type of distribution shift, e. g., GroupDRO for spurious correlation and CORAL for large-scale out-of-distribution data; 3) Comparing different pre-training modes, architectures and data sizes, we provide novel observations about pre-training on distribution shift, which sheds light on designing or selecting pre-training strategy for different kinds of distribution shifts.

Data Augmentation

Unsupervised Representation Learning for 3D MRI Super Resolution with Degradation Adaptation

no code implementations13 May 2022 Jianan Liu, Hao Li, Tao Huang, Euijoon Ahn, Kang Han, Adeel Razi, Wei Xiang, Jinman Kim, David Dagan Feng

However, the difference in degradation representations between synthetic and authentic LR images suppresses the quality of SR images reconstructed from authentic LR images.

Image Registration Representation Learning +1

NTIRE 2022 Challenge on Efficient Super-Resolution: Methods and Results

2 code implementations11 May 2022 Yawei Li, Kai Zhang, Radu Timofte, Luc van Gool, Fangyuan Kong, Mingxi Li, Songwei Liu, Zongcai Du, Ding Liu, Chenhui Zhou, Jingyi Chen, Qingrui Han, Zheyuan Li, Yingqi Liu, Xiangyu Chen, Haoming Cai, Yu Qiao, Chao Dong, Long Sun, Jinshan Pan, Yi Zhu, Zhikai Zong, Xiaoxiao Liu, Zheng Hui, Tao Yang, Peiran Ren, Xuansong Xie, Xian-Sheng Hua, Yanbo Wang, Xiaozhong Ji, Chuming Lin, Donghao Luo, Ying Tai, Chengjie Wang, Zhizhong Zhang, Yuan Xie, Shen Cheng, Ziwei Luo, Lei Yu, Zhihong Wen, Qi Wu1, Youwei Li, Haoqiang Fan, Jian Sun, Shuaicheng Liu, Yuanfei Huang, Meiguang Jin, Hua Huang, Jing Liu, Xinjian Zhang, Yan Wang, Lingshun Long, Gen Li, Yuanfan Zhang, Zuowei Cao, Lei Sun, Panaetov Alexander, Yucong Wang, Minjie Cai, Li Wang, Lu Tian, Zheyuan Wang, Hongbing Ma, Jie Liu, Chao Chen, Yidong Cai, Jie Tang, Gangshan Wu, Weiran Wang, Shirui Huang, Honglei Lu, Huan Liu, Keyan Wang, Jun Chen, Shi Chen, Yuchun Miao, Zimo Huang, Lefei Zhang, Mustafa Ayazoğlu, Wei Xiong, Chengyi Xiong, Fei Wang, Hao Li, Ruimian Wen, Zhijing Yang, Wenbin Zou, Weixin Zheng, Tian Ye, Yuncheng Zhang, Xiangzhen Kong, Aditya Arora, Syed Waqas Zamir, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Dandan Gaoand Dengwen Zhouand Qian Ning, Jingzhu Tang, Han Huang, YuFei Wang, Zhangheng Peng, Haobo Li, Wenxue Guan, Shenghua Gong, Xin Li, Jun Liu, Wanjun Wang, Dengwen Zhou, Kun Zeng, Hanjiang Lin, Xinyu Chen, Jinsheng Fang

The aim was to design a network for single image super-resolution that achieved improvement of efficiency measured according to several metrics including runtime, parameters, FLOPs, activations, and memory consumption while at least maintaining the PSNR of 29. 00dB on DIV2K validation set.

Image Super-Resolution

Joint learning of object graph and relation graph for visual question answering

no code implementations9 May 2022 Hao Li, Xu Li, Belhal Karimi, Jie Chen, Mingming Sun

Modeling visual question answering(VQA) through scene graphs can significantly improve the reasoning accuracy and interpretability.

Question Answering Visual Question Answering

Learning to Listen: Modeling Non-Deterministic Dyadic Facial Motion

no code implementations CVPR 2022 Evonne Ng, Hanbyul Joo, Liwen Hu, Hao Li, Trevor Darrell, Angjoo Kanazawa, Shiry Ginosar

We present a framework for modeling interactional communication in dyadic conversations: given multimodal inputs of a speaker, we autoregressively output multiple possibilities of corresponding listener motion.

Task Adaptive Parameter Sharing for Multi-Task Learning

1 code implementation CVPR 2022 Matthew Wallingford, Hao Li, Alessandro Achille, Avinash Ravichandran, Charless Fowlkes, Rahul Bhotika, Stefano Soatto

TAPS solves a joint optimization problem which determines which layers to share with the base model and the value of the task-specific weights.

Multi-Task Learning

EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation

1 code implementation CVPR 2022 Hansheng Chen, Pichao Wang, Fan Wang, Wei Tian, Lu Xiong, Hao Li

The 2D-3D coordinates and corresponding weights are treated as intermediate variables learned by minimizing the KL divergence between the predicted and target pose distribution.

3D Object Detection 6D Pose Estimation using RGB +1

PMAL: Open Set Recognition via Robust Prototype Mining

no code implementations16 Mar 2022 Jing Lu, Yunxu Xu, Hao Li, Zhanzhan Cheng, Yi Niu

Accordingly, the embedding space can be better optimized to discriminate therein the predefined classes and between known and unknowns.

Open Set Learning

ModDrop++: A Dynamic Filter Network with Intra-subject Co-training for Multiple Sclerosis Lesion Segmentation with Missing Modalities

1 code implementation7 Mar 2022 Han Liu, Yubo Fan, Hao Li, Jiacheng Wang, Dewei Hu, Can Cui, Ho Hin Lee, Huahong Zhang, Ipek Oguz

Previously, a training strategy termed Modality Dropout (ModDrop) has been applied to MS lesion segmentation to achieve the state-of-the-art performance with missing modality.

Lesion Segmentation

On Representation Learning with Feedback

1 code implementation15 Feb 2022 Hao Li

This note complements the author's recent paper "Robust representation learning with feedback for single image deraining" by providing heuristically theoretical explanations on the mechanism of representation learning with feedback, namely an essential merit of the works presented in this recent article.

Representation Learning Single Image Deraining

GiraffeDet: A Heavy-Neck Paradigm for Object Detection

2 code implementations ICLR 2022 Yiqi Jiang, Zhiyu Tan, Junyan Wang, Xiuyu Sun, Ming Lin, Hao Li

This heavy-backbone design paradigm is mostly due to the historical legacy when transferring image recognition models to object detection rather than an end-to-end optimized design for object detection.

object-detection Object Detection

Image-to-Video Re-Identification via Mutual Discriminative Knowledge Transfer

no code implementations21 Jan 2022 Pichao Wang, Fan Wang, Hao Li

During the KD process, the TCL loss transfers the local structure, exploits the higher order information, and mitigates the misalignment of the heterogeneous output of teacher and student networks.

Knowledge Distillation Transfer Learning

Studying Popular Open Source Machine Learning Libraries and Their Cross-Ecosystem Bindings

1 code implementation18 Jan 2022 Hao Li, Cor-Paul Bezemer

Our study shows that the vast majority of the studied bindings cover only a small portion of the source library releases, and the delay for receiving support for a source library release is large.

BIG-bench Machine Learning

Graph Neural Networks for Double-Strand DNA Breaks Prediction

no code implementations4 Jan 2022 Xu Wang, Huan Zhao, WeiWei Tu, Hao Li, Yu Sun, Xiaochen Bo

Double-strand DNA breaks (DSBs) are a form of DNA damage that can cause abnormal chromosomal rearrangements.

ELSA: Enhanced Local Self-Attention for Vision Transformer

1 code implementation23 Dec 2021 Jingkai Zhou, Pichao Wang, Fan Wang, Qiong Liu, Hao Li, Rong Jin

Self-attention is powerful in modeling long-range dependencies, but it is weak in local finer-level feature learning.

Image Classification Instance Segmentation +2

Watch Those Words: Video Falsification Detection Using Word-Conditioned Facial Motion

1 code implementation21 Dec 2021 Shruti Agarwal, Liwen Hu, Evonne Ng, Trevor Darrell, Hao Li, Anna Rohrbach

In today's era of digital misinformation, we are increasingly faced with new threats posed by video falsification techniques.


On the Dilution of Precision for Time Difference of Arrival with Station Deployment

no code implementations10 Dec 2021 Fengyun Zhang, Hao Li, Yulong Ding, Shuang-Hua Yang, Li Yang

The paper aims to reveal the relationship between the performance of moving object tracking algorithms and the tracking anchors (station) deployment.

Object Tracking TAG

Design and Implementation of Real-Time Localization System (RTLS) based on UWB and TDoA Algorithm

no code implementations9 Dec 2021 Fengyun Zhang, Li Yang, Yuhuan Liu, Yulong Ding, Shuang-Hua Yang, Hao Li

The challenges of indoor localization include inadequate localization accuracy, unreasonable anchor deployment in complex scenarios, lack of stability, and high cost.

Indoor Localization

TransZero: Attribute-guided Transformer for Zero-Shot Learning

1 code implementation3 Dec 2021 Shiming Chen, Ziming Hong, Yang Liu, Guo-Sen Xie, Baigui Sun, Hao Li, Qinmu Peng, Ke Lu, Xinge You

Although some attention-based models have attempted to learn such region features in a single image, the transferability and discriminative attribute localization of visual features are typically neglected.

Zero-Shot Learning

Uni-Perceiver: Pre-training Unified Architecture for Generic Perception for Zero-shot and Few-shot Tasks

1 code implementation CVPR 2022 Xizhou Zhu, Jinguo Zhu, Hao Li, Xiaoshi Wu, Xiaogang Wang, Hongsheng Li, Xiaohua Wang, Jifeng Dai

The model is pre-trained on several uni-modal and multi-modal tasks, and evaluated on a variety of downstream tasks, including novel tasks that did not appear in the pre-training stage.

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

1 code implementation2 Dec 2021 Zhaoyuan Yin, Pichao Wang, Fan Wang, Xianzhe Xu, Hanling Zhang, Hao Li, Rong Jin

Unsupervised semantic segmentation aims to obtain high-level semantic representation on low-level visual features without manual annotations.

Ranked #2 on Unsupervised Semantic Segmentation on COCO-Stuff-171 (using extra training data)

Segmentation Self-Supervised Learning +1

3D High-Quality Magnetic Resonance Image Restoration in Clinics Using Deep Learning

no code implementations28 Nov 2021 Hao Li, Jianan Liu

We also analyzed several down-sampling strategies based on the acceleration factor, including multiple combinations of in-plane and through-plane down-sampling, and developed a controllable and quantifiable motion artifact generation method.

Image Restoration Super-Resolution

MAE-DET: Revisiting Maximum Entropy Principle in Zero-Shot NAS for Efficient Object Detection

1 code implementation26 Nov 2021 Zhenhong Sun, Ming Lin, Xiuyu Sun, Zhiyu Tan, Hao Li, Rong Jin

Recent researches attempt to reduce this cost by optimizing the backbone architecture with the help of Neural Architecture Search (NAS).

Neural Architecture Search object-detection +1

Improved Fine-Tuning by Better Leveraging Pre-Training Data

no code implementations24 Nov 2021 Ziquan Liu, Yi Xu, Yuanhong Xu, Qi Qian, Hao Li, Xiangyang Ji, Antoni Chan, Rong Jin

The generalization result of using pre-training data shows that the excess risk bound on a target task can be improved when the appropriate pre-training data is included in fine-tuning.

Image Classification Learning Theory

Self-Supervised Pre-Training for Transformer-Based Person Re-Identification

2 code implementations23 Nov 2021 Hao Luo, Pichao Wang, Yi Xu, Feng Ding, Yanxin Zhou, Fan Wang, Hao Li, Rong Jin

We first investigate self-supervised learning (SSL) methods with Vision Transformer (ViT) pretrained on unlabelled person images (the LUPerson dataset), and empirically find it significantly surpasses ImageNet supervised pre-training models on ReID tasks.

 Ranked #1 on Unsupervised Person Re-Identification on Market-1501 (Rank-1 metric, using extra training data)

Self-Supervised Learning Unsupervised Domain Adaptation +1

Topologically Consistent Multi-View Face Inference Using Volumetric Sampling

no code implementations ICCV 2021 Tianye Li, Shichen Liu, Timo Bolkart, Jiayi Liu, Hao Li, Yajie Zhao

We propose ToFu, Topologically consistent Face from multi-view, a geometry inference framework that can produce topologically consistent meshes across facial identities and expressions using a volumetric representation instead of an explicit underlying 3DMM.

3D Reconstruction

HSVA: Hierarchical Semantic-Visual Adaptation for Zero-Shot Learning

2 code implementations NeurIPS 2021 Shiming Chen, Guo-Sen Xie, Yang Liu, Qinmu Peng, Baigui Sun, Hao Li, Xinge You, Ling Shao

Specifically, HSVA aligns the semantic and visual domains by adopting a hierarchical two-step adaptation, i. e., structure adaptation and distribution adaptation.

Transfer Learning Zero-Shot Learning

NAS-Bench-Zero: A Large Scale Dataset for Understanding Zero-Shot Neural Architecture Search

no code implementations29 Sep 2021 Hanlin Chen, Ming Lin, Xiuyu Sun, Hao Li

Based on these new discoveries, we propose i) a novel hybrid zero-shot proxy which outperforms existing ones by a large margin and is transferable among popular search spaces; ii) a new index for better measuring the true performance of ZS-NAS proxies in constrained NAS.

Benchmarking Neural Architecture Search

Unsupervised Domain Adaptation By Optimal Transportation Of Clusters Between Domains

no code implementations29 Sep 2021 Yang Liu, Zhipeng Zhou, Lei Shang, Baigui Sun, Hao Li, Rong Jin

Unsupervised domain adaptation (UDA) aims to transfer the knowledge from a labeled source domain to an unlabeled target domain.

Clustering Transfer Learning +1

Text-based Person Search in Full Images via Semantic-Driven Proposal Generation

1 code implementation27 Sep 2021 Shizhou Zhang, De Cheng, Wenlong Luo, Yinghui Xing, Duo Long, Hao Li, Kai Niu, Guoqiang Liang, Yanning Zhang

Finding target persons in full scene images with a query of text description has important practical applications in intelligent video surveillance. However, different from the real-world scenarios where the bounding boxes are not available, existing text-based person retrieval methods mainly focus on the cross modal matching between the query text descriptions and the gallery of cropped pedestrian images.

Person Search Retrieval +3

Unsupervised Cross-Modality Domain Adaptation for Segmenting Vestibular Schwannoma and Cochlea with Data Augmentation and Model Ensemble

no code implementations24 Sep 2021 Hao Li, Dewei Hu, Qibang Zhu, Kathleen E. Larson, Huahong Zhang, Ipek Oguz

To overcome this problem, domain adaptation is an effective way to leverage information from source domain to obtain accurate segmentations without requiring manual labels in target domain.

Data Augmentation Domain Adaptation +2

Interpolation variable rate image compression

1 code implementation20 Sep 2021 Zhenhong Sun, Zhiyu Tan, Xiuyu Sun, Fangyi Zhang, Yichen Qian, Dongyang Li, Hao Li

Compression standards have been used to reduce the cost of image storage and transmission for decades.

Image Compression MS-SSIM +1

DisUnknown: Distilling Unknown Factors for Disentanglement Learning

1 code implementation ICCV 2021 Sitao Xiang, Yuming Gu, Pengda Xiang, Menglei Chai, Hao Li, Yajie Zhao, Mingming He

In this paper, we adopt a general setting where all factors that are hard to label or identify are encapsulated as a single unknown factor.


CDTrans: Cross-domain Transformer for Unsupervised Domain Adaptation

2 code implementations ICLR 2022 Tongkun Xu, Weihua Chen, Pichao Wang, Fan Wang, Hao Li, Rong Jin

Along with the pseudo labels, a weight-sharing triple-branch transformer framework is proposed to apply self-attention and cross-attention for source/target feature learning and source-target domain alignment, respectively.

Unsupervised Domain Adaptation

Scaled ReLU Matters for Training Vision Transformers

no code implementations8 Sep 2021 Pichao Wang, Xue Wang, Hao Luo, Jingkai Zhou, Zhipeng Zhou, Fan Wang, Hao Li, Rong Jin

In this paper, we further investigate this problem and extend the above conclusion: only early convolutions do not help for stable training, but the scaled ReLU operation in the \textit{convolutional stem} (\textit{conv-stem}) matters.

Dash: Semi-Supervised Learning with Dynamic Thresholding

no code implementations1 Sep 2021 Yi Xu, Lei Shang, Jinxing Ye, Qi Qian, Yu-Feng Li, Baigui Sun, Hao Li, Rong Jin

In this work we develop a simple yet powerful framework, whose key idea is to select a subset of training examples from the unlabeled data when performing existing SSL methods so that only the unlabeled examples with pseudo labels related to the labeled data will be used to train models.

Semi-Supervised Image Classification

Digging into Uncertainty in Self-supervised Multi-view Stereo

1 code implementation ICCV 2021 Hongbin Xu, Zhipeng Zhou, Yali Wang, Wenxiong Kang, Baigui Sun, Hao Li, Yu Qiao

Specially, the limitations can be categorized into two types: ambiguious supervision in foreground and invalid supervision in background.

Image Reconstruction Self-Supervised Learning

Exploring the Quality of GAN Generated Images for Person Re-Identification

no code implementations23 Aug 2021 Yiqi Jiang, Weihua Chen, Xiuyu Sun, Xiaoyu Shi, Fan Wang, Hao Li

Recently, GAN based method has demonstrated strong effectiveness in generating augmentation data for person re-identification (ReID), on account of its ability to bridge the gap between domains and enrich the data variety in feature space.

Person Re-Identification Unsupervised Domain Adaptation

Fine-Grained AutoAugmentation for Multi-Label Classification

no code implementations12 Jul 2021 Ya Wang, Hesen Chen, Fangyi Zhang, Yaohua Wang, Xiuyu Sun, Ming Lin, Hao Li

Data augmentation is a commonly used approach to improving the generalization of deep learning models.

Classification Data Augmentation +3

A Cloud-Edge-Terminal Collaborative System for Temperature Measurement in COVID-19 Prevention

no code implementations11 Jul 2021 Zheyi Ma, Hao Li, Wen Fang, Qingwen Liu, Bin Zhou, Zhiyong Bu

Then, a mobile detection model based on a multi-task cascaded convolutional network (MTCNN) is proposed to realize face alignment and mask detection on the RGB images.

Face Alignment

LIFE: A Generalizable Autodidactic Pipeline for 3D OCT-A Vessel Segmentation

no code implementations9 Jul 2021 Dewei Hu, Can Cui, Hao Li, Kathleen E. Larson, Yuankai K. Tao, Ipek Oguz

We then construct the local intensity fusion encoder (LIFE) to map a given OCT-A volume and its LIF counterpart to a shared latent space.

Retinal Vessel Segmentation Segmentation

Graph Convolution for Re-ranking in Person Re-identification

1 code implementation5 Jul 2021 Yuqi Zhang, Qian Qi, Chong Liu, Weihua Chen, Fan Wang, Hao Li, Rong Jin

In this work, we propose a graph-based re-ranking method to improve learned features while still keeping Euclidean distance as the similarity metric.

Person Re-Identification Re-Ranking +1

Normalized Avatar Synthesis Using StyleGAN and Perceptual Refinement

no code implementations CVPR 2021 Huiwen Luo, Koki Nagano, Han-Wei Kung, Mclean Goldwhite, Qingguo Xu, Zejian Wang, Lingyu Wei, Liwen Hu, Hao Li

Cutting-edge 3D face reconstruction methods use non-linear morphable face models combined with GAN-based decoders to capture the likeness and details of a person but fail to produce neutral head models with unshaded albedo textures which is critical for creating relightable and animation-friendly avatars for integration in virtual environments.

3D Face Reconstruction Face Model

SKFAC: Training Neural Networks With Faster Kronecker-Factored Approximate Curvature

1 code implementation CVPR 2021 Zedong Tang, Fenlong Jiang, Maoguo Gong, Hao Li, Yue Wu, Fan Yu, Zidong Wang, Min Wang

For the fully connected layers, by utilizing the low-rank property of Kronecker factors of Fisher information matrix, our method only requires inverting a small matrix to approximate the curvature with desirable accuracy.

Dimensionality Reduction

Task-Generic Hierarchical Human Motion Prior using VAEs

no code implementations7 Jun 2021 Jiaman Li, Ruben Villegas, Duygu Ceylan, Jimei Yang, Zhengfei Kuang, Hao Li, Yajie Zhao

We demonstrate the effectiveness of our hierarchical motion variational autoencoder in a variety of tasks including video-based human pose estimation, motion completion from partial observations, and motion synthesis from sparse key-frames.

Motion Synthesis Pose Estimation