Search Results for author: Yu Liu

Found 283 papers, 105 papers with code

SOM-NCSCM : An Efficient Neural Chinese Sentence Compression Model Enhanced with Self-Organizing Map

no code implementations EMNLP 2021 Kangli Zi, Shi Wang, Yu Liu, Jicun Li, Yanan Cao, Cungen Cao

Sentence Compression (SC), which aims to shorten sentences while retaining important words that express the essential meanings, has been studied for many years in many languages, especially in English.

Question Answering Sentence +2

More Classifiers, Less Forgetting: A Generic Multi-classifier Paradigm for Incremental Learning

1 code implementation ECCV 2020 Yu Liu, Sarah Parisot, Gregory Slabaugh, Xu Jia, Ales Leonardis, Tinne Tuytelaars

Since those regularization strategies are mostly associated with classifier outputs, we propose a MUlti-Classifier (MUC) incremental learning paradigm that integrates an ensemble of auxiliary classifiers to estimate more effective regularization constraints.

Incremental Learning

The Devil is in the Neurons: Interpreting and Mitigating Social Biases in Pre-trained Language Models

no code implementations14 Jun 2024 Yan Liu, Yu Liu, Xiaokang Chen, Pin-Yu Chen, Daoguang Zan, Min-Yen Kan, Tsung-Yi Ho

As a result, previous debiasing methods mainly finetune or even pre-train language models on newly constructed anti-stereotypical datasets, which are high-cost.

Compressed Video Quality Enhancement with Temporal Group Alignment and Fusion

no code implementations14 Jun 2024 Qiang Zhu, Yajun Qiu, Yu Liu, Shuyuan Zhu, Bing Zeng

In this paper, we propose a temporal group alignment and fusion network to enhance the quality of compressed videos by using the long-short term correlations between frames.

VulDetectBench: Evaluating the Deep Capability of Vulnerability Detection with Large Language Models

no code implementations11 Jun 2024 Yu Liu, Lang Gao, Mingxin Yang, Yu Xie, Ping Chen, Xiaojin Zhang, Wei Chen

However, sound comprehensive research on detecting program vulnerabilities, a more specific task related to code, and evaluating the performance of LLMs in this more specialized scenario is still lacking.

Vulnerability Detection

Zero-shot Image Editing with Reference Imitation

no code implementations11 Jun 2024 Xi Chen, Yutong Feng, Mengting Chen, Yiyang Wang, Shilong Zhang, Yu Liu, Yujun Shen, Hengshuang Zhao

Image editing serves as a practical yet challenging task considering the diverse demands from users, where one of the hardest parts is to precisely describe how the edited image should look like.

Semantic correspondence

Instruction-Guided Visual Masking

1 code implementation30 May 2024 Jinliang Zheng, Jianxiong Li, Sijie Cheng, Yinan Zheng, Jiaming Li, Jihao Liu, Yu Liu, Jingjing Liu, Xianyuan Zhan

To achieve more accurate and nuanced multimodal instruction following, we introduce Instruction-guided Visual Masking (IVM), a new versatile visual grounding model that is compatible with diverse multimodal models, such as LMM and robot model.

Instruction Following Visual Grounding +1

Enhancing Vision-Language Model with Unmasked Token Alignment

1 code implementation29 May 2024 Jihao Liu, Jinliang Zheng, Boxiao Liu, Yu Liu, Hongsheng Li

Contrastive pre-training on image-text pairs, exemplified by CLIP, becomes a standard technique for learning multi-modal visual-language representations.

Language Modelling Self-Supervised Learning

Novel Class Discovery for Ultra-Fine-Grained Visual Categorization

1 code implementation CVPR 2024 Yu Liu, Yaqi Cai, Qi Jia, Binglin Qiu, Weimin WANG, Nan Pu

To tackle this problem, we devise a Region-Aligned Proxy Learning (RAPL) framework, which comprises a Channel-wise Region Alignment (CRA) module and a Semi-Supervised Proxy Learning (SemiPL) strategy.

Contrastive Learning Fine-Grained Visual Categorization +3

Batched Stochastic Bandit for Nondegenerate Functions

no code implementations9 May 2024 Yu Liu, Yunlu Shu, Tianyu Wang

More specifically, we introduce an algorithm, called Geometric Narrowing (GN), whose regret bound is of order $\widetilde{{\mathcal{O}}} ( A_{+}^d \sqrt{T} )$.

Deep Reward Supervisions for Tuning Text-to-Image Diffusion Models

no code implementations1 May 2024 Xiaoshi Wu, Yiming Hao, Manyuan Zhang, Keqiang Sun, Zhaoyang Huang, Guanglu Song, Yu Liu, Hongsheng Li

In this study, we propose Deep Reward Tuning (DRTune), an algorithm that directly supervises the final output image of a text-to-image diffusion model and back-propagates through the iterative sampling process to the input noise.

Denoising

ReZero: Boosting MCTS-based Algorithms by Backward-view and Entire-buffer Reanalyze

1 code implementation25 Apr 2024 Chunyu Xuan, Yazhe Niu, Yuan Pu, Shuai Hu, Yu Liu, Jing Yang

Monte Carlo Tree Search (MCTS)-based algorithms, such as MuZero and its derivatives, have achieved widespread success in various decision-making domains.

Board Games Decision Making

Improving TAS Adaptability with a Variable Temperature Threshold

no code implementations25 Apr 2024 Anthony Dowling, Ming-Cheng Cheng, Yu Liu

Thermal-Aware Scheduling (TAS) provides methods to manage the thermal dissipation of a computing chip during task execution.

Scheduling

MoVA: Adapting Mixture of Vision Experts to Multimodal Context

1 code implementation19 Apr 2024 Zhuofan Zong, Bingqi Ma, Dazhong Shen, Guanglu Song, Hao Shao, Dongzhi Jiang, Hongsheng Li, Yu Liu

Although some large-scale pretrained vision encoders such as vision encoders in CLIP and DINOv2 have brought promising performance, we found that there is still no single vision encoder that can dominate various image content understanding, e. g., the CLIP vision encoder leads to outstanding results on general image understanding but poor performance on document or chart content.

Language Modelling Large Language Model

GLID: Pre-training a Generalist Encoder-Decoder Vision Model

no code implementations CVPR 2024 Jihao Liu, Jinliang Zheng, Yu Liu, Hongsheng Li

This paper proposes a GeneraLIst encoder-Decoder (GLID) pre-training method for better handling various downstream computer vision tasks.

Decoder Depth Estimation +6

Rethinking the Spatial Inconsistency in Classifier-Free Diffusion Guidance

1 code implementation CVPR 2024 Dazhong Shen, Guanglu Song, Zeyue Xue, Fu-Yun Wang, Yu Liu

Classifier-Free Guidance (CFG) has been widely used in text-to-image diffusion models, where the CFG scale is introduced to control the strength of text guidance on the whole image space.

Denoising Semantic Segmentation

CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching

2 code implementations4 Apr 2024 Dongzhi Jiang, Guanglu Song, Xiaoshi Wu, Renrui Zhang, Dazhong Shen, Zhuofan Zong, Yu Liu, Hongsheng Li

We further attribute this phenomenon to the diffusion model's insufficient condition utilization, which is caused by its training paradigm.

Attribute Image Captioning +1

FlashFace: Human Image Personalization with High-fidelity Identity Preservation

no code implementations25 Mar 2024 Shilong Zhang, Lianghua Huang, Xi Chen, Yifei Zhang, Zhi-Fan Wu, Yutong Feng, Wei Wang, Yujun Shen, Yu Liu, Ping Luo

This work presents FlashFace, a practical tool with which users can easily personalize their own photos on the fly by providing one or a few reference face images and a text prompt.

Face Swapping Instruction Following +1

Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language Models

1 code implementation25 Mar 2024 Hao Shao, Shengju Qian, Han Xiao, Guanglu Song, Zhuofan Zong, Letian Wang, Yu Liu, Hongsheng Li

This paper presents Visual CoT, a novel pipeline that leverages the reasoning capabilities of multi-modal large language models (MLLMs) by incorporating visual Chain-of-Thought (CoT) reasoning.

Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation

1 code implementation20 Mar 2024 Fu-Yun Wang, Xiaoshi Wu, Zhaoyang Huang, Xiaoyu Shi, Dazhong Shen, Guanglu Song, Yu Liu, Hongsheng Li

We introduce MOTIA Mastering Video Outpainting Through Input-Specific Adaptation, a diffusion-based pipeline that leverages both the intrinsic data-specific patterns of the source video and the image/video generative prior for effective outpainting.

FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis

1 code implementation19 Mar 2024 Linjiang Huang, Rongyao Fang, Aiping Zhang, Guanglu Song, Si Liu, Yu Liu, Hongsheng Li

In this study, we delve into the generation of high-resolution images from pre-trained diffusion models, addressing persistent challenges, such as repetitive patterns and structural distortions, that emerge when models are applied beyond their trained resolutions.

Text-to-Image Generation

SmartRefine: A Scenario-Adaptive Refinement Framework for Efficient Motion Prediction

1 code implementation CVPR 2024 Yang Zhou, Hao Shao, Letian Wang, Steven L. Waslander, Hongsheng Li, Yu Liu

Context information, such as road maps and surrounding agents' states, provides crucial geometric and semantic information for motion behavior prediction.

Autonomous Vehicles motion prediction

Depth-induced Saliency Comparison Network for Diagnosis of Alzheimer's Disease via Jointly Analysis of Visual Stimuli and Eye Movements

no code implementations15 Mar 2024 Yu Liu, Wenlin Zhang, Shaochu Wang, Fangyu Zuo, Peiguang Jing, Yong Ji

Early diagnosis of Alzheimer's Disease (AD) is very important for following medical treatments, and eye movements under special visual stimuli may serve as a potential non-invasive biomarker for detecting cognitive abnormalities of AD patients.

CPGA: Coding Priors-Guided Aggregation Network for Compressed Video Quality Enhancement

no code implementations CVPR 2024 Qiang Zhu, Jinhua Hao, Yukang Ding, Yu Liu, Qiao Mo, Ming Sun, Chao Zhou, Shuyuan Zhu

Specifically, the ITA module aggregates temporal information from consecutive frames and coding priors, while the MNA module globally captures spatial information guided by residual frames.

CSCNET: Class-Specified Cascaded Network for Compositional Zero-Shot Learning

no code implementations9 Mar 2024 Yanyi Zhang, Qi Jia, Xin Fan, Yu Liu, Ran He

Inspired by this, we propose a novel A-O disentangled framework for CZSL, namely Class-specified Cascaded Network (CSCNet).

Attribute Compositional Zero-Shot Learning +2

DecisionNCE: Embodied Multimodal Representations via Implicit Preference Learning

1 code implementation28 Feb 2024 Jianxiong Li, Jinliang Zheng, Yinan Zheng, Liyuan Mao, Xiao Hu, Sijie Cheng, Haoyi Niu, Jihao Liu, Yu Liu, Jingjing Liu, Ya-Qin Zhang, Xianyuan Zhan

Multimodal pretraining is an effective strategy for the trinity of goals of representation learning in autonomous robots: 1) extracting both local and global task progressions; 2) enforcing temporal consistency of visual representation; 3) capturing trajectory-level language grounding.

Contrastive Learning Decision Making +1

Extensible Multi-Granularity Fusion Network for Aspect-based Sentiment Analysis

1 code implementation12 Feb 2024 Xiaowei Zhao, Yong Zhou, Xiujuan Xu, Yu Liu

This paper presents the Extensible Multi-Granularity Fusion (EMGF) network, which integrates information from dependency and constituent syntactic, attention semantic , and external knowledge graphs.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +1

Estimating On-road Transportation Carbon Emissions from Open Data of Road Network and Origin-destination Flow Data

1 code implementation7 Feb 2024 Jinwei Zeng, Yu Liu, Jingtao Ding, Jian Yuan, Yong Li

To relieve this issue by utilizing the strong pattern recognition of artificial intelligence, we incorporate two sources of open data representative of the transportation demand and capacity factors, the origin-destination (OD) flow data and the road network data, to build a hierarchical heterogeneous graph learning method for on-road carbon emission estimation (HENCE).

Graph Learning

Space Group Constrained Crystal Generation

no code implementations6 Feb 2024 Rui Jiao, Wenbing Huang, Yu Liu, Deli Zhao, Yang Liu

Crystals are the foundation of numerous scientific and industrial applications.

StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis

no code implementations30 Jan 2024 Zecheng Tang, Chenfei Wu, Zekai Zhang, Mingheng Ni, Shengming Yin, Yu Liu, Zhengyuan Yang, Lijuan Wang, Zicheng Liu, Juntao Li, Nan Duan

To leverage LLMs for visual synthesis, traditional methods convert raster image information into discrete grid tokens through specialized visual modules, while disrupting the model's ability to capture the true semantic representation of visual scenes.

Vector Graphics

Deep-Learning-Based Channel Estimation for IRS-Assisted ISAC System

no code implementations29 Jan 2024 Yu Liu, Ibrahim Al-Nahhal, Octavia A. Dobre, Fanggang Wang

A deep-learning framework is proposed to estimate the sensing and communication (S&C) channels in such a system.

Extreme Learning Machine-based Channel Estimation in IRS-Assisted Multi-User ISAC System

no code implementations29 Jan 2024 Yu Liu, Ibrahim Al-Nahhal, Octavia A. Dobre, Fanggang Wang, Hyundong Shin

Multi-user integrated sensing and communication (ISAC) assisted by intelligent reflecting surface (IRS) has been recently investigated to provide a high spectral and energy efficiency transmission.

Efficient Neural Network

Deep-Learning Channel Estimation for IRS-Assisted Integrated Sensing and Communication System

no code implementations29 Jan 2024 Yu Liu, Ibrahim Al-Nahhal, Octavia A. Dobre, Fanggang Wang

This problem is challenging due to the lack of signal processing capacity in passive IRS, as well as the presence of mutual interference between sensing and communication (SAC) signals in ISAC systems.

UV-SAM: Adapting Segment Anything Model for Urban Village Identification

1 code implementation16 Jan 2024 Xin Zhang, Yu Liu, Yuming Lin, Qingmin Liao, Yong Li

Urban villages, defined as informal residential areas in or around urban centers, are characterized by inadequate infrastructures and poor living conditions, closely related to the Sustainable Development Goals (SDGs) on poverty, adequate housing, and sustainable cities.

Image Classification Semantic Segmentation

Knowledge-aware Graph Transformer for Pedestrian Trajectory Prediction

no code implementations10 Jan 2024 Yu Liu, Yuexin Zhang, Kunming Li, Yongliang Qiao, Stewart Worrall, You-Fu Li, He Kong

To overcome this limitation, this paper proposes a graph transformer structure to improve prediction performance, capturing the differences between the various sites and scenarios contained in the datasets.

Autonomous Vehicles Domain Adaptation +2

EasyDrag: Efficient Point-based Manipulation on Diffusion Models

no code implementations CVPR 2024 Xingzhong Hou, Boxiao Liu, Yi Zhang, Jihao Liu, Yu Liu, Haihang You

Generative models are gaining increasing popularity and the demand for precisely generating images is on the rise.

Image Manipulation

Check Locate Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation

no code implementations CVPR 2024 Biao Gong, Siteng Huang, Yutong Feng, Shiwei Zhang, Yuyuan Li, Yu Liu

To align the generated image with layout instructions we present a training-free layout calibration system SimM that intervenes in the generative process on the fly during inference time.

Text-to-Image Generation

Multi-agent Collaborative Perception via Motion-aware Robust Communication Network

no code implementations CVPR 2024 Shixin Hong, Yu Liu, Zhi Li, Shaohui Li, You He

Collaborative perception allows for information sharing between multiple agents such as vehicles and infrastructure to obtain a comprehensive view of the environment through communication and fusion.

3D Object Detection object-detection

AllSpark: A Multimodal Spatio-Temporal General Intelligence Model with Thirteen Modalities

no code implementations31 Dec 2023 Run Shao, Cheng Yang, Qiujun Li, Qing Zhu, Yongjun Zhang, Yansheng Li, Yu Liu, Yong Tang, Dapeng Liu, Shizhong Yang, Haifeng Li

We introduce the Language as Reference Framework (LaRF), a fundamental principle for constructing a multimodal unified model, aiming to strike a trade-off between the cohesion and autonomy among different modalities.

Mutual Information as Intrinsic Reward of Reinforcement Learning Agents for On-demand Ride Pooling

no code implementations23 Dec 2023 Xianjie Zhang, Jiahao Sun, Chen Gong, Kai Wang, Yifei Cao, Hao Chen, Yu Liu

The emergence of on-demand ride pooling services allows each vehicle to serve multiple passengers at a time, thus increasing drivers' income and enabling passengers to travel at lower prices than taxi/car on-demand services (only one passenger can be assigned to a car at a time like UberX and Lyft).

Reinforcement Learning (RL)

Critic-Guided Decision Transformer for Offline Reinforcement Learning

no code implementations21 Dec 2023 Yuanfu Wang, Chao Yang, Ying Wen, Yu Liu, Yu Qiao

Recent advancements in offline reinforcement learning (RL) have underscored the capabilities of Return-Conditioned Supervised Learning (RCSL), a paradigm that learns the action distribution based on target returns for each state in a supervised manner.

D4RL Offline RL +3

Effect Size Estimation for Duration Recommendation in Online Experiments: Leveraging Hierarchical Models and Objective Utility Approaches

no code implementations20 Dec 2023 Yu Liu, Runzhe Wan, James McQueen, Doug Hains, Jinxiang Gu, Rui Song

The selection of the assumed effect size (AES) critically determines the duration of an experiment, and hence its accuracy and efficiency.

Decision Making

VideoLCM: Video Latent Consistency Model

2 code implementations14 Dec 2023 Xiang Wang, Shiwei Zhang, Han Zhang, Yu Liu, Yingya Zhang, Changxin Gao, Nong Sang

Consistency models have demonstrated powerful capability in efficient image generation and allowed synthesis within a few sampling steps, alleviating the high computational cost in diffusion models.

Computational Efficiency Image Generation +1

CCM: Adding Conditional Controls to Text-to-Image Consistency Models

no code implementations12 Dec 2023 Jie Xiao, Kai Zhu, Han Zhang, Zhiheng Liu, Yujun Shen, Yu Liu, Xueyang Fu, Zheng-Jun Zha

Consistency Models (CMs) have showed a promise in creating visual content efficiently and with high quality.

LMDrive: Closed-Loop End-to-End Driving with Large Language Models

1 code implementation CVPR 2024 Hao Shao, Yuxuan Hu, Letian Wang, Steven L. Waslander, Yu Liu, Hongsheng Li

On the other hand, previous autonomous driving methods tend to rely on limited-format inputs (e. g. sensor data and navigation waypoints), restricting the vehicle's ability to understand language information and interact with humans.

Autonomous Driving Instruction Following

A Perspective of Q-value Estimation on Offline-to-Online Reinforcement Learning

1 code implementation12 Dec 2023 Yinmin Zhang, Jie Liu, Chuming Li, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang

In this paper, from a novel perspective, we systematically study the challenges that remain in O2O RL and identify that the reason behind the slow improvement of the performance and the instability of online finetuning lies in the inaccurate Q-value estimation inherited from offline pretraining.

Offline RL

Building Open-Ended Embodied Agent via Language-Policy Bidirectional Adaptation

no code implementations12 Dec 2023 Shaopeng Zhai, Jie Wang, Tianyi Zhang, Fuxian Huang, Qi Zhang, Ming Zhou, Jing Hou, Yu Qiao, Yu Liu

Building embodied agents on integrating Large Language Models (LLMs) and Reinforcement Learning (RL) have revolutionized human-AI interaction: researchers can now leverage language instructions to plan decision-making for open-ended tasks.

Decision Making Language Modelling +1

LivePhoto: Real Image Animation with Text-guided Motion Control

no code implementations5 Dec 2023 Xi Chen, Zhiheng Liu, Mengting Chen, Yutong Feng, Yu Liu, Yujun Shen, Hengshuang Zhao

In particular, considering the facts that (1) text can only describe motions roughly (e. g., regardless of the moving speed) and (2) text may include both content and motion descriptions, we introduce a motion intensity estimation module as well as a text re-weighting module to reduce the ambiguity of text-to-motion mapping.

Image Animation Text-to-Video Generation +1

Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following

no code implementations CVPR 2024 Yutong Feng, Biao Gong, Di Chen, Yujun Shen, Yu Liu, Jingren Zhou

Existing text-to-image (T2I) diffusion models usually struggle in interpreting complex prompts, especially those with quantity, object-attribute binding, and multi-subject descriptions.

Attribute Denoising +1

Learning Disentangled Identifiers for Action-Customized Text-to-Image Generation

no code implementations CVPR 2024 Siteng Huang, Biao Gong, Yutong Feng, Xi Chen, Yuqian Fu, Yu Liu, Donglin Wang

Experimental results show that existing subject-driven customization methods fail to learn the representative characteristics of actions and struggle in decoupling actions from context features, including appearance.

Text-to-Image Generation

Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation

no code implementations27 Nov 2023 Biao Gong, Siteng Huang, Yutong Feng, Shiwei Zhang, Yuyuan Li, Yu Liu

To align the generated image with layout instructions, we present a training-free layout calibration system SimM that intervenes in the generative process on the fly during inference time.

Text-to-Image Generation

Towards Large-scale Masked Face Recognition

no code implementations25 Oct 2023 Manyuan Zhang, Bingqi Ma, Guanglu Song, Yunxiao Wang, Hongsheng Li, Yu Liu

During the COVID-19 coronavirus epidemic, almost everyone is wearing masks, which poses a huge challenge for deep learning-based face recognition algorithms.

Face Recognition

Decoupled DETR: Spatially Disentangling Localization and Classification for Improved End-to-End Object Detection

no code implementations ICCV 2023 Manyuan Zhang, Guanglu Song, Yu Liu, Hongsheng Li

We observe that different regions of interest in the visual feature map are suitable for performing query classification and box localization tasks, even for the same object.

Classification Decoder +2

Distance-rank Aware Sequential Reward Learning for Inverse Reinforcement Learning with Sub-optimal Demonstrations

no code implementations13 Oct 2023 Lu Li, Yuxin Pan, RuoBing Chen, Jie Liu, Zilin Wang, Yu Liu, Zhiheng Li

Considering that obtaining expert demonstrations can be costly, the focus of current IRL techniques is on learning a better-than-demonstrator policy using a reward function derived from sub-optimal demonstrations.

Contrastive Learning

LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios

1 code implementation NeurIPS 2023 Yazhe Niu, Yuan Pu, Zhenjie Yang, Xueyan Li, Tong Zhou, Jiyuan Ren, Shuai Hu, Hongsheng Li, Yu Liu

Building agents based on tree-search planning capabilities with learned models has achieved remarkable success in classic decision-making problems, such as Go and Atari.

Board Games Decision Making

Continuous Invariance Learning

no code implementations9 Oct 2023 Yong Lin, Fan Zhou, Lu Tan, Lintao Ma, Jiameng Liu, Yansu He, Yuan Yuan, Yu Liu, James Zhang, Yujiu Yang, Hao Wang

To address this challenge, we then propose Continuous Invariance Learning (CIL), which extracts invariant features across continuously indexed domains.

Cloud Computing

Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization

1 code implementation5 Oct 2023 Zhanhui Zhou, Jie Liu, Chao Yang, Jing Shao, Yu Liu, Xiangyu Yue, Wanli Ouyang, Yu Qiao

A single language model (LM), despite aligning well with an average labeler through reinforcement learning from human feedback (RLHF), may not universally suit diverse human preferences.

Language Modelling Long Form Question Answering

Magicremover: Tuning-free Text-guided Image inpainting with Diffusion Models

no code implementations4 Oct 2023 Siyuan Yang, Lu Zhang, Liqian Ma, Yu Liu, Jingjing Fu, You He

In this paper, we propose MagicRemover, a tuning-free method that leverages the powerful diffusion models for text-guided image inpainting.

Denoising Image Inpainting

Tracking Anything in Heart All at Once

no code implementations4 Oct 2023 Chengkang Shen, Hao Zhu, You Zhou, Yu Liu, Si Yi, Lili Dong, Weipeng Zhao, David J. Brady, Xun Cao, Zhan Ma, Yi Lin

Myocardial motion tracking stands as an essential clinical tool in the prevention and detection of Cardiovascular Diseases (CVDs), the foremost cause of death globally.

Motion Estimation

Regulating CPU Temperature With Thermal-Aware Scheduling Using a Reduced Order Learning Thermal Model

no code implementations2 Oct 2023 Anthony Dowling, Lin Jiang, Ming-Cheng Cheng, Yu Liu

Additionally, we compare the performance of a state of the art TAS algorithm, RT-TAS, to our proposed POD-TAS algorithm.

Scheduling

Liveness Detection Competition -- Noncontact-based Fingerprint Algorithms and Systems (LivDet-2023 Noncontact Fingerprint)

no code implementations1 Oct 2023 Sandip Purnapatra, Humaira Rezaie, Bhavin Jawade, Yu Liu, Yue Pan, Luke Brosell, Mst Rumana Sumi, Lambert Igene, Alden Dimarco, Srirangaraj Setlur, Soumyabrata Dey, Stephanie Schuckers, Marco Huber, Jan Niklas Kolf, Meiling Fang, Naser Damer, Banafsheh Adami, Raul Chitic, Karsten Seelert, Vishesh Mistry, Rahul Parthe, Umit Kacar

The competition serves as an important benchmark in noncontact-based fingerprint PAD, offering (a) independent assessment of the state-of-the-art in noncontact-based fingerprint PAD for algorithms and systems, and (b) common evaluation protocol, which includes finger photos of a variety of Presentation Attack Instruments (PAIs) and live fingers to the biometric research community (c) provides standard algorithm and system evaluation protocols, along with the comparative analysis of state-of-the-art algorithms from academia and industry with both old and new android smartphones.

Towards Generative Modeling of Urban Flow through Knowledge-enhanced Denoising Diffusion

1 code implementation19 Sep 2023 Zhilun Zhou, Jingtao Ding, Yu Liu, Depeng Jin, Yong Li

To capture the effect of multiple factors on urban flow, such as region features and urban environment, we employ diffusion model to generate urban flow for regions under different conditions.

Denoising

CHITNet: A Complementary to Harmonious Information Transfer Network for Infrared and Visible Image Fusion

no code implementations12 Sep 2023 Yafei Zhang, Keying Du, Huafeng Li, Zhengtao Yu, Yu Liu

Specifically, to skillfully sidestep aggregating complementary information in IVIF, we design a mutual information transfer (MIT) module to mutually represent features from two modalities, roughly transferring complementary information into harmonious one.

Infrared And Visible Image Fusion

BigFUSE: Global Context-Aware Image Fusion in Dual-View Light-Sheet Fluorescence Microscopy with Image Formation Prior

no code implementations5 Sep 2023 Yu Liu, Gesine Muller, Nassir Navab, Carsten Marr, Jan Huisken, Tingying Peng

Light-sheet fluorescence microscopy (LSFM), a planar illumination technique that enables high-resolution imaging of samples, experiences defocused image quality caused by light scattering when photons propagate through thick tissues.

Evaluation Mappings of Spatial Accelerator Based On Data Placement

no code implementations4 Sep 2023 Zhipeng Wu, Yu Liu

Based on data placement relations, polyAcc accurately analyzes the data volume for different reuse patterns and estimate metrics, including data reuse, latency, and energy.

Relation Scheduling

Snow Removal for LiDAR Point Clouds with Spatio-temporal Conditional Random Fields

1 code implementation IEEE ROBOTICS AND AUTOMATION LETTERS 2023 Weimin WANG, Ting Yang, Yu Du, Yu Liu

The proposed approach first constructs the CRF based on k-nearest neighbors with the snow confidence derived from the physical priors of snow, such as intensity and distribution.

3D Object Detection Autonomous Driving +2

3D Semantic Subspace Traverser: Empowering 3D Generative Model with Shape Editing Capability

1 code implementation ICCV 2023 Ruowei Wang, Yu Liu, Pei Su, Jianwei Zhang, Qijun Zhao

Our method utilizes implicit functions as the 3D shape representation and combines a novel latent-space GAN with a linear subspace model to discover semantic dimensions in the local latent space of 3D shapes.

3D Shape Generation 3D Shape Representation +1

Theoretically Guaranteed Policy Improvement Distilled from Model-Based Planning

no code implementations24 Jul 2023 Chuming Li, Ruonan Jia, Jie Liu, Yinmin Zhang, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang

Model-based reinforcement learning (RL) has demonstrated remarkable successes on a range of continuous control tasks due to its high sample efficiency.

Continuous Control Model-based Reinforcement Learning +1

A Physics-Informed Data-Driven Fault Location Method for Transmission Lines Using Single-Ended Measurements with Field Data Validation

no code implementations19 Jul 2023 Yiqi Xing, Yu Liu, Dayou Lu, Xinchen Zou, Xuming He

This procedure merges the gap between simulation and practical power systems, and at the same time considers the uncertainty of system and fault parameters in practice.

AnyDoor: Zero-shot Object-level Image Customization

2 code implementations CVPR 2024 Xi Chen, Lianghua Huang, Yu Liu, Yujun Shen, Deli Zhao, Hengshuang Zhao

This work presents AnyDoor, a diffusion-based image generator with the power to teleport target objects to new scenes at user-specified locations in a harmonious way.

Object Virtual Try-on

OpenSiteRec: An Open Dataset for Site Recommendation

no code implementations3 Jul 2023 Xinhang Li, Xiangyu Zhao, Yejing Wang, Yu Liu, Yong Li, Cheng Long, Yong Zhang, Chunxiao Xing

As a representative information retrieval task, site recommendation, which aims at predicting the optimal sites for a brand or an institution to open new branches in an automatic data-driven way, is beneficial and crucial for brand development in modern business.

Benchmarking Information Retrieval +1

Eliminating Lipschitz Singularities in Diffusion Models

no code implementations20 Jun 2023 Zhantao Yang, Ruili Feng, Han Zhang, Yujun Shen, Kai Zhu, Lianghua Huang, Yifei Zhang, Yu Liu, Deli Zhao, Jingren Zhou, Fan Cheng

Diffusion models, which employ stochastic differential equations to sample images through integrals, have emerged as a dominant class of generative models.

Learning Search-Space Specific Heuristics Using Neural Networks

no code implementations6 Jun 2023 Yu Liu, Ryo Kuroiwa, Alex Fukunaga

We propose and evaluate a system which learns a neuralnetwork heuristic function for forward search-based, satisficing classical planning.

regression

Video Diffusion Models with Local-Global Context Guidance

1 code implementation5 Jun 2023 Siyuan Yang, Lu Zhang, Yu Liu, Zhizhuo Jiang, You He

We construct a local-global context guidance strategy to capture the multi-perceptual embedding of the past fragment to boost the consistency of future prediction.

Future prediction Unconditional Video Generation +1

Cones 2: Customizable Image Synthesis with Multiple Subjects

1 code implementation30 May 2023 Zhiheng Liu, Yifei Zhang, Yujun Shen, Kecheng Zheng, Kai Zhu, Ruili Feng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao

Synthesizing images with user-specified subjects has received growing attention due to its practical applications.

Image Generation

Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising

1 code implementation29 May 2023 Fu-Yun Wang, Wenshuo Chen, Guanglu Song, Han-Jia Ye, Yu Liu, Hongsheng Li

To address this challenge, we introduce a novel paradigm dubbed as Gen-L-Video, capable of extending off-the-shelf short video diffusion models for generating and editing videos comprising hundreds of frames with diverse semantic segments without introducing additional training, all while preserving content consistency.

Denoising Image Generation +2

AUC Optimization from Multiple Unlabeled Datasets

no code implementations25 May 2023 Zheng Xie, Yu Liu, Ming Li

In this paper, we study the problem of building an AUC (area under ROC curve) optimization model from multiple unlabeled datasets, which maximizes the pairwise ranking ability of the classifier.

Weakly-supervised Learning

Weakly Supervised AUC Optimization: A Unified Partial AUC Approach

no code implementations23 May 2023 Zheng Xie, Yu Liu, Hao-Yuan He, Ming Li, Zhi-Hua Zhou

Since acquiring perfect supervision is usually difficult, real-world machine learning tasks often confront inaccurate, incomplete, or inexact supervision, collectively referred to as weak supervision.

ReasonNet: End-to-End Driving with Temporal and Global Reasoning

no code implementations CVPR 2023 Hao Shao, Letian Wang, RuoBing Chen, Steven L. Waslander, Hongsheng Li, Yu Liu

The large-scale deployment of autonomous vehicles is yet to come, and one of the major remaining challenges lies in urban dense traffic scenarios.

Autonomous Driving

Efficient Reinforcement Learning for Autonomous Driving with Parameterized Skills and Priors

1 code implementation8 May 2023 Letian Wang, Jie Liu, Hao Shao, Wenshuo Wang, RuoBing Chen, Yu Liu, Steven L. Waslander

Inspired by this, we propose ASAP-RL, an efficient reinforcement learning algorithm for autonomous driving that simultaneously leverages motion skills and expert priors.

Autonomous Driving reinforcement-learning

Long-term Visual Localization with Mobile Sensors

no code implementations CVPR 2023 Shen Yan, Yu Liu, Long Wang, Zehong Shen, Zhen Peng, Haomin Liu, Maojun Zhang, Guofeng Zhang, Xiaowei Zhou

Despite the remarkable advances in image matching and pose estimation, image-based localization of a camera in a temporally-varying outdoor environment is still a challenging problem due to huge appearance disparity between query and reference images caused by illumination, seasonal and structural changes.

Image-Based Localization Pose Estimation +1

Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction

1 code implementation ICCV 2023 Zhuofan Zong, Dongzhi Jiang, Guanglu Song, Zeyue Xue, Jingyong Su, Hongsheng Li, Yu Liu

The HoP approach is straightforward: given the current timestamp t, we generate a pseudo Bird's-Eye View (BEV) feature of timestamp t-k from its adjacent frames and utilize this feature to predict the object set at timestamp t-k. Our approach is motivated by the observation that enforcing the detector to capture both the spatial location and temporal motion of objects occurring at historical timestamps can lead to more accurate BEV feature learning.

3D Object Detection Object

Experimentation Platforms Meet Reinforcement Learning: Bayesian Sequential Decision-Making for Continuous Monitoring

no code implementations2 Apr 2023 Runzhe Wan, Yu Liu, James McQueen, Doug Hains, Rui Song

With the growing needs of online A/B testing to support the innovation in industry, the opportunity cost of running an experiment becomes non-negligible.

Decision Making reinforcement-learning

TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs

no code implementations29 Mar 2023 Yaobo Liang, Chenfei Wu, Ting Song, Wenshan Wu, Yan Xia, Yu Liu, Yang Ou, Shuai Lu, Lei Ji, Shaoguang Mao, Yun Wang, Linjun Shou, Ming Gong, Nan Duan

On the other hand, there are also many existing models and systems (symbolic-based or neural-based) that can do some domain-specific tasks very well.

Code Generation Common Sense Reasoning +1

End-to-End Personalized Next Location Recommendation via Contrastive User Preference Modeling

no code implementations22 Mar 2023 Yan Luo, Ye Liu, Fu-Lai Chung, Yu Liu, Chang Wen Chen

History encoder is designed to model mobility patterns from historical check-in sequences, while query generator explicitly learns user preferences to generate user-specific intention queries.

Decoder

GeoMIM: Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D Understanding

1 code implementation ICCV 2023 Jihao Liu, Tai Wang, Boxiao Liu, Qihang Zhang, Yu Liu, Hongsheng Li

In this paper, we propose Geometry Enhanced Masked Image Modeling (GeoMIM) to transfer the knowledge of the LiDAR model in a pretrain-finetune paradigm for improving the multi-view camera-based 3D detection.

3D Object Detection Decoder +2

Deep Learning-based Eye-Tracking Analysis for Diagnosis of Alzheimer's Disease Using 3D Comprehensive Visual Stimuli

no code implementations13 Mar 2023 Fangyu Zuo, Peiguang Jing, Jinglin Sun, Jizhong, Duan, Yong Ji, Yu Liu

To better analyze the differences in visual attention between AD patients and normals, we first conduct a 3D comprehensive visual task on a non-invasive eye-tracking system to collect visual attention heatmaps.

Cones: Concept Neurons in Diffusion Models for Customized Generation

1 code implementation9 Mar 2023 Zhiheng Liu, Ruili Feng, Kai Zhu, Yifei Zhang, Kecheng Zheng, Yu Liu, Deli Zhao, Jingren Zhou, Yang Cao

Concatenating multiple clusters of concept neurons can vividly generate all related concepts in a single image.

Comparison and Analysis of Cognitive Load under 2D/3D Visual Stimuli

no code implementations25 Feb 2023 Yu Liu, Chen Song, Yunpeng Yin, Herui Shi, Jinglin Sun, Han Wang, Peiguang Jing

According to our experiments and analysis, videos that involve simple observational tasks (P <0. 05) consistently induced a higher cognitive load in subjects when they were viewing 3D videos.

EEG Experimental Design

Knowledge-infused Contrastive Learning for Urban Imagery-based Socioeconomic Prediction

1 code implementation25 Feb 2023 Yu Liu, Xin Zhang, Jingtao Ding, Yanxin Xi, Yong Li

To address such issues, in this paper, we propose a Knowledge-infused Contrastive Learning (KnowCL) model for urban imagery-based socioeconomic prediction.

Contrastive Learning Representation Learning

Deep Learning for Video-Text Retrieval: a Review

no code implementations24 Feb 2023 Cunjuan Zhu, Qi Jia, Wei Chen, Yanming Guo, Yu Liu

Video-Text Retrieval (VTR) aims to search for the most relevant video related to the semantics in a given sentence, and vice versa.

Retrieval Sentence +2

Composer: Creative and Controllable Image Synthesis with Composable Conditions

6 code implementations20 Feb 2023 Lianghua Huang, Di Chen, Yu Liu, Yujun Shen, Deli Zhao, Jingren Zhou

Recent large-scale generative models learned on big data are capable of synthesizing incredible images yet suffer from limited controllability.

Image Colorization Image-to-Image Translation +3

Render-and-Compare: Cross-View 6 DoF Localization from Noisy Prior

no code implementations13 Feb 2023 Shen Yan, Xiaoya Cheng, Yuxiang Liu, Juelin Zhu, Rouwan Wu, Yu Liu, Maojun Zhang

Despite the significant progress in 6-DoF visual localization, researchers are mostly driven by ground-level benchmarks.

Pose Estimation Visual Localization

Dimension Reduction and MARS

1 code implementation11 Feb 2023 Yu Liu, Degui Li, Yingcun Xia

The multivariate adaptive regression spline (MARS) is one of the popular estimation methods for nonparametric multivariate regressions.

Dimensionality Reduction regression

SLOTH: Structured Learning and Task-based Optimization for Time Series Forecasting on Hierarchies

no code implementations11 Feb 2023 Fan Zhou, Chen Pan, Lintao Ma, Yu Liu, Shiyu Wang, James Zhang, Xinxin Zhu, Xuanwei Hu, Yunhua Hu, Yangfei Zheng, Lei Lei, Yun Hu

Moreover, unlike most previous reconciliation methods which either rely on strong assumptions or focus on coherent constraints only, we utilize deep neural optimization networks, which not only achieve coherency without any assumptions, but also allow more flexible and realistic constraints to achieve task-based targets, e. g., lower under-estimation penalty and meaningful decision-making loss to facilitate the subsequent downstream tasks.

Decision Making Multivariate Time Series Forecasting +1

Centralized Cooperative Exploration Policy for Continuous Control Tasks

1 code implementation6 Jan 2023 Chao Li, Chen Gong, Qiang He, Xinwen Hou, Yu Liu

To explicitly encourage exploration in continuous control tasks, we propose CCEP (Centralized Cooperative Exploration Policy), which utilizes underestimation and overestimation of value functions to maintain the capacity of exploration.

Continuous Control

Urban Visual Intelligence: Studying Cities with AI and Street-level Imagery

no code implementations2 Jan 2023 Fan Zhang, Arianna Salazar Miranda, Fábio Duarte, Lawrence Vale, Gary Hack, Min Chen, Yu Liu, Michael Batty, Carlo Ratti

The visual dimension of cities has been a fundamental subject in urban studies, since the pioneering work of scholars such as Sitte, Lynch, Arnheim, and Jacobs.

Masked Autoencoders Are Stronger Knowledge Distillers

no code implementations ICCV 2023 Shanshan Lao, Guanglu Song, Boxiao Liu, Yu Liu, Yujiu Yang

In MKD, random patches of the input image are masked, and the corresponding missing feature is recovered by forcing it to imitate the output of the teacher.

Decoder Knowledge Distillation +3

Deep Active Contours for Real-time 6-DoF Object Tracking

no code implementations ICCV 2023 Long Wang, Shen Yan, Jianan Zhen, Yu Liu, Maojun Zhang, Guofeng Zhang, Xiaowei Zhou

Specifically, given an initial pose, we project the object model to the image plane to obtain the initial contour and use a lightweight network to predict how the contour should move to match the true object boundary, which provides the gradients to optimize the object pose.

Computational Efficiency Object +1

Generating Dynamic Kernels via Transformers for Lane Detection

1 code implementation ICCV 2023 Ziye Chen, Yu Liu, Mingming Gong, Bo Du, Guoqi Qian, Kate Smith-Miles

While such methods reduce the reliance on specific knowledge, the kernels computed from the key locations fail to capture the lane line's global structure due to its long and thin structure, leading to inaccurate detection of lane lines with complex topologies.

Lane Detection

UniKD: Universal Knowledge Distillation for Mimicking Homogeneous or Heterogeneous Object Detectors

no code implementations ICCV 2023 Shanshan Lao, Guanglu Song, Boxiao Liu, Yu Liu, Yujiu Yang

Bridging this semantic gap now requires case-by-case algorithm design which is time-consuming and heavily relies on experienced adjustment.

Knowledge Distillation

Hyperbolic Hierarchical Contrastive Hashing

no code implementations17 Dec 2022 Rukai Wei, Yu Liu, Jingkuan Song, Yanzhao Xie, Ke Zhou

To exploit the hierarchical semantic structures in hyperbolic space, we designed the hierarchical contrastive learning algorithm, including hierarchical instance-wise and hierarchical prototype-wise contrastive learning.

Contrastive Learning Retrieval

ACE: Cooperative Multi-agent Q-learning with Bidirectional Action-Dependency

1 code implementation29 Nov 2022 Chuming Li, Jie Liu, Yinmin Zhang, Yuhong Wei, Yazhe Niu, Yaodong Yang, Yu Liu, Wanli Ouyang

In the learning phase, each agent minimizes the TD error that is dependent on how the subsequent agents have reacted to their chosen action.

Decision Making Q-Learning +2

Dimensionality-Varying Diffusion Process

no code implementations CVPR 2023 Han Zhang, Ruili Feng, Zhantao Yang, Lianghua Huang, Yu Liu, Yifei Zhang, Yujun Shen, Deli Zhao, Jingren Zhou, Fan Cheng

Diffusion models, which learn to reverse a signal destruction process to generate new data, typically require the signal at each step to have the same dimension.

Image Generation

DETRs with Collaborative Hybrid Assignments Training

3 code implementations ICCV 2023 Zhuofan Zong, Guanglu Song, Yu Liu

This new training scheme can easily enhance the encoder's learning ability in end-to-end detectors by training the multiple parallel auxiliary heads supervised by one-to-many label assignments such as ATSS and Faster RCNN.

 Ranked #1 on Object Detection on COCO test-dev (using extra training data)

Decoder Instance Segmentation +2

GAN Inversion for Image Editing via Unsupervised Domain Adaptation

no code implementations22 Nov 2022 Siyu Xing, Chen Gong, Hewei Guo, Xiao-Yu Zhang, Xinwen Hou, Yu Liu

Existing GAN inversion methods work brilliantly in reconstructing high-quality (HQ) images while struggling with more common low-quality (LQ) inputs in practical application.

Image Reconstruction Unsupervised Domain Adaptation

Teach-DETR: Better Training DETR with Teachers

1 code implementation22 Nov 2022 Linjiang Huang, Kaixin Lu, Guanglu Song, Liang Wang, Si Liu, Yu Liu, Hongsheng Li

In this paper, we present a novel training scheme, namely Teach-DETR, to learn better DETR-based detectors from versatile teacher detectors.

Real-World Image Super Resolution via Unsupervised Bi-directional Cycle Domain Transfer Learning based Generative Adversarial Network

no code implementations19 Nov 2022 Xiang Wang, Yimin Yang, Zhichang Guo, Zhili Zhou, Yu Liu, Qixiang Pang, Shan Du

First, the UBCDTN is able to produce an approximated real-like LR image through transferring the LR image from an artificially degraded domain to the real-world LR image domain.

Generative Adversarial Network Image Super-Resolution +1

Semantic Encoder Guided Generative Adversarial Face Ultra-Resolution Network

no code implementations18 Nov 2022 Xiang Wang, Yimin Yang, Qixiang Pang, Xiao Lu, Yu Liu, Shan Du

In this paper, we propose a novel face super-resolution method, namely Semantic Encoder guided Generative Adversarial Face Ultra-Resolution Network (SEGA-FURN) to ultra-resolve an unaligned tiny LR face image to its HR counterpart with multiple ultra-upscaling factors (e. g., 4x and 8x).

Image Super-Resolution

Channel Tracking for RIS-aided mmWave Communications Under High Mobility Scenarios

no code implementations7 Nov 2022 Yu Liu, Ming Chen, Cunhua Pan, Yijin Pan, Yinlu Wang, Yaoming Huang, Tianyang Cao, Jiangzhou Wang

The emerging reconfigurable intelligent surface (RIS) technology is promising for applications in the millimeter wave (mmWave) communication systems to effectively compensate for propagation loss or tackle the blockage issue.

Vocal Bursts Intensity Prediction

PolyBuilding: Polygon Transformer for End-to-End Building Extraction

no code implementations3 Nov 2022 Yuan Hu, Zhibin Wang, Zhou Huang, Yu Liu

Given a set of polygon queries, the model learns the relations among them and encodes context information from the image to predict the final set of building polygons with fixed vertex numbers.

Decoder

Large-batch Optimization for Dense Visual Predictions

1 code implementation20 Oct 2022 Zeyue Xue, Jianming Liang, Guanglu Song, Zhuofan Zong, Liang Chen, Yu Liu, Ping Luo

To address this challenge, we propose a simple yet effective algorithm, named Adaptive Gradient Variance Modulator (AGVM), which can train dense visual predictors with very large batch size, enabling several benefits more appealing than prior arts.

Instance Segmentation object-detection +3

Improving Object-centric Learning with Query Optimization

2 code implementations17 Oct 2022 Baoxiong Jia, Yu Liu, Siyuan Huang

The ability to decompose complex natural scenes into meaningful object-centric abstractions lies at the core of human perception and reasoning.

Image Segmentation Object +3

DiffGAR: Model-Agnostic Restoration from Generative Artifacts Using Image-to-Image Diffusion Models

no code implementations16 Oct 2022 Yueqin Yin, Lianghua Huang, Yu Liu, Kaiqi Huang

In this work, we first design a group of mechanisms to simulate generative artifacts of popular generators (i. e., GANs, autoregressive models, and diffusion models), given real images.

Image Generation Image Restoration

SOM-Net: Unrolling the Subspace-based Optimization for Solving Full-wave Inverse Scattering Problems

no code implementations8 Sep 2022 Yu Liu, Hao Zhao, Rencheng Song, Xudong Chen, Chang Li, Xun Chen

The final output of the SOM-Net is the full predicted induced current, from which the scattered field and the permittivity image can also be deduced analytically.

Rolling Shutter Correction

EEG-based Emotion Recognition via Efficient Convolutional Neural Network and Contrastive Learning

no code implementations IEEE Sensors Journal 2022 Chang Li, Xuejuan Lin, Yu Liu, Rencheng Song, Juan Cheng, Xun Chen

To achieve a simple and effective model with supervised learning, we propose an efficient CNN and contrastive learning (ECNN-C) method for EEG-based emotion recognition.

Contrastive Learning EEG +1

Towards Robust Face Recognition with Comprehensive Search

no code implementations29 Aug 2022 Manyuan Zhang, Guanglu Song, Yu Liu, Hongsheng Li

To eliminate the bias of single-aspect research and provide an overall understanding of the face recognition model design, we first carefully design the search space for each aspect, then a comprehensive search method is introduced to jointly search optimal data cleaning, architecture, and loss function design.

Face Recognition Robust Face Recognition

Unifying Visual Perception by Dispersible Points Learning

1 code implementation18 Aug 2022 Jianming Liang, Guanglu Song, Biao Leng, Yu Liu

The method, called UniHead, views different visual perception tasks as the dispersible points learning via the transformer encoder architecture.

Instance Segmentation Object +5

Single-Stage Open-world Instance Segmentation with Cross-task Consistency Regularization

1 code implementation18 Aug 2022 Xizhe Xue, Dongdong Yu, Lingqiao Liu, Yu Liu, Satoshi Tsutsui, Ying Li, Zehuan Yuan, Ping Song, Mike Zheng Shou

Based on the single-stage instance segmentation framework, we propose a regularization model to predict foreground pixels and use its relation to instance segmentation to construct a cross-task consistency loss.

Autonomous Driving Object +3

Rethinking Robust Representation Learning Under Fine-grained Noisy Faces

no code implementations8 Aug 2022 Bingqi Ma, Guanglu Song, Boxiao Liu, Yu Liu

To better understand this, we reformulate the noise type of each class in a more fine-grained manner as N-identities|K^C-clusters.

Face Recognition Representation Learning

TokenMix: Rethinking Image Mixing for Data Augmentation in Vision Transformers

1 code implementation18 Jul 2022 Jihao Liu, Boxiao Liu, Hang Zhou, Hongsheng Li, Yu Liu

In this paper, we propose a novel data augmentation technique TokenMix to improve the performance of vision transformers.

Data Augmentation

UniNet: Unified Architecture Search with Convolution, Transformer, and MLP

2 code implementations12 Jul 2022 Jihao Liu, Xin Huang, Guanglu Song, Hongsheng Li, Yu Liu

Finally, we integrate configurable operators and DSMs into a unified search space and search with a Reinforcement Learning-based search algorithm to fully explore the optimal combination of the operators.

Image Classification Neural Architecture Search

DeStripe: A Self2Self Spatio-Spectral Graph Neural Network with Unfolded Hessian for Stripe Artifact Removal in Light-sheet Microscopy

no code implementations27 Jun 2022 Yu Liu, Kurt Weiss, Nassir Navab, Carsten Marr, Jan Huisken, Tingying Peng

Light-sheet fluorescence microscopy (LSFM) is a cutting-edge volumetric imaging technique that allows for three-dimensional imaging of mesoscopic samples with decoupled illumination and detection paths.

Denoising Graph Neural Network

Extending regionalization algorithms to explore spatial process heterogeneity

2 code implementations19 Jun 2022 Hao Guo, Andre Python, Yu Liu

In spatial regression models, spatial heterogeneity may be considered with either continuous or discrete specifications.

regression

Enhancing Quality of Pose-varied Face Restoration with Local Weak Feature Sensing and GAN Prior

no code implementations28 May 2022 Kai Hu, Yu Liu, Renhe Liu, Wei Lu, Gang Yu, Bin Fu

In the asymmetric codec, we adopt a mixed multi-path residual block (MMRB) to gradually extract weak texture features of input images, which can better preserve the original facial features and avoid excessive fantasy.

Blind Face Restoration Super-Resolution

MixMAE: Mixed and Masked Autoencoder for Efficient Pretraining of Hierarchical Vision Transformers

1 code implementation CVPR 2023 Jihao Liu, Xin Huang, Jinliang Zheng, Yu Liu, Hongsheng Li

In this paper, we propose Mixed and Masked AutoEncoder (MixMAE), a simple but efficient pretraining method that is applicable to various hierarchical Vision Transformers.

Image Classification Object Detection +2

HDGT: Heterogeneous Driving Graph Transformer for Multi-Agent Trajectory Prediction via Scene Encoding

1 code implementation30 Apr 2022 Xiaosong Jia, Penghao Wu, Li Chen, Yu Liu, Hongyang Li, Junchi Yan

Based on these observations, we propose Heterogeneous Driving Graph Transformer (HDGT), a backbone modelling the driving scene as a heterogeneous graph with different types of nodes and edges.

Autonomous Driving graph construction +3

Dual-Domain Reconstruction Networks with V-Net and K-Net for fast MRI

no code implementations11 Mar 2022 Xiaohan Liu, Yanwei Pang, Ruiqi Jin, Yu Liu, ZhenChang Wang

Purpose: To introduce a dual-domain reconstruction network with V-Net and K-Net for accurate MR image reconstruction from undersampled k-space data.

Decoder Image Reconstruction

Mapping evolving population geography in China

1 code implementation4 Mar 2022 Lei Dong, Rui Du, Yu Liu

China's demographic changes have important global economic and geopolitical implications.

Meta Knowledge Distillation

no code implementations16 Feb 2022 Jihao Liu, Boxiao Liu, Hongsheng Li, Yu Liu

Recent studies pointed out that knowledge distillation (KD) suffers from two degradation problems, the teacher-student gap and the incompatibility with strong data augmentations, making it not applicable to training state-of-the-art models, which are trained with advanced augmentations.

Data Augmentation Image Classification +1

Pedestrian Dead Reckoning System using Quasi-static Magnetic Field Detection

no code implementations24 Jan 2022 Liqiang Zhang, Kai Guo, Yu Liu

Kalman filter-based Inertial Navigation System (INS) is a reliable and efficient method to estimate the position of a pedestrian indoors.

Position valid

UniFormer: Unifying Convolution and Self-attention for Visual Recognition

7 code implementations24 Jan 2022 Kunchang Li, Yali Wang, Junhao Zhang, Peng Gao, Guanglu Song, Yu Liu, Hongsheng Li, Yu Qiao

Different from the typical transformer blocks, the relation aggregators in our UniFormer block are equipped with local and global token affinity respectively in shallow and deep layers, allowing to tackle both redundancy and dependency for efficient and effective representation learning.

Image Classification object-detection +5

UniFormer: Unified Transformer for Efficient Spatiotemporal Representation Learning

2 code implementations12 Jan 2022 Kunchang Li, Yali Wang, Peng Gao, Guanglu Song, Yu Liu, Hongsheng Li, Yu Qiao

For Something-Something V1 and V2, our UniFormer achieves new state-of-the-art performances of 60. 9% and 71. 2% top-1 accuracy respectively.

Representation Learning

Swift and Sure: Hardness-aware Contrastive Learning for Low-dimensional Knowledge Graph Embeddings

no code implementations3 Jan 2022 Kai Wang, Yu Liu, Quan Z. Sheng

Knowledge graph embedding (KGE) has shown great potential in automatic knowledge graph (KG) completion and knowledge-driven tasks.

Contrastive Learning Knowledge Graph Embedding +1

Segment, Magnify and Reiterate: Detecting Camouflaged Objects the Hard Way

1 code implementation CVPR 2022 Qi Jia, Shuilian Yao, Yu Liu, Xin Fan, Risheng Liu, Zhongxuan Luo

To tackle camouflaged object detection (COD), we are inspired by humans attention coupled with the coarse-to-fine detection strategy, and thereby propose an iterative refinement framework, coined SegMaR, which integrates Segment, Magnify and Reiterate in a multi-stage detection fashion.

object-detection Object Detection