Search Results for author: Lei Bai

Found 118 papers, 56 papers with code

Dynamic Base model Shift for Delta Compression

no code implementations16 May 2025 Chenyu Huang, Peng Ye, Shenghe Zheng, Xiaohui Wang, Lei Bai, Tao Chen, Wanli Ouyang

To this end, we propose Dynamic Base Model Shift (DBMS), which dynamically adapts the base model to the target task before performing delta compression.

OmniCaptioner: One Captioner to Rule Them All

1 code implementation9 Apr 2025 Yiting Lu, Jiakang Yuan, Zhen Li, Shitian Zhao, Qi Qin, Xinyue Li, Le Zhuo, Licheng Wen, Dongyang Liu, Yuewen Cao, Xiangchao Yan, Xin Li, Botian Shi, Tao Chen, Zhibo Chen, Lei Bai, Bo Zhang, Peng Gao

We propose OmniCaptioner, a versatile visual captioning framework for generating fine-grained textual descriptions across a wide variety of visual domains.

All Image Captioning +2

UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines

no code implementations26 Mar 2025 Chen Tang, Xinzhu Ma, Encheng Su, Xiufeng Song, Xiaohong Liu, Wei-Hong Li, Lei Bai, Wanli Ouyang, Xiangyu Yue

Traditional spatiotemporal models generally rely on task-specific architectures, which limit their generalizability and scalability across diverse tasks due to domain-specific design requirements.

RoboFactory: Exploring Embodied Agent Collaboration with Compositional Constraints

no code implementations20 Mar 2025 Yiran Qin, Li Kang, Xiufeng Song, Zhenfei Yin, Xiaohong Liu, Xihui Liu, Ruimao Zhang, Lei Bai

Furthermore, we explore the architectures and training strategies for multi-agent imitation learning, aiming to build safe and efficient embodied multi-agent systems.

Imitation Learning

Attention Reallocation: Towards Zero-cost and Controllable Hallucination Mitigation of MLLMs

no code implementations11 Mar 2025 Chongjun Tu, Peng Ye, Dongzhan Zhou, Lei Bai, Gang Yu, Tao Chen, Wanli Ouyang

Multi-Modal Large Language Models (MLLMs) stand out in various tasks but still struggle with hallucinations.

Hallucination

Transforming Weather Data from Pixel to Latent Space

no code implementations9 Mar 2025 Sijie Zhao, Feng Liu, Xueliang Zhang, Hao Chen, Tao Han, Junchao Gong, Ran Tao, Pengfeng Xiao, Lei Bai, Wanli Ouyang

The downstream task further demonstrates that task models can apply to multiple PVS with low data costs in latent space and achieve superior performance compared to models in pixel space.

Seeing Delta Parameters as JPEG Images: Data-Free Delta Compression with Discrete Cosine Transform

no code implementations9 Mar 2025 Chenyu Huang, Peng Ye, Xiaohui Wang, Shenghe Zheng, Biqing Qi, Lei Bai, Wanli Ouyang, Tao Chen

With transformer-based models and the pretrain-finetune paradigm becoming mainstream, the high storage and deployment costs of individual finetuned models on multiple tasks pose critical challenges.

Image Compression Quantization

Nature-Inspired Population-Based Evolution of Large Language Models

1 code implementation3 Mar 2025 Yiqun Zhang, Peng Ye, Xiaocui Yang, Shi Feng, Shufei Zhang, Lei Bai, Wanli Ouyang, Shuyue Hu

Evolution, the engine behind the survival and growth of life on Earth, operates through the population-based process of reproduction.

Zero-shot Generalization

SeisMoLLM: Advancing Seismic Monitoring via Cross-modal Transfer with Pre-trained Large Language Model

1 code implementation27 Feb 2025 Xinghao Wang, Feng Liu, Rui Su, Zhihui Wang, Lei Bai, Wanli Ouyang

Recent advances in deep learning have revolutionized seismic monitoring, yet developing a foundation model that performs well across multiple complex tasks remains challenging, particularly when dealing with degraded signals or data scarcity.

Language Modeling Language Modelling +1

EvoFlow: Evolving Diverse Agentic Workflows On The Fly

no code implementations11 Feb 2025 Guibin Zhang, Kaijie Chen, Guancheng Wan, Heng Chang, Hong Cheng, Kun Wang, Shuyue Hu, Lei Bai

The past two years have witnessed the evolution of large language model (LLM)-based multi-agent systems from labor-intensive manual design to partial automation (\textit{e. g.}, prompt engineering, communication topology) and eventually to fully automated design.

Large Language Model Prompt Engineering +1

Satellite Observations Guided Diffusion Model for Accurate Meteorological States at Arbitrary Resolution

no code implementations9 Feb 2025 Siwei Tu, Ben Fei, Weidong Yang, Fenghua Ling, Hao Chen, Zili Liu, Kun Chen, Hang Fan, Wanli Ouyang, Lei Bai

In the sampling, we employed optimizable convolutional kernels to simulate the upscale process, thereby generating high-resolution ERA5 maps using low-resolution ERA5 maps as well as observations from weather stations as guidance.

Spatial Interpolation Weather Forecasting

Multi-agent Architecture Search via Agentic Supernet

1 code implementation6 Feb 2025 Guibin Zhang, Luyang Niu, Junfeng Fang, Kun Wang, Lei Bai, Xiang Wang

Large Language Model (LLM)-empowered multi-agent systems extend the cognitive boundaries of individual agents through disciplined collaboration and interaction, while constructing these systems often requires labor-intensive manual designs.

Language Modeling Language Modelling +1

VQLTI: Long-Term Tropical Cyclone Intensity Forecasting with Physical Constraints

1 code implementation30 Jan 2025 Xinyu Wang, Lei Liu, Kang Chen, Tao Han, Bin Li, Lei Bai

(2) Incorporating physical knowledge and physical constraints can help mitigate the accumulation of forecasting errors.

Tropical Cyclone Intensity Forecasting

Kolmogorov Arnold Neural Interpolator for Downscaling and Correcting Meteorological Fields from In-Situ Observations

no code implementations24 Jan 2025 Zili Liu, Hao Chen, Lei Bai, Wenyuan Li, Zhengxia Zou, Zhenwei Shi

Obtaining accurate weather forecasts at station locations is a critical challenge due to systematic biases arising from the mismatch between multi-scale, continuous atmospheric characteristic and their discrete, gridded representations.

DispFormer: Pretrained Transformer for Flexible Dispersion Curve Inversion from Global Synthesis to Regional Applications

1 code implementation8 Jan 2025 Feng Liu, Bao Deng, Rui Su, Lei Bai, Wanli Ouyang

Surface wave dispersion curve inversion is essential for estimating subsurface Shear-wave velocity ($v_s$), yet traditional methods often struggle to balance computational efficiency with inversion accuracy.

Computational Efficiency

Dolphin: Closed-loop Open-ended Auto-research through Thinking, Practice, and Feedback

no code implementations7 Jan 2025 Jiakang Yuan, Xiangchao Yan, Botian Shi, Tao Chen, Wanli Ouyang, Bo Zhang, Lei Bai, Yu Qiao, BoWen Zhou

The scientific research paradigm is undergoing a profound transformation owing to the development of Artificial Intelligence (AI).

Image Classification

LWFNet: Coherent Doppler Wind Lidar-Based Network for Wind Field Retrieval

no code implementations5 Jan 2025 Ran Tao, Chong Wang, Hao Chen, Mingjiao Jia, Xiang Shang, Luoyuan Qu, Guoliang Shentu, Yanyu Lu, Yanfeng Huo, Lei Bai, Xianghui Xue, Xiankang Dou

LWFNet demonstrates remarkable performance in lidar-based wind field retrieval, setting a benchmark for future research and advancing the development of deep learning models in this domain.

Retrieval Weather Forecasting

Chimera: Improving Generalist Model with Domain-Specific Experts

no code implementations8 Dec 2024 Tianshuo Peng, Mingsheng Li, Hongbin Zhou, Renqiu Xia, Renrui Zhang, Lei Bai, Song Mao, Bin Wang, Conghui He, Aojun Zhou, Botian Shi, Tao Chen, Bo Zhang, Xiangyu Yue

This results in a versatile model that excels across the chart, table, math, and document domains, achieving state-of-the-art performance on multi-modal reasoning and visual content extraction tasks, both of which are challenging tasks for assessing existing LMMs.

Math model

DuoCast: Duo-Probabilistic Meteorology-Aware Model for Extended Precipitation Nowcasting

1 code implementation2 Dec 2024 Penghui Wen, Lei Bai, Mengwei He, Patrick Filippi, Feng Zhang, Thomas Francis Bishop, Zhiyong Wang, Kun Hu

Recently, extended short-term precipitation nowcasting struggles with decreasing precision because of insufficient consideration of meteorological knowledge, such as weather fronts which significantly influence precipitation intensity, duration, and spatial distribution.

FengWu-W2S: A deep learning model for seamless weather-to-subseasonal forecast of global atmosphere

no code implementations15 Nov 2024 Fenghua Ling, Kang Chen, Jiye Wu, Tao Han, Jing-Jia Luo, Wanli Ouyang, Lei Bai

Seamless forecasting that produces warning information at continuum timescales based on only one system is a long-standing pursuit for weather-climate service.

On Learning Multi-Modal Forgery Representation for Diffusion Generated Video Detection

1 code implementation31 Oct 2024 Xiufeng Song, Xiao Guo, Jiache Zhang, Qirui Li, Lei Bai, Xiaoming Liu, Guangtao Zhai, Xiaohong Liu

Large numbers of synthesized videos from diffusion models pose threats to information security and authenticity, leading to an increasing demand for generated content detection.

Video Forensics

WorldSimBench: Towards Video Generation Models as World Simulators

no code implementations23 Oct 2024 Yiran Qin, Zhelun Shi, Jiwen Yu, Xijun Wang, Enshen Zhou, Lijun Li, Zhenfei Yin, Xihui Liu, Lu Sheng, Jing Shao, Lei Bai, Wanli Ouyang, Ruimao Zhang

WorldSimBench includes Explicit Perceptual Evaluation and Implicit Manipulative Evaluation, encompassing human preference assessments from the visual perspective and action-level evaluations in embodied tasks, covering three representative embodied scenarios: Open-Ended Embodied Environment, Autonomous, Driving, and Robot Manipulation.

Autonomous Driving Robot Manipulation +1

SIFM: A Foundation Model for Multi-granularity Arctic Sea Ice Forecasting

no code implementations16 Oct 2024 Jingyi Xu, Yeqi Luo, Weidong Yang, Keyi Liu, Shengnan Wang, Ben Fei, Lei Bai

Arctic sea ice performs a vital role in global climate and has paramount impacts on both polar ecosystems and coastal communities.

Diffusion Models Need Visual Priors for Image Generation

no code implementations11 Oct 2024 Xiaoyu Yue, Zidong Wang, Zeyu Lu, Shuyang Sun, Meng Wei, Wanli Ouyang, Lei Bai, Luping Zhou

Conventional class-guided diffusion models generally succeed in generating images with correct semantic content, but often struggle with texture details.

Image Generation

IceDiff: High Resolution and High-Quality Sea Ice Forecasting with Generative Diffusion Prior

no code implementations10 Oct 2024 Jingyi Xu, Siwei Tu, Weidong Yang, Shuhao Li, Keyi Liu, Yeqi Luo, Lipeng Ma, Ben Fei, Lei Bai

Variation of Arctic sea ice has significant impacts on polar ecosystems, transporting routes, coastal communities, and global climate.

PostCast: Generalizable Postprocessing for Precipitation Nowcasting via Unsupervised Blurriness Modeling

no code implementations8 Oct 2024 Junchao Gong, Siwei Tu, Weidong Yang, Ben Fei, Kun Chen, Wenlong Zhang, Xiaokang Yang, Wanli Ouyang, Lei Bai

By rethinking the blurriness in precipitation nowcasting as a blur kernel acting on predictions, we propose an unsupervised postprocessing method to eliminate the blurriness without the requirement of training with the pairs of blurry predictions and corresponding ground truth.

Denoising

WeatherFormer: Empowering Global Numerical Weather Forecasting with Space-Time Transformer

no code implementations21 Sep 2024 Junchao Gong, Tao Han, Kang Chen, Lei Bai

Numerical Weather Prediction (NWP) system is an infrastructure that exerts considerable impacts on modern society. Traditional NWP system, however, resolves it by solving complex partial differential equations with a huge computing cluster, resulting in tons of carbon emission.

Data Augmentation Weather Forecasting

ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems

1 code implementation2 Sep 2024 Xiangyuan Xue, Zeyu Lu, Di Huang, Zidong Wang, Wanli Ouyang, Lei Bai

Based on ComfyBench, we further develop ComfyAgent, a novel framework that empowers LLM-based agents to autonomously design collaborative AI systems by generating workflows.

Benchmarking Instruction Following

A Benchmark for AI-based Weather Data Assimilation

1 code implementation21 Aug 2024 Wuxin Wang, Weicheng Ni, Tao Han, Taikang Yuan, Xiaoyong Li, Lei Bai, Boheng Duan, Kaijun Ren

Recent advancements in Artificial Intelligence (AI) have led to the development of several Large Weather Models (LWMs) that rival State-Of-The-Art (SOTA) Numerical Weather Prediction (NWP) systems.

Weather Forecasting

MambaDS: Near-Surface Meteorological Field Downscaling with Topography Constrained Selective State Space Modeling

no code implementations20 Aug 2024 Zili Liu, Hao Chen, Lei Bai, Wenyuan Li, Wanli Ouyang, Zhengxia Zou, Zhenwei Shi

In an era of frequent extreme weather and global warming, obtaining precise, fine-grained near-surface weather forecasts is increasingly essential for human activities.

Mamba Super-Resolution

Fast Information Streaming Handler (FisH): A Unified Seismic Neural Network for Single Station Real-Time Earthquake Early Warning

no code implementations13 Aug 2024 Tianning Zhang, Feng Liu, Yuming Yuan, Rui Su, Wanli Ouyang, Lei Bai

FisH is designed to process real-time streaming seismic data and generate simultaneous results for phase picking, location estimation, and magnitude estimation in an end-to-end fashion.

Event Detection

VegeDiff: Latent Diffusion Model for Geospatial Vegetation Forecasting

no code implementations17 Jul 2024 Sijie Zhao, Hao Chen, Xueliang Zhang, Pengfeng Xiao, Lei Bai, Wanli Ouyang

By capturing the uncertainties in vegetation changes and modeling the complex influence of relevant variables, VegeDiff outperforms existing deterministic methods, providing clear and accurate forecasting results of future vegetation states.

model

PredBench: Benchmarking Spatio-Temporal Prediction across Diverse Disciplines

1 code implementation11 Jul 2024 Zidong Wang, Zeyu Lu, Di Huang, Tong He, Xihui Liu, Wanli Ouyang, Lei Bai

In this paper, we introduce PredBench, a benchmark tailored for the holistic evaluation of spatio-temporal prediction networks.

Benchmarking Prediction

How far are today's time-series models from real-world weather forecasting applications?

1 code implementation20 Jun 2024 Tao Han, Song Guo, Zhenghao Chen, Wanghan Xu, Lei Bai

As a result, it enables a better training of models and a more accurate assessment of the real-world forecasting capabilities of TSF models, pushing them closer to in-situ applications.

Benchmarking Time Series +2

FNP: Fourier Neural Processes for Arbitrary-Resolution Data Assimilation

no code implementations3 Jun 2024 Kun Chen, Tao Chen, Peng Ye, Hao Chen, Kang Chen, Tao Han, Wanli Ouyang, Lei Bai

Data assimilation is a vital component in modern global medium-range weather forecasting systems to obtain the best estimation of the atmospheric state by combining the short-term forecast and observations.

Weather Forecasting

Data-driven Global Ocean Modeling for Seasonal to Decadal Prediction

1 code implementation24 May 2024 Zijie Guo, Pumeng Lyu, Fenghua Ling, Lei Bai, Jing-Jia Luo, Niklas Boers, Toshio Yamagata, Takeshi Izumo, Sophie Cravatte, Antonietta Capotondi, Wanli Ouyang

Accurate ocean dynamics modeling is crucial for enhancing understanding of ocean circulation, predicting climate variability, and tackling challenges posed by climate change.

Generalizing Weather Forecast to Fine-grained Temporal Scales via Physics-AI Hybrid Modeling

1 code implementation22 May 2024 Wanghan Xu, Fenghua Ling, Wenlong Zhang, Tao Han, Hao Chen, Wanli Ouyang, Lei Bai

This paper proposes a physics-AI hybrid model (i. e., WeatherGFT) which generalizes weather forecasts to finer-grained temporal scales beyond training dataset.

Weather Forecasting

VAE-Var: Variational-Autoencoder-Enhanced Variational Assimilation

no code implementations22 May 2024 Yi Xiao, Qilong Jia, Wei Xue, Lei Bai

Data assimilation refers to a set of algorithms designed to compute the optimal estimate of a system's state by refining the prior prediction (known as background states) using observed data.

CRA5: Extreme Compression of ERA5 for Portable Global Climate and Weather Research via an Efficient Variational Transformer

1 code implementation6 May 2024 Tao Han, Zhenghao Chen, Song Guo, Wanghan Xu, Lei Bai

To mitigate this issue, we introduce an efficient neural codec, the Variational Autoencoder Transformer (VAEformer), for extreme compression of climate data to significantly reduce data storage cost, making AI-based meteorological research portable to researchers.

Weather Forecasting

G-Refine: A General Quality Refiner for Text-to-Image Generation

1 code implementation29 Apr 2024 Chunyi Li, HaoNing Wu, Hongkun Hao, ZiCheng Zhang, Tengchaun Kou, Chaofeng Chen, Lei Bai, Xiaohong Liu, Weisi Lin, Guangtao Zhai

Based on the mechanisms of the Human Visual System (HVS) and syntax trees, the first two indicators can respectively identify the perception and alignment deficiencies, and the last module can apply targeted quality enhancement accordingly.

Text-to-Image Generation

RS-Mamba for Large Remote Sensing Image Dense Prediction

1 code implementation3 Apr 2024 Sijie Zhao, Hao Chen, Xueliang Zhang, Pengfeng Xiao, Lei Bai, Wanli Ouyang

RSM is specifically designed to capture the global context of remote sensing images with linear complexity, facilitating the effective processing of large VHR images.

Building change detection for remote sensing images Change Detection +3

A Survey on Long Video Generation: Challenges, Methods, and Prospects

no code implementations25 Mar 2024 Chengxuan Li, Di Huang, Zeyu Lu, Yang Xiao, Qingqi Pei, Lei Bai

Video generation is a rapidly advancing research area, garnering significant attention due to its broad range of applications.

Survey Video Generation

HVDistill: Transferring Knowledge from Images to Point Clouds via Unsupervised Hybrid-View Distillation

1 code implementation18 Mar 2024 Sha Zhang, Jiajun Deng, Lei Bai, Houqiang Li, Wanli Ouyang, Yanyong Zhang

We present a hybrid-view-based knowledge distillation framework, termed HVDistill, to guide the feature learning of a point cloud neural network with a pre-trained image network in an unsupervised man- ner.

Knowledge Distillation NER +1

CasCast: Skillful High-resolution Precipitation Nowcasting via Cascaded Modelling

no code implementations6 Feb 2024 Junchao Gong, Lei Bai, Peng Ye, Wanghan Xu, Na Liu, Jianhua Dai, Xiaokang Yang, Wanli Ouyang

Precipitation nowcasting based on radar data plays a crucial role in extreme weather prediction and has broad implications for disaster management.

Management

ExtremeCast: Boosting Extreme Value Prediction for Global Weather Forecast

1 code implementation2 Feb 2024 Wanghan Xu, Kang Chen, Tao Han, Hao Chen, Wanli Ouyang, Lei Bai

Data-driven weather forecast based on machine learning (ML) has experienced rapid development and demonstrated superior performance in the global medium-range forecast compared to traditional physics-based dynamical models.

Prediction Value prediction

Improving Global Weather and Ocean Wave Forecast with Large Artificial Intelligence Models

no code implementations30 Jan 2024 Fenghua Ling, Lin Ouyang, Boufeniza Redouane Larbi, Jing-Jia Luo, Tao Han, Xiaohui Zhong, Lei Bai

The rapid advancement of artificial intelligence technologies, particularly in recent years, has led to the emergence of several large parameter artificial intelligence weather forecast models.

Computational Efficiency Weather Forecasting

Non-Neighbors Also Matter to Kriging: A New Contrastive-Prototypical Learning

1 code implementation23 Jan 2024 Zhishuai Li, Yunhao Nie, Ziyue Li, Lei Bai, Yisheng Lv, Rui Zhao

As a pre-trained paradigm, we conduct the Kriging task from a new perspective of representation: we aim to first learn robust and general representations and then recover attributes from representations.

Attribute Self-Supervised Learning

Observation-Guided Meteorological Field Downscaling at Station Scale: A Benchmark and a New Method

1 code implementation22 Jan 2024 Zili Liu, Hao Chen, Lei Bai, Wenyuan Li, Keyan Chen, Zhengyi Wang, Wanli Ouyang, Zhengxia Zou, Zhenwei Shi

In this paper, we extend meteorological downscaling to arbitrary scattered station scales, establish a brand new benchmark and dataset, and retrieve meteorological states at any given station location from a coarse-resolution meteorological field.

Super-Resolution Weather Forecasting

Online Test-Time Adaptation of Spatial-Temporal Traffic Flow Forecasting

1 code implementation8 Jan 2024 Pengxin Guo, Pengrong Jin, Ziyue Li, Lei Bai, Yu Zhang

To make the model trained on historical data better adapt to future data in a fully online manner, this paper conducts the first study of the online test-time adaptation techniques for spatial-temporal traffic flow forecasting problems.

Test-time Adaptation Traffic Prediction

Q-Refine: A Perceptual Quality Refiner for AI-Generated Image

1 code implementation2 Jan 2024 Chunyi Li, HaoNing Wu, ZiCheng Zhang, Hongkun Hao, Kaiwei Zhang, Lei Bai, Xiaohong Liu, Xiongkuo Min, Weisi Lin, Guangtao Zhai

With the rapid evolution of the Text-to-Image (T2I) model in recent years, their unsatisfactory generation result has become a challenge.

Image Quality Assessment

Towards an end-to-end artificial intelligence driven global weather forecasting system

1 code implementation18 Dec 2023 Kun Chen, Lei Bai, Fenghua Ling, Peng Ye, Tao Chen, Jing-Jia Luo, Hao Chen, Yi Xiao, Kang Chen, Tao Han, Wanli Ouyang

Initial states are typically generated by traditional data assimilation components, which are computational expensive and time-consuming.

Weather Forecasting

ResoNet: Robust and Explainable ENSO Forecasts with Hybrid Convolution and Transformer Networks

no code implementations16 Dec 2023 Pumeng Lyu, Tao Tang, Fenghua Ling, Jing-Jia Luo, Niklas Boers, Wanli Ouyang, Lei Bai

Recent studies have shown that deep learning (DL) models can skillfully predict the El Ni\~no-Southern Oscillation (ENSO) forecasts over 1. 5 years ahead.

VisionTraj: A Noise-Robust Trajectory Recovery Framework based on Large-scale Camera Network

1 code implementation11 Dec 2023 Zhishuai Li, Ziyue Li, Xiaoru Hu, Guoqing Du, Yunhao Nie, Feng Zhu, Lei Bai, Rui Zhao

Trajectory recovery based on the snapshots from the city-wide multi-camera network facilitates urban mobility sensing and driveway optimization.

Clustering Denoising +1

Hulk: A Universal Knowledge Translator for Human-Centric Tasks

2 code implementations4 Dec 2023 Yizhou Wang, Yixuan Wu, Shixiang Tang, Weizhen He, Xun Guo, Feng Zhu, Lei Bai, Rui Zhao, Jian Wu, Tong He, Wanli Ouyang

Human-centric perception tasks, e. g., pedestrian detection, skeleton-based action recognition, and pose estimation, have wide industrial applications, such as metaverse and sports analysis.

3D Human Pose Estimation Action Recognition +8

A Critical Perceptual Pre-trained Model for Complex Trajectory Recovery

no code implementations5 Nov 2023 Dedong Li, Ziyue Li, Zhishuai Li, Lei Bai, Qingyuan Gong, Lijun Sun, Wolfgang Ketter, Rui Zhao

Then, we propose a Multi-view Graph and Complexity Aware Transformer (MGCAT) model to encode these semantics in trajectory pre-training from two aspects: 1) adaptively aggregate the multi-view graph features considering trajectory pattern, and 2) higher attention to critical nodes in a complex trajectory.

Trajectory Recovery

Understanding Masked Autoencoders From a Local Contrastive Perspective

no code implementations3 Oct 2023 Xiaoyu Yue, Lei Bai, Meng Wei, Jiangmiao Pang, Xihui Liu, Luping Zhou, Wanli Ouyang

Masked AutoEncoder (MAE) has revolutionized the field of self-supervised learning with its simple yet effective masking and reconstruction strategies.

Contrastive Learning Data Augmentation +2

STEERER: Resolving Scale Variations for Counting and Localization via Selective Inheritance Learning

1 code implementation ICCV 2023 Tao Han, Lei Bai, Lingbo Liu, Wanli Ouyang

Scale variation is a deep-rooted problem in object counting, which has not been effectively addressed by existing scale-aware algorithms.

feature selection Object Counting

Relation-Aware Distribution Representation Network for Person Clustering with Multiple Modalities

no code implementations1 Aug 2023 Kaijian Liu, Shixiang Tang, Ziyue Li, Zhishuai Li, Lei Bai, Feng Zhu, Rui Zhao

The distribution representation of a clue is a vector consisting of the relation between this clue and all other clues from all modalities, thus being modality agnostic and good for person clustering.

Clustering Relation

Enhancing Mapless Trajectory Prediction through Knowledge Distillation

no code implementations25 Jun 2023 Yuning Wang, Pu Zhang, Lei Bai, Jianru Xue

Scene information plays a crucial role in trajectory forecasting systems for autonomous driving by providing semantic clues and constraints on potential future paths of traffic agents.

Autonomous Driving Knowledge Distillation +2

MotionGPT: Finetuned LLMs Are General-Purpose Motion Generators

no code implementations19 Jun 2023 Yaqi Zhang, Di Huang, Bin Liu, Shixiang Tang, Yan Lu, Lu Chen, Lei Bai, Qi Chu, Nenghai Yu, Wanli Ouyang

Generating realistic human motion from given action descriptions has experienced significant advancements because of the emerging requirement of digital humans.

Motion Generation

Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions

1 code implementation CVPR 2024 Weizhen He, Yiheng Deng, Shixiang Tang, Qihao Chen, Qingsong Xie, Yizhou Wang, Lei Bai, Feng Zhu, Rui Zhao, Wanli Ouyang, Donglian Qi, Yunfeng Yan

This paper strives to resolve this problem by proposing a new instruct-ReID task that requires the model to retrieve images according to the given image or language instructions.

Person Re-Identification Triplet

Correlated Time Series Self-Supervised Representation Learning via Spatiotemporal Bootstrapping

1 code implementation12 Jun 2023 Luxuan Wang, Lei Bai, Ziyue Li, Rui Zhao, Fugee Tsung

We evaluated the effectiveness and flexibility of our representation learning framework on correlated time series forecasting and cold-start transferring the forecasting model to new instances with limited data.

Correlated Time Series Forecasting Representation Learning +1

Dynamic Causal Graph Convolutional Network for Traffic Prediction

1 code implementation12 Jun 2023 Junpeng Lin, Ziyue Li, Zhishuai Li, Lei Bai, Rui Zhao, Chen Zhang

In this work, we propose a novel approach for traffic prediction that embeds time-varying dynamic Bayesian network to capture the fine spatiotemporal topology of traffic data.

Prediction Traffic Prediction

LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark

2 code implementations NeurIPS 2023 Zhenfei Yin, Jiong Wang, JianJian Cao, Zhelun Shi, Dingning Liu, Mukai Li, Lu Sheng, Lei Bai, Xiaoshui Huang, Zhiyong Wang, Jing Shao, Wanli Ouyang

To the best of our knowledge, we present one of the very first open-source endeavors in the field, LAMM, encompassing a Language-Assisted Multi-Modal instruction tuning dataset, framework, and benchmark.

MM-DAG: Multi-task DAG Learning for Multi-modal Data -- with Application for Traffic Congestion Analysis

1 code implementation5 Jun 2023 Tian Lan, Ziyue Li, Zhishuai Li, Lei Bai, Man Li, Fugee Tsung, Wolfgang Ketter, Rui Zhao, Chen Zhang

This encourages the multi-task design: with each DAG as a task, the MM-DAG tries to learn the multiple DAGs jointly so that their consensus and consistency are maximized.

Stimulative Training++: Go Beyond The Performance Limits of Residual Networks

1 code implementation4 May 2023 Peng Ye, Tong He, Shengji Tang, Baopu Li, Tao Chen, Lei Bai, Wanli Ouyang

In this work, we aim to re-investigate the training process of residual networks from a novel social psychology perspective of loafing, and further propose a new training scheme as well as three improved strategies for boosting residual networks beyond their performance limits.

Hierarchical Diffusion Autoencoders and Disentangled Image Manipulation

no code implementations24 Apr 2023 Zeyu Lu, Chengyue Wu, Xinyuan Chen, Yaohui Wang, Lei Bai, Yu Qiao, Xihui Liu

To mitigate those limitations, we propose Hierarchical Diffusion Autoencoders (HDAE) that exploit the fine-grained-to-abstract and lowlevel-to-high-level feature hierarchy for the latent space of diffusion models.

Image Manipulation Image Reconstruction

FengWu: Pushing the Skillful Global Medium-range Weather Forecast beyond 10 Days Lead

2 code implementations6 Apr 2023 Kang Chen, Tao Han, Junchao Gong, Lei Bai, Fenghua Ling, Jing-Jia Luo, Xi Chen, Leiming Ma, Tianning Zhang, Rui Su, Yuanzheng Ci, Bin Li, Xiaokang Yang, Wanli Ouyang

We present FengWu, an advanced data-driven global medium-range weather forecast system based on Artificial Intelligence (AI).

HumanBench: Towards General Human-centric Perception with Projector Assisted Pretraining

1 code implementation CVPR 2023 Shixiang Tang, Cheng Chen, Qingsong Xie, Meilin Chen, Yizhou Wang, Yuanzheng Ci, Lei Bai, Feng Zhu, Haiyang Yang, Li Yi, Rui Zhao, Wanli Ouyang

Specifically, we propose a \textbf{HumanBench} based on existing datasets to comprehensively evaluate on the common ground the generalization abilities of different pretraining methods on 19 datasets from 6 diverse downstream tasks, including person ReID, pose estimation, human parsing, pedestrian attribute recognition, pedestrian detection, and crowd counting.

 Ranked #1 on Pedestrian Attribute Recognition on PA-100K (using extra training data)

Attribute Autonomous Driving +5

UniHCP: A Unified Model for Human-Centric Perceptions

1 code implementation CVPR 2023 Yuanzheng Ci, Yizhou Wang, Meilin Chen, Shixiang Tang, Lei Bai, Feng Zhu, Rui Zhao, Fengwei Yu, Donglian Qi, Wanli Ouyang

When adapted to a specific task, UniHCP achieves new SOTAs on a wide range of human-centric tasks, e. g., 69. 8 mIoU on CIHP for human parsing, 86. 18 mA on PA-100K for attribute prediction, 90. 3 mAP on Market1501 for ReID, and 85. 8 JI on CrowdHuman for pedestrian detection, performing better than specialized models tailored for each task.

2D Pose Estimation Attribute +9

Multi-Scale Control Signal-Aware Transformer for Motion Synthesis without Phase

no code implementations3 Mar 2023 Lintao Wang, Kun Hu, Lei Bai, Yu Ding, Wanli Ouyang, Zhiyong Wang

As past poses often contain useful auxiliary hints, in this paper, we propose a task-agnostic deep learning method, namely Multi-scale Control Signal-aware Transformer (MCS-T), with an attention based encoder-decoder architecture to discover the auxiliary information implicitly for synthesizing controllable motion without explicitly requiring auxiliary information such as phase.

Decoder Feature Engineering +1

Saliency Guided Contrastive Learning on Scene Images

no code implementations22 Feb 2023 Meilin Chen, Yizhou Wang, Shixiang Tang, Feng Zhu, Haiyang Yang, Lei Bai, Rui Zhao, Donglian Qi, Wanli Ouyang

Despite being feasible, recent works largely overlooked discovering the most discriminative regions for contrastive learning to object representations in scene images.

Contrastive Learning Linear evaluation +2

Learning from pseudo-labels: deep networks improve consistency in longitudinal brain volume estimation

no code implementations8 Feb 2023 Geng Zhan, Dongang Wang, Mariano Cabezas, Lei Bai, Kain Kyle, Wanli Ouyang, Michael Barnett, Chenyu Wang

An accurate and robust quantitative measurement of brain volume change is paramount for translational research and clinical applications.

Graph-Free Learning in Graph-Structured Data: A More Efficient and Accurate Spatiotemporal Learning Perspective

no code implementations27 Jan 2023 Xu Wang, Pengfei Gu, Pengkun Wang, Binwu Wang, Zhengyang Zhou, Lei Bai, Yang Wang

In this paper, with extensive and deep-going experiments, we comprehensively analyze existing spatiotemporal graph learning models and reveal that extracting adjacency matrices with carefully design strategies, which are viewed as the key of enhancing performance on graph learning, are largely ineffective.

Graph Learning

$β$-DARTS++: Bi-level Regularization for Proxy-robust Differentiable Architecture Search

1 code implementation16 Jan 2023 Peng Ye, Tong He, Baopu Li, Tao Chen, Lei Bai, Wanli Ouyang

To address the robustness problem, we first benchmark different NAS methods under a wide range of proxy data, proxy channels, proxy layers and proxy epochs, since the robustness of NAS under different kinds of proxies has not been explored before.

Neural Architecture Search

Towards Frame Rate Agnostic Multi-Object Tracking

1 code implementation23 Sep 2022 Weitao Feng, Lei Bai, Yongqiang Yao, Fengwei Yu, Wanli Ouyang

In this paper, we propose a Frame Rate Agnostic MOT framework with a Periodic training Scheme (FAPS) to tackle the FraMOT problem for the first time.

Multi-Object Tracking Object

Jointly Contrastive Representation Learning on Road Network and Trajectory

1 code implementation14 Sep 2022 Zhenyu Mao, Ziyue Li, Dedong Li, Lei Bai, Rui Zhao

Unlike the existing cross-scale contrastive learning methods on graphs that only contrast a graph and its belonging nodes, the contrast between road segment and trajectory is elaborately tailored via novel positive sampling and adaptive weighting strategies.

Contrastive Learning Representation Learning +1

Action Recognition With Motion Diversification and Dynamic Selection

no code implementations TIP 2022 Peiqin Zhuang, Yu Guo, Zhipeng Yu, Luping Zhou, Lei Bai, Ding Liang, Zhiyong Wang, Yali Wang, Wanli Ouyang

To address this issue, we introduce a Motion Diversification and Selection (MoDS) module to generate diversified spatio-temporal motion features and then select the suitable motion representation dynamically for categorizing the input video.

Action Recognition Computational Efficiency +1

Unsupervised Knowledge Adaptation for Passenger Demand Forecasting

no code implementations8 Jun 2022 Can Li, Lei Bai, Wei Liu, Lina Yao, S Travis Waller

These multimodal forecasting models can improve accuracy but be less practical when different parts of multimodal datasets are owned by different institutions who cannot directly share data among them.

Demand Forecasting

Domain Invariant Masked Autoencoders for Self-supervised Learning from Multi-domains

no code implementations10 May 2022 Haiyang Yang, Meilin Chen, Yizhou Wang, Shixiang Tang, Feng Zhu, Lei Bai, Rui Zhao, Wanli Ouyang

While recent self-supervised learning methods have achieved good performances with evaluation set on the same domain as the training set, they will have an undesirable performance decrease when tested on a different domain.

Self-Supervised Learning

DR.VIC: Decomposition and Reasoning for Video Individual Counting

2 code implementations CVPR 2022 Tao Han, Lei Bai, Junyu Gao, Qi Wang, Wanli Ouyang

Instead of relying on the Multiple Object Tracking (MOT) techniques, we propose to solve the problem by decomposing all pedestrians into the initial pedestrians who existed in the first frame and the new pedestrians with separate identities in each following frame.

Crowd Counting Density Estimation +2

Backbone is All Your Need: A Simplified Architecture for Visual Object Tracking

1 code implementation10 Mar 2022 BoYu Chen, Peixia Li, Lei Bai, Lei Qiao, Qiuhong Shen, Bo Li, Weihao Gan, Wei Wu, Wanli Ouyang

Exploiting a general-purpose neural architecture to replace hand-wired designs or inductive biases has recently drawn extensive interest.

All Visual Object Tracking

Trajectory Forecasting from Detection with Uncertainty-Aware Motion Encoding

no code implementations3 Feb 2022 Pu Zhang, Lei Bai, Jianru Xue, Jianwu Fang, Nanning Zheng, Wanli Ouyang

Trajectories obtained from object detection and tracking are inevitably noisy, which could cause serious forecasting errors to predictors built on ground truth trajectories.

object-detection Object Detection +1

PSViT: Better Vision Transformer via Token Pooling and Attention Sharing

no code implementations7 Aug 2021 BoYu Chen, Peixia Li, Baopu Li, Chuming Li, Lei Bai, Chen Lin, Ming Sun, Junjie Yan, Wanli Ouyang

Then, a compact set of the possible combinations for different token pooling and attention sharing mechanisms are constructed.

Online Metro Origin-Destination Prediction via Heterogeneous Information Aggregation

1 code implementation2 Jul 2021 Lingbo Liu, Yuying Zhu, Guanbin Li, Ziyi Wu, Lei Bai, Liang Lin

In this work, we proposed a novel neural network module termed Heterogeneous Information Aggregation Machine (HIAM), which fully exploits heterogeneous information of historical data (e. g., incomplete OD matrices, unfinished order vectors, and DO matrices) to jointly learn the evolutionary patterns of OD and DO ridership.

Time Series Analysis

Mutual CRF-GNN for Few-Shot Learning

no code implementations CVPR 2021 Shixiang Tang, Dapeng Chen, Lei Bai, Kaijian Liu, Yixiao Ge, Wanli Ouyang

In this MCGN, the labels and features of support data are used by the CRF for inferring GNN affinities in a principled and probabilistic way.

Few-Shot Learning

Temporal-Channel Transformer for 3D Lidar-Based Video Object Detection in Autonomous Driving

no code implementations27 Nov 2020 Zhenxun Yuan, Xiao Song, Lei Bai, Wengang Zhou, Zhe Wang, Wanli Ouyang

As a special design of this transformer, the information encoded in the encoder is different from that in the decoder, i. e. the encoder encodes temporal-channel information of multiple frames while the decoder decodes the spatial-channel information for the current frame in a voxel-wise manner.

3D Object Detection Autonomous Driving +4

Knowledge Adaption for Demand Prediction based on Multi-task Memory Neural Network

no code implementations12 Sep 2020 Can Li, Lei Bai, Wei Liu, Lina Yao, S Travis Waller

Accurate demand forecasting of different public transport modes(e. g., buses and light rails) is essential for public service operation. However, the development level of various modes often varies sig-nificantly, which makes it hard to predict the demand of the modeswith insufficient knowledge and sparse station distribution (i. e., station-sparse mode).

Demand Forecasting Multi-Task Learning

Spectrum-Guided Adversarial Disparity Learning

1 code implementation14 Jul 2020 Zhe Liu, Lina Yao, Lei Bai, Xianzhi Wang, Can Wang

It has been a significant challenge to portray intraclass disparity precisely in the area of activity recognition, as it requires a robust representation of the correlation between subject-specific variation for each activity class.

Activity Recognition Denoising

Face to Purchase: Predicting Consumer Choices with Structured Facial and Behavioral Traits Embedding

no code implementations14 Jul 2020 Zhe Liu, Xianzhi Wang, Lina Yao, Jake An, Lei Bai, Ee-Peng Lim

We design a semi-supervised model based on a hierarchical embedding network to extract high-level features of consumers and to predict the top-$N$ purchase destinations of a consumer.

Adaptive Graph Convolutional Recurrent Network for Traffic Forecasting

3 code implementations NeurIPS 2020 Lei Bai, Lina Yao, Can Li, Xianzhi Wang, Can Wang

We further propose an Adaptive Graph Convolutional Recurrent Network (AGCRN) to capture fine-grained spatial and temporal correlations in traffic series automatically based on the two modules and recurrent networks.

Graph Generation Graph Neural Network +6

Cannot find the paper you are looking for? You can Submit a new open access paper.