Search Results for author: Zhiheng Li

Found 41 papers, 26 papers with code

Mask-ControlNet: Higher-Quality Image Generation with An Additional Mask Prompt

no code implementations • 8 Apr 2024 • Zhiqi Huang, Huixin Xiong, Haoyu Wang, Longguang Wang, Zhiheng Li

Then, the object images are employed as additional prompts to facilitate the diffusion model to better understand the relationship between foreground and background regions during image generation.

Text-to-Image Generation

Paper
Add Code

FairRAG: Fair Human Generation via Fair Retrieval Augmentation

no code implementations • 29 Mar 2024 • Robik Shrestha, Yang Zou, Qiuyu Chen, Zhiheng Li, Yusheng Xie, Siqi Deng

In this work, we introduce Fair Retrieval Augmented Generation (FairRAG), a novel framework that conditions pre-trained generative models on reference images retrieved from an external image database to improve fairness in human generation.

Fairness Image Generation +1

Paper
Add Code

Sketch-to-Architecture: Generative AI-aided Architectural Design

no code implementations • 29 Mar 2024 • Pengzhi Li, Baijuan Li, Zhiheng Li

Recently, the development of large-scale models has paved the way for various interdisciplinary research, including architecture.

Paper
Add Code

Discover and Mitigate Multiple Biased Subgroups in Image Classifiers

1 code implementation • 19 Mar 2024 • Zeliang Zhang, Mingqian Feng, Zhiheng Li, Chenliang Xu

Discovering biased subgroups is the key to understanding models' failure modes and further improving models' robustness.

Dimensionality Reduction Subgroup Discovery

Paper
Code

Learning Dual-Level Deformable Implicit Representation for Real-World Scale Arbitrary Super-Resolution

no code implementations • 16 Mar 2024 • Zhiheng Li, Muheng Li, Jixuan Fan, Lei Chen, Yansong Tang, Jie zhou, Jiwen Lu

Scale arbitrary super-resolution based on implicit image function gains increasing popularity since it can better represent the visual world in a continuous manner.

Super-Resolution

Paper
Add Code

SeqTrack3D: Exploring Sequence Information for Robust 3D Point Cloud Tracking

1 code implementation • 26 Feb 2024 • Yu Lin, Zhiheng Li, Yubo Cui, Zheng Fang

Most existing methods perform tracking between two consecutive frames while ignoring the motion patterns of the target over a series of frames, which would cause performance degradation in the scenes with sparse points.

3D Single Object Tracking Autonomous Driving +1

Paper
Code

Pareto-based Multi-Objective Recommender System with Forgetting Curve

no code implementations • 28 Dec 2023 • Jipeng Jin, Zhaoxiang Zhang, Zhiheng Li, Xiaofeng Gao, Xiongwen Yang, Lei Xiao, Jie Jiang

Considering recency effect in memories, we propose a forgetting model based on Ebbinghaus Forgetting Curve to cope with negative feedback.

Recommendation Systems

Paper
Add Code

Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning

2 code implementations • 27 Nov 2023 • Huanjin Yao, Wenhao Wu, Zhiheng Li

In this paper, we present a novel Spatial-Temporal Side Network for memory-efficient fine-tuning large image models to video understanding, named Side4Video.

Ranked #3 on Action Recognition on Something-Something V1

Action Classification Action Recognition +3

Paper
Code

Mixed Attention Network for Cross-domain Sequential Recommendation

1 code implementation • 14 Nov 2023 • GuanYu Lin, Chen Gao, Yu Zheng, Jianxin Chang, Yanan Niu, Yang song, Kun Gai, Zhiheng Li, Depeng Jin, Yong Li, Meng Wang

Recent proposed cross-domain sequential recommendation models such as PiNet and DASL have a common drawback relying heavily on overlapped users in different domains, which limits their usage in practical recommender systems.

Sequential Recommendation

Paper
Code

Inverse Learning with Extremely Sparse Feedback for Recommendation

1 code implementation • 14 Nov 2023 • GuanYu Lin, Chen Gao, Yu Zheng, Yinfeng Li, Jianxin Chang, Yanan Niu, Yang song, Kun Gai, Zhiheng Li, Depeng Jin, Yong Li

In this paper, we propose a meta-learning method to annotate the unlabeled data from loss and gradient perspectives, which considers the noises in both positive and negative instances.

Meta-Learning

Paper
Code

Distance-rank Aware Sequential Reward Learning for Inverse Reinforcement Learning with Sub-optimal Demonstrations

no code implementations • 13 Oct 2023 • Lu Li, Yuxin Pan, RuoBing Chen, Jie Liu, Zilin Wang, Yu Liu, Zhiheng Li

Considering that obtaining expert demonstrations can be costly, the focus of current IRL techniques is on learning a better-than-demonstrator policy using a reward function derived from sub-optimal demonstrations.

Contrastive Learning

Paper
Add Code

Skip-Plan: Procedure Planning in Instructional Videos via Condensed Action Space Learning

1 code implementation • ICCV 2023 • Zhiheng Li, Wenjia Geng, Muheng Li, Lei Chen, Yansong Tang, Jiwen Lu, Jie zhou

By this means, our model explores all sorts of reliable sub-relations within an action sequence in the condensed action space.

Paper
Code

SoccerNet 2023 Challenges Results

2 code implementations • 12 Sep 2023 • Anthony Cioppa, Silvio Giancola, Vladimir Somers, Floriane Magera, Xin Zhou, Hassan Mkhallati, Adrien Deliège, Jan Held, Carlos Hinojosa, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdullah Kamal, Adrien Maglo, Albert Clapés, Amr Abdelaziz, Artur Xarles, Astrid Orcesi, Atom Scott, Bin Liu, Byoungkwon Lim, Chen Chen, Fabian Deuser, Feng Yan, Fufu Yu, Gal Shitrit, Guanshuo Wang, Gyusik Choi, Hankyul Kim, Hao Guo, Hasby Fahrudin, Hidenari Koguchi, Håkan Ardö, Ibrahim Salah, Ido Yerushalmy, Iftikar Muhammad, Ikuma Uchida, Ishay Be'ery, Jaonary Rabarisoa, Jeongae Lee, Jiajun Fu, Jianqin Yin, Jinghang Xu, Jongho Nang, Julien Denize, Junjie Li, Junpei Zhang, Juntae Kim, Kamil Synowiec, Kenji Kobayashi, Kexin Zhang, Konrad Habel, Kota Nakajima, Licheng Jiao, Lin Ma, Lizhi Wang, Luping Wang, Menglong Li, Mengying Zhou, Mohamed Nasr, Mohamed Abdelwahed, Mykola Liashuha, Nikolay Falaleev, Norbert Oswald, Qiong Jia, Quoc-Cuong Pham, Ran Song, Romain Hérault, Rui Peng, Ruilong Chen, Ruixuan Liu, Ruslan Baikulov, Ryuto Fukushima, Sergio Escalera, Seungcheon Lee, Shimin Chen, Shouhong Ding, Taiga Someya, Thomas B. Moeslund, Tianjiao Li, Wei Shen, Wei zhang, Wei Li, Wei Dai, Weixin Luo, Wending Zhao, Wenjie Zhang, Xinquan Yang, Yanbiao Ma, Yeeun Joo, Yingsen Zeng, Yiyang Gan, Yongqiang Zhu, Yujie Zhong, Zheng Ruan, Zhiheng Li, Zhijian Huang, Ziyu Meng

More information on the tasks, challenges, and leaderboards are available on https://www. soccer-net. org.

Action Spotting Camera Calibration +3

Paper
Code

Motion-to-Matching: A Mixed Paradigm for 3D Single Object Tracking

1 code implementation • 23 Aug 2023 • Zhiheng Li, Yu Lin, Yubo Cui, Shuo Li, Zheng Fang

3D single object tracking with LiDAR points is an important task in the computer vision field.

3D Single Object Tracking Object Tracking

Paper
Code

STTracker: Spatio-Temporal Tracker for 3D Single Object Tracking

no code implementations • 30 Jun 2023 • Yubo Cui, Zhiheng Li, Zheng Fang

Previous methods usually input the last two frames and use the predicted box to get the template point cloud in previous frame and the search area point cloud in the current frame respectively, then use similarity-based or motion-based methods to predict the current box.

3D Single Object Tracking Object +1

Paper
Add Code

SSCBench: Monocular 3D Semantic Scene Completion Benchmark in Street Views

1 code implementation • 15 Jun 2023 • Yiming Li, Sihang Li, Xinhao Liu, Moonjun Gong, Kenan Li, Nuo Chen, Zijun Wang, Zhiheng Li, Tao Jiang, Fisher Yu, Yue Wang, Hang Zhao, Zhiding Yu, Chen Feng

Monocular scene understanding is a foundational component of autonomous systems.

3D Semantic Scene Completion 3D Semantic Scene Completion from a single 2D image

149

Paper
Code

Normalization Enhances Generalization in Visual Reinforcement Learning

no code implementations • 1 Jun 2023 • Lu Li, Jiafei Lyu, Guozheng Ma, Zilin Wang, Zhenjie Yang, Xiu Li, Zhiheng Li

Though normalization techniques have demonstrated huge success in supervised and unsupervised learning, their applications in visual RL are still scarce.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

LayerDiffusion: Layered Controlled Image Editing with Diffusion Models

no code implementations • 30 May 2023 • Pengzhi Li, QInxuan Huang, Yikang Ding, Zhiheng Li

During the diffusion process, an iterative guidance strategy is used to generate a final image that aligns with the textual description.

Attribute text-guided-image-editing

Paper
Add Code

MMF-Track: Multi-modal Multi-level Fusion for 3D Single Object Tracking

1 code implementation • 11 May 2023 • Zhiheng Li, Yubo Cui, Yu Lin, Zheng Fang

To overcome the limitations of geometry matching, we propose a Multi-modal Multi-level Fusion Tracker (MMF-Track), which exploits the image texture and geometry characteristic of point clouds to track 3D target.

3D Single Object Tracking Object Tracking

Paper
Code

Dual-interest Factorization-heads Attention for Sequential Recommendation

1 code implementation • 8 Feb 2023 • GuanYu Lin, Chen Gao, Yu Zheng, Jianxin Chang, Yanan Niu, Yang song, Zhiheng Li, Depeng Jin, Yong Li

In this paper, we propose Dual-interest Factorization-heads Attention for Sequential Recommendation (short for DFAR) consisting of feedback-aware encoding layer, dual-interest disentangling layer and prediction layer.

Disentanglement Sequential Recommendation

Paper
Code

You Only Need a Good Embeddings Extractor to Fix Spurious Correlations

no code implementations • 12 Dec 2022 • Raghav Mehta, Vítor Albiero, Li Chen, Ivan Evtimov, Tamar Glaser, Zhiheng Li, Tal Hassner

With experiments on a wide range of pre-trained models and pre-training datasets, we show that the capacity of the pre-training model and the size of the pre-training dataset matters.

Paper
Add Code

A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others

1 code implementation • CVPR 2023 • Zhiheng Li, Ivan Evtimov, Albert Gordo, Caner Hazirbas, Tal Hassner, Cristian Canton Ferrer, Chenliang Xu, Mark Ibrahim

Key to advancing the reliability of vision systems is understanding whether existing methods can overcome multiple shortcuts or struggle in a Whac-A-Mole game, i. e., where mitigating one shortcut amplifies reliance on others.

Ranked #1 on Out-of-Distribution Generalization on ImageNet-W

Domain Generalization Image Classification +1

Paper
Code

Multiple Object Tracking Challenge Technical Report for Team MT_IoT

1 code implementation • 7 Dec 2022 • Feng Yan, Zhiheng Li, Weixin Luo, Zequn Jie, Fan Liang, Xiaolin Wei, Lin Ma

This is a brief technical report of our proposed method for Multiple-Object Tracking (MOT) Challenge in Complex Environments.

Ranked #8 on Multi-Object Tracking on DanceTrack (using extra training data)

Human Detection Multi-Object Tracking +2

Paper
Code

SoccerNet 2022 Challenges Results

7 code implementations • 5 Oct 2022 • Silvio Giancola, Anthony Cioppa, Adrien Deliège, Floriane Magera, Vladimir Somers, Le Kang, Xin Zhou, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdulrahman Darwish, Adrien Maglo, Albert Clapés, Andreas Luyts, Andrei Boiarov, Artur Xarles, Astrid Orcesi, Avijit Shah, Baoyu Fan, Bharath Comandur, Chen Chen, Chen Zhang, Chen Zhao, Chengzhi Lin, Cheuk-Yiu Chan, Chun Chuen Hui, Dengjie Li, Fan Yang, Fan Liang, Fang Da, Feng Yan, Fufu Yu, Guanshuo Wang, H. Anthony Chan, He Zhu, Hongwei Kan, Jiaming Chu, Jianming Hu, Jianyang Gu, Jin Chen, João V. B. Soares, Jonas Theiner, Jorge De Corte, José Henrique Brito, Jun Zhang, Junjie Li, Junwei Liang, Leqi Shen, Lin Ma, Lingchi Chen, Miguel Santos Marques, Mike Azatov, Nikita Kasatkin, Ning Wang, Qiong Jia, Quoc Cuong Pham, Ralph Ewerth, Ran Song, RenGang Li, Rikke Gade, Ruben Debien, Runze Zhang, Sangrok Lee, Sergio Escalera, Shan Jiang, Shigeyuki Odashima, Shimin Chen, Shoichi Masui, Shouhong Ding, Sin-wai Chan, Siyu Chen, Tallal El-Shabrawy, Tao He, Thomas B. Moeslund, Wan-Chi Siu, Wei zhang, Wei Li, Xiangwei Wang, Xiao Tan, Xiaochuan Li, Xiaolin Wei, Xiaoqing Ye, Xing Liu, Xinying Wang, Yandong Guo, YaQian Zhao, Yi Yu, YingYing Li, Yue He, Yujie Zhong, Zhenhua Guo, Zhiheng Li

The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team.

Action Spotting Camera Calibration +3

Paper
Code

Exploiting More Information in Sparse Point Cloud for 3D Single Object Tracking

1 code implementation • 2 Oct 2022 • Yubo Cui, Jiayao Shan, Zuoxu Gu, Zhiheng Li, Zheng Fang

Meanwhile, the encoder applies the attention on multi-scale features to compensate for the lack of information caused by the sparsity of point cloud and the single scale of features.

3D Single Object Tracking Object +1

Paper
Code

Rethinking Dimensionality Reduction in Grid-based 3D Object Detection

no code implementations • 20 Sep 2022 • Dihe Huang, Ying Chen, Yikang Ding, Jinli Liao, Jianlin Liu, Kai Wu, Qiang Nie, Yong liu, Chengjie Wang, Zhiheng Li

In MDRNet, the Spatial-aware Dimensionality Reduction (SDR) is designed to dynamically focus on the valuable parts of the object during voxel-to-BEV feature transformation.

3D Object Detection Cloud Detection +3

Paper
Add Code

Mutual Harmony: Sequential Recommendation with Dual Contrastive Network

1 code implementation • 18 Sep 2022 • GuanYu Lin, Chen Gao, Yinfeng Li, Yu Zheng, Zhiheng Li, Depeng Jin, Dong Li, Jianye Hao, Yong Li

Such user-centric recommendation will make it impossible for the provider to expose their new items, failing to consider the accordant interactions between user and item dimensions.

Contrastive Learning Representation Learning +1

Paper
Code

Discover and Mitigate Unknown Biases with Debiasing Alternate Networks

1 code implementation • 20 Jul 2022 • Zhiheng Li, Anthony Hoogs, Chenliang Xu

By training in an alternate manner, the discoverer tries to find multiple unknown biases of the classifier without any annotations of biases, and the classifier aims at unlearning the biases identified by the discoverer.

Ranked #1 on Out-of-Distribution Generalization on ImageNet-W

Action Recognition Facial Attribute Classification +1

Paper
Code

Enhancing Multi-view Stereo with Contrastive Matching and Weighted Focal Loss

1 code implementation • 21 Jun 2022 • Yikang Ding, Zhenyang Li, Dihe Huang, Zhiheng Li, Kai Zhang

Learning-based multi-view stereo (MVS) methods have made impressive progress and surpassed traditional methods in recent years.

Contrastive Learning

257

Paper
Code

StyleT2I: Toward Compositional and High-Fidelity Text-to-Image Synthesis

1 code implementation • CVPR 2022 • Zhiheng Li, Martin Renqiang Min, Kai Li, Chenliang Xu

Based on the identified latent directions of attributes, we propose Compositional Attribute Adjustment to adjust the latent code, resulting in better compositionality of image synthesis.

Attribute Fairness +2

Paper
Code

Simple Recurrent Neural Networks is all we need for clinical events predictions using EHR data

1 code implementation • 3 Oct 2021 • Laila Rasmy, Jie Zhu, Zhiheng Li, Xin Hao, Hong Thoai Tran, Yujia Zhou, Firat Tiryaki, Yang Xiang, Hua Xu, Degui Zhi

As a result, deep learning models developed for sequence modeling, like recurrent neural networks (RNNs) are common architecture for EHR-based clinical events predictive models.

Bayesian Optimization

114

Paper
Code

Discover the Unknown Biased Attribute of an Image Classifier

1 code implementation • ICCV 2021 • Zhiheng Li, Chenliang Xu

To help human experts better find the AI algorithms' biases, we study a new problem in this work -- for a classifier that predicts a target attribute of the input image, discover its unknown biased attribute.

Attribute Disentanglement

Paper
Code

UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles

2 code implementations • CVPR 2021 • Tianjiao Li, Jun Liu, Wei zhang, Yun Ni, Wenqian Wang, Zhiheng Li

Human behavior understanding with unmanned aerial vehicles (UAVs) is of great significance for a wide range of applications, which simultaneously brings an urgent demand of large, challenging, and comprehensive benchmarks for the development and evaluation of UAV-based models.

Action Recognition Attribute +3

177

Paper
Code

Actor-Action Video Classification CSC 249/449 Spring 2020 Challenge Report

1 code implementation • 1 Aug 2020 • Jing Shi, Zhiheng Li, Haitian Zheng, Yihang Xu, Tianyou Xiao, Weitao Tan, Xiaoning Guo, Sizhe Li, Bin Yang, Zhexin Xu, Ruitao Lin, Zhongkai Shangguan, Yue Zhao, Jingwen Wang, Rohan Sharma, Surya Iyer, Ajinkya Deshmukh, Raunak Mahalik, Srishti Singh, Jayant G Rohra, Yi-Peng Zhang, Tongyu Yang, Xuan Wen, Ethan Fahnestock, Bryce Ikeda, Ian Lawson, Alan Finkelstein, Kehao Guo, Richard Magnotti, Andrew Sexton, Jeet Ketan Thaker, Yiyang Su, Chenliang Xu

This technical report summarizes submissions and compiles from Actor-Action video classification challenge held as a final project in CSC 249/449 Machine Vision course (Spring 2020) at University of Rochester

General Classification Video Classification

Paper
Code

Graph Neural Network Based Coarse-Grained Mapping Prediction

2 code implementations • 24 Jun 2020 • Zhiheng Li, Geemi P. Wellawatte, Maghesree Chakraborty, Heta A. Gandhi, Chenliang Xu, Andrew D. White

The selection of coarse-grained (CG) mapping operators is a critical step for CG molecular dynamics (MD) simulation.

Clustering graph partitioning +2

Paper
Code

Wasserstein Distance guided Adversarial Imitation Learning with Reward Shape Exploration

1 code implementation • 5 Jun 2020 • Ming Zhang, Yawei Wang, Xiaoteng Ma, Li Xia, Jun Yang, Zhiheng Li, Xiu Li

The generative adversarial imitation learning (GAIL) has provided an adversarial learning framework for imitating expert policy from demonstrations in high-dimensional continuous tasks.

Continuous Control Imitation Learning

Paper
Code

Learning a Weakly-Supervised Video Actor-Action Segmentation Model with a Wise Selection

no code implementations • CVPR 2020 • Jie Chen, Zhiheng Li, Jiebo Luo, Chenliang Xu

Instead of blindly trusting quality-inconsistent PAs, WS^2 employs a learning-based selection to select effective PAs and a novel region integrity criterion as a stopping condition for weakly-supervised training.

Action Segmentation Segmentation +3

Paper
Add Code

Deep Grouping Model for Unified Perceptual Parsing

no code implementations • CVPR 2020 • Zhiheng Li, Wenxuan Bao, Jiayang Zheng, Chenliang Xu

The perceptual-based grouping process produces a hierarchical and compositional image representation that helps both human and machine vision systems recognize heterogeneous visual concepts.

Image Segmentation Segmentation +1

Paper
Add Code

Early Prediction of 30-day ICU Re-admissions Using Natural Language Processing and Machine Learning

no code implementations • 6 Oct 2019 • Zhiheng Li, Xinyue Xing, Bingzhang Lu, Zhixiang Li

ICU readmission is associated with longer hospitalization, mortality and adverse outcomes.

BIG-bench Machine Learning

Paper
Add Code

Cooperative Lane Changing via Deep Reinforcement Learning

no code implementations • 20 Jun 2019 • Guan Wang, Jianming Hu, Zhiheng Li, Li Li

In this paper, we study how to learn an appropriate lane changing strategy for autonomous vehicles by using deep reinforcement learning.

Autonomous Vehicles reinforcement-learning +1

Paper
Add Code

Lip Movements Generation at a Glance

1 code implementation • ECCV 2018 • Lele Chen, Zhiheng Li, Ross K. Maddox, Zhiyao Duan, Chenliang Xu

In this paper, we consider a task of such: given an arbitrary audio speech and one lip image of arbitrary target identity, generate synthesized lip movements of the target identity saying the speech.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.