Search Results for author: Zhiheng Li

Found 41 papers, 26 papers with code

Mask-ControlNet: Higher-Quality Image Generation with An Additional Mask Prompt

no code implementations8 Apr 2024 Zhiqi Huang, Huixin Xiong, Haoyu Wang, Longguang Wang, Zhiheng Li

Then, the object images are employed as additional prompts to facilitate the diffusion model to better understand the relationship between foreground and background regions during image generation.

Text-to-Image Generation

FairRAG: Fair Human Generation via Fair Retrieval Augmentation

no code implementations29 Mar 2024 Robik Shrestha, Yang Zou, Qiuyu Chen, Zhiheng Li, Yusheng Xie, Siqi Deng

In this work, we introduce Fair Retrieval Augmented Generation (FairRAG), a novel framework that conditions pre-trained generative models on reference images retrieved from an external image database to improve fairness in human generation.

Fairness Image Generation +1

Sketch-to-Architecture: Generative AI-aided Architectural Design

no code implementations29 Mar 2024 Pengzhi Li, Baijuan Li, Zhiheng Li

Recently, the development of large-scale models has paved the way for various interdisciplinary research, including architecture.

Discover and Mitigate Multiple Biased Subgroups in Image Classifiers

1 code implementation19 Mar 2024 Zeliang Zhang, Mingqian Feng, Zhiheng Li, Chenliang Xu

Discovering biased subgroups is the key to understanding models' failure modes and further improving models' robustness.

Dimensionality Reduction Subgroup Discovery

Learning Dual-Level Deformable Implicit Representation for Real-World Scale Arbitrary Super-Resolution

no code implementations16 Mar 2024 Zhiheng Li, Muheng Li, Jixuan Fan, Lei Chen, Yansong Tang, Jie zhou, Jiwen Lu

Scale arbitrary super-resolution based on implicit image function gains increasing popularity since it can better represent the visual world in a continuous manner.

Super-Resolution

SeqTrack3D: Exploring Sequence Information for Robust 3D Point Cloud Tracking

1 code implementation26 Feb 2024 Yu Lin, Zhiheng Li, Yubo Cui, Zheng Fang

Most existing methods perform tracking between two consecutive frames while ignoring the motion patterns of the target over a series of frames, which would cause performance degradation in the scenes with sparse points.

3D Single Object Tracking Autonomous Driving +1

Pareto-based Multi-Objective Recommender System with Forgetting Curve

no code implementations28 Dec 2023 Jipeng Jin, Zhaoxiang Zhang, Zhiheng Li, Xiaofeng Gao, Xiongwen Yang, Lei Xiao, Jie Jiang

Considering recency effect in memories, we propose a forgetting model based on Ebbinghaus Forgetting Curve to cope with negative feedback.

Recommendation Systems

Side4Video: Spatial-Temporal Side Network for Memory-Efficient Image-to-Video Transfer Learning

2 code implementations27 Nov 2023 Huanjin Yao, Wenhao Wu, Zhiheng Li

In this paper, we present a novel Spatial-Temporal Side Network for memory-efficient fine-tuning large image models to video understanding, named Side4Video.

Action Classification Action Recognition +3

Mixed Attention Network for Cross-domain Sequential Recommendation

1 code implementation14 Nov 2023 GuanYu Lin, Chen Gao, Yu Zheng, Jianxin Chang, Yanan Niu, Yang song, Kun Gai, Zhiheng Li, Depeng Jin, Yong Li, Meng Wang

Recent proposed cross-domain sequential recommendation models such as PiNet and DASL have a common drawback relying heavily on overlapped users in different domains, which limits their usage in practical recommender systems.

Sequential Recommendation

Inverse Learning with Extremely Sparse Feedback for Recommendation

1 code implementation14 Nov 2023 GuanYu Lin, Chen Gao, Yu Zheng, Yinfeng Li, Jianxin Chang, Yanan Niu, Yang song, Kun Gai, Zhiheng Li, Depeng Jin, Yong Li

In this paper, we propose a meta-learning method to annotate the unlabeled data from loss and gradient perspectives, which considers the noises in both positive and negative instances.

Meta-Learning

Distance-rank Aware Sequential Reward Learning for Inverse Reinforcement Learning with Sub-optimal Demonstrations

no code implementations13 Oct 2023 Lu Li, Yuxin Pan, RuoBing Chen, Jie Liu, Zilin Wang, Yu Liu, Zhiheng Li

Considering that obtaining expert demonstrations can be costly, the focus of current IRL techniques is on learning a better-than-demonstrator policy using a reward function derived from sub-optimal demonstrations.

Contrastive Learning

Skip-Plan: Procedure Planning in Instructional Videos via Condensed Action Space Learning

1 code implementation ICCV 2023 Zhiheng Li, Wenjia Geng, Muheng Li, Lei Chen, Yansong Tang, Jiwen Lu, Jie zhou

By this means, our model explores all sorts of reliable sub-relations within an action sequence in the condensed action space.

SoccerNet 2023 Challenges Results

2 code implementations12 Sep 2023 Anthony Cioppa, Silvio Giancola, Vladimir Somers, Floriane Magera, Xin Zhou, Hassan Mkhallati, Adrien Deliège, Jan Held, Carlos Hinojosa, Amir M. Mansourian, Pierre Miralles, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdullah Kamal, Adrien Maglo, Albert Clapés, Amr Abdelaziz, Artur Xarles, Astrid Orcesi, Atom Scott, Bin Liu, Byoungkwon Lim, Chen Chen, Fabian Deuser, Feng Yan, Fufu Yu, Gal Shitrit, Guanshuo Wang, Gyusik Choi, Hankyul Kim, Hao Guo, Hasby Fahrudin, Hidenari Koguchi, Håkan Ardö, Ibrahim Salah, Ido Yerushalmy, Iftikar Muhammad, Ikuma Uchida, Ishay Be'ery, Jaonary Rabarisoa, Jeongae Lee, Jiajun Fu, Jianqin Yin, Jinghang Xu, Jongho Nang, Julien Denize, Junjie Li, Junpei Zhang, Juntae Kim, Kamil Synowiec, Kenji Kobayashi, Kexin Zhang, Konrad Habel, Kota Nakajima, Licheng Jiao, Lin Ma, Lizhi Wang, Luping Wang, Menglong Li, Mengying Zhou, Mohamed Nasr, Mohamed Abdelwahed, Mykola Liashuha, Nikolay Falaleev, Norbert Oswald, Qiong Jia, Quoc-Cuong Pham, Ran Song, Romain Hérault, Rui Peng, Ruilong Chen, Ruixuan Liu, Ruslan Baikulov, Ryuto Fukushima, Sergio Escalera, Seungcheon Lee, Shimin Chen, Shouhong Ding, Taiga Someya, Thomas B. Moeslund, Tianjiao Li, Wei Shen, Wei zhang, Wei Li, Wei Dai, Weixin Luo, Wending Zhao, Wenjie Zhang, Xinquan Yang, Yanbiao Ma, Yeeun Joo, Yingsen Zeng, Yiyang Gan, Yongqiang Zhu, Yujie Zhong, Zheng Ruan, Zhiheng Li, Zhijian Huang, Ziyu Meng

More information on the tasks, challenges, and leaderboards are available on https://www. soccer-net. org.

Action Spotting Camera Calibration +3

Motion-to-Matching: A Mixed Paradigm for 3D Single Object Tracking

1 code implementation23 Aug 2023 Zhiheng Li, Yu Lin, Yubo Cui, Shuo Li, Zheng Fang

3D single object tracking with LiDAR points is an important task in the computer vision field.

3D Single Object Tracking Object Tracking

STTracker: Spatio-Temporal Tracker for 3D Single Object Tracking

no code implementations30 Jun 2023 Yubo Cui, Zhiheng Li, Zheng Fang

Previous methods usually input the last two frames and use the predicted box to get the template point cloud in previous frame and the search area point cloud in the current frame respectively, then use similarity-based or motion-based methods to predict the current box.

3D Single Object Tracking Object +1

Normalization Enhances Generalization in Visual Reinforcement Learning

no code implementations1 Jun 2023 Lu Li, Jiafei Lyu, Guozheng Ma, Zilin Wang, Zhenjie Yang, Xiu Li, Zhiheng Li

Though normalization techniques have demonstrated huge success in supervised and unsupervised learning, their applications in visual RL are still scarce.

reinforcement-learning Reinforcement Learning (RL)

LayerDiffusion: Layered Controlled Image Editing with Diffusion Models

no code implementations30 May 2023 Pengzhi Li, QInxuan Huang, Yikang Ding, Zhiheng Li

During the diffusion process, an iterative guidance strategy is used to generate a final image that aligns with the textual description.

Attribute text-guided-image-editing

MMF-Track: Multi-modal Multi-level Fusion for 3D Single Object Tracking

1 code implementation11 May 2023 Zhiheng Li, Yubo Cui, Yu Lin, Zheng Fang

To overcome the limitations of geometry matching, we propose a Multi-modal Multi-level Fusion Tracker (MMF-Track), which exploits the image texture and geometry characteristic of point clouds to track 3D target.

3D Single Object Tracking Object Tracking

Dual-interest Factorization-heads Attention for Sequential Recommendation

1 code implementation8 Feb 2023 GuanYu Lin, Chen Gao, Yu Zheng, Jianxin Chang, Yanan Niu, Yang song, Zhiheng Li, Depeng Jin, Yong Li

In this paper, we propose Dual-interest Factorization-heads Attention for Sequential Recommendation (short for DFAR) consisting of feedback-aware encoding layer, dual-interest disentangling layer and prediction layer.

Disentanglement Sequential Recommendation

You Only Need a Good Embeddings Extractor to Fix Spurious Correlations

no code implementations12 Dec 2022 Raghav Mehta, Vítor Albiero, Li Chen, Ivan Evtimov, Tamar Glaser, Zhiheng Li, Tal Hassner

With experiments on a wide range of pre-trained models and pre-training datasets, we show that the capacity of the pre-training model and the size of the pre-training dataset matters.

A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others

1 code implementation CVPR 2023 Zhiheng Li, Ivan Evtimov, Albert Gordo, Caner Hazirbas, Tal Hassner, Cristian Canton Ferrer, Chenliang Xu, Mark Ibrahim

Key to advancing the reliability of vision systems is understanding whether existing methods can overcome multiple shortcuts or struggle in a Whac-A-Mole game, i. e., where mitigating one shortcut amplifies reliance on others.

Domain Generalization Image Classification +1

Multiple Object Tracking Challenge Technical Report for Team MT_IoT

1 code implementation7 Dec 2022 Feng Yan, Zhiheng Li, Weixin Luo, Zequn Jie, Fan Liang, Xiaolin Wei, Lin Ma

This is a brief technical report of our proposed method for Multiple-Object Tracking (MOT) Challenge in Complex Environments.

Ranked #8 on Multi-Object Tracking on DanceTrack (using extra training data)

Human Detection Multi-Object Tracking +2

SoccerNet 2022 Challenges Results

7 code implementations5 Oct 2022 Silvio Giancola, Anthony Cioppa, Adrien Deliège, Floriane Magera, Vladimir Somers, Le Kang, Xin Zhou, Olivier Barnich, Christophe De Vleeschouwer, Alexandre Alahi, Bernard Ghanem, Marc Van Droogenbroeck, Abdulrahman Darwish, Adrien Maglo, Albert Clapés, Andreas Luyts, Andrei Boiarov, Artur Xarles, Astrid Orcesi, Avijit Shah, Baoyu Fan, Bharath Comandur, Chen Chen, Chen Zhang, Chen Zhao, Chengzhi Lin, Cheuk-Yiu Chan, Chun Chuen Hui, Dengjie Li, Fan Yang, Fan Liang, Fang Da, Feng Yan, Fufu Yu, Guanshuo Wang, H. Anthony Chan, He Zhu, Hongwei Kan, Jiaming Chu, Jianming Hu, Jianyang Gu, Jin Chen, João V. B. Soares, Jonas Theiner, Jorge De Corte, José Henrique Brito, Jun Zhang, Junjie Li, Junwei Liang, Leqi Shen, Lin Ma, Lingchi Chen, Miguel Santos Marques, Mike Azatov, Nikita Kasatkin, Ning Wang, Qiong Jia, Quoc Cuong Pham, Ralph Ewerth, Ran Song, RenGang Li, Rikke Gade, Ruben Debien, Runze Zhang, Sangrok Lee, Sergio Escalera, Shan Jiang, Shigeyuki Odashima, Shimin Chen, Shoichi Masui, Shouhong Ding, Sin-wai Chan, Siyu Chen, Tallal El-Shabrawy, Tao He, Thomas B. Moeslund, Wan-Chi Siu, Wei zhang, Wei Li, Xiangwei Wang, Xiao Tan, Xiaochuan Li, Xiaolin Wei, Xiaoqing Ye, Xing Liu, Xinying Wang, Yandong Guo, YaQian Zhao, Yi Yu, YingYing Li, Yue He, Yujie Zhong, Zhenhua Guo, Zhiheng Li

The SoccerNet 2022 challenges were the second annual video understanding challenges organized by the SoccerNet team.

Action Spotting Camera Calibration +3

Exploiting More Information in Sparse Point Cloud for 3D Single Object Tracking

1 code implementation2 Oct 2022 Yubo Cui, Jiayao Shan, Zuoxu Gu, Zhiheng Li, Zheng Fang

Meanwhile, the encoder applies the attention on multi-scale features to compensate for the lack of information caused by the sparsity of point cloud and the single scale of features.

3D Single Object Tracking Object +1

Rethinking Dimensionality Reduction in Grid-based 3D Object Detection

no code implementations20 Sep 2022 Dihe Huang, Ying Chen, Yikang Ding, Jinli Liao, Jianlin Liu, Kai Wu, Qiang Nie, Yong liu, Chengjie Wang, Zhiheng Li

In MDRNet, the Spatial-aware Dimensionality Reduction (SDR) is designed to dynamically focus on the valuable parts of the object during voxel-to-BEV feature transformation.

3D Object Detection Cloud Detection +3

Mutual Harmony: Sequential Recommendation with Dual Contrastive Network

1 code implementation18 Sep 2022 GuanYu Lin, Chen Gao, Yinfeng Li, Yu Zheng, Zhiheng Li, Depeng Jin, Dong Li, Jianye Hao, Yong Li

Such user-centric recommendation will make it impossible for the provider to expose their new items, failing to consider the accordant interactions between user and item dimensions.

Contrastive Learning Representation Learning +1

Discover and Mitigate Unknown Biases with Debiasing Alternate Networks

1 code implementation20 Jul 2022 Zhiheng Li, Anthony Hoogs, Chenliang Xu

By training in an alternate manner, the discoverer tries to find multiple unknown biases of the classifier without any annotations of biases, and the classifier aims at unlearning the biases identified by the discoverer.

Action Recognition Facial Attribute Classification +1

Enhancing Multi-view Stereo with Contrastive Matching and Weighted Focal Loss

1 code implementation21 Jun 2022 Yikang Ding, Zhenyang Li, Dihe Huang, Zhiheng Li, Kai Zhang

Learning-based multi-view stereo (MVS) methods have made impressive progress and surpassed traditional methods in recent years.

Contrastive Learning

StyleT2I: Toward Compositional and High-Fidelity Text-to-Image Synthesis

1 code implementation CVPR 2022 Zhiheng Li, Martin Renqiang Min, Kai Li, Chenliang Xu

Based on the identified latent directions of attributes, we propose Compositional Attribute Adjustment to adjust the latent code, resulting in better compositionality of image synthesis.

Attribute Fairness +2

Simple Recurrent Neural Networks is all we need for clinical events predictions using EHR data

1 code implementation3 Oct 2021 Laila Rasmy, Jie Zhu, Zhiheng Li, Xin Hao, Hong Thoai Tran, Yujia Zhou, Firat Tiryaki, Yang Xiang, Hua Xu, Degui Zhi

As a result, deep learning models developed for sequence modeling, like recurrent neural networks (RNNs) are common architecture for EHR-based clinical events predictive models.

Bayesian Optimization

Discover the Unknown Biased Attribute of an Image Classifier

1 code implementation ICCV 2021 Zhiheng Li, Chenliang Xu

To help human experts better find the AI algorithms' biases, we study a new problem in this work -- for a classifier that predicts a target attribute of the input image, discover its unknown biased attribute.

Attribute Disentanglement

UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles

2 code implementations CVPR 2021 Tianjiao Li, Jun Liu, Wei zhang, Yun Ni, Wenqian Wang, Zhiheng Li

Human behavior understanding with unmanned aerial vehicles (UAVs) is of great significance for a wide range of applications, which simultaneously brings an urgent demand of large, challenging, and comprehensive benchmarks for the development and evaluation of UAV-based models.

Action Recognition Attribute +3

Wasserstein Distance guided Adversarial Imitation Learning with Reward Shape Exploration

1 code implementation5 Jun 2020 Ming Zhang, Yawei Wang, Xiaoteng Ma, Li Xia, Jun Yang, Zhiheng Li, Xiu Li

The generative adversarial imitation learning (GAIL) has provided an adversarial learning framework for imitating expert policy from demonstrations in high-dimensional continuous tasks.

Continuous Control Imitation Learning

Learning a Weakly-Supervised Video Actor-Action Segmentation Model with a Wise Selection

no code implementations CVPR 2020 Jie Chen, Zhiheng Li, Jiebo Luo, Chenliang Xu

Instead of blindly trusting quality-inconsistent PAs, WS^2 employs a learning-based selection to select effective PAs and a novel region integrity criterion as a stopping condition for weakly-supervised training.

Action Segmentation Segmentation +3

Deep Grouping Model for Unified Perceptual Parsing

no code implementations CVPR 2020 Zhiheng Li, Wenxuan Bao, Jiayang Zheng, Chenliang Xu

The perceptual-based grouping process produces a hierarchical and compositional image representation that helps both human and machine vision systems recognize heterogeneous visual concepts.

Image Segmentation Segmentation +1

Cooperative Lane Changing via Deep Reinforcement Learning

no code implementations20 Jun 2019 Guan Wang, Jianming Hu, Zhiheng Li, Li Li

In this paper, we study how to learn an appropriate lane changing strategy for autonomous vehicles by using deep reinforcement learning.

Autonomous Vehicles reinforcement-learning +1

Lip Movements Generation at a Glance

1 code implementation ECCV 2018 Lele Chen, Zhiheng Li, Ross K. Maddox, Zhiyao Duan, Chenliang Xu

In this paper, we consider a task of such: given an arbitrary audio speech and one lip image of arbitrary target identity, generate synthesized lip movements of the target identity saying the speech.

Cannot find the paper you are looking for? You can Submit a new open access paper.