Search Results for author: Yining Li

Found 16 papers, 10 papers with code

Towards Language-Driven Video Inpainting via Multimodal Large Language Models

no code implementations18 Jan 2024 Jianzong Wu, Xiangtai Li, Chenyang Si, Shangchen Zhou, Jingkang Yang, Jiangning Zhang, Yining Li, Kai Chen, Yunhai Tong, Ziwei Liu, Chen Change Loy

We introduce a new task -- language-driven video inpainting, which uses natural language instructions to guide the inpainting process.

Video Inpainting

OMG-Seg: Is One Model Good Enough For All Segmentation?

1 code implementation18 Jan 2024 Xiangtai Li, Haobo Yuan, Wei Li, Henghui Ding, Size Wu, Wenwei Zhang, Yining Li, Kai Chen, Chen Change Loy

In this work, we address various segmentation tasks, each traditionally tackled by distinct or partially unified models.

Interactive Segmentation Panoptic Segmentation +3

An Open and Comprehensive Pipeline for Unified Object Grounding and Detection

1 code implementation4 Jan 2024 Xiangyu Zhao, Yicheng Chen, Shilin Xu, Xiangtai Li, Xinjiang Wang, Yining Li, Haian Huang

Grounding-DINO is a state-of-the-art open-set detection model that tackles multiple vision tasks including Open-Vocabulary Detection (OVD), Phrase Grounding (PG), and Referring Expression Comprehension (REC).

Phrase Grounding Referring Expression +1

RTMO: Towards High-Performance One-Stage Real-Time Multi-Person Pose Estimation

1 code implementation12 Dec 2023 Peng Lu, Tao Jiang, Yining Li, Xiangtai Li, Kai Chen, Wenming Yang

Real-time multi-person pose estimation presents significant challenges in balancing speed and precision.

 Ranked #1 on Multi-Person Pose Estimation on CrowdPose (using extra training data)

Multi-Person Pose Estimation

DST-Det: Simple Dynamic Self-Training for Open-Vocabulary Object Detection

1 code implementation2 Oct 2023 Shilin Xu, Xiangtai Li, Size Wu, Wenwei Zhang, Yining Li, Guangliang Cheng, Yunhai Tong, Kai Chen, Chen Change Loy

This work presents a simple yet effective strategy that leverages the zero-shot classification ability of pre-trained vision-language models (VLM), such as CLIP, to directly discover proposals of possible novel classes.

Novel Object Detection Object +5

Achieving Sample and Computational Efficient Reinforcement Learning by Action Space Reduction via Grouping

no code implementations22 Jun 2023 Yining Li, Peizhong Ju, Ness Shroff

To address this issue, we formulate a general optimization problem for determining the optimal grouping strategy, which strikes a balance between performance loss and sample/computational complexity.

RTMPose: Real-Time Multi-Person Pose Estimation based on MMPose

1 code implementation13 Mar 2023 Tao Jiang, Peng Lu, Li Zhang, Ningsheng Ma, Rui Han, Chengqi Lyu, Yining Li, Kai Chen

Recent studies on 2D pose estimation have achieved excellent performance on public benchmarks, yet its application in the industrial community still suffers from heavy model parameters and high latency.

Ranked #3 on Pose Estimation on OCHuman (using extra training data)

2D Human Pose Estimation 2D Pose Estimation +1

DIAMOND: Taming Sample and Communication Complexities in Decentralized Bilevel Optimization

no code implementations5 Dec 2022 Peiwen Qiu, Yining Li, Zhuqing Liu, Prashant Khanduri, Jia Liu, Ness B. Shroff, Elizabeth Serena Bentley, Kurt Turck

Decentralized bilevel optimization has received increasing attention recently due to its foundational role in many emerging multi-agent learning paradigms (e. g., multi-agent meta-learning and multi-agent reinforcement learning) over peer-to-peer edge networks.

Bilevel Optimization Meta-Learning +1

Dense Intrinsic Appearance Flow for Human Pose Transfer

1 code implementation CVPR 2019 Yining Li, Chen Huang, Chen Change Loy

Unlike existing methods, we propose to estimate dense and intrinsic 3D appearance flow to better guide the transfer of pixels between poses.

Pose Transfer

Deep Imbalanced Learning for Face Recognition and Attribute Prediction

1 code implementation1 Jun 2018 Chen Huang, Yining Li, Chen Change Loy, Xiaoou Tang

Data for face analysis often exhibit highly-skewed class distribution, i. e., most data belong to a few majority classes, while the minority classes only contain a scarce amount of instances.

Attribute Face Recognition +1

Learning to Disambiguate by Asking Discriminative Questions

no code implementations ICCV 2017 Yining Li, Chen Huang, Xiaoou Tang, Chen-Change Loy

In particular, each tuple consists of a pair of images and 4. 6 discriminative questions (as positive samples) and 5. 9 non-discriminative questions (as negative samples) on average.

Benchmarking Image Captioning +4

Learning Deep Representation for Imbalanced Classification

no code implementations CVPR 2016 Chen Huang, Yining Li, Chen Change Loy, Xiaoou Tang

We further demonstrate that more discriminative deep representation can be learned by enforcing a deep network to maintain both inter-cluster and inter-class margins.

Classification General Classification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.