no code implementations • 21 Oct 2024 • Yining Li, Peizhong Ju, Ness B. Shroff
By investigating the geometric structure of the Pareto front in MO-MDP, we uncover a key property: the Pareto front is on the boundary of a convex polytope whose vertices all correspond to deterministic policies, and neighboring vertices of the Pareto front differ by only one state-action pair of the deterministic policy, almost surely.
1 code implementation • 11 Jul 2024 • Tao Jiang, Xinchen Xie, Yining Li
In this work, we present RTMW (Real-Time Multi-person Whole-body pose estimation models), a series of high-performance models for 2D/3D whole-body pose estimation.
1 code implementation • 11 Jul 2024 • Jize Wang, Zerun Ma, Yining Li, Songyang Zhang, Cailian Chen, Kai Chen, Xinyi Le
This poses a challenge to LLMs' tool-use capabilities.
2 code implementations • 9 Jul 2024 • Xiangyu Rui, Xiangyong Cao, Yining Li, Deyu Meng
The most challenging issue for this task is that only the to-be-fused LRMS and PAN are available, and the existing deep learning-based methods are unsuitable since they rely on many training pairs.
1 code implementation • 8 Jul 2024 • Jiajun Liu, Wenjun Ke, Peng Wang, Jiahao Wang, Jinhua Gao, Ziyu Shang, Guozheng Li, Zijie Xu, Ke Ji, Yining Li
To address this issue, we propose a fast CKGE framework (\model), incorporating an incremental low-rank adapter (\mec) mechanism to efficiently acquire new knowledge while preserving old knowledge.
1 code implementation • 3 Jul 2024 • Pan Zhang, Xiaoyi Dong, Yuhang Zang, Yuhang Cao, Rui Qian, Lin Chen, Qipeng Guo, Haodong Duan, Bin Wang, Linke Ouyang, Songyang Zhang, Wenwei Zhang, Yining Li, Yang Gao, Peng Sun, Xinyue Zhang, Wei Li, Jingwen Li, Wenhai Wang, Hang Yan, Conghui He, Xingcheng Zhang, Kai Chen, Jifeng Dai, Yu Qiao, Dahua Lin, Jiaqi Wang
This long-context capability allows IXC-2. 5 to excel in tasks requiring extensive input and output contexts.
Ranked #4 on Video Question Answering on TVBench
no code implementations • 28 Jun 2024 • Yicheng Chen, Xiangtai Li, Yining Li, Yanhong Zeng, Jianzong Wu, Xiangyu Zhao, Kai Chen
Diffusion models can generate realistic and diverse images, potentially facilitating data availability for data-intensive perception tasks.
1 code implementation • 25 Jun 2024 • Xiangyu Zhao, Xiangtai Li, Haodong Duan, Haian Huang, Yining Li, Kai Chen, Hua Yang
We propose the integration of an additional high-resolution visual encoder to capture fine-grained details, which are then fused with base visual features through a Conv-Gate fusion network.
Ranked #74 on Visual Question Answering on MM-Vet
no code implementations • 25 Jun 2024 • Jianzong Wu, Xiangtai Li, Yanhong Zeng, Jiangning Zhang, Qianyu Zhou, Yining Li, Yunhai Tong, Kai Chen
In this work, we present MotionBooth, an innovative framework designed for animating customized subjects with precise control over both object and camera movements.
1 code implementation • 21 Jun 2024 • Zhiwei Fei, Songyang Zhang, Xiaoyu Shen, Dawei Zhu, Xiao Wang, Maosong Cao, Fengzhe Zhou, Yining Li, Wenwei Zhang, Dahua Lin, Kai Chen, Jidong Ge
While large language models (LLMs) have showcased impressive capabilities, they struggle with addressing legal queries due to the intricate complexities and specialized expertise required in the legal field.
1 code implementation • 20 Jun 2024 • Xinyu Fang, Kangrui Mao, Haodong Duan, Xiangyu Zhao, Yining Li, Dahua Lin, Kai Chen
The advent of large vision-language models (LVLMs) has spurred research into their applications in multi-modal contexts, particularly in video understanding.
no code implementations • 15 May 2024 • Kai Hu, Weichen Yu, Tianjun Yao, Xiang Li, Wenhe Liu, Lijun Yu, Yining Li, Kai Chen, Zhiqiang Shen, Matt Fredrikson
Our approach relaxes the discrete jailbreak optimization into a continuous optimization and progressively increases the sparsity of the optimizing vectors.
2 code implementations • 9 Apr 2024 • Xiaoyi Dong, Pan Zhang, Yuhang Zang, Yuhang Cao, Bin Wang, Linke Ouyang, Songyang Zhang, Haodong Duan, Wenwei Zhang, Yining Li, Hang Yan, Yang Gao, Zhe Chen, Xinyue Zhang, Wei Li, Jingwen Li, Wenhai Wang, Kai Chen, Conghui He, Xingcheng Zhang, Jifeng Dai, Yu Qiao, Dahua Lin, Jiaqi Wang
The Large Vision-Language Model (LVLM) field has seen significant advancements, yet its progression has been hindered by challenges in comprehending fine-grained visual content due to limited resolution.
Ranked #47 on Visual Question Answering on MM-Vet
3 code implementations • 26 Mar 2024 • Zheng Cai, Maosong Cao, Haojiong Chen, Kai Chen, Keyu Chen, Xin Chen, Xun Chen, Zehui Chen, Zhi Chen, Pei Chu, Xiaoyi Dong, Haodong Duan, Qi Fan, Zhaoye Fei, Yang Gao, Jiaye Ge, Chenya Gu, Yuzhe Gu, Tao Gui, Aijia Guo, Qipeng Guo, Conghui He, Yingfan Hu, Ting Huang, Tao Jiang, Penglong Jiao, Zhenjiang Jin, Zhikai Lei, Jiaxing Li, Jingwen Li, Linyang Li, Shuaibin Li, Wei Li, Yining Li, Hongwei Liu, Jiangning Liu, Jiawei Hong, Kaiwen Liu, Kuikun Liu, Xiaoran Liu, Chengqi Lv, Haijun Lv, Kai Lv, Li Ma, Runyuan Ma, Zerun Ma, Wenchang Ning, Linke Ouyang, Jiantao Qiu, Yuan Qu, FuKai Shang, Yunfan Shao, Demin Song, Zifan Song, Zhihao Sui, Peng Sun, Yu Sun, Huanze Tang, Bin Wang, Guoteng Wang, Jiaqi Wang, Jiayu Wang, Rui Wang, Yudong Wang, Ziyi Wang, Xingjian Wei, Qizhen Weng, Fan Wu, Yingtong Xiong, Chao Xu, Ruiliang Xu, Hang Yan, Yirong Yan, Xiaogui Yang, Haochen Ye, Huaiyuan Ying, JIA YU, Jing Yu, Yuhang Zang, Chuyu Zhang, Li Zhang, Pan Zhang, Peng Zhang, Ruijie Zhang, Shuo Zhang, Songyang Zhang, Wenjian Zhang, Wenwei Zhang, Xingcheng Zhang, Xinyue Zhang, Hui Zhao, Qian Zhao, Xiaomeng Zhao, Fengzhe Zhou, Zaida Zhou, Jingming Zhuo, Yicheng Zou, Xipeng Qiu, Yu Qiao, Dahua Lin
The evolution of Large Language Models (LLMs) like ChatGPT and GPT-4 has sparked discussions on the advent of Artificial General Intelligence (AGI).
Ranked #5 on Long-Context Understanding on Ada-LEval (BestAnswer)
1 code implementation • 29 Jan 2024 • Xiaoyi Dong, Pan Zhang, Yuhang Zang, Yuhang Cao, Bin Wang, Linke Ouyang, Xilin Wei, Songyang Zhang, Haodong Duan, Maosong Cao, Wenwei Zhang, Yining Li, Hang Yan, Yang Gao, Xinyue Zhang, Wei Li, Jingwen Li, Kai Chen, Conghui He, Xingcheng Zhang, Yu Qiao, Dahua Lin, Jiaqi Wang
We introduce InternLM-XComposer2, a cutting-edge vision-language model excelling in free-form text-image composition and comprehension.
Ranked #18 on Visual Question Answering on MM-Vet v2
1 code implementation • CVPR 2024 • Jianzong Wu, Xiangtai Li, Chenyang Si, Shangchen Zhou, Jingkang Yang, Jiangning Zhang, Yining Li, Kai Chen, Yunhai Tong, Ziwei Liu, Chen Change Loy
We introduce a new task -- language-driven video inpainting, which uses natural language instructions to guide the inpainting process.
1 code implementation • 18 Jan 2024 • Shilin Xu, Haobo Yuan, Qingyu Shi, Lu Qi, Jingbo Wang, Yibo Yang, Yining Li, Kai Chen, Yunhai Tong, Bernard Ghanem, Xiangtai Li, Ming-Hsuan Yang
Segment Anything Model (SAM) is one remarkable model that can achieve generalized segmentation.
1 code implementation • CVPR 2024 • Xiangtai Li, Haobo Yuan, Wei Li, Henghui Ding, Size Wu, Wenwei Zhang, Yining Li, Kai Chen, Chen Change Loy
In this work, we address various segmentation tasks, each traditionally tackled by distinct or partially unified models.
1 code implementation • 5 Jan 2024 • Haobo Yuan, Xiangtai Li, Chong Zhou, Yining Li, Kai Chen, Chen Change Loy
The CLIP and Segment Anything Model (SAM) are remarkable vision foundation models (VFMs).
2 code implementations • 4 Jan 2024 • Xiangyu Zhao, Yicheng Chen, Shilin Xu, Xiangtai Li, Xinjiang Wang, Yining Li, Haian Huang
Grounding-DINO is a state-of-the-art open-set detection model that tackles multiple vision tasks including Open-Vocabulary Detection (OVD), Phrase Grounding (PG), and Referring Expression Comprehension (REC).
1 code implementation • CVPR 2024 • Peng Lu, Tao Jiang, Yining Li, Xiangtai Li, Kai Chen, Wenming Yang
Real-time multi-person pose estimation presents significant challenges in balancing speed and precision.
Ranked #1 on Multi-Person Pose Estimation on CrowdPose (using extra training data)
no code implementations • 22 Jun 2023 • Yining Li, Peizhong Ju, Ness Shroff
To address this issue, we formulate a general optimization problem for determining the optimal grouping strategy, which strikes a balance between performance loss and sample/computational complexity.
1 code implementation • 13 Mar 2023 • Tao Jiang, Peng Lu, Li Zhang, Ningsheng Ma, Rui Han, Chengqi Lyu, Yining Li, Kai Chen
Recent studies on 2D pose estimation have achieved excellent performance on public benchmarks, yet its application in the industrial community still suffers from heavy model parameters and high latency.
Ranked #3 on Pose Estimation on OCHuman (using extra training data)
no code implementations • 5 Dec 2022 • Peiwen Qiu, Yining Li, Zhuqing Liu, Prashant Khanduri, Jia Liu, Ness B. Shroff, Elizabeth Serena Bentley, Kurt Turck
Decentralized bilevel optimization has received increasing attention recently due to its foundational role in many emerging multi-agent learning paradigms (e. g., multi-agent meta-learning and multi-agent reinforcement learning) over peer-to-peer edge networks.
1 code implementation • CVPR 2019 • Yining Li, Chen Huang, Chen Change Loy
Unlike existing methods, we propose to estimate dense and intrinsic 3D appearance flow to better guide the transfer of pixels between poses.
1 code implementation • 1 Jun 2018 • Chen Huang, Yining Li, Chen Change Loy, Xiaoou Tang
Data for face analysis often exhibit highly-skewed class distribution, i. e., most data belong to a few majority classes, while the minority classes only contain a scarce amount of instances.
no code implementations • ICCV 2017 • Yining Li, Chen Huang, Xiaoou Tang, Chen-Change Loy
In particular, each tuple consists of a pair of images and 4. 6 discriminative questions (as positive samples) and 5. 9 non-discriminative questions (as negative samples) on average.
no code implementations • CVPR 2016 • Chen Huang, Yining Li, Chen Change Loy, Xiaoou Tang
We further demonstrate that more discriminative deep representation can be learned by enforcing a deep network to maintain both inter-cluster and inter-class margins.