no code implementations • 22 Dec 2024 • Bin Xia, Yuechen Zhang, Jingyao Li, Chengyao Wang, Yitong Wang, Xinglong Wu, Bei Yu, Jiaya Jia
We begin by analyzing existing frameworks and the requirements of downstream tasks, proposing a unified framework that integrates both T2I models and various editing tasks.
1 code implementation • 12 Dec 2024 • Zhisheng Zhong, Chengyao Wang, Yuqi Liu, Senqiao Yang, Longxiang Tang, Yuechen Zhang, Jingyao Li, Tianyuan Qu, Yanwei Li, Yukang Chen, Shaozuo Yu, Sitong Wu, Eric Lo, Shu Liu, Jiaya Jia
As Multi-modal Large Language Models (MLLMs) evolve, expanding beyond single-domain capabilities is essential to meet the demands for more versatile and efficient AI.
Ranked #1 on
Visual Question Answering (VQA)
on EgoSchema
1 code implementation • 5 Dec 2024 • Senqiao Yang, Yukang Chen, Zhuotao Tian, Chengyao Wang, Jingyao Li, Bei Yu, Jiaya Jia
To address this, we introduce VisionZip, a simple yet effective method that selects a set of informative tokens for input to the language model, reducing visual token redundancy and improving efficiency while maintaining model performance.
Ranked #172 on
Visual Question Answering
on MM-Vet
2 code implementations • 7 Oct 2024 • Chuanyang Zheng, Yihang Gao, Han Shi, Jing Xiong, Jiankai Sun, Jingyao Li, Minbin Huang, Xiaozhe Ren, Michael Ng, Xin Jiang, Zhenguo Li, Yu Li
The attention mechanism is a fundamental component of the Transformer model, contributing to interactions among distinct tokens, in contrast to earlier feed-forward neural networks.
no code implementations • 20 Jun 2024 • Zhongshen Zeng, Yinhong Liu, Yingjia Wan, Jingyao Li, Pengguang Chen, Jianbo Dai, Yuxuan Yao, Rongwu Xu, Zehan Qi, Wanru Zhao, Linling Shen, Jianqiao Lu, Haochen Tan, Yukang Chen, Hao Zhang, Zhan Shi, Bailin Wang, Zhijiang Guo, Jiaya Jia
Large language models (LLMs) have shown increasing capability in problem-solving and decision-making, largely based on the step-by-step chain-of-thought reasoning processes.
1 code implementation • 11 Jun 2024 • Jingyao Li, Han Shi, Xin Jiang, Zhenguo Li, Hong Xu, Jiaya Jia
On widely recognized benchmarks, Q-LLM improved by 7. 17% compared to the current state-of-the-art on LLaMA3, and by 3. 26% on Mistral on the $\infty$-bench.
no code implementations • 6 Jun 2024 • Jingyao Li, Pengguang Chen, Sitong Wu, Chuanyang Zheng, Hong Xu, Jiaya Jia
To address these limitations, the RoboCoder framework integrates Large Language Models (LLMs) with a dynamic learning system that uses real-time environmental feedback to continuously update and refine action codes.
2 code implementations • 23 May 2024 • Chuanyang Zheng, Yihang Gao, Han Shi, Minbin Huang, Jingyao Li, Jing Xiong, Xiaozhe Ren, Michael Ng, Xin Jiang, Zhenguo Li, Yu Li
Positional encoding plays a crucial role in transformers, significantly impacting model performance and length generalization.
no code implementations • 22 Feb 2024 • Jingyao Li, Pengguang Chen, Xuan Ju, Hong Xu, Jiaya Jia
Our research aims to bridge the domain gap between natural and artificial scenarios with efficient tuning strategies.
1 code implementation • 5 Jan 2024 • Jingyao Li, Pengguang Chen, Shaozuo Yu, Shu Liu, Jiaya Jia
The crux of effective out-of-distribution (OOD) detection lies in acquiring a robust in-distribution (ID) representation, distinct from OOD samples.
Out-of-Distribution Detection
Out of Distribution (OOD) Detection
1 code implementation • 26 Dec 2023 • Jingyao Li, Pengguang Chen, Shaozuo Yu, Shu Liu, Jiaya Jia
Experimental results demonstrate that, when labeling 80% of the samples, the performance of the current SOTA method declines by 0. 74%, whereas our proposed BAL achieves performance comparable to the full dataset.
1 code implementation • 26 Dec 2023 • Jingyao Li, Pengguang Chen, Bin Xia, Hong Xu, Jiaya Jia
Large Language Models (LLMs) have showcased impressive capabilities in handling straightforward programming tasks.
Ranked #3 on
Code Generation
on APPS
1 code implementation • 15 Apr 2023 • Jingyao Li, Pengguang Chen, Shengju Qian, Shu Liu, Jiaya Jia
Contrastive Language-Image Pre-training (CLIP) has recently shown great promise in pixel-level zero-shot learning tasks.
2 code implementations • CVPR 2023 • Jingyao Li, Pengguang Chen, Shaozuo Yu, Zexin He, Shu Liu, Jiaya Jia
The core of out-of-distribution (OOD) detection is to learn the in-distribution (ID) representation, which is distinguishable from OOD samples.
Ranked #12 on
Out-of-Distribution Detection
on ImageNet-1k vs Places
(AUROC metric)
1 code implementation • 25 Jul 2021 • Junjie Li, Jingyao Li, Wenbo Zhou, Shuai Lü
The training of generative adversarial networks (GANs) is usually vulnerable to mode collapse and vanishing gradients.