Search Results for author: Jingyao Li

Found 15 papers, 11 papers with code

DreamOmni: Unified Image Generation and Editing

no code implementations22 Dec 2024 Bin Xia, Yuechen Zhang, Jingyao Li, Chengyao Wang, Yitong Wang, Xinglong Wu, Bei Yu, Jiaya Jia

We begin by analyzing existing frameworks and the requirements of downstream tasks, proposing a unified framework that integrates both T2I models and various editing tasks.

Image Generation

Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition

1 code implementation12 Dec 2024 Zhisheng Zhong, Chengyao Wang, Yuqi Liu, Senqiao Yang, Longxiang Tang, Yuechen Zhang, Jingyao Li, Tianyuan Qu, Yanwei Li, Yukang Chen, Shaozuo Yu, Sitong Wu, Eric Lo, Shu Liu, Jiaya Jia

As Multi-modal Large Language Models (MLLMs) evolve, expanding beyond single-domain capabilities is essential to meet the demands for more versatile and efficient AI.

EgoSchema +6

VisionZip: Longer is Better but Not Necessary in Vision Language Models

1 code implementation5 Dec 2024 Senqiao Yang, Yukang Chen, Zhuotao Tian, Chengyao Wang, Jingyao Li, Bei Yu, Jiaya Jia

To address this, we introduce VisionZip, a simple yet effective method that selects a set of informative tokens for input to the language model, reducing visual token redundancy and improving efficiency while maintaining model performance.

Video Understanding Visual Question Answering

DAPE V2: Process Attention Score as Feature Map for Length Extrapolation

2 code implementations7 Oct 2024 Chuanyang Zheng, Yihang Gao, Han Shi, Jing Xiong, Jiankai Sun, Jingyao Li, Minbin Huang, Xiaozhe Ren, Michael Ng, Xin Jiang, Zhenguo Li, Yu Li

The attention mechanism is a fundamental component of the Transformer model, contributing to interactions among distinct tokens, in contrast to earlier feed-forward neural networks.

MR-Ben: A Meta-Reasoning Benchmark for Evaluating System-2 Thinking in LLMs

no code implementations20 Jun 2024 Zhongshen Zeng, Yinhong Liu, Yingjia Wan, Jingyao Li, Pengguang Chen, Jianbo Dai, Yuxuan Yao, Rongwu Xu, Zehan Qi, Wanru Zhao, Linling Shen, Jianqiao Lu, Haochen Tan, Yukang Chen, Hao Zhang, Zhan Shi, Bailin Wang, Zhijiang Guo, Jiaya Jia

Large language models (LLMs) have shown increasing capability in problem-solving and decision-making, largely based on the step-by-step chain-of-thought reasoning processes.

Decision Making

QuickLLaMA: Query-aware Inference Acceleration for Large Language Models

1 code implementation11 Jun 2024 Jingyao Li, Han Shi, Xin Jiang, Zhenguo Li, Hong Xu, Jiaya Jia

On widely recognized benchmarks, Q-LLM improved by 7. 17% compared to the current state-of-the-art on LLaMA3, and by 3. 26% on Mistral on the $\infty$-bench.

RoboCoder: Robotic Learning from Basic Skills to General Tasks with Large Language Models

no code implementations6 Jun 2024 Jingyao Li, Pengguang Chen, Sitong Wu, Chuanyang Zheng, Hong Xu, Jiaya Jia

To address these limitations, the RoboCoder framework integrates Large Language Models (LLMs) with a dynamic learning system that uses real-time environmental feedback to continuously update and refine action codes.

DAPE: Data-Adaptive Positional Encoding for Length Extrapolation

2 code implementations23 May 2024 Chuanyang Zheng, Yihang Gao, Han Shi, Minbin Huang, Jingyao Li, Jing Xiong, Xiaozhe Ren, Michael Ng, Xin Jiang, Zhenguo Li, Yu Li

Positional encoding plays a crucial role in transformers, significantly impacting model performance and length generalization.

VLPose: Bridging the Domain Gap in Pose Estimation with Language-Vision Tuning

no code implementations22 Feb 2024 Jingyao Li, Pengguang Chen, Xuan Ju, Hong Xu, Jiaya Jia

Our research aims to bridge the domain gap between natural and artificial scenarios with efficient tuning strategies.

Pose Estimation

MOODv2: Masked Image Modeling for Out-of-Distribution Detection

1 code implementation5 Jan 2024 Jingyao Li, Pengguang Chen, Shaozuo Yu, Shu Liu, Jiaya Jia

The crux of effective out-of-distribution (OOD) detection lies in acquiring a robust in-distribution (ID) representation, distinct from OOD samples.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

BAL: Balancing Diversity and Novelty for Active Learning

1 code implementation26 Dec 2023 Jingyao Li, Pengguang Chen, Shaozuo Yu, Shu Liu, Jiaya Jia

Experimental results demonstrate that, when labeling 80% of the samples, the performance of the current SOTA method declines by 0. 74%, whereas our proposed BAL achieves performance comparable to the full dataset.

Active Learning Diversity +1

MoTCoder: Elevating Large Language Models with Modular of Thought for Challenging Programming Tasks

1 code implementation26 Dec 2023 Jingyao Li, Pengguang Chen, Bin Xia, Hong Xu, Jiaya Jia

Large Language Models (LLMs) have showcased impressive capabilities in handling straightforward programming tasks.

Code Generation

TagCLIP: Improving Discrimination Ability of Open-Vocabulary Semantic Segmentation

1 code implementation15 Apr 2023 Jingyao Li, Pengguang Chen, Shengju Qian, Shu Liu, Jiaya Jia

Contrastive Language-Image Pre-training (CLIP) has recently shown great promise in pixel-level zero-shot learning tasks.

Language Modeling Language Modelling +5

Rethinking Out-of-distribution (OOD) Detection: Masked Image Modeling is All You Need

2 code implementations CVPR 2023 Jingyao Li, Pengguang Chen, Shaozuo Yu, Zexin He, Shu Liu, Jiaya Jia

The core of out-of-distribution (OOD) detection is to learn the in-distribution (ID) representation, which is distinguishable from OOD samples.

Out-of-Distribution Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.