Search Results for author: Zhilong Ji

Found 25 papers, 16 papers with code

Two Optimizers Are Better Than One: LLM Catalyst for Enhancing Gradient-Based Optimization

1 code implementation • 30 May 2024 • Zixian Guo, Ming Liu, Zhilong Ji, Jinfeng Bai, Yiwen Guo, WangMeng Zuo

Learning a skill generally relies on both practical experience by doer and insightful high-level guidance by instructor.

Paper
Code

MuMath-Code: Combining Tool-Use Large Language Models with Multi-perspective Data Augmentation for Mathematical Reasoning

no code implementations • 13 May 2024 • Shuo Yin, Weihao You, Zhilong Ji, Guoqiang Zhong, Jinfeng Bai

To fully leverage the advantages of our augmented data, we propose a two-stage training strategy: In Stage-1, we finetune Llama-2 on pure CoT data to get an intermediate model, which then is trained on the code-nested data in Stage-2 to get the resulting MuMath-Code.

Data Augmentation GSM8K +2

Paper
Add Code

MasterWeaver: Taming Editability and Identity for Personalized Text-to-Image Generation

1 code implementation • 9 May 2024 • Yuxiang Wei, Zhilong Ji, Jinfeng Bai, Hongzhi Zhang, Lei Zhang, WangMeng Zuo

In this work, we present MasterWeaver, a test-time tuning-free method designed to generate personalized images with both faithful identity fidelity and flexible editability.

Text-to-Image Generation

Paper
Code

HPNet: Dynamic Trajectory Forecasting with Historical Prediction Attention

1 code implementation • 9 Apr 2024 • Xiaolong Tang, Meina Kan, Shiguang Shan, Zhilong Ji, Jinfeng Bai, Xilin Chen

The proposed Historical Prediction Attention together with the Agent Attention and Mode Attention is further formulated as the Triple Factorized Attention module, serving as the core design of HPNet. Experiments on the Argoverse and INTERACTION datasets show that HPNet achieves state-of-the-art performance, and generates accurate and stable future trajectories.

Autonomous Driving Trajectory Forecasting

Paper
Code

Black-Box Tuning of Vision-Language Models with Effective Gradient Approximation

1 code implementation • 26 Dec 2023 • Zixian Guo, Yuxiang Wei, Ming Liu, Zhilong Ji, Jinfeng Bai, Yiwen Guo, WangMeng Zuo

Parameter-efficient fine-tuning (PEFT) methods have provided an effective way for adapting large vision-language models to specific tasks or scenarios.

Paper
Code

Decoupled Textual Embeddings for Customized Image Generation

1 code implementation • 19 Dec 2023 • Yufei Cai, Yuxiang Wei, Zhilong Ji, Jinfeng Bai, Hu Han, WangMeng Zuo

To decouple irrelevant attributes (i. e., background and pose) from the subject embedding, we further present several attribute mappers that encode each image as several image-specific subject-unrelated embeddings.

Attribute Disentanglement +2

Paper
Code

Patch Is Not All You Need

no code implementations • 21 Aug 2023 • Changzhen Li, Jie Zhang, Yang Wei, Zhilong Ji, Jinfeng Bai, Shiguang Shan

Vision Transformers have achieved great success in computer visions, delivering exceptional performance across various tasks.

Paper
Add Code

Semantic Graph Representation Learning for Handwritten Mathematical Expression Recognition

no code implementations • 21 Aug 2023 • Zhuang Liu, Ye Yuan, Zhilong Ji, Jingfeng Bai, Xiang Bai

Then we design a semantic aware module (SAM), which projects the visual and classification feature into semantic space.

Graph Representation Learning

Paper
Add Code

LRANet: Towards Accurate and Efficient Scene Text Detection with Low-Rank Approximation Network

1 code implementation • 27 Jun 2023 • Yuchen Su, Zhineng Chen, Zhiwen Shao, Yuning Du, Zhilong Ji, Jinfeng Bai, Yong Zhou, Yu-Gang Jiang

Next, we propose a dual assignment scheme for speed acceleration.

Scene Text Detection Text Detection

Paper
Code

Inferring and Leveraging Parts from Object Shape for Improving Semantic Image Synthesis

1 code implementation • CVPR 2023 • Yuxiang Wei, Zhilong Ji, Xiaohe Wu, Jinfeng Bai, Lei Zhang, WangMeng Zuo

Despite the progress in semantic image synthesis, it remains a challenging problem to generate photo-realistic parts from input semantic map.

Image Generation Object

Paper
Code

CCLAP: Controllable Chinese Landscape Painting Generation via Latent Diffusion Model

1 code implementation • 9 Apr 2023 • Zhongqi Wang, Jie Zhang, Zhilong Ji, Jinfeng Bai, Shiguang Shan

While the style aggregator module is to generate paintings of a style corresponding to a reference image.

Chinese Landscape Painting Generation

Paper
Code

ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation

1 code implementation • ICCV 2023 • Yuxiang Wei, Yabo Zhang, Zhilong Ji, Jinfeng Bai, Lei Zhang, WangMeng Zuo

In addition to the unprecedented ability in imaginary creation, large text-to-image models are expected to take customized concepts in image generation.

Text-to-Image Generation

483

Paper
Code

1st Place Solution for YouTubeVOS Challenge 2022: Referring Video Object Segmentation

1 code implementation • 27 Dec 2022 • Zhiwei Hu, Bo Chen, Yuan Gao, Zhilong Ji, Jinfeng Bai

The task of referring video object segmentation aims to segment the object in the frames of a given video to which the referring expressions refer.

Object Referring Video Object Segmentation +2

Paper
Code

Position-Aware Contrastive Alignment for Referring Image Segmentation

no code implementations • 27 Dec 2022 • Bo Chen, Zhiwei Hu, Zhilong Ji, Jinfeng Bai, WangMeng Zuo

The main challenge of this task is to understand the visual and linguistic content simultaneously and to find the referred object accurately among all instances in the image.

Image Segmentation Position +1

Paper
Add Code

Texts as Images in Prompt Tuning for Multi-Label Image Recognition

1 code implementation • CVPR 2023 • Zixian Guo, Bowen Dong, Zhilong Ji, Jinfeng Bai, Yiwen Guo, WangMeng Zuo

Nonetheless, visual data (e. g., images) is by default prerequisite for learning prompts in existing methods.

Ranked #1 on Multi-label Image Recognition with Partial Labels on PASCAL VOC 2007

Contrastive Learning Multi-label Image Recognition with Partial Labels

Paper
Code

1st Place Solutions for UG2+ Challenge 2022 ATMOSPHERIC TURBULENCE MITIGATION

no code implementations • 30 Oct 2022 • Zhuang Liu, Zhichao Zhao, Ye Yuan, Zhi Qiao, Jinfeng Bai, Zhilong Ji

In this technical report, we briefly introduce the solution of our team ''summer'' for Atomospheric Turbulence Mitigation in UG$^2$+ Challenge in CVPR 2022.

Image Quality Assessment Image Reconstruction

Paper
Add Code

1st Place Solutions for the UVO Challenge 2022

no code implementations • 18 Oct 2022 • Jiajun Zhang, BoYu Chen, Zhilong Ji, Jinfeng Bai, Zonghai Hu

This paper describes the approach we have taken in the challenge.

object-detection Object Detection +1

Paper
Add Code

When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition

3 code implementations • 23 Jul 2022 • Bohan Li, Ye Yuan, Dingkang Liang, Xiao Liu, Zhilong Ji, Jinfeng Bai, Wenyu Liu, Xiang Bai

Recently, most handwritten mathematical expression recognition (HMER) methods adopt the encoder-decoder networks, which directly predict the markup sequences from formula images with the attention mechanism.

Decoder Optical Character Recognition (OCR)

347

Paper
Code

Towards Diverse and Faithful One-shot Adaption of Generative Adversarial Networks

1 code implementation • 18 Jul 2022 • Yabo Zhang, Mingshuai Yao, Yuxiang Wei, Zhilong Ji, Jinfeng Bai, WangMeng Zuo

In this paper, we present a novel one-shot generative domain adaption method, i. e., DiFa, for diverse generation and faithful adaptation.

Domain Adaptation

Paper
Code

Syntax-Aware Network for Handwritten Mathematical Expression Recognition

2 code implementations • CVPR 2022 • Ye Yuan, Xiao Liu, Wondimu Dikubab, Hui Liu, Zhilong Ji, Zhongqin Wu, Xiang Bai

In this paper, we propose a simple and efficient method for HMER, which is the first to incorporate syntax information into an encoder-decoder network.

Decoder

Paper
Code

Orthogonal Jacobian Regularization for Unsupervised Disentanglement in Image Generation

1 code implementation • ICCV 2021 • Yuxiang Wei, Yupeng Shi, Xiao Liu, Zhilong Ji, Yuan Gao, Zhongqin Wu, WangMeng Zuo

It simply encourages the variation of output caused by perturbations on different latent dimensions to be orthogonal, and the Jacobian with respect to the input is calculated to represent this variation.

Disentanglement Image Generation

Paper
Code

Structured Multi-modal Feature Embedding and Alignment for Image-Sentence Retrieval

no code implementations • 5 Aug 2021 • Xuri Ge, Fuhai Chen, Joemon M. Jose, Zhilong Ji, Zhongqin Wu, Xiao Liu

In this work, we propose to address the above issue from two aspects: (i) constructing intrinsic structure (along with relations) among the fragments of respective modalities, e. g., "dog $\to$ play $\to$ ball" in semantic structure for an image, and (ii) seeking explicit inter-modal structural and semantic correspondence between the visual and textual modalities.

Retrieval Semantic correspondence +2

Paper
Add Code

Gaze Estimation with an Ensemble of Four Architectures

1 code implementation • 5 Jul 2021 • Xin Cai, BoYu Chen, Jiabei Zeng, Jiajun Zhang, Yunjia Sun, Xiao Wang, Zhilong Ji, Xiao Liu, Xilin Chen, Shiguang Shan

This paper presents a method for gaze estimation according to face images.

Gaze Estimation

Paper
Code

1st Place Solutions for UG2+ Challenge 2021 -- (Semi-)supervised Face detection in the low light condition

no code implementations • 2 Jul 2021 • Pengcheng Wang, Lingqiao Ji, Zhilong Ji, Yuan Gao, Xiao Liu

In this technical report, we briefly introduce the solution of our team "TAL-ai" for (Semi-) supervised Face detection in the low light condition in UG2+ Challenge in CVPR 2021.

Face Detection Image Enhancement +2

Paper
Add Code

TAL EmotioNet Challenge 2020 Rethinking the Model Chosen Problem in Multi-Task Learning

no code implementations • 21 Apr 2020 • Pengcheng Wang, ZiHao Wang, Zhilong Ji, Xiao Liu, Songfan Yang, Zhongqin Wu

This paper introduces our approach to the EmotioNet Challenge 2020.

Multi-Task Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.