Search Results for author: Jinfeng Bai

Found 23 papers, 17 papers with code

HPNet: Dynamic Trajectory Forecasting with Historical Prediction Attention

1 code implementation9 Apr 2024 Xiaolong Tang, Meina Kan, Shiguang Shan, Zhilong Ji, Jinfeng Bai, Xilin Chen

The proposed Historical Prediction Attention together with the Agent Attention and Mode Attention is further formulated as the Triple Factorized Attention module, serving as the core design of HPNet. Experiments on the Argoverse and INTERACTION datasets show that HPNet achieves state-of-the-art performance, and generates accurate and stable future trajectories.

Autonomous Driving Trajectory Forecasting

Black-Box Tuning of Vision-Language Models with Effective Gradient Approximation

1 code implementation26 Dec 2023 Zixian Guo, Yuxiang Wei, Ming Liu, Zhilong Ji, Jinfeng Bai, Yiwen Guo, WangMeng Zuo

Parameter-efficient fine-tuning (PEFT) methods have provided an effective way for adapting large vision-language models to specific tasks or scenarios.

Decoupled Textual Embeddings for Customized Image Generation

1 code implementation19 Dec 2023 Yufei Cai, Yuxiang Wei, Zhilong Ji, Jinfeng Bai, Hu Han, WangMeng Zuo

To decouple irrelevant attributes (i. e., background and pose) from the subject embedding, we further present several attribute mappers that encode each image as several image-specific subject-unrelated embeddings.

Attribute Disentanglement +2

Unveiling the Implicit Toxicity in Large Language Models

1 code implementation29 Nov 2023 Jiaxin Wen, Pei Ke, Hao Sun, Zhexin Zhang, Chengfei Li, Jinfeng Bai, Minlie Huang

While recent studies primarily focus on probing toxic outputs that can be easily detected with existing toxicity classifiers, we show that LLMs can generate diverse implicit toxic outputs that are exceptionally difficult to detect via simply zero-shot prompting.

Language Modelling Reinforcement Learning (RL)

GPT Can Solve Mathematical Problems Without a Calculator

1 code implementation6 Sep 2023 Zhen Yang, Ming Ding, Qingsong Lv, Zhihuan Jiang, Zehai He, Yuyi Guo, Jinfeng Bai, Jie Tang

Previous studies have typically assumed that large language models are unable to accurately perform arithmetic operations, particularly multiplication of >8 digits, and operations involving decimals and fractions, without the use of calculator tools.

Language Modelling Math

Patch Is Not All You Need

no code implementations21 Aug 2023 Changzhen Li, Jie Zhang, Yang Wei, Zhilong Ji, Jinfeng Bai, Shiguang Shan

Vision Transformers have achieved great success in computer visions, delivering exceptional performance across various tasks.

DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model

1 code implementation2 Jun 2023 Haoyu Wang, Siyuan Wang, Wei-Qiang Zhang, Jinfeng Bai

Multilingual self-supervised speech representation models have greatly enhanced the speech recognition performance for low-resource languages, and the compression of these huge models has also become a crucial prerequisite for their industrial application.

speech-recognition Speech Recognition

Inferring and Leveraging Parts from Object Shape for Improving Semantic Image Synthesis

1 code implementation CVPR 2023 Yuxiang Wei, Zhilong Ji, Xiaohe Wu, Jinfeng Bai, Lei Zhang, WangMeng Zuo

Despite the progress in semantic image synthesis, it remains a challenging problem to generate photo-realistic parts from input semantic map.

Image Generation Object

TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition

1 code implementation9 May 2023 Tianlun Zheng, Zhineng Chen, Jinfeng Bai, Hongtao Xie, Yu-Gang Jiang

In this work, we introduce TPS++, an attention-enhanced TPS transformation that incorporates the attention mechanism to text rectification for the first time.

Optical Character Recognition (OCR) Scene Text Recognition

Dual Contrastive Prediction for Incomplete Multi-view Representation Learning

1 code implementation IEEE Transactions on Pattern Analysis and Machine Intelligence 2023 Yijie Lin, Yuanbiao Gou, Xiaotian Liu, Jinfeng Bai, Jiancheng Lv, Xi Peng

In this article, we propose a unified framework to solve the following two challenging problems in incomplete multi-view representation learning: i) how to learn a consistent representation unifying different views, and ii) how to recover the missing views.

Action Recognition Contrastive Learning +3

ELITE: Encoding Visual Concepts into Textual Embeddings for Customized Text-to-Image Generation

1 code implementation ICCV 2023 Yuxiang Wei, Yabo Zhang, Zhilong Ji, Jinfeng Bai, Lei Zhang, WangMeng Zuo

In addition to the unprecedented ability in imaginary creation, large text-to-image models are expected to take customized concepts in image generation.

Text-to-Image Generation

1st Place Solution for YouTubeVOS Challenge 2022: Referring Video Object Segmentation

1 code implementation27 Dec 2022 Zhiwei Hu, Bo Chen, Yuan Gao, Zhilong Ji, Jinfeng Bai

The task of referring video object segmentation aims to segment the object in the frames of a given video to which the referring expressions refer.

Object Referring Video Object Segmentation +2

Position-Aware Contrastive Alignment for Referring Image Segmentation

no code implementations27 Dec 2022 Bo Chen, Zhiwei Hu, Zhilong Ji, Jinfeng Bai, WangMeng Zuo

The main challenge of this task is to understand the visual and linguistic content simultaneously and to find the referred object accurately among all instances in the image.

Image Segmentation Position +1

1st Place Solutions for UG2+ Challenge 2022 ATMOSPHERIC TURBULENCE MITIGATION

no code implementations30 Oct 2022 Zhuang Liu, Zhichao Zhao, Ye Yuan, Zhi Qiao, Jinfeng Bai, Zhilong Ji

In this technical report, we briefly introduce the solution of our team ''summer'' for Atomospheric Turbulence Mitigation in UG$^2$+ Challenge in CVPR 2022.

Image Quality Assessment Image Reconstruction

Summary on the ISCSLP 2022 Chinese-English Code-Switching ASR Challenge

no code implementations12 Oct 2022 Shuhao Deng, Chengfei Li, Jinfeng Bai, Qingqing Zhang, Wei-Qiang Zhang, Runyan Yang, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan

Code-switching automatic speech recognition becomes one of the most challenging and the most valuable scenarios of automatic speech recognition, due to the code-switching phenomenon between multilingual language and the frequent occurrence of code-switching phenomenon in daily life.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

When Counting Meets HMER: Counting-Aware Network for Handwritten Mathematical Expression Recognition

2 code implementations23 Jul 2022 Bohan Li, Ye Yuan, Dingkang Liang, Xiao Liu, Zhilong Ji, Jinfeng Bai, Wenyu Liu, Xiang Bai

Recently, most handwritten mathematical expression recognition (HMER) methods adopt the encoder-decoder networks, which directly predict the markup sequences from formula images with the attention mechanism.

Optical Character Recognition (OCR)

Towards Diverse and Faithful One-shot Adaption of Generative Adversarial Networks

1 code implementation18 Jul 2022 Yabo Zhang, Mingshuai Yao, Yuxiang Wei, Zhilong Ji, Jinfeng Bai, WangMeng Zuo

In this paper, we present a novel one-shot generative domain adaption method, i. e., DiFa, for diverse generation and faithful adaptation.

Domain Adaptation

Cannot find the paper you are looking for? You can Submit a new open access paper.