Search Results for author: Yalong Bai

Found 22 papers, 5 papers with code

Dynamic Prompt Optimizing for Text-to-Image Generation

2 code implementations • 5 Apr 2024 • Wenyi Mo, Tianyu Zhang, Yalong Bai, Bing Su, Ji-Rong Wen, Qing Yang

Users assign weights or alter the injection time steps of certain words in the text prompts to improve the quality of generated images.

Text-to-Image Generation

152

Paper
Code

StyleInject: Parameter Efficient Tuning of Text-to-Image Diffusion Models

no code implementations • 25 Jan 2024 • Yalong Bai, Mohan Zhou, Qing Yang

The ability to fine-tune generative models for text-to-image generation tasks is crucial, particularly facing the complexity involved in accurately interpreting and visualizing textual inputs.

Language Modelling Text-to-Image Generation +1

Paper
Add Code

Learning and Evaluating Human Preferences for Conversational Head Generation

no code implementations • 20 Jul 2023 • Mohan Zhou, Yalong Bai, Wei zhang, Ting Yao, Tiejun Zhao, Tao Mei

In this paper, we propose a novel learning-based evaluation metric named Preference Score (PS) for fitting human preference according to the quantitative evaluations across different dimensions.

Paper
Add Code

Interactive Conversational Head Generation

no code implementations • 5 Jul 2023 • Mohan Zhou, Yalong Bai, Wei zhang, Ting Yao, Tiejun Zhao

Based on ViCo and ViCo-X, we define three novel tasks targeting the interaction modeling during the face-to-face conversation: 1) responsive listening head generation making listeners respond actively to the speaker with non-verbal signals, 2) expressive talking head generation guiding speakers to be aware of listeners' behaviors, and 3) conversational head generation to integrate the talking/listening ability in one interlocutor.

Sentence Talking Head Generation

Paper
Add Code

Deep Equilibrium Multimodal Fusion

no code implementations • 29 Jun 2023 • Jinhong Ni, Yalong Bai, Wei zhang, Ting Yao, Tao Mei

Multimodal fusion integrates the complementary information present in multiple modalities and has gained much attention recently.

Visual Question Answering (VQA)

Paper
Add Code

Visual-Aware Text-to-Speech

no code implementations • 21 Jun 2023 • Mohan Zhou, Yalong Bai, Wei zhang, Ting Yao, Tiejun Zhao, Tao Mei

Dynamically synthesizing talking speech that actively responds to a listening head is critical during the face-to-face interaction.

Speech Synthesis

Paper
Add Code

Learning cross space mapping via DNN using large scale click-through logs

no code implementations • 26 Feb 2023 • Wei Yu, Kuiyuan Yang, Yalong Bai, Hongxun Yao, Yong Rui

The image and query are mapped to a common vector space via these two parts respectively, and image-query similarity is naturally defined as an inner product of their mappings in the space.

Image Classification Image Retrieval +1

Paper
Add Code

Visualizing and Understanding Patch Interactions in Vision Transformer

no code implementations • 11 Mar 2022 • Jie Ma, Yalong Bai, Bineng Zhong, Wei zhang, Ting Yao, Tao Mei

Vision Transformer (ViT) has become a leading tool in various computer vision tasks, owing to its unique self-attention mechanism that learns visual representations explicitly through cross-patch information interactions.

Paper
Add Code

Freeform Body Motion Generation from Speech

1 code implementation • 4 Mar 2022 • Jing Xu, Wei zhang, Yalong Bai, Qibin Sun, Tao Mei

Motivated by studies in linguistics, we decompose the co-speech motion into two complementary parts: pose modes and rhythmic dynamics.

190

Paper
Code

Responsive Listening Head Generation: A Benchmark Dataset and Baseline

no code implementations • 27 Dec 2021 • Mohan Zhou, Yalong Bai, Wei zhang, Ting Yao, Tiejun Zhao, Tao Mei

Automatically synthesizing listening behavior that actively responds to a talking head, is critical to applications such as digital human, virtual agents and social robots.

Talking Head Generation Translation

Paper
Add Code

Directional Self-supervised Learning for Heavy Image Augmentations

no code implementations • CVPR 2022 • Yalong Bai, Yifan Yang, Wei zhang, Tao Mei

Specifically, we adapt heavy augmentation policies after the views lightly augmented by standard augmentations, to generate harder view (HV).

Representation Learning Self-Supervised Learning

Paper
Add Code

Augmentation Pathways Network for Visual Recognition

1 code implementation • 26 Jul 2021 • Yalong Bai, Mohan Zhou, Wei zhang, BoWen Zhou, Tao Mei

Experimental results on ImageNet demonstrate the compatibility and effectiveness on a much wider range of augmentations, while consuming fewer parameters and lower computational costs at inference time.

Data Augmentation

Paper
Code

Exploiting Relationship for Complex-scene Image Generation

no code implementations • 1 Apr 2021 • Tianyu Hua, Hongdong Zheng, Yalong Bai, Wei zhang, Xiao-Ping Zhang, Tao Mei

Our method tends to synthesize plausible layouts and objects, respecting the interplay of multiple objects in an image.

Image Generation Scene Generation

Paper
Add Code

Products-10K: A Large-scale Product Recognition Dataset

1 code implementation • 24 Aug 2020 • Yalong Bai, Yuxiang Chen, Wei Yu, Linfang Wang, Wei zhang

With the rapid development of electronic commerce, the way of shopping has experienced a revolutionary evolution.

Paper
Code

Look-into-Object: Self-supervised Structure Modeling for Object Recognition

2 code implementations • CVPR 2020 • Mohan Zhou, Yalong Bai, Wei zhang, Tiejun Zhao, Tao Mei

Specifically, we first propose an object-extent learning module for localizing the object according to the visual patterns shared among the instances in the same category.

Ranked #17 on Fine-Grained Image Classification on CUB-200-2011

Fine-Grained Image Classification Image Recognition +7

582

Paper
Code

Relationship-Aware Spatial Perception Fusion for Realistic Scene Layout Generation

no code implementations • 2 Sep 2019 • Hongdong Zheng, Yalong Bai, Wei zhang, Tao Mei

In our framework, a spatial constraint module is designed to fit reasonable scaling and spatial layout of object pairs with considering relationship between them.

Image Generation Object

Paper
Add Code

VrR-VG: Refocusing Visually-Relevant Relationships

no code implementations • ICCV 2019 • Yuanzhi Liang, Yalong Bai, Wei zhang, Xueming Qian, Li Zhu, Tao Mei

Relationships encode the interactions among individual instances, and play a critical role in deep visual scene understanding.

Image Captioning Question Answering +3

Paper
Add Code

Deep Attention Neural Tensor Network for Visual Question Answering

no code implementations • ECCV 2018 • Yalong Bai, Jianlong Fu, Tiejun Zhao, Tao Mei

First, we model one of the pairwise interaction (e. g., image and question) by bilinear features, which is further encoded with the third dimension (e. g., answer) to be a triplet by bilinear tensor product.

Deep Attention Question Answering +1

Paper
Add Code

Automatic Dataset Augmentation

no code implementations • 28 Aug 2017 • Yalong Bai, Kuiyuan Yang, Tao Mei, Wei-Ying Ma, Tiejun Zhao

Large scale image dataset and deep convolutional neural network (DCNN) are two primary driving forces for the rapid progress made in generic object recognition tasks in recent years.

Object Recognition

Paper
Add Code

Visualizing and Comparing Convolutional Neural Networks

no code implementations • 20 Dec 2014 • Wei Yu, Kuiyuan Yang, Yalong Bai, Hongxun Yao, Yong Rui

Convolutional Neural Networks (CNNs) have achieved comparable error rates to well-trained human on ILSVRC2014 image classification task.

Classification General Classification +1

Paper
Add Code

Learning High-level Image Representation for Image Retrieval via Multi-Task DNN using Clickthrough Data

no code implementations • 17 Dec 2013 • Yalong Bai, Kuiyuan Yang, Wei Yu, Wei-Ying Ma, Tiejun Zhao

Image retrieval refers to finding relevant images from an image database for a query, which is considered difficult for the gap between low-level representation of images and high-level representation of queries.

Image Retrieval Retrieval

Paper
Add Code

Cross-lingual Projections between Languages from Different Families

no code implementations • ACL 2013 • Mo Yu, Tiejun Zhao, Yalong Bai, Hao Tian, dianhai yu

Word Alignment

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.