Search Results for author: Yihao Chen

Found 18 papers, 7 papers with code

Generic and Robust Root Cause Localization for Multi-Dimensional Data in Online Service Systems

1 code implementation5 May 2023 Zeyan Li, Junjie Chen, Yihao Chen, Chengyang Luo, Yiwei Zhao, Yongqian Sun, Kaixin Sui, Xiping Wang, Dapeng Liu, Xing Jin, Qi Wang, Dan Pei

Such attribute combinations are substantial clues to the underlying root causes and thus are called root causes of multidimensional data.

LipsFormer: Introducing Lipschitz Continuity to Vision Transformers

1 code implementation19 Apr 2023 Xianbiao Qi, Jianan Wang, Yihao Chen, Yukai Shi, Lei Zhang

In contrast to previous practical tricks that address training instability by learning rate warmup, layer normalization, attention formulation, and weight initialization, we show that Lipschitz continuity is a more essential property to ensure training stability.

DisCo-CLIP: A Distributed Contrastive Loss for Memory Efficient CLIP Training

1 code implementation CVPR 2023 Yihao Chen, Xianbiao Qi, Jianan Wang, Lei Zhang

In this way, we can reduce the GPU memory consumption of contrastive loss computation from $\bigO(B^2)$ to $\bigO(\frac{B^2}{N})$, where $B$ and $N$ are the batch size and the number of GPUs used for training.

Contrastive Learning

Exploring Vision Transformers as Diffusion Learners

no code implementations28 Dec 2022 He Cao, Jianan Wang, Tianhe Ren, Xianbiao Qi, Yihao Chen, Yuan YAO, Lei Zhang

We further provide a hypothesis on the implication of disentangling the generative backbone as an encoder-decoder structure and show proof-of-concept experiments verifying the effectiveness of a stronger encoder for generative tasks with ASymmetriC ENcoder Decoder (ASCEND).

The SpeakIn System Description for CNSRC2022

no code implementations22 Sep 2022 Yu Zheng, Yihao Chen, Jinghan Peng, Yajun Zhang, Min Liu, Minqiang Xu

In the SV task fixed track, our system was a fusion of five models, and two models were fused in the SV task open track.

Retrieval Speaker Recognition +1

3D Shuffle-Mixer: An Efficient Context-Aware Vision Learner of Transformer-MLP Paradigm for Dense Prediction in Medical Volume

no code implementations14 Apr 2022 Jianye Pang, Cheng Jiang, Yihao Chen, Jianbo Chang, Ming Feng, Renzhi Wang, Jianhua Yao

Therefore, designing an elegant and efficient vision transformer learner for dense prediction in medical volume is promising and challenging.

Inductive Bias

1st Place Solution for ICDAR 2021 Competition on Mathematical Formula Detection

1 code implementation12 Jul 2021 Yuxiang Zhong, Xianbiao Qi, Shanjun Li, Dengyi Gu, Yihao Chen, Peiyang Ning, Rong Xiao

In this technical report, we present our 1st place solution for the ICDAR 2021 competition on mathematical formula detection (MFD).

PingAn-VCGroup's Solution for ICDAR 2021 Competition on Scientific Literature Parsing Task B: Table Recognition to HTML

2 code implementations5 May 2021 Jiaquan Ye, Xianbiao Qi, Yelin He, Yihao Chen, Dengyi Gu, Peng Gao, Rong Xiao

In our method, we divide the table content recognition task into foursub-tasks: table structure recognition, text line detection, text line recognition, and box assignment. Our table structure recognition algorithm is customized based on MASTER [1], a robust image textrecognition algorithm.

Line Detection Table Recognition

Melody-Conditioned Lyrics Generation with SeqGANs

no code implementations28 Oct 2020 Yihao Chen, Alexander Lerch

Automatic lyrics generation has received attention from both music and AI communities for years.

Learning Graph Normalization for Graph Neural Networks

1 code implementation24 Sep 2020 Yihao Chen, Xin Tang, Xianbiao Qi, Chun-Guang Li, Rong Xiao

We conduct extensive experiments on benchmark datasets for different tasks, including node classification, link prediction, graph classification and graph regression, and confirm that the learned graph normalization leads to competitive results and that the learned weights suggest the appropriate normalization techniques for the specific task.

Graph Classification Graph Regression +2

Neural Mesh Refiner for 6-DoF Pose Estimation

no code implementations17 Mar 2020 Di Wu, Yihao Chen, Xianbiao Qi, Yongjian Yu, Weixuan Chen, Rong Xiao

We utilise the overlay between the accurate mask prediction and less accurate mesh prediction to iteratively optimise the direct regressed 6D pose information with a focus on translation estimation.

Autonomous Driving Instance Segmentation +4

MASTER: Multi-Aspect Non-local Network for Scene Text Recognition

7 code implementations7 Oct 2019 Ning Lu, Wenwen Yu, Xianbiao Qi, Yihao Chen, Ping Gong, Rong Xiao, Xiang Bai

Attention-based scene text recognizers have gained huge success, which leverages a more compact intermediate representation to learn 1d- or 2d- attention by a RNN-based encoder-decoder architecture.

Scene Text Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.