Search Results for author: Jingtuo Liu

Found 36 papers, 17 papers with code

A Single-Shot Arbitrarily-Shaped Text Detector based on Context Attended Multi-Task Learning

1 code implementation • 15 Aug 2019 • Pengfei Wang, Chengquan Zhang, Fei Qi, Zuming Huang, Mengyi En, Junyu Han, Jingtuo Liu, Errui Ding, Guangming Shi

Detecting scene text of arbitrary shapes has been a challenging task over the past years.

Ranked #18 on Scene Text Detection on ICDAR 2015

Multi-Task Learning Optical Character Recognition (OCR) +2

38,418

Paper
Code

Towards Accurate Scene Text Recognition with Semantic Reasoning Networks

2 code implementations • CVPR 2020 • Deli Yu, Xuan Li, Chengquan Zhang, Junyu Han, Jingtuo Liu, Errui Ding

Scene text image contains two levels of contents: visual texture and semantic information.

Ranked #4 on Optical Character Recognition (OCR) on Benchmarking Chinese Text Recognition: Datasets, Baselines, and an Empirical Study

Optical Character Recognition (OCR) Scene Text Recognition

38,418

Paper
Code

PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network

2 code implementations • 12 Apr 2021 • Pengfei Wang, Chengquan Zhang, Fei Qi, Shanshan Liu, Xiaoqiang Zhang, Pengyuan Lyu, Junyu Han, Jingtuo Liu, Errui Ding, Guangming Shi

With a PG-CTC decoder, we gather high-level character classification vectors from two-dimensional space and decode them into text symbols without NMS and RoI operations involved, which guarantees high efficiency.

Ranked #1 on Scene Text Detection on ICDAR 2015 (Accuracy metric)

Optical Character Recognition (OCR) Scene Text Detection +1

38,418

Paper
Code

PyramidBox: A Context-assisted Single Shot Face Detector

5 code implementations • ECCV 2018 • Xu Tang, Daniel K. Du, Zeqiang He, Jingtuo Liu

This paper proposes a novel context-assisted single shot face detector, named \emph{PyramidBox} to handle the hard face detection problem.

Ranked #4 on Face Detection on FDDB

Face Detection

6,868

Paper
Code

Real-time Neural Radiance Talking Portrait Synthesis via Audio-spatial Decomposition

1 code implementation • 22 Nov 2022 • Jiaxiang Tang, Kaisiyuan Wang, Hang Zhou, Xiaokang Chen, Dongliang He, Tianshu Hu, Jingtuo Liu, Gang Zeng, Jingdong Wang

While dynamic Neural Radiance Fields (NeRF) have shown success in high-fidelity 3D modeling of talking portraits, the slow training and inference speed severely obstruct their potential usage.

Talking Face Generation

815

Paper
Code

ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT)

1 code implementation • 16 Sep 2019 • Chee-Kheng Chng, Yuliang Liu, Yipeng Sun, Chun Chet Ng, Canjie Luo, Zihan Ni, ChuanMing Fang, Shuaitao Zhang, Junyu Han, Errui Ding, Jingtuo Liu, Dimosthenis Karatzas, Chee Seng Chan, Lianwen Jin

This paper reports the ICDAR2019 Robust Reading Challenge on Arbitrary-Shaped Text (RRC-ArT) that consists of three major challenges: i) scene text detection, ii) scene text recognition, and iii) scene text spotting.

Scene Text Detection Scene Text Recognition +2

720

Paper
Code

StrucTexT: Structured Text Understanding with Multi-Modal Transformers

1 code implementation • 6 Aug 2021 • Yulin Li, Yuxi Qian, Yuchen Yu, Xiameng Qin, Chengquan Zhang, Yan Liu, Kun Yao, Junyu Han, Jingtuo Liu, Errui Ding

Due to the complexity of content and layout in VRDs, structured text understanding has been a challenging task.

Entity Linking Language Modelling +1

480

Paper
Code

MobileFaceSwap: A Lightweight Framework for Video Face Swapping

1 code implementation • 11 Jan 2022 • Zhiliang Xu, Zhibin Hong, Changxing Ding, Zhen Zhu, Junyu Han, Jingtuo Liu, Errui Ding

In this work, we propose a lightweight Identity-aware Dynamic Network (IDN) for subject-agnostic face swapping by dynamically adjusting the model parameters according to the identity information.

Face Swapping Knowledge Distillation

280

Paper
Code

Learning Generalized Spoof Cues for Face Anti-spoofing

6 code implementations • 8 May 2020 • Haocheng Feng, Zhibin Hong, Haixiao Yue, Yang Chen, Keyao Wang, Junyu Han, Jingtuo Liu, Errui Ding

In this paper, we reformulate FAS in an anomaly detection perspective and propose a residual-learning framework to learn the discriminative live-spoof differences which are defined as the spoof cues.

Anomaly Detection Face Anti-Spoofing

225

Paper
Code

Editing Text in the Wild

2 code implementations • 8 Aug 2019 • Liang Wu, Chengquan Zhang, Jiaming Liu, Junyu Han, Jingtuo Liu, Errui Ding, Xiang Bai

Specifically, we propose an end-to-end trainable style retention network (SRNet) that consists of three modules: text conversion module, background inpainting module and fusion module.

Ranked #1 on Image Inpainting on StreetView

Image Inpainting Image-to-Image Translation +1

220

Paper
Code

EATEN: Entity-aware Attention for Single Shot Visual Text Extraction

1 code implementation • 20 Sep 2019 • He guo, Xiameng Qin, Jiaming Liu, Junyu Han, Jingtuo Liu, Errui Ding

Extracting entity from images is a crucial part of many OCR applications, such as entity recognition of cards, invoices, and receipts.

Entity Extraction using GAN Optical Character Recognition (OCR)

171

Paper
Code

ACFNet: Attentional Class Feature Network for Semantic Segmentation

1 code implementation • ICCV 2019 • Fan Zhang, Yanqin Chen, Zhihang Li, Zhibin Hong, Jingtuo Liu, Feifei Ma, Junyu Han, Errui Ding

Recent works have made great progress in semantic segmentation by exploiting richer context, most of which are designed from a spatial perspective.

Segmentation Semantic Segmentation

149

Paper
Code

Few-Shot Font Generation by Learning Fine-Grained Local Styles

2 code implementations • CVPR 2022 • Licheng Tang, Yiyang Cai, Jiaming Liu, Zhibin Hong, Mingming Gong, Minhu Fan, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang

Instead of explicitly disentangling global or component-wise modeling, the cross-attention mechanism can attend to the right local styles in the reference glyphs and aggregate the reference styles into a fine-grained style representation for the given content glyphs.

Font Generation

Paper
Code

Dynamic Class Queue for Large Scale Face Recognition In the Wild

1 code implementation • CVPR 2021 • Bi Li, Teng Xi, Gang Zhang, Haocheng Feng, Junyu Han, Jingtuo Liu, Errui Ding, Wenyu Liu

Since only a subset of classes is selected for each iteration, the computing requirement is reduced.

Ranked #3 on Face Recognition on AgeDB-30

Face Recognition Representation Learning

Paper
Code

UFO: Unified Feature Optimization

1 code implementation • 21 Jul 2022 • Teng Xi, Yifan Sun, Deli Yu, Bi Li, Nan Peng, Gang Zhang, Xinyu Zhang, Zhigang Wang, Jinwen Chen, Jian Wang, Lufei Liu, Haocheng Feng, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang

UFO aims to benefit each single task with a large-scale pretraining on all tasks.

Face Recognition Multi-Task Learning +4

Paper
Code

NTIRE 2020 Challenge on Real Image Denoising: Dataset, Methods and Results

1 code implementation • 8 May 2020 • Abdelrahman Abdelhamed, Mahmoud Afifi, Radu Timofte, Michael S. Brown, Yue Cao, Zhilu Zhang, WangMeng Zuo, Xiaoling Zhang, Jiye Liu, Wendong Chen, Changyuan Wen, Meng Liu, Shuailin Lv, Yunchao Zhang, Zhihong Pan, Baopu Li, Teng Xi, Yanwen Fan, Xiyu Yu, Gang Zhang, Jingtuo Liu, Junyu Han, Errui Ding, Songhyun Yu, Bumjun Park, Jechang Jeong, Shuai Liu, Ziyao Zong, Nan Nan, Chenghua Li, Zengli Yang, Long Bao, Shuangquan Wang, Dongwoon Bai, Jungwon Lee, Youngjung Kim, Kyeongha Rho, Changyeop Shin, Sungho Kim, Pengliang Tang, Yiyun Zhao, Yuqian Zhou, Yuchen Fan, Thomas Huang, Zhihao LI, Nisarg A. Shah, Wei Liu, Qiong Yan, Yuzhi Zhao, Marcin Możejko, Tomasz Latkowski, Lukasz Treszczotko, Michał Szafraniuk, Krzysztof Trojanowski, Yanhong Wu, Pablo Navarrete Michelini, Fengshuo Hu, Yunhua Lu, Sujin Kim, Wonjin Kim, Jaayeon Lee, Jang-Hwan Choi, Magauiya Zhussip, Azamat Khassenov, Jong Hyun Kim, Hwechul Cho, Priya Kansal, Sabari Nathan, Zhangyu Ye, Xiwen Lu, Yaqi Wu, Jiangxin Yang, Yanlong Cao, Siliang Tang, Yanpeng Cao, Matteo Maggioni, Ioannis Marras, Thomas Tanay, Gregory Slabaugh, Youliang Yan, Myungjoo Kang, Han-Soo Choi, Kyungmin Song, Shusong Xu, Xiaomu Lu, Tingniao Wang, Chunxia Lei, Bin Liu, Rajat Gupta, Vineet Kumar

This challenge is based on a newly collected validation and testing image datasets, and hence, named SIDD+.

Image Denoising

Paper
Code

PyramidBox++: High Performance Detector for Finding Tiny Face

4 code implementations • 31 Mar 2019 • Zhihang Li, Xu Tang, Junyu Han, Jingtuo Liu, Ran He

With the rapid development of deep convolutional neural network, face detection has made great progress in recent years.

Data Augmentation Face Detection +1

Paper
Code

Targeting Ultimate Accuracy: Face Recognition via Deep Embedding

no code implementations • 24 Jun 2015 • Jingtuo Liu, Yafeng Deng, Tao Bai, Zhengping Wei, Chang Huang

Face Recognition has been studied for many decades.

Face Recognition Face Verification +1

Paper
Add Code

Biphasic Learning of GANs for High-Resolution Image-to-Image Translation

no code implementations • 14 Apr 2019 • Jie Cao, Huaibo Huang, Yi Li, Jingtuo Liu, Ran He, Zhenan Sun

In this work, we present a novel training framework for GANs, namely biphasic learning, to achieve image-to-image translation in multiple visual domains at $1024^2$ resolution.

Image-to-Image Translation Mutual Information Estimation +2

Paper
Add Code

ICDAR 2019 Competition on Large-scale Street View Text with Partial Labeling -- RRC-LSVT

no code implementations • 17 Sep 2019 • Yipeng Sun, Zihan Ni, Chee-Kheng Chng, Yuliang Liu, Canjie Luo, Chun Chet Ng, Junyu Han, Errui Ding, Jingtuo Liu, Dimosthenis Karatzas, Chee Seng Chan, Lianwen Jin

Robust text reading from street view images provides valuable information for various applications.

Text Detection Text Spotting +1

Paper
Add Code

Chinese Street View Text: Large-scale Chinese Text Reading with Partially Supervised Learning

no code implementations • ICCV 2019 • Yipeng Sun, Jiaming Liu, Wei Liu, Junyu Han, Errui Ding, Jingtuo Liu

Most existing text reading benchmarks make it difficult to evaluate the performance of more advanced deep learning models in large vocabularies due to the limited amount of training data.

Paper
Add Code

HAMBox: Delving into Online High-quality Anchors Mining for Detecting Outer Faces

no code implementations • 19 Dec 2019 • Yang Liu, Xu Tang, Xiang Wu, Junyu Han, Jingtuo Liu, Errui Ding

In this paper, we propose an Online High-quality Anchor Mining Strategy (HAMBox), which explicitly helps outer faces compensate with high-quality anchors.

Face Detection Multi-Task Learning +2

Paper
Add Code

Learning Global Structure Consistency for Robust Object Tracking

no code implementations • 26 Aug 2020 • Bi Li, Chengquan Zhang, Zhibin Hong, Xu Tang, Jingtuo Liu, Junyu Han, Errui Ding, Wenyu Liu

Unlike many existing trackers that focus on modeling only the target, in this work, we consider the \emph{transient variations of the whole scene}.

Object Visual Object Tracking

Paper
Add Code

Real Image Super Resolution Via Heterogeneous Model Ensemble using GP-NAS

no code implementations • 2 Sep 2020 • Zhihong Pan, Baopu Li, Teng Xi, Yanwen Fan, Gang Zhang, Jingtuo Liu, Junyu Han, Errui Ding

With advancement in deep neural network (DNN), recent state-of-the-art (SOTA) image superresolution (SR) methods have achieved impressive performance using deep residual network with dense skip connections.

Image Super-Resolution Neural Architecture Search

Paper
Add Code

AIM 2020 Challenge on Real Image Super-Resolution: Methods and Results

no code implementations • 25 Sep 2020 • Pengxu Wei, Hannan Lu, Radu Timofte, Liang Lin, WangMeng Zuo, Zhihong Pan, Baopu Li, Teng Xi, Yanwen Fan, Gang Zhang, Jingtuo Liu, Junyu Han, Errui Ding, Tangxin Xie, Liang Cao, Yan Zou, Yi Shen, Jialiang Zhang, Yu Jia, Kaihua Cheng, Chenhuan Wu, Yue Lin, Cen Liu, Yunbo Peng, Xueyi Zou, Zhipeng Luo, Yuehan Yao, Zhenyu Xu, Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Tongtong Zhao, Shanshan Zhao, Yoseob Han, Byung-Hoon Kim, JaeHyun Baek, HaoNing Wu, Dejia Xu, Bo Zhou, Wei Guan, Xiaobo Li, Chen Ye, Hao Li, Yukai Shi, Zhijing Yang, Xiaojun Yang, Haoyu Zhong, Xin Li, Xin Jin, Yaojun Wu, Yingxue Pang, Sen Liu, Zhi-Song Liu, Li-Wen Wang, Chu-Tak Li, Marie-Paule Cani, Wan-Chi Siu, Yuanbo Zhou, Rao Muhammad Umer, Christian Micheloni, Xiaofeng Cong, Rajat Gupta, Keon-Hee Ahn, Jun-Hyuk Kim, Jun-Ho Choi, Jong-Seok Lee, Feras Almasri, Thomas Vandamme, Olivier Debeir

This paper introduces the real image Super-Resolution (SR) challenge that was part of the Advances in Image Manipulation (AIM) workshop, held in conjunction with ECCV 2020.

Image Manipulation Image Super-Resolution +1

Paper
Add Code

FaceController: Controllable Attribute Editing for Face in the Wild

no code implementations • 23 Feb 2021 • Zhiliang Xu, Xiyu Yu, Zhibin Hong, Zhen Zhu, Junyu Han, Jingtuo Liu, Errui Ding, Xiang Bai

By simply employing some existing and easy-obtainable prior information, our method can control, transfer, and edit diverse attributes of faces in the wild.

Ranked #1 on Face Swapping on FaceForensics++ (FID metric)

Attribute Disentanglement +1

Paper
Add Code

ViSTA: Vision and Scene Text Aggregation for Cross-Modal Retrieval

no code implementations • CVPR 2022 • Mengjun Cheng, Yipeng Sun, Longchao Wang, Xiongwei Zhu, Kun Yao, Jie Chen, Guoli Song, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang

Visual appearance is considered to be the most important cue to understand images for cross-modal retrieval, while sometimes the scene text appearing in images can provide valuable information to understand the visual semantics.

Ranked #10 on Cross-Modal Retrieval on Flickr30k (using extra training data)

Contrastive Learning Cross-Modal Retrieval +1

Paper
Add Code

Few-Shot Head Swapping in the Wild

no code implementations • CVPR 2022 • Changyong Shu, Hemao Wu, Hang Zhou, Jiaming Liu, Zhibin Hong, Changxing Ding, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang

Particularly, seamless blending is achieved with the help of a Semantic-Guided Color Reference Creation procedure and a Blending UNet.

Face Swapping

Paper
Add Code

Expressive Talking Head Generation With Granular Audio-Visual Control

no code implementations • CVPR 2022 • Borong Liang, Yan Pan, Zhizhi Guo, Hang Zhou, Zhibin Hong, Xiaoguang Han, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang

Generating expressive talking heads is essential for creating virtual humans.

Talking Head Generation

Paper
Add Code

TRUST: An Accurate and End-to-End Table structure Recognizer Using Splitting-based Transformers

no code implementations • 31 Aug 2022 • Zengyuan Guo, Yuechen Yu, Pengyuan Lv, Chengquan Zhang, Haojie Li, Zhihui Wang, Kun Yao, Jingtuo Liu, Jingdong Wang

The Vertex-based Merging Module is capable of aggregating local contextual information between adjacent basic grids, providing the ability to merge basic girds that belong to the same spanning cell accurately.

Ranked #5 on Table Recognition on PubTabNet

Table Recognition

Paper
Add Code

StyleSwap: Style-Based Generator Empowers Robust Face Swapping

no code implementations • 27 Sep 2022 • Zhiliang Xu, Hang Zhou, Zhibin Hong, Ziwei Liu, Jiaming Liu, Zhizhi Guo, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang

Our core idea is to leverage a style-based generator to empower high-fidelity and robust face swapping, thus the generator's advantage can be adopted for optimizing identity similarity.

Face Swapping

Paper
Add Code

Masked Lip-Sync Prediction by Audio-Visual Contextual Exploitation in Transformers

no code implementations • 9 Dec 2022 • Yasheng Sun, Hang Zhou, Kaisiyuan Wang, Qianyi Wu, Zhibin Hong, Jingtuo Liu, Errui Ding, Jingdong Wang, Ziwei Liu, Hideki Koike

This requires masking a large percentage of the original image and seamlessly inpainting it with the aid of audio and reference frames.

Paper
Add Code

StyleSync: High-Fidelity Generalized and Personalized Lip Sync in Style-based Generator

no code implementations • CVPR 2023 • Jiazhi Guan, Zhanwang Zhang, Hang Zhou, Tianshu Hu, Kaisiyuan Wang, Dongliang He, Haocheng Feng, Jingtuo Liu, Errui Ding, Ziwei Liu, Jingdong Wang

Despite recent advances in syncing lip movements with any audio waves, current methods still struggle to balance generation quality and the model's generalization ability.

Paper
Add Code

HD-Fusion: Detailed Text-to-3D Generation Leveraging Multiple Noise Estimation

no code implementations • 30 Jul 2023 • Jinbo Wu, Xiaobo Gao, Xing Liu, Zhengyang Shen, Chen Zhao, Haocheng Feng, Jingtuo Liu, Errui Ding

In this paper, we study Text-to-3D content generation leveraging 2D diffusion priors to enhance the quality and detail of the generated 3D models.

3D Generation Noise Estimation +1

Paper
Add Code

Accelerating Vision Transformers Based on Heterogeneous Attention Patterns

no code implementations • 11 Oct 2023 • Deli Yu, Teng Xi, Jianwei Li, Baopu Li, Gang Zhang, Haocheng Feng, Junyu Han, Jingtuo Liu, Errui Ding, Jingdong Wang

On one hand, different images share more similar attention patterns in early layers than later layers, indicating that the dynamic query-by-key self-attention matrix may be replaced with a static self-attention matrix in early layers.

Dimensionality Reduction

Paper
Add Code

GIR: 3D Gaussian Inverse Rendering for Relightable Scene Factorization

no code implementations • 8 Dec 2023 • Yahao Shi, Yanmin Wu, Chenming Wu, Xing Liu, Chen Zhao, Haocheng Feng, Jingtuo Liu, Liangjun Zhang, Jian Zhang, Bin Zhou, Errui Ding, Jingdong Wang

This paper presents GIR, a 3D Gaussian Inverse Rendering method for relightable scene factorization.

Inverse Rendering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.