Search Results for author: Guan Pang

Found 16 papers, 8 papers with code

LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning

no code implementations6 Dec 2023 Bolin Lai, Xiaoliang Dai, Lawrence Chen, Guan Pang, James M. Rehg, Miao Liu

Additionally, existing diffusion-based image manipulation models are sub-optimal in controlling the state transition of an action in egocentric image pixel space because of the domain gap.

Image Manipulation Language Modelling +1

DISGO: Automatic End-to-End Evaluation for Scene Text OCR

no code implementations25 Aug 2023 Mei-Yuh Hwang, Yangyang Shi, Ankit Ramchandani, Guan Pang, Praveen Krishnan, Lucas Kabela, Frank Seide, Samyak Datta, Jun Liu

This paper discusses the challenges of optical character recognition (OCR) on natural scenes, which is harder than OCR on documents due to the wild content and various image backgrounds.

Machine Translation Optical Character Recognition +2

Text-Conditional Contextualized Avatars For Zero-Shot Personalization

no code implementations14 Apr 2023 Samaneh Azadi, Thomas Hayes, Akbar Shah, Guan Pang, Devi Parikh, Sonal Gupta

Recent large-scale text-to-image generation models have made significant improvements in the quality, realism, and diversity of the synthesized images and enable users to control the created content through language.

Text to 3D Text-to-Image Generation

TextStyleBrush: Transfer of Text Aesthetics from a Single Example

1 code implementation15 Jun 2021 Praveen Krishnan, Rama Kovvuri, Guan Pang, Boris Vassilev, Tal Hassner

We present a novel approach for disentangling the content of a text image from all aspects of its appearance.

Disentanglement

TextOCR: Towards large-scale end-to-end reasoning for arbitrary-shaped scene text

no code implementations CVPR 2021 Amanpreet Singh, Guan Pang, Mandy Toh, Jing Huang, Wojciech Galuba, Tal Hassner

A crucial component for the scene text based reasoning required for TextVQA and TextCaps datasets involve detecting and recognizing text present in the images using an optical character recognition (OCR) system.

Optical Character Recognition Optical Character Recognition (OCR) +2

A Multiplexed Network for End-to-End, Multilingual OCR

1 code implementation CVPR 2021 Jing Huang, Guan Pang, Rama Kovvuri, Mandy Toh, Kevin J Liang, Praveen Krishnan, Xi Yin, Tal Hassner

Recent advances in OCR have shown that an end-to-end (E2E) training pipeline that includes both detection and recognition leads to the best results.

Optical Character Recognition (OCR) Text Detection

img2pose: Face Alignment and Detection via 6DoF, Face Pose Estimation

1 code implementation CVPR 2021 Vítor Albiero, Xingyu Chen, Xi Yin, Guan Pang, Tal Hassner

Tests on AFLW2000-3D and BIWI show that our method runs at real-time and outperforms state of the art (SotA) face pose estimators.

3D Face Alignment Face Alignment +3

From Satellite Imagery to Disaster Insights

1 code implementation17 Dec 2018 Jigar Doshi, Saikat Basu, Guan Pang

The use of satellite imagery has become increasingly popular for disaster monitoring and response.

Change Detection Disaster Response

Improving Rotated Text Detection with Rotation Region Proposal Networks

no code implementations16 Nov 2018 Jing Huang, Viswanath Sivakumar, Mher Mnatsakanyan, Guan Pang

In this work, we extend the scene-text extraction system at Facebook, Rosetta, to efficiently handle text in various orientations.

Misinformation Region Proposal +1

DeepGlobe 2018: A Challenge to Parse the Earth through Satellite Images

1 code implementation17 May 2018 Ilke Demir, Krzysztof Koperski, David Lindenbaum, Guan Pang, Jing Huang, Saikat Basu, Forest Hughes, Devis Tuia, Ramesh Raskar

We present the DeepGlobe 2018 Satellite Image Understanding Challenge, which includes three public competitions for segmentation, detection, and classification tasks on satellite images.

Cannot find the paper you are looking for? You can Submit a new open access paper.