Search Results for author: Xichen Pan

Found 5 papers, 4 papers with code

Image Sculpting: Precise Object Editing with 3D Geometry Control

no code implementations2 Jan 2024 Jiraphon Yenphraphai, Xichen Pan, Sainan Liu, Daniele Panozzo, Saining Xie

We present Image Sculpting, a new framework for editing 2D images by incorporating tools from 3D geometry and graphics.

Object

Kosmos-G: Generating Images in Context with Multimodal Large Language Models

1 code implementation4 Oct 2023 Xichen Pan, Li Dong, Shaohan Huang, Zhiliang Peng, Wenhu Chen, Furu Wei

These limitations keep them far from the ultimate goal of "image as a foreign language in image generation."

Image Generation

Learning Temporal Distribution and Spatial Correlation Towards Universal Moving Object Segmentation

1 code implementation19 Apr 2023 Guanfang Dong, Chenqiu Zhao, Xichen Pan, Anup Basu

In this paper, we propose a method called Learning Temporal Distribution and Spatial Correlation (LTS) that has the potential to be a general solution for universal moving object segmentation.

Object Segmentation +1

Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition

1 code implementation ACL 2022 Xichen Pan, Peiyu Chen, Yichen Gong, Helong Zhou, Xinbing Wang, Zhouhan Lin

In particular, audio and visual front-ends are trained on large-scale unimodal datasets, then we integrate components of both front-ends into a larger multimodal framework which learns to recognize parallel audio-visual data into characters through a combination of CTC and seq2seq decoding.

Audio-Visual Speech Recognition Automatic Speech Recognition (ASR) +7

Cannot find the paper you are looking for? You can Submit a new open access paper.