Search Results for author: Ce Liu

Found 49 papers, 22 papers with code

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

no code implementations • 10 Nov 2023 • Bin Xiao, Haiping Wu, Weijian Xu, Xiyang Dai, Houdong Hu, Yumao Lu, Michael Zeng, Ce Liu, Lu Yuan

We introduce Florence-2, a novel vision foundation model with a unified, prompt-based representation for a variety of computer vision and vision-language tasks.

Multi-Task Learning object-detection +1

Paper
Add Code

MM-VID: Advancing Video Understanding with GPT-4V(ision)

no code implementations • 30 Oct 2023 • Kevin Lin, Faisal Ahmed, Linjie Li, Chung-Ching Lin, Ehsan Azarnasab, Zhengyuan Yang, JianFeng Wang, Lin Liang, Zicheng Liu, Yumao Lu, Ce Liu, Lijuan Wang

We present MM-VID, an integrated system that harnesses the capabilities of GPT-4V, combined with specialized tools in vision, audio, and speech, to facilitate advanced video understanding.

Video Understanding

Paper
Add Code

Self-Enforced Job Matching

no code implementations • 26 Aug 2023 • Ce Liu, Ziwei Wang, HanZhe Zhang

The classic two-sided many-to-one job matching model assumes that firms treat workers as substitutes and workers ignore colleagues when choosing where to work.

Paper
Add Code

Indiscernible Object Counting in Underwater Scenes

1 code implementation • CVPR 2023 • Guolei Sun, Zhaochong An, Yun Liu, Ce Liu, Christos Sakaridis, Deng-Ping Fan, Luc van Gool

We further advance the frontier of this field by systematically studying a new challenge named indiscernible object counting (IOC), the goal of which is to count objects that are blended with respect to their surroundings.

Benchmarking Object +2

Paper
Code

Single Image Depth Prediction Made Better: A Multivariate Gaussian Take

no code implementations • CVPR 2023 • Ce Liu, Suryansh Kumar, Shuhang Gu, Radu Timofte, Luc van Gool

Accordingly, we introduce an approach that performs continuous modeling of per-pixel depth, where we can predict and reason about the per-pixel depth and its distribution.

Depth Estimation Depth Prediction

Paper
Add Code

MM-REACT: Prompting ChatGPT for Multimodal Reasoning and Action

1 code implementation • 20 Mar 2023 • Zhengyuan Yang, Linjie Li, JianFeng Wang, Kevin Lin, Ehsan Azarnasab, Faisal Ahmed, Zicheng Liu, Ce Liu, Michael Zeng, Lijuan Wang

We propose MM-REACT, a system paradigm that integrates ChatGPT with a pool of vision experts to achieve multimodal reasoning and action.

Ranked #22 on Visual Question Answering on MM-Vet

Multimodal Reasoning Visual Question Answering

906

Paper
Code

VA-DepthNet: A Variational Approach to Single Image Depth Prediction

2 code implementations • 13 Feb 2023 • Ce Liu, Suryansh Kumar, Shuhang Gu, Radu Timofte, Luc van Gool

While state-of-the-art deep neural network methods for SIDP learn the scene depth from images in a supervised setting, they often overlook the invaluable invariances and priors in the rigid scene space, such as the regularity of the scene.

Ranked #18 on Monocular Depth Estimation on NYU-Depth V2

Depth Prediction Monocular Depth Estimation

108

Paper
Code

Learning Customized Visual Models with Retrieval-Augmented Knowledge

1 code implementation • CVPR 2023 • Haotian Liu, Kilho Son, Jianwei Yang, Ce Liu, Jianfeng Gao, Yong Jae Lee, Chunyuan Li

Image-text contrastive learning models such as CLIP have demonstrated strong task transfer ability.

Ranked #1 on Semi-Supervised Image Classification on ImageNet - 1% labeled data (using extra training data)

Contrastive Learning Retrieval +3

117

Paper
Code

X-Paste: Revisiting Scalable Copy-Paste for Instance Segmentation using CLIP and StableDiffusion

1 code implementation • 7 Dec 2022 • Hanqing Zhao, Dianmo Sheng, Jianmin Bao, Dongdong Chen, Dong Chen, Fang Wen, Lu Yuan, Ce Liu, Wenbo Zhou, Qi Chu, Weiming Zhang, Nenghai Yu

We demonstrate for the first time that using a text2image model to generate images or zero-shot recognition model to filter noisily crawled images for different object categories is a feasible way to make Copy-Paste truly scalable.

Ranked #7 on Instance Segmentation on LVIS v1.0 val

Data Augmentation Instance Segmentation +5

Paper
Code

ReCo: Region-Controlled Text-to-Image Generation

no code implementations • CVPR 2023 • Zhengyuan Yang, JianFeng Wang, Zhe Gan, Linjie Li, Kevin Lin, Chenfei Wu, Nan Duan, Zicheng Liu, Ce Liu, Michael Zeng, Lijuan Wang

Human evaluation on PaintSkill shows that ReCo is +19. 28% and +17. 21% more accurate in generating images with correct object count and spatial relationship than the T2I model.

Ranked #2 on Conditional Text-to-Image Synthesis on COCO-MIG

Conditional Text-to-Image Synthesis Position

Paper
Add Code

OmniVL:One Foundation Model for Image-Language and Video-Language Tasks

no code implementations • 15 Sep 2022 • Junke Wang, Dongdong Chen, Zuxuan Wu, Chong Luo, Luowei Zhou, Yucheng Zhao, Yujia Xie, Ce Liu, Yu-Gang Jiang, Lu Yuan

This paper presents OmniVL, a new foundation model to support both image-language and video-language tasks using one universal architecture.

Ranked #4 on Cross-Modal Retrieval on Flickr30k (using extra training data)

Action Classification Action Recognition +13

Paper
Add Code

Coarse-to-Fine Vision-Language Pre-training with Fusion in the Backbone

1 code implementation • NeurIPS 2022 • Zi-Yi Dou, Aishwarya Kamath, Zhe Gan, Pengchuan Zhang, JianFeng Wang, Linjie Li, Zicheng Liu, Ce Liu, Yann Lecun, Nanyun Peng, Jianfeng Gao, Lijuan Wang

Vision-language (VL) pre-training has recently received considerable attention.

Ranked #1 on Phrase Grounding on Flickr30k Entities Dev

Described Object Detection Image Captioning +5

123

Paper
Code

LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling

1 code implementation • CVPR 2023 • Linjie Li, Zhe Gan, Kevin Lin, Chung-Ching Lin, Zicheng Liu, Ce Liu, Lijuan Wang

In this work, we explore a unified VidL framework LAVENDER, where Masked Language Modeling (MLM) is used as the common interface for all pre-training and downstream tasks.

Language Modelling Masked Language Modeling +6

Paper
Code

Visual Clues: Bridging Vision and Language Foundations for Image Paragraph Captioning

no code implementations • 3 Jun 2022 • Yujia Xie, Luowei Zhou, Xiyang Dai, Lu Yuan, Nguyen Bach, Ce Liu, Michael Zeng

Thanks to the strong zero-shot capability of foundation models, we start by constructing a rich semantic representation of the image (e. g., image tags, object attributes / locations, captions) as a structured textual prompt, called visual clues, using a vision foundation model.

Image Paragraph Captioning Language Modelling +1

Paper
Add Code

GIT: A Generative Image-to-text Transformer for Vision and Language

2 code implementations • 27 May 2022 • JianFeng Wang, Zhengyuan Yang, Xiaowei Hu, Linjie Li, Kevin Lin, Zhe Gan, Zicheng Liu, Ce Liu, Lijuan Wang

In this paper, we design and train a Generative Image-to-text Transformer, GIT, to unify vision-language tasks such as image/video captioning and question answering.

Ranked #1 on Image Captioning on nocaps-XD near-domain

Image Captioning Image Classification +7

124,889

Paper
Code

Credible Persuasion

no code implementations • 6 May 2022 • Xiao Lin, Ce Liu

We propose a new notion of credibility for Bayesian persuasion problems.

Paper
Add Code

K-LITE: Learning Transferable Visual Models with External Knowledge

2 code implementations • 20 Apr 2022 • Sheng Shen, Chunyuan Li, Xiaowei Hu, Jianwei Yang, Yujia Xie, Pengchuan Zhang, Zhe Gan, Lijuan Wang, Lu Yuan, Ce Liu, Kurt Keutzer, Trevor Darrell, Anna Rohrbach, Jianfeng Gao

We propose K-LITE, a simple strategy to leverage external knowledge for building transferable visual systems: In training, it enriches entities in text with WordNet and Wiktionary knowledge, leading to an efficient and scalable approach to learning image representations that uses knowledge about the visual concepts.

Benchmarking Descriptive +4

367

Paper
Code

Unified Contrastive Learning in Image-Text-Label Space

1 code implementation • CVPR 2022 • Jianwei Yang, Chunyuan Li, Pengchuan Zhang, Bin Xiao, Ce Liu, Lu Yuan, Jianfeng Gao

Particularly, it attains gains up to 9. 2% and 14. 5% in average on zero-shot recognition benchmarks over the language-image contrastive learning and supervised learning methods, respectively.

Contrastive Learning Image Classification +2

367

Paper
Code

MaskGIT: Masked Generative Image Transformer

6 code implementations • CVPR 2022 • Huiwen Chang, Han Zhang, Lu Jiang, Ce Liu, William T. Freeman

At inference time, the model begins with generating all tokens of an image simultaneously, and then refines the image iteratively conditioned on the previous generation.

Ranked #2 on Text-to-Image Generation on LHQC

Image Manipulation Image Outpainting +1

1,114

Paper
Code

ViSER: Video-Specific Surface Embeddings for Articulated 3D Shape Reconstruction

1 code implementation • NeurIPS 2021 • Gengshan Yang, Deqing Sun, Varun Jampani, Daniel Vlasic, Forrester Cole, Ce Liu, Deva Ramanan

The surface embeddings are implemented as coordinate-based MLPs that are fit to each video via consistency and contrastive reconstruction losses. Experimental results show that ViSER compares favorably against prior work on challenging videos of humans with loose clothing and unusual poses as well as animals videos from DAVIS and YTVOS.

3D Shape Reconstruction from Videos

Paper
Code

Pyramid Adversarial Training Improves ViT Performance

1 code implementation • CVPR 2022 • Charles Herrmann, Kyle Sargent, Lu Jiang, Ramin Zabih, Huiwen Chang, Ce Liu, Dilip Krishnan, Deqing Sun

In this work, we present pyramid adversarial training (PyramidAT), a simple and effective technique to improve ViT's overall performance.

Ranked #9 on Domain Generalization on ImageNet-C (using extra training data)

Adversarial Attack Data Augmentation +2

2,990

Paper
Code

Florence: A New Foundation Model for Computer Vision

1 code implementation • 22 Nov 2021 • Lu Yuan, Dongdong Chen, Yi-Ling Chen, Noel Codella, Xiyang Dai, Jianfeng Gao, Houdong Hu, Xuedong Huang, Boxin Li, Chunyuan Li, Ce Liu, Mengchen Liu, Zicheng Liu, Yumao Lu, Yu Shi, Lijuan Wang, JianFeng Wang, Bin Xiao, Zhen Xiao, Jianwei Yang, Michael Zeng, Luowei Zhou, Pengchuan Zhang

Computer vision foundation models, which are trained on diverse, large-scale dataset and can be adapted to a wide range of downstream tasks, are critical for this mission to solve real-world computer vision applications.

Ranked #1 on Action Recognition In Videos on Kinetics-600

Action Classification Action Recognition In Videos +12

367

Paper
Code

Neural-PIL: Neural Pre-Integrated Lighting for Reflectance Decomposition

1 code implementation • NeurIPS 2021 • Mark Boss, Varun Jampani, Raphael Braun, Ce Liu, Jonathan T. Barron, Hendrik P. A. Lensch

Decomposing a scene into its shape, reflectance and illumination is a fundamental problem in computer vision and graphics.

Novel View Synthesis

Paper
Code

SLIDE: Single Image 3D Photography with Soft Layering and Depth-aware Inpainting

no code implementations • ICCV 2021 • Varun Jampani, Huiwen Chang, Kyle Sargent, Abhishek Kar, Richard Tucker, Michael Krainin, Dominik Kaeser, William T. Freeman, David Salesin, Brian Curless, Ce Liu

We present SLIDE, a modular and unified system for single image 3D photography that uses a simple yet effective soft layering strategy to better preserve appearance details in novel views.

Image Matting

Paper
Add Code

ViTGAN: Training GANs with Vision Transformers

3 code implementations • ICLR 2022 • Kwonjoon Lee, Huiwen Chang, Lu Jiang, Han Zhang, Zhuowen Tu, Ce Liu

Recently, Vision Transformers (ViTs) have shown competitive performance on image recognition while requiring less vision-specific inductive biases.

Ranked #68 on Image Generation on CIFAR-10

Image Generation

506

Paper
Code

OCONet: Image Extrapolation by Object Completion

no code implementations • CVPR 2021 • Richard Strong Bowen, Huiwen Chang, Charles Herrmann, Piotr Teterwak, Ce Liu, Ramin Zabih

Existing methods struggle to extrapolate images with salient objects in the foreground or are limited to very specific objects such as humans, but tend to work well on indoor/outdoor scenes.

Object

Paper
Add Code

LASR: Learning Articulated Shape Reconstruction from a Monocular Video

1 code implementation • CVPR 2021 • Gengshan Yang, Deqing Sun, Varun Jampani, Daniel Vlasic, Forrester Cole, Huiwen Chang, Deva Ramanan, William T. Freeman, Ce Liu

Remarkable progress has been made in 3D reconstruction of rigid structures from a video or a collection of images.

3D Shape Reconstruction from Videos Object

166

Paper
Code

COMISR: Compression-Informed Video Super-Resolution

2 code implementations • ICCV 2021 • Yinxiao Li, Pengchong Jin, Feng Yang, Ce Liu, Ming-Hsuan Yang, Peyman Milanfar

Most video super-resolution methods focus on restoring high-resolution video frames from low-resolution videos without taking into account compression.

Ranked #6 on Video Super-Resolution on MSU Super-Resolution for Video Compression

Video Super-Resolution

32,798

Paper
Code

AutoFlow: Learning a Better Training Set for Optical Flow

1 code implementation • CVPR 2021 • Deqing Sun, Daniel Vlasic, Charles Herrmann, Varun Jampani, Michael Krainin, Huiwen Chang, Ramin Zabih, William T. Freeman, Ce Liu

Synthetic datasets play a critical role in pre-training CNN models for optical flow, but they are painstaking to generate and hard to adapt to new applications.

Optical Flow Estimation

115

Paper
Code

Deep 3D-to-2D Watermarking: Embedding Messages in 3D Meshes and Extracting Them from 2D Renderings

no code implementations • CVPR 2022 • Innfarn Yoo, Huiwen Chang, Xiyang Luo, Ondrej Stava, Ce Liu, Peyman Milanfar, Feng Yang

Digital watermarking is widely used for copyright protection.

Paper
Add Code

Regularizing Generative Adversarial Networks under Limited Data

1 code implementation • CVPR 2021 • Hung-Yu Tseng, Lu Jiang, Ce Liu, Ming-Hsuan Yang, Weilong Yang

Recent years have witnessed the rapid progress of generative adversarial networks (GANs).

Ranked #1 on Image Generation on CIFAR-100

Data Augmentation Image Generation

163

Paper
Code

NeRD: Neural Reflectance Decomposition from Image Collections

1 code implementation • ICCV 2021 • Mark Boss, Raphael Braun, Varun Jampani, Jonathan T. Barron, Ce Liu, Hendrik P. A. Lensch

This problem is inherently more challenging when the illumination is not a single light source under laboratory conditions but is instead an unconstrained environmental illumination.

Ranked #5 on Image Relighting on Stanford-ORB

Depth Prediction Image Relighting +3

243

Paper
Code

Robust image stitching with multiple registrations

no code implementations • ECCV 2018 • Charles Herrmann, Chen Wang, Richard Strong Bowen, Emil Keyder, Michael Krainin, Ce Liu, Ramin Zabih

Here, we observe that the use of a single registration often leads to errors, especially in scenes with significant depth variation or object motion.

Image Stitching

Paper
Add Code

Stability in Repeated Matching Markets

no code implementations • 7 Jul 2020 • Ce Liu

This paper develops a framework for repeated matching markets.

Paper
Add Code

Supervised Contrastive Learning

23 code implementations • NeurIPS 2020 • Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, Dilip Krishnan

Contrastive learning applied to self-supervised representation learning has seen a resurgence in recent years, leading to state of the art performance in the unsupervised training of deep image models.

Ranked #2 on Class Incremental Learning on cifar100

Class Incremental Learning Contrastive Learning +4

32,793

Paper
Code

Microwave Photonic Imaging Radar with a Millimeter-level Resolution

no code implementations • 9 Apr 2020 • Cong Ma, Yue Yang, Ce Liu, Beichen Fan, Xingwei Ye, Yamei Zhang, Xiangchuan Wang, Shilong Pan

Microwave photonic radars enable fast or even real-time high-resolution imaging thanks to its broad bandwidth.

Paper
Add Code

Depth Extraction from Video Using Non-parametric Sampling

no code implementations • 24 Dec 2019 • Kevin Karsch, Ce Liu, Sing Bing Kang

We describe a technique that automatically generates plausible depth maps from videos using non-parametric depth sampling.

Depth Estimation Optical Flow Estimation

Paper
Add Code

DepthTransfer: Depth Extraction from Video Using Non-parametric Sampling

no code implementations • 24 Dec 2019 • Kevin Karsch, Ce Liu, Sing Bing Kang

We describe a technique that automatically generates plausible depth maps from videos using non-parametric depth sampling.

Depth Estimation Optical Flow Estimation

Paper
Add Code

Boundless: Generative Adversarial Networks for Image Extension

no code implementations • ICCV 2019 • Piotr Teterwak, Aaron Sarna, Dilip Krishnan, Aaron Maschinot, David Belanger, Ce Liu, William T. Freeman

Image extension models have broad applications in image editing, computational photography and computer graphics.

Ranked #2 on Uncropping on Places2 val

Generative Adversarial Network Image Inpainting +1

Paper
Add Code

Learning the Depths of Moving People by Watching Frozen People

no code implementations • CVPR 2019 • Zhengqi Li, Tali Dekel, Forrester Cole, Richard Tucker, Noah Snavely, Ce Liu, William T. Freeman

We present a method for predicting dense depth in scenarios where both a monocular camera and people in the scene are freely moving.

Depth Estimation Depth Prediction

Paper
Add Code

Sparse, Smart Contours to Represent and Edit Images

no code implementations • CVPR 2018 • Tali Dekel, Chuang Gan, Dilip Krishnan, Ce Liu, William T. Freeman

We study the problem of reconstructing an image from information stored at contour locations.

Face Recognition Image Manipulation

Paper
Add Code

Smart, Sparse Contours to Represent and Edit Images

no code implementations • 21 Dec 2017 • Tali Dekel, Chuang Gan, Dilip Krishnan, Ce Liu, William T. Freeman

We study the problem of reconstructing an image from information stored at contour locations.

Face Recognition Image Manipulation

Paper
Add Code

On the Effectiveness of Visible Watermarks

no code implementations • CVPR 2017 • Tali Dekel, Michael Rubinstein, Ce Liu, William T. Freeman

Since such an attack relies on the consistency of watermarks across image collection, we explore and evaluate how it is affected by various types of inconsistencies in the watermark embedding that could potentially be used to make watermarking more secured.

Image Matting

Paper
Add Code

Deep Convolutional Neural Network for Image Deconvolution

no code implementations • NeurIPS 2014 • Li Xu, Jimmy SJ. Ren, Ce Liu, Jiaya Jia

Many fundamental image-related problems involve deconvolution operators.

Ranked #1 on Image Compression on FER2013

Image Compression Image Deconvolution

Paper
Add Code

Local Layering for Joint Motion Estimation and Occlusion Detection

no code implementations • CVPR 2014 • Deqing Sun, Ce Liu, Hanspeter Pfister

To handle such situations, we propose a local layering model where motion and occlusion relationships are inferred jointly.

Motion Estimation Optical Flow Estimation

Paper
Add Code

A Compositional Model for Low-Dimensional Image Set Representation

no code implementations • CVPR 2014 • Hossein Mobahi, Ce Liu, William T. Freeman

Learning a low-dimensional representation of images is useful for various applications in graphics and computer vision.

Paper
Add Code

Unsupervised Joint Object Discovery and Segmentation in Internet Images

no code implementations • CVPR 2013 • Michael Rubinstein, Armand Joulin, Johannes Kopf, Ce Liu

In contrast to previous co-segmentation methods, our algorithm performs well even in the presence of significant amounts of noise images (images not containing a common object), as typical for datasets collected from Internet search.

Object Object Discovery +1

Paper
Add Code

Probabilistic Label Trees for Efficient Large Scale Image Classification

no code implementations • CVPR 2013 • Baoyuan Liu, Fereshteh Sadeghi, Marshall Tappen, Ohad Shamir, Ce Liu

Large-scale recognition problems with thousands of classes pose a particular challenge because applying the classifier requires more computation as the number of classes grows.

Classification General Classification +1

Paper
Add Code

Deformable Spatial Pyramid Matching for Fast Dense Correspondences

no code implementations • CVPR 2013 • Jaechul Kim, Ce Liu, Fei Sha, Kristen Grauman

We introduce a fast deformable spatial pyramid (DSP) matching algorithm for computing dense pixel correspondences.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.