Search Results for author: Jason Kuen

Found 32 papers, 11 papers with code

ControlVAR: Exploring Controllable Visual Autoregressive Modeling

no code implementations14 Jun 2024 Xiang Li, Kai Qiu, Hao Chen, Jason Kuen, Zhe Lin, Rita Singh, Bhiksha Raj

Conditional visual generation has witnessed remarkable progress with the advent of diffusion models (DMs), especially in tasks like control-to-image generation.

Image Generation

SOHES: Self-supervised Open-world Hierarchical Entity Segmentation

no code implementations18 Apr 2024 Shengcao Cao, Jiuxiang Gu, Jason Kuen, Hao Tan, Ruiyi Zhang, Handong Zhao, Ani Nenkova, Liang-Yan Gui, Tong Sun, Yu-Xiong Wang

Using raw images as the sole training data, our method achieves unprecedented performance in self-supervised open-world segmentation, marking a significant milestone towards high-quality open-world entity segmentation in the absence of human-annotated masks.

Segmentation

SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis

no code implementations6 Nov 2023 Hanrong Ye, Jason Kuen, Qing Liu, Zhe Lin, Brian Price, Dan Xu

On the highly competitive ADE20K and COCO benchmarks, our data generation method markedly improves the performance of state-of-the-art segmentation models in semantic segmentation, panoptic segmentation, and instance segmentation.

Diversity Image Generation +4

AIMS: All-Inclusive Multi-Level Segmentation

1 code implementation28 May 2023 Lu Qi, Jason Kuen, Weidong Guo, Jiuxiang Gu, Zhe Lin, Bo Du, Yu Xu, Ming-Hsuan Yang

Despite the progress of image segmentation for accurate visual entity segmentation, completing the diverse requirements of image editing applications for different-level region-of-interest selections remains unsolved.

Image Segmentation Segmentation +1

TopNet: Transformer-based Object Placement Network for Image Compositing

1 code implementation CVPR 2023 Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen

Given a background image and a segmented object, the goal is to train a model to predict plausible placements (location and scale) of the object for compositing.

Object

High Quality Entity Segmentation

no code implementations ICCV 2023 Lu Qi, Jason Kuen, Tiancheng Shen, Jiuxiang Gu, Wenbo Li, Weidong Guo, Jiaya Jia, Zhe Lin, Ming-Hsuan Yang

Given the high-quality and -resolution nature of the dataset, we propose CropFormer which is designed to tackle the intractability of instance-level segmentation on high-resolution images.

Image Segmentation Segmentation +1

SceneComposer: Any-Level Semantic Image Synthesis

no code implementations CVPR 2023 Yu Zeng, Zhe Lin, Jianming Zhang, Qing Liu, John Collomosse, Jason Kuen, Vishal M. Patel

We propose a new framework for conditional image synthesis from semantic layouts of any precision levels, ranging from pure text to a 2D semantic canvas with precise shapes.

Image Generation

High-Quality Entity Segmentation

1 code implementation10 Nov 2022 Lu Qi, Jason Kuen, Weidong Guo, Tiancheng Shen, Jiuxiang Gu, Jiaya Jia, Zhe Lin, Ming-Hsuan Yang

It improves mask prediction by fusing high-res image crops that provide more fine-grained image details and the full image.

Image Segmentation Segmentation +2

Improving the Reliability for Confidence Estimation

no code implementations13 Oct 2022 Haoxuan Qu, Yanchao Li, Lin Geng Foo, Jason Kuen, Jiuxiang Gu, Jun Liu

Confidence estimation, a task that aims to evaluate the trustworthiness of the model's prediction output during deployment, has received lots of research attention recently, due to its importance for the safe deployment of deep models.

Image Classification Meta-Learning +1

Text-to-Image Generation via Implicit Visual Guidance and Hypernetwork

no code implementations17 Aug 2022 Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, John Collomosse

We develop an approach for text-to-image generation that embraces additional retrieval images, driven by a combination of implicit visual guidance loss and generative objectives.

Diversity Retrieval +1

Meta Spatio-Temporal Debiasing for Video Scene Graph Generation

no code implementations23 Jul 2022 Li Xu, Haoxuan Qu, Jason Kuen, Jiuxiang Gu, Jun Liu

Video scene graph generation (VidSGG) aims to parse the video content into scene graphs, which involves modeling the spatio-temporal contextual information in the video.

Graph Generation Meta-Learning +2

GALA: Toward Geometry-and-Lighting-Aware Object Search for Compositing

no code implementations31 Mar 2022 Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen

To move a step further, this paper proposes GALA (Geometry-and-Lighting-Aware), a generic foreground object search method with discriminative modeling on geometry and lighting compatibility for open-world image compositing.

Object

CA-SSL: Class-Agnostic Semi-Supervised Learning for Detection and Segmentation

1 code implementation9 Dec 2021 Lu Qi, Jason Kuen, Zhe Lin, Jiuxiang Gu, Fengyun Rao, Dian Li, Weidong Guo, Zhen Wen, Ming-Hsuan Yang, Jiaya Jia

To improve instance-level detection/segmentation performance, existing self-supervised and semi-supervised methods extract either task-unrelated or task-specific training signals from unlabeled data.

object-detection Object Detection +2

Open-Vocabulary Instance Segmentation via Robust Cross-Modal Pseudo-Labeling

1 code implementation CVPR 2022 Dat Huynh, Jason Kuen, Zhe Lin, Jiuxiang Gu, Ehsan Elhamifar

To address this, we propose a cross-modal pseudo-labeling framework, which generates training pseudo masks by aligning word semantics in captions with visual features of object masks in images.

Instance Segmentation Semantic Segmentation

Open-World Entity Segmentation

2 code implementations29 Jul 2021 Lu Qi, Jason Kuen, Yi Wang, Jiuxiang Gu, Hengshuang Zhao, Zhe Lin, Philip Torr, Jiaya Jia

By removing the need of class label prediction, the models trained for such task can focus more on improving segmentation quality.

Image Manipulation Image Segmentation +2

SelfDoc: Self-Supervised Document Representation Learning

no code implementations CVPR 2021 Peizhao Li, Jiuxiang Gu, Jason Kuen, Vlad I. Morariu, Handong Zhao, Rajiv Jain, Varun Manjunatha, Hongfu Liu

For downstream usage, we propose a novel modality-adaptive attention mechanism for multimodal feature fusion by adaptively emphasizing language and vision signals.

Representation Learning

Multimodal Contrastive Training for Visual Representation Learning

no code implementations CVPR 2021 Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, Yilin Wang, Michael Maire, Ajinkya Kale, Baldo Faieta

We first train our model on COCO and evaluate the learned visual representations on various downstream tasks including image classification, object detection, and instance segmentation.

Cross-Modal Retrieval Image Classification +6

Self-Supervised Relationship Probing

no code implementations NeurIPS 2020 Jiuxiang Gu, Jason Kuen, Shafiq Joty, Jianfei Cai, Vlad Morariu, Handong Zhao, Tong Sun

Structured representations of images that model visual relationships are beneficial for many vision and vision-language applications.

Contrastive Learning Language Modelling +1

Scaling Object Detection by Transferring Classification Weights

1 code implementation ICCV 2019 Jason Kuen, Federico Perazzi, Zhe Lin, Jianming Zhang, Yap-Peng Tan

Large scale object detection datasets are constantly increasing their size in terms of the number of classes and annotations count.

Classification General Classification +3

Motion-Guided Cascaded Refinement Network for Video Object Segmentation

no code implementations CVPR 2018 Ping Hu, Gang Wang, Xiangfei Kong, Jason Kuen, Yap-Peng Tan

Then, the proposed Cascaded Refinement Network(CRN) takes the coarse segmentation as guidance to generate an accurate segmentation of full resolution.

Object Optical Flow Estimation +4

Dual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identification

no code implementations CVPR 2018 Jianlou Si, Honggang Zhang, Chun-Guang Li, Jason Kuen, Xiangfei Kong, Alex C. Kot, Gang Wang

Typical person re-identification (ReID) methods usually describe each pedestrian with a single feature vector and match them in a task-specific metric space.

Person Re-Identification

DelugeNets: Deep Networks with Efficient and Flexible Cross-layer Information Inflows

1 code implementation17 Nov 2016 Jason Kuen, Xiangfei Kong, Gang Wang, Yap-Peng Tan

Deluge Networks (DelugeNets) are deep neural networks which efficiently facilitate massive cross-layer information inflows from preceding layers to succeeding layers.

General Classification

Recurrent Attentional Networks for Saliency Detection

no code implementations CVPR 2016 Jason Kuen, Zhenhua Wang, Gang Wang

Convolutional-deconvolution networks can be adopted to perform end-to-end saliency detection.

Saliency Detection

Recent Advances in Convolutional Neural Networks

no code implementations22 Dec 2015 Jiuxiang Gu, Zhenhua Wang, Jason Kuen, Lianyang Ma, Amir Shahroudy, Bing Shuai, Ting Liu, Xingxing Wang, Li Wang, Gang Wang, Jianfei Cai, Tsuhan Chen

In the last few years, deep learning has led to very good performance on a variety of problems, such as visual recognition, speech recognition and natural language processing.

speech-recognition Speech Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.