Search Results for author: Jason Kuen

Found 20 papers, 7 papers with code

Unified Pretraining Framework for Document Understanding

no code implementations22 Apr 2022 Jiuxiang Gu, Jason Kuen, Vlad I. Morariu, Handong Zhao, Nikolaos Barmpalios, Rajiv Jain, Ani Nenkova, Tong Sun

Document intelligence automates the extraction of information from documents and supports many business applications.

Self-Supervised Learning

GALA: Toward Geometry-and-Lighting-Aware Object Search for Compositing

no code implementations31 Mar 2022 Sijie Zhu, Zhe Lin, Scott Cohen, Jason Kuen, Zhifei Zhang, Chen Chen

To move a step further, this paper proposes GALA (Geometry-and-Lighting-Aware), a generic foreground object search method with discriminative modeling on geometry and lighting compatibility for open-world image compositing.

CaSP: Class-agnostic Semi-Supervised Pretraining for Detection and Segmentation

1 code implementation9 Dec 2021 Lu Qi, Jason Kuen, Zhe Lin, Jiuxiang Gu, Fengyun Rao, Dian Li, Weidong Guo, Zhen Wen, Jiaya Jia

To this end, we propose a novel Class-agnostic Semi-supervised Pretraining (CaSP) framework to achieve a more favorable task-specificity balance in extracting training signals from unlabeled data.

Object Detection

High Quality Segmentation for Ultra High-resolution Images

1 code implementation29 Nov 2021 Tiancheng Shen, Yuechen Zhang, Lu Qi, Jason Kuen, Xingyu Xie, Jianlong Wu, Zhe Lin, Jiaya Jia

To segment 4K or 6K ultra high-resolution images needs extra computation consideration in image segmentation.

Semantic Segmentation

Open-Vocabulary Instance Segmentation via Robust Cross-Modal Pseudo-Labeling

no code implementations24 Nov 2021 Dat Huynh, Jason Kuen, Zhe Lin, Jiuxiang Gu, Ehsan Elhamifar

To address this, we propose a cross-modal pseudo-labeling framework, which generates training pseudo masks by aligning word semantics in captions with visual features of object masks in images.

Instance Segmentation Semantic Segmentation

Open-World Entity Segmentation

2 code implementations29 Jul 2021 Lu Qi, Jason Kuen, Yi Wang, Jiuxiang Gu, Hengshuang Zhao, Zhe Lin, Philip Torr, Jiaya Jia

By removing the need of class label prediction, the models trained for such task can focus more on improving segmentation quality.

Image Manipulation Panoptic Segmentation

SelfDoc: Self-Supervised Document Representation Learning

no code implementations CVPR 2021 Peizhao Li, Jiuxiang Gu, Jason Kuen, Vlad I. Morariu, Handong Zhao, Rajiv Jain, Varun Manjunatha, Hongfu Liu

For downstream usage, we propose a novel modality-adaptive attention mechanism for multimodal feature fusion by adaptively emphasizing language and vision signals.

Representation Learning

Multimodal Contrastive Training for Visual Representation Learning

no code implementations CVPR 2021 Xin Yuan, Zhe Lin, Jason Kuen, Jianming Zhang, Yilin Wang, Michael Maire, Ajinkya Kale, Baldo Faieta

We first train our model on COCO and evaluate the learned visual representations on various downstream tasks including image classification, object detection, and instance segmentation.

Cross-Modal Retrieval Image Classification +4

Self-Supervised Relationship Probing

no code implementations NeurIPS 2020 Jiuxiang Gu, Jason Kuen, Shafiq Joty, Jianfei Cai, Vlad Morariu, Handong Zhao, Tong Sun

Structured representations of images that model visual relationships are beneficial for many vision and vision-language applications.

Contrastive Learning Language Modelling +1

Scaling Object Detection by Transferring Classification Weights

1 code implementation ICCV 2019 Jason Kuen, Federico Perazzi, Zhe Lin, Jianming Zhang, Yap-Peng Tan

Large scale object detection datasets are constantly increasing their size in terms of the number of classes and annotations count.

General Classification Object Detection

Motion-Guided Cascaded Refinement Network for Video Object Segmentation

no code implementations CVPR 2018 Ping Hu, Gang Wang, Xiangfei Kong, Jason Kuen, Yap-Peng Tan

Then, the proposed Cascaded Refinement Network(CRN) takes the coarse segmentation as guidance to generate an accurate segmentation of full resolution.

Frame Optical Flow Estimation +3

Dual Attention Matching Network for Context-Aware Feature Sequence based Person Re-Identification

no code implementations CVPR 2018 Jianlou Si, Honggang Zhang, Chun-Guang Li, Jason Kuen, Xiangfei Kong, Alex C. Kot, Gang Wang

Typical person re-identification (ReID) methods usually describe each pedestrian with a single feature vector and match them in a task-specific metric space.

Person Re-Identification

DelugeNets: Deep Networks with Efficient and Flexible Cross-layer Information Inflows

1 code implementation17 Nov 2016 Jason Kuen, Xiangfei Kong, Gang Wang, Yap-Peng Tan

Deluge Networks (DelugeNets) are deep neural networks which efficiently facilitate massive cross-layer information inflows from preceding layers to succeeding layers.

General Classification

Recurrent Attentional Networks for Saliency Detection

no code implementations CVPR 2016 Jason Kuen, Zhenhua Wang, Gang Wang

Convolutional-deconvolution networks can be adopted to perform end-to-end saliency detection.

Saliency Detection

Recent Advances in Convolutional Neural Networks

no code implementations22 Dec 2015 Jiuxiang Gu, Zhenhua Wang, Jason Kuen, Lianyang Ma, Amir Shahroudy, Bing Shuai, Ting Liu, Xingxing Wang, Li Wang, Gang Wang, Jianfei Cai, Tsuhan Chen

In the last few years, deep learning has led to very good performance on a variety of problems, such as visual recognition, speech recognition and natural language processing.

Speech Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.