no code implementations • EMNLP 2021 • Arjun Akula, Soravit Changpinyo, Boqing Gong, Piyush Sharma, Song-Chun Zhu, Radu Soricut
One challenge in evaluating visual question answering (VQA) models in the cross-dataset adaptation setting is that the distribution shifts are multi-modal, making it difficult to identify if it is the shifts in visual or language features that play a key role.
no code implementations • NeurIPS 2018 • Hexiang Hu, Liyu Chen, Boqing Gong, Fei Sha
The ability to transfer in reinforcement learning is key towards building an agent of general artificial intelligence.
1 code implementation • 27 Jun 2024 • Ruochen Wang, Ting Liu, Cho-Jui Hsieh, Boqing Gong
Two major challenges arise in efficiently finding a solution to this problem: (1) Enormous Domain Space: Setting the domain to the entire language space poses significant difficulty to the optimization process.
no code implementations • 5 Jun 2024 • Yuanhao Ban, Ruochen Wang, Tianyi Zhou, Minhao Cheng, Boqing Gong, Cho-Jui Hsieh
The concept of negative prompts, emerging from conditional generation models like Stable Diffusion, allows users to specify what to exclude from the generated images.%, demonstrating significant practical efficacy.
no code implementations • 4 Jun 2024 • Yuanhao Ban, Ruochen Wang, Tianyi Zhou, Boqing Gong, Cho-Jui Hsieh, Minhao Cheng
We identify these patches by analyzing the dispersion of object bounding boxes across generated images, leading to the development of a posterior analysis technique.
1 code implementation • 26 May 2024 • Minseon Kim, Hyomin Lee, Boqing Gong, Huishuai Zhang, Sung Ju Hwang
Recent AI systems have shown extremely powerful performance, even surpassing human performance, on various tasks such as information retrieval, language generation, and image generation based on large language models (LLMs).
1 code implementation • 20 May 2024 • Zheyuan Zhang, Elif Keles, Gorkem Durak, Yavuz Taktak, Onkar Susladkar, Vandan Gorade, Debesh Jha, Asli C. Ormeci, Alpay Medetalibeyoglu, Lanhong Yao, Bin Wang, Ilkin Sevgi Isler, Linkai Peng, Hongyi Pan, Camila Lopes Vendrami, Amir Bourhani, Yury Velichko, Boqing Gong, Concetto Spampinato, Ayis Pyrros, Pallavi Tiwari, Derk C. F. Klatte, Megan Engels, Sanne Hoogenboom, Candice W. Bolan, Emil Agarunov, Nassier Harfouch, Chenchan Huang, Marco J. Bruno, Ivo Schoots, Rajesh N. Keswani, Frank H. Miller, Tamas Gonda, Cemal Yazici, Temel Tirkes, Baris Turkbey, Michael B. Wallace, Ulas Bagci
We also collected CT scans of 1, 350 patients from publicly available sources for benchmarking purposes.
no code implementations • 20 Feb 2024 • Long Zhao, Nitesh B. Gundavarapu, Liangzhe Yuan, Hao Zhou, Shen Yan, Jennifer J. Sun, Luke Friedman, Rui Qian, Tobias Weyand, Yue Zhao, Rachel Hornung, Florian Schroff, Ming-Hsuan Yang, David A. Ross, Huisheng Wang, Hartwig Adam, Mikhail Sirotenko, Ting Liu, Boqing Gong
We introduce VideoPrism, a general-purpose video encoder that tackles diverse video understanding tasks with a single frozen model.
no code implementations • CVPR 2024 • Yue Zhao, Long Zhao, Xingyi Zhou, Jialin Wu, Chun-Te Chu, Hui Miao, Florian Schroff, Hartwig Adam, Ting Liu, Boqing Gong, Philipp Krähenbühl, Liangzhe Yuan
Our best model outperforms state-of-the-art methods on MSR-VTT zero-shot text-to-video retrieval by 6%.
no code implementations • CVPR 2024 • Hexiang Hu, Kelvin C. K. Chan, Yu-Chuan Su, Wenhu Chen, Yandong Li, Kihyuk Sohn, Yang Zhao, Xue Ben, Boqing Gong, William Cohen, Ming-Wei Chang, Xuhui Jia
We introduce *multi-modal instruction* for image generation, a task representation articulating a range of generation intents with precision.
no code implementations • 10 Nov 2023 • Calvin Luo, Boqing Gong, Ting Chen, Chen Sun
Motivated by the recent success of multi-task transformers for visual recognition and language understanding, we propose a unified neural architecture for visual recognition and reasoning with a generic interface (e. g., tokens) for both.
1 code implementation • 9 Oct 2023 • Lijun Yu, José Lezama, Nitesh B. Gundavarapu, Luca Versari, Kihyuk Sohn, David Minnen, Yong Cheng, Vighnesh Birodkar, Agrim Gupta, Xiuye Gu, Alexander G. Hauptmann, Boqing Gong, Ming-Hsuan Yang, Irfan Essa, David A. Ross, Lu Jiang
While Large Language Models (LLMs) are the dominant models for generative tasks in language, they do not perform as well as diffusion models on image and video generation.
Ranked #2 on Video Prediction on Kinetics-600 12 frames, 64x64
1 code implementation • NeurIPS 2023 • Meng Liu, Mingda Zhang, Jialu Liu, Hanjun Dai, Ming-Hsuan Yang, Shuiwang Ji, Zheyun Feng, Boqing Gong
In this paper, we present a novel problem, namely video timeline modeling.
no code implementations • 23 Sep 2023 • Yifan Ding, Liqiang Wang, Boqing Gong
Domain adaptation, which aims to transfer knowledge between domains, has been well studied in many areas such as image classification and object detection.
1 code implementation • 6 Jul 2023 • Liangzhe Yuan, Nitesh Bharadwaj Gundavarapu, Long Zhao, Hao Zhou, Yin Cui, Lu Jiang, Xuan Yang, Menglin Jia, Tobias Weyand, Luke Friedman, Mikhail Sirotenko, Huisheng Wang, Florian Schroff, Hartwig Adam, Ming-Hsuan Yang, Ting Liu, Boqing Gong
We evaluate existing foundation models video understanding capabilities using a carefully designed experiment protocol consisting of three hallmark tasks (action recognition, temporal localization, and spatiotemporal localization), eight datasets well received by the community, and four adaptation methods tailoring a foundation model (FM) for a downstream task.
no code implementations • 16 Apr 2023 • Hong-You Chen, Jike Zhong, Mingda Zhang, Xuhui Jia, Hang Qi, Boqing Gong, Wei-Lun Chao, Li Zhang
FedBasis learns a set of few shareable ``basis'' models, which can be linearly combined to form personalized models for clients.
no code implementations • 14 Apr 2023 • Yu-Chuan Su, Kelvin C. K. Chan, Yandong Li, Yang Zhao, Han Zhang, Boqing Gong, Huisheng Wang, Xuhui Jia
Our approach greatly reduces the overhead for personalized image generation and is more applicable in many potential applications.
1 code implementation • 5 Apr 2023 • Zheyuan Zhang, Bin Wang, Lanhong Yao, Ugur Demir, Debesh Jha, Ismail Baris Turkbey, Boqing Gong, Ulas Bagci
In real-world scenarios, however, it is common for models to encounter data from new and different domains to which they were not exposed to during training.
no code implementations • 5 Apr 2023 • Xuhui Jia, Yang Zhao, Kelvin C. K. Chan, Yandong Li, Han Zhang, Boqing Gong, Tingbo Hou, Huisheng Wang, Yu-Chuan Su
This paper proposes a method for generating images of customized objects specified by users.
no code implementations • 28 Mar 2023 • Yuanhao Xiong, Long Zhao, Boqing Gong, Ming-Hsuan Yang, Florian Schroff, Ting Liu, Cho-Jui Hsieh, Liangzhe Yuan
Existing video-language pre-training methods primarily focus on instance-level alignment between video clips and captions via global contrastive learning but neglect rich fine-grained local information in both videos and text, which is of importance to downstream tasks requiring temporal localization and semantic reasoning.
1 code implementation • ICCV 2023 • Long Zhao, Liangzhe Yuan, Boqing Gong, Yin Cui, Florian Schroff, Ming-Hsuan Yang, Hartwig Adam, Ting Liu
To address this challenge, we propose UniVRD, a novel bottom-up method for Unified Visual Relationship Detection by leveraging vision and language models (VLMs).
Human-Object Interaction Detection Relationship Detection +2
1 code implementation • CVPR 2023 • Dongdong Wang, Boqing Gong, Liqiang Wang
Then, we study popular existing calibration methods and compare them with selective scaling on semantic segmentation calibration.
no code implementations • 14 Oct 2022 • Minghua Liu, Yin Zhou, Charles R. Qi, Boqing Gong, Hao Su, Dragomir Anguelov
Our method co-designs an efficient labeling process with semi/weakly supervised learning and is applicable to nearly any 3D semantic segmentation backbones.
no code implementations • 17 Aug 2022 • Ziwei Liu, Zhongqi Miao, Xiaohang Zhan, Jiayun Wang, Boqing Gong, Stella X. Yu
A practical recognition system must balance between majority (head) and minority (tail) classes, generalize across the distribution, and acknowledge novelty upon the instances of unseen classes (open classes).
1 code implementation • 11 Apr 2022 • Amil Dravid, Florian Schiffers, Boqing Gong, Aggelos K. Katsaggelos
Despite the surge of deep learning in the past decade, some users are skeptical to deploy these models in practice due to their black-box nature.
2 code implementations • ICLR 2022 • Juntang Zhuang, Boqing Gong, Liangzhe Yuan, Yin Cui, Hartwig Adam, Nicha Dvornek, Sekhar Tatikonda, James Duncan, Ting Liu
Instead, we define a \textit{surrogate gap}, a measure equivalent to the dominant eigenvalue of Hessian at a local minimum when the radius of the neighborhood (to derive the perturbed loss) is small.
no code implementations • 14 Dec 2021 • Qing Li, Boqing Gong, Yin Cui, Dan Kondratyuk, Xianzhi Du, Ming-Hsuan Yang, Matthew Brown
The experiments show that the resultant unified foundation transformer works surprisingly well on both the vision-only and text-only tasks, and the proposed knowledge distillation and gradient masking strategy can effectively lift the performance to approach the level of separately-trained models.
1 code implementation • CVPR 2022 • Liangzhe Yuan, Rui Qian, Yin Cui, Boqing Gong, Florian Schroff, Ming-Hsuan Yang, Hartwig Adam, Ting Liu
Modern self-supervised learning algorithms typically enforce persistency of instance representations across views.
no code implementations • 8 Dec 2021 • Rui Qian, Yeqing Li, Liangzhe Yuan, Boqing Gong, Ting Liu, Matthew Brown, Serge Belongie, Ming-Hsuan Yang, Hartwig Adam, Yin Cui
The training objective consists of two parts: a fine-grained temporal learning objective to maximize the similarity between corresponding temporal embeddings in the short clip and the long clip, and a persistent temporal learning objective to pull together global embeddings of the two clips.
1 code implementation • 18 Sep 2021 • Zihang Zou, Boqing Gong, Liqiang Wang
We study protecting a user's data (images in this work) against a learner's unauthorized use in training neural networks.
no code implementations • 17 Aug 2021 • Chun-Han Yao, Boqing Gong, Yin Cui, Hang Qi, Yukun Zhu, Ming-Hsuan Yang
We further take the server-client and inter-client domain shifts into account and pose a domain adaptation problem with one source (centralized server data) and multiple targets (distributed client data).
1 code implementation • NeurIPS 2021 • Tai-Yu Pan, Cheng Zhang, Yandong Li, Hexiang Hu, Dong Xuan, Soravit Changpinyo, Boqing Gong, Wei-Lun Chao
We propose NorCal, Normalized Calibration for long-tailed object detection and instance segmentation, a simple and straightforward recipe that reweighs the predicted scores of each class by its training sample size.
no code implementations • 18 Jun 2021 • Marco Fornoni, Chaochao Yan, Liangchen Luo, Kimberly Wilber, Alex Stark, Yin Cui, Boqing Gong, Andrew Howard
When interacting with objects through cameras, or pictures, users often have a specific intent.
2 code implementations • ICLR 2022 • Xiangning Chen, Cho-Jui Hsieh, Boqing Gong
Vision Transformers (ViTs) and MLPs signal further efforts on replacing hand-wired features or inductive biases with general-purpose neural architectures.
no code implementations • CVPR 2021 • Xinjie Fan, Qifei Wang, Junjie Ke, Feng Yang, Boqing Gong, Mingyuan Zhou
As a generic tool, the improvement introduced by ASR-Norm is agnostic to the choice of ADA methods.
1 code implementation • 26 Apr 2021 • Yu-Chuan Su, Soravit Changpinyo, Xiangning Chen, Sathish Thoppay, Cho-Jui Hsieh, Lior Shapira, Radu Soricut, Hartwig Adam, Matthew Brown, Ming-Hsuan Yang, Boqing Gong
To enable progress on this task, we create a new dataset consisting of 220k human-annotated 2. 5D relationships among 512K objects from 11K images.
2 code implementations • NeurIPS 2021 • Hassan Akbari, Liangzhe Yuan, Rui Qian, Wei-Hong Chuang, Shih-Fu Chang, Yin Cui, Boqing Gong
We train VATT end-to-end from scratch using multimodal contrastive losses and evaluate its performance by the downstream tasks of video action recognition, audio event classification, image classification, and text-to-video retrieval.
Ranked #3 on Zero-Shot Video Retrieval on YouCook2 (text-to-video Mean Rank metric)
3 code implementations • 12 Apr 2021 • Ahmet Iscen, André Araujo, Boqing Gong, Cordelia Schmid
An effective and simple approach to long-tailed visual recognition is to learn feature representations and a classifier separately, with instance and class-balanced sampling, respectively.
Ranked #13 on Long-tail Learning on iNaturalist 2018
1 code implementation • CVPR 2021 • Xiangning Chen, Cihang Xie, Mingxing Tan, Li Zhang, Cho-Jui Hsieh, Boqing Gong
Data augmentation has become a de facto component for training high-performance deep image classifiers, but its potential is under-explored for object detection.
Ranked #17 on Object Detection on COCO-O
3 code implementations • CVPR 2021 • Dan Kondratyuk, Liangzhe Yuan, Yandong Li, Li Zhang, Mingxing Tan, Matthew Brown, Boqing Gong
We present Mobile Video Networks (MoViNets), a family of computation and memory efficient video networks that can operate on streaming video for online inference.
Ranked #3 on Action Classification on Charades
1 code implementation • ICCV 2021 • Cheng Zhang, Tai-Yu Pan, Yandong Li, Hexiang Hu, Dong Xuan, Soravit Changpinyo, Boqing Gong, Wei-Lun Chao
Many objects do not appear frequently enough in complex scenes (e. g., certain handbags in living rooms) for training an accurate object detector, but are often found frequently by themselves (e. g., in product images).
no code implementations • 14 Feb 2021 • Jaewoong Shin, Hae Beom Lee, Boqing Gong, Sung Ju Hwang
Meta-learning of shared initialization parameters has shown to be highly effective in solving few-shot learning tasks.
no code implementations • ICCV 2021 • Xiangyun Zhao, Raviteja Vemulapalli, Philip Andrew Mansfield, Boqing Gong, Bradley Green, Lior Shapira, Ying Wu
While recent Convolutional Neural Network (CNN) based semantic segmentation approaches have achieved impressive results by using large amounts of labeled training data, their performance drops significantly as the amount of labeled data decreases.
no code implementations • ICCV 2021 • Muhammad Abdullah Jamal, Liqiang Wang, Boqing Gong
Gradient-based meta-learning relates task-specific models to a meta-model by gradients.
no code implementations • 13 Dec 2020 • Xiangyun Zhao, Raviteja Vemulapalli, Philip Mansfield, Boqing Gong, Bradley Green, Lior Shapira, Ying Wu
While recent Convolutional Neural Network (CNN) based semantic segmentation approaches have achieved impressive results by using large amounts of labeled training data, their performance drops significantly as the amount of labeled data decreases.
1 code implementation • CVPR 2021 • Yandong Li, Xuhui Jia, Ruoxin Sang, Yukun Zhu, Bradley Green, Liqiang Wang, Boqing Gong
This paper is concerned with ranking many pre-trained deep neural networks (DNNs), called checkpoints, for the transfer learning to a downstream task.
Ranked #6 on Transferability on classification benchmark
4 code implementations • CVPR 2021 • Rui Qian, Tianjian Meng, Boqing Gong, Ming-Hsuan Yang, Huisheng Wang, Serge Belongie, Yin Cui
Our representations are learned using a contrastive loss, where two augmented clips from the same short video are pulled together in the embedding space, while clips from different videos are pushed away.
Ranked #1 on Self-Supervised Action Recognition on Kinetics-600
no code implementations • ECCV 2020 • Yandong Li, Di Huang, Danfeng Qin, Liqiang Wang, Boqing Gong
They fail to improve object detectors in their vanilla forms due to the domain gap between the Web images and curated datasets.
no code implementations • CVPR 2021 • Li Yi, Boqing Gong, Thomas Funkhouser
We study an unsupervised domain adaptation problem for the semantic labeling of 3D point clouds, with a particular focus on domain discrepancies induced by different LiDAR sensors.
1 code implementation • 25 Jun 2020 • Cihang Xie, Mingxing Tan, Boqing Gong, Alan Yuille, Quoc V. Le
SAT also works well with larger networks: it helps EfficientNet-L1 to achieve 82. 2% accuracy and 58. 6% robustness on ImageNet, outperforming the previous state-of-the-art defense by 9. 5% for accuracy and 11. 6% for robustness.
no code implementations • 1 May 2020 • Dan Kondratyuk, Mingxing Tan, Matthew Brown, Boqing Gong
Ensembling is a simple and popular technique for boosting evaluation performance by training multiple models (e. g., with different initializations) and aggregating their predictions.
4 code implementations • CVPR 2020 • Yang Zhang, Zixiang Zhou, Philip David, Xiangyu Yue, Zerong Xi, Boqing Gong, Hassan Foroosh
The need for fine-grained perception in autonomous driving systems has resulted in recently increased research on online semantic segmentation of single-scan LiDAR.
Ranked #11 on Robust 3D Semantic Segmentation on nuScenes-C
1 code implementation • CVPR 2020 • Dongdong Wang, Yandong Li, Liqiang Wang, Boqing Gong
The other is that the number of images used for the knowledge distillation should be small; otherwise, it violates our expectation of reducing the dependence on large-scale datasets.
1 code implementation • CVPR 2020 • Muhammad Abdullah Jamal, Matthew Brown, Ming-Hsuan Yang, Liqiang Wang, Boqing Gong
Object frequency in the real world often follows a power law, leading to a mismatch between datasets with long-tailed class distributions seen by a machine learning model and our expectation of the model to perform well on all classes.
Ranked #29 on Long-tail Learning on Places-LT
2 code implementations • ICLR 2020 • Runtian Zhai, Chen Dan, Di He, huan zhang, Boqing Gong, Pradeep Ravikumar, Cho-Jui Hsieh, Li-Wei Wang
Adversarial training is one of the most popular ways to learn robust models but is usually attack-dependent and time costly.
1 code implementation • 25 Dec 2019 • Chuang Gan, Yiwei Zhang, Jiajun Wu, Boqing Gong, Joshua B. Tenenbaum
In this paper, we attempt to approach the problem of Audio-Visual Embodied Navigation, the task of planning the shortest path from a random starting location in a scene to the sound source in an indoor environment, given only raw egocentric visual and audio sensory data.
6 code implementations • CVPR 2020 • Cihang Xie, Mingxing Tan, Boqing Gong, Jiang Wang, Alan Yuille, Quoc V. Le
We show that AdvProp improves a wide range of models on various image recognition tasks and performs better when the models are bigger.
Ranked #4 on Domain Generalization on VizWiz-Classification
no code implementations • CVPR 2020 • Ziwei Liu, Zhongqi Miao, Xingang Pan, Xiaohang Zhan, Dahua Lin, Stella X. Yu, Boqing Gong
A typical domain adaptation approach is to adapt models trained on the annotated data in a source domain (e. g., sunny weather) for achieving high performance on the test data in a target domain (e. g., rainy weather).
no code implementations • ICCV 2019 • Xiangyu Yue, Yang Zhang, Sicheng Zhao, Alberto Sangiovanni-Vincentelli, Kurt Keutzer, Boqing Gong
To this end, we propose a new approach of domain randomization and pyramid consistency to learn a model with high generalizability.
1 code implementation • ICCV 2019 • Qing Lian, Fengmao Lv, Lixin Duan, Boqing Gong
We propose a new approach, called self-motivated pyramid curriculum domain adaptation (PyCDA), to facilitate the adaptation of semantic segmentation neural networks from synthetic source domains to real target domains.
Ranked #14 on Image-to-Image Translation on SYNTHIA-to-Cityscapes
2 code implementations • ICCV 2019 • Zhengyuan Yang, Boqing Gong, Li-Wei Wang, Wenbing Huang, Dong Yu, Jiebo Luo
We propose a simple, fast, and accurate one-stage approach to visual grounding, inspired by the following insight.
no code implementations • 16 Jun 2019 • Yifan Ding, Liqiang Wang, huan zhang, Jin-Feng Yi, Deliang Fan, Boqing Gong
As deep neural networks (DNNs) have become increasingly important and popular, the robustness of DNNs is the key to the safety of both the Internet and the physical world.
no code implementations • ICLR 2019 • Yandong Li, Lijun Li, Liqiang Wang, Tong Zhang, Boqing Gong
In other words, there is a population of adversarial examples, instead of only one, for any input to a DNN.
1 code implementation • ICLR 2019 • Meng Fang, Cheng Zhou, Bei Shi, Boqing Gong, Jia Xu, Tong Zhang
Dealing with sparse rewards is one of the most important challenges in reinforcement learning (RL), especially when a goal is dynamic (e. g., to grasp a moving object).
1 code implementation • 1 May 2019 • Yandong Li, Lijun Li, Liqiang Wang, Tong Zhang, Boqing Gong
Powerful adversarial attack methods are vital for understanding how to construct robust deep neural networks (DNNs) and for thoroughly testing defense techniques.
1 code implementation • ICLR 2019 • Yang Zhang, Hassan Foroosh, Philip David, Boqing Gong
In particular, we learn a camouflage pattern to hide vehicles from being detected by state-of-the-art convolutional neural network based detectors.
2 code implementations • CVPR 2019 • Ziwei Liu, Zhongqi Miao, Xiaohang Zhan, Jiayun Wang, Boqing Gong, Stella X. Yu
We define Open Long-Tailed Recognition (OLTR) as learning from such naturally distributed data and optimizing the classification accuracy over a balanced test set which include head, tail, and open classes.
2 code implementations • NeurIPS 2018 • Hexiang Hu, Liyu Chen, Boqing Gong, Fei Sha
The ability to transfer in reinforcement learning is key towards building an agent of general artificial intelligence.
no code implementations • 25 Feb 2019 • Xianfeng Tang, Boqing Gong, Yanwei Yu, Huaxiu Yao, Yandong Li, Haiyong Xie, Xiaoyu Wang
In this paper, we propose a novel framework for the citywide traffic volume inference using both dense GPS trajectories and incomplete trajectories captured by camera surveillance systems.
2 code implementations • 24 Dec 2018 • Yang Zhang, Philip David, Hassan Foroosh, Boqing Gong
Hence, we propose a curriculum-style learning approach to minimizing the domain gap in urban scene semantic segmentation.
Ranked #26 on Image-to-Image Translation on SYNTHIA-to-Cityscapes
1 code implementation • 16 Dec 2018 • Soravit Changpinyo, Wei-Lun Chao, Boqing Gong, Fei Sha
Zero-shot learning (ZSL) enables solving a task without the need to see its examples.
no code implementations • 9 Aug 2018 • Lijie Fan, Wenbing Huang, Chuang Gan, Junzhou Huang, Boqing Gong
The recent advances in deep learning have made it possible to generate photo-realistic images by using neural networks and even to extrapolate video frames from an input video clip.
no code implementations • ECCV 2018 • Aidean Sharghi, Ali Borji, Chengtao Li, Tianbao Yang, Boqing Gong
In terms of modeling, we design a new probabilistic distribution such that, when it is integrated into SeqDPP, the resulting model accepts user input about the expected length of the summary.
no code implementations • 20 Jul 2018 • Zhezhi He, Boqing Gong, Deliang Fan
Deep convolution neural network has achieved great success in many artificial intelligence applications.
no code implementations • 18 Jul 2018 • Adnan Siraj Rakin, Jin-Feng Yi, Boqing Gong, Deliang Fan
Recent studies have shown that deep neural networks (DNNs) are vulnerable to adversarial attacks.
no code implementations • ECCV 2018 • Yandong Li, Liqiang Wang, Tianbao Yang, Boqing Gong
The large volume of video content and high viewing frequency demand automatic video summarization algorithms, of which a key property is the capability of modeling diversity.
no code implementations • CVPR 2018 • Chuang Gan, Boqing Gong, Kun Liu, Hao Su, Leonidas J. Guibas
In addition, we also find that a progressive training strategy can foster a better neural network for the video recognition task than blindly pooling the distinct sources of geometry cues together.
no code implementations • CVPR 2018 • Muhammad Abdullah Jamal, Haoxiang Li, Boqing Gong
Arguably, no single face detector fits all real-life scenarios.
1 code implementation • CVPR 2018 • Lijie Fan, Wenbing Huang, Chuang Gan, Stefano Ermon, Boqing Gong, Junzhou Huang
Despite the recent success of end-to-end learned representations, hand-crafted optical flow features are still widely used in video analysis tasks.
Ranked #42 on Action Recognition on UCF101
1 code implementation • 21 Mar 2018 • Lijun Li, Boqing Gong
Although end-to-end (E2E) learning has led to impressive progress on a variety of visual understanding tasks, it is often impeded by hardware constraints (e. g., GPU memory) and is prone to overfitting.
1 code implementation • ICLR 2018 • Xiang Wei, Boqing Gong, Zixia Liu, Wei Lu, Liqiang Wang
Despite being impactful on a variety of problems and applications, the generative adversarial nets (GANs) are remarkably difficult to train.
no code implementations • 8 Feb 2018 • Yifan Ding, Liqiang Wang, Deliang Fan, Boqing Gong
In the first stage, we identify a small portion of images from the noisy training set of which the labels are correct with a high probability.
no code implementations • 5 Feb 2018 • Adnan Siraj Rakin, Zhezhi He, Boqing Gong, Deliang Fan
Blind pre-processing improves the white box attack accuracy of MNIST from 94. 3\% to 98. 7\%.
1 code implementation • ICCV 2017 • Chuang Gan, Yandong Li, Haoxiang Li, Chen Sun, Boqing Gong
Many seemingly distant annotations (e. g., semantic segmentation and visual question answering (VQA)) are inherently connected in that they reveal different levels and perspectives of human understandings about the same visual scenes --- and even the same set of images (e. g., of COCO).
1 code implementation • ICCV 2017 • Yang Zhang, Philip David, Boqing Gong
Hence, we propose a curriculum-style learning approach to minimize the domain gap in urban scenery semantic segmentation.
Ranked #27 on Image-to-Image Translation on SYNTHIA-to-Cityscapes
no code implementations • CVPR 2017 • Aidean Sharghi, Jacob S. Laurel, Boqing Gong
However, one of the main obstacles to the research on video summarization is the user subjectivity - users have various preferences over the summaries.
no code implementations • CVPR 2017 • Mahdi M. Kalayeh, Boqing Gong, Mubarak Shah
We build our facial attribute prediction model jointly with a deep semantic segmentation network.
Ranked #2 on Facial Attribute Classification on LFWA
no code implementations • 23 Aug 2016 • Yang Zhang, Rupam Acharyya, Ji Liu, Boqing Gong
We develop a new statistical machine learning paradigm, named infinite-label learning, to annotate a data point with more than one relevant labels from a candidate set, which pools both the finite labels observed at training and a potentially infinite number of previously unseen labels.
no code implementations • 18 Jul 2016 • Aidean Sharghi, Boqing Gong, Mubarak Shah
The decision to include a shot in the summary depends on the shot's relevance to the user query and importance in the context of the video, jointly.
no code implementations • CVPR 2016 • Yang Zhang, Boqing Gong, Mubarak Shah
The well-known word analogy experiments show that the recent word vectors capture fine-grained linguistic regularities in words by linear vector offsets, but it is unclear how well the simple vector offsets can encode visual regularities over words.
Ranked #5 on Multi-label zero-shot learning on Open Images V4
1 code implementation • 13 May 2016 • Wei-Lun Chao, Soravit Changpinyo, Boqing Gong, Fei Sha
Zero-shot learning (ZSL) methods have been studied in the unrealistic setting where test data are assumed to come from unseen classes only.
no code implementations • CVPR 2016 • Chuang Gan, Tianbao Yang, Boqing Gong
Attributes possess appealing properties and benefit many computer vision problems, such as object recognition, learning with humans in the loop, and image retrieval.
2 code implementations • CVPR 2016 • Soravit Changpinyo, Wei-Lun Chao, Boqing Gong, Fei Sha
Given semantic descriptions of object classes, zero-shot learning aims to accurately recognize objects of the unseen classes, from which no examples are available at the training stage, by associating them to the seen classes, from which labeled examples are provided.
Ranked #1 on Few-Shot Image Classification on AWA - 0-Shot
no code implementations • NeurIPS 2016 • Zhe Li, Boqing Gong, Tianbao Yang
To exhibit the optimal dropout probabilities, we analyze the shallow learning with multinomial dropout and establish the risk bound for stochastic optimization.
no code implementations • NeurIPS 2014 • Boqing Gong, Wei-Lun Chao, Kristen Grauman, Fei Sha
Video summarization is a challenging problem with great application potential.
no code implementations • 6 Nov 2014 • Boqing Gong, Wei-Lun Chao, Kristen Grauman, Fei Sha
Extensive empirical studies validate our contributions, including applications on challenging document and video summarization, where flexibility in modeling the kernel matrix and balancing different errors is indispensable.
no code implementations • NeurIPS 2013 • Boqing Gong, Kristen Grauman, Fei Sha
By maximum distinctiveness, we require the underlying distributions of the identified domains to be different from each other; by maximum learnability, we ensure that a strong discriminative model can be learned from the domain.