Search Results for author: Hexiang Hu

Found 28 papers, 12 papers with code

Synthesize Policies for Transfer and Adaptation across Tasks and Environments

no code implementations NeurIPS 2018 Hexiang Hu, Liyu Chen, Boqing Gong, Fei Sha

The ability to transfer in reinforcement learning is key towards building an agent of general artificial intelligence.

On Model Calibration for Long-Tailed Object Detection and Instance Segmentation

no code implementations5 Jul 2021 Tai-Yu Pan, Cheng Zhang, Yandong Li, Hexiang Hu, Dong Xuan, Soravit Changpinyo, Boqing Gong, Wei-Lun Chao

We propose NorCal, Normalized Calibration for long-tailed object detection and instance segmentation, a simple and straightforward recipe that reweighs the predicted scores of each class by its training sample size.

Instance Segmentation Object Detection +1

MosaicOS: A Simple and Effective Use of Object-Centric Images for Long-Tailed Object Detection

1 code implementation17 Feb 2021 Cheng Zhang, Tai-Yu Pan, Yandong Li, Hexiang Hu, Dong Xuan, Soravit Changpinyo, Boqing Gong, Wei-Lun Chao

Many objects do not appear frequently enough in complex scenes (e. g., certain handbags in living rooms) for training an accurate object detector, but are often found frequently by themselves (e. g., in product images).

Imputation Instance Segmentation +2

A Hierarchical Multi-Modal Encoder for Moment Localization in Video Corpus

no code implementations18 Nov 2020 BoWen Zhang, Hexiang Hu, Joonseok Lee, Ming Zhao, Sheide Chammas, Vihan Jain, Eugene Ie, Fei Sha

Identifying a short segment in a long video that semantically matches a text query is a challenging task that has important application potentials in language-based video search, browsing, and navigation.

Language Modelling Temporal Localization +1

Learning the Best Pooling Strategy for Visual Semantic Embedding

no code implementations CVPR 2021 Jiacheng Chen, Hexiang Hu, Hao Wu, Yuning Jiang, Changhu Wang

Visual Semantic Embedding (VSE) is a dominant approach for vision-language retrieval, which aims at learning a deep embedding space such that visual data are embedded close to their semantic text labels or descriptions.

Video-Text Retrieval

Learning to Represent Image and Text with Denotation Graph

no code implementations EMNLP 2020 BoWen Zhang, Hexiang Hu, Vihan Jain, Eugene Ie, Fei Sha

Recent progresses have leveraged the ideas of pre-training (from language modeling) and attention layers in Transformers to learn representation from datasets containing images aligned with linguistic expressions that describe the images.

Image Retrieval Language Modelling +1

Drinking from a Firehose: Continual Learning with Web-scale Natural Language

1 code implementation18 Jul 2020 Hexiang Hu, Ozan Sener, Fei Sha, Vladlen Koltun

Collectively, the POLL problem setting, the Firehose datasets, and the ConGraD algorithm enable a complete benchmark for reproducible research on web-scale continual learning.

Continual Learning

BabyWalk: Going Farther in Vision-and-Language Navigation by Taking Baby Steps

1 code implementation ACL 2020 Wang Zhu, Hexiang Hu, Jiacheng Chen, Zhiwei Deng, Vihan Jain, Eugene Ie, Fei Sha

To this end, we propose BabyWalk, a new VLN agent that is learned to navigate by decomposing long instructions into shorter ones (BabySteps) and completing them sequentially.

Imitation Learning Vision and Language Navigation

Visual Storytelling via Predicting Anchor Word Embeddings in the Stories

no code implementations13 Jan 2020 Bowen Zhang, Hexiang Hu, Fei Sha

To narrate a sequence of images, we use the predicted anchor word embeddings and the image features as the joint input to a seq2seq model.

Visual Storytelling Word Embeddings

Multimodal Model-Agnostic Meta-Learning via Task-Aware Modulation

2 code implementations NeurIPS 2019 Risto Vuorio, Shao-Hua Sun, Hexiang Hu, Joseph J. Lim

Model-agnostic meta-learners aim to acquire meta-learned parameters from similar tasks to adapt to novel tasks from the same distribution with few gradient updates.

Few-Shot Image Classification General Classification

Learning Adaptive Classifiers Synthesis for Generalized Few-Shot Learning

1 code implementation7 Jun 2019 Han-Jia Ye, Hexiang Hu, De-Chuan Zhan

In this paper, we investigate the problem of generalized few-shot learning (GFSL) -- a model during the deployment is required to learn about tail categories with few shots and simultaneously classify the head classes.

Few-Shot Learning Generalized Few-Shot Learning +2

Synthesized Policies for Transfer and Adaptation across Tasks and Environments

2 code implementations NeurIPS 2018 Hexiang Hu, Liyu Chen, Boqing Gong, Fei Sha

The ability to transfer in reinforcement learning is key towards building an agent of general artificial intelligence.

Evaluating Text-to-Image Matching using Binary Image Selection (BISON)

no code implementations19 Jan 2019 Hexiang Hu, Ishan Misra, Laurens van der Maaten

Providing systems the ability to relate linguistic and visual content is one of the hallmarks of computer vision.

Image Captioning Image Retrieval

Toward Multimodal Model-Agnostic Meta-Learning

no code implementations18 Dec 2018 Risto Vuorio, Shao-Hua Sun, Hexiang Hu, Joseph J. Lim

One important limitation of such frameworks is that they seek a common initialization shared across the entire task distribution, substantially limiting the diversity of the task distributions that they are able to learn from.

Few-Shot Image Classification Meta-Learning

Few-Shot Learning via Embedding Adaptation with Set-to-Set Functions

3 code implementations CVPR 2020 Han-Jia Ye, Hexiang Hu, De-Chuan Zhan, Fei Sha

Many few-shot learning methods address this challenge by learning an instance embedding function from seen classes and apply the function to instances from unseen classes with limited labels.

Few-Shot Image Classification General Classification +2

Engaging Image Captioning Via Personality

no code implementations CVPR 2019 Kurt Shuster, Samuel Humeau, Hexiang Hu, Antoine Bordes, Jason Weston

While such tasks are useful to verify that a machine understands the content of an image, they are not engaging to humans as captions.

Image Captioning

Cross-Modal and Hierarchical Modeling of Video and Text

1 code implementation ECCV 2018 Bowen Zhang, Hexiang Hu, Fei Sha

Similarly, a paragraph may contain sentences with different topics, which collectively conveys a coherent message or story.

Action Recognition Video Captioning +1

Cross-Dataset Adaptation for Visual Question Answering

no code implementations CVPR 2018 Wei-Lun Chao, Hexiang Hu, Fei Sha

Analogous to domain adaptation for visual recognition, this setting is appealing when the target dataset does not have a sufficient amount of labeled data to learn an "in-domain" model.

Domain Adaptation Question Answering +1

Learning Answer Embeddings for Visual Question Answering

no code implementations CVPR 2018 Hexiang Hu, Wei-Lun Chao, Fei Sha

These properties make the approach particularly appealing for transfer learning for open-ended Visual QA, where the source dataset on which the model is learned has limited overlapping with the target dataset in the space of answers.

Question Answering Transfer Learning +1

Structured Label Inference for Visual Understanding

1 code implementation18 Feb 2018 Nelson Nauata, Hexiang Hu, Guang-Tong Zhou, Zhiwei Deng, Zicheng Liao, Greg Mori

In this paper, we exploit this rich structure for performing graph-based inference in label space for a number of tasks: multi-label image and video classification and action detection in untrimmed videos.

Action Detection Classification +4

Being Negative but Constructively: Lessons Learnt from Creating Better Visual Question Answering Datasets

no code implementations NAACL 2018 Wei-Lun Chao, Hexiang Hu, Fei Sha

We apply the procedures to re-construct decoy answers for two popular Visual QA datasets as well as to create a new Visual QA dataset from the Visual Genome project, resulting in the largest dataset for this task.

Question Answering Visual Question Answering

LabelBank: Revisiting Global Perspectives for Semantic Segmentation

1 code implementation29 Mar 2017 Hexiang Hu, Zhiwei Deng, Guang-Tong Zhou, Fei Sha, Greg Mori

We advocate that holistic inference of image concepts provides valuable information for detailed pixel labeling.

Semantic Segmentation

Recalling Holistic Information for Semantic Segmentation

no code implementations24 Nov 2016 Hexiang Hu, Zhiwei Deng, Guang-Tong Zhou, Fei Sha, Greg Mori

We advocate that high-recall holistic inference of image concepts provides valuable information for detailed pixel labeling.

Semantic Segmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.