Search Results for author: Yao-Hung Hubert Tsai

Found 45 papers, 19 papers with code

KPConvX: Modernizing Kernel Point Convolution with Kernel Attention

no code implementations CVPR 2024 Hugues Thomas, Yao-Hung Hubert Tsai, Timothy D. Barfoot, Jian Zhang

In the field of deep point cloud understanding, KPConv is a unique architecture that uses kernel points to locate convolutional weights in space, instead of relying on Multi-Layer Perceptron (MLP) encodings.

3D Point Cloud Classification Semantic Segmentation

An Empirical Study of Self-supervised Learning with Wasserstein Distance

no code implementations16 Oct 2023 Makoto Yamada, Yuki Takezawa, Guillaume Houry, Kira Michaela Dusterwald, Deborah Sulem, Han Zhao, Yao-Hung Hubert Tsai

We find that the model performance depends on the combination of TWD and probability model, and that the Jeffrey divergence regularization helps in model training.

Representation Learning Self-Supervised Learning

Multimodal Large Language Model for Visual Navigation

no code implementations12 Oct 2023 Yao-Hung Hubert Tsai, Vansh Dhar, Jialu Li, BoWen Zhang, Jian Zhang

Recent efforts to enable visual navigation using large language models have mainly focused on developing complex prompt systems.

Language Modelling Large Language Model +3

Self-Supervised Object Goal Navigation with In-Situ Finetuning

no code implementations9 Dec 2022 So Yeon Min, Yao-Hung Hubert Tsai, Wei Ding, Ali Farhadi, Ruslan Salakhutdinov, Yonatan Bisk, Jian Zhang

In contrast, our LocCon shows the most robust transfer in the real world among the set of models we compare to, and that the real-world performance of all models can be further improved with self-supervised LocCon in-situ training.

Contrastive Learning Navigate +2

Greedy Modality Selection via Approximate Submodular Maximization

no code implementations22 Oct 2022 Runxiang Cheng, Gargi Balasubramaniam, Yifei He, Yao-Hung Hubert Tsai, Han Zhao

We formulate a theoretical framework for optimizing modality selection in multimodal learning and introduce a utility measure to quantify the benefit of selecting a modality.

Feature Importance

Towards Multimodal Multitask Scene Understanding Models for Indoor Mobile Agents

no code implementations27 Sep 2022 Yao-Hung Hubert Tsai, Hanlin Goh, Ali Farhadi, Jian Zhang

The perception system in personalized mobile agents requires developing indoor scene understanding models, which can understand 3D geometries, capture objectiveness, analyze human behaviors, etc.

3D Object Detection Autonomous Driving +9

Paraphrasing Is All You Need for Novel Object Captioning

no code implementations25 Sep 2022 Cheng-Fu Yang, Yao-Hung Hubert Tsai, Wan-Cyuan Fan, Ruslan Salakhutdinov, Louis-Philippe Morency, Yu-Chiang Frank Wang

Since no ground truth captions are available for novel object images during training, our P2C leverages cross-modality (image-text) association modules to ensure the above caption characteristics can be properly preserved.

Language Modelling Object

Scale dependant layer for self-supervised nuclei encoding

1 code implementation22 Jul 2022 Peter Naylor, Yao-Hung Hubert Tsai, Marick Laé, Makoto Yamada

Recent developments in self-supervised learning give us the possibility to further reduce human intervention in multi-step pipelines where the focus evolves around particular objects of interest.

Self-Supervised Learning

Conditional Contrastive Learning with Kernel

1 code implementation ICLR 2022 Yao-Hung Hubert Tsai, Tianqin Li, Martin Q. Ma, Han Zhao, Kun Zhang, Louis-Philippe Morency, Ruslan Salakhutdinov

Conditional contrastive learning frameworks consider the conditional sampling procedure that constructs positive or negative data pairs conditioned on specific variables.

Contrastive Learning

Learning Visual-Linguistic Adequacy, Fidelity, and Fluency for Novel Object Captioning

no code implementations29 Sep 2021 Cheng-Fu Yang, Yao-Hung Hubert Tsai, Wan-Cyuan Fan, Yu-Chiang Frank Wang, Louis-Philippe Morency, Ruslan Salakhutdinov

Novel object captioning (NOC) learns image captioning models for describing objects or visual concepts which are unseen (i. e., novel) in the training captions.

Image Captioning

HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units

10 code implementations14 Jun 2021 Wei-Ning Hsu, Benjamin Bolte, Yao-Hung Hubert Tsai, Kushal Lakhotia, Ruslan Salakhutdinov, Abdelrahman Mohamed

Self-supervised approaches for speech representation learning are challenged by three unique problems: (1) there are multiple sound units in each input utterance, (2) there is no lexicon of input sound units during the pre-training phase, and (3) sound units have variable lengths with no explicit segmentation.

Clustering Language Modelling +3

Integrating Auxiliary Information in Self-supervised Learning

no code implementations5 Jun 2021 Yao-Hung Hubert Tsai, Tianqin Li, Weixin Liu, Peiyuan Liao, Ruslan Salakhutdinov, Louis-Philippe Morency

Our approach contributes as follows: 1) Comparing to conventional self-supervised representations, the auxiliary-information-infused self-supervised representations bring the performance closer to the supervised representations; 2) The presented Cl-InfoNCE can also work with unsupervised constructed clusters (e. g., k-means clusters) and outperform strong clustering-based self-supervised learning approaches, such as the Prototypical Contrastive Learning (PCL) method; 3) We show that Cl-InfoNCE may be a better approach to leverage the data clustering information, by comparing it to the baseline approach - learning to predict the clustering assignments with cross-entropy loss.

Clustering Contrastive Learning +1

A Note on Connecting Barlow Twins with Negative-Sample-Free Contrastive Learning

2 code implementations28 Apr 2021 Yao-Hung Hubert Tsai, Shaojie Bai, Louis-Philippe Morency, Ruslan Salakhutdinov

In this report, we relate the algorithmic design of Barlow Twins' method to the Hilbert-Schmidt Independence Criterion (HSIC), thus establishing it as a contrastive learning approach that is free of negative samples.

Contrastive Learning Self-Supervised Learning

Self-supervised Representation Learning with Relative Predictive Coding

1 code implementation ICLR 2021 Yao-Hung Hubert Tsai, Martin Q. Ma, Muqiao Yang, Han Zhao, Louis-Philippe Morency, Ruslan Salakhutdinov

This paper introduces Relative Predictive Coding (RPC), a new contrastive representation learning objective that maintains a good balance among training stability, minibatch size sensitivity, and downstream task performance.

Representation Learning Self-Supervised Learning

Feature-Robust Optimal Transport for High-Dimensional Data

no code implementations1 Jan 2021 Mathis Petrovich, Chao Liang, Ryoma Sato, Yanbin Liu, Yao-Hung Hubert Tsai, Linchao Zhu, Yi Yang, Ruslan Salakhutdinov, Makoto Yamada

To show the effectiveness of FROT, we propose using the FROT algorithm for the layer selection problem in deep neural networks for semantic correspondence.

feature selection Semantic correspondence +1

Self-supervised Learning from a Multi-view Perspective

1 code implementation ICLR 2021 Yao-Hung Hubert Tsai, Yue Wu, Ruslan Salakhutdinov, Louis-Philippe Morency

In particular, we propose a composite objective that bridges the gap between prior contrastive and predictive learning objectives, and introduce an additional objective term to discard task-irrelevant information.

Image Captioning Language Modelling +4

Neural Methods for Point-wise Dependency Estimation

1 code implementation NeurIPS 2020 Yao-Hung Hubert Tsai, Han Zhao, Makoto Yamada, Louis-Philippe Morency, Ruslan Salakhutdinov

Since its inception, the neural estimation of mutual information (MI) has demonstrated the empirical success of modeling expected dependency between high-dimensional random variables.

Cross-Modal Retrieval Representation Learning +1

Feature Robust Optimal Transport for High-dimensional Data

1 code implementation25 May 2020 Mathis Petrovich, Chao Liang, Ryoma Sato, Yanbin Liu, Yao-Hung Hubert Tsai, Linchao Zhu, Yi Yang, Ruslan Salakhutdinov, Makoto Yamada

To show the effectiveness of FROT, we propose using the FROT algorithm for the layer selection problem in deep neural networks for semantic correspondence.

feature selection Semantic correspondence +1

Capsules with Inverted Dot-Product Attention Routing

2 code implementations ICLR 2020 Yao-Hung Hubert Tsai, Nitish Srivastava, Hanlin Goh, Ruslan Salakhutdinov

We introduce a new routing algorithm for capsule networks, in which a child capsule is routed to a parent based only on agreement between the parent's state and the child's vote.

Image Classification

Transformer Dissection: An Unified Understanding for Transformer's Attention via the Lens of Kernel

no code implementations IJCNLP 2019 Yao-Hung Hubert Tsai, Shaojie Bai, Makoto Yamada, Louis-Philippe Morency, Ruslan Salakhutdinov

This new formulation gives us a better way to understand individual components of the Transformer{'}s attention, such as the better way to integrate the positional embedding.

Machine Translation Translation

Complex Transformer: A Framework for Modeling Complex-Valued Sequence

1 code implementation22 Oct 2019 Muqiao Yang, Martin Q. Ma, Dongyu Li, Yao-Hung Hubert Tsai, Ruslan Salakhutdinov

While deep learning has received a surge of interest in a variety of fields in recent years, major deep learning models barely use complex numbers.

Decoder Music Transcription

LSMI-Sinkhorn: Semi-supervised Mutual Information Estimation with Optimal Transport

1 code implementation5 Sep 2019 Yanbin Liu, Makoto Yamada, Yao-Hung Hubert Tsai, Tam Le, Ruslan Salakhutdinov, Yi Yang

To estimate the mutual information from data, a common practice is preparing a set of paired samples $\{(\mathbf{x}_i,\mathbf{y}_i)\}_{i=1}^n \stackrel{\mathrm{i. i. d.

BIG-bench Machine Learning Mutual Information Estimation

Transformer Dissection: A Unified Understanding of Transformer's Attention via the Lens of Kernel

1 code implementation EMNLP 2019 Yao-Hung Hubert Tsai, Shaojie Bai, Makoto Yamada, Louis-Philippe Morency, Ruslan Salakhutdinov

This new formulation gives us a better way to understand individual components of the Transformer's attention, such as the better way to integrate the positional embedding.

Machine Translation Translation

Learning Neural Networks with Adaptive Regularization

1 code implementation NeurIPS 2019 Han Zhao, Yao-Hung Hubert Tsai, Ruslan Salakhutdinov, Geoffrey J. Gordon

Feed-forward neural networks can be understood as a combination of an intermediate representation and a linear hypothesis.

Learning Representations from Imperfect Time Series Data via Tensor Rank Regularization

no code implementations ACL 2019 Paul Pu Liang, Zhun Liu, Yao-Hung Hubert Tsai, Qibin Zhao, Ruslan Salakhutdinov, Louis-Philippe Morency

Our method is based on the observation that high-dimensional multimodal time series data often exhibit correlations across time and modalities which leads to low-rank tensor representations.

Question Answering Sentiment Analysis +4

Strong and Simple Baselines for Multimodal Utterance Embeddings

1 code implementation NAACL 2019 Paul Pu Liang, Yao Chong Lim, Yao-Hung Hubert Tsai, Ruslan Salakhutdinov, Louis-Philippe Morency

Human language is a rich multimodal signal consisting of spoken words, facial expressions, body gestures, and vocal intonations.

Benchmarking

Learning Factorized Multimodal Representations

2 code implementations ICLR 2019 Yao-Hung Hubert Tsai, Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency, Ruslan Salakhutdinov

Multimodal discriminative factors are shared across all modalities and contain joint multimodal features required for discriminative tasks such as sentiment prediction.

Representation Learning

Post Selection Inference with Incomplete Maximum Mean Discrepancy Estimator

no code implementations ICLR 2019 Makoto Yamada, Denny Wu, Yao-Hung Hubert Tsai, Ichiro Takeuchi, Ruslan Salakhutdinov, Kenji Fukumizu

In the paper, we propose a post selection inference (PSI) framework for divergence measure, which can select a set of statistically significant features that discriminate two distributions.

Binary Classification Change Point Detection +1

Discovering Order in Unordered Datasets: Generative Markov Networks

no code implementations ICLR 2018 Yao-Hung Hubert Tsai, Han Zhao, Nebojsa Jojic, Ruslan Salakhutdinov

The assumption that data samples are independently identically distributed is the backbone of many learning algorithms.

Learning Markov Chain in Unordered Dataset

no code implementations ICLR 2018 Yao-Hung Hubert Tsai, Han Zhao, Ruslan Salakhutdinov, Nebojsa Jojic

In this technical report, we introduce OrderNet that can be used to extract the order of data instances in an unsupervised way.

Improving One-Shot Learning through Fusing Side Information

no code implementations23 Oct 2017 Yao-Hung Hubert Tsai, Ruslan Salakhutdinov

We introduce two statistical approaches for fusing side information into data representation learning to improve one-shot learning.

One-Shot Learning regression +1

Generative-Discriminative Variational Model for Visual Recognition

no code implementations7 Jun 2017 Chih-Kuan Yeh, Yao-Hung Hubert Tsai, Yu-Chiang Frank Wang

In other words, our GDVM casts the supervised learning task as a generative learning process, with data discrimination to be jointly exploited for improved classification.

Classification General Classification +3

Learning Robust Visual-Semantic Embeddings

no code implementations ICCV 2017 Yao-Hung Hubert Tsai, Liang-Kang Huang, Ruslan Salakhutdinov

Many of the existing methods for learning joint embedding of images and text use only supervised information from paired images and its textual attributes.

Generalized Few-Shot Learning Representation Learning +1

Learning Cross-Domain Landmarks for Heterogeneous Domain Adaptation

no code implementations CVPR 2016 Yao-Hung Hubert Tsai, Yi-Ren Yeh, Yu-Chiang Frank Wang

With the goal of deriving a domain-invariant feature subspace for HDA, our CDLS is able to identify representative cross-domain data, including the unlabeled ones in the target domain, for performing adaptation.

Domain Adaptation

Cannot find the paper you are looking for? You can Submit a new open access paper.