Shared-unique Features and Task-aware Prioritized Sampling on Multi-task Reinforcement Learning

no code implementations2 Jun 2024 Po-Shao Lin, Jia-Fong Yeh, Yi-Ting Chen, Winston H. Hsu

We observe that current state-of-the-art (SOTA) methods suffer from the performance imbalance issue when performing multi-task reinforcement learning (MTRL) tasks.

VividDream: Generating 3D Scene with Ambient Dynamics

no code implementations30 May 2024 Yao-Chih Lee, Yi-Ting Chen, Andrew Wang, Ting-Hsuan Liao, Brandon Y. Feng, Jia-Bin Huang

An ensemble of animated videos is then generated using video diffusion models with quality refinement techniques and conditioned on renderings of the static 3D scene from the sampled camera trajectories.

RiskBench: A Scenario-based Benchmark for Risk Identification

1 code implementation4 Dec 2023 Chi-Hsi Kung, Chieh-Chi Yang, Pang-Yuan Pao, Shu-Wei Lu, Pin-Lun Chen, Hsin-Cheng Lu, Yi-Ting Chen

Intelligent driving systems aim to achieve a zero-collision mobility experience, requiring interdisciplinary efforts to enhance safety performance.

Decision Making

Action-slot: Visual Action-centric Representations for Multi-label Atomic Activity Recognition in Traffic Scenes

no code implementations CVPR 2024 Chi-Hsi Kung, Shu-Wei Lu, Yi-Hsuan Tsai, Yi-Ting Chen

In this paper, we introduce Action-slot, a slot attention-based approach that learns visual action-centric representations, capturing both motion and contextual information.

Action Recognition

Learning Road Scene-level Representations via Semantic Region Prediction

no code implementations2 Jan 2023 Zihao Xiao, Alan Yuille, Yi-Ting Chen

In this work, we tackle two vital tasks in automated driving systems, i. e., driver intent prediction and risk object identification from egocentric images.

Audio-Visual Speech Enhancement and Separation by Utilizing Multi-Modal Self-Supervised Embeddings

no code implementations31 Oct 2022 I-Chun Chern, Kuo-Hsuan Hung, Yi-Ting Chen, Tassadaq Hussain, Mandar Gogate, Amir Hussain, Yu Tsao, Jen-Cheng Hou

In summary, our results confirm the effectiveness of our proposed model for the AVSS task with proper fine-tuning strategies, demonstrating that multi-modal self-supervised embeddings obtained from AV-HuBERT can be generalized to audio-visual regression tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

MetaDIP: Accelerating Deep Image Prior with Meta Learning

no code implementations18 Sep 2022 Kevin Zhang, Mingyang Xie, Maharshi Gor, Yi-Ting Chen, Yvonne Zhou, Christopher A. Metzler

Deep image prior (DIP) is a recently proposed technique for solving imaging inverse problems by fitting the reconstructed images to the output of an untrained convolutional neural network.

Denoising Meta-Learning +1

ADAM Challenge: Detecting Age-related Macular Degeneration from Fundus Images

no code implementations16 Feb 2022 Huihui Fang, Fei Li, Huazhu Fu, Xu sun, Xingxing Cao, Fengbin Lin, Jaemin Son, Sunho Kim, Gwenole Quellec, Sarah Matta, Sharath M Shankaranarayana, Yi-Ting Chen, Chuen-heng Wang, Nisarg A. Shah, Chia-Yen Lee, Chih-Chung Hsu, Hai Xie, Baiying Lei, Ujjwal Baid, Shubham Innani, Kang Dang, Wenxiu Shi, Ravi Kamble, Nitin Singhal, Ching-Wei Wang, Shih-Chang Lo, José Ignacio Orlando, Hrvoje Bogunović, Xiulan Zhang, Yanwu Xu, iChallenge-AMD study group

The ADAM challenge consisted of four tasks which cover the main aspects of detecting and characterizing AMD from fundus images, including detection of AMD, detection and segmentation of optic disc, localization of fovea, and detection and segmentation of lesions.

Semi-supervised 3D Object Detection via Temporal Graph Neural Networks

no code implementations1 Feb 2022 Jianren Wang, Haiming Gang, Siddharth Ancha, Yi-Ting Chen, David Held

However, these detectors usually require training on large amounts of annotated data that is expensive and time-consuming to collect.

3D Object Detection Autonomous Driving +2

Combined Scaling for Zero-shot Transfer Learning

no code implementations19 Nov 2021 Hieu Pham, Zihang Dai, Golnaz Ghiasi, Kenji Kawaguchi, Hanxiao Liu, Adams Wei Yu, Jiahui Yu, Yi-Ting Chen, Minh-Thang Luong, Yonghui Wu, Mingxing Tan, Quoc V. Le

Second, while increasing the dataset size and the model size has been the defacto method to improve the performance of deep learning models like BASIC, the effect of a large contrastive batch size on such contrastive-trained image-text models is not well-understood.

Classification Contrastive Learning +3

DROID: Driver-centric Risk Object Identification

no code implementations24 Jun 2021 Chengxi Li, Stanley H. Chan, Yi-Ting Chen

Identification of high-risk driving situations is generally approached through collision risk estimation or accident pattern recognition.

Causal Inference Object

CARLS: Cross-platform Asynchronous Representation Learning System

1 code implementation26 May 2021 Chun-Ta Lu, Yun Zeng, Da-Cheng Juan, Yicheng Fan, Zhe Li, Jan Dlabal, Yi-Ting Chen, Arjun Gopalan, Allan Heydon, Chun-Sung Ferng, Reah Miyara, Ariel Fuxman, Futang Peng, Zhen Li, Tom Duerig, Andrew Tomkins

In this work, we propose CARLS, a novel framework for augmenting the capacity of existing deep learning frameworks by enabling multiple components -- model trainers, knowledge makers and knowledge banks -- to concertedly work together in an asynchronous fashion across hardware platforms.

Representation Learning

Multimodal Object Detection via Probabilistic Ensembling

3 code implementations7 Apr 2021 Yi-Ting Chen, Jinghao Shi, Zelin Ye, Christoph Mertz, Deva Ramanan, Shu Kong

Object detection with multimodal inputs can improve many safety-critical systems such as autonomous vehicles (AVs).

Autonomous Vehicles Object +2

Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision

4 code implementations11 Feb 2021 Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, YunHsuan Sung, Zhen Li, Tom Duerig

In this paper, we leverage a noisy dataset of over one billion image alt-text pairs, obtained without expensive filtering or post-processing steps in the Conceptual Captions dataset.

 Ranked #1 on Image Classification on VTAB-1k (using extra training data)

Cross-Modal Retrieval Fine-Grained Image Classification +6

Unsupervised Domain Adaptation for Spatio-Temporal Action Localization

no code implementations19 Oct 2020 Nakul Agarwal, Yi-Ting Chen, Behzad Dariush, Ming-Hsuan Yang

Spatio-temporal action localization is an important problem in computer vision that involves detecting where and when activities occur, and therefore requires modeling of both spatial and temporal features.

object-detection Object Detection +3

Uncertainty-aware Self-supervised 3D Data Association

1 code implementation18 Aug 2020 Jianren Wang, Siddharth Ancha, Yi-Ting Chen, David Held

Instead, we propose leveraging vast unlabeled datasets by self-supervised metric learning of 3D object trackers, with a focus on data association.

Metric Learning Object +2

MPDD: A Multi-Party Dialogue Dataset for Analysis of Emotions and Interpersonal Relationships

no code implementations LREC 2020 Yi-Ting Chen, Hen-Hsen Huang, Hsin-Hsi Chen

In this paper, we collect the conversions from TV series scripts, and annotate emotion and interpersonal relationship labels on each utterance.

Relation Relation Classification

Deep Quaternion Features for Privacy Protection

no code implementations18 Mar 2020 Hao Zhang, Yi-Ting Chen, Liyao Xiang, Haotian Ma, Jie Shi, Quanshi Zhang

We propose a method to revise the neural network to construct the quaternion-valued neural network (QNN), in order to prevent intermediate-layer features from leaking input information.

Privacy Preserving

Who Make Drivers Stop? Towards Driver-centric Risk Assessment: Risk Object Identification via Causal Inference

no code implementations5 Mar 2020 Chengxi Li, Stanley H. Chan, Yi-Ting Chen

We formulate the task as the cause-effect problem and present a novel two-stage risk object identification framework based on causal inference with the proposed object-level manipulable driving model.

Causal Inference Object +1

Grounding Human-to-Vehicle Advice for Self-driving Vehicles

no code implementations CVPR 2019 Jinkyu Kim, Teruhisa Misu, Yi-Ting Chen, Ashish Tawari, John Canny

We show that taking advice improves the performance of the end-to-end network, while the network cues on a variety of visual features that are provided by advice.

Learning 3D-aware Egocentric Spatial-Temporal Interaction via Graph Convolutional Networks

no code implementations20 Sep 2019 Chengxi Li, Yue Meng, Stanley H. Chan, Yi-Ting Chen

First, we decompose egocentric interactions into ego-thing and ego-stuff interaction, modeled by two GCNs.

Novel Concepts

Graph-RISE: Graph-Regularized Image Semantic Embedding

1 code implementation14 Feb 2019 Da-Cheng Juan, Chun-Ta Lu, Zhen Li, Futang Peng, Aleksei Timofeev, Yi-Ting Chen, Yaxi Gao, Tom Duerig, Andrew Tomkins, Sujith Ravi

Learning image representations to capture fine-grained semantics has been a challenging and important task enabling many applications such as image search and clustering.

Clustering General Classification +4

Temporal Recurrent Networks for Online Action Detection

2 code implementations ICCV 2019 Mingze Xu, Mingfei Gao, Yi-Ting Chen, Larry S. Davis, David J. Crandall

Most work on temporal action detection is formulated as an offline problem, in which the start and end times of actions are determined after the entire video is fully observed.

Online Action Detection

Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning

no code implementations CVPR 2018 Vasili Ramanishka, Yi-Ting Chen, Teruhisa Misu, Kate Saenko

We present the Honda Research Institute Driving Dataset (HDD), a challenging dataset to enable research on learning driver behavior in real-life environments.

Scene Understanding

