Search Results for author: Yi-Ting Chen

Found 37 papers, 10 papers with code

RiskBench: A Scenario-based Benchmark for Risk Identification

1 code implementation • 4 Dec 2023 • Chi-Hsi Kung, Chieh-Chi Yang, Pang-Yuan Pao, Shu-Wei Lu, Pin-Lun Chen, Hsin-Cheng Lu, Yi-Ting Chen

Intelligent driving systems aim to achieve a zero-collision mobility experience, requiring interdisciplinary efforts to enhance safety performance.

Decision Making

Paper
Code

Action-slot: Visual Action-centric Representations for Multi-label Atomic Activity Recognition in Traffic Scenes

no code implementations • 29 Nov 2023 • Chi-Hsi Kung, Shu-Wei Lu, Yi-Hsuan Tsai, Yi-Ting Chen

In this paper, we introduce Action-slot, a slot attention-based approach that learns visual action-centric representations, capturing both motion and contextual information.

Action Recognition

Paper
Add Code

AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models

1 code implementation • 19 Sep 2023 • Yuan Tseng, Layne Berry, Yi-Ting Chen, I-Hsiang Chiu, Hsuan-Hao Lin, Max Liu, Puyuan Peng, Yi-Jen Shih, Hung-Yu Wang, Haibin Wu, Po-Yao Huang, Chun-Mao Lai, Shang-Wen Li, David Harwath, Yu Tsao, Shinji Watanabe, Abdelrahman Mohamed, Chi-Luen Feng, Hung-Yi Lee

Audio-visual representation learning aims to develop systems with human-like perception by utilizing correlation between auditory and visual information.

audio-visual learning Representation Learning

Paper
Code

UrbanIR: Large-Scale Urban Scene Inverse Rendering from a Single Video

no code implementations • 15 Jun 2023 • Zhi-Hao Lin, Bohan Liu, Yi-Ting Chen, David Forsyth, Jia-Bin Huang, Anand Bhattad, Shenlong Wang

UrbanIR uses a novel loss to make very good estimates of shadow volumes in the original scene.

Inverse Rendering

Paper
Add Code

CLR-GAM: Contrastive Point Cloud Learning with Guided Augmentation and Feature Mapping

no code implementations • 28 Feb 2023 • Srikanth Malla, Yi-Ting Chen

Point cloud data plays an essential role in robotics and self-driving applications.

3D Point Cloud Classification Contrastive Learning +3

Paper
Add Code

Shape-aware Text-driven Layered Video Editing

no code implementations • CVPR 2023 • Yao-Chih Lee, Ji-Ze Genevieve Jang, Yi-Ting Chen, Elizabeth Qiu, Jia-Bin Huang

Temporal consistency is essential for video editing applications.

Video Editing

Paper
Add Code

Learning Road Scene-level Representations via Semantic Region Prediction

no code implementations • 2 Jan 2023 • Zihao Xiao, Alan Yuille, Yi-Ting Chen

In this work, we tackle two vital tasks in automated driving systems, i. e., driver intent prediction and risk object identification from egocentric images.

Paper
Add Code

Ordered Atomic Activity for Fine-grained Interactive Traffic Scenario Understanding

no code implementations • ICCV 2023 • Nakul Agarwal, Yi-Ting Chen

We introduce a novel representation called Ordered Atomic Activity for interactive scenario understanding.

Activity Recognition

Paper
Add Code

Audio-Visual Speech Enhancement and Separation by Utilizing Multi-Modal Self-Supervised Embeddings

no code implementations • 31 Oct 2022 • I-Chun Chern, Kuo-Hsuan Hung, Yi-Ting Chen, Tassadaq Hussain, Mandar Gogate, Amir Hussain, Yu Tsao, Jen-Cheng Hou

In summary, our results confirm the effectiveness of our proposed model for the AVSS task with proper fine-tuning strategies, demonstrating that multi-modal self-supervised embeddings obtained from AV-HuBERT can be generalized to audio-visual regression tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Paper
Add Code

Orbeez-SLAM: A Real-time Monocular Visual SLAM with ORB Features and NeRF-realized Mapping

1 code implementation • 27 Sep 2022 • Chi-Ming Chung, Yang-Che Tseng, Ya-Ching Hsu, Xiang-Qian Shi, Yun-Hung Hua, Jia-Fong Yeh, Wen-Chin Chen, Yi-Ting Chen, Winston H. Hsu

A spatial AI that can perform complex tasks through visual signals and cooperate with humans is highly anticipated.

Visual Odometry

238

Paper
Code

MetaDIP: Accelerating Deep Image Prior with Meta Learning

no code implementations • 18 Sep 2022 • Kevin Zhang, Mingyang Xie, Maharshi Gor, Yi-Ting Chen, Yvonne Zhou, Christopher A. Metzler

Deep image prior (DIP) is a recently proposed technique for solving imaging inverse problems by fitting the reconstructed images to the output of an untrained convolutional neural network.

Denoising Meta-Learning +1

Paper
Add Code

ADAM Challenge: Detecting Age-related Macular Degeneration from Fundus Images

no code implementations • 16 Feb 2022 • Huihui Fang, Fei Li, Huazhu Fu, Xu sun, Xingxing Cao, Fengbin Lin, Jaemin Son, Sunho Kim, Gwenole Quellec, Sarah Matta, Sharath M Shankaranarayana, Yi-Ting Chen, Chuen-heng Wang, Nisarg A. Shah, Chia-Yen Lee, Chih-Chung Hsu, Hai Xie, Baiying Lei, Ujjwal Baid, Shubham Innani, Kang Dang, Wenxiu Shi, Ravi Kamble, Nitin Singhal, Ching-Wei Wang, Shih-Chang Lo, José Ignacio Orlando, Hrvoje Bogunović, Xiulan Zhang, Yanwu Xu, iChallenge-AMD study group

The ADAM challenge consisted of four tasks which cover the main aspects of detecting and characterizing AMD from fundus images, including detection of AMD, detection and segmentation of optic disc, localization of fovea, and detection and segmentation of lesions.

Paper
Add Code

Semi-supervised 3D Object Detection via Temporal Graph Neural Networks

no code implementations • 1 Feb 2022 • Jianren Wang, Haiming Gang, Siddharth Ancha, Yi-Ting Chen, David Held

However, these detectors usually require training on large amounts of annotated data that is expensive and time-consuming to collect.

3D Object Detection Autonomous Driving +1

Paper
Add Code

Stage Conscious Attention Network (SCAN) : A Demonstration-Conditioned Policy for Few-Shot Imitation

no code implementations • 4 Dec 2021 • Jia-Fong Yeh, Chi-Ming Chung, Hung-Ting Su, Yi-Ting Chen, Winston H. Hsu

(3) Learning from a different expert.

Few-Shot Imitation Learning Imitation Learning

Paper
Add Code

Combined Scaling for Zero-shot Transfer Learning

no code implementations • 19 Nov 2021 • Hieu Pham, Zihang Dai, Golnaz Ghiasi, Kenji Kawaguchi, Hanxiao Liu, Adams Wei Yu, Jiahui Yu, Yi-Ting Chen, Minh-Thang Luong, Yonghui Wu, Mingxing Tan, Quoc V. Le

Second, while increasing the dataset size and the model size has been the defacto method to improve the performance of deep learning models like BASIC, the effect of a large contrastive batch size on such contrastive-trained image-text models is not well-understood.

Ranked #3 on Zero-Shot Transfer Image Classification on ImageNet-Sketch

Classification Contrastive Learning +3

Paper
Add Code

DROID: Driver-centric Risk Object Identification

no code implementations • 24 Jun 2021 • Chengxi Li, Stanley H. Chan, Yi-Ting Chen

Identification of high-risk driving situations is generally approached through collision risk estimation or accident pattern recognition.

Causal Inference Object

Paper
Add Code

CARLS: Cross-platform Asynchronous Representation Learning System

1 code implementation • 26 May 2021 • Chun-Ta Lu, Yun Zeng, Da-Cheng Juan, Yicheng Fan, Zhe Li, Jan Dlabal, Yi-Ting Chen, Arjun Gopalan, Allan Heydon, Chun-Sung Ferng, Reah Miyara, Ariel Fuxman, Futang Peng, Zhen Li, Tom Duerig, Andrew Tomkins

In this work, we propose CARLS, a novel framework for augmenting the capacity of existing deep learning frameworks by enabling multiple components -- model trainers, knowledge makers and knowledge banks -- to concertedly work together in an asynchronous fashion across hardware platforms.

Representation Learning

976

Paper
Code

Multimodal Object Detection via Probabilistic Ensembling

2 code implementations • 7 Apr 2021 • Yi-Ting Chen, Jinghao Shi, Zelin Ye, Christoph Mertz, Deva Ramanan, Shu Kong

Object detection with multimodal inputs can improve many safety-critical systems such as autonomous vehicles (AVs).

Autonomous Vehicles Object +2

123

Paper
Code

Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision

4 code implementations • 11 Feb 2021 • Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, YunHsuan Sung, Zhen Li, Tom Duerig

In this paper, we leverage a noisy dataset of over one billion image alt-text pairs, obtained without expensive filtering or post-processing steps in the Conceptual Captions dataset.

Ranked #1 on Image Classification on VTAB-1k (using extra training data)

Cross-Modal Retrieval Fine-Grained Image Classification +6

1,078

Paper
Code

Unsupervised Domain Adaptation for Spatio-Temporal Action Localization

no code implementations • 19 Oct 2020 • Nakul Agarwal, Yi-Ting Chen, Behzad Dariush, Ming-Hsuan Yang

Spatio-temporal action localization is an important problem in computer vision that involves detecting where and when activities occur, and therefore requires modeling of both spatial and temporal features.

object-detection Object Detection +3

Paper
Add Code

Uncertainty-aware Self-supervised 3D Data Association

1 code implementation • 18 Aug 2020 • Jianren Wang, Siddharth Ancha, Yi-Ting Chen, David Held

Instead, we propose leveraging vast unlabeled datasets by self-supervised metric learning of 3D object trackers, with a focus on data association.

Metric Learning Object +2

Paper
Code

MPDD: A Multi-Party Dialogue Dataset for Analysis of Emotions and Interpersonal Relationships

no code implementations • LREC 2020 • Yi-Ting Chen, Hen-Hsen Huang, Hsin-Hsi Chen

In this paper, we collect the conversions from TV series scripts, and annotate emotion and interpersonal relationship labels on each utterance.

Relation Relation Classification

Paper
Add Code

Deep Quaternion Features for Privacy Protection

no code implementations • 18 Mar 2020 • Hao Zhang, Yi-Ting Chen, Liyao Xiang, Haotian Ma, Jie Shi, Quanshi Zhang

We propose a method to revise the neural network to construct the quaternion-valued neural network (QNN), in order to prevent intermediate-layer features from leaking input information.

Privacy Preserving

Paper
Add Code

Who Make Drivers Stop? Towards Driver-centric Risk Assessment: Risk Object Identification via Causal Inference

no code implementations • 5 Mar 2020 • Chengxi Li, Stanley H. Chan, Yi-Ting Chen

We formulate the task as the cause-effect problem and present a novel two-stage risk object identification framework based on causal inference with the proposed object-level manipulable driving model.

Causal Inference Object +1

Paper
Add Code

Grounding Human-to-Vehicle Advice for Self-driving Vehicles

no code implementations • CVPR 2019 • Jinkyu Kim, Teruhisa Misu, Yi-Ting Chen, Ashish Tawari, John Canny

We show that taking advice improves the performance of the end-to-end network, while the network cues on a variety of visual features that are provided by advice.

Paper
Add Code

Learning 3D-aware Egocentric Spatial-Temporal Interaction via Graph Convolutional Networks

no code implementations • 20 Sep 2019 • Chengxi Li, Yue Meng, Stanley H. Chan, Yi-Ting Chen

First, we decompose egocentric interactions into ego-thing and ego-stuff interaction, modeled by two GCNs.

Novel Concepts

Paper
Add Code

The H3D Dataset for Full-Surround 3D Multi-Object Detection and Tracking in Crowded Urban Scenes

no code implementations • 4 Mar 2019 • Abhishek Patil, Srikanth Malla, Haiming Gang, Yi-Ting Chen

Finally, sources of errors are discussed for the development of future algorithms.

3D Object Detection Object +2

Paper
Add Code

Graph-RISE: Graph-Regularized Image Semantic Embedding

1 code implementation • 14 Feb 2019 • Da-Cheng Juan, Chun-Ta Lu, Zhen Li, Futang Peng, Aleksei Timofeev, Yi-Ting Chen, Yaxi Gao, Tom Duerig, Andrew Tomkins, Sujith Ravi

Learning image representations to capture fine-grained semantics has been a challenging and important task enabling many applications such as image search and clustering.

Ranked #11 on Image Classification on iNaturalist

Clustering General Classification +4

976

Paper
Code

Unsupervised Data Uncertainty Learning in Visual Retrieval Systems

no code implementations • 7 Feb 2019 • Ahmed Taha, Yi-Ting Chen, Teruhisa Misu, Abhinav Shrivastava, Larry Davis

We introduce an unsupervised formulation to estimate heteroscedastic uncertainty in retrieval systems.

Retrieval Video Retrieval

Paper
Add Code

Boosting Standard Classification Architectures Through a Ranking Regularizer

1 code implementation • 24 Jan 2019 • Ahmed Taha, Yi-Ting Chen, Teruhisa Misu, Abhinav Shrivastava, Larry Davis

We employ triplet loss as a feature embedding regularizer to boost classification performance.

Classification General Classification

Paper
Code

Exploring Uncertainty in Conditional Multi-Modal Retrieval Systems

no code implementations • 23 Jan 2019 • Ahmed Taha, Yi-Ting Chen, Xitong Yang, Teruhisa Misu, Larry Davis

We cast visual retrieval as a regression problem by posing triplet loss as a regression loss.

Action Understanding Person Re-Identification +2

Paper
Add Code

Temporal Recurrent Networks for Online Action Detection

2 code implementations • ICCV 2019 • Mingze Xu, Mingfei Gao, Yi-Ting Chen, Larry S. Davis, David J. Crandall

Most work on temporal action detection is formulated as an offline problem, in which the start and end times of actions are determined after the entire video is fully observed.

Ranked #12 on Online Action Detection on TVSeries

Online Action Detection

Paper
Code

Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning

no code implementations • CVPR 2018 • Vasili Ramanishka, Yi-Ting Chen, Teruhisa Misu, Kate Saenko

We present the Honda Research Institute Driving Dataset (HDD), a challenging dataset to enable research on learning driver behavior in real-life environments.

Scene Understanding