Search Results for author: Hung-Ting Su

Found 23 papers, 10 papers with code

Tracking-Assisted Object Detection with Event Cameras

no code implementations27 Mar 2024 Ting-Kang Yen, Igor Morawski, Shusil Dangi, Kai He, Chung-Yi Lin, Jia-Fong Yeh, Hung-Ting Su, Winston Hsu

However, feature asynchronism and sparsity cause invisible objects due to no relative motion to the camera, posing a significant challenge in the task.

Attribute Object +2

TelTrans: Applying Multi-Type Telecom Data to Transportation Evaluation and Prediction via Multifaceted Graph Modeling

no code implementations6 Jan 2024 ChungYi Lin, Shen-Lung Tung, Hung-Ting Su, Winston H. Hsu

To address the limitations of traffic prediction from location-bound detectors, we present Geographical Cellular Traffic (GCT) flow, a novel data source that leverages the extensive coverage of cellular traffic to capture mobility patterns.

Traffic Prediction

Unsupervised Adversarial Detection without Extra Model: Training Loss Should Change

1 code implementation7 Aug 2023 Chien Cheng Chyou, Hung-Ting Su, Winston H. Hsu

Adversarial robustness poses a critical challenge in the deployment of deep learning models for real-world applications.

Adversarial Robustness

Language Models are Causal Knowledge Extractors for Zero-shot Video Question Answering

no code implementations7 Apr 2023 Hung-Ting Su, Yulei Niu, Xudong Lin, Winston H. Hsu, Shih-Fu Chang

Causal Video Question Answering (CVidQA) queries not only association or temporal relations but also causal relations in a video.

Question Answering Question Generation +3

Learning Fine-Grained Visual Understanding for Video Question Answering via Decoupling Spatial-Temporal Modeling

1 code implementation8 Oct 2022 Hsin-Ying Lee, Hung-Ting Su, Bing-Chen Tsai, Tsung-Han Wu, Jia-Fong Yeh, Winston H. Hsu

While recent large-scale video-language pre-training made great progress in video question answering, the design of spatial modeling of video-language models is less fine-grained than that of image-language models; existing practices of temporal modeling also suffer from weak and noisy alignment between modalities.

Language Modelling Question Answering +1

MonoDTR: Monocular 3D Object Detection with Depth-Aware Transformer

1 code implementation CVPR 2022 Kuan-Chih Huang, Tsung-Han Wu, Hung-Ting Su, Winston H. Hsu

Moreover, different from conventional pixel-wise positional encodings, we introduce a novel depth positional encoding (DPE) to inject depth positional hints into transformers.

Autonomous Driving Monocular 3D Object Detection +2

Anomaly-Aware Semantic Segmentation by Leveraging Synthetic-Unknown Data

no code implementations29 Nov 2021 Guan-Rong Lu, Yueh-Cheng Liu, Tung-I Chen, Hung-Ting Su, Tsung-Han Wu, Winston H. Hsu

We design a new Masked Gradient Update (MGU) module to generate auxiliary data along the boundary of in-distribution data points.

Anomaly Detection Autonomous Driving +3

Multivariate and Propagation Graph Attention Network for Spatial-Temporal Prediction with Outdoor Cellular Traffic

1 code implementation18 Aug 2021 Chung-Yi Lin, Hung-Ting Su, Shen-Lung Tung, Winston H. Hsu

Furthermore, we propose a new model for multivariate spatial-temporal prediction, mainly consisting of two extending graph attention networks (GAT).

Graph Attention

TrUMAn: Trope Understanding in Movies and Animations

no code implementations10 Aug 2021 Hung-Ting Su, Po-Wei Shen, Bing-Chen Tsai, Wen-Feng Cheng, Ke-Jyun Wang, Winston H. Hsu

By coping with the trope understanding task and enabling the deep cognition skills of machines, data mining applications and algorithms could be taken to the next level.

Recommendation Systems

ReDAL: Region-based and Diversity-aware Active Learning for Point Cloud Semantic Segmentation

1 code implementation ICCV 2021 Tsung-Han Wu, Yueh-Cheng Liu, Yu-Kai Huang, Hsin-Ying Lee, Hung-Ting Su, Ping-Chia Huang, Winston H. Hsu

Despite the success of deep learning on supervised point cloud semantic segmentation, obtaining large-scale point-by-point manual annotations is still a significant challenge.

Active Learning Scene Understanding +1

Class-agnostic-Few-shot-Object-Counting

1 code implementation WACV 2021 Shuo-Diao Yang, Hung-Ting Su, Winston H. Hsu, Wen-Chin Chen

Instead of counting a pre-defined class, our model is able to count instances based on input reference images and reduces the huge cost of data collection, training and parameter tuning for each new object class.

Object Object Counting

OCID-Ref: A 3D Robotic Dataset with Embodied Language for Clutter Scene Grounding

1 code implementation NAACL 2021 Ke-Jyun Wang, Yun-Hsuan Liu, Hung-Ting Su, Jen-Wei Wang, Yu-Siang Wang, Winston H. Hsu, Wen-Chin Chen

To effectively apply robots in working environments and assist humans, it is essential to develop and evaluate how visual grounding (VG) can affect machine performance on occluded objects.

Referring Expression Referring Expression Segmentation +1

Dual-Awareness Attention for Few-Shot Object Detection

1 code implementation24 Feb 2021 Tung-I Chen, Yueh-Cheng Liu, Hung-Ting Su, Yu-Cheng Chang, Yu-Hsiang Lin, Jia-Fong Yeh, Wen-Chin Chen, Winston H. Hsu

While recent progress has significantly boosted few-shot classification (FSC) performance, few-shot object detection (FSOD) remains challenging for modern learning systems.

Few-Shot Learning Few-Shot Object Detection +2

Situation and Behavior Understanding by Trope Detection on Films

1 code implementation19 Jan 2021 Chen-Hsi Chang, Hung-Ting Su, Jui-heng Hsu, Yu-Siang Wang, Yu-Cheng Chang, Zhe Yu Liu, Ya-Liang Chang, Wen-Feng Cheng, Ke-Jyun Wang, Winston H. Hsu

Experimental result demonstrates that modern models including BERT contextual embedding, movie tag prediction systems, and relational networks, perform at most 37% of human performance (23. 97/64. 87) in terms of F1 score.

Reading Comprehension Sentence +1

GDN: A Coarse-To-Fine (C2F) Representation for End-To-End 6-DoF Grasp Detection

no code implementations21 Oct 2020 Kuang-Yu Jeng, Yueh-Cheng Liu, Zhe Yu Liu, Jen-Wei Wang, Ya-Liang Chang, Hung-Ting Su, Winston H. Hsu

We proposed an end-to-end grasp detection network, Grasp Detection Network (GDN), cooperated with a novel coarse-to-fine (C2F) grasp representation design to detect diverse and accurate 6-DoF grasps based on point clouds.

Expanding Sparse Guidance for Stereo Matching

no code implementations24 Apr 2020 Yu-Kai Huang, Yueh-Cheng Liu, Tsung-Han Wu, Hung-Ting Su, Winston H. Hsu

The performance of image based stereo estimation suffers from lighting variations, repetitive patterns and homogeneous appearance.

Domain Adaptation Stereo Matching

Cannot find the paper you are looking for? You can Submit a new open access paper.