Search Results for author: Shiyuan Huang

Found 13 papers, 9 papers with code

Characterizing Video Question Answering with Sparsified Inputs

no code implementations • 27 Nov 2023 • Shiyuan Huang, Robinson Piramuthu, Vicente Ordonez, Shih-Fu Chang, Gunnar A. Sigurdsson

From our experiments, we have observed only 5. 2%-5. 8% loss of performance with only 10% of video lengths, which corresponds to 2-4 frames selected from each video.

Question Answering Video Question Answering

Paper
Add Code

Can Large Language Models Explain Themselves? A Study of LLM-Generated Self-Explanations

no code implementations • 17 Oct 2023 • Shiyuan Huang, Siddarth Mamidanna, Shreedhar Jangam, Yilun Zhou, Leilani H. Gilpin

Through an extensive set of experiments, we find that ChatGPT's self-explanations perform on par with traditional ones, but are quite different from them according to various agreement metrics, meanwhile being much cheaper to produce (as they are generated along with the prediction).

Mathematical Reasoning Sentiment Analysis

Paper
Add Code

Supervised Masked Knowledge Distillation for Few-Shot Transformers

1 code implementation • CVPR 2023 • Han Lin, Guangxing Han, Jiawei Ma, Shiyuan Huang, Xudong Lin, Shih-Fu Chang

Vision Transformers (ViTs) emerge to achieve impressive performance on many data-abundant computer vision tasks by capturing long-range dependencies among local features.

Few-Shot Learning Inductive Bias +1

Paper
Code

DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection

1 code implementation • CVPR 2023 • Jiawei Ma, Yulei Niu, Jincheng Xu, Shiyuan Huang, Guangxing Han, Shih-Fu Chang

Generalized few-shot object detection aims to achieve precise detection on both base classes with abundant annotations and novel classes with limited training data.

Few-Shot Object Detection object-detection

Paper
Code

TempCLR: Temporal Alignment Representation with Contrastive Learning

1 code implementation • 28 Dec 2022 • Yuncong Yang, Jiawei Ma, Shiyuan Huang, Long Chen, Xudong Lin, Guangxing Han, Shih-Fu Chang

For long videos, given a paragraph of description where the sentences describe different segments of the video, by matching all sentence-clip pairs, the paragraph and the full video are aligned implicitly.

Ranked #2 on Long Video Retrieval (Background Removed) on YouCook2

Contrastive Learning Dynamic Time Warping +7

Paper
Code

Video in 10 Bits: Few-Bit VideoQA for Efficiency and Privacy

1 code implementation • 15 Oct 2022 • Shiyuan Huang, Robinson Piramuthu, Shih-Fu Chang, Gunnar A. Sigurdsson

Specifically, we insert a lightweight Feature Compression Module (FeatComp) into a VideoQA model which learns to extract task-specific tiny features as little as 10 bits, which are optimal for answering certain types of questions.

Feature Compression Question Answering +1

311

Paper
Code

Towards Fast Adaptation of Pretrained Contrastive Models for Multi-channel Video-Language Retrieval

1 code implementation • CVPR 2023 • Xudong Lin, Simran Tiwari, Shiyuan Huang, Manling Li, Mike Zheng Shou, Heng Ji, Shih-Fu Chang

We surprisingly find that discrete text tokens coupled with a pretrained contrastive text model yields the best performance, which can even outperform state-of-the-art on the iVQA and How2QA datasets without additional training on millions of video-text data.

Ranked #1 on Video Question Answering on iVQA

Retrieval Sentence +2

Paper
Code

Multi-Modal Few-Shot Object Detection with Meta-Learning-Based Cross-Modal Prompting

no code implementations • 16 Apr 2022 • Guangxing Han, Long Chen, Jiawei Ma, Shiyuan Huang, Rama Chellappa, Shih-Fu Chang

Our approach is motivated by the high-level conceptual similarity of (metric-based) meta-learning and prompt-based learning to learn generalizable few-shot and zero-shot object detection models respectively without fine-tuning.

Few-Shot Learning Few-Shot Object Detection +3

Paper
Add Code

Few-Shot Object Detection with Fully Cross-Transformer

1 code implementation • CVPR 2022 • Guangxing Han, Jiawei Ma, Shiyuan Huang, Long Chen, Shih-Fu Chang

Inspired by the recent work on vision transformers and vision-language transformers, we propose a novel Fully Cross-Transformer based model (FCT) for FSOD by incorporating cross-transformer into both the feature backbone and detection head.

Few-Shot Object Detection Metric Learning +2

Paper
Code

Query Adaptive Few-Shot Object Detection with Heterogeneous Graph Convolutional Networks

1 code implementation • ICCV 2021 • Guangxing Han, Yicheng He, Shiyuan Huang, Jiawei Ma, Shih-Fu Chang

Few-shot object detection (FSOD) aims to detect never-seen objects using few examples.

Few-Shot Object Detection Meta-Learning +1

Paper
Code

Meta Faster R-CNN: Towards Accurate Few-Shot Object Detection with Attentive Feature Alignment

2 code implementations • 15 Apr 2021 • Guangxing Han, Shiyuan Huang, Jiawei Ma, Yicheng He, Shih-Fu Chang

To improve the fine-grained few-shot proposal classification, we propose a novel attentive feature alignment method to address the spatial misalignment between the noisy proposals and few-shot classes, thus improving the performance of few-shot object detection.

Few-Shot Learning Few-Shot Object Detection +3

Paper
Code

Task-Adaptive Negative Envision for Few-Shot Open-Set Recognition

1 code implementation • CVPR 2022 • Shiyuan Huang, Jiawei Ma, Guangxing Han, Shih-Fu Chang

In this paper, we instead propose task-adaptive negative class envision for FSOR to integrate threshold tuning into the learning process.

Few-Shot Learning Open Set Learning

Paper
Code

Flow-Distilled IP Two-Stream Networks for Compressed Video Action Recognition

no code implementations • 10 Dec 2019 • Shiyuan Huang, Xudong Lin, Svebor Karaman, Shih-Fu Chang

Recent works instead use modern compressed video modalities as an alternative to the RGB spatial stream and improve the inference speed by orders of magnitudes.

Action Recognition Optical Flow Estimation +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.