Search Results for author: Guangxing Han

Found 15 papers, 9 papers with code

Mitigating Dialogue Hallucination for Large Multi-modal Models via Adversarial Instruction Tuning

no code implementations • 15 Mar 2024 • Dongmin Park, Zhaofang Qian, Guangxing Han, Ser-Nam Lim

To precisely measure this, we first present an evaluation benchmark by extending popular multi-modal benchmark datasets with prepended hallucinatory dialogues generated by our novel Adversarial Question Generator, which can automatically generate image-related yet adversarial dialogues by adopting adversarial attacks on LMMs.

Hallucination Instruction Following +1

Paper
Add Code

Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model

no code implementations • 19 Dec 2023 • Shraman Pramanick, Guangxing Han, Rui Hou, Sayan Nag, Ser-Nam Lim, Nicolas Ballas, Qifan Wang, Rama Chellappa, Amjad Almahairi

In this work, we introduce VistaLLM, a powerful visual system that addresses coarse- and fine-grained VL tasks over single and multiple input images using a unified framework.

Attribute Language Modelling +1

Paper
Add Code

Supervised Masked Knowledge Distillation for Few-Shot Transformers

1 code implementation • CVPR 2023 • Han Lin, Guangxing Han, Jiawei Ma, Shiyuan Huang, Xudong Lin, Shih-Fu Chang

Vision Transformers (ViTs) emerge to achieve impressive performance on many data-abundant computer vision tasks by capturing long-range dependencies among local features.

Few-Shot Learning Inductive Bias +1

Paper
Code

DiGeo: Discriminative Geometry-Aware Learning for Generalized Few-Shot Object Detection

1 code implementation • CVPR 2023 • Jiawei Ma, Yulei Niu, Jincheng Xu, Shiyuan Huang, Guangxing Han, Shih-Fu Chang

Generalized few-shot object detection aims to achieve precise detection on both base classes with abundant annotations and novel classes with limited training data.

Few-Shot Object Detection object-detection

Paper
Code

TempCLR: Temporal Alignment Representation with Contrastive Learning

1 code implementation • 28 Dec 2022 • Yuncong Yang, Jiawei Ma, Shiyuan Huang, Long Chen, Xudong Lin, Guangxing Han, Shih-Fu Chang

For long videos, given a paragraph of description where the sentences describe different segments of the video, by matching all sentence-clip pairs, the paragraph and the full video are aligned implicitly.

Ranked #2 on Long Video Retrieval (Background Removed) on YouCook2

Contrastive Learning Dynamic Time Warping +7

Paper
Code

Weakly-Supervised Temporal Article Grounding

1 code implementation • 22 Oct 2022 • Long Chen, Yulei Niu, Brian Chen, Xudong Lin, Guangxing Han, Christopher Thomas, Hammad Ayyubi, Heng Ji, Shih-Fu Chang

Specifically, given an article and a relevant video, WSAG aims to localize all ``groundable'' sentences to the video, and these sentences are possibly at different semantic scales.

Natural Language Queries Sentence +1

Paper
Code

Explicit Image Caption Editing

1 code implementation • 20 Jul 2022 • Zhen Wang, Long Chen, Wenbo Ma, Guangxing Han, Yulei Niu, Jian Shao, Jun Xiao

Given an image and a reference caption, the image caption editing task aims to correct the misalignment errors and generate a refined caption.

Sentence

Paper
Code

Multi-Modal Few-Shot Object Detection with Meta-Learning-Based Cross-Modal Prompting

no code implementations • 16 Apr 2022 • Guangxing Han, Long Chen, Jiawei Ma, Shiyuan Huang, Rama Chellappa, Shih-Fu Chang

Our approach is motivated by the high-level conceptual similarity of (metric-based) meta-learning and prompt-based learning to learn generalizable few-shot and zero-shot object detection models respectively without fine-tuning.

Few-Shot Learning Few-Shot Object Detection +3

Paper
Add Code

Few-Shot Object Detection with Fully Cross-Transformer

1 code implementation • CVPR 2022 • Guangxing Han, Jiawei Ma, Shiyuan Huang, Long Chen, Shih-Fu Chang

Inspired by the recent work on vision transformers and vision-language transformers, we propose a novel Fully Cross-Transformer based model (FCT) for FSOD by incorporating cross-transformer into both the feature backbone and detection head.

Few-Shot Object Detection Metric Learning +2

Paper
Code

The Met Dataset: Instance-level Recognition for Artworks

no code implementations • 3 Feb 2022 • Nikolaos-Antonios Ypsilantis, Noa Garcia, Guangxing Han, Sarah Ibrahimi, Nanne van Noord, Giorgos Tolias

Testing is primarily performed on photos taken by museum guests depicting exhibits, which introduces a distribution shift between training and testing.

Contrastive Learning Out-of-Distribution Detection

Paper
Add Code

Query Adaptive Few-Shot Object Detection with Heterogeneous Graph Convolutional Networks

1 code implementation • ICCV 2021 • Guangxing Han, Yicheng He, Shiyuan Huang, Jiawei Ma, Shih-Fu Chang

Few-shot object detection (FSOD) aims to detect never-seen objects using few examples.

Few-Shot Object Detection Meta-Learning +1

Paper
Code

Partner-Assisted Learning for Few-Shot Image Classification

no code implementations • ICCV 2021 • Jiawei Ma, Hanchen Xie, Guangxing Han, Shih-Fu Chang, Aram Galstyan, Wael Abd-Almageed

In this paper, we focus on the design of training strategy to obtain an elemental representation such that the prototype of each novel class can be estimated from a few labeled samples.

Classification Few-Shot Image Classification +1

Paper
Add Code

Meta Faster R-CNN: Towards Accurate Few-Shot Object Detection with Attentive Feature Alignment

2 code implementations • 15 Apr 2021 • Guangxing Han, Shiyuan Huang, Jiawei Ma, Yicheng He, Shih-Fu Chang

To improve the fine-grained few-shot proposal classification, we propose a novel attentive feature alignment method to address the spatial misalignment between the noisy proposals and few-shot classes, thus improving the performance of few-shot object detection.

Few-Shot Learning Few-Shot Object Detection +3

Paper
Code

Task-Adaptive Negative Envision for Few-Shot Open-Set Recognition

1 code implementation • CVPR 2022 • Shiyuan Huang, Jiawei Ma, Guangxing Han, Shih-Fu Chang

In this paper, we instead propose task-adaptive negative class envision for FSOR to integrate threshold tuning into the learning process.

Few-Shot Learning Open Set Learning

Paper
Code

COVID-19 Literature Knowledge Graph Construction and Drug Repurposing Report Generation

no code implementations • NAACL 2021 • Qingyun Wang, Manling Li, Xuan Wang, Nikolaus Parulian, Guangxing Han, Jiawei Ma, Jingxuan Tu, Ying Lin, Haoran Zhang, Weili Liu, Aabhas Chauhan, Yingjun Guan, Bangzheng Li, Ruisong Li, Xiangchen Song, Yi R. Fung, Heng Ji, Jiawei Han, Shih-Fu Chang, James Pustejovsky, Jasmine Rah, David Liem, Ahmed Elsayed, Martha Palmer, Clare Voss, Cynthia Schneider, Boyan Onyshkevych

To combat COVID-19, both clinicians and scientists need to digest vast amounts of relevant biomedical knowledge in scientific literature to understand the disease mechanism and related biological functions.

graph construction Knowledge Graphs +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.