Search Results for author: Jingkang Yang

Found 29 papers, 25 papers with code

LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models

1 code implementation17 Jul 2024 Kaichen Zhang, Bo Li, Peiyuan Zhang, Fanyi Pu, Joshua Adrian Cahyono, Kairui Hu, Shuai Liu, Yuanhan Zhang, Jingkang Yang, Chunyuan Li, Ziwei Liu

To approach this evaluation trilemma, we further introduce LMMS-EVAL LITE, a pruned evaluation toolkit that emphasizes both coverage and efficiency.

Benchmarking Language Modelling +1

Long Context Transfer from Language to Vision

2 code implementations24 Jun 2024 Peiyuan Zhang, Kaichen Zhang, Bo Li, Guangtao Zeng, Jingkang Yang, Yuanhan Zhang, Ziyue Wang, Haoran Tan, Chunyuan Li, Ziwei Liu

By simply extrapolating the context length of the language backbone, we enable LMMs to comprehend orders of magnitude more visual tokens without any video training.

Language Modelling

4D Panoptic Scene Graph Generation

3 code implementations NeurIPS 2023 Jingkang Yang, Jun Cen, Wenxuan Peng, Shuai Liu, Fangzhou Hong, Xiangtai Li, Kaiyang Zhou, Qifeng Chen, Ziwei Liu

To facilitate research in this new area, we build a richly annotated PSG-4D dataset consisting of 3K RGB-D videos with a total of 1M frames, each of which is labeled with 4D panoptic segmentation masks as well as fine-grained, dynamic scene graphs.

4D Panoptic Segmentation Graph Generation +5

Panoptic Video Scene Graph Generation

3 code implementations CVPR 2023 Jingkang Yang, Wenxuan Peng, Xiangtai Li, Zujin Guo, Liangyu Chen, Bo Li, Zheng Ma, Kaiyang Zhou, Wayne Zhang, Chen Change Loy, Ziwei Liu

PVSG relates to the existing video scene graph generation (VidSGG) problem, which focuses on temporal interactions between humans and objects grounded with bounding boxes in videos.

Graph Generation Panoptic Scene Graph Generation +5

OtterHD: A High-Resolution Multi-modality Model

1 code implementation7 Nov 2023 Bo Li, Peiyuan Zhang, Jingkang Yang, Yuanhan Zhang, Fanyi Pu, Ziwei Liu

In this paper, we present OtterHD-8B, an innovative multimodal model evolved from Fuyu-8B, specifically engineered to interpret high-resolution visual inputs with granular precision.

Visual Question Answering

Pair then Relation: Pair-Net for Panoptic Scene Graph Generation

1 code implementation17 Jul 2023 Jinghao Wang, Zhengyu Wen, Xiangtai Li, Zujin Guo, Jingkang Yang, Ziwei Liu

Panoptic Scene Graph (PSG) is a challenging task in Scene Graph Generation (SGG) that aims to create a more comprehensive scene graph representation using panoptic segmentation instead of boxes.

Graph Generation Panoptic Scene Graph Generation +2

FunQA: Towards Surprising Video Comprehension

1 code implementation26 Jun 2023 Binzhu Xie, Sicheng Zhang, Zitang Zhou, Bo Li, Yuanhan Zhang, Jack Hessel, Jingkang Yang, Ziwei Liu

Surprising videos, such as funny clips, creative performances, or visual illusions, attract significant attention.

Question Answering Text Generation +3

SAD: Segment Any RGBD

1 code implementation23 May 2023 Jun Cen, Yizheng Wu, Kewei Wang, Xingyi Li, Jingkang Yang, Yixuan Pei, Lingdong Kong, Ziwei Liu, Qifeng Chen

The Segment Anything Model (SAM) has demonstrated its effectiveness in segmenting any part of 2D RGB images.

3D Panoptic Segmentation Open Vocabulary Semantic Segmentation +2

Otter: A Multi-Modal Model with In-Context Instruction Tuning

1 code implementation5 May 2023 Bo Li, Yuanhan Zhang, Liangyu Chen, Jinghao Wang, Jingkang Yang, Ziwei Liu

Large language models (LLMs) have demonstrated significant universal capabilities as few/zero-shot learners in various tasks due to their pre-training on vast amounts of text data, as exemplified by GPT-3, which boosted to InstrctGPT and ChatGPT, effectively following natural language instructions to accomplish real-world tasks.

In-Context Learning Instruction Following +2

OpenOOD: Benchmarking Generalized Out-of-Distribution Detection

4 code implementations13 Oct 2022 Jingkang Yang, Pengyun Wang, Dejian Zou, Zitang Zhou, Kunyuan Ding, Wenxuan Peng, Haoqi Wang, Guangyao Chen, Bo Li, Yiyou Sun, Xuefeng Du, Kaiyang Zhou, Wayne Zhang, Dan Hendrycks, Yixuan Li, Ziwei Liu

Out-of-distribution (OOD) detection is vital to safety-critical machine learning applications and has thus been extensively studied, with a plethora of methods developed in the literature.

Anomaly Detection Benchmarking +3

On-Device Domain Generalization

2 code implementations15 Sep 2022 Kaiyang Zhou, Yuanhan Zhang, Yuhang Zang, Jingkang Yang, Chen Change Loy, Ziwei Liu

Another interesting observation is that the teacher-student gap on out-of-distribution data is bigger than that on in-distribution data, which highlights the capacity mismatch issue as well as the shortcoming of KD.

Data Augmentation Domain Generalization +2

Panoptic Scene Graph Generation

1 code implementation22 Jul 2022 Jingkang Yang, Yi Zhe Ang, Zujin Guo, Kaiyang Zhou, Wayne Zhang, Ziwei Liu

Existing research addresses scene graph generation (SGG) -- a critical technology for scene understanding in images -- from a detection perspective, i. e., objects are detected using bounding boxes followed by prediction of their pairwise relationships.

Benchmarking Panoptic Scene Graph Generation +1

Sparse Mixture-of-Experts are Domain Generalizable Learners

2 code implementations8 Jun 2022 Bo Li, Yifei Shen, Jingkang Yang, Yezhen Wang, Jiawei Ren, Tong Che, Jun Zhang, Ziwei Liu

It is motivated by an empirical finding that transformer-based models trained with empirical risk minimization (ERM) outperform CNN-based models employing state-of-the-art (SOTA) DG algorithms on multiple DG datasets.

Ranked #17 on Domain Generalization on DomainNet (using extra training data)

Domain Generalization Object Recognition

Full-Spectrum Out-of-Distribution Detection

2 code implementations11 Apr 2022 Jingkang Yang, Kaiyang Zhou, Ziwei Liu

In this paper, we take into account both shift types and introduce full-spectrum OOD (FS-OOD) detection, a more realistic problem setting that considers both detecting semantic shift and being tolerant to covariate shift; and designs three benchmarks.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

Conditional Prompt Learning for Vision-Language Models

9 code implementations CVPR 2022 Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu

With the rise of powerful pre-trained vision-language models like CLIP, it becomes essential to investigate ways to adapt these models to downstream datasets.

Domain Generalization Prompt Engineering

Generalized Out-of-Distribution Detection: A Survey

4 code implementations21 Oct 2021 Jingkang Yang, Kaiyang Zhou, Yixuan Li, Ziwei Liu

In this survey, we first present a unified framework called generalized OOD detection, which encompasses the five aforementioned problems, i. e., AD, ND, OSR, OOD detection, and OD.

Anomaly Detection Autonomous Driving +5

Learning to Prompt for Vision-Language Models

15 code implementations2 Sep 2021 Kaiyang Zhou, Jingkang Yang, Chen Change Loy, Ziwei Liu

Large pre-trained vision-language models like CLIP have shown great potential in learning representations that are transferable across a wide range of downstream tasks.

Domain Generalization Few-shot Age Estimation +2

Semantically Coherent Out-of-Distribution Detection

2 code implementations ICCV 2021 Jingkang Yang, Haoqi Wang, Litong Feng, Xiaopeng Yan, Huabin Zheng, Wayne Zhang, Ziwei Liu

The proposed UDG can not only enrich the semantic knowledge of the model by exploiting unlabeled data in an unsupervised manner, but also distinguish ID/OOD samples to enhance ID classification and OOD detection tasks simultaneously.

Out-of-Distribution Detection Out of Distribution (OOD) Detection

GIST: Distributed Training for Large-Scale Graph Convolutional Networks

1 code implementation20 Feb 2021 Cameron R. Wolfe, Jingkang Yang, Arindam Chowdhury, Chen Dun, Artun Bayer, Santiago Segarra, Anastasios Kyrillidis

The graph convolutional network (GCN) is a go-to solution for machine learning on graphs, but its training is notoriously difficult to scale both in terms of graph size and the number of model parameters.

BIG-bench Machine Learning Graph Sampling

Webly Supervised Image Classification with Metadata: Automatic Noisy Label Correction via Visual-Semantic Graph

1 code implementation12 Oct 2020 Jingkang Yang, Weirong Chen, Litong Feng, Xiaopeng Yan, Huabin Zheng, Wayne Zhang

VSGraph-LC starts from anchor selection referring to the semantic similarity between metadata and correct label concepts, and then propagates correct labels from anchors on a visual graph using graph neural network (GNN).

General Classification Graph Neural Network +3

SeDMiD for Confusion Detection: Uncovering Mind State from Time Series Brain Wave Data

no code implementations29 Nov 2016 Jingkang Yang, Haohan Wang, Jun Zhu, Eric P. Xing

In this paper, we propose an extension of State Space Model to work with different sources of information together with its learning and inference algorithms.

Time Series Time Series Analysis

Cannot find the paper you are looking for? You can Submit a new open access paper.