Search Results for author: Bingchen Zhao

Found 24 papers, 15 papers with code

AQA-Bench: An Interactive Benchmark for Evaluating LLMs' Sequential Reasoning Ability

1 code implementation14 Feb 2024 Siwei Yang, Bingchen Zhao, Cihang Xie

This paper introduces AQA-Bench, a novel benchmark to assess the sequential reasoning capabilities of large language models (LLMs) in algorithmic contexts, such as depth-first search (DFS).

Tuning LayerNorm in Attention: Towards Efficient Multi-Modal LLM Finetuning

no code implementations18 Dec 2023 Bingchen Zhao, Haoqin Tu, Chen Wei, Jieru Mei, Cihang Xie

This paper introduces an efficient strategy to transform Large Language Models (LLMs) into Multi-Modal Large Language Models (MLLMs).

Domain Adaptation

Compress & Align: Curating Image-Text Data with Human Knowledge

no code implementations11 Dec 2023 Lei Zhang, Fangxun Shu, Sucheng Ren, Bingchen Zhao, Hao Jiang, Cihang Xie

The massive growth of image-text data through web crawling inherently presents the challenge of variability in data quality.

Image Captioning Text Retrieval

How Many Unicorns Are in This Image? A Safety Evaluation Benchmark for Vision LLMs

1 code implementation27 Nov 2023 Haoqin Tu, Chenhang Cui, Zijun Wang, Yiyang Zhou, Bingchen Zhao, Junlin Han, Wangchunshu Zhou, Huaxiu Yao, Cihang Xie

Different from prior studies, we shift our focus from evaluating standard performance to introducing a comprehensive safety evaluation suite, covering both out-of-distribution (OOD) generalization and adversarial robustness.

Adversarial Robustness Visual Question Answering (VQA) +1

What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-modal Language Models

1 code implementation10 Oct 2023 Letian Zhang, Xiaotong Zhai, Zhongkai Zhao, Yongshuo Zong, Xin Wen, Bingchen Zhao

In light of the advancements in current multi-modal large language models, we explore their effectiveness in counterfactual reasoning.

Benchmarking Code Generation +4

Fool Your (Vision and) Language Model With Embarrassingly Simple Permutations

1 code implementation2 Oct 2023 Yongshuo Zong, Tingyang Yu, Bingchen Zhao, Ruchika Chavhan, Timothy Hospedales

Large language and vision-language models are rapidly being deployed in practice thanks to their impressive capabilities in instruction following, in-context learning, and so on.

In-Context Learning Instruction Following +3

Sight Beyond Text: Multi-Modal Training Enhances LLMs in Truthfulness and Ethics

1 code implementation13 Sep 2023 Haoqin Tu, Bingchen Zhao, Chen Wei, Cihang Xie

Multi-modal large language models (MLLMs) are trained based on large language models (LLM), with an enhanced capability to comprehend multi-modal inputs and generate textual responses.

Ethics

Learning Semi-supervised Gaussian Mixture Models for Generalized Category Discovery

1 code implementation ICCV 2023 Bingchen Zhao, Xin Wen, Kai Han

In this paper, we address the problem of generalized category discovery (GCD), \ie, given a set of images where part of them are labelled and the rest are not, the task is to automatically cluster the images in the unlabelled data, leveraging the information from the labelled data, while the unlabelled data contain images from the labelled classes and also new ones.

Contrastive Learning Image Classification +2

Vision Learners Meet Web Image-Text Pairs

no code implementations17 Jan 2023 Bingchen Zhao, Quan Cui, Hao Wu, Osamu Yoshie, Cheng Yang, Oisin Mac Aodha

In this work, given the excellent scalability of web data, we consider self-supervised pre-training on noisy web sourced image-text paired data.

Benchmarking Self-Supervised Learning +1

One Venue, Two Conferences: The Separation of Chinese and American Citation Networks

no code implementations17 Nov 2022 Bingchen Zhao, Yuling Gu, Jessica Zosa Forde, Naomi Saphra

At NeurIPS, American and Chinese institutions cite papers from each other's regions substantially less than they cite endogamously.

XCon: Learning with Experts for Fine-grained Category Discovery

1 code implementation3 Aug 2022 Yixin Fei, Zhongkai Zhao, Siwei Yang, Bingchen Zhao

We address the problem of generalized category discovery (GCD) in this paper, i. e. clustering the unlabeled images leveraging the information from a set of seen classes, where the unlabeled images could contain both seen classes and unseen classes.

Clustering Contrastive Learning +1

Self-Supervised Visual Representation Learning with Semantic Grouping

1 code implementation30 May 2022 Xin Wen, Bingchen Zhao, Anlin Zheng, Xiangyu Zhang, Xiaojuan Qi

The semantic grouping is performed by assigning pixels to a set of learnable prototypes, which can adapt to each sample by attentive pooling over the feature and form new slots.

Contrastive Learning Instance Segmentation +6

Discriminability-Transferability Trade-Off: An Information-Theoretic Perspective

1 code implementation8 Mar 2022 Quan Cui, Bingchen Zhao, Zhao-Min Chen, Borui Zhao, RenJie Song, Jiajun Liang, Boyan Zhou, Osamu Yoshie

This work simultaneously considers the discriminability and transferability properties of deep representations in the typical supervised learning task, i. e., image classification.

Image Classification Transfer Learning

OOD-CV: A Benchmark for Robustness to Out-of-Distribution Shifts of Individual Nuisances in Natural Images

no code implementations29 Nov 2021 Bingchen Zhao, Shaozuo Yu, Wufei Ma, Mingxin Yu, Shenxiao Mei, Angtian Wang, Ju He, Alan Yuille, Adam Kortylewski

One reason is that existing robustness benchmarks are limited, as they either rely on synthetic data or ignore the effects of individual nuisance factors.

3D Pose Estimation Benchmarking +5

Novel Visual Category Discovery with Dual Ranking Statistics and Mutual Knowledge Distillation

no code implementations NeurIPS 2021 Bingchen Zhao, Kai Han

In this paper, we tackle the problem of novel visual category discovery, i. e., grouping unlabelled images from new classes into different semantic partitions by leveraging a labelled dataset that contains images from other different but relevant categories.

Fine-Grained Visual Recognition Knowledge Distillation

Rail-5k: a Real-World Dataset for Rail Surface Defects Detection

no code implementations28 Jun 2021 Zihao Zhang, Shaozuo Yu, Siwei Yang, Yu Zhou, Bingchen Zhao

This paper presents the Rail-5k dataset for benchmarking the performance of visual algorithms in a real-world application scenario, namely the rail surface defects detection task.

Benchmarking

Reducing the feature divergence of RGB and near-infrared images using Switchable Normalization

1 code implementation6 Jun 2021 Siwei Yang, Shaozuo Yu, Bingchen Zhao, Yin Wang

Visual pattern recognition over agricultural areas is an important application of aerial image processing.

Temporal Context Aggregation for Video Retrieval with Contrastive Learning

1 code implementation4 Aug 2020 Jie Shao, Xin Wen, Bingchen Zhao, xiangyang xue

The current research focus on Content-Based Video Retrieval requires higher-level video representation describing the long-range semantic dependencies of relevant incidents, events, etc.

Contrastive Learning Representation Learning +2

Distilling Visual Priors from Self-Supervised Learning

1 code implementation1 Aug 2020 Bingchen Zhao, Xin Wen

Convolutional Neural Networks (CNNs) are prone to overfit small training datasets.

Classification Contrastive Learning +4

Cannot find the paper you are looking for? You can Submit a new open access paper.