Search Results for author: Rulin Shao

Found 12 papers, 6 papers with code

Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning

no code implementations18 Feb 2024 Zhiyang Xu, Chao Feng, Rulin Shao, Trevor Ashby, Ying Shen, Di Jin, Yu Cheng, Qifan Wang, Lifu Huang

Despite vision-language models' (VLMs) remarkable capabilities as versatile visual assistants, two substantial challenges persist within the existing VLM frameworks: (1) lacking task diversity in pretraining and visual instruction tuning, and (2) annotation error and bias in GPT-4 synthesized instruction tuning data.

Hallucination Visual Question Answering

DISTFLASHATTN: Distributed Memory-efficient Attention for Long-context LLMs Training

1 code implementation5 Oct 2023 Dacheng Li, Rulin Shao, Anze Xie, Eric P. Xing, Xuezhe Ma, Ion Stoica, Joseph E. Gonzalez, Hao Zhang

FlashAttention (Dao, 2023) effectively reduces the quadratic peak memory usage to linear in training transformer-based large language models (LLMs) on a single GPU.

VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use

1 code implementation12 Aug 2023 Yonatan Bitton, Hritik Bansal, Jack Hessel, Rulin Shao, Wanrong Zhu, Anas Awadalla, Josh Gardner, Rohan Taori, Ludwig Schmidt

These descriptions enable 1) collecting human-verified reference outputs for each instance; and 2) automatic evaluation of candidate multimodal generations using a text-only LLM, aligning with human judgment.

Instruction Following

Cross-modal Attention Congruence Regularization for Vision-Language Relation Alignment

1 code implementation20 Dec 2022 Rohan Pandey, Rulin Shao, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency

To tackle this problem, we show that relation alignment can be enforced by encouraging the directed language attention from 'mug' to 'grass' (capturing the semantic relation 'in') to match the directed visual attention from the mug to the grass.

Relation Visual Reasoning

Does Structural Attention Improve Compositional Representations in Vision-Language Models?

no code implementations NeurIPS Workshop: Self-Supervised Learning - Theory and Practice 2022 Rohan Pandey, Rulin Shao, Paul Pu Liang, Louis-Philippe Morency

Although scaling self-supervised approaches has gained widespread success in Vision-Language pre-training, a number of works providing structural knowledge of visually-grounded semantics have recently shown incremental performance gains.

Visual Reasoning

MPCFormer: fast, performant and private Transformer inference with MPC

1 code implementation2 Nov 2022 Dacheng Li, Rulin Shao, Hongyi Wang, Han Guo, Eric P. Xing, Hao Zhang

Through extensive evaluations, we show that MPCFORMER significantly speeds up Transformer inference in MPC settings while achieving similar ML performance to the input model.

Knowledge Distillation

How and When Adversarial Robustness Transfers in Knowledge Distillation?

no code implementations22 Oct 2021 Rulin Shao, JinFeng Yi, Pin-Yu Chen, Cho-Jui Hsieh

Our comprehensive analysis shows several novel insights that (1) With KDIGA, students can preserve or even exceed the adversarial robustness of the teacher model, even when their models have fundamentally different architectures; (2) KDIGA enables robustness to transfer to pre-trained students, such as KD from an adversarially trained ResNet to a pre-trained ViT, without loss of clean accuracy; and (3) Our derived local linearity bounds for characterizing adversarial robustness in KD are consistent with the empirical results.

Adversarial Robustness Knowledge Distillation +1

On the Adversarial Robustness of Vision Transformers

1 code implementation29 Mar 2021 Rulin Shao, Zhouxing Shi, JinFeng Yi, Pin-Yu Chen, Cho-Jui Hsieh

Following the success in advancing natural language processing and understanding, transformers are expected to bring revolutionary changes to computer vision.

Adversarial Robustness

Robust Text CAPTCHAs Using Adversarial Examples

no code implementations7 Jan 2021 Rulin Shao, Zhouxing Shi, JinFeng Yi, Pin-Yu Chen, Cho-Jui Hsieh

At the second stage, we design and apply a highly transferable adversarial attack for text CAPTCHAs to better obstruct CAPTCHA solvers.

Adversarial Attack Optical Character Recognition (OCR)

Stochastic Channel-Based Federated Learning for Medical Data Privacy Preserving

no code implementations23 Oct 2019 Rulin Shao, Hongyu He, Hui Liu, Dianbo Liu

Specifically, we design, implement and evaluate a channel-based update algorithm for the central server in a distributed system, which selects the channels with regard to the most active features in a training loop and uploads them as learned information from local datasets.

Federated Learning Privacy Preserving

Privacy Preserving Stochastic Channel-Based Federated Learning with Neural Network Pruning

no code implementations4 Oct 2019 Rulin Shao, Hui Liu, Dianbo Liu

Artificial neural network has achieved unprecedented success in a wide variety of domains such as classifying, predicting and recognizing objects.

Federated Learning Network Pruning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.