Search Results for author: Bin Zhu

Found 43 papers, 16 papers with code

LLMBind: A Unified Modality-Task Integration Framework

no code implementations22 Feb 2024 Bin Zhu, Peng Jin, Munan Ning, Bin Lin, Jinfa Huang, Qi Song, Junwu Zhang, Zhenyu Tang, Mingjun Pan, Xing Zhou, Li Yuan

While recent progress in multimodal large language models tackles various modality tasks, they posses limited integration capabilities for complex multi-modality tasks, consequently constraining the development of the field.

Audio Generation Image Segmentation +1

Video Editing for Video Retrieval

no code implementations4 Feb 2024 Bin Zhu, Kevin Flanagan, Adriano Fragomeni, Michael Wray, Dima Damen

The teacher model is employed to edit the clips in the training set whereas the student model trains on the edited clips.

Retrieval Text Retrieval +2

FoodLMM: A Versatile Food Assistant using Large Multi-modal Model

no code implementations22 Dec 2023 Yuehao Yin, Huiyan Qi, Bin Zhu, Jingjing Chen, Yu-Gang Jiang, Chong-Wah Ngo

In the second stage, we construct a multi-round conversation and a reasoning segmentation datasets to fine-tune the model, enabling it to conduct professional dialogues and generate segmentation masks based on complex reasoning in food domain.

Food Recognition Multi-Task Learning +3

Benchmarking and Defending Against Indirect Prompt Injection Attacks on Large Language Models

1 code implementation21 Dec 2023 Jingwei Yi, Yueqi Xie, Bin Zhu, Keegan Hines, Emre Kiciman, Guangzhong Sun, Xing Xie, Fangzhao Wu

A key feature of these applications is the combination of LLMs with external content, where user instructions and third-party content are combined to create prompts for LLM processing.

Benchmarking

CAR: Consolidation, Augmentation and Regulation for Recipe Retrieval

no code implementations8 Dec 2023 Fangzhou Song, Bin Zhu, Yanbin Hao, Shuo Wang, Xiangnan He

Learning recipe and food image representation in common embedding space is non-trivial but crucial for cross-modal recipe retrieval.

Retrieval

Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models

1 code implementation27 Nov 2023 Munan Ning, Bin Zhu, Yujia Xie, Bin Lin, Jiaxi Cui, Lu Yuan, Dongdong Chen, Li Yuan

Video-based large language models (Video-LLMs) have been recently introduced, targeting both fundamental improvements in perception and comprehension, and a diverse range of user inquiries.

Decision Making Question Answering

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

4 code implementations16 Nov 2023 Bin Lin, Yang Ye, Bin Zhu, Jiaxi Cui, Munan Ning, Peng Jin, Li Yuan

In this work, we unify visual representation into the language feature space to advance the foundational LLM towards a unified LVLM.

Language Modelling Large Language Model +2

Controlling Neural Style Transfer with Deep Reinforcement Learning

no code implementations30 Sep 2023 Chengming Feng, Jing Hu, Xin Wang, Shu Hu, Bin Zhu, Xi Wu, Hongtu Zhu, Siwei Lyu

Controlling the degree of stylization in the Neural Style Transfer (NST) is a little tricky since it usually needs hand-engineering on hyper-parameters.

reinforcement-learning Reinforcement Learning (RL) +1

Image-to-Image Translation with Deep Reinforcement Learning

1 code implementation24 Sep 2023 Xin Wang, Ziwei Luo, Jing Hu, Chengming Feng, Shu Hu, Bin Zhu, Xi Wu, Xin Li, Siwei Lyu

The key feature in the RL-I2IT framework is to decompose a monolithic learning process into small steps with a lightweight model to progressively transform a source image successively to a target image.

Auxiliary Learning Decision Making +3

MKL-$L_{0/1}$-SVM

no code implementations23 Aug 2023 Bin Zhu, Yijie Shi

This paper presents a Multiple Kernel Learning (abbreviated as MKL) framework for the Support Vector Machine (SVM) with the $(0, 1)$ loss function.

CgT-GAN: CLIP-guided Text GAN for Image Captioning

1 code implementation23 Aug 2023 Jiarui Yu, Haoran Li, Yanbin Hao, Bin Zhu, Tong Xu, Xiangnan He

Particularly, we use adversarial training to teach CgT-GAN to mimic the phrases of an external text corpus and CLIP-based reward to provide semantic guidance.

Image Captioning

Towards Attack-tolerant Federated Learning via Critical Parameter Analysis

1 code implementation ICCV 2023 Sungwon Han, Sungwon Park, Fangzhao Wu, Sundong Kim, Bin Zhu, Xing Xie, Meeyoung Cha

Federated learning is used to train a shared model in a decentralized way without clients sharing private data with each other.

Federated Learning

FedDefender: Client-Side Attack-Tolerant Federated Learning

1 code implementation18 Jul 2023 Sungwon Park, Sungwon Han, Fangzhao Wu, Sundong Kim, Bin Zhu, Xing Xie, Meeyoung Cha

Evaluations of real-world scenarios across multiple datasets show that the proposed method enhances the robustness of federated learning against model poisoning attacks.

Federated Learning Knowledge Distillation +1

Are You Copying My Model? Protecting the Copyright of Large Language Models for EaaS via Backdoor Watermark

1 code implementation17 May 2023 Wenjun Peng, Jingwei Yi, Fangzhao Wu, Shangxi Wu, Bin Zhu, Lingjuan Lyu, Binxing Jiao, Tong Xu, Guangzhong Sun, Xing Xie

Companies have begun to offer Embedding as a Service (EaaS) based on these LLMs, which can benefit various natural language processing (NLP) tasks for customers.

Model extraction

Harnessing the Power of Text-image Contrastive Models for Automatic Detection of Online Misinformation

no code implementations19 Apr 2023 Hao Chen, Peng Zheng, Xin Wang, Shu Hu, Bin Zhu, Jinrong Hu, Xi Wu, Siwei Lyu

As growing usage of social media websites in the recent decades, the amount of news articles spreading online rapidly, resulting in an unprecedented scale of potentially fraudulent information.

Contrastive Learning Misinformation +1

An ADMM Solver for the MKL-$L_{0/1}$-SVM

no code implementations8 Mar 2023 Yijie Shi, Bin Zhu

We formulate the Multiple Kernel Learning (abbreviated as MKL) problem for the support vector machine with the infamous $(0, 1)$-loss function.

Attacking Important Pixels for Anchor-free Detectors

no code implementations26 Jan 2023 Yunxu Xie, Shu Hu, Xin Wang, Quanyu Liao, Bin Zhu, Xi Wu, Siwei Lyu

Existing adversarial attacks on object detection focus on attacking anchor-based detectors, which may not work well for anchor-free detectors.

Adversarial Attack object-detection +2

On the Statistical Consistency of a Generalized Cepstral Estimator

no code implementations17 Jan 2023 Bin Zhu, Mattia Zorzi

We consider the problem to estimate the generalized cepstral coefficients of a stationary stochastic process or stationary multidimensional random field.

Text-driven Video Prediction

no code implementations6 Oct 2022 Xue Song, Jingjing Chen, Bin Zhu, Yu-Gang Jiang

Specifically, appearance and motion components are provided by the image and caption separately.

Causal Inference Video Generation +1

EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations

3 code implementations26 Sep 2022 Ahmad Darkhalil, Dandan Shan, Bin Zhu, Jian Ma, Amlan Kar, Richard Higgins, Sanja Fidler, David Fouhey, Dima Damen

VISOR annotates videos from EPIC-KITCHENS, which comes with a new set of challenges not encountered in current video segmentation datasets.

Object Segmentation +4

Robust Quantity-Aware Aggregation for Federated Learning

no code implementations22 May 2022 Jingwei Yi, Fangzhao Wu, Huishuai Zhang, Bin Zhu, Tao Qi, Guangzhong Sun, Xing Xie

Federated learning (FL) enables multiple clients to collaboratively train models without sharing their local data, and becomes an important privacy-preserving machine learning framework.

Federated Learning Privacy Preserving

Cross-lingual Adaptation for Recipe Retrieval with Mixup

no code implementations8 May 2022 Bin Zhu, Chong-Wah Ngo, Jingjing Chen, Wing-Kwong Chan

To bridge the domain gap, recipe mixup loss is proposed to enforce the intermediate domain to locate in the shortest geodesic path between source and target domains in the recipe embedding space.

Retrieval Unsupervised Domain Adaptation

Improving robustness of language models from a geometry-aware perspective

no code implementations Findings (ACL) 2022 Bin Zhu, Zhaoquan Gu, Le Wang, Jinyin Chen, Qi Xuan

On top of FADA, we propose geometry-aware adversarial training (GAT) to perform adversarial training on friendly adversarial data so that we can save a large number of search steps.

Data Augmentation

OneLabeler: A Flexible System for Building Data Labeling Tools

1 code implementation27 Mar 2022 Yu Zhang, Yun Wang, Haidong Zhang, Bin Zhu, Siming Chen, Dongmei Zhang

In this paper, we propose a conceptual framework for data labeling and OneLabeler based on the conceptual framework to support easy building of labeling tools for diverse usage scenarios.

UA-FedRec: Untargeted Attack on Federated News Recommendation

1 code implementation14 Feb 2022 Jingwei Yi, Fangzhao Wu, Bin Zhu, Jing Yao, Zhulin Tao, Guangzhong Sun, Xing Xie

Our study reveals a critical security issue in existing federated news recommendation systems and calls for research efforts to address the issue.

Federated Learning News Recommendation +2

TREATED:Towards Universal Defense against Textual Adversarial Attacks

no code implementations13 Sep 2021 Bin Zhu, Zhaoquan Gu, Le Wang, Zhihong Tian

Recent work shows that deep neural networks are vulnerable to adversarial examples.

Adversarial Defense

Transferable Adversarial Examples for Anchor Free Object Detection

no code implementations3 Jun 2021 Quanyu Liao, Xin Wang, Bin Kong, Siwei Lyu, Bin Zhu, Youbing Yin, Qi Song, Xi Wu

Deep neural networks have been demonstrated to be vulnerable to adversarial attacks: subtle perturbation can completely change prediction result.

Adversarial Attack Object +2

Imperceptible Adversarial Examples for Fake Image Detection

no code implementations3 Jun 2021 Quanyu Liao, Yuezun Li, Xin Wang, Bin Kong, Bin Zhu, Siwei Lyu, Youbing Yin, Qi Song, Xi Wu

Fooling people with highly realistic fake images generated with Deepfake or GANs brings a great social disturbance to our society.

Face Swapping Fake Image Detection

Pyramid Fusion Dark Channel Prior for Single Image Dehazing

no code implementations21 May 2021 Qiyuan Liang, Bin Zhu, Chong-Wah Ngo

In this paper, we propose the pyramid fusion dark channel prior (PF-DCP) for single image dehazing.

Image Dehazing Single Image Dehazing

An Optimized H.266/VVC Software Decoder On Mobile Platform

no code implementations5 Mar 2021 Yiming Li, Shan Liu, Yu Chen, Yushan Zheng, Sijia Chen, Bin Zhu, Jian Lou

As the successor of H. 265/HEVC, the new versatile video coding standard (H. 266/VVC) can provide up to 50% bitrate saving with the same subjective quality, at the cost of increased decoding complexity.

New Strong Bounds on sub-GeV Dark Matter from Boosted and Migdal Effects

no code implementations17 Dec 2020 Victor V. Flambaum, Liangliang Su, Lei Wu, Bin Zhu

Due to the low nuclear recoils, sub-GeV dark matter (DM) is usually beyond the sensitivity of the conventional DM direct detection experiments.

High Energy Physics - Phenomenology Cosmology and Nongalactic Astrophysics

Line Spectrum Representation for Vector Processes With Application to Frequency Estimation

no code implementations24 Jun 2020 Bin Zhu

A positive semidefinite Toeplitz matrix, which often arises as the finite covariance matrix of a stationary random process, can be decomposed as the sum of a nonnegative multiple of the identity corresponding to a white noise, and a singular term corresponding to a purely deterministic process.

Time Series Analysis

CookGAN: Causality Based Text-to-Image Synthesis

no code implementations CVPR 2020 Bin Zhu, Chong-Wah Ngo

Particularly, a cooking simulator sub-network is proposed to incrementally make changes to food images based on the interaction between ingredients and cooking methods over a series of steps.

Image Generation

CPM R-CNN: Calibrating Point-guided Misalignment in Object Detection

1 code implementation7 Mar 2020 Bin Zhu, Qing Song, Lu Yang, Zhihui Wang, Chun Liu, Mengjie Hu

In object detection, offset-guided and point-guided regression dominate anchor-based and anchor-free method separately.

object-detection Object Detection +1

An Empirical Bayes Approach to Frequency Estimation

no code implementations21 Oct 2019 Giorgio Picci, Bin Zhu

In this paper we show that the classical problem of frequency estimation can be formulated and solved efficiently in an empirical Bayesian framework by assigning a uniform a priori probability distribution to the unknown frequency.

R2GAN: Cross-Modal Recipe Retrieval With Generative Adversarial Network

no code implementations CVPR 2019 Bin Zhu, Chong-Wah Ngo, Jingjing Chen, Yanbin Hao

Representing procedure text such as recipe for crossmodal retrieval is inherently a difficult problem, not mentioning to generate image from recipe for visualization.

Generative Adversarial Network Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.