Search Results for author: Pu Zhao

Found 80 papers, 37 papers with code

Werewolf: A Straightforward Game Framework with TTS for Improved User Engagement

no code implementations30 May 2025 Qihui Fan, Enfu Nan, Wenbo Li, Lei Lu, Pu Zhao, Yanzhi Wang

The growing popularity of social deduction game systems for both business applications and AI research has greatly benefited from the rapid advancements in Large Language Models (LLMs), which now demonstrate stronger reasoning and persuasion capabilities.

text-to-speech Text to Speech

Enabling Flexible Multi-LLM Integration for Scalable Knowledge Aggregation

1 code implementation28 May 2025 Zhenglun Kong, Zheng Zhan, Shiyue Hou, Yifan Gong, Xin Meng, Pengwei Sui, Peiyan Dong, Xuan Shen, Zifeng Wang, Pu Zhao, Hao Tang, Stratis Ioannidis, Yanzhi Wang

To address these issues, we propose a framework that adaptively selects and aggregates knowledge from diverse LLMs to build a single, stronger model, avoiding the high memory overhead of ensemble and inflexible weight merging.

RePrompt: Reasoning-Augmented Reprompting for Text-to-Image Generation via Reinforcement Learning

1 code implementation23 May 2025 Mingrui Wu, Lu Wang, Pu Zhao, Fangkai Yang, Jianjin Zhang, Jianfeng Liu, Yuefeng Zhan, Weihao Han, Hao Sun, Jiayi Ji, Xiaoshuai Sun, QIngwei Lin, Weiwei Deng, Dongmei Zhang, Feng Sun, Qi Zhang, Rongrong Ji

Instead of relying on handcrafted rules or stylistic rewrites, our method trains a language model to generate structured, self-reflective prompts by optimizing for image-level outcomes.

Language Modeling Language Modelling +2

Taming Diffusion for Dataset Distillation with High Representativeness

1 code implementation23 May 2025 Lin Zhao, Yushu Wu, Xinru Jiang, Jianyang Gu, Yanzhi Wang, Xiaolin Xu, Pu Zhao, Xue Lin

Specifically, we adopt DDIM inversion to map the latents of the full dataset from a low-normality latent domain to a high-normality Gaussian domain, preserving information and ensuring structural consistency to generate representative latents for the distilled dataset.

Dataset Distillation Image Generation

Token Reduction Should Go Beyond Efficiency in Generative Models -- From Vision, Language to Multimodality

1 code implementation23 May 2025 Zhenglun Kong, Yize Li, Fanhu Zeng, Lei Xin, Shvat Messica, Xue Lin, Pu Zhao, Manolis Kellis, Hao Tang, Marinka Zitnik

We highlight its potential to drive new model architectures and learning strategies that improve robustness, increase interpretability, and better align with the objectives of generative modeling.

In-Context Learning Token Reduction

Structured Agent Distillation for Large Language Model

no code implementations20 May 2025 Jun Liu, Zhenglun Kong, Peiyan Dong, Changdi Yang, Tianqi Li, Hao Tang, Geng Yuan, Wei Niu, Wenbin Zhang, Pu Zhao, Xue Lin, Dong Huang, Yanzhi Wang

Large language models (LLMs) exhibit strong capabilities as decision-making agents by interleaving reasoning and actions, as seen in ReAct-style frameworks.

Imitation Learning Language Modeling +3

FastCar: Cache Attentive Replay for Fast Auto-Regressive Video Generation on the Edge

1 code implementation17 May 2025 Xuan Shen, Weize Ma, Yufa Zhou, Enhao Tang, Yanyue Xie, Zhengang Li, Yifan Gong, Quanyi Wang, Henghui Ding, Yiwei Wang, Yanzhi Wang, Pu Zhao, Jun Lin, Jiuxiang Gu

In this paper, we propose the \textbf{FastCar} framework to accelerate the decode phase for the AR video generation by exploring the temporal redundancy.

Image Generation Scheduling +2

DraftAttention: Fast Video Diffusion via Low-Resolution Attention Guidance

1 code implementation17 May 2025 Xuan Shen, Chenxia Han, Yufa Zhou, Yanyue Xie, Yifan Gong, Quanyi Wang, Yiwei Wang, Yanzhi Wang, Pu Zhao, Jiuxiang Gu

To address this, we propose the DraftAttention, a training-free framework for the acceleration of video diffusion transformers with dynamic sparse attention on GPUs.

Video Generation

UFO2: The Desktop AgentOS

1 code implementation20 Apr 2025 Chaoyun Zhang, He Huang, Chiming Ni, Jian Mu, Si Qin, Shilin He, Lu Wang, Fangkai Yang, Pu Zhao, Chao Du, Liqun Li, Yu Kang, Zhao Jiang, Suzhen Zheng, Rujia Wang, Jiaxu Qian, Minghua Ma, Jian-Guang Lou, QIngwei Lin, Saravan Rajmohan, Dongmei Zhang

Recent Computer-Using Agents (CUAs), powered by multimodal large language models (LLMs), offer a promising direction for automating complex desktop workflows through natural language.

Lean and Mean: Decoupled Value Policy Optimization with Global Value Guidance

no code implementations24 Feb 2025 Chenghua Huang, Lu Wang, Fangkai Yang, Pu Zhao, Zhixu Li, QIngwei Lin, Dongmei Zhang, Saravan Rajmohan, Qi Zhang

Proximal Policy Optimization (PPO)-based Reinforcement Learning from Human Feedback (RLHF) is essential for aligning large language models (LLMs) with human preferences.

Efficient Reasoning with Hidden Thinking

1 code implementation31 Jan 2025 Xuan Shen, Yizhou Wang, Xiangxi Shi, Yanzhi Wang, Pu Zhao, Jiuxiang Gu

Meanwhile, we design corresponding Heima Decoder with traditional Large Language Models (LLMs) to adaptively interpret the hidden representations into variable-length textual sequence, reconstructing reasoning processes that closely resemble the original CoTs.

Decoder Multimodal Reasoning

RoRA: Efficient Fine-Tuning of LLM with Reliability Optimization for Rank Adaptation

no code implementations8 Jan 2025 Jun Liu, Zhenglun Kong, Peiyan Dong, Changdi Yang, Xuan Shen, Pu Zhao, Hao Tang, Geng Yuan, Wei Niu, Wenbin Zhang, Xue Lin, Dong Huang, Yanzhi Wang

Although Low-Rank Adaptation (LoRA) is widely used and effective for fine-tuning, we have observed that its scaling factor can limit or even reduce performance as the rank size increases.

WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models

no code implementations23 Dec 2024 Huawen Feng, Pu Zhao, Qingfeng Sun, Can Xu, Fangkai Yang, Lu Wang, Qianli Ma, QIngwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang

Despite recent progress achieved by code large language models (LLMs), their remarkable abilities are largely dependent on fine-tuning on the high-quality data, posing challenges for data collection and annotation.

Data Augmentation Diversity

Numerical Pruning for Efficient Autoregressive Models

no code implementations17 Dec 2024 Xuan Shen, Zhao Song, Yufa Zhou, Bo Chen, Jing Liu, Ruiyi Zhang, Ryan A. Rossi, Hao Tan, Tong Yu, Xiang Chen, Yufan Zhou, Tong Sun, Pu Zhao, Yanzhi Wang, Jiuxiang Gu

Transformers have emerged as the leading architecture in deep learning, proving to be versatile and highly effective across diverse domains beyond language and image processing.

Decoder Image Generation

LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers

no code implementations17 Dec 2024 Xuan Shen, Zhao Song, Yufa Zhou, Bo Chen, Yanyu Li, Yifan Gong, Kai Zhang, Hao Tan, Jason Kuen, Henghui Ding, Zhihao Shu, Wei Niu, Pu Zhao, Yanzhi Wang, Jiuxiang Gu

In this paper, we show that performing the full computation of the model at each diffusion step is unnecessary, as some computations can be skipped by lazily reusing the results of previous steps.

Denoising

Large Action Models: From Inception to Implementation

1 code implementation13 Dec 2024 Lu Wang, Fangkai Yang, Chaoyun Zhang, Junting Lu, Jiaxu Qian, Shilin He, Pu Zhao, Bo Qiao, Ray Huang, Si Qin, Qisheng Su, Jiayi Ye, Yudi Zhang, Jian-Guang Lou, QIngwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang

As AI continues to advance, there is a growing demand for systems that go beyond language-based assistance and move toward intelligent agents capable of performing real-world actions.

Action Generation

Open-Source Acceleration of Stable-Diffusion.cpp Deployable on All Devices

no code implementations8 Dec 2024 Jingxu Ng, Cheng Lv, Pu Zhao, Wei Niu, Juyi Lin, Minzhou Pan, Yun Liang, Yanzhi Wang

To address this, stable-diffusion. cpp (Sdcpp) emerges as an efficient inference framework to accelerate the diffusion models.

All Image Generation

Fully Open Source Moxin-7B Technical Report

1 code implementation8 Dec 2024 Pu Zhao, Xuan Shen, Zhenglun Kong, Yixin Shen, Sung-En Chang, Timothy Rupprecht, Lei Lu, Enfu Nan, Changdi Yang, Yumei He, Xingchen Xu, Yu Huang, Wei Wang, Yue Chen, Yong He, Yanzhi Wang

Recently, Large Language Models (LLMs) have undergone a significant transformation, marked by a rapid rise in both their popularity and capabilities.

Fast and Memory-Efficient Video Diffusion Using Streamlined Inference

1 code implementation2 Nov 2024 Zheng Zhan, Yushu Wu, Yifan Gong, Zichong Meng, Zhenglun Kong, Changdi Yang, Geng Yuan, Pu Zhao, Wei Niu, Yanzhi Wang

The rapid progress in artificial intelligence-generated content (AIGC), especially with diffusion models, has significantly advanced development of high-quality video generation.

Video Generation

Self-Evolved Reward Learning for LLMs

1 code implementation1 Nov 2024 Chenghua Huang, Zhizhen Fan, Lu Wang, Fangkai Yang, Pu Zhao, Zeqi Lin, QIngwei Lin, Dongmei Zhang, Saravan Rajmohan, Qi Zhang

Reinforcement Learning from Human Feedback (RLHF) is a crucial technique for aligning language models with human preferences, playing a pivotal role in the success of conversational models like GPT-4, ChatGPT, and Llama 2.

Pruning Foundation Models for High Accuracy without Retraining

1 code implementation21 Oct 2024 Pu Zhao, Fei Sun, Xuan Shen, Pinrui Yu, Zhenglun Kong, Yanzhi Wang, Xue Lin

To deal with this problem, post-training pruning methods are proposed to prune LLMs in one-shot without retraining.

Mamba

Rethinking Token Reduction for State Space Models

1 code implementation16 Oct 2024 Zheng Zhan, Yushu Wu, Zhenglun Kong, Changdi Yang, Yifan Gong, Xuan Shen, Xue Lin, Pu Zhao, Yanzhi Wang

While token reduction techniques offer a straightforward post-training strategy, we find that applying existing methods directly to SSMs leads to substantial performance drops.

Mamba State Space Models +1

Lotus: learning-based online thermal and latency variation management for two-stage detectors on edge devices

1 code implementation1 Oct 2024 Yifan Gong, Yushu Wu, Zheng Zhan, Pu Zhao, Liangkai Liu, Chao Wu, Xulong Tang, Yanzhi Wang

Two-stage object detectors exhibit high accuracy and precise localization, especially for identifying small objects that are favorable for various edge applications.

Deep Reinforcement Learning Management

Exploring Token Pruning in Vision State Space Models

no code implementations27 Sep 2024 Zheng Zhan, Zhenglun Kong, Yifan Gong, Yushu Wu, Zichong Meng, Hangyu Zheng, Xuan Shen, Stratis Ioannidis, Wei Niu, Pu Zhao, Yanzhi Wang

Inspired by the observations that the final prediction in vision transformers (ViTs) is only based on a subset of most informative tokens, we take the novel step of enhancing the efficiency of SSM-based vision models through token-based pruning.

State Space Models

Search for Efficient Large Language Models

1 code implementation25 Sep 2024 Xuan Shen, Pu Zhao, Yifan Gong, Zhenglun Kong, Zheng Zhan, Yushu Wu, Ming Lin, Chao Wu, Xue Lin, Yanzhi Wang

Large Language Models (LLMs) have long held sway in the realms of artificial intelligence research.

Model Compression Quantization

AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation

1 code implementation1 Aug 2024 Mengkang Hu, Pu Zhao, Can Xu, Qingfeng Sun, JianGuang Lou, QIngwei Lin, Ping Luo, Saravan Rajmohan

Moreover, to increase the difficulty diversity of generated planning tasks, we propose a bidirectional evolution method, Bi-Evol, that evolves planning tasks from easier and harder directions to synthesize a task set with a smoother difficulty curve.

Diversity Language Modeling +2

Arena Learning: Build Data Flywheel for LLMs Post-training via Simulated Chatbot Arena

no code implementations15 Jul 2024 Haipeng Luo, Qingfeng Sun, Can Xu, Pu Zhao, QIngwei Lin, JianGuang Lou, Shifeng Chen, Yansong Tang, Weizhu Chen

In this paper, we introduce Arena Learning, an innovative offline strategy designed to simulate these arena battles using AI-driven annotations to evaluate battle outcomes, thus facilitating the continuous improvement of the target model through both supervised fine-tuning and reinforcement learning.

Chatbot

Toward Adaptive Large Language Models Structured Pruning via Hybrid-grained Weight Importance Assessment

no code implementations16 Mar 2024 Jun Liu, Zhenglun Kong, Pu Zhao, Changdi Yang, Hao Tang, Xuan Shen, Geng Yuan, Wei Niu, Wenbin Zhang, Xue Lin, Dong Huang, Yanzhi Wang

For example, HyWIA surpasses the cutting-edge LLM-Pruner by an average margin of 2. 82% in accuracy across seven downstream tasks when pruning LLaMA-7B by 50%.

Decoder Language Modelling +1

InstructGIE: Towards Generalizable Image Editing

no code implementations8 Mar 2024 Zichong Meng, Changdi Yang, Jun Liu, Hao Tang, Pu Zhao, Yanzhi Wang

In response to this challenge, our study introduces a novel image editing framework with enhanced generalization robustness by boosting in-context learning capability and unifying language instruction.

Denoising In-Context Learning

DiffClass: Diffusion-Based Class Incremental Learning

no code implementations8 Mar 2024 Zichong Meng, Jie Zhang, Changdi Yang, Zheng Zhan, Pu Zhao, Yanzhi Wang

On top of that, Exemplar-free Class Incremental Learning is even more challenging due to forbidden access to previous task data.

class-incremental learning Class Incremental Learning +4

EdgeQAT: Entropy and Distribution Guided Quantization-Aware Training for the Acceleration of Lightweight LLMs on the Edge

1 code implementation16 Feb 2024 Xuan Shen, Zhenglun Kong, Changdi Yang, Zhaoyang Han, Lei Lu, Peiyan Dong, Cheng Lyu, Chih-hsiang Li, Xuehang Guo, Zhihao Shu, Wei Niu, Miriam Leeser, Pu Zhao, Yanzhi Wang

In this paper, we propose EdgeQAT, the Entropy and Distribution Guided QAT for the optimization of lightweight LLMs to achieve inference acceleration on Edge devices.

Quantization

Detection and Recovery Against Deep Neural Network Fault Injection Attacks Based on Contrastive Learning

no code implementations30 Jan 2024 Chenan Wang, Pu Zhao, Siyue Wang, Xue Lin

Deep Neural Network (DNN) models when implemented on executing devices as the inference engines are susceptible to Fault Injection Attacks (FIAs) that manipulate model parameters to disrupt inference execution with disastrous performance.

Contrastive Learning Self-Supervised Learning

Contrastive Learning with Negative Sampling Correction

no code implementations13 Jan 2024 Lu Wang, Chao Du, Pu Zhao, Chuan Luo, Zhangchi Zhu, Bo Qiao, Wei zhang, QIngwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang

To correct the negative sampling bias, we propose a novel contrastive learning method named Positive-Unlabeled Contrastive Learning (PUCL).

Contrastive Learning Data Augmentation +2

Why does Prediction Accuracy Decrease over Time? Uncertain Positive Learning for Cloud Failure Prediction

no code implementations8 Jan 2024 Haozhe Li, Minghua Ma, Yudong Liu, Pu Zhao, Lingling Zheng, Ze Li, Yingnong Dang, Murali Chintalapati, Saravan Rajmohan, QIngwei Lin, Dongmei Zhang

Using two real-world datasets of disk failure prediction and conducting node prediction experiments in Microsoft Azure, which is a top-tier cloud provider that serves millions of users, we demonstrate Uptake can significantly improve the failure prediction accuracy by 5% on average.

Cloud Computing Prediction

TaskWeaver: A Code-First Agent Framework

1 code implementation29 Nov 2023 Bo Qiao, Liqun Li, Xu Zhang, Shilin He, Yu Kang, Chaoyun Zhang, Fangkai Yang, Hang Dong, Jue Zhang, Lu Wang, Minghua Ma, Pu Zhao, Si Qin, Xiaoting Qin, Chao Du, Yong Xu, QIngwei Lin, Saravan Rajmohan, Dongmei Zhang

TaskWeaver provides support for rich data structures, flexible plugin usage, and dynamic plugin selection, and leverages LLM coding capabilities for complex logic.

Natural Language Understanding

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

1 code implementation18 Aug 2023 Haipeng Luo, Qingfeng Sun, Can Xu, Pu Zhao, JianGuang Lou, Chongyang Tao, Xiubo Geng, QIngwei Lin, Shifeng Chen, Yansong Tang, Dongmei Zhang

Large language models (LLMs), such as GPT-4, have shown remarkable performance in natural language processing (NLP) tasks, including challenging mathematical reasoning.

Ranked #54 on Arithmetic Reasoning on GSM8K (using extra training data)

Arithmetic Reasoning GSM8K +2

Robust Positive-Unlabeled Learning via Noise Negative Sample Self-correction

1 code implementation1 Aug 2023 Zhangchi Zhu, Lu Wang, Pu Zhao, Chao Du, Wei zhang, Hang Dong, Bo Qiao, QIngwei Lin, Saravan Rajmohan, Dongmei Zhang

To mitigate the impact of label uncertainty and improve the robustness of learning with positive and unlabeled data, we propose a new robust PU learning method with a training strategy motivated by the nature of human learning: easy cases should be learned first.

WizardCoder: Empowering Code Large Language Models with Evol-Instruct

4 code implementations14 Jun 2023 Ziyang Luo, Can Xu, Pu Zhao, Qingfeng Sun, Xiubo Geng, Wenxiang Hu, Chongyang Tao, Jing Ma, QIngwei Lin, Daxin Jiang

Moreover, our model even outperforms the largest closed LLMs, Anthropic's Claude and Google's Bard, on HumanEval and HumanEval+.

Code Generation HumanEval +1

Empower Large Language Model to Perform Better on Industrial Domain-Specific Question Answering

1 code implementation19 May 2023 Fangkai Yang, Pu Zhao, Zezhong Wang, Lu Wang, Jue Zhang, Mohit Garg, QIngwei Lin, Saravan Rajmohan, Dongmei Zhang

Large Language Model (LLM) has gained popularity and achieved remarkable results in open-domain tasks, but its performance in real industrial domain-specific scenarios is average due to its lack of specific domain knowledge.

Language Modeling Language Modelling +3

Introspective Tips: Large Language Model for In-Context Decision Making

no code implementations19 May 2023 Liting Chen, Lu Wang, Hang Dong, Yali Du, Jie Yan, Fangkai Yang, Shuang Li, Pu Zhao, Si Qin, Saravan Rajmohan, QIngwei Lin, Dongmei Zhang

The emergence of large language models (LLMs) has substantially influenced natural language processing, demonstrating exceptional results across various tasks.

Decision Making Language Modeling +3

Augmented Large Language Models with Parametric Knowledge Guiding

1 code implementation8 May 2023 Ziyang Luo, Can Xu, Pu Zhao, Xiubo Geng, Chongyang Tao, Jing Ma, QIngwei Lin, Daxin Jiang

We demonstrate that our PKG framework can enhance the performance of "black-box" LLMs on a range of domain knowledge-intensive tasks that require factual (+7. 9%), tabular (+11. 9%), medical (+3. 0%), and multimodal (+8. 1%) knowledge.

WizardLM: Empowering Large Language Models to Follow Complex Instructions

4 code implementations24 Apr 2023 Can Xu, Qingfeng Sun, Kai Zheng, Xiubo Geng, Pu Zhao, Jiazhan Feng, Chongyang Tao, Daxin Jiang

In this paper, we show an avenue for creating large amounts of instruction data with varying levels of complexity using LLM instead of humans.

Instruction Following

Less is More: Data Pruning for Faster Adversarial Training

no code implementations23 Feb 2023 Yize Li, Pu Zhao, Xue Lin, Bhavya Kailkhura, Ryan Goldhahn

Deep neural networks (DNNs) are sensitive to adversarial examples, resulting in fragile and unreliable performance in the real world.

LexLIP: Lexicon-Bottlenecked Language-Image Pre-Training for Large-Scale Image-Text Retrieval

1 code implementation6 Feb 2023 Ziyang Luo, Pu Zhao, Can Xu, Xiubo Geng, Tao Shen, Chongyang Tao, Jing Ma, Qingwen Lin, Daxin Jiang

The conventional dense retrieval paradigm relies on encoding images and texts into dense representations using dual-stream encoders, however, it faces challenges with low retrieval speed in large-scale retrieval scenarios.

Image-text Retrieval Text Retrieval

Pruning Parameterization With Bi-Level Optimization for Efficient Semantic Segmentation on the Edge

no code implementations CVPR 2023 Changdi Yang, Pu Zhao, Yanyu Li, Wei Niu, Jiexiong Guan, Hao Tang, Minghai Qin, Bin Ren, Xue Lin, Yanzhi Wang

With the ever-increasing popularity of edge devices, it is necessary to implement real-time segmentation on the edge for autonomous driving and many other applications.

Autonomous Driving Segmentation +1

All-in-One: A Highly Representative DNN Pruning Framework for Edge Devices with Dynamic Power Management

no code implementations9 Dec 2022 Yifan Gong, Zheng Zhan, Pu Zhao, Yushu Wu, Chao Wu, Caiwen Ding, Weiwen Jiang, Minghai Qin, Yanzhi Wang

By re-configuring the model to the corresponding pruning ratio for a specific execution frequency (and voltage), we are able to achieve stable inference speed, i. e., keeping the difference in speed performance under various execution frequencies as small as possible.

All Management

Advancing Model Pruning via Bi-level Optimization

1 code implementation8 Oct 2022 Yihua Zhang, Yuguang Yao, Parikshit Ram, Pu Zhao, Tianlong Chen, Mingyi Hong, Yanzhi Wang, Sijia Liu

To reduce the computation overhead, various efficient 'one-shot' pruning methods have been developed, but these schemes are usually unable to find winning tickets as good as IMP.

model

Efficient Multi-Prize Lottery Tickets: Enhanced Accuracy, Training, and Inference Speed

no code implementations26 Sep 2022 Hao Cheng, Pu Zhao, Yize Li, Xue Lin, James Diffenderfer, Ryan Goldhahn, Bhavya Kailkhura

Recently, Diffenderfer and Kailkhura proposed a new paradigm for learning compact yet highly accurate binary neural networks simply by pruning and quantizing randomly weighted full precision neural networks.

Compiler-Aware Neural Architecture Search for On-Mobile Real-time Super-Resolution

1 code implementation25 Jul 2022 Yushu Wu, Yifan Gong, Pu Zhao, Yanyu Li, Zheng Zhan, Wei Niu, Hao Tang, Minghai Qin, Bin Ren, Yanzhi Wang

Instead of measuring the speed on mobile devices at each iteration during the search process, a speed model incorporated with compiler optimizations is leveraged to predict the inference latency of the SR block with various width configurations for faster convergence.

Neural Architecture Search SSIM +1

Pruning-as-Search: Efficient Neural Architecture Search via Channel Pruning and Structural Reparameterization

1 code implementation2 Jun 2022 Yanyu Li, Pu Zhao, Geng Yuan, Xue Lin, Yanzhi Wang, Xin Chen

By combining the structural reparameterization and PaS, we successfully searched out a new family of VGG-like and lightweight networks, which enable the flexibility of arbitrary width with respect to each layer instead of each stage.

Instance Segmentation Network Pruning +2

Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time Mobile Acceleration

no code implementations22 Nov 2021 Yifan Gong, Geng Yuan, Zheng Zhan, Wei Niu, Zhengang Li, Pu Zhao, Yuxuan Cai, Sijia Liu, Bin Ren, Xue Lin, Xulong Tang, Yanzhi Wang

Weight pruning is an effective model compression technique to tackle the challenges of achieving real-time deep neural network (DNN) inference on mobile devices.

Model Compression

Achieving on-Mobile Real-Time Super-Resolution with Neural Architecture and Pruning Search

no code implementations ICCV 2021 Zheng Zhan, Yifan Gong, Pu Zhao, Geng Yuan, Wei Niu, Yushu Wu, Tianyun Zhang, Malith Jayaweera, David Kaeli, Bin Ren, Xue Lin, Yanzhi Wang

Though recent years have witnessed remarkable progress in single image super-resolution (SISR) tasks with the prosperous development of deep neural networks (DNNs), the deep learning methods are confronted with the computation and memory consumption issues in practice, especially for resource-limited platforms such as mobile devices.

Image Super-Resolution Neural Architecture Search +1

A Compression-Compilation Framework for On-mobile Real-time BERT Applications

no code implementations30 May 2021 Wei Niu, Zhenglun Kong, Geng Yuan, Weiwen Jiang, Jiexiong Guan, Caiwen Ding, Pu Zhao, Sijia Liu, Bin Ren, Yanzhi Wang

In this paper, we propose a compression-compilation co-design framework that can guarantee the identified model to meet both resource and real-time specifications of mobile devices.

Question Answering Text Generation

High-Robustness, Low-Transferability Fingerprinting of Neural Networks

no code implementations14 May 2021 Siyue Wang, Xiao Wang, Pin-Yu Chen, Pu Zhao, Xue Lin

This paper proposes Characteristic Examples for effectively fingerprinting deep neural networks, featuring high-robustness to the base model against model pruning as well as low-transferability to unassociated models.

Vocal Bursts Intensity Prediction

Learning to Generate Image Source-Agnostic Universal Adversarial Perturbations

no code implementations29 Sep 2020 Pu Zhao, Parikshit Ram, Songtao Lu, Yuguang Yao, Djallel Bouneffouf, Xue Lin, Sijia Liu

The resulting scheme for meta-learning a UAP generator (i) has better performance (50% higher ASR) than baselines such as Projected Gradient Descent, (ii) has better performance (37% faster) than the vanilla L2O and MAML frameworks (when applicable), and (iii) is able to simultaneously handle UAP generation for different victim models and image data sources.

Adversarial Attack Bilevel Optimization +1

Real-Time Execution of Large-scale Language Models on Mobile

no code implementations15 Sep 2020 Wei Niu, Zhenglun Kong, Geng Yuan, Weiwen Jiang, Jiexiong Guan, Caiwen Ding, Pu Zhao, Sijia Liu, Bin Ren, Yanzhi Wang

Our framework can guarantee the identified model to meet both resource and real-time specifications of mobile devices, thus achieving real-time execution of large transformer-based models like BERT variants.

Edge-computing

Bridging Mode Connectivity in Loss Landscapes and Adversarial Robustness

3 code implementations ICLR 2020 Pu Zhao, Pin-Yu Chen, Payel Das, Karthikeyan Natesan Ramamurthy, Xue Lin

In this work, we propose to employ mode connectivity in loss landscapes to study the adversarial robustness of deep neural networks, and provide novel methods for improving this robustness.

Adversarial Robustness

Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning and Compiler Optimization

no code implementations22 Apr 2020 Wei Niu, Pu Zhao, Zheng Zhan, Xue Lin, Yanzhi Wang, Bin Ren

High-end mobile platforms rapidly serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications.

Compiler Optimization Style Transfer +1

Defending against Backdoor Attack on Deep Neural Networks

no code implementations26 Feb 2020 Hao Cheng, Kaidi Xu, Sijia Liu, Pin-Yu Chen, Pu Zhao, Xue Lin

Although deep neural networks (DNNs) have achieved a great success in various computer vision tasks, it is recently found that they are vulnerable to adversarial attacks.

Backdoor Attack Data Poisoning

Towards Query-Efficient Black-Box Adversary with Zeroth-Order Natural Gradient Descent

1 code implementation18 Feb 2020 Pu Zhao, Pin-Yu Chen, Siyue Wang, Xue Lin

Despite the great achievements of the modern deep neural networks (DNNs), the vulnerability/robustness of state-of-the-art DNNs raises security concerns in many application domains requiring high reliability.

Adversarial Attack image-classification +1

On the Design of Black-box Adversarial Examples by Leveraging Gradient-free Optimization and Operator Splitting Method

1 code implementation ICCV 2019 Pu Zhao, Sijia Liu, Pin-Yu Chen, Nghia Hoang, Kaidi Xu, Bhavya Kailkhura, Xue Lin

Robust machine learning is currently one of the most prominent topics which could potentially help shaping a future of advanced AI platforms that not only perform well in average cases but also in worst cases or adverse situations.

Adversarial Attack Bayesian Optimization +2

Fault Sneaking Attack: a Stealthy Framework for Misleading Deep Neural Networks

no code implementations28 May 2019 Pu Zhao, Siyue Wang, Cheng Gongye, Yanzhi Wang, Yunsi Fei, Xue Lin

Despite the great achievements of deep neural networks (DNNs), the vulnerability of state-of-the-art DNNs raises security concerns of DNNs in many application domains requiring high reliability. We propose the fault sneaking attack on DNNs, where the adversary aims to misclassify certain input images into any target labels by modifying the DNN parameters.

Overall - Test

Interpreting Adversarial Examples by Activation Promotion and Suppression

no code implementations3 Apr 2019 Kaidi Xu, Sijia Liu, Gaoyuan Zhang, Mengshu Sun, Pu Zhao, Quanfu Fan, Chuang Gan, Xue Lin

It is widely known that convolutional neural networks (CNNs) are vulnerable to adversarial examples: images with imperceptible perturbations crafted to fool classifiers.

Adversarial Robustness

Defensive Dropout for Hardening Deep Neural Networks under Adversarial Attacks

no code implementations13 Sep 2018 Siyue Wang, Xiao Wang, Pu Zhao, Wujie Wen, David Kaeli, Peter Chin, Xue Lin

Based on the observations of the effect of test dropout rate on test accuracy and attack success rate, we propose a defensive dropout algorithm to determine an optimal test dropout rate given the neural network model and the attacker's strategy for generating adversarial examples. We also investigate the mechanism behind the outstanding defense effects achieved by the proposed defensive dropout.

Structured Adversarial Attack: Towards General Implementation and Better Interpretability

1 code implementation ICLR 2019 Kaidi Xu, Sijia Liu, Pu Zhao, Pin-Yu Chen, huan zhang, Quanfu Fan, Deniz Erdogmus, Yanzhi Wang, Xue Lin

When generating adversarial examples to attack deep neural networks (DNNs), Lp norm of the added perturbation is usually used to measure the similarity between original image and adversarial example.

Adversarial Attack

An ADMM-Based Universal Framework for Adversarial Attacks on Deep Neural Networks

no code implementations9 Apr 2018 Pu Zhao, Sijia Liu, Yanzhi Wang, Xue Lin

In the literature, the added distortions are usually measured by L0, L1, L2, and L infinity norms, namely, L0, L1, L2, and L infinity attacks, respectively.

Adversarial Attack

Cannot find the paper you are looking for? You can Submit a new open access paper.