Search Results for author: Lichao Sun

Found 159 papers, 79 papers with code

EfficientLLM: Efficiency in Large Language Models

no code implementations20 May 2025 Zhengqing Yuan, Weixiang Sun, Yixin Liu, Huichi Zhou, Rong Zhou, Yiyang Li, Zheyuan Zhang, Wei Song, Yue Huang, Haolong Jia, Keerthiram Murugesan, Yu Wang, Lifang He, Jianfeng Gao, Lichao Sun, Yanfang Ye

Large Language Models (LLMs) have driven significant progress, yet their growing parameter counts and context windows incur prohibitive compute, energy, and monetary costs.

Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning

no code implementations3 May 2025 Jifeng Hu, Sili Huang, Zhejian Yang, Shengchao Hu, Li Shen, Hechang Chen, Lichao Sun, Yi Chang, DaCheng Tao

Finally, we train an intermediate energy neural network to approach the target estimation of log-expectation formulation.

D4RL Offline RL +3

Missing Premise exacerbates Overthinking: Are Reasoning Models losing Critical Thinking Skill?

1 code implementation9 Apr 2025 Chenrui Fan, Ming Li, Lichao Sun, Tianyi Zhou

This implies a critical flaw of the current training recipe for reasoning LLMs, which does not encourage efficient thinking adequately, leading to the abuse of thinking patterns.

Could AI Trace and Explain the Origins of AI-Generated Images and Text?

1 code implementation5 Apr 2025 Hongchao Fang, Yixin Liu, Jiangshu Du, Can Qin, ran Xu, Feng Liu, Lichao Sun, Dongwon Lee, Lifu Huang, Wenpeng Yin

AI-generated content is becoming increasingly prevalent in the real world, leading to serious ethical and societal concerns.

TruthPrInt: Mitigating LVLM Object Hallucination Via Latent Truthful-Guided Pre-Intervention

1 code implementation13 Mar 2025 Jinhao Duan, Fei Kong, Hao Cheng, James Diffenderfer, Bhavya Kailkhura, Lichao Sun, Xiaofeng Zhu, Xiaoshuang Shi, Kaidi Xu

In this paper, we first conduct an in-depth exploration of LVLM internal states in relation to OH issues and discover that (1) LVLM internal states are high-specificity per-token indicators of hallucination behaviors.

Hallucination Object Hallucination +1

A Survey on Post-training of Large Language Models

no code implementations8 Mar 2025 Guiyao Tie, Zeli Zhao, Dingjie Song, Fuyang Wei, Rong Zhou, Yurou Dai, Wen Yin, Zhejian Yang, Jiangyue Yan, Yao Su, Zhenhan Dai, Yifeng Xie, Yihan Cao, Lichao Sun, Pan Zhou, Lifang He, Hechang Chen, Yu Zhang, Qingsong Wen, Tianming Liu, Neil Zhenqiang Gong, Jiliang Tang, Caiming Xiong, Heng Ji, Philip S. Yu, Jianfeng Gao

The emergence of Large Language Models (LLMs) has fundamentally transformed natural language processing, making them indispensable across domains ranging from conversational systems to scientific exploration.

Survey

Poisoned-MRAG: Knowledge Poisoning Attacks to Multimodal Retrieval Augmented Generation

no code implementations8 Mar 2025 Yinuo Liu, Zenghui Yuan, Guiyao Tie, Jiawen Shi, Pan Zhou, Lichao Sun, Neil Zhenqiang Gong

Multimodal retrieval-augmented generation (RAG) enhances the visual reasoning capability of vision-language models (VLMs) by dynamically accessing information from external knowledge bases.

RAG Retrieval +1

End-to-End Deep Learning for Structural Brain Imaging: A Unified Framework

1 code implementation23 Feb 2025 Yao Su, Keqi Han, Mingjie Zeng, Lichao Sun, Liang Zhan, Carl Yang, Lifang He, Xiangnan Kong

Brain imaging analysis is fundamental in neuroscience, providing valuable insights into brain structure and function.

Computational Efficiency

From Correctness to Comprehension: AI Agents for Personalized Error Diagnosis in Education

no code implementations19 Feb 2025 Yi-Fan Zhang, Hang Li, Dingjie Song, Lichao Sun, Tianlong Xu, Qingsong Wen

Finally, we propose a multi-agent collaborative framework that combines a Time Series Agent for historical analysis and an MLLM Agent for real-time refinement, enhancing error classification and feedback generation.

Diagnostic GSM8K +1

XAttnMark: Learning Robust Audio Watermarking with Cross-Attention

no code implementations6 Feb 2025 Yixin Liu, Lie Lu, Jihui Jin, Lichao Sun, Andrea Fanelli

The rapid proliferation of generative audio synthesis and editing technologies has raised significant concerns about copyright infringement, data provenance, and the spread of misinformation through deepfake audio.

Audio Synthesis Face Swapping +1

CultureVLM: Characterizing and Improving Cultural Understanding of Vision-Language Models for over 100 Countries

no code implementations2 Jan 2025 Shudong Liu, Yiqiao Jin, Cheng Li, Derek F. Wong, Qingsong Wen, Lichao Sun, Haipeng Chen, Xing Xie, Jindong Wang

Our evaluation of 16 models reveals significant disparities, with a stronger performance in Western concepts and weaker results in African and Asian contexts.

Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucination

1 code implementation15 Nov 2024 Haojie Zheng, Tianyang Xu, Hanchi Sun, Shu Pu, Ruoxi Chen, Lichao Sun

Current approaches like chain of thought (CoT) reasoning have augmented the cognitive capabilities of large language models (LLMs), yet their adaptation to MLLMs is hindered by heightened risks of hallucination in cross-modality comprehension.

Hallucination Multimodal Reasoning

LLaVA-CoT: Let Vision Language Models Reason Step-by-Step

2 code implementations15 Nov 2024 Guowei Xu, Peng Jin, Hao Li, Yibing Song, Lichao Sun, Li Yuan

Large language models have demonstrated substantial advancements in reasoning capabilities, particularly through inference-time scaling, as illustrated by models such as OpenAI's o1.

Logical Reasoning Multimodal Reasoning +2

SpecHub: Provable Acceleration to Multi-Draft Speculative Decoding

1 code implementation8 Nov 2024 Ryan Sun, Tianyi Zhou, Xun Chen, Lichao Sun

We present SpecHub, a novel, efficient sampling-verification method for MDSD that improves acceptance rates with only linear computational overhead.

Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination

1 code implementation6 Nov 2024 Dingjie Song, Sicheng Lai, Shunian Chen, Lichao Sun, Benyou Wang

The rapid progression of multimodal large language models (MLLMs) has demonstrated superior performance on various multimodal benchmarks.

Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset

1 code implementation5 Nov 2024 Yingzi Ma, Jiongxiao Wang, Fei Wang, Siyuan Ma, Jiazhao Li, Xiujun Li, Furong Huang, Lichao Sun, Bo Li, Yejin Choi, Muhao Chen, Chaowei Xiao

Specifically, we formulate the VLM unlearning task via constructing the Fictitious Facial Identity VQA dataset and apply a two-stage evaluation pipeline that is designed to precisely control the sources of information and their exposure levels.

Benchmarking Language Modeling +3

Social Science Meets LLMs: How Reliable Are Large Language Models in Social Simulations?

no code implementations30 Oct 2024 Yue Huang, Zhengqing Yuan, Yujun Zhou, Kehan Guo, Xiangqi Wang, Haomin Zhuang, Weixiang Sun, Lichao Sun, Jindong Wang, Yanfang Ye, Xiangliang Zhang

To address this, we introduce TrustSim, an evaluation dataset covering 10 CSS-related topics, to systematically investigate the reliability of the LLM simulation.

Solving Continual Offline RL through Selective Weights Activation on Aligned Spaces

no code implementations21 Oct 2024 Jifeng Hu, Sili Huang, Li Shen, Zhejian Yang, Shengchao Hu, Shisong Tang, Hechang Chen, Yi Chang, DaCheng Tao, Lichao Sun

In the quantized spaces alignment, we leverage vector quantization to align the different state and action spaces of various tasks, facilitating continual training in the same space.

Continual Learning Lifelong learning +2

BenTo: Benchmark Task Reduction with In-Context Transferability

1 code implementation17 Oct 2024 Hongyu Zhao, Ming Li, Lichao Sun, Tianyi Zhou

Evaluating large language models (LLMs) is costly: it requires the generation and examination of LLM outputs on a large-scale benchmark of various tasks.

In-Context Learning MMLU

FedCAP: Robust Federated Learning via Customized Aggregation and Personalization

1 code implementation16 Oct 2024 Youpeng Li, Xinda Wang, Fuxun Yu, Lichao Sun, Wenbin Zhang, Xuyu Wang

The core of FedCAP is a model update calibration mechanism to help a server capture the differences in the direction and magnitude of model updates among clients.

Anomaly Detection Federated Learning +1

MLP-KAN: Unifying Deep Representation and Function Learning

1 code implementation3 Oct 2024 Yunhong He, Yifeng Xie, Zhengqing Yuan, Lichao Sun

Recent advancements in both representation learning and function learning have demonstrated substantial promise across diverse domains of artificial intelligence.

Kolmogorov-Arnold Networks Mixture-of-Experts +2

Empirical Perturbation Analysis of Linear System Solvers from a Data Poisoning Perspective

no code implementations1 Oct 2024 Yixin Liu, Arielle Carr, Lichao Sun

The perturbation analysis of linear solvers applied to systems arising broadly in machine learning settings -- for instance, when using linear regression models -- establishes an important perspective when reframing these analyses through the lens of a data poisoning attack.

Data Poisoning

LLM4Brain: Training a Large Language Model for Brain Video Understanding

no code implementations26 Sep 2024 Ruizhe Zheng, Lichao Sun

Specifically, we employ fine-tuning techniques on an fMRI encoder equipped with adaptors to transform brain responses into latent representations aligned with the video stimuli.

Domain Adaptation Language Modeling +3

Continual Diffuser (CoD): Mastering Continual Offline Reinforcement Learning with Experience Rehearsal

1 code implementation4 Sep 2024 Jifeng Hu, Li Shen, Sili Huang, Zhejian Yang, Hechang Chen, Lichao Sun, Yi Chang, DaCheng Tao

Artificial neural networks, especially recent diffusion-based models, have shown remarkable superiority in gaming, control, and QA systems, where the training tasks' datasets are usually static.

Reinforcement Learning (RL)

Biomedical SAM 2: Segment Anything in Biomedical Images and Videos

1 code implementation6 Aug 2024 Zhiling Yan, Weixiang Sun, Rong Zhou, Zhengqing Yuan, Kai Zhang, Yiwei Li, Tianming Liu, Quanzheng Li, Xiang Li, Lifang He, Lichao Sun

Medical image segmentation and video object segmentation are essential for diagnosing and analyzing diseases by identifying and measuring biological structures.

Image Segmentation Medical Image Segmentation +5

Can Large Language Models Automatically Jailbreak GPT-4V?

no code implementations23 Jul 2024 Yuanwei Wu, Yue Huang, Yixin Liu, Xiang Li, Pan Zhou, Lichao Sun

In our study, we introduce AutoJailbreak, an innovative automatic jailbreak technique inspired by prompt optimization.

Face Recognition In-Context Learning +2

Unified-EGformer: Exposure Guided Lightweight Transformer for Mixed-Exposure Image Enhancement

no code implementations18 Jul 2024 Eashan Adhikarla, Kai Zhang, Rosaura G. VidalMata, Manjushree Aithal, Nikhil Ambha Madhusudhana, John Nicholson, Lichao Sun, Brian D. Davison

Despite recent strides made by AI in image processing, the issue of mixed exposure, pivotal in many real-world scenarios like surveillance and photography, remains inadequately addressed.

Autonomous Navigation Image Enhancement

Bora: Biomedical Generalist Video Generation Model

no code implementations12 Jul 2024 Weixiang Sun, Xiaocao You, Ruizhe Zheng, Zhengqing Yuan, Xiang Li, Lifang He, Quanzheng Li, Lichao Sun

This paper introduces Bora, the first spatio-temporal diffusion probabilistic model designed for text-guided biomedical video generation.

Cell Tracking Data Augmentation +2

Self-Cognition in Large Language Models: An Exploratory Study

no code implementations1 Jul 2024 Dongping Chen, Jiawen Shi, Yao Wan, Pan Zhou, Neil Zhenqiang Gong, Lichao Sun

Additionally, we also explore the utility and trustworthiness of LLM in the self-cognition state, revealing that the self-cognition state enhances some specific tasks such as creative writing and exaggeration.

Chatbot

Rethinking and Defending Protective Perturbation in Personalized Diffusion Models

1 code implementation27 Jun 2024 Yixin Liu, Ruoxi Chen, Xun Chen, Lichao Sun

Existing purification methods attempt to mitigate this issue but often over-purify images, resulting in information loss.

Contrastive Learning Image Generation +1

UniGen: A Unified Framework for Textual Dataset Generation Using Large Language Models

1 code implementation27 Jun 2024 Siyuan Wu, Yue Huang, Chujie Gao, Dongping Chen, Qihui Zhang, Yao Wan, Tianyi Zhou, Xiangliang Zhang, Jianfeng Gao, Chaowei Xiao, Lichao Sun

Large Language Models (LLMs) such as GPT-4 and Llama3 have significantly impacted various fields by enabling high-quality synthetic data generation and reducing dependence on expensive human-generated datasets.

Attribute Benchmarking +4

ViT-1.58b: Mobile Vision Transformers in the 1-bit Era

1 code implementation26 Jun 2024 Zhengqing Yuan, Rong Zhou, Hongyi Wang, Lifang He, Yanfang Ye, Lichao Sun

Vision Transformers (ViTs) have achieved remarkable performance in various image classification tasks by leveraging the attention mechanism to process image patches as tokens.

Image Classification Quantization

Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models

no code implementations25 Jun 2024 Yuan Li, Yue Huang, Hongyi Wang, Xiangliang Zhang, James Zou, Lichao Sun

Inspired by psychometrics, this paper presents a framework for investigating psychology in LLMs, including psychological dimension identification, assessment dataset curation, and assessment with results validation.

1+1>2: Can Large Language Models Serve as Cross-Lingual Knowledge Aggregators?

no code implementations20 Jun 2024 Yue Huang, Chenrui Fan, Yuan Li, Siyuan Wu, Tianyi Zhou, Xiangliang Zhang, Lichao Sun

This paper introduces a method to enhance the multilingual performance of LLMs by aggregating knowledge from diverse languages.

Jailbreaking Large Language Models Through Alignment Vulnerabilities in Out-of-Distribution Settings

1 code implementation19 Jun 2024 Yue Huang, Jingyu Tang, Dongping Chen, Bingda Tang, Yao Wan, Lichao Sun, Philip S. Yu, Xiangliang Zhang

Recently, Large Language Models (LLMs) have garnered significant attention for their exceptional natural language processing capabilities.

GUI-World: A Video Benchmark and Dataset for Multimodal GUI-oriented Understanding

1 code implementation16 Jun 2024 Dongping Chen, Yue Huang, Siyuan Wu, Jingyu Tang, Liuyi Chen, Yilin Bai, Zhigang He, Chenlong Wang, Huichi Zhou, Yiqiang Li, Tianshuo Zhou, Yue Yu, Chujie Gao, Qihui Zhang, Yi Gui, Zhen Li, Yao Wan, Pan Zhou, Jianfeng Gao, Lichao Sun

We evaluate the capabilities of current state-of-the-art MLLMs, including Image LLMs and Video LLMs, in understanding various types of GUI content, especially dynamic and sequential content.

HonestLLM: Toward an Honest and Helpful Large Language Model

1 code implementation1 Jun 2024 Chujie Gao, Siyuan Wu, Yue Huang, Dongping Chen, Qihui Zhang, Zhengyan Fu, Yao Wan, Lichao Sun, Xiangliang Zhang

Subsequently, we present two approaches to augmenting honesty and helpfulness in LLMs: a training-free enhancement and a fine-tuning-based improvement.

Language Modeling Language Modelling +1

In-Context Decision Transformer: Reinforcement Learning via Hierarchical Chain-of-Thought

1 code implementation31 May 2024 Sili Huang, Jifeng Hu, Hechang Chen, Lichao Sun, Bo Yang

Recent works demonstrated that in-context RL could emerge with self-improvement in a trial-and-error manner when treating RL tasks as an across-episodic sequential prediction problem.

D4RL Decision Making +3

Decision Mamba: Reinforcement Learning via Hybrid Selective Sequence Modeling

no code implementations31 May 2024 Sili Huang, Jifeng Hu, Zhejian Yang, Liwei Yang, Tao Luo, Hechang Chen, Lichao Sun, Bo Yang

Then, we propose a Decision Mamba-Hybrid (DM-H) with the merits of transformers and Mamba in high-quality prediction and long-term memory.

D4RL Mamba +3

Variational Bayes for Federated Continual Learning

1 code implementation23 May 2024 Dezhong Yao, Sanmu Li, Yutong Dai, Zhiqiang Xu, Shengshan Hu, Peilin Zhao, Lichao Sun

Federated continual learning (FCL) has received increasing attention due to its potential in handling real-world streaming data, characterized by evolving data distributions and varying client classes over time.

Continual Learning Federated Learning

Physical Backdoor: Towards Temperature-based Backdoor Attacks in the Physical World

no code implementations CVPR 2024 Wen Yin, Jian Lou, Pan Zhou, Yulai Xie, Dan Feng, Yuhua Sun, Tailai Zhang, Lichao Sun

In the digital realm, we evaluate our approach using benchmark datasets for TIOD, achieving an Attack Success Rate (ASR) of up to 98. 21%.

Object object-detection +1

CodeIP: A Grammar-Guided Multi-Bit Watermark for Large Language Models of Code

1 code implementation24 Apr 2024 Batu Guan, Yao Wan, Zhangqian Bi, Zheng Wang, Hongyu Zhang, Pan Zhou, Lichao Sun

Experiments conducted on a real-world dataset across five programming languages demonstrate the effectiveness of CodeIP in watermarking LLMs for code generation while maintaining the syntactical correctness of code.

Code Generation Diversity

Does Your Neural Code Completion Model Use My Code? A Membership Inference Approach

1 code implementation22 Apr 2024 Yao Wan, Guanghua Wan, Shijie Zhang, Hongyu Zhang, Pan Zhou, Hai Jin, Lichao Sun

Subsequently, the membership classifier can be effectively employed to deduce the membership status of a given code sample based on the output of a target code completion model.

Code Completion Memorization

Mora: Enabling Generalist Video Generation via A Multi-Agent Framework

1 code implementation20 Mar 2024 Zhengqing Yuan, Yixin Liu, Yihan Cao, Weixiang Sun, Haolong Jia, Ruoxi Chen, Zhaoxu Li, Bin Lin, Li Yuan, Lifang He, Chi Wang, Yanfang Ye, Lichao Sun

Existing open-source methods struggle to achieve comparable performance, often hindered by ineffective agent collaboration and inadequate training data quality.

Image to Video Generation Text-to-Video Generation +1

Medical Unlearnable Examples: Securing Medical Data from Unauthorized Training via Sparsity-Aware Local Masking

no code implementations15 Mar 2024 Weixiang Sun, Yixin Liu, Zhiling Yan, Kaidi Xu, Lichao Sun

The rapid expansion of AI in healthcare has led to a surge in medical data generation and storage, boosting medical AI development.

Conditional Score-Based Diffusion Model for Cortical Thickness Trajectory Prediction

no code implementations11 Mar 2024 Qing Xiao, Siyeop Yoon, Hui Ren, Matthew Tivnan, Lichao Sun, Quanzheng Li, Tianming Liu, Yu Zhang, Xiang Li

Alzheimer's Disease (AD) is a neurodegenerative condition characterized by diverse progression rates among individuals, with changes in cortical thickness (CTh) closely linked to its progression.

Prediction Trajectory Prediction

Medical Image Synthesis via Fine-Grained Image-Text Alignment and Anatomy-Pathology Prompting

no code implementations11 Mar 2024 WenTing Chen, Pengyu Wang, Hui Ren, Lichao Sun, Quanzheng Li, Yixuan Yuan, Xiang Li

To address these challenges, we propose a novel medical image synthesis model that leverages fine-grained image-text alignment and anatomy-pathology prompts to generate highly detailed and accurate synthetic medical images.

Anatomy Descriptive +1

MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark

1 code implementation7 Feb 2024 Dongping Chen, Ruoxi Chen, Shilin Zhang, Yinuo Liu, Yaochen Wang, Huichi Zhou, Qihui Zhang, Yao Wan, Pan Zhou, Lichao Sun

Drawing inspiration from the concept of LLM-as-a-Judge within LLMs, this paper introduces a novel benchmark, termed MLLM-as-a-Judge, to assess the ability of MLLMs in assisting judges across diverse modalities, encompassing three distinct tasks: Scoring Evaluation, Pair Comparison, and Batch Ranking.

Revisiting Gradient Pruning: A Dual Realization for Defending against Gradient Attacks

no code implementations30 Jan 2024 Lulu Xue, Shengshan Hu, Ruizhi Zhao, Leo Yu Zhang, Shengqing Hu, Lichao Sun, Dezhong Yao

To mitigate the weaknesses of existing solutions, we propose a novel defense method, Dual Gradient Pruning (DGP), based on gradient pruning, which can improve communication efficiency while preserving the utility and privacy of CL.

The Radiation Oncology NLP Database

1 code implementation19 Jan 2024 Zhengliang Liu, Jason Holmes, Wenxiong Liao, Chenbin Liu, Lian Zhang, Hongying Feng, Peilong Wang, Muhammad Ali Elahi, Hongmin Cai, Lichao Sun, Quanzheng Li, Xiang Li, Tianming Liu, Jiajian Shen, Wei Liu

ROND is specifically designed to address this gap in the domain of radiation oncology, a field that offers many opportunities for NLP exploration.

Language Modelling Large Language Model +7

LLM-as-a-Coauthor: Can Mixed Human-Written and Machine-Generated Text Be Detected?

2 code implementations11 Jan 2024 Qihui Zhang, Chujie Gao, Dongping Chen, Yue Huang, Yixin Huang, Zhenyang Sun, Shilin Zhang, Weiye Li, Zhengyan Fu, Yao Wan, Lichao Sun

With the rapid development and widespread application of Large Language Models (LLMs), the use of Machine-Generated Text (MGT) has become increasingly common, bringing with it potential risks, especially in terms of quality and integrity in fields like news, education, and science.

Binary text classification

Deep Efficient Private Neighbor Generation for Subgraph Federated Learning

no code implementations9 Jan 2024 Ke Zhang, Lichao Sun, Bolin Ding, Siu Ming Yiu, Carl Yang

Behemoth graphs are often fragmented and separately stored by multiple data owners as distributed subgraphs in many realistic applications.

Federated Learning Graph Mining

TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones

2 code implementations28 Dec 2023 Zhengqing Yuan, Zhaoxu Li, Weiran Huang, Yanfang Ye, Lichao Sun

In recent years, multimodal large language models (MLLMs) such as GPT-4V have demonstrated remarkable advancements, excelling in a variety of vision-language tasks.

Computational Efficiency Image Captioning +7

ClassLIE: Structure- and Illumination-Adaptive Classification for Low-Light Image Enhancement

no code implementations20 Dec 2023 Zixiang Wei, Yiting Wang, Lichao Sun, Athanasios V. Vasilakos, Lin Wang

A class prediction block is then designed to classify the degradation information by calculating the structure similarity scores on the reflectance map and mean square error on the illumination map.

Low-Light Image Enhancement SSIM

Robust Computer Vision in an Ever-Changing World: A Survey of Techniques for Tackling Distribution Shifts

no code implementations3 Dec 2023 Eashan Adhikarla, Kai Zhang, Jun Yu, Lichao Sun, John Nicholson, Brian D. Davison

As a result, it raises concerns about the overall robustness of the machine learning techniques for computer vision applications that are deployed publicly for consumers.

Data Augmentation Transfer Learning

Improving Interpretation Faithfulness for Vision Transformers

no code implementations29 Nov 2023 Lijie Hu, Yixin Liu, Ninghao Liu, Mengdi Huai, Lichao Sun, Di Wang

However, ViTs suffer from issues with explanation faithfulness, as their focal points are fragile to adversarial attacks and can be easily changed with even slight perturbations on the input image.

Denoising

Stable Unlearnable Example: Enhancing the Robustness of Unlearnable Examples via Stable Error-Minimizing Noise

1 code implementation22 Nov 2023 Yixin Liu, Kaidi Xu, Xun Chen, Lichao Sun

Observing that simply removing the adversarial noise on the training process of the defensive noise can improve the performance of robust unlearnable examples, we identify that solely the surrogate model's robustness contributes to the performance.

MetaCloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning

1 code implementation CVPR 2024 Yixin Liu, Chenrui Fan, Yutong Dai, Xun Chen, Pan Zhou, Lichao Sun

To solve these challenges, we propose MetaCloak, which solves the bi-level poisoning problem with a meta-learning framework with an additional transformation sampling process to craft transferable and robust perturbation.

Bilevel Optimization Denoising +1

Jailbreaking GPT-4V via Self-Adversarial Attacks with System Prompts

no code implementations15 Nov 2023 Yuanwei Wu, Xiang Li, Yixin Liu, Pan Zhou, Lichao Sun

This finding indicates potential exploitable security risks in MLLMs; 2) Based on the acquired system prompts, we propose a novel MLLM jailbreaking attack method termed SASP (Self-Adversarial Attack via System Prompt).

Adversarial Attack Red Teaming

Multimodal ChatGPT for Medical Applications: an Experimental Study of GPT-4V

1 code implementation29 Oct 2023 Zhiling Yan, Kai Zhang, Rong Zhou, Lifang He, Xiang Li, Lichao Sun

In this paper, we critically evaluate the capabilities of the state-of-the-art multimodal large language model, i. e., GPT-4 with Vision (GPT-4V), on Visual Question Answering (VQA) task.

Diagnostic Language Modeling +5

Towards Graph Foundation Models: A Survey and Beyond

no code implementations18 Oct 2023 Jiawei Liu, Cheng Yang, Zhiyuan Lu, Junze Chen, Yibo Li, Mengmei Zhang, Ting Bai, Yuan Fang, Lichao Sun, Philip S. Yu, Chuan Shi

Foundation models have emerged as critical components in a variety of artificial intelligence applications, and showcase significant success in natural language processing and several other domains.

Graph Learning Survey

MetaAgents: Simulating Interactions of Human Behaviors for LLM-based Task-oriented Coordination via Collaborative Generative Agents

1 code implementation10 Oct 2023 Yuan Li, Yixuan Zhang, Lichao Sun

We propose a novel framework that equips collaborative generative agents with human-like reasoning abilities and specialized skills.

Learning Generalizable Agents via Saliency-Guided Features Decorrelation

no code implementations NeurIPS 2023 Sili Huang, Yanchao Sun, Jifeng Hu, Siyuan Guo, Hechang Chen, Yi Chang, Lichao Sun, Bo Yang

Our experimental results demonstrate that SGFD can generalize well on a wide range of test environments and significantly outperforms state-of-the-art methods in handling both task-irrelevant variations and task-relevant variations.

Reinforcement Learning (RL)

FakeGPT: Fake News Generation, Explanation and Detection of Large Language Models

no code implementations8 Oct 2023 Yue Huang, Lichao Sun

The rampant spread of fake news has adversely affected society, resulting in extensive research on curbing its spread.

News Generation

MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use

1 code implementation4 Oct 2023 Yue Huang, Jiawen Shi, Yuan Li, Chenrui Fan, Siyuan Wu, Qihui Zhang, Yixin Liu, Pan Zhou, Yao Wan, Neil Zhenqiang Gong, Lichao Sun

However, in scenarios where LLMs serve as intelligent agents, as seen in applications like AutoGPT and MetaGPT, LLMs are expected to engage in intricate decision-making processes that involve deciding whether to employ a tool and selecting the most suitable tool(s) from a collection of available tools to fulfill user requests.

Decision Making

Evaluation of GPT-3 for Anti-Cancer Drug Sensitivity Prediction

no code implementations18 Sep 2023 Shaika Chowdhury, Sivaraman Rajaganapathy, Lichao Sun, James Cerhan, Nansu Zong

In this study, we investigated the potential of GPT-3 for the anti-cancer drug sensitivity prediction task using structured pharmacogenomics data across five tissue types and evaluated its performance with zero-shot prompting and fine-tuning paradigms.

MA-SAM: Modality-agnostic SAM Adaptation for 3D Medical Image Segmentation

1 code implementation16 Sep 2023 Cheng Chen, Juzheng Miao, Dufan Wu, Zhiling Yan, Sekeun Kim, Jiang Hu, Aoxiao Zhong, Zhengliang Liu, Lichao Sun, Xiang Li, Tianming Liu, Pheng-Ann Heng, Quanzheng Li

The Segment Anything Model (SAM), a foundation model for general image segmentation, has demonstrated impressive zero-shot performance across numerous natural image segmentation tasks.

Image Segmentation Medical Image Segmentation +5

InstructionGPT-4: A 200-Instruction Paradigm for Fine-Tuning MiniGPT-4

3 code implementations23 Aug 2023 Lai Wei, Zihao Jiang, Weiran Huang, Lichao Sun

To achieve this, we first propose several metrics to access the quality of multimodal instruction data.

Instruction Following Question Answering +1

Benchmarking and Analyzing Robust Point Cloud Recognition: Bag of Tricks for Defending Adversarial Examples

1 code implementation ICCV 2023 Qiufan Ji, Lin Wang, Cong Shi, Shengshan Hu, Yingying Chen, Lichao Sun

In this paper, we first establish a comprehensive, and rigorous point cloud adversarial robustness benchmark to evaluate adversarial robustness, which can provide a detailed understanding of the effects of the defense and attack methods.

Adversarial Robustness Benchmarking

Instruction Mining: Instruction Data Selection for Tuning Large Language Models

no code implementations12 Jul 2023 Yihan Cao, Yanbin Kang, Chi Wang, Lichao Sun

Large language models (LLMs) are initially pretrained for broad capabilities and then finetuned with instruction-following datasets to improve their performance in interacting with humans.

Instruction Following Language Modeling +2

TrustGPT: A Benchmark for Trustworthy and Responsible Large Language Models

no code implementations20 Jun 2023 Yue Huang, Qihui Zhang, Philip S. Y, Lichao Sun

Through the implementation of TrustGPT, this research aims to enhance our understanding of the performance of conversation generation models and promote the development of language models that are more ethical and socially responsible.

FedSecurity: Benchmarking Attacks and Defenses in Federated Learning and Federated LLMs

1 code implementation8 Jun 2023 Shanshan Han, Baturalp Buyukates, Zijian Hu, Han Jin, Weizhao Jin, Lichao Sun, Xiaoyang Wang, Wenxuan Wu, Chulin Xie, Yuhang Yao, Kai Zhang, Qifan Zhang, Yuhui Zhang, Carlee Joe-Wong, Salman Avestimehr, Chaoyang He

This paper introduces FedSecurity, an end-to-end benchmark that serves as a supplementary component of the FedML library for simulating adversarial attacks and corresponding defense mechanisms in Federated Learning (FL).

Benchmarking Federated Learning

Decentralized Federated Learning: A Survey and Perspective

no code implementations2 Jun 2023 Liangqi Yuan, Ziran Wang, Lichao Sun, Philip S. Yu, Christopher G. Brinton

Federated learning (FL) has been gaining attention for its ability to share knowledge while maintaining user data, protecting privacy, increasing learning efficiency, and reducing communication overhead.

Federated Learning Survey

DiffusionShield: A Watermark for Copyright Protection against Generative Diffusion Models

1 code implementation25 May 2023 Yingqian Cui, Jie Ren, Han Xu, Pengfei He, Hui Liu, Lichao Sun, Yue Xing, Jiliang Tang

By detecting the watermark from generated images, copyright infringement can be exposed with evidence.

ArtGPT-4: Towards Artistic-understanding Large Vision-Language Models with Enhanced Adapter

1 code implementation12 May 2023 Zhengqing Yuan, Yunhong He, Kun Wang, Yanfang Ye, Lichao Sun

However, a grand challenge of exploiting LLMs for multimodal learning is the size of pre-trained LLMs which are always with billions of parameters.

Image Comprehension Language Modelling

Prompt Engineering for Healthcare: Methodologies and Applications

no code implementations28 Apr 2023 Jiaqi Wang, Enze Shi, Sigang Yu, Zihao Wu, Chong Ma, Haixing Dai, Qiushi Yang, Yanqing Kang, Jinru Wu, Huawen Hu, Chenxi Yue, Haiyang Zhang, Yiheng Liu, Yi Pan, Zhengliang Liu, Lichao Sun, Xiang Li, Bao Ge, Xi Jiang, Dajiang Zhu, Yixuan Yuan, Dinggang Shen, Tianming Liu, Shu Zhang

Prompt engineering is a critical technique in the field of natural language processing that involves designing and optimizing the prompts used to input information into models, aiming to enhance their performance on specific tasks.

Machine Translation Prompt Engineering +3

DeID-GPT: Zero-shot Medical Text De-Identification by GPT-4

1 code implementation20 Mar 2023 Zhengliang Liu, Yue Huang, Xiaowei Yu, Lu Zhang, Zihao Wu, Chao Cao, Haixing Dai, Lin Zhao, Yiwei Li, Peng Shu, Fang Zeng, Lichao Sun, Wei Liu, Dinggang Shen, Quanzheng Li, Tianming Liu, Dajiang Zhu, Xiang Li

The digitization of healthcare has facilitated the sharing and re-using of medical data but has also raised concerns about confidentiality and privacy.

Benchmarking De-identification +4

Memory-adaptive Depth-wise Heterogenous Federated Learning

1 code implementation8 Mar 2023 Kai Zhang, Yutong Dai, Hongyi Wang, Eric Xing, Xun Chen, Lichao Sun

Federated learning is a promising paradigm that allows multiple clients to collaboratively train a model without sharing the local data.

Federated Learning

A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT

1 code implementation7 Mar 2023 Yihan Cao, Siyu Li, Yixin Liu, Zhiling Yan, Yutong Dai, Philip S. Yu, Lichao Sun

The goal of AIGC is to make the content creation process more efficient and accessible, allowing for the production of high-quality content at a faster pace.

multimodal interaction

Securing Biomedical Images from Unauthorized Training with Anti-Learning Perturbation

no code implementations5 Mar 2023 Yixin Liu, Haohui Ye, Kai Zhang, Lichao Sun

The volume of open-source biomedical data has been essential to the development of various spheres of the healthcare community since more `free' data can provide individual researchers more chances to contribute.

Unlearnable Graph: Protecting Graphs from Unauthorized Exploitation

no code implementations5 Mar 2023 Yixin Liu, Chenrui Fan, Pan Zhou, Lichao Sun

While the use of graph-structured data in various fields is becoming increasingly popular, it also raises concerns about the potential unauthorized exploitation of personal data for training commercial graph neural network (GNN) models, which can compromise privacy.

Graph Neural Network

BadGPT: Exploring Security Vulnerabilities of ChatGPT via Backdoor Attacks to InstructGPT

no code implementations21 Feb 2023 Jiawen Shi, Yixin Liu, Pan Zhou, Lichao Sun

Recently, ChatGPT has gained significant attention in research due to its ability to interact with humans effectively.

Backdoor Attack Language Modeling +3

A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT

no code implementations18 Feb 2023 Ce Zhou, Qian Li, Chen Li, Jun Yu, Yixin Liu, Guangjing Wang, Kai Zhang, Cheng Ji, Qiben Yan, Lifang He, Hao Peng, JianXin Li, Jia Wu, Ziwei Liu, Pengtao Xie, Caiming Xiong, Jian Pei, Philip S. Yu, Lichao Sun

This study provides a comprehensive review of recent research advancements, challenges, and opportunities for PFMs in text, image, graph, as well as other data modalities.

Graph Learning Language Modelling +1

Rethinking the Video Sampling and Reasoning Strategies for Temporal Sentence Grounding

no code implementations2 Jan 2023 Jiahao Zhu, Daizong Liu, Pan Zhou, Xing Di, Yu Cheng, Song Yang, Wenzheng Xu, Zichuan Xu, Yao Wan, Lichao Sun, Zeyu Xiong

All existing works first utilize a sparse sampling strategy to extract a fixed number of video frames and then conduct multi-modal interactions with query sentence for reasoning.

Sentence Temporal Sentence Grounding

Tackling Data Heterogeneity in Federated Learning with Class Prototypes

1 code implementation6 Dec 2022 Yutong Dai, Zeyuan Chen, Junnan Li, Shelby Heinecke, Lichao Sun, ran Xu

We propose FedNH, a novel method that improves the local models' performance for both personalization and generalization by combining the uniformity and semantics of class prototypes.

Personalized Federated Learning

SEAT: Stable and Explainable Attention

no code implementations23 Nov 2022 Lijie Hu, Yixin Liu, Ninghao Liu, Mengdi Huai, Lichao Sun, Di Wang

Results show that SEAT is more stable against different perturbations and randomness while also keeps the explainability of attention, which indicates it is a more faithful explanation.

PointCA: Evaluating the Robustness of 3D Point Cloud Completion Models Against Adversarial Examples

no code implementations22 Nov 2022 Shengshan Hu, Junwei Zhang, Wei Liu, Junhui Hou, Minghui Li, Leo Yu Zhang, Hai Jin, Lichao Sun

In addition, existing attack approaches towards point cloud classifiers cannot be applied to the completion models due to different output forms and attack purposes.

Adversarial Attack Point Cloud Classification +2

Transferable Unlearnable Examples

1 code implementation18 Oct 2022 Jie Ren, Han Xu, Yuxuan Wan, Xingjun Ma, Lichao Sun, Jiliang Tang

The unlearnable strategies have been introduced to prevent third parties from training on the data without permission.

RAIN: RegulArization on Input and Network for Black-Box Domain Adaptation

1 code implementation22 Aug 2022 Qucheng Peng, Zhengming Ding, Lingjuan Lyu, Lichao Sun, Chen Chen

For the input-level, we design a new data augmentation technique as Phase MixUp, which highlights task-relevant objects in the interpolations, thus enhancing input-level regularization and class consistency for target models.

Data Augmentation Self-Knowledge Distillation +1

BOND: Benchmarking Unsupervised Outlier Node Detection on Static Attributed Graphs

2 code implementations21 Jun 2022 Kay Liu, Yingtong Dou, Yue Zhao, Xueying Ding, Xiyang Hu, Ruitong Zhang, Kaize Ding, Canyu Chen, Hao Peng, Kai Shu, Lichao Sun, Jundong Li, George H. Chen, Zhihao Jia, Philip S. Yu

To bridge this gap, we present--to the best of our knowledge--the first comprehensive benchmark for unsupervised outlier node detection on static attributed graphs called BOND, with the following highlights.

Anomaly Detection Benchmarking +2

Secure Embedding Aggregation for Federated Representation Learning

no code implementations18 Jun 2022 Jiaxiang Tang, Jinbao Zhu, Songze Li, Lichao Sun

We consider a federated representation learning framework, where with the assistance of a central server, a group of $N$ distributed clients train collaboratively over their private data, for the representations (or embeddings) of a set of entities (e. g., users in a social network).

Federated Learning Privacy Preserving +1

End-to-End Multimodal Fact-Checking and Explanation Generation: A Challenging Dataset and Models

1 code implementation25 May 2022 Barry Menglong Yao, Aditya Shah, Lichao Sun, Jin-Hee Cho, Lifu Huang

We propose end-to-end multimodal fact-checking and explanation generation, where the input is a claim and a large collection of web sources, including articles, images, videos, and tweets, and the goal is to assess the truthfulness of the claim by retrieving relevant evidence and predicting a truthfulness label (e. g., support, refute or not enough information), and to generate a statement to summarize and explain the reasoning and ruling process.

Claim Verification Explanation Generation +2

Data-Free Adversarial Knowledge Distillation for Graph Neural Networks

no code implementations8 May 2022 Yuanxin Zhuang, Lingjuan Lyu, Chuan Shi, Carl Yang, Lichao Sun

Graph neural networks (GNNs) have been widely used in modeling graph structured data, owing to its impressive performance in a wide range of practical applications.

Generative Adversarial Network Graph Classification +3

Efficient Federated Learning on Knowledge Graphs via Privacy-preserving Relation Embedding Aggregation

1 code implementation17 Mar 2022 Kai Zhang, Yu Wang, Hongyi Wang, Lifu Huang, Carl Yang, Xun Chen, Lichao Sun

Furthermore, we propose a Federated learning paradigm with privacy-preserving Relation embedding aggregation (FedR) to tackle the privacy issue in FedE.

Entity Embeddings Federated Learning +4

FedHM: Efficient Federated Learning for Heterogeneous Models via Low-rank Factorization

no code implementations29 Nov 2021 Dezhong Yao, Wanning Pan, Michael J O'Neill, Yutong Dai, Yao Wan, Hai Jin, Lichao Sun

To this end, this paper proposes FedHM, a novel heterogeneous federated model compression framework, distributing the heterogeneous low-rank models to clients and then aggregating them into a full-rank model.

Distributed Computing Federated Learning +3

Query and Extract: Refining Event Extraction as Type-oriented Binary Decoding

no code implementations Findings (ACL) 2022 Sijia Wang, Mo Yu, Shiyu Chang, Lichao Sun, Lifu Huang

Event extraction is typically modeled as a multi-class classification problem where event types and argument roles are treated as atomic symbols.

Multi-class Classification Natural Language Queries +2

DoubleStar: Long-Range Attack Towards Depth Estimation based Obstacle Avoidance in Autonomous Systems

no code implementations7 Oct 2021 Ce Zhou, Qiben Yan, Yan Shi, Lichao Sun

By exploiting the weaknesses of the stereo matching in depth estimation algorithms and the lens flare effect in optical imaging, we propose DoubleStar, a long-range attack that injects fake obstacle depth by projecting pure light from two complementary light sources.

Depth Estimation Sensor Fusion +1

FedDiscrete: A Secure Federated Learning Algorithm Against Weight Poisoning

no code implementations29 Sep 2021 Yutong Dai, Xingjun Ma, Lichao Sun

Federated learning (FL) is a privacy-aware collaborative learning paradigm that allows multiple parties to jointly train a machine learning model without sharing their private data.

Federated Learning

Source Inference Attacks in Federated Learning

1 code implementation13 Sep 2021 Hongsheng Hu, Zoran Salcic, Lichao Sun, Gillian Dobbie, Xuyun Zhang

However, existing MIAs ignore the source of a training member, i. e., the information of which client owns the training member, while it is essential to explore source privacy in FL beyond membership privacy of examples from all clients.

Federated Learning Inference Attack

How to Inject Backdoors with Better Consistency: Logit Anchoring on Clean Data

no code implementations ICLR 2022 Zhiyuan Zhang, Lingjuan Lyu, Weiqiang Wang, Lichao Sun, Xu sun

In this work, we observe an interesting phenomenon that the variations of parameters are always AWPs when tuning the trained clean model to inject backdoors.

DSKReG: Differentiable Sampling on Knowledge Graph for Recommendation with Relational GNN

1 code implementation26 Aug 2021 Yu Wang, Zhiwei Liu, Ziwei Fan, Lichao Sun, Philip S. Yu

In the information explosion era, recommender systems (RSs) are widely studied and applied to discover user-preferred information.

Knowledge Graphs Recommendation Systems

Multiplex Graph Networks for Multimodal Brain Network Analysis

1 code implementation31 Jul 2021 Zhaoming Kong, Lichao Sun, Hao Peng, Liang Zhan, Yong Chen, Lifang He

In this paper, we propose MGNet, a simple and effective multiplex graph convolutional network (GCN) model for multimodal brain network analysis.

Joint Embedding of Structural and Functional Brain Networks with Graph Neural Networks for Mental Illness Diagnosis

no code implementations7 Jul 2021 Yanqiao Zhu, Hejie Cui, Lifang He, Lichao Sun, Carl Yang

Multimodal brain networks characterize complex connectivities among different brain regions from both structural and functional aspects and provide a new means for mental disease analysis.

Contrastive Learning

Local-Global Knowledge Distillation in Heterogeneous Federated Learning with Non-IID Data

no code implementations30 Jun 2021 Dezhong Yao, Wanning Pan, Yutong Dai, Yao Wan, Xiaofeng Ding, Hai Jin, Zheng Xu, Lichao Sun

Federated learning enables multiple clients to collaboratively learn a global model by periodically aggregating the clients' models without transferring the local data.

Federated Learning Knowledge Distillation

Subgraph Federated Learning with Missing Neighbor Generation

1 code implementation NeurIPS 2021 Ke Zhang, Carl Yang, Xiaoxiao Li, Lichao Sun, Siu Ming Yiu

Graphs have been widely used in data mining and machine learning due to their unique representation of real-world objects and their interactions.

Federated Learning Graph Mining

Federated Multi-View Learning for Private Medical Data Integration and Analysis

no code implementations4 May 2021 Sicong Che, Hao Peng, Lichao Sun, Yong Chen, Lifang He

This paper aims to provide a generic Federated Multi-View Learning (FedMV) framework for multi-view data leakage prevention, which is based on different types of local data availability and enables to accommodate two types of problems: Vertical Federated Multi-View Learning (V-FedMV) and Horizontal Federated Multi-View Learning (H-FedMV).

Data Integration Federated Learning +2

User Preference-aware Fake News Detection

2 code implementations25 Apr 2021 Yingtong Dou, Kai Shu, Congying Xia, Philip S. Yu, Lichao Sun

The majority of existing fake news detection algorithms focus on mining news content and/or the surrounding exogenous context for discovering deceptive signals; while the endogenous preference of a user when he/she decides to spread a piece of fake news or not is ignored.

Fact Checking Fake News Detection +2

Membership Inference Attacks on Knowledge Graphs

no code implementations16 Apr 2021 Yu Wang, Lifu Huang, Philip S. Yu, Lichao Sun

Membership inference attacks (MIAs) infer whether a specific data record is used for target model training.

Inference Attack Knowledge Graph Embedding +3

FedGraphNN: A Federated Learning System and Benchmark for Graph Neural Networks

1 code implementation14 Apr 2021 Chaoyang He, Keshav Balasubramanian, Emir Ceyani, Carl Yang, Han Xie, Lichao Sun, Lifang He, Liangwei Yang, Philip S. Yu, Yu Rong, Peilin Zhao, Junzhou Huang, Murali Annavaram, Salman Avestimehr

FedGraphNN is built on a unified formulation of graph FL and contains a wide range of datasets from different domains, popular GNN models, and FL algorithms, with secure and efficient system support.

Federated Learning Graph Neural Network +1

Model Extraction and Adversarial Transferability, Your BERT is Vulnerable!

1 code implementation NAACL 2021 Xuanli He, Lingjuan Lyu, Qiongkai Xu, Lichao Sun

Finally, we investigate two defence strategies to protect the victim model and find that unless the performance of the victim model is sacrificed, both model ex-traction and adversarial transferability can effectively compromise the target models

Model extraction text-classification +2

Membership Inference Attacks on Machine Learning: A Survey

2 code implementations14 Mar 2021 Hongsheng Hu, Zoran Salcic, Lichao Sun, Gillian Dobbie, Philip S. Yu, Xuyun Zhang

In recent years, MIAs have been shown to be effective on various ML models, e. g., classification models and generative models.

BIG-bench Machine Learning Fairness +5

FedMood: Federated Learning on Mobile Health Data for Mood Detection

1 code implementation6 Feb 2021 Xiaohang Xu, Hao Peng, Lichao Sun, Md Zakirul Alam Bhuiyan, Lianzhong Liu, Lifang He

Depression is one of the most common mental illness problems, and the symptoms shown by patients are not consistent, making it difficult to diagnose in the process of clinical practice and pathological research.

BIG-bench Machine Learning Depression Detection +3

EXPLORING VULNERABILITIES OF BERT-BASED APIS

no code implementations1 Jan 2021 Xuanli He, Lingjuan Lyu, Lichao Sun, Xiaojun Chang, Jun Zhao

We then demonstrate how the extracted model can be exploited to develop effective attribute inference attack to expose sensitive information of the training data.

Attribute Inference Attack +4

Privacy and Robustness in Federated Learning: Attacks and Defenses

no code implementations7 Dec 2020 Lingjuan Lyu, Han Yu, Xingjun Ma, Chen Chen, Lichao Sun, Jun Zhao, Qiang Yang, Philip S. Yu

Besides training powerful global models, it is of paramount importance to design FL systems that have privacy guarantees and are resistant to different types of adversaries.

Federated Learning Privacy Preserving

Mixup-Transformer: Dynamic Data Augmentation for NLP Tasks

no code implementations COLING 2020 Lichao Sun, Congying Xia, Wenpeng Yin, TingTing Liang, Philip S. Yu, Lifang He

Our studies show that mixup is a domain-independent data augmentation technique to pre-trained language models, resulting in significant performance improvement for transformer-based models.

Data Augmentation Image Classification

Secure Network Release with Link Privacy

no code implementations28 Sep 2020 Carl Yang, Haonan Wang, Ke Zhang, Lichao Sun

Many data mining and analytical tasks rely on the abstraction of networks (graphs) to summarize relational structures among individuals (nodes).

Graph Generation

Federated Model Distillation with Noise-Free Differential Privacy

no code implementations11 Sep 2020 Lichao Sun, Lingjuan Lyu

Conventional federated learning directly averages model weights, which is only possible for collaboration between models with homogeneous architectures.

Federated Learning

LDP-FL: Practical Private Aggregation in Federated Learning with Local Differential Privacy

no code implementations31 Jul 2020 Lichao Sun, Jianwei Qian, Xun Chen

In this paper, we proposed a novel design of local differential privacy mechanism for federated learning to address the abovementioned issues.

Federated Learning

Natural Backdoor Attack on Text Data

no code implementations29 Jun 2020 Lichao Sun

Recently, advanced NLP models have seen a surge in the usage of various applications.

Backdoor Attack text-classification +1

Secure Deep Graph Generation with Link Differential Privacy

1 code implementation1 May 2020 Carl Yang, Haonan Wang, Ke Zhang, Liang Chen, Lichao Sun

Many data mining and analytical tasks rely on the abstraction of networks (graphs) to summarize relational structures among individuals (nodes).

Graph Generation Link Prediction

SplitFed: When Federated Learning Meets Split Learning

2 code implementations25 Apr 2020 Chandra Thapa, M. A. P. Chamikara, Seyit Camtepe, Lichao Sun

SL provides better model privacy than FL due to the machine learning model architecture split between clients and the server.

BIG-bench Machine Learning Federated Learning

Differentially Private Deep Learning with Smooth Sensitivity

no code implementations1 Mar 2020 Lichao Sun, Yingbo Zhou, Philip S. Yu, Caiming Xiong

Ensuring the privacy of sensitive data used to train modern machine learning models is of paramount importance in many areas of practice.

Deep Learning

Adv-BERT: BERT is not robust on misspellings! Generating nature adversarial samples on BERT

no code implementations27 Feb 2020 Lichao Sun, Kazuma Hashimoto, Wenpeng Yin, Akari Asai, Jia Li, Philip Yu, Caiming Xiong

There is an increasing amount of literature that claims the brittleness of deep neural networks in dealing with adversarial examples that are created maliciously.

Question Answering Sentence +1

Near-Zero-Cost Differentially Private Deep Learning with Teacher Ensembles

no code implementations25 Sep 2019 Lichao Sun, Yingbo Zhou, Jia Li, Richard Socher, Philip S. Yu, Caiming Xiong

Ensuring the privacy of sensitive data used to train modern machine learning models is of paramount importance in many areas of practice.

Deep Learning

Private Deep Learning with Teacher Ensembles

no code implementations5 Jun 2019 Lichao Sun, Yingbo Zhou, Ji Wang, Jia Li, Richard Sochar, Philip S. Yu, Caiming Xiong

Privacy-preserving deep learning is crucial for deploying deep neural network based solutions, especially when the model works on data that contains sensitive information.

Deep Learning Ensemble Learning +3

Self-Activation Influence Maximization

no code implementations5 Jun 2019 Lichao Sun, Albert Chen, Philip S. Yu, Wei Chen

We incorporate self activation into influence propagation and propose the self-activation independent cascade (SAIC) model: nodes may be self activated besides being selected as seeds, and influence propagates from both selected seeds and self activated nodes.

Social and Information Networks

Adversarial Attack and Defense on Graph Data: A Survey

1 code implementation26 Dec 2018 Lichao Sun, Yingtong Dou, Carl Yang, Ji Wang, Yixin Liu, Philip S. Yu, Lifang He, Bo Li

Therefore, this review is intended to provide an overall landscape of more than 100 papers on adversarial attack and defense strategies for graph data, and establish a unified formulation encompassing most graph adversarial learning models.

Adversarial Attack Image Classification +2

Private Model Compression via Knowledge Distillation

no code implementations13 Nov 2018 Ji Wang, Weidong Bao, Lichao Sun, Xiaomin Zhu, Bokai Cao, Philip S. Yu

To benefit from the on-device deep learning without the capacity and privacy concerns, we design a private model compression framework RONA.

Knowledge Distillation model +2

Joint Embedding of Meta-Path and Meta-Graph for Heterogeneous Information Networks

no code implementations11 Sep 2018 Lichao Sun, Lifang He, Zhipeng Huang, Bokai Cao, Congying Xia, Xiaokai Wei, Philip S. Yu

Meta-graph is currently the most powerful tool for similarity search on heterogeneous information networks, where a meta-graph is a composition of meta-paths that captures the complex structural information.

Network Embedding Tensor Decomposition

Deep Learning Towards Mobile Applications

no code implementations10 Sep 2018 Ji Wang, Bokai Cao, Philip S. Yu, Lichao Sun, Weidong Bao, Xiaomin Zhu

In this paper, we provide an overview of the current challenges and representative achievements about pushing deep learning on mobile devices from three aspects: training with mobile data, efficient inference on mobile devices, and applications of mobile deep learning.

BIG-bench Machine Learning Deep Learning +1

Multi-Round Influence Maximization (Extended Version)

1 code implementation12 Feb 2018 Lichao Sun, Weiran Huang, Philip S. Yu, Wei Chen

In this paper, we study the Multi-Round Influence Maximization (MRIM) problem, where influence propagates in multiple rounds independently from possibly different seed sets, and the goal is to select seeds for each round to maximize the expected number of nodes that are activated in at least one round.

Social and Information Networks

Contaminant Removal for Android Malware Detection Systems

no code implementations7 Nov 2017 Lichao Sun, Xiaokai Wei, Jiawei Zhang, Lifang He, Philip S. Yu, Witawas Srisa-an

The results indicate that once we remove contaminants from the datasets, we can significantly improve both malware detection rate and detection accuracy

Cryptography and Security

Cannot find the paper you are looking for? You can Submit a new open access paper.