Search Results for author: Chen Zhu

Found 61 papers, 29 papers with code

A Survey on Large Language Models for Recommendation

1 code implementation31 May 2023 Likang Wu, Zhi Zheng, Zhaopeng Qiu, Hao Wang, Hongchao Gu, Tingjia Shen, Chuan Qin, Chen Zhu, HengShu Zhu, Qi Liu, Hui Xiong, Enhong Chen

Large Language Models (LLMs) have emerged as powerful tools in the field of Natural Language Processing (NLP) and have recently gained significant attention in the domain of Recommendation Systems (RS).

Recommendation Systems Self-Supervised Learning

Long-Short Transformer: Efficient Transformers for Language and Vision

3 code implementations NeurIPS 2021 Chen Zhu, Wei Ping, Chaowei Xiao, Mohammad Shoeybi, Tom Goldstein, Anima Anandkumar, Bryan Catanzaro

For instance, Transformer-LS achieves 0. 97 test BPC on enwik8 using half the number of parameters than previous method, while being faster and is able to handle 3x as long sequences compared to its full-attention version on the same hardware.

Language Modelling

Robust Optimization as Data Augmentation for Large-scale Graphs

3 code implementations CVPR 2022 Kezhi Kong, Guohao Li, Mucong Ding, Zuxuan Wu, Chen Zhu, Bernard Ghanem, Gavin Taylor, Tom Goldstein

Data augmentation helps neural networks generalize better by enlarging the training set, but it remains an open question how to effectively augment graph data to enhance the performance of GNNs (Graph Neural Networks).

Data Augmentation Graph Classification +4

FreeLB: Enhanced Adversarial Training for Natural Language Understanding

2 code implementations ICLR 2020 Chen Zhu, Yu Cheng, Zhe Gan, Siqi Sun, Tom Goldstein, Jingjing Liu

Adversarial training, which minimizes the maximal risk for label-preserving input perturbations, has proved to be effective for improving the generalization of language models.

Natural Language Understanding Overall - Test +1

Large-Scale Adversarial Training for Vision-and-Language Representation Learning

2 code implementations NeurIPS 2020 Zhe Gan, Yen-Chun Chen, Linjie Li, Chen Zhu, Yu Cheng, Jingjing Liu

We present VILLA, the first known effort on large-scale adversarial training for vision-and-language (V+L) representation learning.

Ranked #7 on Visual Entailment on SNLI-VE val (using extra training data)

Question Answering Referring Expression +7

Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Priors

1 code implementation20 May 2022 Ravid Shwartz-Ziv, Micah Goldblum, Hossein Souri, Sanyam Kapoor, Chen Zhu, Yann Lecun, Andrew Gordon Wilson

Deep learning is increasingly moving towards a transfer learning paradigm whereby large foundation models are fine-tuned on downstream tasks, starting from an initialization learned on the source task.

Transfer Learning

Compressing Neural Networks using the Variational Information Bottleneck

1 code implementation ICML 2018 Bin Dai, Chen Zhu, David Wipf

Neural networks can be compressed to reduce memory and computational requirements, or to increase accuracy by facilitating the use of a larger base architecture.

Compressing Neural Networks using the Variational Information Bottelneck

1 code implementation ICML 2018 Bin Dai, Chen Zhu, Baining Guo, David Wipf

Neural networks can be compressed to reduce memory and computational requirements, or to increase accuracy by facilitating the use of a larger base architecture.

The Intrinsic Dimension of Images and Its Impact on Learning

1 code implementation ICLR 2021 Phillip Pope, Chen Zhu, Ahmed Abdelkader, Micah Goldblum, Tom Goldstein

We find that common natural image datasets indeed have very low intrinsic dimension relative to the high number of pixels in the images.

Image Generation

On the Exploitability of Instruction Tuning

1 code implementation NeurIPS 2023 Manli Shu, Jiongxiao Wang, Chen Zhu, Jonas Geiping, Chaowei Xiao, Tom Goldstein

In this work, we investigate how an adversary can exploit instruction tuning by injecting specific instruction-following examples into the training data that intentionally changes the model's behavior.

Data Poisoning Instruction Following

Structured Attentions for Visual Question Answering

1 code implementation ICCV 2017 Chen Zhu, Yanpeng Zhao, Shuaiyi Huang, Kewei Tu, Yi Ma

In this paper, we demonstrate the importance of encoding such relations by showing the limited effective receptive field of ResNet on two datasets, and propose to model the visual attention as a multivariate distribution over a grid-structured Conditional Random Field on image regions.

Visual Question Answering

Certified Defenses for Adversarial Patches

1 code implementation ICLR 2020 Ping-Yeh Chiang, Renkun Ni, Ahmed Abdelkader, Chen Zhu, Christoph Studer, Tom Goldstein

Adversarial patch attacks are among one of the most practical threat models against real-world computer vision systems.

Transferable Clean-Label Poisoning Attacks on Deep Neural Nets

1 code implementation15 May 2019 Chen Zhu, W. Ronny Huang, Ali Shafahi, Hengduo Li, Gavin Taylor, Christoph Studer, Tom Goldstein

Clean-label poisoning attacks inject innocuous looking (and "correctly" labeled) poison images into training data, causing a model to misclassify a targeted image after being trained on this data.

Transfer Learning

Reducing the Teacher-Student Gap via Spherical Knowledge Disitllation

1 code implementation15 Oct 2020 Jia Guo, Minghao Chen, Yao Hu, Chen Zhu, Xiaofei He, Deng Cai

We investigate this problem by study the gap of confidence between teacher and student.

Knowledge Distillation

Physics-Guided Human Motion Capture with Pose Probability Modeling

1 code implementation19 Aug 2023 Jingyi Ju, Buzhen Huang, Chen Zhu, Zhihao LI, Yangang Wang

To address the obstacles, our key-idea is to employ physics as denoising guidance in the reverse diffusion process to reconstruct physically plausible human motion from a modeled pose probability distribution.

Denoising

Learning from Noisy Anchors for One-stage Object Detection

1 code implementation CVPR 2020 Hengduo Li, Zuxuan Wu, Chen Zhu, Caiming Xiong, Richard Socher, Larry S. Davis

State-of-the-art object detectors rely on regressing and classifying an extensive list of possible anchors, which are divided into positive and negative samples based on their intersection-over-union (IoU) with corresponding groundtruth objects.

Classification General Classification +3

Adversarially robust transfer learning

1 code implementation ICLR 2020 Ali Shafahi, Parsa Saadatpanah, Chen Zhu, Amin Ghiasi, Christoph Studer, David Jacobs, Tom Goldstein

By training classifiers on top of these feature extractors, we produce new models that inherit the robustness of their parent networks.

Transfer Learning

Adaptive Densely Connected Super-Resolution Reconstruction

1 code implementation17 Dec 2019 Tangxin Xie, Xin Yang, Yu Jia, Chen Zhu, Xiaochuan Li

For a better performance in single image super-resolution(SISR), we present an image super-resolution algorithm based on adaptive dense connection (ADCSR).

Image Super-Resolution SSIM

Plug-In Inversion: Model-Agnostic Inversion for Vision with Data Augmentations

1 code implementation31 Jan 2022 Amin Ghiasi, Hamid Kazemi, Steven Reich, Chen Zhu, Micah Goldblum, Tom Goldstein

Existing techniques for model inversion typically rely on hard-to-tune regularizers, such as total variation or feature regularization, which must be individually calibrated for each network in order to produce adequate images.

Image Classification

Multi-Grained Multimodal Interaction Network for Entity Linking

1 code implementation19 Jul 2023 Pengfei Luo, Tong Xu, Shiwei Wu, Chen Zhu, Linli Xu, Enhong Chen

Then, to derive the similarity matching score for each mention-entity pair, we device three interaction units to comprehensively explore the intra-modal interaction and inter-modal fusion among features of entities and mentions.

Contrastive Learning Descriptive +1

Deep k-NN Defense against Clean-label Data Poisoning Attacks

1 code implementation29 Sep 2019 Neehar Peri, Neal Gupta, W. Ronny Huang, Liam Fowl, Chen Zhu, Soheil Feizi, Tom Goldstein, John P. Dickerson

Targeted clean-label data poisoning is a type of adversarial attack on machine learning systems in which an adversary injects a few correctly-labeled, minimally-perturbed samples into the training data, causing a model to misclassify a particular test sample during inference.

Adversarial Attack Data Poisoning

SetRank: A Setwise Bayesian Approach for Collaborative Ranking from Implicit Feedback

1 code implementation23 Feb 2020 Chao Wang, HengShu Zhu, Chen Zhu, Chuan Qin, Hui Xiong

The recent development of online recommender systems has a focus on collaborative ranking from implicit feedback, such as user clicks and purchases.

Collaborative Ranking Recommendation Systems

Heterogeneous Trajectory Forecasting via Risk and Scene Graph Learning

1 code implementation2 Nov 2022 Jianwu Fang, Chen Zhu, Pu Zhang, Hongkai Yu, Jianru Xue

Heterogeneous trajectory forecasting is critical for intelligent transportation systems, but it is challenging because of the difficulty of modeling the complex interaction relations among the heterogeneous road agents as well as their agent-environment constraints.

Graph Learning Trajectory Forecasting

MaxVA: Fast Adaptation of Step Sizes by Maximizing Observed Variance of Gradients

1 code implementation21 Jun 2020 Chen Zhu, Yu Cheng, Zhe Gan, Furong Huang, Jingjing Liu, Tom Goldstein

Adaptive gradient methods such as RMSProp and Adam use exponential moving estimate of the squared gradient to compute adaptive step sizes, achieving better convergence than SGD in face of noisy objectives.

Image Classification Machine Translation +3

Faithful Abstractive Summarization via Fact-aware Consistency-constrained Transformer

1 code implementation CIKM 2022 Yuanjie Lyu, Chen Zhu, Tong Xu, Zikai Yin, Enhong Chen

To deal with this challenge, in this paper, we propose a novel fact-aware abstractive summarization model, named Entity-Relation Pointer Generator Network (ERPGN).

Abstractive Text Summarization Text Generation

Recruitment Market Trend Analysis with Sequential Latent Variable Models

no code implementations8 Dec 2017 Chen Zhu, HengShu Zhu, Hui Xiong, Pengliang Ding, Fang Xie

To this end, in this paper, we propose a new research paradigm for recruitment market analysis by leveraging unsupervised learning techniques for automatically discovering recruitment market trends based on large-scale recruitment data.

Person-Job Fit: Adapting the Right Talent for the Right Job with Joint Representation Learning

no code implementations8 Oct 2018 Chen Zhu, HengShu Zhu, Hui Xiong, Chao Ma, Fang Xie, Pengliang Ding, Pan Li

To this end, in this paper, we propose a novel end-to-end data-driven model based on Convolutional Neural Network (CNN), namely Person-Job Fit Neural Network (PJFNN), for matching a talent qualification to the requirements of a job.

Data Visualization Representation Learning

Fine-grained Video Categorization with Redundancy Reduction Attention

no code implementations ECCV 2018 Chen Zhu, Xiao Tan, Feng Zhou, Xiao Liu, Kaiyu Yue, Errui Ding, Yi Ma

Specifically, it firstly summarizes the video by weight-summing all feature vectors in the feature maps of selected frames with a spatio-temporal soft attention, and then predicts which channels to suppress or to enhance according to this summary with a learned non-linear transform.

Action Recognition Video Classification

Enhancing Person-Job Fit for Talent Recruitment: An Ability-aware Neural Network Approach

no code implementations21 Dec 2018 Chuan Qin, HengShu Zhu, Tong Xu, Chen Zhu, Liang Jiang, Enhong Chen, Hui Xiong

The wide spread use of online recruitment services has led to information explosion in the job market.

Improving the Tightness of Convex Relaxation Bounds for Training Certifiably Robust Classifiers

no code implementations22 Feb 2020 Chen Zhu, Renkun Ni, Ping-Yeh Chiang, Hengduo Li, Furong Huang, Tom Goldstein

Convex relaxations are effective for training and certifying neural networks against norm-bounded adversarial attacks, but they leave a large gap between certifiable and empirical robustness.

Towards Accurate Quantization and Pruning via Data-free Knowledge Transfer

no code implementations14 Oct 2020 Chen Zhu, Zheng Xu, Ali Shafahi, Manli Shu, Amin Ghiasi, Tom Goldstein

Further, we demonstrate that the compact structure and corresponding initialization from the Lottery Ticket Hypothesis can also help in data-free training.

Data Free Quantization Transfer Learning

Are Adversarial Examples Created Equal? A Learnable Weighted Minimax Risk for Robustness under Non-uniform Attacks

no code implementations24 Oct 2020 Huimin Zeng, Chen Zhu, Tom Goldstein, Furong Huang

Adversarial Training is proved to be an efficient method to defend against adversarial examples, being one of the few defenses that withstand strong attacks.

Modifying Memories in Transformer Models

no code implementations1 Dec 2020 Chen Zhu, Ankit Singh Rawat, Manzil Zaheer, Srinadh Bhojanapalli, Daliang Li, Felix Yu, Sanjiv Kumar

In this paper, we propose a new task of \emph{explicitly modifying specific factual knowledge in Transformer models while ensuring the model performance does not degrade on the unmodified facts}.

Memorization

Inductive Predictions of Extreme Hydrologic Events in The Wabash River Watershed

no code implementations25 Apr 2021 Nicholas Majeske, Bidisha Abesh, Chen Zhu, Ariful Azad

We present a machine learning method to predict extreme hydrologic events from spatially and temporally varying hydrological and meteorological data.

Time Series Time Series Analysis

Improved Training of Certifiably Robust Models

no code implementations25 Sep 2019 Chen Zhu, Renkun Ni, Ping-Yeh Chiang, Hengduo Li, Furong Huang, Tom Goldstein

Convex relaxations are effective for training and certifying neural networks against norm-bounded adversarial attacks, but they leave a large gap between certifiable and empirical (PGD) robustness.

Complexity-Oriented Per-shot Video Coding Optimization

no code implementations23 Dec 2021 Hongcheng Zhong, Jun Xu, Chen Zhu, Donghui Feng, Li Song

Current per-shot encoding schemes aim to improve the compression efficiency by shot-level optimization.

CPMLHO:Hyperparameter Tuning via Cutting Plane and Mixed-Level Optimization

no code implementations11 Dec 2022 Shuo Yang, Yang Jiao, Shaoyu Dou, Mana Zheng, Chen Zhu

The bilevel optimization is used to automatically update the hyperparameter, and the gradient of the hyperparameter is the approximate gradient based on the best response function.

Bilevel Optimization Hyperparameter Optimization

Multi-Level Association Rule Mining for Wireless Network Time Series Data

no code implementations15 Dec 2022 Chen Zhu, Chengbo Qiu, Shaoyu Dou, Minghao Liao

Key performance indicators(KPIs) are of great significance in the monitoring of wireless network service quality.

Management Time Series +1

How Many Demonstrations Do You Need for In-context Learning?

no code implementations14 Mar 2023 Jiuhai Chen, Lichang Chen, Chen Zhu, Tianyi Zhou

Moreover, ICL (with and w/o CoT) using only one correct demo significantly outperforms all-demo ICL adopted by most previous works, indicating the weakness of LLMs in finding correct demo(s) for input queries, which is difficult to evaluate on the biased datasets.

In-Context Learning

DELTA: Dynamic Embedding Learning with Truncated Conscious Attention for CTR Prediction

no code implementations3 May 2023 Chen Zhu, Liang Du, Hong Chen, Shuang Zhao, Zixun Sun, Xin Wang, Wenwu Zhu

To tackle this problem, inspired by the Global Workspace Theory in conscious processing, which posits that only a specific subset of the product features are pertinent while the rest can be noisy and even detrimental to human-click behaviors, we propose a CTR model that enables Dynamic Embedding Learning with Truncated Conscious Attention for CTR prediction, termed DELTA.

Click-Through Rate Prediction

A Comprehensive Survey of Artificial Intelligence Techniques for Talent Analytics

no code implementations3 Jul 2023 Chuan Qin, Le Zhang, Rui Zha, Dazhong Shen, Qi Zhang, Ying Sun, Chen Zhu, HengShu Zhu, Hui Xiong

To this end, we present an up-to-date and comprehensive survey on AI technologies used for talent analytics in the field of human resource management.

Decision Making Management

Tensor Programs VI: Feature Learning in Infinite-Depth Neural Networks

no code implementations3 Oct 2023 Greg Yang, Dingli Yu, Chen Zhu, Soufiane Hayou

By classifying infinite-width neural networks and identifying the *optimal* limit, Tensor Programs IV and V demonstrated a universal way, called $\mu$P, for *widthwise hyperparameter transfer*, i. e., predicting optimal hyperparameters of wide neural networks from narrow ones.

Retrieval meets Long Context Large Language Models

no code implementations4 Oct 2023 Peng Xu, Wei Ping, Xianchao Wu, Lawrence McAfee, Chen Zhu, Zihan Liu, Sandeep Subramanian, Evelina Bakhturina, Mohammad Shoeybi, Bryan Catanzaro

Perhaps surprisingly, we find that LLM with 4K context window using simple retrieval-augmentation at generation can achieve comparable performance to finetuned LLM with 16K context window via positional interpolation on long context tasks, while taking much less computation.

Few-Shot Learning Natural Questions +2

Bridging the Information Gap Between Domain-Specific Model and General LLM for Personalized Recommendation

no code implementations7 Nov 2023 Wenxuan Zhang, Hongzhi Liu, Yingpeng Du, Chen Zhu, Yang song, HengShu Zhu, Zhonghai Wu

Nevertheless, these methods encounter the certain issue that information such as community behavior pattern in RS domain is challenging to express in natural language, which limits the capability of LLMs to surpass state-of-the-art domain-specific models.

Vertiport Navigation Requirements and Multisensor Architecture Considerations for Urban Air Mobility

no code implementations13 Nov 2023 Omar Garcia Crespillo, Chen Zhu, Maximilian Simonetti, Daniel Gerbeth, Young-Hee Lee, Wenhan Hao

Communication, Navigation and Surveillance (CNS) technologies are key enablers for future safe operation of drones in urban environments.

Implicit-explicit Integrated Representations for Multi-view Video Compression

no code implementations29 Nov 2023 Chen Zhu, Guo Lu, Bing He, Rong Xie, Li Song

To further enhance the reconstruction quality from the INR codec, we leverage the high-quality reconstructed frames from the explicit codec to achieve inter-view compensation.

Video Compression

Robust Target Detection of Intelligent Integrated Optical Camera and mmWave Radar System

no code implementations12 Dec 2023 Chen Zhu, Zhouxiang Zhao, Zejing Shan, Lijie Yang, Sijie Ji, Zhaohui Yang, Zhaoyang Zhang

To improve the target detection performance under complex real-world scenarios, this paper proposes an intelligent integrated optical camera and millimeter-wave (mmWave) radar system.

ODIN: Disentangled Reward Mitigates Hacking in RLHF

no code implementations11 Feb 2024 Lichang Chen, Chen Zhu, Davit Soselia, Jiuhai Chen, Tianyi Zhou, Tom Goldstein, Heng Huang, Mohammad Shoeybi, Bryan Catanzaro

In this work, we study the issue of reward hacking on the response length, a challenge emerging in Reinforcement Learning from Human Feedback (RLHF) on LLMs.

KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning over Knowledge Graph

no code implementations17 Feb 2024 Jinhao Jiang, Kun Zhou, Wayne Xin Zhao, Yang song, Chen Zhu, HengShu Zhu, Ji-Rong Wen

To guarantee the effectiveness, we leverage program language to formulate the multi-hop reasoning process over the KG, and synthesize a code-based instruction dataset to fine-tune the base LLM.

Knowledge Graphs

Cannot find the paper you are looking for? You can Submit a new open access paper.