1 code implementation • ICCV 2017 • Chen Zhu, Yanpeng Zhao, Shuaiyi Huang, Kewei Tu, Yi Ma
In this paper, we demonstrate the importance of encoding such relations by showing the limited effective receptive field of ResNet on two datasets, and propose to model the visual attention as a multivariate distribution over a grid-structured Conditional Random Field on image regions.
no code implementations • 8 Dec 2017 • Chen Zhu, HengShu Zhu, Hui Xiong, Pengliang Ding, Fang Xie
To this end, in this paper, we propose a new research paradigm for recruitment market analysis by leveraging unsupervised learning techniques for automatically discovering recruitment market trends based on large-scale recruitment data.
1 code implementation • ICML 2018 • Bin Dai, Chen Zhu, David Wipf
Neural networks can be compressed to reduce memory and computational requirements, or to increase accuracy by facilitating the use of a larger base architecture.
no code implementations • CVPR 2018 • Zhou Su, Chen Zhu, Yinpeng Dong, Dongqi Cai, Yurong Chen, Jianguo Li
Second is the mechanism for handling multiple knowledge facts expanding from question and answer pairs.
1 code implementation • ICML 2018 • Bin Dai, Chen Zhu, Baining Guo, David Wipf
Neural networks can be compressed to reduce memory and computational requirements, or to increase accuracy by facilitating the use of a larger base architecture.
no code implementations • 8 Oct 2018 • Chen Zhu, HengShu Zhu, Hui Xiong, Chao Ma, Fang Xie, Pengliang Ding, Pan Li
To this end, in this paper, we propose a novel end-to-end data-driven model based on Convolutional Neural Network (CNN), namely Person-Job Fit Neural Network (PJFNN), for matching a talent qualification to the requirements of a job.
no code implementations • ECCV 2018 • Chen Zhu, Xiao Tan, Feng Zhou, Xiao Liu, Kaiyu Yue, Errui Ding, Yi Ma
Specifically, it firstly summarizes the video by weight-summing all feature vectors in the feature maps of selected frames with a spatio-temporal soft attention, and then predicts which channels to suppress or to enhance according to this summary with a learned non-linear transform.
Ranked #12 on Action Recognition on ActivityNet
no code implementations • 21 Dec 2018 • Chuan Qin, HengShu Zhu, Tong Xu, Chen Zhu, Liang Jiang, Enhong Chen, Hui Xiong
The wide spread use of online recruitment services has led to information explosion in the job market.
1 code implementation • 15 May 2019 • Chen Zhu, W. Ronny Huang, Ali Shafahi, Hengduo Li, Gavin Taylor, Christoph Studer, Tom Goldstein
Clean-label poisoning attacks inject innocuous looking (and "correctly" labeled) poison images into training data, causing a model to misclassify a targeted image after being trained on this data.
1 code implementation • ICLR 2020 • Ali Shafahi, Parsa Saadatpanah, Chen Zhu, Amin Ghiasi, Christoph Studer, David Jacobs, Tom Goldstein
By training classifiers on top of these feature extractors, we produce new models that inherit the robustness of their parent networks.
no code implementations • 25 Sep 2019 • Chen Zhu, Renkun Ni, Ping-Yeh Chiang, Hengduo Li, Furong Huang, Tom Goldstein
Convex relaxations are effective for training and certifying neural networks against norm-bounded adversarial attacks, but they leave a large gap between certifiable and empirical (PGD) robustness.
2 code implementations • ICLR 2020 • Chen Zhu, Yu Cheng, Zhe Gan, Siqi Sun, Tom Goldstein, Jingjing Liu
Adversarial training, which minimizes the maximal risk for label-preserving input perturbations, has proved to be effective for improving the generalization of language models.
1 code implementation • 29 Sep 2019 • Neehar Peri, Neal Gupta, W. Ronny Huang, Liam Fowl, Chen Zhu, Soheil Feizi, Tom Goldstein, John P. Dickerson
Targeted clean-label data poisoning is a type of adversarial attack on machine learning systems in which an adversary injects a few correctly-labeled, minimally-perturbed samples into the training data, causing a model to misclassify a particular test sample during inference.
1 code implementation • CVPR 2020 • Hengduo Li, Zuxuan Wu, Chen Zhu, Caiming Xiong, Richard Socher, Larry S. Davis
State-of-the-art object detectors rely on regressing and classifying an extensive list of possible anchors, which are divided into positive and negative samples based on their intersection-over-union (IoU) with corresponding groundtruth objects.
1 code implementation • 17 Dec 2019 • Tangxin Xie, Xin Yang, Yu Jia, Chen Zhu, Xiaochuan Li
For a better performance in single image super-resolution(SISR), we present an image super-resolution algorithm based on adaptive dense connection (ADCSR).
no code implementations • 22 Feb 2020 • Chen Zhu, Renkun Ni, Ping-Yeh Chiang, Hengduo Li, Furong Huang, Tom Goldstein
Convex relaxations are effective for training and certifying neural networks against norm-bounded adversarial attacks, but they leave a large gap between certifiable and empirical robustness.
1 code implementation • 23 Feb 2020 • Chao Wang, HengShu Zhu, Chen Zhu, Chuan Qin, Hui Xiong
The recent development of online recommender systems has a focus on collaborative ranking from implicit feedback, such as user clicks and purchases.
1 code implementation • ICLR 2020 • Ping-Yeh Chiang, Renkun Ni, Ahmed Abdelkader, Chen Zhu, Christoph Studer, Tom Goldstein
Adversarial patch attacks are among one of the most practical threat models against real-world computer vision systems.
no code implementations • 20 Apr 2020 • Ahmed Abdelkader, Michael J. Curry, Liam Fowl, Tom Goldstein, Avi Schwarzschild, Manli Shu, Christoph Studer, Chen Zhu
We first demonstrate successful transfer attacks against a victim network using \textit{only} its feature extractor.
2 code implementations • NeurIPS 2020 • Zhe Gan, Yen-Chun Chen, Linjie Li, Chen Zhu, Yu Cheng, Jingjing Liu
We present VILLA, the first known effort on large-scale adversarial training for vision-and-language (V+L) representation learning.
Ranked #7 on Visual Entailment on SNLI-VE val (using extra training data)
1 code implementation • 21 Jun 2020 • Chen Zhu, Yu Cheng, Zhe Gan, Furong Huang, Jingjing Liu, Tom Goldstein
Adaptive gradient methods such as RMSProp and Adam use exponential moving estimate of the squared gradient to compute adaptive step sizes, achieving better convergence than SGD in face of noisy objectives.
no code implementations • 14 Oct 2020 • Chen Zhu, Zheng Xu, Ali Shafahi, Manli Shu, Amin Ghiasi, Tom Goldstein
Further, we demonstrate that the compact structure and corresponding initialization from the Lottery Ticket Hypothesis can also help in data-free training.
1 code implementation • 15 Oct 2020 • Jia Guo, Minghao Chen, Yao Hu, Chen Zhu, Xiaofei He, Deng Cai
We investigate this problem by study the gap of confidence between teacher and student.
3 code implementations • CVPR 2022 • Kezhi Kong, Guohao Li, Mucong Ding, Zuxuan Wu, Chen Zhu, Bernard Ghanem, Gavin Taylor, Tom Goldstein
Data augmentation helps neural networks generalize better by enlarging the training set, but it remains an open question how to effectively augment graph data to enhance the performance of GNNs (Graph Neural Networks).
Ranked #1 on Graph Property Prediction on ogbg-ppa
no code implementations • 24 Oct 2020 • Huimin Zeng, Chen Zhu, Tom Goldstein, Furong Huang
Adversarial Training is proved to be an efficient method to defend against adversarial examples, being one of the few defenses that withstand strong attacks.
no code implementations • 1 Dec 2020 • Chen Zhu, Ankit Singh Rawat, Manzil Zaheer, Srinadh Bhojanapalli, Daliang Li, Felix Yu, Sanjiv Kumar
In this paper, we propose a new task of \emph{explicitly modifying specific factual knowledge in Transformer models while ensuring the model performance does not degrade on the unmodified facts}.
2 code implementations • NeurIPS 2021 • Chen Zhu, Renkun Ni, Zheng Xu, Kezhi Kong, W. Ronny Huang, Tom Goldstein
Innovations in neural architectures have fostered significant breakthroughs in language modeling and computer vision.
Ranked #137 on Image Classification on CIFAR-10
1 code implementation • ICLR 2021 • Phillip Pope, Chen Zhu, Ahmed Abdelkader, Micah Goldblum, Tom Goldstein
We find that common natural image datasets indeed have very low intrinsic dimension relative to the high number of pixels in the images.
no code implementations • 25 Apr 2021 • Nicholas Majeske, Bidisha Abesh, Chen Zhu, Ariful Azad
We present a machine learning method to predict extreme hydrologic events from spatially and temporally varying hydrological and meteorological data.
3 code implementations • NeurIPS 2021 • Chen Zhu, Wei Ping, Chaowei Xiao, Mohammad Shoeybi, Tom Goldstein, Anima Anandkumar, Bryan Catanzaro
For instance, Transformer-LS achieves 0. 97 test BPC on enwik8 using half the number of parameters than previous method, while being faster and is able to handle 3x as long sequences compared to its full-attention version on the same hardware.
Ranked #1 on Language Modelling on enwik8 dev
2 code implementations • 14 Jul 2021 • Jianyu Wang, Zachary Charles, Zheng Xu, Gauri Joshi, H. Brendan McMahan, Blaise Aguera y Arcas, Maruan Al-Shedivat, Galen Andrew, Salman Avestimehr, Katharine Daly, Deepesh Data, Suhas Diggavi, Hubert Eichner, Advait Gadhikar, Zachary Garrett, Antonious M. Girgis, Filip Hanzely, Andrew Hard, Chaoyang He, Samuel Horvath, Zhouyuan Huo, Alex Ingerman, Martin Jaggi, Tara Javidi, Peter Kairouz, Satyen Kale, Sai Praneeth Karimireddy, Jakub Konecny, Sanmi Koyejo, Tian Li, Luyang Liu, Mehryar Mohri, Hang Qi, Sashank J. Reddi, Peter Richtarik, Karan Singhal, Virginia Smith, Mahdi Soltanolkotabi, Weikang Song, Ananda Theertha Suresh, Sebastian U. Stich, Ameet Talwalkar, Hongyi Wang, Blake Woodworth, Shanshan Wu, Felix X. Yu, Honglin Yuan, Manzil Zaheer, Mi Zhang, Tong Zhang, Chunxiang Zheng, Chen Zhu, Wennan Zhu
Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection.
no code implementations • ICLR 2022 • Chen Zhu, Zheng Xu, Mingqing Chen, Jakub Konečný, Andrew Hard, Tom Goldstein
Federated learning has been deployed to train machine learning models from decentralized client data on mobile devices in practice.
no code implementations • 26 Oct 2021 • Jiuhai Chen, Chen Zhu, Bin Dai
In this paper, we study how SSL can enhance the performance of the out-of-distribution (OOD) detection task.
Out-of-Distribution Detection Out of Distribution (OOD) Detection +1
1 code implementation • NeurIPS 2021 • Mucong Ding, Kezhi Kong, Jingling Li, Chen Zhu, John P Dickerson, Furong Huang, Tom Goldstein
Our framework avoids the "neighbor explosion" problem of GNNs using quantized representations combined with a low-rank version of the graph convolution matrix.
Ranked #11 on Node Classification on Reddit
no code implementations • 23 Dec 2021 • Hongcheng Zhong, Jun Xu, Chen Zhu, Donghui Feng, Li Song
Current per-shot encoding schemes aim to improve the compression efficiency by shot-level optimization.
1 code implementation • 31 Jan 2022 • Amin Ghiasi, Hamid Kazemi, Steven Reich, Chen Zhu, Micah Goldblum, Tom Goldstein
Existing techniques for model inversion typically rely on hard-to-tune regularizers, such as total variation or feature regularization, which must be individually calibrated for each network in order to produce adequate images.
1 code implementation • 20 May 2022 • Ravid Shwartz-Ziv, Micah Goldblum, Hossein Souri, Sanyam Kapoor, Chen Zhu, Yann Lecun, Andrew Gordon Wilson
Deep learning is increasingly moving towards a transfer learning paradigm whereby large foundation models are fine-tuned on downstream tasks, starting from an initialization learned on the source task.
1 code implementation • CIKM 2022 • Yuanjie Lyu, Chen Zhu, Tong Xu, Zikai Yin, Enhong Chen
To deal with this challenge, in this paper, we propose a novel fact-aware abstractive summarization model, named Entity-Relation Pointer Generator Network (ERPGN).
1 code implementation • 2 Nov 2022 • Jianwu Fang, Chen Zhu, Pu Zhang, Hongkai Yu, Jianru Xue
Heterogeneous trajectory forecasting is critical for intelligent transportation systems, but it is challenging because of the difficulty of modeling the complex interaction relations among the heterogeneous road agents as well as their agent-environment constraints.
no code implementations • 11 Dec 2022 • Shuo Yang, Yang Jiao, Shaoyu Dou, Mana Zheng, Chen Zhu
The bilevel optimization is used to automatically update the hyperparameter, and the gradient of the hyperparameter is the approximate gradient based on the best response function.
no code implementations • 15 Dec 2022 • Chen Zhu, Chengbo Qiu, Shaoyu Dou, Minghao Liao
Key performance indicators(KPIs) are of great significance in the monitoring of wireless network service quality.
no code implementations • 14 Mar 2023 • Jiuhai Chen, Lichang Chen, Chen Zhu, Tianyi Zhou
Moreover, ICL (with and w/o CoT) using only one correct demo significantly outperforms all-demo ICL adopted by most previous works, indicating the weakness of LLMs in finding correct demo(s) for input queries, which is difficult to evaluate on the biased datasets.
1 code implementation • F1000Research 2023 • Xiaopeng Xu, Juexiao Zhou, Chen Zhu, Qing Zhan, Zhongxiao Li, Ruochi Zhang, Yu Wang, Xingyu Liao, Xin Gao
The SGPT-RL method achieved better results than Reinvent on the ACE2 task, where molecular docking was used as the optimization goal.
no code implementations • 3 May 2023 • Chen Zhu, Liang Du, Hong Chen, Shuang Zhao, Zixun Sun, Xin Wang, Wenwu Zhu
To tackle this problem, inspired by the Global Workspace Theory in conscious processing, which posits that only a specific subset of the product features are pertinent while the rest can be noisy and even detrimental to human-click behaviors, we propose a CTR model that enables Dynamic Embedding Learning with Truncated Conscious Attention for CTR prediction, termed DELTA.
1 code implementation • 31 May 2023 • Likang Wu, Zhi Zheng, Zhaopeng Qiu, Hao Wang, Hongchao Gu, Tingjia Shen, Chuan Qin, Chen Zhu, HengShu Zhu, Qi Liu, Hui Xiong, Enhong Chen
Large Language Models (LLMs) have emerged as powerful tools in the field of Natural Language Processing (NLP) and have recently gained significant attention in the domain of Recommendation Systems (RS).
1 code implementation • NeurIPS 2023 • Manli Shu, Jiongxiao Wang, Chen Zhu, Jonas Geiping, Chaowei Xiao, Tom Goldstein
In this work, we investigate how an adversary can exploit instruction tuning by injecting specific instruction-following examples into the training data that intentionally changes the model's behavior.
no code implementations • 3 Jul 2023 • Chuan Qin, Le Zhang, Rui Zha, Dazhong Shen, Qi Zhang, Ying Sun, Chen Zhu, HengShu Zhu, Hui Xiong
To this end, we present an up-to-date and comprehensive survey on AI technologies used for talent analytics in the field of human resource management.
1 code implementation • 19 Jul 2023 • Pengfei Luo, Tong Xu, Shiwei Wu, Chen Zhu, Linli Xu, Enhong Chen
Then, to derive the similarity matching score for each mention-entity pair, we device three interaction units to comprehensively explore the intra-modal interaction and inter-modal fusion among features of entities and mentions.
1 code implementation • 19 Aug 2023 • Jingyi Ju, Buzhen Huang, Chen Zhu, Zhihao LI, Yangang Wang
To address the obstacles, our key-idea is to employ physics as denoising guidance in the reverse diffusion process to reconstruct physically plausible human motion from a modeled pose probability distribution.
no code implementations • 22 Sep 2023 • Henry Pak, Lukas Asmer, Petra Kokus, Bianca I. Schuchardt, Albert End, Frank Meller, Karolin Schweiger, Christoph Torens, Carolina Barzantny, Dennis Becker, Johannes Maria Ernst, Florian Jäger, Tim Laudien, Nabih Naeem, Anne Papenfuß, Jan Pertz, Prajwal Shiva Prakasha, Patrick Ratei, Fabian Reimer, Patrick Sieb, Chen Zhu
Urban Air Mobility (UAM) is a new air transportation system for passengers and cargo in urban environments, enabled by new technologies and integrated into multimodal transportation systems.
no code implementations • 3 Oct 2023 • Greg Yang, Dingli Yu, Chen Zhu, Soufiane Hayou
By classifying infinite-width neural networks and identifying the *optimal* limit, Tensor Programs IV and V demonstrated a universal way, called $\mu$P, for *widthwise hyperparameter transfer*, i. e., predicting optimal hyperparameters of wide neural networks from narrow ones.
no code implementations • 4 Oct 2023 • Peng Xu, Wei Ping, Xianchao Wu, Lawrence McAfee, Chen Zhu, Zihan Liu, Sandeep Subramanian, Evelina Bakhturina, Mohammad Shoeybi, Bryan Catanzaro
Perhaps surprisingly, we find that LLM with 4K context window using simple retrieval-augmentation at generation can achieve comparable performance to finetuned LLM with 16K context window via positional interpolation on long context tasks, while taking much less computation.
no code implementations • 7 Nov 2023 • Wenxuan Zhang, Hongzhi Liu, Yingpeng Du, Chen Zhu, Yang song, HengShu Zhu, Zhonghai Wu
Nevertheless, these methods encounter the certain issue that information such as community behavior pattern in RS domain is challenging to express in natural language, which limits the capability of LLMs to surpass state-of-the-art domain-specific models.
no code implementations • 13 Nov 2023 • Omar Garcia Crespillo, Chen Zhu, Maximilian Simonetti, Daniel Gerbeth, Young-Hee Lee, Wenhan Hao
Communication, Navigation and Surveillance (CNS) technologies are key enablers for future safe operation of drones in urban environments.
no code implementations • 29 Nov 2023 • Chen Zhu, Guo Lu, Bing He, Rong Xie, Li Song
To further enhance the reconstruction quality from the INR codec, we leverage the high-quality reconstructed frames from the explicit codec to achieve inter-view compensation.
no code implementations • 12 Dec 2023 • Chen Zhu, Zhouxiang Zhao, Zejing Shan, Lijie Yang, Sijie Ji, Zhaohui Yang, Zhaoyang Zhang
To improve the target detection performance under complex real-world scenarios, this paper proposes an intelligent integrated optical camera and millimeter-wave (mmWave) radar system.
no code implementations • 10 Feb 2024 • Fedor Borisyuk, Mingzhou Zhou, Qingquan Song, Siyu Zhu, Birjodh Tiwana, Ganesh Parameswaran, Siddharth Dangi, Lars Hertel, Qiang Xiao, Xiaochen Hou, Yunbo Ouyang, Aman Gupta, Sheallika Singh, Dan Liu, Hailing Cheng, Lei Le, Jonathan Hung, Sathiya Keerthi, Ruoyan Wang, Fengyu Zhang, Mohit Kothari, Chen Zhu, Daqi Sun, Yun Dai, Xun Luan, Sirou Zhu, Zhiwei Wang, Neil Daftary, Qianqi Shen, Chengming Jiang, Haichao Wei, Maneesh Varshney, Amol Ghoting, Souvik Ghosh
We present LiRank, a large-scale ranking framework at LinkedIn that brings to production state-of-the-art modeling architectures and optimization methods.
no code implementations • 11 Feb 2024 • Lichang Chen, Chen Zhu, Davit Soselia, Jiuhai Chen, Tianyi Zhou, Tom Goldstein, Heng Huang, Mohammad Shoeybi, Bryan Catanzaro
In this work, we study the issue of reward hacking on the response length, a challenge emerging in Reinforcement Learning from Human Feedback (RLHF) on LLMs.
no code implementations • 17 Feb 2024 • Jinhao Jiang, Kun Zhou, Wayne Xin Zhao, Yang song, Chen Zhu, HengShu Zhu, Ji-Rong Wen
To guarantee the effectiveness, we leverage program language to formulate the multi-hop reasoning process over the KG, and synthesize a code-based instruction dataset to fine-tune the base LLM.
no code implementations • 26 Feb 2024 • Jupinder Parmar, Shrimai Prabhumoye, Joseph Jennings, Mostofa Patwary, Sandeep Subramanian, Dan Su, Chen Zhu, Deepak Narayanan, Aastha Jhunjhunwala, Ayush Dattagupta, Vibhu Jawa, Jiwei Liu, Ameya Mahabaleshwarkar, Osvald Nitski, Annika Brundyn, James Maki, Miguel Martinez, Jiaxuan You, John Kamalu, Patrick Legresley, Denys Fridman, Jared Casper, Ashwath Aithal, Oleksii Kuchaiev, Mohammad Shoeybi, Jonathan Cohen, Bryan Catanzaro
We introduce Nemotron-4 15B, a 15-billion-parameter large multilingual language model trained on 8 trillion text tokens.
no code implementations • 26 Mar 2024 • Dian Chao, Xin Song, Shupeng Zhong, Boyuan Wang, Xiangyu Wu, Chen Zhu, Yang Yang
In this paper, we propose a solution for improving the quality of captions generated for figures in papers.