no code implementations • 25 Mar 2022 • Hanlin Tang, Xipeng Zhang, Kai Liu, Jianchen Zhu, Zhanhui Kang
In this work, we propose MKQ-BERT, which further improves the compression level and uses 4-bits for quantization.
no code implementations • 20 Aug 2021 • Weicong Ding, Hanlin Tang, Jingshuo Feng, Lei Yuan, Sen yang, Guangxu Yang, Jie Zheng, Jing Wang, Qiang Su, Dong Zheng, Xuezhong Qiu, Yongqi Liu, Yuxuan Chen, Yang Liu, Chao Song, Dongying Kong, Kai Ren, Peng Jiang, Qiao Lian, Ji Liu
In this setting with multiple and constrained goals, this paper discovers that a probabilistic strategic parameter regime can achieve better value compared to the standard regime of finding a single deterministic parameter.
no code implementations • ICLR 2021 • Cory Stephenson, Suchismita Padhy, Abhinav Ganesh, Yue Hui, Hanlin Tang, SueYeon Chung
Understanding how large neural networks avoid memorizing training data is key to explaining their high generalization performance.
no code implementations • ACL (RepL4NLP) 2021 • Matteo Alleman, Jonathan Mamou, Miguel A Del Rio, Hanlin Tang, Yoon Kim, SueYeon Chung
While vector-based language representations from pretrained language models have set a new standard for many NLP tasks, there is not yet a complete accounting of their inner workings.
1 code implementation • 13 Apr 2021 • Conglong Li, Ammar Ahmad Awan, Hanlin Tang, Samyam Rajbhandari, Yuxiong He
To this end, we design a new communication-efficient algorithm, 1-bit LAMB, which introduces a novel way to support adaptive layerwise learning rates under compression.
2 code implementations • 4 Feb 2021 • Hanlin Tang, Shaoduo Gan, Ammar Ahmad Awan, Samyam Rajbhandari, Conglong Li, Xiangru Lian, Ji Liu, Ce Zhang, Yuxiong He
One of the most effective methods is error-compensated compression, which offers robust convergence speed even under 1-bit compression.
no code implementations • 1 Jan 2021 • Matteo Alleman, Jonathan Mamou, Miguel A Del Rio, Hanlin Tang, Yoon Kim, SueYeon Chung
Importing from computational and cognitive neuroscience the notion of representational invariance, we perform a series of probes designed to test the sensitivity of Transformer representations to several kinds of structure in sentences.
no code implementations • 26 Aug 2020 • Hanlin Tang, Shaoduo Gan, Samyam Rajbhandari, Xiangru Lian, Ji Liu, Yuxiong He, Ce Zhang
Adam is the important optimization algorithm to guarantee efficiency and accuracy for training many important tasks such as BERT and ImageNet.
no code implementations • ICLR 2021 • Shauharda Khadka, Estelle Aflalo, Mattias Marder, Avrech Ben-David, Santiago Miret, Shie Mannor, Tamir Hazan, Hanlin Tang, Somdeb Majumdar
For deep neural network accelerators, memory movement is both energetically expensive and can bound computation.
1 code implementation • ICML 2020 • Jonathan Mamou, Hang Le, Miguel Del Rio, Cory Stephenson, Hanlin Tang, Yoon Kim, SueYeon Chung
In addition, we find that the emergence of linear separability in these manifolds is driven by a combined reduction of manifolds' radius, dimensionality and inter-manifold correlations.
1 code implementation • NeurIPS 2019 • Cory Stephenson, Jenelle Feather, Suchismita Padhy, Oguz Elibol, Hanlin Tang, Josh Mcdermott, SueYeon Chung
Higher level concepts such as parts-of-speech and context dependence also emerge in the later layers of the network.
no code implementations • ICLR 2020 • Léopold Cambier, Anahita Bhiwandiwalla, Ting Gong, Mehran Nekuii, Oguz H. Elibol, Hanlin Tang
This necessitates increased memory footprint and computational requirements for training.
no code implementations • 19 Nov 2019 • Barak Battash, Haim Barad, Hanlin Tang, Amit Bleiweiss
In this paper we are approaching the task in a completely different way; we are looking at the data from the compressed stream as a one unit clip and propose that the residual frames can replace the original RGB frames from the raw domain.
4 code implementations • 6 Nov 2019 • Vijay Janapa Reddi, Christine Cheng, David Kanter, Peter Mattson, Guenther Schmuelling, Carole-Jean Wu, Brian Anderson, Maximilien Breughe, Mark Charlebois, William Chou, Ramesh Chukka, Cody Coleman, Sam Davis, Pan Deng, Greg Diamos, Jared Duke, Dave Fick, J. Scott Gardner, Itay Hubara, Sachin Idgunji, Thomas B. Jablin, Jeff Jiao, Tom St. John, Pankaj Kanwar, David Lee, Jeffery Liao, Anton Lokhmotov, Francisco Massa, Peng Meng, Paulius Micikevicius, Colin Osborne, Gennady Pekhimenko, Arun Tejusve Raghunath Rajan, Dilip Sequeira, Ashish Sirasao, Fei Sun, Hanlin Tang, Michael Thomson, Frank Wei, Ephrem Wu, Lingjie Xu, Koichi Yamada, Bing Yu, George Yuan, Aaron Zhong, Peizhao Zhang, Yuchen Zhou
Machine-learning (ML) hardware and software system demand is burgeoning.
1 code implementation • 11 Oct 2019 • Chaoyang He, Conghui Tan, Hanlin Tang, Shuang Qiu, Ji Liu
However, in many social network scenarios, centralized federated learning is not applicable (e. g., a central agent or server connecting all users may not exist, or the communication cost to the central server is not affordable).
no code implementations • 2 Oct 2019 • Brigit Schroeder, Hanlin Tang, Alexandre Alahi
We propose a simple yet effective method for leveraging these image priors to improve semantic segmentation of images from sequential driving datasets.
2 code implementations • 2 Oct 2019 • Peter Mattson, Christine Cheng, Cody Coleman, Greg Diamos, Paulius Micikevicius, David Patterson, Hanlin Tang, Gu-Yeon Wei, Peter Bailis, Victor Bittorf, David Brooks, Dehao Chen, Debojyoti Dutta, Udit Gupta, Kim Hazelwood, Andrew Hock, Xinyuan Huang, Atsushi Ike, Bill Jia, Daniel Kang, David Kanter, Naveen Kumar, Jeffery Liao, Guokai Ma, Deepak Narayanan, Tayo Oguntebi, Gennady Pekhimenko, Lillian Pentecost, Vijay Janapa Reddi, Taylor Robie, Tom St. John, Tsuguchika Tabaru, Carole-Jean Wu, Lingjie Xu, Masafumi Yamazaki, Cliff Young, Matei Zaharia
Machine learning (ML) needs industry-standard performance benchmarks to support design and competitive evaluation of the many emerging software and hardware solutions for ML.
no code implementations • 19 Sep 2019 • Brigit Schroeder, Subarna Tripathi, Hanlin Tang
We see a significant performance increase in both metrics that measure the goodness of layout prediction, mean intersection-over-union (mIoU)(52. 3% vs. 49. 2%) and relation score (61. 7% vs. 54. 1%), after the addition of triplet supervision and data augmentation.
no code implementations • 17 Jul 2019 • Hanlin Tang, Xiangru Lian, Shuang Qiu, Lei Yuan, Ce Zhang, Tong Zhang, Ji Liu
Since the \emph{decentralized} training has been witnessed to be superior to the traditional \emph{centralized} training in the communication restricted scenario, therefore a natural question to ask is "how to apply the error-compensated technology to the decentralized learning to further reduce the communication cost."
no code implementations • 26 Jun 2019 • Varun Kumar Vijay, Abhinav Ganesh, Hanlin Tang, Arjun Bansal
To solve tasks in new environments involving objects unseen during training, agents must reason over prior information about those objects and their relations.
1 code implementation • 13 Jun 2019 • Shuyuan Li, Jianguo Li, Hanlin Tang, Rui Qian, Weiyao Lin
This paper tries to fill the gap by introducing a novel large-scale dataset, the Amur Tiger Re-identification in the Wild (ATRW) dataset.
no code implementations • 28 May 2019 • Suchismita Padhy, Jenelle Feather, Cory Stephenson, Oguz Elibol, Hanlin Tang, Josh Mcdermott, SueYeon Chung
The success of deep neural networks in visual tasks have motivated recent theoretical and empirical work to understand how these networks operate.
no code implementations • 15 May 2019 • Hanlin Tang, Xiangru Lian, Chen Yu, Tong Zhang, Ji Liu
For example, under the popular parameter server model for distributed learning, the worker nodes need to send the compressed local gradients to the parameter server, which performs the aggregation.
no code implementations • 19 Apr 2019 • Subarna Tripathi, Sharath Nittur Sridhar, Sairam Sundaresan, Hanlin Tang
Structured representations such as scene graphs serve as an efficient and compact representation that can be used for downstream rendering or retrieval tasks.
no code implementations • ICCV 2019 • Nicholas Weir, David Lindenbaum, Alexei Bastidas, Adam Van Etten, Sean McPherson, Jacob Shermeyer, Varun Kumar, Hanlin Tang
To address this problem, we present an open source Multi-View Overhead Imagery dataset, termed SpaceNet MVOI, with 27 unique looks from a broad range of viewing angles (-32. 5 degrees to 54. 0 degrees).
no code implementations • ICLR Workshop LLD 2019 • Subarna Tripathi, Anahita Bhiwandiwalla, Alexei Bastidas, Hanlin Tang
Existing scene graph to image models have two stages: (1) a scene composition stage, and an (2) image generation stage.
no code implementations • 29 Jan 2019 • Yawei Zhao, Chen Yu, Peilin Zhao, Hanlin Tang, Shuang Qiu, Ji Liu
Decentralized Online Learning (online learning in decentralized networks) attracts more and more attention, since it is believed that Decentralized Online Learning can help the data providers cooperatively better solve their online problems without sharing their private data to a third party or other providers.
no code implementations • 11 Jan 2019 • Subarna Tripathi, Anahita Bhiwandiwalla, Alexei Bastidas, Hanlin Tang
Generating realistic images from scene graphs asks neural networks to be able to reason about object relationships and compositionality.
no code implementations • 17 Oct 2018 • Chen Yu, Hanlin Tang, Cedric Renggli, Simon Kassing, Ankit Singla, Dan Alistarh, Ce Zhang, Ji Liu
Most of today's distributed machine learning systems assume {\em reliable networks}: whenever two machines exchange information (e. g., gradients or models), the network should guarantee the delivery of the message.
no code implementations • ICML 2018 • Hanlin Tang, Xiangru Lian, Ming Yan, Ce Zhang, Ji Liu
While training a machine learning model using multiple workers, each of which collects data from its own data source, it would be useful when the data collected from different workers are unique and different.
Ranked #3 on
Multi-view Subspace Clustering
on ORL
no code implementations • 19 Mar 2018 • Hanlin Tang, Xiangru Lian, Ming Yan, Ce Zhang, Ji Liu
While training a machine learning model using multiple workers, each of which collects data from their own data sources, it would be most useful when the data collected from different workers can be {\em unique} and {\em different}.
no code implementations • NeurIPS 2018 • Hanlin Tang, Shaoduo Gan, Ce Zhang, Tong Zhang, Ji Liu
In this paper, We explore a natural question: {\em can the combination of both techniques lead to a system that is robust to both bandwidth and latency?}
1 code implementation • 7 Jun 2017 • Hanlin Tang, Martin Schrimpf, Bill Lotter, Charlotte Moerman, Ana Paredes, Josue Ortega Caro, Walter Hardesty, David Cox, Gabriel Kreiman
First, subjects robustly recognized objects even when rendered <15% visible, but recognition was largely impaired when processing was interrupted by backward masking.