Search Results for author: Minsik Cho

Found 18 papers, 0 papers with code

PDP: Parameter-free Differentiable Pruning is All You Need

no code implementations18 May 2023 Minsik Cho, Saurabh Adya, Devang Naik

Also, PDP yields over 83. 1% accuracy on Multi-Genre Natural Language Inference with 90% sparsity for BERT, while the next best from the existing techniques shows 81. 5% accuracy.

Natural Language Inference

R^2: Range Regularization for Model Compression and Quantization

no code implementations14 Mar 2023 Arnav Kundu, Chungkuk Yoo, Srijan Mishra, Minsik Cho, Saurabh Adya

Model parameter regularization is a widely used technique to improve generalization, but also can be used to shape the weight distributions for various purposes.

Classification Model Compression +2

DKM: Differentiable K-Means Clustering Layer for Neural Network Compression

no code implementations ICLR 2022 Minsik Cho, Keivan A. Vahid, Saurabh Adya, Mohammad Rastegari

For MobileNet-v1, which is a challenging DNN to compress, DKM delivers 63. 9% top-1 ImageNet1k accuracy with 0. 72 MB model size (22. 4x model compression factor).

Neural Network Compression

Exploring Avenues Beyond Revised DSD Functionals: I. range separation, with xDSD as a special case

no code implementations9 Feb 2021 Golokesh Santra, Minsik Cho, Jan M. L. Martin

We have explored the use of range separation as a possible avenue for further improvement on our revDSD minimally empirical double hybrid functionals.

Chemical Physics

NASTransfer: Analyzing Architecture Transferability in Large Scale Neural Architecture Search

no code implementations23 Jun 2020 Rameswar Panda, Michele Merler, Mayoore Jaiswal, Hui Wu, Kandan Ramakrishnan, Ulrich Finkler, Chun-Fu Chen, Minsik Cho, David Kung, Rogerio Feris, Bishwaranjan Bhattacharjee

The typical way of conducting large scale NAS is to search for an architectural building block on a small dataset (either using a proxy set from the large dataset or a completely different small scale dataset) and then transfer the block to a larger dataset.

Neural Architecture Search

SNOW: Subscribing to Knowledge via Channel Pooling for Transfer & Lifelong Learning of Convolutional Neural Networks

no code implementations ICLR 2020 Chungkuk Yoo, Bumsoo Kang, Minsik Cho

SNOW is an efficient learning method to improve training/serving throughput as well as accuracy for transfer and lifelong learning of convolutional neural networks based on knowledge subscription.

Image Classification

SimEx: Express Prediction of Inter-dataset Similarity by a Fleet of Autoencoders

no code implementations14 Jan 2020 Inseok Hwang, Jinho Lee, Frank Liu, Minsik Cho

Our intuition is that, the more similarity exists between the unknown data samples and the part of known data that an autoencoder was trained with, the better chances there could be that this autoencoder makes use of its trained knowledge, reconstructing output samples closer to the originals.

Data Augmentation

MUTE: Data-Similarity Driven Multi-hot Target Encoding for Neural Network Design

no code implementations15 Oct 2019 Mayoore S. Jaiswal, Bumboo Kang, Jinho Lee, Minsik Cho

Target encoding is an effective technique to deliver better performance for conventional machine learning methods, and recently, for deep neural networks as well.

General Classification Image Classification

Deep Learning for Multi-Messenger Astrophysics: A Gateway for Discovery in the Big Data Era

no code implementations1 Feb 2019 Gabrielle Allen, Igor Andreoni, Etienne Bachelet, G. Bruce Berriman, Federica B. Bianco, Rahul Biswas, Matias Carrasco Kind, Kyle Chard, Minsik Cho, Philip S. Cowperthwaite, Zachariah B. Etienne, Daniel George, Tom Gibbs, Matthew Graham, William Gropp, Anushri Gupta, Roland Haas, E. A. Huerta, Elise Jennings, Daniel S. Katz, Asad Khan, Volodymyr Kindratenko, William T. C. Kramer, Xin Liu, Ashish Mahabal, Kenton McHenry, J. M. Miller, M. S. Neubauer, Steve Oberlin, Alexander R. Olivas Jr, Shawn Rosofsky, Milton Ruiz, Aaron Saxton, Bernard Schutz, Alex Schwing, Ed Seidel, Stuart L. Shapiro, Hongyu Shen, Yue Shen, Brigitta M. Sipőcz, Lunan Sun, John Towns, Antonios Tsokaros, Wei Wei, Jack Wells, Timothy J. Williams, JinJun Xiong, Zhizhen Zhao

We discuss key aspects to realize this endeavor, namely (i) the design and exploitation of scalable and computationally efficient AI algorithms for Multi-Messenger Astrophysics; (ii) cyberinfrastructure requirements to numerically simulate astrophysical sources, and to process and interpret Multi-Messenger Astrophysics data; (iii) management of gravitational wave detections and triggers to enable electromagnetic and astro-particle follow-ups; (iv) a vision to harness future developments of machine and deep learning and cyberinfrastructure resources to cope with the scale of discovery in the Big Data Era; (v) and the need to build a community that brings domain experts together with data scientists on equal footing to maximize and accelerate discovery in the nascent field of Multi-Messenger Astrophysics.

Astronomy Management

Data-parallel distributed training of very large models beyond GPU capacity

no code implementations29 Nov 2018 Samuel Matzek, Max Grossman, Minsik Cho, Anar Yusifov, Bryant Nelson, Amit Juneja

GPUs have limited memory and it is difficult to train wide and/or deep models that cause the training process to go out of memory.

A Unified Approximation Framework for Compressing and Accelerating Deep Neural Networks

no code implementations26 Jul 2018 Yuzhe Ma, Ran Chen, Wei Li, Fanhua Shang, Wenjian Yu, Minsik Cho, Bei Yu

To address this issue, various approximation techniques have been investigated, which seek for a light weighted network with little performance degradation in exchange of smaller model size or faster inference.

General Classification Image Classification +1


no code implementations7 Aug 2017 Minsik Cho, Ulrich Finkler, Sameer Kumar, David Kung, Vaibhav Saxena, Dheeraj Sreedhar

We train Resnet-101 on Imagenet 22K with 64 IBM Power8 S822LC servers (256 GPUs) in about 7 hours to an accuracy of 33. 8 % validation accuracy.

MEC: Memory-efficient Convolution for Deep Neural Network

no code implementations ICML 2017 Minsik Cho, Daniel Brand

However, all these indirect methods have high memory-overhead, which creates performance degradation and offers a poor trade-off between performance and memory consumption.

Cannot find the paper you are looking for? You can Submit a new open access paper.