no code implementations • ACL (RepL4NLP) 2021 • Han Guo, Ramakanth Pasunuru, Mohit Bansal
Many recalibration methods have been proposed in the literature for quantifying predictive uncertainty and calibrating model outputs, with varying degrees of complexity.
1 code implementation • 26 Nov 2024 • Vladimir Malinovskii, Andrei Panferov, Ivan Ilin, Han Guo, Peter Richtárik, Dan Alistarh
Quantizing large language models has become a standard way to reduce their memory and computational costs.
1 code implementation • 11 Nov 2024 • Ekin Akyürek, Mehul Damani, Linlu Qiu, Han Guo, Yoon Kim, Jacob Andreas
TTT significantly improves performance on ARC tasks, achieving up to 6x improvement in accuracy compared to base fine-tuned models; applying TTT to an 8B-parameter language model, we achieve 53% accuracy on the ARC's public validation set, improving the state-of-the-art by nearly 25% for public and purely neural approaches.
1 code implementation • 26 Aug 2024 • James Liu, Pragaash Ponnusamy, Tianle Cai, Han Guo, Yoon Kim, Ben Athiwaratkun
Activation sparsity can enable practical inference speedups in large language models (LLMs) by reducing the compute and memory-movement required for matrix multiplications during the forward pass.
1 code implementation • 15 Jul 2024 • Han Guo, William Brandon, Radostin Cholakov, Jonathan Ragan-Kelley, Eric P. Xing, Yoon Kim
The deployment of large language models (LLMs) is often constrained by memory bandwidth, where the primary bottleneck is the cost of transferring model parameters from the GPU's global memory to its registers.
1 code implementation • 15 Jun 2024 • Chenyao Zhou, Haotian Zhang, Han Guo, Zhengxia Zou, Zhenwei Shi
Recently some multi-task learning based semantic change detection methods have been proposed to decompose the task into semantic segmentation and binary change detection subtasks.
1 code implementation • 28 Feb 2024 • Han Guo, Ramtin Hosseini, Ruiyi Zhang, Sai Ashish Somayajula, Ranak Roy Chowdhury, Rajesh K. Gupta, Pengtao Xie
Masked Autoencoder (MAE) is a notable method for self-supervised pretraining in visual representation learning.
no code implementations • 4 Jan 2024 • Ke Li, Han Guo
The learned preference information is used to progressively guide policy optimization towards policies of interest.
1 code implementation • 20 Nov 2023 • Han Guo, Philip Greengard, Eric P. Xing, Yoon Kim
Our approach uses an iterative algorithm to decompose each pretrained matrix into a high-precision low-rank component and a memory-efficient quantized component.
no code implementations • 19 May 2023 • Xingyu Bai, Taiqiang Wu, Han Guo, Zhe Zhao, Xuefeng Yang, Jiayi Li, Weijie Liu, Qi Ju, Weigang Guo, Yujiu Yang
Event Extraction (EE), aiming to identify and classify event triggers and arguments from event mentions, has benefited from pre-trained language models (PLMs).
1 code implementation • 8 Feb 2023 • Han Guo, Philip Greengard, Hongyi Wang, Andrew Gelman, Yoon Kim, Eric P. Xing
A recent alternative formulation instead treats federated learning as a distributed inference problem, where the goal is to infer a global posterior from partitioned client data (Al-Shedivat et al., 2021).
3 code implementations • 13 Dec 2022 • Zhe Zhao, Yudong Li, Cheng Hou, Jing Zhao, Rong Tian, Weijie Liu, Yiren Chen, Ningyuan Sun, Haoyan Liu, Weiquan Mao, Han Guo, Weigang Guo, Taiqiang Wu, Tao Zhu, Wenhang Shi, Chen Chen, Shan Huang, Sihong Chen, Liqun Liu, Feifei Li, Xiaoshuai Chen, Xingwu Sun, Zhanhui Kang, Xiaoyong Du, Linlin Shen, Kimmo Yan
The proposed pre-training models of different modalities are showing a rising trend of homogeneity in their model structures, which brings the opportunity to implement different pre-training models within a uniform framework.
1 code implementation • 2 Nov 2022 • Dacheng Li, Rulin Shao, Hongyi Wang, Han Guo, Eric P. Xing, Hao Zhang
Through extensive evaluations, we show that MPCFORMER significantly speeds up Transformer inference in MPC settings while achieving similar ML performance to the input model.
1 code implementation • 25 May 2022 • Mingkai Deng, Jianyu Wang, Cheng-Ping Hsieh, Yihan Wang, Han Guo, Tianmin Shu, Meng Song, Eric P. Xing, Zhiting Hu
RLPrompt formulates a parameter-efficient policy network that generates the desired discrete prompt after training with reward.
no code implementations • 21 Feb 2022 • Jinjiang Guo, Qi Liu, Han Guo, Xi Lu
Robust and efficient interpretation of QSAR methods is quite useful to validate AI prediction rationales with subjective opinion (chemist or biologist expertise), understand sophisticated chemical or biological process mechanisms, and provide heuristic ideas for structure optimization in pharmaceutical industry.
no code implementations • 29 Sep 2021 • Han Guo, Bowen Tan, Zhengzhong Liu, Eric Xing, Zhiting Hu
We apply the approach to a wide range of text generation tasks, including learning from noisy/negative examples, adversarial attacks, and prompt generation.
1 code implementation • 21 Sep 2021 • Youngseog Chung, Ian Char, Han Guo, Jeff Schneider, Willie Neiswanger
With increasing deployment of machine learning systems in various real-world tasks, there is a greater need for accurate quantification of predictive uncertainty.
1 code implementation • 14 Jun 2021 • Han Guo, Bowen Tan, Zhengzhong Liu, Eric P. Xing, Zhiting Hu
We apply the approach to a wide range of novel text generation tasks, including learning from noisy/negative examples, adversarial attacks, and prompt generation.
1 code implementation • CVPR 2022 • Jun Chen, Han Guo, Kai Yi, Boyang Li, Mohamed Elhoseiny
To the best of our knowledge, this is the first work that improves data efficiency of image captioning by utilizing LM pretrained on unimodal data.
1 code implementation • ICLR 2021 • Yonggan Fu, Han Guo, Meng Li, Xin Yang, Yining Ding, Vikas Chandra, Yingyan Celine Lin
In this paper, we attempt to explore low-precision training from a new perspective as inspired by recent findings in understanding DNN training: we conjecture that DNNs' precision might have a similar effect as the learning rate during DNN training, and advocate dynamic precision along the training trajectory for further boosting the time/energy efficiency of DNN training.
1 code implementation • EMNLP 2021 • Han Guo, Nazneen Fatema Rajani, Peter Hase, Mohit Bansal, Caiming Xiong
With the availability of the fast influence functions, we demonstrate their usefulness in four applications.
no code implementations • EMNLP 2020 • Ramakanth Pasunuru, Han Guo, Mohit Bansal
Further, it is important to consider using a dynamic combination and curriculum of metric rewards that flexibly changes over time.
no code implementations • 8 Oct 2020 • Claire E. Dickerson, Han Guo, Ashley J. Shin, Benjamin L. Augenbraun, Justin R. Caram, Wesley C. Campbell, Anastassia N. Alexandrova
Laser induced electronic excitations that spontaneously emit photons and decay directly to the initial ground state ("optical cycling transitions") are used in quantum information and precision measurement for state initialization and readout.
Atomic Physics
no code implementations • 13 Jan 2020 • Han Guo, Ramakanth Pasunuru, Mohit Bansal
Next, we develop a DistanceNet model which uses these distance measures, or a mixture of these distance measures, as an additional loss function to be minimized jointly with the task's loss function, so as to achieve better unsupervised domain adaptation.
no code implementations • NAACL 2019 • Han Guo, Ramakanth Pasunuru, Mohit Bansal
To address these issues, we present AutoSeM, a two-stage MTL pipeline, where the first stage automatically selects the most useful auxiliary tasks via a Beta-Bernoulli multi-armed bandit with Thompson Sampling, and the second stage learns the training mixing ratio of these selected auxiliary tasks via a Gaussian Process based Bayesian optimization framework.
no code implementations • COLING 2018 • Han Guo, Ramakanth Pasunuru, Mohit Bansal
In this work, we first present a strong pointer-copy mechanism based sequence-to-sequence sentence simplification model, and then improve its entailment and paraphrasing capabilities via multi-task learning with related auxiliary tasks of entailment and paraphrase generation.
Ranked #2 on Text Simplification on Newsela
no code implementations • ACL 2018 • Han Guo, Ramakanth Pasunuru, Mohit Bansal
An accurate abstractive summary of a document should contain all its salient information and should be logically entailed by the input document.
Ranked #33 on Text Summarization on GigaWord
no code implementations • Mountain View, CA, USA 2017 • Zhiwei Jin, Juan Cao, Han Guo, Yongdong Zhang
In this paper, we propose a novel Recurrent Neural Network with an at- tention mechanism (att-RNN) to fuse multimodal features for e ective rumor detection.
no code implementations • 5 Oct 2017 • Han Guo, Namrata Vaswani
Video denoising refers to the problem of removing "noise" from a video sequence.
no code implementations • WS 2017 • Ramakanth Pasunuru, Han Guo, Mohit Bansal
Abstractive summarization, the task of rewriting and compressing a document into a short summary, has achieved considerable success with neural sequence-to-sequence models.
no code implementations • 8 Mar 2017 • Tianran Hu, Han Guo, Hao Sun, Thuy-vy Thi Nguyen, Jiebo Luo
Second, from a perspective of message recipients, we further study the sentiment effects of emojis, as well as their duplications, on verbal messages.
no code implementations • NeurIPS 2016 • Namrata Vaswani, Han Guo
We obtain a correctness result for the standard eigenvalue decomposition (EVD) based solution to PCA under simple assumptions on the data-noise correlation.
no code implementations • NeurIPS 2016 • Namrata Vaswani, Han Guo
We obtain a correctness result for the standard eigenvalue decomposition (EVD) based solution to PCA under simple assumptions on the data-noise correlation.