Search Results for author: Insu Han

Found 15 papers, 12 papers with code

SubGen: Token Generation in Sublinear Time and Memory

no code implementations8 Feb 2024 Amir Zandieh, Insu Han, Vahab Mirrokni, Amin Karbasi

In this work, our focus is on developing an efficient compression technique for the KV cache.

Clustering Online Clustering +1

HyperAttention: Long-context Attention in Near-Linear Time

1 code implementation9 Oct 2023 Insu Han, Rajesh Jayaram, Amin Karbasi, Vahab Mirrokni, David P. Woodruff, Amir Zandieh

Recent work suggests that in the worst-case scenario, quadratic time is necessary unless the entries of the attention matrix are bounded or the matrix has low stable rank.

KDEformer: Accelerating Transformers via Kernel Density Estimation

1 code implementation5 Feb 2023 Amir Zandieh, Insu Han, Majid Daliri, Amin Karbasi

Dot-product attention mechanism plays a crucial role in modern deep architectures (e. g., Transformer) for sequence modeling, however, na\"ive exact computation of this model incurs quadratic time and memory complexities in sequence length, hindering the training of long-sequence models.

Density Estimation Image Generation

Fast Neural Kernel Embeddings for General Activations

2 code implementations9 Sep 2022 Insu Han, Amir Zandieh, Jaehoon Lee, Roman Novak, Lechao Xiao, Amin Karbasi

Moreover, most prior works on neural kernels have focused on the ReLU activation, mainly due to its popularity but also due to the difficulty of computing such kernels for general activations.

Scalable MCMC Sampling for Nonsymmetric Determinantal Point Processes

1 code implementation1 Jul 2022 Insu Han, Mike Gartrell, Elvis Dohmatob, Amin Karbasi

In this work, we develop a scalable MCMC sampling algorithm for $k$-NDPPs with low-rank kernels, thus enabling runtime that is sublinear in $n$.

Point Processes

Random Gegenbauer Features for Scalable Kernel Methods

no code implementations7 Feb 2022 Insu Han, Amir Zandieh, Haim Avron

Our proposed GZK family, generalizes the zonal kernels (i. e., dot-product kernels on the unit sphere) by introducing radial factors in their Gegenbauer series expansion, and includes a wide range of ubiquitous kernel functions such as the entirety of dot-product kernels as well as the Gaussian and the recently introduced Neural Tangent kernels.

Scaling Neural Tangent Kernels via Sketching and Random Features

1 code implementation NeurIPS 2021 Amir Zandieh, Insu Han, Haim Avron, Neta Shoham, Chaewon Kim, Jinwoo Shin

To accelerate learning with NTK, we design a near input-sparsity time approximation algorithm for NTK, by sketching the polynomial expansions of arc-cosine kernels: our sketch for the convolutional counterpart of NTK (CNTK) can transform any image using a linear runtime in the number of pixels.

regression

Random Features for the Neural Tangent Kernel

no code implementations3 Apr 2021 Insu Han, Haim Avron, Neta Shoham, Chaewon Kim, Jinwoo Shin

We combine random features of the arc-cosine kernels with a sketching-based algorithm which can run in linear with respect to both the number of data points and input dimension.

Scalable Learning and MAP Inference for Nonsymmetric Determinantal Point Processes

2 code implementations ICLR 2021 Mike Gartrell, Insu Han, Elvis Dohmatob, Jennifer Gillenwater, Victor-Emmanuel Brunel

Determinantal point processes (DPPs) have attracted significant attention in machine learning for their ability to model subsets drawn from a large item collection.

Point Processes

Stochastic Chebyshev Gradient Descent for Spectral Optimization

1 code implementation NeurIPS 2018 Insu Han, Haim Avron, Jinwoo Shin

A large class of machine learning techniques requires the solution of optimization problems involving spectral functions of parametric matrices, e. g. log-determinant and nuclear norm.

Faster Greedy MAP Inference for Determinantal Point Processes

1 code implementation ICML 2017 Insu Han, Prabhanjan Kambadur, KyoungSoo Park, Jinwoo Shin

Determinantal point processes (DPPs) are popular probabilistic models that arise in many machine learning tasks, where distributions of diverse sets are characterized by matrix determinants.

Point Processes

Approximating the Spectral Sums of Large-scale Matrices using Chebyshev Approximations

1 code implementation3 Jun 2016 Insu Han, Dmitry Malioutov, Haim Avron, Jinwoo Shin

Computation of the trace of a matrix function plays an important role in many scientific computing applications, including applications in machine learning, computational physics (e. g., lattice quantum chromodynamics), network analysis and computational biology (e. g., protein folding), just to name a few application areas.

Data Structures and Algorithms

Large-scale Log-determinant Computation through Stochastic Chebyshev Expansions

1 code implementation22 Mar 2015 Insu Han, Dmitry Malioutov, Jinwoo Shin

Logarithms of determinants of large positive definite matrices appear ubiquitously in machine learning applications including Gaussian graphical and Gaussian process models, partition functions of discrete graphical models, minimum-volume ellipsoids, metric learning and kernel learning.

Metric Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.