no code implementations • ICML 2020 • Yangsibo Huang, Zhao Song, Sanjeev Arora, Kai Li
The new ideas in the current paper are: (a) new variants of mixup with negative as well as positive coefficients, and extend the sample-wise mixup to be pixel-wise.
1 code implementation • 11 Feb 2025 • Yong Lin, Shange Tang, Bohan Lyu, Jiayun Wu, Hongzhou Lin, Kaiyu Yang, Jia Li, Mengzhou Xia, Danqi Chen, Sanjeev Arora, Chi Jin
On the miniF2F benchmark, it achieves a 57. 6% success rate (Pass@32), exceeding the previous best open-source model by 7. 6%.
no code implementations • 5 Feb 2025 • Yikai Wu, Haoyu Zhao, Sanjeev Arora
Some GPU-based methods even perform similarly to the simplest heuristic, degree-based greedy.
1 code implementation • 5 Jan 2025 • Simon Park, Abhishek Panigrahi, Yun Cheng, Dingli Yu, Anirudh Goyal, Sanjeev Arora
We seek strategies for training on the SIMPLE version of the tasks that improve performance on the corresponding HARD task, i. e., S2H generalization.
no code implementations • 19 Nov 2024 • Stanley Wei, Sadhika Malladi, Sanjeev Arora, Amartya Sanyal
Machine unlearning algorithms are increasingly important as legal concerns arise around the provenance of training data, but verifying the success of unlearning is often difficult.
1 code implementation • 11 Oct 2024 • Noam Razin, Sadhika Malladi, Adithya Bhaskar, Danqi Chen, Sanjeev Arora, Boris Hanin
Direct Preference Optimization (DPO) and its variants are increasingly used for aligning language models with human preferences.
no code implementations • 29 Sep 2024 • Haoyu Zhao, Simran Kaur, Dingli Yu, Anirudh Goyal, Sanjeev Arora
(2) When skill categories are split into training and held-out groups, models significantly improve at composing texts with held-out skills during testing despite having only seen training skills during fine-tuning, illustrating the efficacy of the training approach even with previously unseen skills.
no code implementations • 27 Aug 2024 • Simran Kaur, Simon Park, Anirudh Goyal, Sanjeev Arora
Vanilla SFT (i. e., no PPO, DPO, or RL methods) on data generated from Instruct-SkillMix leads to strong gains on instruction following benchmarks such as AlpacaEval 2. 0, MT-Bench, and WildBench.
no code implementations • 26 Aug 2024 • Xindi Wu, Dingli Yu, Yangsibo Huang, Olga Russakovsky, Sanjeev Arora
Compositionality is a critical capability in Text-to-Image (T2I) models, as it reflects their ability to understand and combine multiple concepts from text descriptions.
1 code implementation • 30 Jul 2024 • Vedant Shah, Dingli Yu, Kaifeng Lyu, Simon Park, Jiatong Yu, Yinghui He, Nan Rosemary Ke, Michael Mozer, Yoshua Bengio, Sanjeev Arora, Anirudh Goyal
We present a design framework that combines the strengths of LLMs with a human-in-the-loop approach to generate a diverse array of challenging math questions.
1 code implementation • 26 Jun 2024 • ZiRui Wang, Mengzhou Xia, Luxi He, Howard Chen, Yitao Liu, Richard Zhu, Kaiqu Liang, Xindi Wu, Haotian Liu, Sadhika Malladi, Alexis Chevalier, Sanjeev Arora, Danqi Chen
All models lag far behind human performance of 80. 5%, underscoring weaknesses in the chart understanding capabilities of existing MLLMs.
no code implementations • 20 May 2024 • Aniket Didolkar, Anirudh Goyal, Nan Rosemary Ke, Siyuan Guo, Michal Valko, Timothy Lillicrap, Danilo Rezende, Yoshua Bengio, Michael Mozer, Sanjeev Arora
(b) When using an LLM to solve the test questions, we present it with the full list of skill labels and ask it to identify the skill needed.
1 code implementation • 28 Feb 2024 • Kaifeng Lyu, Haoyu Zhao, Xinran Gu, Dingli Yu, Anirudh Goyal, Sanjeev Arora
Public LLMs such as the Llama 2-Chat underwent alignment training and were considered safe.
1 code implementation • 16 Feb 2024 • Alexis Chevalier, Jiayi Geng, Alexander Wettig, Howard Chen, Sebastian Mizera, Toni Annala, Max Jameson Aragon, Arturo Rodríguez Fanlo, Simon Frieder, Simon Machado, Akshara Prabhakar, Ellie Thieu, Jiachen T. Wang, ZiRui Wang, Xindi Wu, Mengzhou Xia, Wenhan Xia, Jiatong Yu, Jun-Jie Zhu, Zhiyong Jason Ren, Sanjeev Arora, Danqi Chen
We use TutorChat to fine-tune Llemma models with 7B and 34B parameters.
2 code implementations • 6 Feb 2024 • Mengzhou Xia, Sadhika Malladi, Suchin Gururangan, Sanjeev Arora, Danqi Chen
Instruction tuning has unlocked powerful capabilities in large language models (LLMs), effectively using combined datasets to develop generalpurpose chatbots.
no code implementations • 26 Nov 2023 • Vedant Shah, Frederik Träuble, Ashish Malik, Hugo Larochelle, Michael Mozer, Sanjeev Arora, Yoshua Bengio, Anirudh Goyal
Machine \emph{unlearning}, which involves erasing knowledge about a \emph{forget set} from a trained model, can prove to be costly and infeasible by existing techniques.
no code implementations • 26 Oct 2023 • Dingli Yu, Simran Kaur, Arushi Gupta, Jonah Brown-Cohen, Anirudh Goyal, Sanjeev Arora
The paper develops a methodology for (a) designing and administering such an evaluation, and (b) automatic grading (plus spot-checking by humans) of the results using GPT-4 as well as the open LLaMA-2 70B model.
1 code implementation • 22 Oct 2023 • Xinran Gu, Kaifeng Lyu, Sanjeev Arora, Jingzhao Zhang, Longbo Huang
In distributed deep learning with data parallelism, synchronizing gradients at each training step can cause a huge communication overhead, especially when many nodes work together to train large models.
no code implementations • 29 Jul 2023 • Sanjeev Arora, Anirudh Goyal
Contributions include: (a) A statistical framework that relates cross-entropy loss of LLMs to competence on the basic skills that underlie language tasks.
1 code implementation • 3 Jul 2023 • Abhishek Panigrahi, Sadhika Malladi, Mengzhou Xia, Sanjeev Arora
In this work, we propose an efficient construction, Transformer in Transformer (in short, TinT), that allows a transformer to simulate and fine-tune complex models internally during inference (e. g., pre-trained language models).
2 code implementations • NeurIPS 2023 • Sadhika Malladi, Tianyu Gao, Eshaan Nichani, Alex Damian, Jason D. Lee, Danqi Chen, Sanjeev Arora
Fine-tuning language models (LMs) has yielded success on diverse downstream tasks, but as LMs grow in size, backpropagation requires a prohibitively large amount of memory.
no code implementations • 14 Mar 2023 • Haoyu Zhao, Abhishek Panigrahi, Rong Ge, Sanjeev Arora
We also show that the Inside-Outside algorithm is optimal for masked language modeling loss on the PCFG-generated data.
1 code implementation • 2 Mar 2023 • Xinran Gu, Kaifeng Lyu, Longbo Huang, Sanjeev Arora
Local SGD is a communication-efficient variant of SGD for large-scale training, where multiple GPUs perform SGD independently and average the model parameters periodically.
1 code implementation • 13 Feb 2023 • Abhishek Panigrahi, Nikunj Saunshi, Haoyu Zhao, Sanjeev Arora
Given the downstream task and a model fine-tuned on that task, a simple optimization is used to identify a very small subset of parameters ($\sim0. 01$% of model parameters) responsible for ($>95$%) of the model's performance, in the sense that grafting the fine-tuned values for just this tiny subset onto the pre-trained model gives performance almost as well as the fine-tuned model.
1 code implementation • 5 Nov 2022 • Arushi Gupta, Nikunj Saunshi, Dingli Yu, Kaifeng Lyu, Sanjeev Arora
Saliency methods compute heat maps that highlight portions of an input that were most {\em important} for the label assigned to it by a deep net.
1 code implementation • 11 Oct 2022 • Sadhika Malladi, Alexander Wettig, Dingli Yu, Danqi Chen, Sanjeev Arora
It has become standard to solve NLP tasks by fine-tuning pre-trained language models (LMs), especially in low-data settings.
no code implementations • 3 Oct 2022 • Nikunj Saunshi, Arushi Gupta, Mark Braverman, Sanjeev Arora
Influence functions estimate effect of individual data points on predictions of the model on test data and were adapted to deep learning in Koh and Liang [2017].
no code implementations • 8 Jul 2022 • Zhiyuan Li, Tianhao Wang, JasonD. Lee, Sanjeev Arora
Conversely, continuous mirror descent with any Legendre function can be viewed as gradient flow with a related commuting parametrization.
no code implementations • 14 Jun 2022 • Kaifeng Lyu, Zhiyuan Li, Sanjeev Arora
Normalization layers (e. g., Batch Normalization, Layer Normalization) were introduced to help with optimization difficulties in very deep nets, but they clearly also help generalization, even in not-so-deep nets.
1 code implementation • 20 May 2022 • Sadhika Malladi, Kaifeng Lyu, Abhishek Panigrahi, Sanjeev Arora
Approximating Stochastic Gradient Descent (SGD) as a Stochastic Differential Equation (SDE) has allowed researchers to enjoy the benefits of studying a continuous optimization trajectory while carefully preserving the stochasticity of SGD.
no code implementations • 19 May 2022 • Sanjeev Arora, Zhiyuan Li, Abhishek Panigrahi
The current paper mathematically analyzes a new mechanism of implicit regularization in the EoS phase, whereby GD updates due to non-smooth loss landscape turn out to evolve along some deterministic flow on the manifold of minimum loss.
no code implementations • 2 Mar 2022 • Zhou Lu, Wenhan Xia, Sanjeev Arora, Elad Hazan
Adaptive gradient methods are the method of choice for optimization in machine learning and used to train the largest deep models.
no code implementations • 28 Feb 2022 • Nikunj Saunshi, Jordan Ash, Surbhi Goel, Dipendra Misra, Cyril Zhang, Sanjeev Arora, Sham Kakade, Akshay Krishnamurthy
Contrastive learning is a popular form of self-supervised learning that encourages augmentations (views) of the same input to have more similar representations compared to augmentations of different inputs.
1 code implementation • NeurIPS 2021 • Yangsibo Huang, Samyak Gupta, Zhao Song, Kai Li, Sanjeev Arora
Gradient inversion attack (or input recovery from gradient) is an emerging threat to the security and privacy preservation of Federated learning, whereby malicious eavesdroppers or participants in the protocol can recover (partially) the clients' private data.
no code implementations • ICLR 2022 • Yi Zhang, Arushi Gupta, Nikunj Saunshi, Sanjeev Arora
Research on generalization bounds for deep networks seeks to give ways to predict test error using just the training dataset and the network parameters.
no code implementations • NeurIPS 2021 • Kaifeng Lyu, Zhiyuan Li, Runzhe Wang, Sanjeev Arora
The current paper is able to establish this global optimality for two-layer Leaky ReLU nets trained with gradient flow on linearly separable and symmetric data, regardless of the width.
no code implementations • ICLR 2022 • Zhiyuan Li, Tianhao Wang, Sanjeev Arora
Understanding the implicit bias of Stochastic Gradient Descent (SGD) is one of the key challenges in deep learning, especially for overparametrized models, where the local minimizers of the loss function $L$ can form a manifold.
no code implementations • 29 Sep 2021 • Arushi Gupta, Nikunj Saunshi, Dingli Yu, Kaifeng Lyu, Sanjeev Arora
Saliency methods seek to provide human-interpretable explanations for the output of machine learning model on a given input.
no code implementations • 25 Feb 2021 • Sanjeev Arora, Yi Zhang
Traditional statistics forbids use of test data (a. k. a.
1 code implementation • NeurIPS 2021 • Zhiyuan Li, Sadhika Malladi, Sanjeev Arora
It is generally recognized that finite learning rate (LR), in contrast to infinitesimal LR, is important for good generalization in real-life deep nets.
no code implementations • ICLR 2021 • Zhiyuan Li, Yi Zhang, Sanjeev Arora
However, this has not been made mathematically rigorous, and the hurdle is that the fully connected net can always simulate the convolutional net (for a fixed task).
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Yangsibo Huang, Zhao Song, Danqi Chen, Kai Li, Sanjeev Arora
In addition, TextHide fits well with the popular framework of fine-tuning pre-trained language models (e. g., BERT) for any sentence or sentence-pair task.
no code implementations • ICLR 2021 • Nikunj Saunshi, Sadhika Malladi, Sanjeev Arora
This paper initiates a mathematical study of this phenomenon for the downstream task of text classification by considering the following questions: (1) What is the intuitive connection between the pretraining task of next word prediction and text classification?
3 code implementations • 6 Oct 2020 • Yangsibo Huang, Zhao Song, Kai Li, Sanjeev Arora
This paper introduces InstaHide, a simple encryption of training images, which can be plugged into existing distributed deep learning pipelines.
no code implementations • NeurIPS 2020 • Zhiyuan Li, Kaifeng Lyu, Sanjeev Arora
Recent works (e. g., (Li and Arora, 2020)) suggest that the use of popular normalization schemes (including Batch Normalization) in today's deep learning can move it far from a traditional optimization viewpoint, e. g., use of exponentially increasing learning rates.
no code implementations • 4 Mar 2020 • Yangsibo Huang, Yushan Su, Sachin Ravi, Zhao Song, Sanjeev Arora, Kai Li
This paper attempts to answer the question whether neural network pruning can be used as a tool to achieve differential privacy without losing much data utility.
no code implementations • ICML 2020 • Nikunj Saunshi, Yi Zhang, Mikhail Khodak, Sanjeev Arora
In contrast, for the non-convex formulation of a two layer linear network on the same instance, we show that both Reptile and multi-task representation learning can have new task sample complexity of $\mathcal{O}(1)$, demonstrating a separation from convex meta-learning.
no code implementations • ICML 2020 • Sanjeev Arora, Simon S. Du, Sham Kakade, Yuping Luo, Nikunj Saunshi
We formulate representation learning as a bi-level optimization problem where the "outer" optimization tries to learn the joint representation and the "inner" optimization encodes the imitation learning setup and tries to learn task-specific parameters.
no code implementations • NeurIPS 2020 • Yi Zhang, Orestis Plevrakis, Simon S. Du, Xingguo Li, Zhao Song, Sanjeev Arora
Our work proves convergence to low robust training loss for \emph{polynomial} width instead of exponential, under natural assumptions and with the ReLU activation.
no code implementations • 3 Nov 2019 • Zhiyuan Li, Ruosong Wang, Dingli Yu, Simon S. Du, Wei Hu, Ruslan Salakhutdinov, Sanjeev Arora
An exact algorithm to compute CNTK (Arora et al., 2019) yielded the finding that classification accuracy of CNTK on CIFAR-10 is within 6-7% of that of that of the corresponding CNN architecture (best figure being around 78%) which is interesting performance for a fixed kernel.
no code implementations • ICLR 2020 • Zhiyuan Li, Sanjeev Arora
This paper suggests that the phenomenon may be due to Batch Normalization or BN, which is ubiquitous and provides benefits in optimization and generalization across all standard architectures.
4 code implementations • ICLR 2020 • Sanjeev Arora, Simon S. Du, Zhiyuan Li, Ruslan Salakhutdinov, Ruosong Wang, Dingli Yu
On VOC07 testbed for few-shot image classification tasks on ImageNet with transfer learning (Goyal et al., 2019), replacing the linear SVM currently used with a Convolutional NTK SVM consistently improves performance.
no code implementations • 25 Sep 2019 • Arushi Gupta, Sanjeev Arora
This involves computing saliency maps for all possible labels in the classification task, and using a simple competition among them to identify and remove less relevant pixels from the map.
1 code implementation • NeurIPS 2019 • Rohith Kuditipudi, Xiang Wang, Holden Lee, Yi Zhang, Zhiyuan Li, Wei Hu, Sanjeev Arora, Rong Ge
Mode connectivity is a surprising phenomenon in the loss landscape of deep nets.
1 code implementation • NeurIPS 2019 • Sanjeev Arora, Nadav Cohen, Wei Hu, Yuping Luo
Efforts to understand the generalization mystery in deep learning have led to the belief that gradient-based optimization induces a form of implicit regularization, a bias towards models of low "complexity."
no code implementations • 27 May 2019 • Arushi Gupta, Sanjeev Arora
There is great interest in "saliency methods" (also called "attribution methods"), which give "explanations" for a deep net's decision, by assigning a "score" to each feature/pixel in the input.
2 code implementations • NeurIPS 2019 • Sanjeev Arora, Simon S. Du, Wei Hu, Zhiyuan Li, Ruslan Salakhutdinov, Ruosong Wang
An attraction of such ideas is that a pure kernel-based method is used to capture the power of a fully-trained deep net of infinite width.
no code implementations • 25 Feb 2019 • Sanjeev Arora, Hrishikesh Khandeparkar, Mikhail Khodak, Orestis Plevrakis, Nikunj Saunshi
This framework allows us to show provable guarantees on the performance of the learned representations on the average classification task that is comprised of a subset of the same set of latent classes.
no code implementations • 24 Jan 2019 • Sanjeev Arora, Simon S. Du, Wei Hu, Zhiyuan Li, Ruosong Wang
This paper analyzes training and generalization for a simple 2-layer ReLU net with random initialization, and provides the following improvements over recent works: (i) Using a tighter characterization of training speed than recent papers, an explanation for why training a neural net with random labels leads to slower training, as originally observed in [Zhang et al. ICLR'17].
no code implementations • ICLR 2019 • Sanjeev Arora, Zhiyuan Li, Kaifeng Lyu
Batch Normalization (BN) has become a cornerstone of deep learning across diverse architectures, appearing to help optimization as well as generalization.
no code implementations • ICLR 2019 • Sanjeev Arora, Nadav Cohen, Noah Golowich, Wei Hu
We analyze speed of convergence to global optimum for gradient descent training a deep linear neural network (parameterized as $x \mapsto W_N W_{N-1} \cdots W_1 x$) by minimizing the $\ell_2$ loss over whitened data.
1 code implementation • ACL 2018 • Mikhail Khodak, Nikunj Saunshi, YIngyu Liang, Tengyu Ma, Brandon Stewart, Sanjeev Arora
Motivations like domain adaptation, transfer learning, and feature learning have fueled interest in inducing embeddings for rare or unseen words, n-grams, synsets, and other textual features.
Ranked #3 on
Sentiment Analysis
on MPQA
no code implementations • 5 Mar 2018 • Sanjeev Arora, Wei Hu, Pravesh K. Kothari
A first line of attack in exploratory data analysis is data visualization, i. e., generating a 2-dimensional representation of data that makes clusters of similar points visually identifiable.
no code implementations • ICML 2018 • Sanjeev Arora, Nadav Cohen, Elad Hazan
The effect of depth on optimization is decoupled from expressiveness by focusing on settings where additional layers amount to overparameterization - linear neural networks, a well-studied model.
no code implementations • ICML 2018 • Sanjeev Arora, Rong Ge, Behnam Neyshabur, Yi Zhang
Analysis of correctness of our compression relies upon some newly identified \textquotedblleft noise stability\textquotedblright properties of trained deep nets, which are also experimentally verified.
no code implementations • ICLR 2018 • Sanjeev Arora, Elad Hazan, Holden Lee, Karan Singh, Cyril Zhang, Yi Zhang
We study the control of symmetric linear dynamical systems with unknown dynamics and a hidden state.
2 code implementations • ICLR 2018 • Sanjeev Arora, Mikhail Khodak, Nikunj Saunshi, Kiran Vodrahalli
We also show a surprising new property of embeddings such as GloVe and word2vec: they form a good sensing matrix for text that is more efficient than random matrices, the standard sparse recovery tool, which may explain why they lead to better representations in practice.
no code implementations • ICLR 2018 • Sanjeev Arora, Andrej Risteski, Yi Zhang
Using this evidence is presented that well-known GANs approaches do learn distributions of fairly low support.
no code implementations • 7 Nov 2017 • Sanjeev Arora, Andrej Risteski, Yi Zhang
Encoder-decoder GANs architectures (e. g., BiGAN and ALI) seek to add an inference mechanism to the GANs setup, consisting of a small encoder deep net that maps data-points to their succinct encodings.
no code implementations • 26 Jun 2017 • Sanjeev Arora, Yi Zhang
Do GANS (Generative Adversarial Nets) actually learn the target distribution?
no code implementations • 14 Jun 2017 • Sanjeev Arora, Andrej Risteski
There is general consensus that learning representations is useful for a variety of reasons, e. g. efficient use of labeled data (semi-supervised learning), transfer learning and understanding hidden structure of data.
no code implementations • 29 Apr 2017 • Mikhail Khodak, Andrej Risteski, Christiane Fellbaum, Sanjeev Arora
Our methods require very few linguistic resources, thus being applicable for Wordnet construction in low-resources languages, and may further be applied to sense clustering and other Wordnet improvements.
1 code implementation • WS 2017 • Mikhail Khodak, Andrej Risteski, Christiane Fellbaum, Sanjeev Arora
To evaluate our method we construct two 600-word testsets for word-to-synset matching in French and Russian using native speakers and evaluate the performance of our method along with several other recent approaches.
1 code implementation • ICML 2017 • Sanjeev Arora, Rong Ge, YIngyu Liang, Tengyu Ma, Yi Zhang
We show that training of generative adversarial network (GAN) may not have good generalization properties; e. g., training may appear successful but the trained distribution may be far from target distribution in standard metrics.
no code implementations • 22 Feb 2017 • Holden Lee, Rong Ge, Tengyu Ma, Andrej Risteski, Sanjeev Arora
We take a first cut at explaining the expressivity of multilayer nets by giving a sufficient criterion for a function to be approximable by a neural network with $n$ hidden layers.
no code implementations • 28 Dec 2016 • Sanjeev Arora, Rong Ge, Tengyu Ma, Andrej Risteski
Many machine learning applications use latent variable models to explain structure in data, whereby visible variables (= coordinates of the given datapoint) are explained as a probabilistic function of some hidden variables.
1 code implementation • 13 Oct 2016 • Kiran Vodrahalli, Po-Hsuan Chen, YIngyu Liang, Christopher Baldassano, Janice Chen, Esther Yong, Christopher Honey, Uri Hasson, Peter Ramadge, Ken Norman, Sanjeev Arora
Several research groups have shown how to correlate fMRI responses to the meanings of presented stimuli.
no code implementations • 27 May 2016 • Sanjeev Arora, Rong Ge, Frederic Koehler, Tengyu Ma, Ankur Moitra
But designing provable algorithms for inference has proven to be more challenging.
1 code implementation • TACL 2018 • Sanjeev Arora, Yuanzhi Li, YIngyu Liang, Tengyu Ma, Andrej Risteski
A novel aspect of our technique is that each extracted word sense is accompanied by one of about 2000 "discourse atoms" that gives a succinct description of which other words co-occur with that word sense.
no code implementations • 18 Nov 2015 • Sanjeev Arora, YIngyu Liang, Tengyu Ma
Under this assumption ---which is experimentally tested on real-life nets like AlexNet--- it is formally proved that feed forward net is a correct inference method for recovering the hidden layer.
no code implementations • 2 Mar 2015 • Sanjeev Arora, Rong Ge, Tengyu Ma, Ankur Moitra
Its standard formulation is as a non-convex optimization problem which is solved in practice by heuristics based on alternating minimization.
4 code implementations • TACL 2016 • Sanjeev Arora, Yuanzhi Li, YIngyu Liang, Tengyu Ma, Andrej Risteski
Semantic word embeddings represent the meaning of a word via a vector, and are created by diverse methods.
no code implementations • 3 Jan 2014 • Sanjeev Arora, Aditya Bhaskara, Rong Ge, Tengyu Ma
In dictionary learning, also known as sparse coding, the algorithm is given samples of the form $y = Ax$ where $x\in \mathbb{R}^m$ is an unknown random sparse vector and $A$ is an unknown dictionary matrix in $\mathbb{R}^{n\times m}$ (usually $m > n$, which is the overcomplete case).
no code implementations • 23 Oct 2013 • Sanjeev Arora, Aditya Bhaskara, Rong Ge, Tengyu Ma
The analysis of the algorithm reveals interesting structure of neural networks with random edge weights.
no code implementations • 28 Aug 2013 • Sanjeev Arora, Rong Ge, Ankur Moitra
In sparse recovery we are given a matrix $A$ (the dictionary) and a vector of the form $A X$ where $X$ is sparse, and the goal is to recover $X$.
2 code implementations • 19 Dec 2012 • Sanjeev Arora, Rong Ge, Yoni Halpern, David Mimno, Ankur Moitra, David Sontag, Yichen Wu, Michael Zhu
Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora.
no code implementations • NeurIPS 2012 • Sanjeev Arora, Rong Ge, Ankur Moitra, Sushant Sachdeva
We present a new algorithm for Independent Component Analysis (ICA) which has provable performance guarantees.
2 code implementations • 9 Apr 2012 • Sanjeev Arora, Rong Ge, Ankur Moitra
Topic Modeling is an approach used for automatic comprehension and classification of data in a variety of settings, and perhaps the canonical application is in uncovering thematic structure in a corpus of documents.