Search Results for author: Bingbin Liu

Found 12 papers, 2 papers with code

Progressive distillation induces an implicit curriculum

no code implementations7 Oct 2024 Abhishek Panigrahi, Bingbin Liu, Sadhika Malladi, Andrej Risteski, Surbhi Goel

Our theoretical and empirical findings on sparse parity, complemented by empirical observations on more complex tasks, highlight the benefit of progressive distillation via implicit curriculum across setups.

Knowledge Distillation

TinyGSM: achieving >80% on GSM8k with small language models

no code implementations14 Dec 2023 Bingbin Liu, Sebastien Bubeck, Ronen Eldan, Janardhan Kulkarni, Yuanzhi Li, Anh Nguyen, Rachel Ward, Yi Zhang

Specifically for solving grade school math, the smallest model size so far required to break the 80\% barrier on the GSM8K benchmark remains to be 34B.

Arithmetic Reasoning GSM8K +2

Transformers are uninterpretable with myopic methods: a case study with bounded Dyck grammars

no code implementations NeurIPS 2023 Kaiyue Wen, Yuchen Li, Bingbin Liu, Andrej Risteski

Interpretability methods aim to understand the algorithm implemented by a trained model (e. g., a Transofmer) by examining various aspects of the model, such as the weight matrices or the attention patterns.

LEMMA

Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression

no code implementations1 Jun 2023 Runtian Zhai, Bingbin Liu, Andrej Risteski, Zico Kolter, Pradeep Ravikumar

Recent work has built the connection between self-supervised learning and the approximation of the top eigenspace of a graph Laplacian operator, suggesting that learning a linear probe atop such representation can be connected to RKHS regression.

Contrastive Learning Data Augmentation +7

Transformers Learn Shortcuts to Automata

no code implementations19 Oct 2022 Bingbin Liu, Jordan T. Ash, Surbhi Goel, Akshay Krishnamurthy, Cyril Zhang

Algorithmic reasoning requires capabilities which are most naturally understood through recurrent models of computation, like the Turing machine.

Masked prediction tasks: a parameter identifiability view

no code implementations18 Feb 2022 Bingbin Liu, Daniel Hsu, Pradeep Ravikumar, Andrej Risteski

This lens is undoubtedly very interesting, but suffers from the problem that there isn't a "canonical" set of downstream tasks to focus on -- in practice, this problem is usually resolved by competing on the benchmark dataset du jour.

Self-Supervised Learning

Analyzing and Improving the Optimization Landscape of Noise-Contrastive Estimation

no code implementations ICLR 2022 Bingbin Liu, Elan Rosenfeld, Pradeep Ravikumar, Andrej Risteski

Noise-contrastive estimation (NCE) is a statistically consistent method for learning unnormalized probabilistic models.

Contrastive learning of strong-mixing continuous-time stochastic processes

no code implementations3 Mar 2021 Bingbin Liu, Pradeep Ravikumar, Andrej Risteski

Contrastive learning is a family of self-supervised methods where a model is trained to solve a classification task constructed from unlabeled data.

Contrastive Learning Time Series +1

Generalized Boosting

no code implementations NeurIPS 2020 Arun Suggala, Bingbin Liu, Pradeep Ravikumar

Using thorough empirical evaluation, we show that our learning algorithms have superior performance over traditional additive boosting algorithms, as well as existing greedy learning techniques for DNNs.

Additive models Classification +2

Spatiotemporal Relationship Reasoning for Pedestrian Intent Prediction

1 code implementation20 Feb 2020 Bingbin Liu, Ehsan Adeli, Zhangjie Cao, Kuan-Hui Lee, Abhijeet Shenoi, Adrien Gaidon, Juan Carlos Niebles

In addition, we introduce a new dataset designed specifically for autonomous-driving scenarios in areas with dense pedestrian populations: the Stanford-TRI Intent Prediction (STIP) dataset.

Autonomous Driving Navigate

Temporal Modular Networks for Retrieving Complex Compositional Activities in Videos

no code implementations ECCV 2018 Bingbin Liu, Serena Yeung, Edward Chou, De-An Huang, Li Fei-Fei, Juan Carlos Niebles

A major challenge in computer vision is scaling activity understanding to the long tail of complex activities without requiring collecting large quantities of data for new actions.

Retrieval Video Retrieval

Cannot find the paper you are looking for? You can Submit a new open access paper.