Search Results for author: Bin Yu

Found 83 papers, 30 papers with code

Minimum-Norm Interpolation Under Covariate Shift

no code implementations • 31 Mar 2024 • Neil Mallinar, Austin Zane, Spencer Frei, Bin Yu

We follow our analysis with empirical studies that show these beneficial and malignant covariate shifts for linear interpolators on real image data, and for fully-connected neural networks in settings where the input data dimension is larger than the training sample size.

regression Transfer Learning

Paper
Add Code

Large Stepsize Gradient Descent for Logistic Loss: Non-Monotonicity of the Loss Improves Optimization Efficiency

no code implementations • 24 Feb 2024 • Jingfeng Wu, Peter L. Bartlett, Matus Telgarsky, Bin Yu

We consider gradient descent (GD) with a constant stepsize applied to logistic regression with linearly separable data, where the constant stepsize $\eta$ is so large that the loss initially oscillates.

General Classification

Paper
Add Code

ED-Copilot: Reduce Emergency Department Wait Time with Language Model Diagnostic Assistance

no code implementations • 21 Feb 2024 • Liwen Sun, Abhineet Agarwal, Aaron Kornblith, Bin Yu, Chenyan Xiong

Using publicly available patient data, we collaborate with ED clinicians to curate MIMIC-ED-Assist, a benchmark that measures the ability of AI systems in suggesting laboratory tests that minimize ED wait times, while correctly predicting critical outcomes such as death.

Language Modelling

Paper
Add Code

LoRA+: Efficient Low Rank Adaptation of Large Models

1 code implementation • 19 Feb 2024 • Soufiane Hayou, Nikhil Ghosh, Bin Yu

In this paper, we show that Low Rank Adaptation (LoRA) as originally introduced in Hu et al. (2021) leads to suboptimal finetuning of models with large width (embedding dimension).

146

Paper
Code

Towards Consistent Natural-Language Explanations via Explanation-Consistency Finetuning

1 code implementation • 25 Jan 2024 • Yanda Chen, Chandan Singh, Xiaodong Liu, Simiao Zuo, Bin Yu, He He, Jianfeng Gao

We propose explanation-consistency finetuning (EC-finetuning), a method that adapts LLMs to generate more consistent natural-language explanations on related examples.

Question Answering

Paper
Code

Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs

1 code implementation • 3 Nov 2023 • Qingru Zhang, Chandan Singh, Liyuan Liu, Xiaodong Liu, Bin Yu, Jianfeng Gao, Tuo Zhao

In human-written articles, we often leverage the subtleties of text style, such as bold and italics, to guide the attention of readers.

Paper
Code

Using Experience Classification for Training Non-Markovian Tasks

no code implementations • 18 Oct 2023 • Ruixuan Miao, Xu Lu, Cong Tian, Bin Yu, Zhenhua Duan

Unlike the standard Reinforcement Learning (RL) model, many real-world tasks are non-Markovian, whose rewards are predicated on state history rather than solely on the current state.

Autonomous Driving Classification +2

Paper
Add Code

Prominent Roles of Conditionally Invariant Components in Domain Adaptation: Theory and Algorithms

no code implementations • 19 Sep 2023 • Keru Wu, Yuansi Chen, Wooseok Ha, Bin Yu

Domain adaptation (DA) is a statistical learning problem that arises when the distribution of the source data used to train a model differs from that of the target data used to evaluate the model.

Domain Adaptation

Paper
Add Code

TrafficGPT: Viewing, Processing and Interacting with Traffic Foundation Models

1 code implementation • 13 Sep 2023 • Siyao Zhang, Daocheng Fu, Zhao Zhang, Bin Yu, Pinlong Cai

This integration yields the following key enhancements: 1) empowering ChatGPT with the capacity to view, analyze, process traffic data, and provide insightful decision support for urban transportation system management; 2) facilitating the intelligent deconstruction of broad and complex tasks and sequential utilization of traffic foundation models for their gradual completion; 3) aiding human decision-making in traffic control through natural language dialogues; and 4) enabling interactive feedback and solicitation of revised outcomes.

Common Sense Reasoning Decision Making +3

108

Paper
Code

The Effect of SGD Batch Size on Autoencoder Learning: Sparsity, Sharpness, and Feature Learning

no code implementations • 6 Aug 2023 • Nikhil Ghosh, Spencer Frei, Wooseok Ha, Bin Yu

On the other hand, for any batch size strictly smaller than the number of samples, SGD finds a global minimum which is sparse and nearly orthogonal to its initialization, showing that the randomness of stochastic gradients induces a qualitatively different type of "feature selection" in this setting.

feature selection

Paper
Add Code

Improving Prototypical Part Networks with Reward Reweighing, Reselection, and Retraining

no code implementations • 8 Jul 2023 • Robin Netzorg, Jiaxun Li, Bin Yu

Hoping to remedy this, we take inspiration from the recent developments in Reinforcement Learning with Human Feedback (RLHF) to fine-tune these prototypes.

Image Classification

Paper
Add Code

SkipDecode: Autoregressive Skip Decoding with Batching and Caching for Efficient LLM Inference

no code implementations • 5 Jul 2023 • Luciano del Corro, Allie Del Giorno, Sahaj Agarwal, Bin Yu, Ahmed Awadallah, Subhabrata Mukherjee

While existing token-level early exit methods show promising results for online inference, they cannot be readily applied for batch inferencing and Key-Value caching.

Text Generation

Paper
Add Code

MDI+: A Flexible Random Forest-Based Feature Importance Framework

2 code implementations • 4 Jul 2023 • Abhineet Agarwal, Ana M. Kenney, Yan Shuo Tan, Tiffany M. Tang, Bin Yu

We show that the MDI for a feature $X_k$ in each tree in an RF is equivalent to the unnormalized $R^2$ value in a linear regression of the response on the collection of decision stumps that split on $X_k$.

Drug Response Prediction Feature Importance +1

1,290

Paper
Code

Diagnosing Transformers: Illuminating Feature Spaces for Clinical Decision-Making

1 code implementation • 27 May 2023 • Aliyah R. Hsu, Yeshwanth Cherapanamjeri, Briton Park, Tristan Naumann, Anobel Y. Odisho, Bin Yu

These findings showcase the utility of SUFO in enhancing trust and safety when using transformers in medicine, and we believe SUFO can aid practitioners in evaluating fine-tuned language models for other applications in medicine and in more critical domains.

Decision Making

Paper
Code

Explaining black box text modules in natural language with language models

2 code implementations • 17 May 2023 • Chandan Singh, Aliyah R. Hsu, Richard Antonello, Shailee Jain, Alexander G. Huth, Bin Yu, Jianfeng Gao

Here, we ask whether we can automatically obtain natural language explanations for black box text modules.

Explanation Generation

Paper
Code

Enhancing Peak Network Traffic Prediction via Time-Series Decomposition

no code implementations • 9 Mar 2023 • Tucker Stewart, Bin Yu, Anderson Nascimento, Juhua Hu

For network administration and maintenance, it is critical to anticipate when networks will receive peak volumes of traffic so that adequate resources can be allocated to service requests made to servers.

Time Series Traffic Prediction

Paper
Add Code

A Mixing Time Lower Bound for a Simplified Version of BART

no code implementations • 17 Oct 2022 • Omer Ronen, Theo Saarinen, Yan Shuo Tan, James Duncan, Bin Yu

In this paper, we provide the first lower bound on the mixing time for a simplified version of BART in which we reduce the sum to a single tree and use a subset of the possible moves for the MCMC proposal distribution.

Causal Inference regression

Paper
Add Code

Same Root Different Leaves: Time Series and Cross-Sectional Methods in Panel Data

1 code implementation • 29 Jul 2022 • Dennis Shen, Peng Ding, Jasjeet Sekhon, Bin Yu

A central goal in social science is to evaluate the causal effect of a policy.

regression Time Series +1

Paper
Code

Learning Using Privileged Information for Zero-Shot Action Recognition

no code implementations • 17 Jun 2022 • Zhiyi Gao, Yonghong Hou, Wanqing Li, Zihui Guo, Bin Yu

This approach has been challenged by the semantic gap between the visual space and semantic space.

Action Recognition Hallucination +2

Paper
Add Code

Group Probability-Weighted Tree Sums for Interpretable Modeling of Heterogeneous Data

1 code implementation • 30 May 2022 • Keyan Nasseri, Chandan Singh, James Duncan, Aaron Kornblith, Bin Yu

Machine learning in high-stakes domains, such as healthcare, faces two critical challenges: (1) generalizing to diverse data distributions given limited training data while (2) maintaining interpretability.

Specificity

Paper
Code

Hierarchical Shrinkage: improving the accuracy and interpretability of tree-based methods

2 code implementations • 2 Feb 2022 • Abhineet Agarwal, Yan Shuo Tan, Omer Ronen, Chandan Singh, Bin Yu

Tree-based models such as decision trees and random forests (RF) are a cornerstone of modern machine-learning practice.

1,290

Paper
Code

Fast Interpretable Greedy-Tree Sums

2 code implementations • 28 Jan 2022 • Yan Shuo Tan, Chandan Singh, Keyan Nasseri, Abhineet Agarwal, James Duncan, Omer Ronen, Matthew Epland, Aaron Kornblith, Bin Yu

In such settings, practitioners often use highly interpretable decision tree models, but these suffer from inductive bias against additive structure.

Additive models Decision Making +4

1,290

Paper
Code

C2AM Loss: Chasing a Better Decision Boundary for Long-Tail Object Detection

no code implementations • CVPR 2022 • Tong Wang, Yousong Zhu, Yingying Chen, Chaoyang Zhao, Bin Yu, Jinqiao Wang, Ming Tang

The decision boundary between any two categories is the angular bisector of their weight vectors.

object-detection Object Detection

Paper
Add Code

The Three Stages of Learning Dynamics in High-Dimensional Kernel Methods

no code implementations • ICLR 2022 • Nikhil Ghosh, Song Mei, Bin Yu

To understand how deep learning works, it is crucial to understand the training dynamics of neural networks.

Vocal Bursts Intensity Prediction

Paper
Add Code

A cautionary tale on fitting decision trees to data from additive models: generalization lower bounds

1 code implementation • 18 Oct 2021 • Yan Shuo Tan, Abhineet Agarwal, Bin Yu

We prove a sharp squared error generalization lower bound for a large class of decision tree algorithms fitted to sparse additive models with $C^1$ component functions.

Additive models Decision Making +2

Paper
Code

Towards Robust Waveform-Based Acoustic Models

no code implementations • 16 Oct 2021 • Dino Oglic, Zoran Cvetkovic, Peter Sollich, Steve Renals, Bin Yu

We study the problem of learning robust acoustic models in adverse environments, characterized by a significant mismatch between training and test conditions.

Data Augmentation Inductive Bias +3

Paper
Add Code

Interpreting and improving deep-learning models with reality checks

4 code implementations • 16 Aug 2021 • Chandan Singh, Wooseok Ha, Bin Yu

Recent deep-learning models have achieved impressive predictive performance by learning complex functions of many variables, often at the cost of interpretability.

123

Paper
Code

Adaptive wavelet distillation from neural networks through interpretations

2 code implementations • NeurIPS 2021 • Wooseok Ha, Chandan Singh, Francois Lanusse, Srigokul Upadhyayula, Bin Yu

Moreover, interpretable models are concise and often yield computational efficiency.

Computational Efficiency

Paper
Code

Provable Boolean Interaction Recovery from Tree Ensemble obtained via Random Forests

no code implementations • 23 Feb 2021 • Merle Behr, Yu Wang, Xiao Li, Bin Yu

Iterative Random Forests (iRF) use a tree ensemble from iteratively modified RF to obtain predictive and stable non-linear or Boolean interactions of features.

Statistics Theory Statistics Theory

Paper
Add Code

High-Performance Discriminative Tracking With Transformers

no code implementations • ICCV 2021 • Bin Yu, Ming Tang, Linyu Zheng, Guibo Zhu, Jinqiao Wang, Hao Feng, Xuetao Feng, Hanqing Lu

End-to-end discriminative trackers improve the state of the art significantly, yet the improvement in robustness and efficiency is restricted by the conventional discriminative model, i. e., least-squares based regression.

Object Visual Tracking +1

Paper
Add Code

Enriched Annotations for Tumor Attribute Classification from Pathology Reports with Limited Labeled Data

no code implementations • 15 Dec 2020 • Nick Altieri, Briton Park, Mara Olson, John DeNero, Anobel Odisho, Bin Yu

Precision medicine has the potential to revolutionize healthcare, but much of the data for patients is locked away in unstructured free-text, limiting research and delivery of effective personalized treatments.

Attribute General Classification +1

Paper
Add Code

Stable discovery of interpretable subgroups via calibration in causal studies

1 code implementation • 23 Aug 2020 • Raaz Dwivedi, Yan Shuo Tan, Briton Park, Mian Wei, Kevin Horgan, David Madigan, Bin Yu

Building on Yu and Kumbier's PCS framework and for randomized experiments, we introduce a novel methodology for Stable Discovery of Interpretable Subgroups via Calibration (StaDISC), with large heterogeneous treatment effects.

Paper
Code

Two-stage growth mode for lift-off mechanism in oblique shock-wave/jet interaction

no code implementations • 11 Jul 2020 • Bin Yu, Miaosheng He, Bin Zhang, Hong Liu

Based on the objective coordinate system in frame of oblique shock structure, it is found that the nature of three-dimensional lift-off structure of a shockinduced streamwise vortex is inherently and precisely controlled by a two-stage growth mode of structure kinetics of a shock bubble interaction (SBI for short).

Fluid Dynamics

Paper
Add Code

Revisiting minimum description length complexity in overparameterized models

1 code implementation • 17 Jun 2020 • Raaz Dwivedi, Chandan Singh, Bin Yu, Martin J. Wainwright

We provide an extensive theoretical characterization of MDL-COMP for linear models and kernel methods and show that it is not just a function of parameter count, but rather a function of the singular values of the design or the kernel matrix and the signal-to-noise ratio.

Learning Theory

Paper
Code

A Survey on Dynamic Network Embedding

no code implementations • 15 Jun 2020 • Yu Xie, Chunyi Li, Bin Yu, Chen Zhang, Zhouhua Tang

Real-world networks are composed of diverse interacting and evolving entities, while most of existing researches simply characterize them as particular static networks, without consideration of the evolution trend in dynamic networks.

Social and Information Networks Physics and Society

Paper
Add Code

Instability, Computational Efficiency and Statistical Accuracy

no code implementations • 22 May 2020 • Nhat Ho, Koulik Khamaru, Raaz Dwivedi, Martin J. Wainwright, Michael. I. Jordan, Bin Yu

Many statistical estimators are defined as the fixed point of a data-dependent operator, with estimators based on minimizing a cost function being an important special case.

Computational Efficiency

Paper
Add Code

Curating a COVID-19 data repository and forecasting county-level death counts in the United States

1 code implementation • 16 May 2020 • Nick Altieri, Rebecca L. Barter, James Duncan, Raaz Dwivedi, Karl Kumbier, Xiao Li, Robert Netzorg, Briton Park, Chandan Singh, Yan Shuo Tan, Tiffany Tang, Yu Wang, Chao Zhang, Bin Yu

We use this data to develop predictions and corresponding prediction intervals for the short-term trajectory of COVID-19 cumulative death counts at the county-level in the United States up to two weeks ahead.

COVID-19 Tracking Decision Making +2

227

Paper
Code

Transformation Importance with Applications to Cosmology

2 code implementations • 4 Mar 2020 • Chandan Singh, Wooseok Ha, Francois Lanusse, Vanessa Boehm, Jia Liu, Bin Yu

Machine learning lies at the heart of new possibilities for scientific discovery, knowledge generation, and artificial intelligence.

Paper
Code

Interpretations are useful: penalizing explanations to align neural networks with prior knowledge

4 code implementations • ICML 2020 • Laura Rieger, Chandan Singh, W. James Murdoch, Bin Yu

For an explanation of a deep learning model to be effective, it must provide both insight into a model and suggest a corresponding action in order to achieve some objective.

123

Paper
Code

A Debiased MDI Feature Importance Measure for Random Forests

3 code implementations • NeurIPS 2019 • Xiao Li, Yu Wang, Sumanta Basu, Karl Kumbier, Bin Yu

Based on the original definition of MDI by Breiman et al. for a single tree, we derive a tight non-asymptotic bound on the expected bias of MDI importance of noisy features, showing that deep trees have higher (expected) feature selection bias than shallow ones.

Feature Importance feature selection +1

Paper
Code

Fast mixing of Metropolized Hamiltonian Monte Carlo: Benefits of multi-step gradients

1 code implementation • 29 May 2019 • Yuansi Chen, Raaz Dwivedi, Martin J. Wainwright, Bin Yu

This bound gives a precise quantification of the faster convergence of Metropolized HMC relative to simpler MCMC algorithms such as the Metropolized random walk, or Metropolized Langevin algorithm.

Paper
Code

Disentangled Attribution Curves for Interpreting Random Forests and Boosted Trees

4 code implementations • 18 May 2019 • Summer Devlin, Chandan Singh, W. James Murdoch, Bin Yu

Tree ensembles, such as random forests and AdaBoost, are ubiquitous machine learning models known for achieving strong predictive performance across a wide variety of domains.

Feature Engineering Feature Importance +1

123

Paper
Code

CharBot: A Simple and Effective Method for Evading DGA Classifiers

no code implementations • 3 May 2019 • Jonathan Peck, Claire Nie, Raaghavi Sivaguru, Charles Grumer, Femi Olumofin, Bin Yu, Anderson Nascimento, Martine De Cock

In this work, we present a novel DGA called CharBot which is capable of producing large numbers of unregistered domain names that are not detected by state-of-the-art classifiers for real-time detection of DGAs, including the recently published methods FANCI (a random forest based on human-engineered features) and LSTM. MI (a deep learning approach).

Adversarial Attack

Paper
Add Code

Unique Sharp Local Minimum in $\ell_1$-minimization Complete Dictionary Learning

no code implementations • 22 Feb 2019 • Yu Wang, Siqi Wu, Bin Yu

First, we obtain a necessary and sufficient norm condition for the reference dictionary $D^*$ to be a sharp local minimum of the expected $\ell_1$ objective function.

Dictionary Learning

Paper
Add Code

Sharp Analysis of Expectation-Maximization for Weakly Identifiable Models

no code implementations • 1 Feb 2019 • Raaz Dwivedi, Nhat Ho, Koulik Khamaru, Martin J. Wainwright, Michael. I. Jordan, Bin Yu

We study a class of weakly identifiable location-scale mixture models for which the maximum likelihood estimates based on $n$ i. i. d.

Paper
Add Code

Veridical Data Science

no code implementations • 23 Jan 2019 • Bin Yu, Karl Kumbier

It augments predictability and computability with an overarching stability principle for the data science life cycle.

Two-sample testing

Paper
Add Code

Interpretable machine learning: definitions, methods, and applications

6 code implementations • 14 Jan 2019 • W. James Murdoch, Chandan Singh, Karl Kumbier, Reza Abbasi-Asl, Bin Yu

Official code for using / reproducing ACD (ICLR 2019) from the paper "Hierarchical interpretations for neural network predictions" https://arxiv. org/abs/1806. 05337

BIG-bench Machine Learning Feature Importance +1

1,290

Paper
Code

An objective-adaptive refinement criterion based on modified ridge extraction method for finite-time Lyapunov exponent (FTLE) calculation

no code implementations • 13 Nov 2018 • Haotian Hang, Bin Yu, Yang Xiang, Bin Zhang, Hong Liu

High-accuracy and high-efficiency finite-time Lyapunov exponent (FTLE) calculation method has long been a research hot point, and adaptive refinement method is a kind of method in this field.

Fluid Dynamics

Paper
Add Code

Signed iterative random forests to identify enhancer-associated transcription factor binding

1 code implementation • 16 Oct 2018 • Karl Kumbier, Sumanta Basu, Erwin Frise, Susan E. Celniker, James B. Brown, Susan Celniker, Bin Yu

Standard ChIP-seq peak calling pipelines seek to differentiate biochemically reproducible signals of individual genomic elements from background noise.

Interpretable Machine Learning

Paper
Code

Singularity, Misspecification, and the Convergence Rate of EM

no code implementations • 1 Oct 2018 • Raaz Dwivedi, Nhat Ho, Koulik Khamaru, Michael. I. Jordan, Martin J. Wainwright, Bin Yu

A line of recent work has analyzed the behavior of the Expectation-Maximization (EM) algorithm in the well-specified setting, in which the population likelihood is locally strongly concave around its maximizing argument.

Paper
Add Code

Fast Kernelized Correlation Filters without Boundary Effect

no code implementations • 17 Jun 2018 • Ming Tang, Linyu Zheng, Bin Yu, Jinqiao Wang

To achieve the fast training and detection, a set of cyclic bases is introduced to construct the filter.

Visual Tracking

Paper
Add Code

High-speed Tracking with Multi-kernel Correlation Filters

no code implementations • CVPR 2018 • Ming Tang, Bin Yu, Fan Zhang, Jinqiao Wang

In this paper, we will introduce the MKL into KCF in a different way than MKCF.

Vocal Bursts Intensity Prediction

Paper
Add Code

Hierarchical interpretations for neural network predictions

1 code implementation • ICLR 2019 • Chandan Singh, W. James Murdoch, Bin Yu

Deep neural networks (DNNs) have achieved impressive predictive performance due to their ability to learn complex, non-linear relationships between variables.

Clustering Feature Importance +1

122

Paper
Code

Stability and Convergence Trade-off of Iterative Optimization Algorithms

no code implementations • 4 Apr 2018 • Yuansi Chen, Chi Jin, Bin Yu

Applying existing stability upper bounds for the gradient methods in our trade-off framework, we obtain lower bounds matching the well-established convergence upper bounds up to constants for these algorithms and conjecture similar lower bounds for NAG and HB.

Paper
Add Code

Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs

4 code implementations • ICLR 2018 • W. James Murdoch, Peter J. Liu, Bin Yu

On the task of sentiment analysis with the Yelp and SST data sets, we show that CD is able to reliably identify words and phrases of contrasting sentiment, and how they are combined to yield the LSTM's final prediction.

Sentiment Analysis

122

Paper
Code

Log-concave sampling: Metropolis-Hastings algorithms are fast

1 code implementation • 8 Jan 2018 • Raaz Dwivedi, Yuansi Chen, Martin J. Wainwright, Bin Yu

Relative to known guarantees for the unadjusted Langevin algorithm (ULA), our bounds show that the use of an accept-reject step in MALA leads to an exponentially improved dependence on the error-tolerance.

Paper
Code

Character Level Based Detection of DGA Domain Names

no code implementations • ICLR 2018 • Bin Yu, Jie Pan, Jiaming Hu, Anderson Nascimento, Martine De Cock

Recently several different deep learning architectures have been proposed that take a string of characters as the raw input signal and automatically derive features for text classification.

General Classification text-classification +1

Paper
Add Code

Artificial Intelligence and Statistics

no code implementations • 8 Dec 2017 • Bin Yu, Karl Kumbier

Artificial intelligence (AI) is intrinsically data-driven.

Experimental Design Self-Driving Cars

Paper
Add Code

Interpreting Convolutional Neural Networks Through Compression

no code implementations • 7 Nov 2017 • Reza Abbasi-Asl, Bin Yu

In our compression, the filter importance index is defined as the classification accuracy reduction (CAR) of the network after pruning that filter.

Object Recognition

Paper
Add Code

Fast MCMC sampling algorithms on polytopes

2 code implementations • 23 Oct 2017 • Yuansi Chen, Raaz Dwivedi, Martin J. Wainwright, Bin Yu

We propose and analyze two new MCMC sampling algorithms, the Vaidya walk and the John walk, for generating samples from the uniform distribution over a polytope.

Paper
Code

Iterative Random Forests to detect predictive and stable high-order interactions

4 code implementations • 26 Jun 2017 • Sumanta Basu, Karl Kumbier, James B. Brown, Bin Yu

Genomics has revolutionized biology, enabling the interrogation of whole transcriptomes, genome-wide binding sites for proteins, and many other molecular processes.

Vocal Bursts Intensity Prediction

Paper
Code

Meta-learners for Estimating Heterogeneous Treatment Effects using Machine Learning

6 code implementations • 12 Jun 2017 • Sören R. Künzel, Jasjeet S. Sekhon, Peter J. Bickel, Bin Yu

There is growing interest in estimating and analyzing heterogeneous treatment effects in experimental and observational studies.

Statistics Theory Methodology Statistics Theory

4,759

Paper
Code

Structural Compression of Convolutional Neural Networks

no code implementations • 20 May 2017 • Reza Abbasi-Asl, Bin Yu

Deep convolutional neural networks (CNNs) have been successful in many tasks in machine vision, however, millions of weights in the form of thousands of convolutional filters in CNNs makes them difficult for human intepretation or understanding in science.

General Classification

Paper
Add Code

Formulas for Counting the Sizes of Markov Equivalence Classes of Directed Acyclic Graphs

no code implementations • 23 Oct 2016 • Yangbo He, Bin Yu

A Markov equivalence class can be represented by an essential graph and its undirected subgraphs determine the size of the class.

Paper
Add Code

Optimal Subsampling Approaches for Large Sample Linear Regression

no code implementations • 17 Sep 2015 • Rong Zhu, Ping Ma, Michael W. Mahoney, Bin Yu

For unweighted estimation algorithm, we show that its resulting subsample estimator is not consistent to the full sample OLS estimator.

regression

Paper
Add Code

Local identifiability of $l_1$-minimization dictionary learning: a sufficient and almost necessary condition

no code implementations • 17 May 2015 • Siqi Wu, Bin Yu

Moreover, our local identifiability results also translate to the finite sample case with high probability provided that the number of signals $N$ scales as $O(K\log K)$.

Dictionary Learning

Paper
Add Code

Error Rate Bounds and Iterative Weighted Majority Voting for Crowdsourcing

no code implementations • 15 Nov 2014 • Hongwei Li, Bin Yu

We propose an iterative weighted majority voting (IWMV) method that optimizes the error rate bound and approximates the oracle MAP rule.

Paper
Add Code

Statistical guarantees for the EM algorithm: From population to sample-based analysis

no code implementations • 9 Aug 2014 • Sivaraman Balakrishnan, Martin J. Wainwright, Bin Yu

Leveraging this characterization, we then provide non-asymptotic guarantees on the EM and gradient EM algorithms when applied to a finite set of samples.

Paper
Add Code

Concise comparative summaries (CCS) of large text corpora with a human experiment

no code implementations • 29 Apr 2014 • Jinzhu Jia, Luke Miratrix, Bin Yu, Brian Gawalt, Laurent El Ghaoui, Luke Barnesmoore, Sophie Clavier

In this paper we propose a general framework for topic-specific summarization of large text corpora and illustrate how it can be used for the analysis of news databases.

General Classification

Paper
Add Code

The geometry of kernelized spectral clustering

no code implementations • 29 Apr 2014 • Geoffrey Schiebinger, Martin J. Wainwright, Bin Yu

As a corollary we control the fraction of samples mislabeled by spectral clustering under finite mixtures with nonparametric components.

Clustering

Paper
Add Code

Impact of regularization on Spectral Clustering

no code implementations • 5 Dec 2013 • Antony Joseph, Bin Yu

Under the stochastic block model (SBM), and its extensions, previous results on spectral clustering relied on the minimum degree of the graph being sufficiently large for its good performance.

Clustering Stochastic Block Model

Paper
Add Code

Error Rate Bounds in Crowdsourcing Models

no code implementations • 10 Jul 2013 • Hongwei Li, Bin Yu, Dengyong Zhou

We show that the oracle Maximum A Posterior (MAP) rule approximately optimizes our upper bound on the mean error rate for any hyperplane binary labeling rule, and propose a simple data-driven weighted majority voting (WMV) rule (called one-step WMV) that attempts to approximate the oracle MAP and has a provable theoretical guarantee on the error rate.

Paper
Add Code

A Statistical Perspective on Algorithmic Leveraging

no code implementations • 23 Jun 2013 • Ping Ma, Michael W. Mahoney, Bin Yu

A detailed empirical evaluation of existing leverage-based methods as well as these two new methods is carried out on both synthetic and real data sets.

Computational Efficiency

Paper
Add Code

Early stopping and non-parametric regression: An optimal data-dependent stopping rule

no code implementations • 15 Jun 2013 • Garvesh Raskutti, Martin J. Wainwright, Bin Yu

The strategy of early stopping is a regularization technique based on choosing a stopping time for an iterative algorithm.

regression

Paper
Add Code

Estimation Stability with Cross Validation (ESCV)

no code implementations • 13 Mar 2013 • Chinghway Lim, Bin Yu

For the two real data sets from neuroscience and cell biology, the models found by ESCV are less than half of the model sizes by CV.

Paper
Add Code

Supplement to "Reversible MCMC on Markov equivalence classes of sparse directed acyclic graphs"

no code implementations • 4 Mar 2013 • Yangbo He, Jinzhu Jia, Bin Yu

This supplementary material includes three parts: some preliminary results, four examples, an experiment, three new algorithms, and all proofs of the results in the paper "Reversible MCMC on Markov equivalence classes of sparse directed acyclic graphs".

Paper
Add Code

Reversible MCMC on Markov equivalence classes of sparse directed acyclic graphs

no code implementations • 26 Sep 2012 • Yangbo He, Jinzhu Jia, Bin Yu

In this paper, we design reversible irreducible Markov chains on the space of Markov equivalent classes by proposing a perfect set of operators that determine the transitions of the Markov chain.

Paper
Add Code

Supervised Feature Selection in Graphs with Path Coding Penalties and Network Flows

no code implementations • 20 Apr 2012 • Julien Mairal, Bin Yu

We consider supervised learning problems where the features are embedded in a graph, such as gene expressions in a gene network.

feature selection

Paper
Add Code

Co-clustering for directed graphs: the Stochastic co-Blockmodel and spectral algorithm Di-Sim

no code implementations • 10 Apr 2012 • Karl Rohe, Tai Qin, Bin Yu

In each example, a small subset of nodes have persistent asymmetries; these nodes send edges with one cluster, but receive edges with another cluster.

Clustering Graph Clustering

Paper
Add Code

Predicting Execution Time of Computer Programs Using Sparse Polynomial Regression

no code implementations • NeurIPS 2010 • Ling Huang, Jinzhu Jia, Bin Yu, Byung-Gon Chun, Petros Maniatis, Mayur Naik

Our two SPORE algorithms are able to build relationships between responses (e. g., the execution time of a computer program) and features, and select a few from hundreds of the retrieved features to construct an explicitly sparse and non-linear model to predict the response variable.

regression

Paper
Add Code

Lower bounds on minimax rates for nonparametric regression with additive sparsity and smoothness

no code implementations • NeurIPS 2009 • Garvesh Raskutti, Bin Yu, Martin J. Wainwright

components from some distribution $\mP$, we determine tight lower bounds on the minimax rate for estimating the regression function with respect to squared $\LTP$ error.

regression

Paper
Add Code

A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers

no code implementations • NeurIPS 2009 • Sahand Negahban, Bin Yu, Martin J. Wainwright, Pradeep K. Ravikumar

The estimation of high-dimensional parametric models requires imposing some structure on the models, for instance that they be sparse, or that matrix structured parameters have low rank.

Paper
Add Code

Nonparametric sparse hierarchical models describe V1 fMRI responses to natural images

no code implementations • NeurIPS 2008 • Vincent Q. Vu, Bin Yu, Thomas Naselaris, Kendrick Kay, Jack Gallant, Pradeep K. Ravikumar

We propose a novel hierarchical, nonlinear model that predicts brain activity in area V1 evoked by natural images.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.