Search Results for author: Si Si

Found 24 papers, 8 papers with code

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks

6 code implementations • KDD 2019 • Wei-Lin Chiang, Xuanqing Liu, Si Si, Yang Li, Samy Bengio, Cho-Jui Hsieh

Furthermore, Cluster-GCN allows us to train much deeper GCN without much time and memory overhead, which leads to improved prediction accuracy---using a 5-layer Cluster-GCN, we achieve state-of-the-art test F1 score 99. 36 on the PPI dataset, while the previous best result was 98. 71 by [16].

Ranked #1 on Node Classification on Amazon2M

Clustering Computational Efficiency +4

32,717

Paper
Code

DC-BENCH: Dataset Condensation Benchmark

2 code implementations • 20 Jul 2022 • Justin Cui, Ruochen Wang, Si Si, Cho-Jui Hsieh

Dataset Condensation is a newly emerging technique aiming at learning a tiny dataset that captures the rich information encoded in the original dataset.

Data Augmentation Data Compression +2

1,150

Paper
Code

Scaling Up Dataset Distillation to ImageNet-1K with Constant Memory

2 code implementations • 19 Nov 2022 • Justin Cui, Ruochen Wang, Si Si, Cho-Jui Hsieh

The resulting algorithm sets new SOTA on ImageNet-1K: we can scale up to 50 IPCs (Image Per Class) on ImageNet-1K on a single GPU (all previous methods can only scale to 2 IPCs on ImageNet-1K), leading to the best accuracy (only 5. 9% accuracy drop against full dataset training) while utilizing only 4. 2% of the number of data points - an 18. 2% absolute gain over prior SOTA.

1,150

Paper
Code

Robustness Verification of Tree-based Models

2 code implementations • NeurIPS 2019 • Hongge Chen, huan zhang, Si Si, Yang Li, Duane Boning, Cho-Jui Hsieh

We show that there is a simple linear time algorithm for verifying a single tree, and for tree ensembles, the verification problem can be cast as a max-clique problem on a multi-partite graph with bounded boxicity.

Paper
Code

Learnable Fourier Features for Multi-Dimensional Spatial Positional Encoding

1 code implementation • NeurIPS 2021 • Yang Li, Si Si, Gang Li, Cho-Jui Hsieh, Samy Bengio

Attentional mechanisms are order-invariant.

Position

Paper
Code

Area Attention

1 code implementation • ICLR 2019 • Yang Li, Lukasz Kaiser, Samy Bengio, Si Si

We propose area attention: a way to attend to areas in the memory, where each area contains a group of items that are structurally adjacent, e. g., spatially for a 2D memory such as images, or temporally for a 1D memory such as natural language sentences.

Image Captioning Machine Translation +1

Paper
Code

Neural SDE: Stabilizing Neural ODE Networks with Stochastic Noise

1 code implementation • 5 Jun 2019 • Xuanqing Liu, Tesi Xiao, Si Si, Qin Cao, Sanjiv Kumar, Cho-Jui Hsieh

In this paper, we propose a new continuous neural network framework called Neural Stochastic Differential Equation (Neural SDE) network, which naturally incorporates various commonly used regularization mechanisms based on random noise injection.

Paper
Code

GPU-acceleration for Large-scale Tree Boosting

3 code implementations • 26 Jun 2017 • Huan Zhang, Si Si, Cho-Jui Hsieh

In this paper, we present a novel massively parallel algorithm for accelerating the decision tree building procedure on GPUs (Graphics Processing Units), which is a crucial step in Gradient Boosted Decision Tree (GBDT) and random forests training.

Paper
Code

Nonlinear Online Learning with Adaptive Nyström Approximation

no code implementations • 21 Feb 2018 • Si Si, Sanjiv Kumar, Yang Li

Use of nonlinear feature maps via kernel approximation has led to success in many online learning tasks.

Paper
Add Code

Communication-Efficient Parallel Block Minimization for Kernel Machines

no code implementations • 5 Aug 2016 • Cho-Jui Hsieh, Si Si, Inderjit S. Dhillon

Kernel machines often yield superior predictive performance on various tasks; however, they suffer from severe computational challenges.

Paper
Add Code

Kernel Ridge Regression via Partitioning

no code implementations • 5 Aug 2016 • Rashish Tandon, Si Si, Pradeep Ravikumar, Inderjit Dhillon

In this paper, we investigate a divide and conquer approach to Kernel Ridge Regression (KRR).

Clustering Generalization Bounds +1

Paper
Add Code

A Divide-and-Conquer Solver for Kernel Support Vector Machines

no code implementations • 4 Nov 2013 • Cho-Jui Hsieh, Si Si, Inderjit S. Dhillon

We show theoretically that the support vectors identified by the subproblem solution are likely to be support vectors of the entire kernel SVM problem, provided that the problem is partitioned appropriately by kernel clustering.

Clustering

Paper
Add Code

GroupReduce: Block-Wise Low-Rank Approximation for Neural Language Model Shrinking

no code implementations • NeurIPS 2018 • Patrick H. Chen, Si Si, Yang Li, Ciprian Chelba, Cho-Jui Hsieh

Model compression is essential for serving large deep neural nets on devices with limited resources or applications that require real-time responses.

Language Modelling Model Compression +1

Paper
Add Code

Learning to Screen for Fast Softmax Inference on Large Vocabulary Neural Networks

no code implementations • ICLR 2019 • Patrick H. Chen, Si Si, Sanjiv Kumar, Yang Li, Cho-Jui Hsieh

The algorithm achieves an order of magnitude faster inference than the original softmax layer for predicting top-$k$ words in various tasks such as beam search in machine translation or next words prediction.

Clustering Machine Translation +1

Paper
Add Code

You Look Twice: GaterNet for Dynamic Filter Selection in CNNs

no code implementations • CVPR 2019 • Zhourong Chen, Yang Li, Samy Bengio, Si Si

The concept of conditional computation for deep nets has been proposed previously to improve model performance by selectively using only parts of the model conditioned on the sample it is processing.

Paper
Add Code

Multi-Scale Spectral Decomposition of Massive Graphs

no code implementations • NeurIPS 2014 • Si Si, Donghyuk Shin, Inderjit S. Dhillon, Beresford N. Parlett

Thus, eigenvectors of the clusters serve as good initializations to a block Lanczos algorithm that is used to compute spectral decomposition of the original graph.

Clustering

Paper
Add Code

Fast Prediction for Large-Scale Kernel Machines

no code implementations • NeurIPS 2014 • Cho-Jui Hsieh, Si Si, Inderjit S. Dhillon

Second, we provide a new theoretical analysis on bounding the error of the solution computed by using Nystr¨om kernel approximation method, and show that the error is related to the weighted kmeans objective function where the weights are given by the model computed from the original kernel.

General Classification regression

Paper
Add Code

Gradient Boosted Decision Trees for High Dimensional Sparse Output

no code implementations • ICML 2017 • Si Si, huan zhang, S. Sathiya Keerthi, Dhruv Mahajan, Inderjit S. Dhillon, Cho-Jui Hsieh

In this paper, we study the gradient boosted decision trees (GBDT) when the output space is high dimensional and sparse.

General Classification Vocal Bursts Intensity Prediction

Paper
Add Code

A Unified Framework for Data Poisoning Attack to Graph-based Semi-supervised Learning

no code implementations • NeurIPS 2019 • Xuanqing Liu, Si Si, Xiaojin Zhu, Yang Li, Cho-Jui Hsieh

In this paper, we proposed a general framework for data poisoning attacks to graph-based semi-supervised learning (G-SSL).

Binary Classification Data Poisoning +1

Paper
Add Code

Auto Completion of User Interface Layout Design Using Transformer-Based Tree Decoders

no code implementations • 14 Jan 2020 • Yang Li, Julien Amelot, Xin Zhou, Samy Bengio, Si Si

While we focus on interface layout prediction, our model can be generally applicable for other layout prediction problems that involve tree structures and 2-dimensional placements.

Layout Design

Paper
Add Code

Multi-Stage Influence Function

no code implementations • NeurIPS 2020 • Hongge Chen, Si Si, Yang Li, Ciprian Chelba, Sanjiv Kumar, Duane Boning, Cho-Jui Hsieh

With this score, we can identify the pretraining examples in the pretraining task that contribute most to a prediction in the finetuning task.

Transfer Learning

Paper
Add Code

How much progress have we made in neural network training? A New Evaluation Protocol for Benchmarking Optimizers

no code implementations • 19 Oct 2020 • Yuanhao Xiong, Xuanqing Liu, Li-Cheng Lan, Yang You, Si Si, Cho-Jui Hsieh

For end-to-end efficiency, unlike previous work that assumes random hyperparameter tuning, which over-emphasizes the tuning time, we propose to evaluate with a bandit hyperparameter tuning strategy.

Benchmarking Graph Mining

Paper
Add Code

Two-stage LLM Fine-tuning with Less Specialization and More Generalization

no code implementations • 1 Nov 2022 • Yihan Wang, Si Si, Daliang Li, Michal Lukasik, Felix Yu, Cho-Jui Hsieh, Inderjit S Dhillon, Sanjiv Kumar

Pretrained large language models (LLMs) are general purpose problem solvers applicable to a diverse set of tasks with prompts.

Binary Classification Domain Generalization +5

Paper
Add Code

Automatic Engineering of Long Prompts

no code implementations • 16 Nov 2023 • Cho-Jui Hsieh, Si Si, Felix X. Yu, Inderjit S. Dhillon

Large language models (LLMs) have demonstrated remarkable capabilities in solving complex open-domain tasks, guided by comprehensive instructions and demonstrations provided in the form of prompts.

Prompt Engineering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.