1 code implementation • 25 May 2023 • Xizhou Zhu, Yuntao Chen, Hao Tian, Chenxin Tao, Weijie Su, Chenyu Yang, Gao Huang, Bin Li, Lewei Lu, Xiaogang Wang, Yu Qiao, Zhaoxiang Zhang, Jifeng Dai
These agents, equipped with the logic and common sense capabilities of LLMs, can skillfully navigate complex, sparse-reward environments with text-based interactions.
1 code implementation • CVPR 2023 • Weijie Su, Xizhou Zhu, Chenxin Tao, Lewei Lu, Bin Li, Gao Huang, Yu Qiao, Xiaogang Wang, Jie zhou, Jifeng Dai
It has been proved that combining multiple pre-training strategies and data from various modalities/sources can greatly boost the training of large-scale models.
Ranked #2 on
Object Detection
on LVIS v1.0 minival
(using extra training data)
no code implementations • 6 Jul 2022 • Lei Wu, Mingze Wang, Weijie Su
In this paper, we provide an explanation of this striking phenomenon by relating the particular noise structure of SGD to its \emph{linear stability} (Wu et al., 2018).
2 code implementations • CVPR 2023 • Chenxin Tao, Xizhou Zhu, Weijie Su, Gao Huang, Bin Li, Jie zhou, Yu Qiao, Xiaogang Wang, Jifeng Dai
Driven by these analysis, we propose Siamese Image Modeling (SiameseIM), which predicts the dense representations of an augmented view, based on another masked view from the same image but with different augmentations.
no code implementations • NeurIPS 2021 • Weijie Su
To address this withholding of information, in this paper, I introduce the Isotonic Mechanism, a simple and efficient approach to improving on the imprecise raw scores by leveraging certain information that the owner is incentivized to provide.
17 code implementations • ICLR 2021 • Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, Jifeng Dai
DETR has been recently proposed to eliminate the need for many hand-designed components in object detection while demonstrating good performance.
Ranked #34 on
Object Detection
on COCO-O
no code implementations • 6 Aug 2020 • Zhu Li, Weijie Su, Dino Sejdinovic
Modern machine learning often operates in the regime where the number of parameters is much higher than the number of data points, with zero training loss and yet good generalization, thereby contradicting the classical bias-variance trade-off.
no code implementations • 18 Oct 2019 • Matteo Sordello, Hangfeng He, Weijie Su
This paper proposes SplitSGD, a new dynamic learning rate schedule for stochastic optimization.
3 code implementations • ICLR 2020 • Weijie Su, Xizhou Zhu, Yue Cao, Bin Li, Lewei Lu, Furu Wei, Jifeng Dai
We introduce a new pre-trainable generic representation for visual-linguistic tasks, called Visual-Linguistic BERT (VL-BERT for short).
Ranked #1 on
Visual Question Answering (VQA)
on VCR (Q-A) dev
1 code implementation • NeurIPS 2019 • Zhiqi Bu, Jason Klusowski, Cynthia Rush, Weijie Su
SLOPE is a relatively new convex optimization procedure for high-dimensional linear regression via the sorted l1 penalty: the larger the rank of the fitted coefficient, the larger the penalty.
no code implementations • 20 Dec 2017 • Tengyuan Liang, Weijie Su
Modern statistical inference tasks often require iterative optimization methods to compute the solution.
1 code implementation • 17 Oct 2016 • Damian Brzyski, Alexej Gossmann, Weijie Su, Malgorzata Bogdan
Sorted L-One Penalized Estimation (SLOPE) is a relatively new convex optimization procedure which allows for adaptive selection of regressors under sparse high dimensional designs.
Methodology 46N10 G.1.6
no code implementations • 12 Nov 2015 • Cynthia Dwork, Weijie Su, Li Zhang
This destroys the classical proof of FDR control.
3 code implementations • 5 Nov 2015 • Weijie Su, Malgorzata Bogdan, Emmanuel Candes
In regression settings where explanatory variables have very low correlations and there are relatively few effects, each of large magnitude, we expect the Lasso to find the important variables with few errors, if any.
no code implementations • 17 Jun 2015 • Weijie Su, Junyang Qian, Linxi Liu
The false discovery rate (FDR)---the expected fraction of spurious discoveries among all the discoveries---provides a popular statistical assessment of the reproducibility of scientific studies in various disciplines.
no code implementations • 29 Mar 2015 • Weijie Su, Emmanuel Candes
We consider high-dimensional sparse regression problems in which we observe $y = X \beta + z$, where $X$ is an $n \times p$ design matrix and $z$ is an $n$-dimensional vector of independent Gaussian errors, each with variance $\sigma^2$.
Statistics Theory Information Theory Information Theory Statistics Theory
no code implementations • 4 Mar 2015 • Weijie Su, Stephen Boyd, Emmanuel J. Candes
We derive a second-order ordinary differential equation (ODE) which is the limit of Nesterov's accelerated gradient method.
no code implementations • NeurIPS 2014 • Weijie Su, Stephen Boyd, Emmanuel Candes
We derive a second-order ordinary differential equation (ODE), which is the limit of Nesterov’s accelerated gradient method.
no code implementations • 14 Jul 2014 • Małgorzata Bogdan, Ewout van den Berg, Chiara Sabatti, Weijie Su, Emmanuel J. Candès
SLOPE, short for Sorted L-One Penalized Estimation, is the solution to \[\min_{b\in\mathbb{R}^p}\frac{1}{2}\Vert y-Xb\Vert _{\ell_2}^2+\lambda_1\vert b\vert _{(1)}+\lambda_2\vert b\vert_{(2)}+\cdots+\lambda_p\vert b\vert_{(p)},\] where $\lambda_1\ge\lambda_2\ge\cdots\ge\lambda_p\ge0$ and $\vert b\vert_{(1)}\ge\vert b\vert_{(2)}\ge\cdots\ge\vert b\vert_{(p)}$ are the decreasing absolute values of the entries of $b$.
Methodology