1 code implementation • 15 Mar 2023 • Shuning Chang, Pichao Wang, Ming Lin, Fan Wang, David Junhao Zhang, Rong Jin, Mike Zheng Shou
In this work, we propose a novel Semantic Token ViT (STViT), for efficient global and local vision transformers, which can also be revised to serve as backbone for downstream tasks.
no code implementations • 23 Feb 2023 • Tian Zhou, Peisong Niu, Xue Wang, Liang Sun, Rong Jin
The diversity and domain dependence of time series data pose significant challenges in transferring learning to time series forecasting.
1 code implementation • The Eleventh International Conference on Learning Representations (ICLR 2023) 2023 • Yifan Zhang, Xue Wang, Jian Liang, Zhang Zhang, Liang Wang, Rong Jin, Tieniu Tan
A fundamental challenge for machine learning models is how to generalize learned models for out-of-distribution (OOD) data.
Ranked #2 on
Domain Adaptation
on Office-Home
1 code implementation • 9 Jan 2023 • Xiang Wang, Shiwei Zhang, Zhiwu Qing, Zhengrong Zuo, Changxin Gao, Rong Jin, Nong Sang
To be specific, HyRSM++ consists of two key components, a hybrid relation module and a temporal set matching metric.
no code implementations • 1 Nov 2022 • Marios Papachristou, Rishab Goel, Frank Portman, Matthew Miller, Rong Jin
On the other hand, shallow (or node-level) models using ego features and adjacency embeddings work well in heterophilous graphs.
no code implementations • 26 Oct 2022 • Zhishuai Guo, Rong Jin, Jiebo Luo, Tianbao Yang
This problem has important applications in machine learning, e. g., AUROC maximization with a pairwise loss, and partial AUROC maximization with a compositional loss.
no code implementations • 9 Oct 2022 • Xinwei Zhang, Jianwen Jiang, Yutong Feng, Zhi-Fan Wu, Xibin Zhao, Hai Wan, Mingqian Tang, Rong Jin, Yue Gao
Although a number of studies are devoted to novel category discovery, most of them assume a static setting where both labeled and unlabeled data are given at once for finding new categories.
1 code implementation • 8 Oct 2022 • Yaohua Wang, Fangyi Zhang, Ming Lin, Senzhang Wang, Xiuyu Sun, Rong Jin
A natural way to construct a graph among images is to treat each image as a node and assign pairwise image similarities as weights to corresponding edges.
no code implementations • 19 Sep 2022 • Yunwen Lei, Rong Jin, Yiming Ying
While significant theoretical progress has been achieved, unveiling the generalization mystery of overparameterized neural networks still remains largely elusive.
no code implementations • 24 Jun 2022 • Tian Zhou, Jianqing Zhu, Xue Wang, Ziqing Ma, Qingsong Wen, Liang Sun, Rong Jin
Various deep learning models, especially some latest Transformer-based approaches, have greatly improved the state-of-art performance for long-term time series forecasting. However, those transformer-based models suffer a severe deterioration performance with prolonged input length, which prohibits them from using extended historical info. Moreover, these methods tend to handle complex examples in long-term forecasting with increased model complexity, which often leads to a significant increase in computation and less robustness in performance(e. g., overfitting).
no code implementations • 25 May 2022 • Ziquan Liu, Yi Xu, Yuanhong Xu, Qi Qian, Hao Li, Rong Jin, Xiangyang Ji, Antoni B. Chan
With our empirical result obtained from 1, 330 models, we provide the following main observations: 1) ERM combined with data augmentation can achieve state-of-the-art performance if we choose a proper pre-trained model respecting the data property; 2) specialized algorithms further improve the robustness on top of ERM when handling a specific type of distribution shift, e. g., GroupDRO for spurious correlation and CORAL for large-scale out-of-distribution data; 3) Comparing different pre-training modes, architectures and data sizes, we provide novel observations about pre-training on distribution shift, which sheds light on designing or selecting pre-training strategy for different kinds of distribution shifts.
1 code implementation • 18 May 2022 • Tian Zhou, Ziqing Ma, Xue Wang, Qingsong Wen, Liang Sun, Tao Yao, Wotao Yin, Rong Jin
Recent studies have shown that deep learning models such as RNNs and Transformers have brought significant performance gains for long-term forecasting of time series because they effectively utilize historical information.
Ranked #2 on
Time Series Forecasting
on ETTh1 (720)
1 code implementation • CVPR 2022 • Xiang Wang, Shiwei Zhang, Zhiwu Qing, Mingqian Tang, Zhengrong Zuo, Changxin Gao, Rong Jin, Nong Sang
To overcome the two limitations, we propose a novel Hybrid Relation guided Set Matching (HyRSM) approach that incorporates two key components: hybrid relation module and set matching metric.
2 code implementations • ACL 2022 • Xiangpeng Wei, Heng Yu, Yue Hu, Rongxiang Weng, Weihua Luo, Jun Xie, Rong Jin
Although data augmentation is widely used to enrich the training data, conventional methods with discrete manipulations fail to generate diverse and faithful training samples.
no code implementations • CVPR 2022 • Zhiwu Qing, Shiwei Zhang, Ziyuan Huang, Yi Xu, Xiang Wang, Mingqian Tang, Changxin Gao, Rong Jin, Nong Sang
In this work, we aim to learn representations by leveraging more abundant information in untrimmed videos.
1 code implementation • CVPR 2022 • Zejiang Hou, Minghai Qin, Fei Sun, Xiaolong Ma, Kun Yuan, Yi Xu, Yen-Kuang Chen, Rong Jin, Yuan Xie, Sun-Yuan Kung
However, conventional pruning methods have limitations in that: they are restricted to pruning process only, and they require a fully pre-trained large model.
no code implementations • 13 Feb 2022 • Bingxu Mu, Zhenxing Niu, Le Wang, Xue Wang, Rong Jin, Gang Hua
Deep neural networks (DNNs) are known to be vulnerable to both backdoor attacks as well as adversarial attacks.
2 code implementations • ICLR 2022 • Yichen Qian, Ming Lin, Xiuyu Sun, Zhiyu Tan, Rong Jin
One critical component in lossy deep image compression is the entropy model, which predicts the probability distribution of the quantized latent representation in the encoding and decoding modules.
1 code implementation • 30 Jan 2022 • Tian Zhou, Ziqing Ma, Qingsong Wen, Xue Wang, Liang Sun, Rong Jin
Although Transformer-based methods have significantly improved state-of-the-art results for long-term series forecasting, they are not only computationally expensive but more importantly, are unable to capture the global view of time series (e. g. overall trend).
1 code implementation • 23 Dec 2021 • Jingkai Zhou, Pichao Wang, Fan Wang, Qiong Liu, Hao Li, Rong Jin
Self-attention is powerful in modeling long-range dependencies, but it is weak in local finer-level feature learning.
Ranked #37 on
Instance Segmentation
on COCO minival
1 code implementation • CVPR 2022 • Benjia Zhou, Pichao Wang, Jun Wan, Yanyan Liang, Fan Wang, Du Zhang, Zhen Lei, Hao Li, Rong Jin
Decoupling spatiotemporal representation refers to decomposing the spatial and temporal features into dimension-independent factors.
no code implementations • 7 Dec 2021 • Zhishuai Guo, Yi Xu, Wotao Yin, Rong Jin, Tianbao Yang
Although rigorous convergence analysis exists for Adam, they impose specific requirements on the update of the adaptive step size, which are not generic enough to cover many other variants of Adam.
1 code implementation • 2 Dec 2021 • Zhaoyuan Yin, Pichao Wang, Fan Wang, Xianzhe Xu, Hanling Zhang, Hao Li, Rong Jin
Unsupervised semantic segmentation aims to obtain high-level semantic representation on low-level visual features without manual annotations.
Ranked #1 on
Unsupervised Semantic Segmentation
on COCO-Stuff-171
(using extra training data)
1 code implementation • 26 Nov 2021 • Zhenhong Sun, Ming Lin, Xiuyu Sun, Zhiyu Tan, Hao Li, Rong Jin
Recent researches attempt to reduce this cost by optimizing the backbone architecture with the help of Neural Architecture Search (NAS).
Ranked #74 on
Object Detection
on COCO minival
no code implementations • 24 Nov 2021 • Ziquan Liu, Yi Xu, Yuanhong Xu, Qi Qian, Hao Li, Xiangyang Ji, Antoni Chan, Rong Jin
The generalization result of using pre-training data shows that the excess risk bound on a target task can be improved when the appropriate pre-training data is included in fine-tuning.
1 code implementation • 23 Nov 2021 • Hao Luo, Pichao Wang, Yi Xu, Feng Ding, Yanxin Zhou, Fan Wang, Hao Li, Rong Jin
We first investigate self-supervised learning (SSL) methods with Vision Transformer (ViT) pretrained on unlabelled person images (the LUPerson dataset), and empirically find it significantly surpasses ImageNet supervised pre-training models on ReID tasks.
Ranked #1 on
Unsupervised Person Re-Identification
on Market-1501
(using extra training data)
no code implementations • 17 Nov 2021 • Ming Yan, Haiyang Xu, Chenliang Li, Junfeng Tian, Bin Bi, Wei Wang, Weihua Chen, Xianzhe Xu, Fan Wang, Zheng Cao, Zhicheng Zhang, Qiyu Zhang, Ji Zhang, Songfang Huang, Fei Huang, Luo Si, Rong Jin
The Visual Question Answering (VQA) task utilizes both visual image and language analysis to answer a textual question with respect to an image.
Ranked #11 on
Visual Question Answering (VQA)
on VQA v2 test-dev
1 code implementation • 24 Oct 2021 • Niv Nayman, Yonathan Aflalo, Asaf Noy, Rong Jin, Lihi Zelnik-Manor
Practical use of neural networks often involves requirements on latency, energy and memory among others.
no code implementations • ICLR 2022 • Yutong Feng, Jianwen Jiang, Mingqian Tang, Rong Jin, Yue Gao
Though for most cases, the pre-training stage is conducted based on supervised methods, recent works on self-supervised pre-training have shown powerful transferability and even outperform supervised pre-training on multiple downstream tasks.
no code implementations • 29 Sep 2021 • Yang Liu, Zhipeng Zhou, Lei Shang, Baigui Sun, Hao Li, Rong Jin
Unsupervised domain adaptation (UDA) aims to transfer the knowledge from a labeled source domain to an unlabeled target domain.
no code implementations • 29 Sep 2021 • Hesen Chen, Ming Lin, Xiuyu Sun, Rong Jin
In this work, we propose a novel approach termed Hierarchical Cross Contrastive Learning(HCCL) to further distill the information mismatched by the conventional contrastive loss.
no code implementations • 29 Sep 2021 • Zhenhong Sun, Ming Lin, Zhiyu Tan, Xiuyu Sun, Rong Jin
Recent researches attempt to reduce this cost by optimizing the backbone architecture with the help of Neural Architecture Search (NAS).
1 code implementation • ICLR 2022 • Tongkun Xu, Weihua Chen, Pichao Wang, Fan Wang, Hao Li, Rong Jin
Along with the pseudo labels, a weight-sharing triple-branch transformer framework is proposed to apply self-attention and cross-attention for source/target feature learning and source-target domain alignment, respectively.
Ranked #1 on
Domain Adaptation
on Office-31
no code implementations • 8 Sep 2021 • Pichao Wang, Xue Wang, Hao Luo, Jingkai Zhou, Zhipeng Zhou, Fan Wang, Hao Li, Rong Jin
In this paper, we further investigate this problem and extend the above conclusion: only early convolutions do not help for stable training, but the scaled ReLU operation in the \textit{convolutional stem} (\textit{conv-stem}) matters.
no code implementations • 1 Sep 2021 • Yi Xu, Lei Shang, Jinxing Ye, Qi Qian, Yu-Feng Li, Baigui Sun, Hao Li, Rong Jin
In this work we develop a simple yet powerful framework, whose key idea is to select a subset of training examples from the unlabeled data when performing existing SSL methods so that only the unlabeled examples with pseudo labels related to the labeled data will be used to train models.
1 code implementation • 24 Aug 2021 • Zhiwu Qing, Ziyuan Huang, Shiwei Zhang, Mingqian Tang, Changxin Gao, Marcelo H. Ang Jr, Rong Jin, Nong Sang
The visualizations show that ParamCrop adaptively controls the center distance and the IoU between two augmented views, and the learned change in the disparity along the training process is beneficial to learning a strong representation.
1 code implementation • 5 Jul 2021 • Yuqi Zhang, Qian Qi, Chong Liu, Weihua Chen, Fan Wang, Hao Li, Rong Jin
In this work, we propose a graph-based re-ranking method to improve learned features while still keeping Euclidean distance as the similarity metric.
no code implementations • CVPR 2021 • Liuyihan Song, Kang Zhao, Pan Pan, Yu Liu, Yingya Zhang, Yinghui Xu, Rong Jin
Different from all of them, we regard large and small gradients selection as the exploitation and exploration of gradient information, respectively.
1 code implementation • ICLR 2022 • Xiaolong Ma, Minghai Qin, Fei Sun, Zejiang Hou, Kun Yuan, Yi Xu, Yanzhi Wang, Yen-Kuang Chen, Rong Jin, Yuan Xie
It addresses the shortcomings of the previous works by repeatedly growing a subset of layers to dense and then pruning them back to sparse after some training.
1 code implementation • 28 May 2021 • Pichao Wang, Xue Wang, Fan Wang, Ming Lin, Shuning Chang, Hao Li, Rong Jin
A key component in vision transformers is the fully-connected self-attention which is more powerful than CNNs in modelling long range dependencies.
1 code implementation • CVPR 2022 • Qi Qian, Yuanhong Xu, Juhua Hu, Hao Li, Rong Jin
Clustering is to assign each instance a pseudo label that will be used to learn representations in discrimination.
Ranked #36 on
Self-Supervised Image Classification
on ImageNet
no code implementations • 13 May 2021 • Yi Xu, Qi Qian, Hao Li, Rong Jin
Stochastic gradient descent (SGD) has become the most attractive optimization method in training large-scale deep neural networks due to its simplicity, low computational cost in each updating step, and good performance.
no code implementations • 30 Apr 2021 • Zhishuai Guo, Yi Xu, Wotao Yin, Rong Jin, Tianbao Yang
Our analysis exhibits that an increasing or large enough "momentum" parameter for the first-order moment used in practice is sufficient to ensure Adam and its many variants converge under a mild boundness condition on the adaptive scaling factor of the step size.
no code implementations • CVPR 2021 • Li Hu, Peng Zhang, Bang Zhang, Pan Pan, Yinghui Xu, Rong Jin
To address this limitation, we propose to Learn position and target Consistency framework for Memory-based video object segmentation, termed as LCM.
One-shot visual object segmentation
Semantic Segmentation
+1
no code implementations • 8 Apr 2021 • Yi Xu, Qi Qian, Hao Li, Rong Jin
Noisy labels are very common in deep supervised learning.
1 code implementation • CVPR 2021 • Lianghua Huang, Yu Liu, Bin Wang, Pan Pan, Yinghui Xu, Rong Jin
A key challenge in self-supervised video representation learning is how to effectively capture motion information besides context bias.
1 code implementation • CVPR 2021 • Ziyuan Huang, Shiwei Zhang, Jianwen Jiang, Mingqian Tang, Rong Jin, Marcelo Ang
We furthermore introduce a static mask in pseudo motions to create local motion patterns, which forces the model to additionally locate notable motion areas for the correct classification. We demonstrate that MoSI can discover regions with large motion even without fine-tuning on the downstream datasets.
no code implementations • 9 Feb 2021 • Yanhao Zhang, Pan Pan, Yun Zheng, Kang Zhao, Yingya Zhang, Xiaofeng Ren, Rong Jin
We hope visual search at Alibaba becomes more widely incorporated into today's commercial applications.
no code implementations • 9 Feb 2021 • Liuyihan Song, Pan Pan, Kang Zhao, Hao Yang, Yiming Chen, Yingya Zhang, Yinghui Xu, Rong Jin
In the last decades, extreme classification has become an essential topic for deep learning.
no code implementations • 9 Feb 2021 • Kang Zhao, Pan Pan, Yun Zheng, Yanhao Zhang, Changxu Wang, Yingya Zhang, Yinghui Xu, Rong Jin
For a deployed visual search system with several billions of online images in total, building a billion-scale offline graph in hours is essential, which is almost unachievable by most existing methods.
no code implementations • 9 Feb 2021 • Yu Liu, Lianghua Huang, Pan Pan, Bin Wang, Yinghui Xu, Rong Jin
However, scaling up the classification task from thousands of semantic labels to millions of instance labels brings specific challenges including 1) the large-scale softmax computation; 2) the slow convergence due to the infrequent visiting of instance samples; and 3) the massive number of negative classes that can be noisy.
no code implementations • 9 Feb 2021 • Xiangzeng Zhou, Pan Pan, Yun Zheng, Yinghui Xu, Rong Jin
In this paper, we present a novel side information based large scale visual recognition co-training~(SICoT) system to deal with the long tail problem by leveraging the image related side information.
no code implementations • 9 Feb 2021 • Yanhao Zhang, Pan Pan, Yun Zheng, Kang Zhao, Jianmin Wu, Yinghui Xu, Rong Jin
Benefiting from exploration of user click data, our networks are more effective to encode richer supervision and better distinguish real-shot images in terms of category and feature.
2 code implementations • 1 Feb 2021 • Ming Lin, Pichao Wang, Zhenhong Sun, Hesen Chen, Xiuyu Sun, Qi Qian, Hao Li, Rong Jin
Comparing with previous NAS methods, the proposed Zen-NAS is magnitude times faster on multiple server-side and mobile-side GPU platforms with state-of-the-art accuracy on ImageNet.
Ranked #2 on
Neural Architecture Search
on ImageNet
no code implementations • 12 Jan 2021 • Asaf Noy, Yi Xu, Yonathan Aflalo, Lihi Zelnik-Manor, Rong Jin
We show that convergence to a global minimum is guaranteed for networks with widths quadratic in the sample size and linear in their depth at a time logarithmic in both.
no code implementations • 4 Jan 2021 • Rong Jin, Weili Wu
Recent years have seen various rumor diffusion models being assumed in detection of rumor source research of the online social network.
2 code implementations • ICCV 2021 • Ming Lin, Pichao Wang, Zhenhong Sun, Hesen Chen, Xiuyu Sun, Qi Qian, Hao Li, Rong Jin
To address this issue, instead of using an accuracy predictor, we propose a novel zero-shot index dubbed Zen-Score to rank the architectures.
no code implementations • 13 Dec 2020 • Qi Qi, Yi Xu, Rong Jin, Wotao Yin, Tianbao Yang
In this paper, we present a simple yet effective method (ABSGD) for addressing the data imbalance issue in deep learning.
no code implementations • 10 Dec 2020 • Liang Han, Zhaozheng Yin, Zhurong Xia, Mingqian Tang, Rong Jin
The goal of price prediction is to help sellers set effective and reasonable prices for their second-hand items with the images and text descriptions uploaded to the online platforms.
no code implementations • 10 Dec 2020 • Liang Han, Zhaozheng Yin, Zhurong Xia, Li Guo, Mingqian Tang, Rong Jin
Then, we design a vision-based price suggestion module which takes the extracted visual features along with some statistical item features from the shopping platform as the inputs to determine whether an uploaded item image is qualified for price suggestion by a binary classification model, and provide price suggestions for items with qualified images by a regression model.
2 code implementations • ICLR 2021 • Yichen Qian, Zhiyu Tan, Xiuyu Sun, Ming Lin, Dongyang Li, Zhenhong Sun, Hao Li, Rong Jin
In this work, we propose a novel Global Reference Model for image compression to effectively leverage both the local and the global context information, leading to an enhanced compression rate.
no code implementations • 3 Oct 2020 • Yi Xu, Asaf Noy, Ming Lin, Qi Qian, Hao Li, Rong Jin
To this end, we develop two novel algorithms, termed "AugDrop" and "MixLoss", to correct the data bias in the data augmentation.
2 code implementations • 24 Jun 2020 • Ming Lin, Hesen Chen, Xiuyu Sun, Qi Qian, Hao Li, Rong Jin
To address this issue, we propose a general principle for designing GPU-efficient networks based on extensive empirical studies.
no code implementations • 20 Jun 2020 • Yi Xu, Yuanhong Xu, Qi Qian, Hao Li, Rong Jin
Label smoothing regularization (LSR) has a great success in training deep neural networks by stochastic algorithms such as stochastic gradient descent and its variants.
1 code implementation • NeurIPS 2021 • Qi Qi, Zhishuai Guo, Yi Xu, Rong Jin, Tianbao Yang
In this paper, we propose a practical online method for solving a class of distributionally robust optimization (DRO) with non-convex objectives, which has important applications in machine learning for improving the robustness of neural networks.
1 code implementation • ICCV 2021 • Yuanhong Xu, Qi Qian, Hao Li, Rong Jin, Juhua Hu
To mitigate this challenge, we propose an algorithm to learn the fine-grained patterns for the target task, when only its coarse-class labels are available.
5 code implementations • ICCV 2019 • Qi Qian, Lei Shang, Baigui Sun, Juhua Hu, Hao Li, Rong Jin
The set of triplet constraints has to be sampled within the mini-batch.
Ranked #18 on
Metric Learning
on CUB-200-2011
(using extra training data)
1 code implementation • CVPR 2020 • Qi Qian, Lei Chen, Hao Li, Rong Jin
This architecture is efficient but can suffer from the imbalance issue with respect to two aspects: the inter-class imbalance between the number of candidates from foreground and background classes and the intra-class imbalance in the hardness of background candidates, where only a few candidates are hard to be identified.
2 code implementations • NeurIPS 2019 • Niv Nayman, Asaf Noy, Tal Ridnik, Itamar Friedman, Rong Jin, Lihi Zelnik-Manor
This paper introduces a novel optimization method for differential neural architecture search, based on the theory of prediction with expert advice.
no code implementations • 3 Jun 2019 • Ming Lin, Xiaomin Song, Qi Qian, Hao Li, Liang Sun, Shenghuo Zhu, Rong Jin
We validate the superiority of the proposed method in our real-time high precision positioning system against several popular state-of-the-art robust regression methods.
no code implementations • 10 May 2019 • Hao Yu, Rong Jin
We show that for stochastic non-convex optimization under the P-L condition, the classical data-parallel SGD with exponentially increasing batch sizes can achieve the fastest known $O(1/(NT))$ convergence with linear speedup using only $\log(T)$ communication rounds.
no code implementations • 9 May 2019 • Hao Yu, Rong Jin, Sen yang
Recent developments on large-scale distributed machine learning applications, e. g., deep neural networks, benefit enormously from the advances in distributed non-convex optimization techniques, e. g., distributed Stochastic Gradient Descent (SGD).
no code implementations • 19 Mar 2019 • Rong Jin, David Simchi-Levi, Li Wang, Xinshang Wang, Sen Yang
In this paper, we study algorithms for dynamically identifying a large number of products (i. e., SKUs) with top customer purchase probabilities on the fly, from an ocean of potential products to offer on retailers' ultra-fast delivery platforms.
no code implementations • 30 Jan 2019 • Ming Lin, Shuang Qiu, Jieping Ye, Xiaomin Song, Qi Qian, Liang Sun, Shenghuo Zhu, Rong Jin
This bound is sub-optimal comparing to the information theoretical lower bound $\mathcal{O}(kd)$.
no code implementations • NeurIPS 2019 • Zhuoning Yuan, Yan Yan, Rong Jin, Tianbao Yang
For convex loss functions and two classes of "nice-behaviored" non-convex objectives that are close to a convex function, we establish faster convergence of stagewise training than the vanilla SGD under the PL condition on both training error and testing error.
no code implementations • 28 Nov 2018 • Yi Xu, Qi Qi, Qihang Lin, Rong Jin, Tianbao Yang
In this paper, we propose new stochastic optimization algorithms and study their first-order convergence theories for solving a broad family of DC functions.
no code implementations • CVPR 2018 • Qi Qian, Jiasheng Tang, Hao Li, Shenghuo Zhu, Rong Jin
Furthermore, we can show that the metric is learned from latent examples only, but it can preserve the large margin property even for the original data.
no code implementations • 21 May 2018 • Yi Xu, Shenghuo Zhu, Sen yang, Chi Zhang, Rong Jin, Tianbao Yang
Learning with a {\it convex loss} function has been a dominating paradigm for many years.
no code implementations • 19 May 2018 • Qi Qian, Shenghuo Zhu, Jiasheng Tang, Rong Jin, Baigui Sun, Hao Li
Hence, we propose to learn the model and the adversarial distribution simultaneously with the stochastic algorithm for efficiency.
BIG-bench Machine Learning
Fine-Grained Visual Categorization
no code implementations • NeurIPS 2018 • Mingrui Liu, Xiaoxuan Zhang, Lijun Zhang, Rong Jin, Tianbao Yang
Error bound conditions (EBC) are properties that characterize the growth of an objective function when a point is moved away from the optimal set.
no code implementations • 8 May 2018 • Mingdong Ou, Nan Li, Shenghuo Zhu, Rong Jin
In each round, the player selects a $K$-cardinality subset from $N$ candidate items, and receives a reward which is governed by a {\it multinomial logit} (MNL) choice model considering both item utility and substitution property among items.
no code implementations • 4 Dec 2017 • Yi Xu, Rong Jin, Tianbao Yang
Accelerated gradient (AG) methods are breakthroughs in convex optimization, improving the convergence rate of the gradient descent method for optimization with smooth functions.
no code implementations • NeurIPS 2018 • Yi Xu, Rong Jin, Tianbao Yang
Two classes of methods have been proposed for escaping from saddle points with one using the second-order information carried by the Hessian and the other adding the noise into the first-order information.
no code implementations • 24 Jul 2017 • Cong Leng, Hao Li, Shenghuo Zhu, Rong Jin
Although deep learning models are highly effective for various learning tasks, their high computational costs prohibit the deployment to scenarios where either memory or computational resources are limited.
no code implementations • CVPR 2017 • Luan Tran, Xiaoming Liu, Jiayu Zhou, Rong Jin
To leverage the valuable information in the corrupted data, we propose to impute the missing data by leveraging the relatedness among different modalities.
no code implementations • 7 Feb 2017 • Lijun Zhang, Tianbao Yang, Rong Jin
First, we establish an $\widetilde{O}(d/n + \sqrt{F_*/n})$ risk bound when the random function is nonnegative, convex and smooth, and the expected function is Lipschitz continuous, where $d$ is the dimensionality of the problem, $n$ is the number of samples, and $F_*$ is the minimal risk.
no code implementations • ICML 2018 • Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou
To cope with changing environments, recent developments in online learning have introduced the concepts of adaptive regret and dynamic regret independently.
no code implementations • NeurIPS 2017 • Lijun Zhang, Tianbao Yang, Jin-Feng Yi, Rong Jin, Zhi-Hua Zhou
When multiple gradients are accessible to the learner, we first demonstrate that the dynamic regret of strongly convex functions can be upper bounded by the minimum of the path-length and the squared path-length.
no code implementations • 16 May 2016 • Tianbao Yang, Lijun Zhang, Rong Jin, Jin-Feng Yi
Secondly, we present a lower bound with noisy gradient feedback and then show that we can achieve optimal dynamic regrets under a stochastic gradient feedback and two-point bandit feedback.
no code implementations • 6 Dec 2015 • Qi Qian, Inci M. Baytas, Rong Jin, Anil Jain, Shenghuo Zhu
The similarity between pairs of images can be measured by the distances between their high dimensional representations, and the problem of learning the appropriate similarity is often addressed by distance metric learning.
no code implementations • 12 Nov 2015 • Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou
In this paper, we develop a randomized algorithm and theory for learning a sparse model from large-scale and high-dimensional data, which is usually formulated as an empirical risk minimization problem with a sparsity-inducing regularizer.
no code implementations • 5 Nov 2015 • Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou
In this paper, we utilize stochastic optimization to reduce the space complexity of convex composite optimization with a nuclear norm regularizer, where the variable is a matrix of size $m \times n$.
no code implementations • 25 Sep 2015 • Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou
In this paper, we study a special bandit setting of online stochastic linear optimization, where only one-bit of information is revealed to the learner at each round.
no code implementations • 15 Sep 2015 • Qi Qian, Rong Jin, Lijun Zhang, Shenghuo Zhu
In this work, we present a dual random projection frame for DML with high dimensional data that explicitly addresses the limitation of dimensionality reduction for DML.
no code implementations • 18 Jul 2015 • Tianbao Yang, Lijun Zhang, Qihang Lin, Rong Jin
In this paper, we study a fast approximation method for {\it large-scale high-dimensional} sparse least-squares regression problem by exploiting the Johnson-Lindenstrauss (JL) transforms, which embed a set of high-dimensional vectors into a low-dimensional space.
no code implementations • 4 May 2015 • Tianbao Yang, Lijun Zhang, Rong Jin, Shenghuo Zhu
In this paper, we consider the problem of column subset selection.
no code implementations • 26 Apr 2015 • Lijun Zhang, Tianbao Yang, Rong Jin, Zhi-Hua Zhou
To the best of our knowledge, this is first time such a relative bound is proved for the regularized formulation of matrix completion.
no code implementations • 15 Apr 2015 • Tianbao Yang, Lijun Zhang, Rong Jin, Shenghuo Zhu
In this paper, we study randomized reduction methods, which reduce high-dimensional features into low-dimensional space by randomized methods (e. g., random projection, random hashing), for large-scale high-dimensional classification.
no code implementations • NeurIPS 2014 • Tianbao Yang, Rong Jin
In this work, we study the problem of transductive pairwise classification from pairwise similarities~\footnote{The pairwise similarities are usually derived from some side information instead of the underlying class labels.}.
no code implementations • 4 Nov 2014 • Miao Xu, Rong Jin, Zhi-Hua Zhou
In particular, the proposed algorithm computes the low rank approximation of the target matrix based on (i) the randomly sampled rows and columns, and (ii) a subset of observed entries that are randomly sampled from the matrix.
no code implementations • NeurIPS 2014 • Nan Li, Rong Jin, Zhi-Hua Zhou
Recent efforts of bipartite ranking are focused on optimizing ranking accuracy at the top of the ranked list.
no code implementations • 13 Aug 2014 • Tianbao Yang, Rong Jin, Shenghuo Zhu, Qihang Lin
In this work, we study data preconditioning, a well-known and long-existing technique, for boosting the convergence of first-order methods for regularized loss minimization.
no code implementations • 22 Mar 2014 • Rong Jin, Shenghuo Zhu
Our goal is to develop a low rank approximation algorithm, similar to CUR, based on (i) randomly sampled rows and columns from A, and (ii) randomly sampled entries from A.
no code implementations • 16 Feb 2014 • Radha Chitta, Rong Jin, Timothy C. Havens, Anil K. Jain
Kernel-based clustering algorithms have the ability to capture the non-linear structure in real world data.
no code implementations • 7 Feb 2014 • Mehrdad Mahdavi, Lijun Zhang, Rong Jin
In statistical learning theory, convex surrogates of the 0-1 loss are highly preferred because of the computational and theoretical virtues that convexity brings in.
no code implementations • CVPR 2015 • Qi Qian, Rong Jin, Shenghuo Zhu, Yuanqing Lin
To this end, we proposed a multi-stage metric learning framework that divides the large-scale high dimensional learning problem to a series of simple subproblems, achieving $\mathcal{O}(d)$ computational complexity.
no code implementations • 18 Jan 2014 • Mehrdad Mahdavi, Rong Jin
The overarching goal of this paper is to derive excess risk bounds for learning from exp-concave loss functions in passive and sequential learning settings.
no code implementations • 4 Dec 2013 • Tianbao Yang, Shenghuo Zhu, Rong Jin, Yuanqing Lin
Extraordinary performances have been observed and reported for the well-motivated updates, as referred to the practical updates, compared to the naive updates.
no code implementations • NeurIPS 2013 • Mehrdad Mahdavi, Lijun Zhang, Rong Jin
It is well known that the optimal convergence rate for stochastic optimization of smooth functions is $[O(1/\sqrt{T})]$, which is same as stochastic optimization of Lipschitz continuous convex functions.
no code implementations • NeurIPS 2013 • Miao Xu, Rong Jin, Zhi-Hua Zhou
In standard matrix completion theory, it is required to have at least $O(n\ln^2 n)$ observed entries to perfectly recover a low-rank matrix $M$ of size $n\times n$, leading to a large number of observations when $n$ is large.
no code implementations • NeurIPS 2013 • Mehrdad Mahdavi, Tianbao Yang, Rong Jin
It leverages on the theory of Lagrangian method in constrained optimization and attains the optimal convergence rate of $[O(1/ \sqrt{T})]$ in high probability for general Lipschitz continuous objectives.
no code implementations • NeurIPS 2013 • Lijun Zhang, Mehrdad Mahdavi, Rong Jin
For smooth and strongly convex optimization, the optimal iteration complexity of the gradient-based algorithm is $O(\sqrt{\kappa}\log 1/\epsilon)$, where $\kappa$ is the conditional number.
no code implementations • 30 Nov 2013 • Rong Jin
In this paper, we first prove a high probability bound rather than an expectation bound for stochastic optimization with smooth loss.
no code implementations • 19 Nov 2013 • Lijun Zhang, Mehrdad Mahdavi, Rong Jin
Under the assumption that the norm of the optimal classifier that minimizes the convex risk is available, our analysis shows that the introduction of the convex surrogate loss yields an exponential reduction in the label complexity even when the parameter $\kappa$ of the Tsybakov noise is larger than $1$.
no code implementations • 26 Jul 2013 • Mehrdad Mahdavi, Rong Jin
It is well known that the optimal convergence rate for stochastic optimization of smooth functions is $O(1/\sqrt{T})$, which is same as stochastic optimization of Lipschitz continuous convex functions.
no code implementations • CVPR 2013 • Yue Lin, Rong Jin, Deng Cai, Shuicheng Yan, Xuelong. Li
Recent studies have shown that hashing methods are effective for high dimensional nearest neighbor search.
no code implementations • 7 May 2013 • Wei Gao, Rong Jin, Shenghuo Zhu, Zhi-Hua Zhou
AUC is an important performance measure and many algorithms have been devoted to AUC optimization, mostly by minimizing a surrogate convex loss on a training data set.
no code implementations • 3 Apr 2013 • Qi Qian, Rong Jin, Jin-Feng Yi, Lijun Zhang, Shenghuo Zhu
Although stochastic gradient descent (SGD) has been successfully applied to improve the efficiency of DML, it can still be computationally expensive because in order to ensure that the solution is a PSD matrix, it has to, at every iteration, project the updated distance metric onto the PSD cone, an expensive operation.
no code implementations • 2 Apr 2013 • Lijun Zhang, Tianbao Yang, Rong Jin, Xiaofei He
Traditional algorithms for stochastic optimization require projecting the solution at each iteration into a given domain to ensure its feasibility.
no code implementations • 8 Feb 2013 • Mehrdad Mahdavi, Rong Jin
In this paper we consider learning in passive setting but with a slight modification.
no code implementations • NeurIPS 2012 • Jinfeng Yi, Rong Jin, Shaili Jain, Tianbao Yang, Anil K. Jain
One difficulty in learning the pairwise similarity measure is that there is a significant amount of noise and inter-worker variations in the manual annotations obtained via crowdsourcing.
no code implementations • NeurIPS 2012 • Tianbao Yang, Yu-Feng Li, Mehrdad Mahdavi, Rong Jin, Zhi-Hua Zhou
Both random Fourier features and the Nyström method have been successfully applied to efficient kernel learning.
no code implementations • NeurIPS 2012 • Mehrdad Mahdavi, Tianbao Yang, Rong Jin, Shenghuo Zhu, Jin-Feng Yi
Although many variants of stochastic gradient descent have been proposed for large-scale convex optimization, most of them require projecting the solution at {\it each} iteration to ensure that the obtained solution stays within the feasible domain.
no code implementations • 26 Nov 2012 • Mehrdad Mahdavi, Tianbao Yang, Rong Jin
We first propose a projection based algorithm which attains an $O(T^{-1/3})$ convergence rate.
no code implementations • 13 Nov 2012 • Lijun Zhang, Mehrdad Mahdavi, Rong Jin, Tianbao Yang, Shenghuo Zhu
Random projection has been widely used in data classification.
no code implementations • 27 Jun 2012 • Ming Ji, Tianbao Yang, Binbin Lin, Rong Jin, Jiawei Han
In this work, we develop a simple algorithm for semi-supervised regression.
no code implementations • 15 Mar 2012 • Kaizhu Huang, Rong Jin, Zenglin Xu, Cheng-Lin Liu
Most existing distance metric learning methods assume perfect side information that is usually given in pairwise or triplet constraints.
no code implementations • 24 Jan 2012 • Tianbao Yang, Mehrdad Mahdavi, Rong Jin, Shenghuo Zhu
We study the non-smooth optimization problems in machine learning, where both the loss function and the regularizer are non-smooth functions.
no code implementations • NeurIPS 2010 • Serhat Bucak, Rong Jin, Anil K. Jain
Recent studies have shown that multiple kernel learning is very effective for object recognition, leading to the popularity of kernel learning in computer vision problems.
no code implementations • NeurIPS 2010 • Sheng-Jun Huang, Rong Jin, Zhi-Hua Zhou
Most active learning approaches select either informative or representative unlabeled instances to query their labels.
no code implementations • NeurIPS 2009 • Lei Wu, Rong Jin, Steven C. Hoi, Jianke Zhu, Nenghai Yu
Learning distance functions with side information plays a key role in many machine learning and data mining applications.
no code implementations • NeurIPS 2009 • Zenglin Xu, Rong Jin, Jianke Zhu, Irwin King, Michael Lyu, Zhirong Yang
In this framework, SVM and TSVM can be regarded as a learning machine without regularization and one with full regularization from the unlabeled data, respectively.
no code implementations • NeurIPS 2009 • Peilin Zhao, Steven C. Hoi, Rong Jin
This is clearly insufficient since when a new misclassified example is added to the pool of support vectors, we generally expect it to affect the weights for the existing support vectors.
no code implementations • NeurIPS 2009 • Rong Jin, Shijun Wang, Yang Zhou
In this paper, we examine the generalization error of regularized distance metric learning.
no code implementations • NeurIPS 2009 • Hamed Valizadegan, Rong Jin, Ruofei Zhang, Jianchang Mao
Learning to rank is a relatively new field of study, aiming to learn a ranking function from a set of training data with relevancy labels.
no code implementations • NeurIPS 2008 • Liu Yang, Rong Jin, Rahul Sukthankar
For empirical evaluation, we present a direct comparison with a number of state-of-the-art methods for inductive semi-supervised learning and text categorization; and we show that SSLW results in a significant improvement in categorization accuracy, equipped with a small training set and an unlabeled resource that is weakly related to the test beds."
no code implementations • NeurIPS 2008 • Zenglin Xu, Rong Jin, Irwin King, Michael Lyu
We consider the problem of multiple kernel learning (MKL), which can be formulated as a convex-concave problem.
no code implementations • NeurIPS 2008 • Shuiwang Ji, Liang Sun, Rong Jin, Jieping Ye
We present a multi-label multiple kernel learning (MKL) formulation, in which the data are embedded into a low-dimensional space directed by the instance-label correlations encoded into a hypergraph.
no code implementations • NeurIPS 2007 • Zenglin Xu, Rong Jin, Jianke Zhu, Irwin King, Michael Lyu
We consider the problem of Support Vector Machine transduction, which involves a combinatorial problem with exponential computational complexity in the number of unlabeled examples.