no code implementations • 6 Dec 2017 • Danyang Sun, Tongzheng Ren, Chongxun Li, Hang Su, Jun Zhu
Automatically writing stylized Chinese characters is an attractive yet challenging task due to its wide applicabilities.
no code implementations • 10 Oct 2018 • Yichi Zhou, Tongzheng Ren, Jialian Li, Dong Yan, Jun Zhu
In this paper, we present a novel technique, lazy update, which can avoid traversing the whole game tree in CFR, as well as a novel analysis on the regret of CFR with lazy update.
no code implementations • 27 Jan 2019 • Haosheng Zou, Tongzheng Ren, Dong Yan, Hang Su, Jun Zhu
Reward shaping is one of the most effective methods to tackle the crucial yet challenging problem of credit assignment in Reinforcement Learning (RL).
1 code implementation • ICLR 2019 • Ziyu Wang, Tongzheng Ren, Jun Zhu, Bo Zhang
While Bayesian neural networks (BNNs) have drawn increasing attention, their posterior inference remains challenging, due to the high-dimensional and over-parameterized nature.
no code implementations • NeurIPS 2020 • Xiaoxia Wu, Edgar Dobriban, Tongzheng Ren, Shanshan Wu, Zhiyuan Li, Suriya Gunasekar, Rachel Ward, Qiang Liu
For certain stepsizes of g and w , we show that they can converge close to the minimum norm solution.
1 code implementation • 20 Feb 2020 • Chengyue Gong, Tongzheng Ren, Mao Ye, Qiang Liu
The idea is to generate a set of augmented data with some random perturbations or transforms and minimize the maximum, or worst case loss over the augmented data.
Ranked #187 on Image Classification on ImageNet
1 code implementation • NeurIPS 2020 • Mao Ye, Tongzheng Ren, Qiang Liu
Our idea is to introduce Stein variational gradient as a repulsive force to push the samples of Langevin dynamics away from the past trajectories.
no code implementations • ICLR 2020 • Yichi Zhou, Tongzheng Ren, Jialian Li, Dong Yan, Jun Zhu
In this paper, we present Lazy-CFR, a CFR algorithm that adopts a lazy update strategy to avoid traversing the whole game tree in each round.
no code implementations • 15 Aug 2020 • Yihao Feng, Tongzheng Ren, Ziyang Tang, Qiang Liu
We consider off-policy evaluation (OPE), which evaluates the performance of a new policy from observed data collected from previous experiments, without requiring the execution of the new policy.
no code implementations • 3 Mar 2021 • Shuo Yang, Tongzheng Ren, Inderjit S. Dhillon, Sujay Sanghavi
Specifically, we focus on a challenging setting where 1) the reward distribution of an arm depends on the set $s$ it is part of, and crucially 2) there is \textit{no total order} for the arms in $\mathcal{A}$.
no code implementations • 3 Mar 2021 • Shuo Yang, Tongzheng Ren, Sanjay Shakkottai, Eric Price, Inderjit S. Dhillon, Sujay Sanghavi
For sufficiently large $K$, our algorithms have sublinear per-step complexity and $\tilde O(\sqrt{T})$ regret.
no code implementations • NeurIPS 2021 • Tongzheng Ren, Jialian Li, Bo Dai, Simon S. Du, Sujay Sanghavi
To the best of our knowledge, these are the \emph{first} set of nearly horizon-free bounds for episodic time-homogeneous offline tabular MDP and linear MDP with anchor points.
no code implementations • NeurIPS 2021 • Ziyu Wang, Yuhao Zhou, Tongzheng Ren, Jun Zhu
Recent years have witnessed an upsurge of interest in employing flexible machine learning models for instrumental variable (IV) regression, but the development of uncertainty quantification methodology is still lacking.
1 code implementation • ACL 2021 • Keyang Xu, Tongzheng Ren, Shikun Zhang, Yihao Feng, Caiming Xiong
Deployed real-world machine learning applications are often subject to uncontrolled and even potentially malicious inputs.
1 code implementation • NeurIPS 2021 • Ziyu Wang, Yuhao Zhou, Tongzheng Ren, Jun Zhu
Recent years have witnessed an upsurge of interest in employing flexible machine learning models for instrumental variable (IV) regression, but the development of uncertainty quantification methodology is still lacking.
no code implementations • CVPR 2021 • Chengyue Gong, Tongzheng Ren, Mao Ye, Qiang Liu
The idea is to generate a set of augmented data with some random perturbations or transforms, and minimize the maximum, or worst case loss over the augmented data.
no code implementations • 15 Oct 2021 • Tongzheng Ren, Fuheng Cui, Alexia Atsidakou, Sujay Sanghavi, Nhat Ho
We study the statistical and computational complexities of the Polyak step size gradient descent algorithm under generalized smoothness and Lojasiewicz conditions of the population loss function, namely, the limit of the empirical loss function when the sample size goes to infinity, and the stability between the gradients of the empirical and population loss functions, namely, the polynomial growth on the concentration bound between the gradients of sample and population loss functions.
no code implementations • 22 Nov 2021 • Tongzheng Ren, Tianjun Zhang, Csaba Szepesvári, Bo Dai
Representation learning lies at the heart of the empirical success of deep learning for dealing with the curse of dimensionality.
no code implementations • 9 Feb 2022 • Tongzheng Ren, Jiacheng Zhuo, Sujay Sanghavi, Nhat Ho
This computational complexity is cheaper than that of the fixed step-size gradient descent algorithm, which is of the order $\mathcal{O}(n^{\tau})$ for some $\tau > 1$, to reach the same statistical radius.
no code implementations • 13 Mar 2022 • Jialian Li, Tongzheng Ren, Dong Yan, Hang Su, Jun Zhu
Our goal is to identify a near-optimal robust policy for the perturbed testing environment, which introduces additional technical difficulties as we need to simultaneously estimate the training environment uncertainty from samples and find the worst-case perturbation for testing.
no code implementations • 16 May 2022 • Nhat Ho, Tongzheng Ren, Sujay Sanghavi, Purnamrita Sarkar, Rachel Ward
Therefore, the total computational complexity of the EGD algorithm is \emph{optimal} and exponentially cheaper than that of the GD for solving parameter estimation in non-regular statistical models while being comparable to that of the GD in regular statistical settings.
no code implementations • 23 May 2022 • Tongzheng Ren, Fuheng Cui, Sujay Sanghavi, Nhat Ho
However, when the models are over-specified, namely, the chosen number of components to fit the data is larger than the unknown true number of components, EM needs a polynomial number of iterations in terms of the sample size to reach the final statistical radius; this is computationally expensive in practice.
no code implementations • 27 May 2022 • Xing Han, Tongzheng Ren, Jing Hu, Joydeep Ghosh, Nhat Ho
To attain this goal, each time series is first assigned the forecast for its cluster representative, which can be considered as a "shrinkage prior" for the set of time series it represents.
no code implementations • 14 Jul 2022 • Tianjun Zhang, Tongzheng Ren, Mengjiao Yang, Joseph E. Gonzalez, Dale Schuurmans, Bo Dai
It is common to address the curse of dimensionality in Markov decision processes (MDPs) by exploiting low-rank representations.
no code implementations • 19 Aug 2022 • Tongzheng Ren, Tianjun Zhang, Lisa Lee, Joseph E. Gonzalez, Dale Schuurmans, Bo Dai
Representation learning often plays a critical role in reinforcement learning by managing the curse of dimensionality.
1 code implementation • 27 Sep 2022 • Khai Nguyen, Tongzheng Ren, Huy Nguyen, Litu Rout, Tan Nguyen, Nhat Ho
We explain the usage of these projections by introducing Hierarchical Radon Transform (HRT) which is constructed by applying Radon Transform variants recursively.
no code implementations • 17 Dec 2022 • Tongzheng Ren, Chenjun Xiao, Tianjun Zhang, Na Li, Zhaoran Wang, Sujay Sanghavi, Dale Schuurmans, Bo Dai
Theoretically, we establish the sample complexity of the proposed approach in the online and offline settings.
Model-based Reinforcement Learning reinforcement-learning +1
1 code implementation • NeurIPS 2023 • Khai Nguyen, Tongzheng Ren, Nhat Ho
Sliced Wasserstein (SW) distance suffers from redundant projections due to independent uniform random projecting directions.
no code implementations • 8 Apr 2023 • Tongzheng Ren, Zhaolin Ren, Haitong Ma, Na Li, Bo Dai
This paper presents an approach, Spectral Dynamics Embedding Control (SDEC), to optimal control for nonlinear stochastic systems.
no code implementations • 20 Nov 2023 • Hongming Zhang, Tongzheng Ren, Chenjun Xiao, Dale Schuurmans, Bo Dai
In most real-world reinforcement learning applications, state information is only partially observable, which breaks the Markov decision process assumption and leads to inferior performance for algorithms that conflate observations with state.
Partially Observable Reinforcement Learning reinforcement-learning
1 code implementation • 5 Jan 2024 • DeepSeek-AI, :, Xiao Bi, Deli Chen, Guanting Chen, Shanhuang Chen, Damai Dai, Chengqi Deng, Honghui Ding, Kai Dong, Qiushi Du, Zhe Fu, Huazuo Gao, Kaige Gao, Wenjun Gao, Ruiqi Ge, Kang Guan, Daya Guo, JianZhong Guo, Guangbo Hao, Zhewen Hao, Ying He, Wenjie Hu, Panpan Huang, Erhang Li, Guowei Li, Jiashi Li, Yao Li, Y. K. Li, Wenfeng Liang, Fangyun Lin, A. X. Liu, Bo Liu, Wen Liu, Xiaodong Liu, Xin Liu, Yiyuan Liu, Haoyu Lu, Shanghao Lu, Fuli Luo, Shirong Ma, Xiaotao Nie, Tian Pei, Yishi Piao, Junjie Qiu, Hui Qu, Tongzheng Ren, Zehui Ren, Chong Ruan, Zhangli Sha, Zhihong Shao, Junxiao Song, Xuecheng Su, Jingxiang Sun, Yaofeng Sun, Minghui Tang, Bingxuan Wang, Peiyi Wang, Shiyu Wang, Yaohui Wang, Yongji Wang, Tong Wu, Y. Wu, Xin Xie, Zhenda Xie, Ziwei Xie, Yiliang Xiong, Hanwei Xu, R. X. Xu, Yanhong Xu, Dejian Yang, Yuxiang You, Shuiping Yu, Xingkai Yu, B. Zhang, Haowei Zhang, Lecong Zhang, Liyue Zhang, Mingchuan Zhang, Minghua Zhang, Wentao Zhang, Yichao Zhang, Chenggang Zhao, Yao Zhao, Shangyan Zhou, Shunfeng Zhou, Qihao Zhu, Yuheng Zou
The rapid development of open-source large language models (LLMs) has been truly remarkable.
2 code implementations • 8 Mar 2024 • Haoyu Lu, Wen Liu, Bo Zhang, Bingxuan Wang, Kai Dong, Bo Liu, Jingxiang Sun, Tongzheng Ren, Zhuoshu Li, Hao Yang, Yaofeng Sun, Chengqi Deng, Hanwei Xu, Zhenda Xie, Chong Ruan
The DeepSeek-VL family (both 1. 3B and 7B models) showcases superior user experiences as a vision-language chatbot in real-world applications, achieving state-of-the-art or competitive performance across a wide range of visual-language benchmarks at the same model size while maintaining robust performance on language-centric benchmarks.
Ranked #27 on Visual Question Answering on MM-Vet
no code implementations • ICML 2020 • Yihao Feng, Tongzheng Ren, Ziyang Tang, Qiang Liu
In this work, we investigate the statistical properties of the kernel loss, which allows us to find a feasible set that contains the true value function with high probability.