no code implementations • CCL 2021 • Chenlin Zhang, Mingwen Wang, Yiming Tan, Ming Yin, Xinyi Zhang
“本文主要以汉语委婉语作为研究对象, 基于大量人工标注, 借助机器学习有监督分类方法, 实现了较高精度的委婉语自动识别, 并基于此对1946年-2017年的《人民日报》中的委婉语历时变化发展情况进行量化统计分析。从大规模数据的角度探讨委婉语历时性发展变化、委婉语与社会之间的共变关系, 验证了语言的格雷什姆规律与更新规律。”
no code implementations • 29 Oct 2023 • Nikki Lijing Kuang, Ming Yin, Mengdi Wang, Yu-Xiang Wang, Yi-An Ma
We provide the first analysis for posterior sampling algorithms with delayed feedback in RL and show our algorithm achieves $\widetilde{O}(\sqrt{d^3H^3 T} + d^2H^2 E[\tau])$ worst-case regret in the presence of unknown stochastic delays.
no code implementations • 11 Oct 2023 • Zhuoyan Li, Hangxiao Zhu, Zhuoran Lu, Ming Yin
The collection and curation of high-quality training data is crucial for developing text classification models with superior performance, but it is often associated with significant costs and time investment.
no code implementations • 17 Aug 2023 • Songtao Feng, Ming Yin, Yu-Xiang Wang, Jing Yang, Yingbin Liang
In this work, we propose a model-free stage-based Q-learning algorithm and show that it achieves the same sample complexity as the best model-based algorithm, and hence for the first time demonstrate that model-free algorithms can enjoy the same optimality in the $H$ dependence as model-based algorithms.
no code implementations • 24 Jun 2023 • Sunil Madhow, Dan Xiao, Ming Yin, Yu-Xiang Wang
Developing theoretical guarantees on the sample complexity of offline RL methods is an important step towards making data-hungry RL algorithms practically viable.
no code implementations • 1 Jun 2023 • Songtao Feng, Ming Yin, Ruiquan Huang, Yu-Xiang Wang, Jing Yang, Yingbin Liang
To the best of our knowledge, this is the first dynamic regret analysis in non-stationary MDPs with general function approximation.
2 code implementations • 21 May 2023 • Wenhu Chen, Ming Yin, Max Ku, Pan Lu, Yixin Wan, Xueguang Ma, Jianyu Xu, Xinyi Wang, Tony Xia
We evaluate a wide spectrum of 16 large language and code models with different prompting strategies like Chain-of-Thoughts and Program-of-Thoughts.
Ranked #1 on
Natural Questions
on TheoremQA
no code implementations • 8 May 2023 • Maria Leonor Pacheco, Tunazzina Islam, Lyle Ungar, Ming Yin, Dan Goldwasser
Experts across diverse disciplines are often interested in making sense of large text collections.
no code implementations • 26 Feb 2023 • Chong Liu, Ming Yin, Yu-Xiang Wang
It achieves a near-optimal $\sqrt{T}$ regret for problems that the best-known regret is almost linear in time horizon $T$.
no code implementations • 24 Feb 2023 • Dan Qiao, Ming Yin, Yu-Xiang Wang
In many real-life reinforcement learning (RL) problems, deploying new policies is costly.
no code implementations • 14 Jan 2023 • Shuai Ma, Ying Lei, Xinru Wang, Chengbo Zheng, Chuhan Shi, Ming Yin, Xiaojuan Ma
To mitigate this gap, we proposed to promote humans' appropriate trust based on the CL of both sides at a task-instance level.
no code implementations • 29 Nov 2022 • Jiachen Li, Edwin Zhang, Ming Yin, Qinxun Bai, Yu-Xiang Wang, William Yang Wang
Behavior constrained policy optimization has been demonstrated to be a successful paradigm for tackling Offline Reinforcement Learning.
no code implementations • 23 Nov 2022 • Thanh Nguyen-Tang, Ming Yin, Sunil Gupta, Svetha Venkatesh, Raman Arora
To the best of our knowledge, these are the first $\tilde{\mathcal{O}}(\frac{1}{K})$ bound and absolute zero sub-optimality bound respectively for offline RL with linear function approximation from adaptive data with partial coverage.
no code implementations • 3 Oct 2022 • Ming Yin, Mengdi Wang, Yu-Xiang Wang
Offline reinforcement learning, which aims at optimizing sequential decision-making strategies with historical data, has been extensively applied in real-life applications.
no code implementations • 13 Jun 2022 • Kaiqi Zhang, Ming Yin, Yu-Xiang Wang
We propose a quasi neural network to approximate the distribution propagation, which is a neural network with continuous parameters and smooth activation function.
no code implementations • 10 Jun 2022 • Ming Yin, Wenjing Chen, Mengdi Wang, Yu-Xiang Wang
Goal-oriented Reinforcement Learning, where the agent needs to reach the goal state while simultaneously minimizing the cost, has received significant attention in real-world applications.
1 code implementation • NAACL 2022 • Maria Leonor Pacheco, Tunazzina Islam, Monal Mahajan, Andrey Shor, Ming Yin, Lyle Ungar, Dan Goldwasser
The Covid-19 pandemic has led to infodemic of low quality information leading to poor health decisions.
no code implementations • 11 Mar 2022 • Ming Yin, Yaqi Duan, Mengdi Wang, Yu-Xiang Wang
However, a precise understanding of the statistical limits with function representations, remains elusive, even when such a representation is linear.
no code implementations • 13 Feb 2022 • Dan Qiao, Ming Yin, Ming Min, Yu-Xiang Wang
In this paper, we propose a new algorithm based on stage-wise exploration and adaptive policy elimination that achieves a regret of $\widetilde{O}(\sqrt{H^4S^2AT})$ while requiring a switching cost of $O(HSA \log\log T)$.
no code implementations • NeurIPS 2021 • Ming Yin, Yu-Xiang Wang
We study the offline reinforcement learning (offline RL) problem, where the goal is to learn a reward-maximizing policy in an unknown Markov Decision Process (MDP) using the data coming from a policy $\mu$.
no code implementations • NeurIPS 2021 • Ming Yin, Yu-Xiang Wang
This work studies the statistical limits of uniform convergence for offline policy evaluation (OPE) problems with model-based methods (for episodic MDP) and provides a unified framework towards optimal learning for several well-motivated offline tasks.
no code implementations • NeurIPS 2021 • Ming Yin, Yu Bai, Yu-Xiang Wang
Our main result shows that OPDVR provably identifies an $\epsilon$-optimal policy with $\widetilde{O}(H^2/d_m\epsilon^2)$ episodes of offline data in the finite-horizon stationary transition setting, where $H$ is the horizon length and $d_m$ is the minimal marginal state-action distribution induced by the behavior policy.
no code implementations • 7 Jul 2020 • Ming Yin, Yu Bai, Yu-Xiang Wang
The problem of Offline Policy Evaluation (OPE) in Reinforcement Learning (RL) is a critical step towards applying RL in real-life applications.
no code implementations • 29 Jan 2020 • Ming Yin, Yu-Xiang Wang
We consider the problem of off-policy evaluation for reinforcement learning, where the goal is to estimate the expected reward of a target policy $\pi$ using offline data collected by running a logging policy $\mu$.
1 code implementation • 23 Jul 2019 • Ming Yin, Weitian Huang, Junbin Gao
Clustering multi-view data has been a fundamental research topic in the computer vision community.
no code implementations • 30 Aug 2016 • Ming Yin, Junbin Gao, Shengli Xie, Yi Guo
Multi-view subspace clustering is based on the fact that the multi-view data are generated from a latent subspace.
no code implementations • 27 Jan 2016 • Ming Yin, Shengli Xie, Yi Guo, Junbin Gao, Yun Zhang
Due to its promising classification performance, sparse representation based classification(SRC) algorithm has attracted great attention in the past few years.
no code implementations • CVPR 2016 • Ming Yin, Yi Guo, Junbin Gao, Zhaoshui He, Shengli Xie
Sparse subspace clustering (SSC), as one of the most successful subspace clustering methods, has achieved notable clustering accuracy in computer vision tasks.
no code implementations • 18 Aug 2015 • Xuejie Liu, Jingbin Wang, Ming Yin, Benjamin Edwards, Peijuan Xu
Context of data points, which is usually defined as the other data points in a data set, has been found to play important roles in data representation and classification.
no code implementations • 30 Jun 2015 • Jing-Yan Wang, Yihua Zhou, Ming Yin, Shaochang Chen, Benjamin Edwards
In this objective, the reconstruction error is minimized and the coefficient spar- sity is encouraged.