Search Results for author: Bo An

Found 85 papers, 22 papers with code

Vehicle Traffic Driven Camera Placement for Better Metropolis Security Surveillance

1 code implementation1 Apr 2017 Yihui He, Xiaobo Ma, Xiapu Luo, Jianfeng Li, Mengchen Zhao, Bo An, Xiaohong Guan

Security surveillance is one of the most important issues in smart cities, especially in an era of terrorism.

Decision Making

Accurate Text-Enhanced Knowledge Graph Representation Learning

no code implementations NAACL 2018 Bo An, Bo Chen, Xianpei Han, Le Sun

Previous representation learning techniques for knowledge graph representation usually represent the same entity or relation in different triples with the same representation, without considering the ambiguity of relations and entities.

General Classification Graph Representation Learning +4

Model-Free Context-Aware Word Composition

no code implementations COLING 2018 Bo An, Xianpei Han, Le Sun

Word composition is a promising technique for representation learning of large linguistic units (e. g., phrases, sentences and documents).

Dimensionality Reduction Learning Word Embeddings +4

Sentence Rewriting for Semantic Parsing

no code implementations ACL 2016 Bo Chen, Le Sun, Xianpei Han, Bo An

A major challenge of semantic parsing is the vocabulary mismatch problem between natural language and target ontology.

Semantic Parsing Sentence +1

Collaboration based Multi-Label Learning

no code implementations8 Feb 2019 Lei Feng, Bo An, Shuo He

It is well-known that exploiting label correlations is crucially important to multi-label learning.

Multi-Label Learning

Partial Label Learning with Self-Guided Retraining

no code implementations8 Feb 2019 Lei Feng, Bo An

We show that optimizing this convex-concave problem is equivalent to solving a set of quadratic programming (QP) problems.

Partial Label Learning

Competitive Bridge Bidding with Deep Neural Networks

no code implementations3 Mar 2019 Jiang Rong, Tao Qin, Bo An

Second, based on the analysis of the impact of other players' unknown cards on one's final rewards, we design two neural networks to deal with imperfect information, the first one inferring the cards of the partner and the second one taking the outputs of the first one as part of its input to select a bid.

Manipulating a Learning Defender and Ways to Counteract

no code implementations NeurIPS 2019 Jiarui Gan, Qingyu Guo, Long Tran-Thanh, Bo An, Michael Wooldridge

We then apply a game-theoretic framework at a higher level to counteract such manipulation, in which the defender commits to a policy that specifies her strategy commitment according to the learned information.

EUSP: An Easy-to-Use Semantic Parsing PlatForm

no code implementations IJCNLP 2019 Bo An, Chen Bo, Xianpei Han, Le Sun

Semantic parsing aims to map natural language utterances into structured meaning representations.

Semantic Parsing

Inducing Cooperation via Team Regret Minimization based Multi-Agent Deep Reinforcement Learning

no code implementations18 Nov 2019 Runsheng Yu, Zhenyu Shi, Xinrun Wang, Rundong Wang, Buhong Liu, Xinwen Hou, Hanjiang Lai, Bo An

Existing value-factorized based Multi-Agent deep Reinforce-ment Learning (MARL) approaches are well-performing invarious multi-agent cooperative environment under thecen-tralized training and decentralized execution(CTDE) scheme, where all agents are trained together by the centralized valuenetwork and each agent execute its policy independently.

reinforcement-learning Reinforcement Learning (RL)

Learning with Multiple Complementary Labels

no code implementations ICML 2020 Lei Feng, Takuo Kaneko, Bo Han, Gang Niu, Bo An, Masashi Sugiyama

In this paper, we propose a novel problem setting to allow MCLs for each example and two ways for learning with MCLs.

Combating noisy labels by agreement: A joint training method with co-regularization

2 code implementations CVPR 2020 Hongxin Wei, Lei Feng, Xiangyu Chen, Bo An

The state-of-the-art approaches "Decoupling" and "Co-teaching+" claim that the "disagreement" strategy is crucial for alleviating the problem of learning with noisy labels.

Learning with noisy labels Weakly-supervised Learning

Learning Expensive Coordination: An Event-Based Deep RL Approach

no code implementations ICLR 2020 Zhenyu Shi*, Runsheng Yu*, Xinrun Wang*, Rundong Wang, Youzhi Zhang, Hanjiang Lai, Bo An

The main difficulties of expensive coordination are that i) the leader has to consider the long-term effect and predict the followers' behaviors when assigning bonuses and ii) the complex interactions between followers make the training process hard to converge, especially when the leader's policy changes with time.

Decision Making Multi-agent Reinforcement Learning

Learning Behaviors with Uncertain Human Feedback

1 code implementation7 Jun 2020 Xu He, Haipeng Chen, Bo An

However, previous works rarely consider the uncertainty when humans provide feedback, especially in cases that the optimal actions are not obvious to the trainers.

Provably Consistent Partial-Label Learning

no code implementations NeurIPS 2020 Lei Feng, Jiaqi Lv, Bo Han, Miao Xu, Gang Niu, Xin Geng, Bo An, Masashi Sugiyama

Partial-label learning (PLL) is a multi-class classification problem, where each training example is associated with a set of candidate labels.

Multi-class Classification Partial Label Learning

Contextual User Browsing Bandits for Large-Scale Online Mobile Recommendation

no code implementations21 Aug 2020 Xu He, Bo An, Yanghua Li, Haikai Chen, Qingyu Guo, Xin Li, Zhirong Wang

First, since we concern the reward of a set of recommended items, we model the online recommendation as a contextual combinatorial bandit problem and define the reward of a recommended set.

Complexity and Algorithms for Exploiting Quantal Opponents in Large Two-Player Games

no code implementations30 Sep 2020 David Milec, Jakub Černý, Viliam Lisý, Bo An

This paper aims to analyze and propose scalable algorithms for computing effective and robust strategies against a quantal opponent in normal-form and extensive-form games.

counterfactual

Pointwise Binary Classification with Pairwise Confidence Comparisons

no code implementations5 Oct 2020 Lei Feng, Senlin Shu, Nan Lu, Bo Han, Miao Xu, Gang Niu, Bo An, Masashi Sugiyama

To alleviate the data requirement for training effective binary classifiers in binary classification, many weakly supervised learning settings have been proposed.

Binary Classification Classification +2

SemiNLL: A Framework of Noisy-Label Learning by Semi-Supervised Learning

no code implementations2 Dec 2020 Zhuowei Wang, Jing Jiang, Bo Han, Lei Feng, Bo An, Gang Niu, Guodong Long

We also instantiate our framework with different combinations, which set the new state of the art on benchmark-simulated and real-world datasets with noisy labels.

Learning with noisy labels

MetaInfoNet: Learning Task-Guided Information for Sample Reweighting

no code implementations9 Dec 2020 Hongxin Wei, Lei Feng, Rundong Wang, Bo An

Deep neural networks have been shown to easily overfit to biased training data with label noise or class imbalance.

Meta-Learning

Personalized Adaptive Meta Learning for Cold-start User Preference Prediction

no code implementations22 Dec 2020 Runsheng Yu, Yu Gong, Xu He, Bo An, Yu Zhu, Qingwen Liu, Wenwu Ou

Recently, many existing studies regard the cold-start personalized preference prediction as a few-shot learning problem, where each user is the task and recommended items are the classes, and the gradient-based meta learning method (MAML) is leveraged to address this challenge.

Few-Shot Learning

Deep Stock Trading: A Hierarchical Reinforcement Learning Framework for Portfolio Optimization and Order Execution

no code implementations23 Dec 2020 Rundong Wang, Hongxin Wei, Bo An, Zhouyan Feng, Jun Yao

Portfolio management via reinforcement learning is at the forefront of fintech research, which explores how to optimally reallocate a fund into different financial assets over the long term by trial-and-error.

Hierarchical Reinforcement Learning Management +2

RMIX: Risk-Sensitive Multi-Agent Reinforcement Learning

no code implementations1 Jan 2021 Wei Qiu, Xinrun Wang, Runsheng Yu, Xu He, Rundong Wang, Bo An, Svetlana Obraztsova, Zinovi Rabinovich

Centralized training with decentralized execution (CTDE) has become an important paradigm in multi-agent reinforcement learning (MARL).

Multi-agent Reinforcement Learning reinforcement-learning +3

Safe Coupled Deep Q-Learning for Recommendation Systems

no code implementations8 Jan 2021 Runsheng Yu, Yu Gong, Rundong Wang, Bo An, Qingwen Liu, Wenwu Ou

Firstly, we introduce a novel training scheme with two value functions to maximize the accumulated long-term reward under the safety constraint.

Q-Learning Recommendation Systems +1

Learning from Similarity-Confidence Data

no code implementations13 Feb 2021 Yuzhou Cao, Lei Feng, Yitian Xu, Bo An, Gang Niu, Masashi Sugiyama

Weakly supervised learning has drawn considerable attention recently to reduce the expensive time and labor consumption of labeling massive data.

Weakly-supervised Learning

RMIX: Learning Risk-Sensitive Policies for Cooperative Reinforcement Learning Agents

no code implementations16 Feb 2021 Wei Qiu, Xinrun Wang, Runsheng Yu, Xu He, Rundong Wang, Bo An, Svetlana Obraztsova, Zinovi Rabinovich

Current value-based multi-agent reinforcement learning methods optimize individual Q values to guide individuals' behaviours via centralized training with decentralized execution (CTDE).

Multi-agent Reinforcement Learning reinforcement-learning +3

DO-GAN: A Double Oracle Framework for Generative Adversarial Networks

no code implementations CVPR 2022 Aye Phyu Phyu Aung, Xinrun Wang, Runsheng Yu, Bo An, Senthilnath Jayavelu, XiaoLi Li

In this paper, we propose a new approach to train Generative Adversarial Networks (GANs) where we deploy a double-oracle framework using the generator and discriminator oracles.

Continual Learning

L2E: Learning to Exploit Your Opponent

no code implementations18 Feb 2021 Zhe Wu, Kai Li, Enmin Zhao, Hang Xu, Meng Zhang, Haobo Fu, Bo An, Junliang Xing

In this work, we propose a novel Learning to Exploit (L2E) framework for implicit opponent modeling.

CFR-MIX: Solving Imperfect Information Extensive-Form Games with Combinatorial Action Space

no code implementations18 May 2021 Shuxin Li, Youzhi Zhang, Xinrun Wang, Wanqi Xue, Bo An

The challenge of solving this type of game is that the team's joint action space grows exponentially with the number of agents, which results in the inefficiency of the existing algorithms, e. g., Counterfactual Regret Minimization (CFR).

counterfactual

On the Robustness of Average Losses for Partial-Label Learning

no code implementations11 Jun 2021 Jiaqi Lv, Biao Liu, Lei Feng, Ning Xu, Miao Xu, Bo An, Gang Niu, Xin Geng, Masashi Sugiyama

Partial-label learning (PLL) utilizes instances with PLs, where a PL includes several candidate labels but only one is the true label (TL).

Partial Label Learning Weakly Supervised Classification

Contingency-Aware Influence Maximization: A Reinforcement Learning Approach

1 code implementation13 Jun 2021 Haipeng Chen, Wei Qiu, Han-Ching Ou, Bo An, Milind Tambe

Empirical results show that our method achieves influence as high as the state-of-the-art methods for contingency-aware IM, while having negligible runtime at test phase.

Combinatorial Optimization reinforcement-learning +1

Multi-Class Classification from Single-Class Data with Confidences

no code implementations16 Jun 2021 Yuzhou Cao, Lei Feng, Senlin Shu, Yitian Xu, Bo An, Gang Niu, Masashi Sugiyama

We show that without any assumptions on the loss functions, models, and optimizers, we can successfully learn a multi-class classifier from only data of a single class with a rigorous consistency guarantee when confidences (i. e., the class-posterior probabilities for all the classes) are available.

Classification Multi-class Classification

Mis-spoke or mis-lead: Achieving Robustness in Multi-Agent Communicative Reinforcement Learning

no code implementations9 Aug 2021 Wanqi Xue, Wei Qiu, Bo An, Zinovi Rabinovich, Svetlana Obraztsova, Chai Kiat Yeo

Empirical results demonstrate that many state-of-the-art MACRL methods are vulnerable to message attacks, and our method can significantly improve their robustness.

Multi-agent Reinforcement Learning reinforcement-learning +1

Reinforcement Learning for Quantitative Trading

no code implementations28 Sep 2021 Shuo Sun, Rundong Wang, Bo An

RL's impact is pervasive, recently demonstrating its ability to conquer many challenging QT tasks.

Decision Making reinforcement-learning +1

Learning Pseudometric-based Action Representations for Offline Reinforcement Learning

no code implementations29 Sep 2021 Pengjie Gu, Mengchen Zhao, Chen Chen, Dong Li, Jianye Hao, Bo An

Offline reinforcement learning is a promising approach for practical applications since it does not require interactions with real-world environments.

Offline RL Recommendation Systems +4

Online Ad Hoc Teamwork under Partial Observability

no code implementations ICLR 2022 Pengjie Gu, Mengchen Zhao, Jianye Hao, Bo An

Autonomous agents often need to work together as a team to accomplish complex cooperative tasks.

Open-sampling: Re-balancing Long-tailed Datasets with Out-of-Distribution Data

no code implementations29 Sep 2021 Hongxin Wei, Lue Tao, Renchunzi Xie, Lei Feng, Bo An

Deep neural networks usually perform poorly when the training dataset suffers from extreme class imbalance.

RMIX: Learning Risk-Sensitive Policies forCooperative Reinforcement Learning Agents

no code implementations NeurIPS 2021 Wei Qiu, Xinrun Wang, Runsheng Yu, Rundong Wang, Xu He, Bo An, Svetlana Obraztsova, Zinovi Rabinovich

Current value-based multi-agent reinforcement learning methods optimize individual Q values to guide individuals' behaviours via centralized training with decentralized execution (CTDE).

Multi-agent Reinforcement Learning reinforcement-learning +3

Pretrained Cost Model for Distributed Constraint Optimization Problems

1 code implementation8 Dec 2021 Yanchen Deng, Shufeng Kong, Bo An

Our model, GAT-PCM, is then pretrained with optimally labelled data in an offline manner, so as to construct effective heuristics to boost a broad range of DCOP algorithms where evaluating the quality of a partial assignment is critical, such as local search or backtracking search.

Combinatorial Optimization Graph Attention

DeepScalper: A Risk-Aware Reinforcement Learning Framework to Capture Fleeting Intraday Trading Opportunities

no code implementations15 Dec 2021 Shuo Sun, Wanqi Xue, Rundong Wang, Xu He, Junlei Zhu, Jian Li, Bo An

Reinforcement learning (RL) techniques have shown great success in many challenging quantitative trading tasks, such as portfolio management and algorithmic trading.

Algorithmic Trading Decision Making +3

GearNet: Stepwise Dual Learning for Weakly Supervised Domain Adaptation

3 code implementations16 Jan 2022 Renchunzi Xie, Hongxin Wei, Lei Feng, Bo An

Although there have been a few studies on this problem, most of them only exploit unidirectional relationships from the source domain to the target domain.

Domain Adaptation

NSGZero: Efficiently Learning Non-Exploitable Policy in Large-Scale Network Security Games with Neural Monte Carlo Tree Search

no code implementations17 Jan 2022 Wanqi Xue, Bo An, Chai Kiat Yeo

Second, we enable neural MCTS with decentralized control, making NSGZero applicable to NSGs with many resources.

Mitigating Neural Network Overconfidence with Logit Normalization

2 code implementations19 May 2022 Hongxin Wei, Renchunzi Xie, Hao Cheng, Lei Feng, Bo An, Yixuan Li

Our method is motivated by the analysis that the norm of the logit keeps increasing during training, leading to overconfident output.

ResAct: Reinforcing Long-term Engagement in Sequential Recommendation with Residual Actor

1 code implementation1 Jun 2022 Wanqi Xue, Qingpeng Cai, Ruohan Zhan, Dong Zheng, Peng Jiang, Kun Gai, Bo An

Meanwhile, reinforcement learning (RL) is widely regarded as a promising framework for optimizing long-term engagement in sequential recommendation.

Reinforcement Learning (RL) Sequential Recommendation

Open-Sampling: Exploring Out-of-Distribution data for Re-balancing Long-tailed datasets

3 code implementations17 Jun 2022 Hongxin Wei, Lue Tao, Renchunzi Xie, Lei Feng, Bo An

Deep neural networks usually perform poorly when the training dataset suffers from extreme class imbalance.

Offline Equilibrium Finding

1 code implementation12 Jul 2022 Shuxin Li, Xinrun Wang, Youzhi Zhang, Jakub Cerny, Pengdeng Li, Hau Chan, Bo An

Extensive experimental results demonstrate the superiority of our approach over offline RL algorithms and the importance of using model-based methods for OEF problems.

Offline RL

Deep Attentive Belief Propagation: Integrating Reasoning and Learning for Solving Constraint Optimization Problems

no code implementations24 Sep 2022 Yanchen Deng, Shufeng Kong, Caihua Liu, Bo An

Belief Propagation (BP) is an important message-passing algorithm for various reasoning tasks over graphical models, including solving the Constraint Optimization Problems (COPs).

Graph Attention Self-Supervised Learning

RPM: Generalizable Behaviors for Multi-Agent Reinforcement Learning

no code implementations18 Oct 2022 Wei Qiu, Xiao Ma, Bo An, Svetlana Obraztsova, Shuicheng Yan, Zhongwen Xu

Despite the recent advancement in multi-agent reinforcement learning (MARL), the MARL agents easily overfit the training environment and perform poorly in the evaluation scenarios where other agents behave differently.

Multi-agent Reinforcement Learning reinforcement-learning +1

Classifying Ambiguous Identities in Hidden-Role Stochastic Games with Multi-Agent Reinforcement Learning

1 code implementation24 Oct 2022 Shijie Han, Siyuan Li, Bo An, Wei Zhao, Peng Liu

In this work, we develop a novel identity detection reinforcement learning (IDRL) framework that allows an agent to dynamically infer the identities of nearby agents and select an appropriate policy to accomplish the task.

Multi-agent Reinforcement Learning reinforcement-learning +2

Generalized Consistent Multi-Class Classification with Rejection to be Compatible with Arbitrary Losses

2 code implementations Conference 2022 Yuzhou Cao, Tianchi Cai, Lei Feng, Lihong Gu, Jinjie Gu, Bo An, Gang Niu, Masashi Sugiyama

\emph{Classification with rejection} (CwR) refrains from making a prediction to avoid critical misclassification when encountering test samples that are difficult to classify.

Classification Multi-class Classification

PrefRec: Recommender Systems with Human Preferences for Reinforcing Long-term User Engagement

1 code implementation6 Dec 2022 Wanqi Xue, Qingpeng Cai, Zhenghai Xue, Shuo Sun, Shuchang Liu, Dong Zheng, Peng Jiang, Kun Gai, Bo An

Though promising, the application of RL heavily relies on well-designed rewards, but designing rewards related to long-term user engagement is quite difficult.

Recommendation Systems Reinforcement Learning (RL)

Mitigating Memorization of Noisy Labels by Clipping the Model Prediction

no code implementations8 Dec 2022 Hongxin Wei, Huiping Zhuang, Renchunzi Xie, Lei Feng, Gang Niu, Bo An, Yixuan Li

In the presence of noisy labels, designing robust loss functions is critical for securing the generalization performance of deep neural networks.

Memorization

PRUDEX-Compass: Towards Systematic Evaluation of Reinforcement Learning in Financial Markets

no code implementations14 Jan 2023 Shuo Sun, Molei Qin, Xinrun Wang, Bo An

Specifically, i) we propose AlphaMix+ as a strong FinRL baseline, which leverages mixture-of-experts (MoE) and risk-sensitive approaches to make diversified risk-aware investment decisions, ii) we evaluate 8 FinRL methods in 4 long-term real-world datasets of influential financial markets to demonstrate the usage of our PRUDEX-Compass, iii) PRUDEX-Compass together with 4 real-world datasets, standard implementation of 8 FinRL methods and a portfolio management environment is released as public resources to facilitate the design and comparison of new FinRL methods.

Management reinforcement-learning +1

Reinforcement Learning from Diverse Human Preferences

no code implementations27 Jan 2023 Wanqi Xue, Bo An, Shuicheng Yan, Zhongwen Xu

The complexity of designing reward functions has been a major obstacle to the wide application of deep reinforcement learning (RL) techniques.

reinforcement-learning Reinforcement Learning (RL)

Towards Skilled Population Curriculum for Multi-Agent Reinforcement Learning

no code implementations7 Feb 2023 Rundong Wang, Longtao Zheng, Wei Qiu, Bowei He, Bo An, Zinovi Rabinovich, Yujing Hu, Yingfeng Chen, Tangjie Lv, Changjie Fan

Despite its success, ACL's applicability is limited by (1) the lack of a general student framework for dealing with the varying number of agents across tasks and the sparse reward problem, and (2) the non-stationarity of the teacher's task due to ever-changing student strategies.

Multi-agent Reinforcement Learning reinforcement-learning +1

Population-size-Aware Policy Optimization for Mean-Field Games

no code implementations7 Feb 2023 Pengdeng Li, Xinrun Wang, Shuxin Li, Hau Chan, Bo An

In this work, we attempt to bridge the two fields of finite-agent and infinite-agent games, by studying how the optimal policies of agents evolve with the number of agents (population size) in mean-field games, an agent-centric perspective in contrast to the existing works focusing typically on the convergence of the empirical distribution of the population.

Synapse: Trajectory-as-Exemplar Prompting with Memory for Computer Control

1 code implementation13 Jun 2023 Longtao Zheng, Rundong Wang, Xinrun Wang, Bo An

To address these challenges, we introduce Synapse, a computer agent featuring three key components: i) state abstraction, which filters out task-irrelevant information from raw states, allowing more exemplars within the limited context, ii) trajectory-as-exemplar prompting, which prompts the LLM with complete trajectories of the abstracted states and actions to improve multi-step decision-making, and iii) exemplar memory, which stores the embeddings of exemplars and retrieves them via similarity search for generalization to novel tasks.

Decision Making In-Context Learning +1

Partial-Label Regression

1 code implementation AAAI 2023 Xin Cheng, Deng-Bao Wang, Lei Feng, Min-Ling Zhang, Bo An

Our proposed methods are theoretically grounded and can be compatible with any models, optimizers, and losses.

Partial Label Learning regression +1

Weakly Supervised Regression with Interval Targets

no code implementations18 Jun 2023 Xin Cheng, Yuzhou Cao, Ximing Li, Bo An, Lei Feng

Third, we propose a statistically consistent limiting method for RIT to train the model by limiting the predictions to the interval.

regression

Efficient Last-iterate Convergence Algorithms in Solving Games

no code implementations22 Aug 2023 Linjian Meng, Zhenxing Ge, Wenbin Li, Bo An, Yang Gao

Recent works propose a Reward Transformation (RT) framework for MWU, which removes the uniqueness condition and achieves competitive performance with OMWU.

counterfactual

Market-GAN: Adding Control to Financial Market Data Generation with Semantic Context

no code implementations14 Sep 2023 Haochong Xia, Shuo Sun, Xinrun Wang, Bo An

Financial simulators play an important role in enhancing forecasting accuracy, managing risks, and fostering strategic financial decision-making.

Stock Market Prediction text-guided-generation +1

EarnHFT: Efficient Hierarchical Reinforcement Learning for High Frequency Trading

1 code implementation22 Sep 2023 Molei Qin, Shuo Sun, Wentao Zhang, Haochong Xia, Xinrun Wang, Bo An

In stage II, we construct a pool of diverse RL agents for different market trends, distinguished by return rates, where hundreds of RL agents are trained with different preferences of return rates and only a tiny fraction of them will be selected into the pool based on their profitability.

Algorithmic Trading Hierarchical Reinforcement Learning

AdaRec: Adaptive Sequential Recommendation for Reinforcing Long-term User Engagement

no code implementations6 Oct 2023 Zhenghai Xue, Qingpeng Cai, Tianyou Zuo, Bin Yang, Lantao Hu, Peng Jiang, Kun Gai, Bo An

One challenge in large-scale online recommendation systems is the constant and complicated changes in users' behavior patterns, such as interaction rates and retention tendencies.

Reinforcement Learning (RL) Sequential Recommendation

Reinforcement Learning with Maskable Stock Representation for Portfolio Management in Customizable Stock Pools

1 code implementation17 Nov 2023 Wentao Zhang, Yilei Zhao, Shuo Sun, Jie Ying, Yonggang Xie, Zitao Song, Xinrun Wang, Bo An

Specifically, the target stock pool of different investors varies dramatically due to their discrepancy on market states and individual investors may temporally adjust stocks they desire to trade (e. g., adding one popular stocks), which lead to customizable stock pools (CSPs).

Management reinforcement-learning +1

keqing: knowledge-based question answering is a nature chain-of-thought mentor of LLM

no code implementations31 Dec 2023 Chaojie Wang, Yishi Xu, Zhong Peng, Chenxi Zhang, Bo Chen, Xinrun Wang, Lei Feng, Bo An

Large language models (LLMs) have exhibited remarkable performance on various natural language processing (NLP) tasks, especially for question answering.

Information Retrieval Question Answering +1

Leveraging Gradients for Unsupervised Accuracy Estimation under Distribution Shift

no code implementations17 Jan 2024 Renchunzi Xie, Ambroise Odonnat, Vasilii Feofanov, Ievgen Redko, Jianfeng Zhang, Bo An

Our key idea is that the model should be adjusted with a higher magnitude of gradients when it does not generalize to the test dataset with a distribution shift.

Debiased Sample Selection for Combating Noisy Labels

1 code implementation24 Jan 2024 Qi Wei, Lei Feng, Haobo Wang, Bo An

To address this limitation, we propose a noIse-Tolerant Expert Model (ITEM) for debiased learning in sample selection.

Learning with noisy labels

True Knowledge Comes from Practice: Aligning LLMs with Embodied Environments via Reinforcement Learning

1 code implementation25 Jan 2024 Weihao Tan, Wentao Zhang, Shanqi Liu, Longtao Zheng, Xinrun Wang, Bo An

Despite the impressive performance across numerous tasks, large language models (LLMs) often fail in solving simple decision-making tasks due to the misalignment of the knowledge in LLMs with environments.

Decision Making Reinforcement Learning (RL)

Towards General Computer Control: A Multimodal Agent for Red Dead Redemption II as a Case Study

2 code implementations5 Mar 2024 Weihao Tan, Ziluo Ding, Wentao Zhang, Boyu Li, Bohan Zhou, Junpeng Yue, Haochong Xia, Jiechuan Jiang, Longtao Zheng, Xinrun Xu, Yifei Bi, Pengjie Gu, Xinrun Wang, Börje F. Karlsson, Bo An, Zongqing Lu

Despite the success in specific tasks and scenarios, existing foundation agents, empowered by large models (LMs) and advanced tools, still cannot generalize to different scenarios, mainly due to dramatic differences in the observations and actions across scenarios.

Efficient Exploration

AgentStudio: A Toolkit for Building General Virtual Agents

no code implementations26 Mar 2024 Longtao Zheng, Zhiyuan Huang, Zhenghai Xue, Xinrun Wang, Bo An, Shuicheng Yan

We have open-sourced the environments, datasets, benchmarks, and interfaces to promote research towards developing general virtual agents for the future.

Visual Grounding

Converging to Team-Maxmin Equilibria in Zero-Sum Multiplayer Games

no code implementations ICML 2020 Youzhi Zhang, Bo An

Second, we design an ISG variant for TMEs (ISGT) by exploiting that a TME is an NE maximizing the team’s utility and show that ISGT converges to a TME and the impossibility of relaxing conditions in ISGT.

Cannot find the paper you are looking for? You can Submit a new open access paper.