Search Results for author: Zhen Xiao

Found 20 papers, 5 papers with code

Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like

no code implementations • 12 Feb 2024 • Naoyuki Kanda, Xiaofei Wang, Sefik Emre Eskimez, Manthan Thakker, Hemin Yang, Zirun Zhu, Min Tang, Canrun Li, Chung-Hsien Tsai, Zhen Xiao, Yufei Xia, Jinzhu Li, Yanqing Liu, Sheng Zhao, Michael Zeng

In this work, we propose ELaTE, a zero-shot TTS that can generate natural laughing speech of any speaker based on a short audio prompt with precise control of laughter timing and expression.

Paper
Add Code

Understanding the Weakness of Large Language Model Agents within a Complex Android Environment

1 code implementation • 9 Feb 2024 • Mingzhe Xing, Rongkai Zhang, Hui Xue, Qi Chen, Fan Yang, Zhen Xiao

These challenges motivate AndroidArena, an environment and benchmark designed to evaluate LLM agents on a modern operating system.

Date Understanding Language Modelling +1

Paper
Code

DTMM: Deploying TinyML Models on Extremely Weak IoT Devices with Pruning

no code implementations • 17 Jan 2024 • Lixiang Han, Zhen Xiao, Zhenjiang Li

DTMM is a library designed for efficient deployment and execution of machine learning models on weak IoT devices such as microcontroller units (MCUs).

Paper
Add Code

PDiT: Interleaving Perception and Decision-making Transformers for Deep Reinforcement Learning

2 code implementations • 26 Dec 2023 • Hangyu Mao, Rui Zhao, Ziyue Li, Zhiwei Xu, Hao Chen, Yiqun Chen, Bin Zhang, Zhen Xiao, Junge Zhang, Jiangjin Yin

Designing better deep networks and better reinforcement learning (RL) algorithms are both important for deep RL.

Decision Making Offline RL +2

Paper
Code

AttMOT: Improving Multiple-Object Tracking by Introducing Auxiliary Pedestrian Attributes

no code implementations • 15 Aug 2023 • Yunhao Li, Zhen Xiao, Lin Yang, Dan Meng, Xin Zhou, Heng Fan, Libo Zhang

To the best of our knowledge, AttMOT is the first MOT dataset with semantic attributes.

Attribute Multi-Object Tracking +1

Paper
Add Code

AGAIN: Adversarial Training With Attribution Span Enlargement and Hybrid Feature Fusion

1 code implementation • CVPR 2023 • Shenglin Yin, Kelu Yao, Sheng Shi, Yangzhou Du, Zhen Xiao

To this end, compared with standard DNNs, we discover that the generalization gap of adversarially trained DNNs is caused by the smaller attribution span on the input image.

Paper
Code

Transformer in Transformer as Backbone for Deep Reinforcement Learning

1 code implementation • 30 Dec 2022 • Hangyu Mao, Rui Zhao, Hao Chen, Jianye Hao, Yiqun Chen, Dong Li, Junge Zhang, Zhen Xiao

Recent methods combine the Transformer with these modules for better performance.

Decision Making reinforcement-learning +1

Paper
Code

AnimalTrack: A Benchmark for Multi-Animal Tracking in the Wild

no code implementations • 30 Apr 2022 • Libo Zhang, Junyuan Gao, Zhen Xiao, Heng Fan

Multi-animal tracking (MAT), a multi-object tracking (MOT) problem, is crucial for animal motion and behavior analysis and has many crucial applications such as biology, ecology and animal conservation.

Multi-Object Tracking

Paper
Add Code

Florence: A New Foundation Model for Computer Vision

1 code implementation • 22 Nov 2021 • Lu Yuan, Dongdong Chen, Yi-Ling Chen, Noel Codella, Xiyang Dai, Jianfeng Gao, Houdong Hu, Xuedong Huang, Boxin Li, Chunyuan Li, Ce Liu, Mengchen Liu, Zicheng Liu, Yumao Lu, Yu Shi, Lijuan Wang, JianFeng Wang, Bin Xiao, Zhen Xiao, Jianwei Yang, Michael Zeng, Luowei Zhou, Pengchuan Zhang

Computer vision foundation models, which are trained on diverse, large-scale dataset and can be adapted to a wide range of downstream tasks, are critical for this mission to solve real-world computer vision applications.

Ranked #1 on Action Recognition In Videos on Kinetics-600

Action Classification Action Recognition In Videos +12

369

Paper
Code

Learning Explicit Credit Assignment for Multi-agent Joint Q-learning

no code implementations • 29 Sep 2021 • Hangyu Mao, Jianye Hao, Dong Li, Jun Wang, Weixun Wang, Xiaotian Hao, Bin Wang, Kun Shao, Zhen Xiao, Wulong Liu

In contrast, we formulate an \emph{explicit} credit assignment problem where each agent gives its suggestion about how to weight individual Q-values to explicitly maximize the joint Q-value, besides guaranteeing the Bellman optimality of the joint Q-value.

Q-Learning

Paper
Add Code

Speech-language Pre-training for End-to-end Spoken Language Understanding

no code implementations • 11 Feb 2021 • Yao Qian, Ximo Bian, Yu Shi, Naoyuki Kanda, Leo Shen, Zhen Xiao, Michael Zeng

End-to-end (E2E) spoken language understanding (SLU) can infer semantics directly from speech signal without cascading an automatic speech recognizer (ASR) with a natural language understanding (NLU) module.

Ranked #3 on Spoken Language Understanding on Fluent Speech Commands (using extra training data)

Language Modelling Natural Language Understanding +1

Paper
Add Code

Reward Design in Cooperative Multi-agent Reinforcement Learning for Packet Routing

no code implementations • ICLR 2018 • Hangyu Mao, Zhibo Gong, Zhen Xiao

In this paper, we study reward design problem in cooperative MARL based on packet routing environments.

Multi-agent Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Neighborhood Cognition Consistent Multi-Agent Reinforcement Learning

no code implementations • 3 Dec 2019 • Hangyu Mao, Wulong Liu, Jianye Hao, Jun Luo, Dong Li, Zhengchao Zhang, Jun Wang, Zhen Xiao

Social psychology and real experiences show that cognitive consistency plays an important role to keep human society in order: if people have a more consistent cognition about their environments, they are more likely to achieve better cooperation.

Multi-agent Reinforcement Learning Q-Learning +2

Paper
Add Code

Learning Agent Communication under Limited Bandwidth by Message Pruning

no code implementations • 3 Dec 2019 • Hangyu Mao, Zhengchao Zhang, Zhen Xiao, Zhibo Gong, Yan Ni

We evaluate the gating mechanism on several tasks.

Paper
Add Code

Learning Multi-agent Communication under Limited-bandwidth Restriction for Internet Packet Routing

no code implementations • 26 Feb 2019 • Hangyu Mao, Zhibo Gong, Zhengchao Zhang, Zhen Xiao, Yan Ni

Communication is an important factor for the big multi-agent world to stay organized and productive.

Paper
Add Code

Modelling the Dynamic Joint Policy of Teammates with Attention Multi-agent DDPG

no code implementations • 13 Nov 2018 • Hangyu Mao, Zhengchao Zhang, Zhen Xiao, Zhibo Gong

Second, to model the teammates' policies using the collected information in an effective way, ATT-MADDPG enhances the centralized critic with an attention mechanism.

Reinforcement Learning (RL)