3 code implementations • 29 Aug 2024 • Wenyi Hong, Weihan Wang, Ming Ding, Wenmeng Yu, Qingsong Lv, Yan Wang, Yean Cheng, Shiyu Huang, Junhui Ji, Zhao Xue, Lei Zhao, Zhuoyi Yang, Xiaotao Gu, Xiaohan Zhang, Guanyu Feng, Da Yin, Zihan Wang, Ji Qi, Xixuan Song, Peng Zhang, Debing Liu, Bin Xu, Juanzi Li, Yuxiao Dong, Jie Tang
Beginning with VisualGLM and CogVLM, we are continuously exploring VLMs in pursuit of enhanced vision-language fusion, efficient higher-resolution architecture, and broader modalities and applications.
Ranked #5 on Visual Question Answering on MM-Vet
1 code implementation • 12 Aug 2024 • Zhuoyi Yang, Jiayan Teng, Wendi Zheng, Ming Ding, Shiyu Huang, Jiazheng Xu, Yuanming Yang, Wenyi Hong, Xiaohan Zhang, Guanyu Feng, Da Yin, Xiaotao Gu, Yuxuan Zhang, Weihan Wang, Yean Cheng, Ting Liu, Bin Xu, Yuxiao Dong, Jie Tang
We present CogVideoX, a large-scale text-to-video generation model based on diffusion transformer, which can generate 10-second continuous videos aligned with text prompt, with a frame rate of 16 fps and resolution of 768 * 1360 pixels.
no code implementations • 2 Aug 2024 • Ruize Zhang, Zelai Xu, Chengdong Ma, Chao Yu, Wei-Wei Tu, Shiyu Huang, Deheng Ye, Wenbo Ding, Yaodong Yang, Yu Wang
Self-play, characterized by agents' interactions with copies or past versions of itself, has recently gained prominence in reinforcement learning.
Multi-agent Reinforcement Learning reinforcement-learning +2
no code implementations • 24 Jun 2024 • Yajing Pei, Shiyu Huang, Yiting Lu, Xin Li, Zhibo Chen
User Generated Content (UGC) videos are susceptible to complicated and variant degradations and contents, which prevents the existing blind video quality assessment (BVQA) models from good performance since the lack of the adapability of distortions and contents.
1 code implementation • 20 Jun 2024 • Wentse Chen, Shiyu Huang, Jeff Schneider
In this paper, we propose an enhancement to QMIX by incorporating an additional local Q-value learning method within the maximum entropy RL framework.
1 code implementation • 12 Jun 2024 • Weihan Wang, Zehai He, Wenyi Hong, Yean Cheng, Xiaohan Zhang, Ji Qi, Shiyu Huang, Bin Xu, Yuxiao Dong, Ming Ding, Jie Tang
To address this gap, we introduce LVBench, a benchmark specifically designed for long video understanding.
1 code implementation • 26 Feb 2024 • Junzhe Chen, Xuming Hu, Shuodi Liu, Shiyu Huang, Wei-Wei Tu, Zhaofeng He, Lijie Wen
Recent advancements in large language models (LLMs) have revealed their potential for achieving autonomous agents possessing human-level intelligence.
1 code implementation • 16 Feb 2024 • Yiwen Sun, Xianyin Zhang, Shiyu Huang, Shaowei Cai, BingZhen Zhang, Ke Wei
Heuristics are crucial in SAT solvers, but no heuristic rules are suitable for all SAT problems.
1 code implementation • 20 Dec 2023 • Shiyu Huang, Wentse Chen, Yiwen Sun, Fuqing Bie, Wei-Wei Tu
We present OpenRL, an advanced reinforcement learning (RL) framework designed to accommodate a diverse array of tasks, from single-agent challenges to complex multi-agent systems.
1 code implementation • 5 Sep 2023 • Haixu Song, Shiyu Huang, Yinpeng Dong, Wei-Wei Tu
The rise of deepfake images, especially of well-known personalities, poses a serious threat to the dissemination of authentic information.
1 code implementation • 23 Aug 2023 • Fanqi Lin, Shiyu Huang, WeiWei Tu
Under such a framework, we also propose a provably efficient diversity reinforcement learning algorithm.
2 code implementations • NeurIPS 2023 • Bill Yuchen Lin, Yicheng Fu, Karina Yang, Faeze Brahman, Shiyu Huang, Chandra Bhagavatula, Prithviraj Ammanabrolu, Yejin Choi, Xiang Ren
The Swift module is a small encoder-decoder LM fine-tuned on the oracle agent's action trajectories, while the Sage module employs LLMs such as GPT-4 for subgoal planning and grounding.
1 code implementation • 15 Feb 2023 • Fanqi Lin, Shiyu Huang, Tim Pearce, Wenze Chen, Wei-Wei Tu
Multi-agent football poses an unsolved challenge in AI research.
1 code implementation • 8 Feb 2023 • Xinyi Yang, Shiyu Huang, Yiwen Sun, Yuxiang Yang, Chao Yu, Wei-Wei Tu, Huazhong Yang, Yu Wang
Goal-conditioned hierarchical reinforcement learning (HRL) provides a promising direction to tackle this challenge by introducing a hierarchical structure to decompose the search space, where the low-level policy predicts primitive actions in the guidance of the goals derived from the high-level policy.
Hierarchical Reinforcement Learning Multi-agent Reinforcement Learning +2
2 code implementations • 12 Jul 2022 • Wentse Chen, Shiyu Huang, Yuan Chiang, Tim Pearce, Wei-Wei Tu, Ting Chen, Jun Zhu
We propose Diversity-Guided Policy Optimization (DGPO), an on-policy algorithm that discovers multiple strategies for solving a given task.
1 code implementation • 9 Oct 2021 • Shiyu Huang, Wenze Chen, Longfei Zhang, Shizhen Xu, Ziyang Li, Fengming Zhu, Deheng Ye, Ting Chen, Jun Zhu
To the best of our knowledge, Tikick is the first learning-based AI system that can take over the multi-agent Google Research Football full game, while previous work could either control a single agent or experiment on toy academic scenarios.
1 code implementation • 8 Oct 2021 • Shiyu Huang, Bin Wang, Dong Li, Jianye Hao, Ting Chen, Jun Zhu
In this work, we propose a new algorithm for circuit routing, named Ranking Cost, which innovatively combines search-based methods (i. e., A* algorithm) and learning-based methods (i. e., Evolution Strategies) to form an efficient and trainable router.
no code implementations • 1 Jan 2021 • Shiyu Huang, Bin Wang, Dong Li, Jianye Hao, Jun Zhu, Ting Chen
In our method, we introduce a new set of variables called cost maps, which can help the A* router to find out proper paths to achieve the global object.
no code implementations • ICLR 2020 • Shiyu Huang, Hang Su, Jun Zhu, Ting Chen
Partially Observable Markov Decision Processes (POMDPs) are popular and flexible models for real-world decision-making applications that demand the information from past observations to make optimal decisions.
1 code implementation • CVPR 2017 • Shiyu Huang, Deva Ramanan
Such "in-the-tail" data is notoriously hard to observe, making both training and testing difficult.