1 code implementation • 14 Oct 2024 • Mingchen Zhuge, Changsheng Zhao, Dylan Ashley, Wenyi Wang, Dmitrii Khizbullin, Yunyang Xiong, Zechun Liu, Ernie Chang, Raghuraman Krishnamoorthi, Yuandong Tian, Yangyang Shi, Vikas Chandra, Jürgen Schmidhuber
To address this, we introduce the Agent-as-a-Judge framework, wherein agentic systems are used to evaluate agentic systems.
1 code implementation • 14 Oct 2024 • Jiayi Zhang, Jinyu Xiang, Zhaoyang Yu, Fengwei Teng, Xionghui Chen, Jiaqi Chen, Mingchen Zhuge, Xin Cheng, Sirui Hong, Jinlin Wang, Bingnan Zheng, Bang Liu, Yuyu Luo, Chenglin Wu
Large language models (LLMs) have demonstrated remarkable potential in solving complex tasks across diverse domains, typically by employing agentic workflows that follow detailed instructions and operational sequences.
Ranked #3 on Code Generation on HumanEval
no code implementations • 24 Jul 2024 • Xiuying Chen, Tairan Wang, Taicheng Guo, Kehan Guo, Juexiao Zhou, Haoyang Li, Mingchen Zhuge, Jürgen Schmidhuber, Xin Gao, Xiangliang Zhang
We hope our benchmark and model can facilitate and promote more research on chemical QA.
2 code implementations • 23 Jul 2024 • Xingyao Wang, Boxuan Li, Yufan Song, Frank F. Xu, Xiangru Tang, Mingchen Zhuge, Jiayi Pan, Yueqi Song, Bowen Li, Jaskirat Singh, Hoang H. Tran, Fuqiang Li, Ren Ma, Mingzhang Zheng, Bill Qian, Yanjun Shao, Niklas Muennighoff, Yizhe Zhang, Binyuan Hui, Junyang Lin, Robert Brennan, Hao Peng, Heng Ji, Graham Neubig
OpenDevin), a platform for the development of powerful and flexible AI agents that interact with the world in similar ways to those of a human developer: by writing code, interacting with a command line, and browsing the web.
1 code implementation • 17 Jul 2024 • Kirolos Ataallah, Xiaoqian Shen, Eslam Abdelrahman, Essam Sleiman, Mingchen Zhuge, Jian Ding, Deyao Zhu, Jürgen Schmidhuber, Mohamed Elhoseiny
This design of the retrieval mechanism enables the Goldfish to efficiently process arbitrarily long video sequences, facilitating its application in contexts such as movies or television series.
1 code implementation • 28 Feb 2024 • Sirui Hong, Yizhang Lin, Bang Liu, Bangbang Liu, Binhao Wu, Ceyao Zhang, Chenxing Wei, Danyang Li, Jiaqi Chen, Jiayi Zhang, Jinlin Wang, Li Zhang, Lingyao Zhang, Min Yang, Mingchen Zhuge, Taicheng Guo, Tuo Zhou, Wei Tao, Xiangru Tang, Xiangtao Lu, Xiawu Zheng, Xinbing Liang, Yaying Fei, Yuheng Cheng, Zhibin Gou, Zongze Xu, Chenglin Wu
On InfiAgent-DABench, it achieves a 25% performance boost, raising accuracy from 75. 9% to 94. 9%.
2 code implementations • 26 Feb 2024 • Mingchen Zhuge, Wenyi Wang, Louis Kirsch, Francesco Faccio, Dmitrii Khizbullin, Jürgen Schmidhuber
Various human-designed prompt engineering techniques have been proposed to improve problem solvers based on Large Language Models (LLMs), yielding many disparate code bases.
1 code implementation • ICCV 2023 • Haozhe Liu, Mingchen Zhuge, Bing Li, Yuhui Wang, Francesco Faccio, Bernard Ghanem, Jürgen Schmidhuber
Recent work on deep reinforcement learning (DRL) has pointed out that algorithmic information about good policies can be extracted from offline data which lack explicit information about executed actions.
1 code implementation • 1 Aug 2023 • Sirui Hong, Mingchen Zhuge, Jiaqi Chen, Xiawu Zheng, Yuheng Cheng, Ceyao Zhang, Jinlin Wang, Zili Wang, Steven Ka Shing Yau, Zijuan Lin, Liyang Zhou, Chenyu Ran, Lingfeng Xiao, Chenglin Wu, Jürgen Schmidhuber
Remarkable progress has been made on automated problem solving through societies of agents based on large language models (LLMs).
Ranked #12 on Code Generation on HumanEval
no code implementations • 26 May 2023 • Mingchen Zhuge, Haozhe Liu, Francesco Faccio, Dylan R. Ashley, Róbert Csordás, Anand Gopalakrishnan, Abdullah Hamdi, Hasan Abed Al Kader Hammoud, Vincent Herrmann, Kazuki Irie, Louis Kirsch, Bing Li, Guohao Li, Shuming Liu, Jinjie Mai, Piotr Piękos, Aditya Ramesh, Imanol Schlag, Weimin Shi, Aleksandar Stanić, Wenyi Wang, Yuhui Wang, Mengmeng Xu, Deng-Ping Fan, Bernard Ghanem, Jürgen Schmidhuber
What should be the social structure of an NLSOM?
no code implementations • 2 Feb 2023 • Weimin Shi, Mingchen Zhuge, Dehong Gao, Zhong Zhou, Ming-Ming Cheng, Deng-Ping Fan
Daily images may convey abstract meanings that require us to memorize and infer profound information from them.
no code implementations • CVPR 2023 • Haoqian Wu, Keyu Chen, Haozhe Liu, Mingchen Zhuge, Bing Li, Ruizhi Qiao, Xiujun Shu, Bei Gan, Liangsheng Xu, Bo Ren, Mengmeng Xu, Wentian Zhang, Raghavendra Ramachandra, Chia-Wen Lin, Bernard Ghanem
Temporal video segmentation is the get-to-go automatic video analysis, which decomposes a long-form video into smaller components for the following-up understanding tasks.
1 code implementation • 8 Mar 2022 • Jingfei Xia, Mingchen Zhuge, Tiantian Geng, Shun Fan, Yuantai Wei, Zhenyu He, Feng Zheng
Figure skating scoring is challenging because it requires judging the technical moves of the players as well as their coordination with the background music.
1 code implementation • 5 Nov 2021 • Ge-Peng Ji, Lei Zhu, Mingchen Zhuge, Keren Fu
Camouflaged Object Detection (COD) aims to detect objects with similar patterns (e. g., texture, intensity, colour, etc) to their surroundings, and recently has attracted growing research interest.
Ranked #15 on Camouflaged Object Segmentation on PCOD_1200
1 code implementation • CVPR 2021 • Mingchen Zhuge, Dehong Gao, Deng-Ping Fan, Linbo Jin, Ben Chen, Haoming Zhou, Minghui Qiu, Ling Shao
We present a new vision-language (VL) pre-training model dubbed Kaleido-BERT, which introduces a novel kaleido strategy for fashion cross-modality representations from transformers.
3 code implementations • 19 Jan 2021 • Mingchen Zhuge, Deng-Ping Fan, Nian Liu, Dingwen Zhang, Dong Xu, Ling Shao
We define the concept of integrity at both a micro and macro level.
no code implementations • 14 Jan 2021 • Geng Chen, Xinrui Chen, Bo Dong, Mingchen Zhuge, Yongxiong Wang, Hongbo Bi, Jian Chen, Peng Wang, Yanning Zhang
Our method detects camouflaged objects with an effective fusion strategy, which aggregates the rich context information from a large receptive field.