no code implementations • 2 Feb 2025 • Haoran Qiu, Anish Biswas, Zihan Zhao, Jayashree Mohan, Alind Khare, Esha Choukse, Íñigo Goiri, Zeyu Zhang, Haiying Shen, Chetan Bansal, Ramachandran Ramjee, Rodrigo Fonseca
Large multimodal models (LMMs) demonstrate impressive capabilities in understanding images, videos, and audio beyond text.
no code implementations • 5 Dec 2024 • Subash Katel, Haoyang Li, Zihan Zhao, Raghav Kansal, Farouk Mokhtar, Javier Duarte
In high energy physics, self-supervised learning (SSL) methods have the potential to aid in the creation of machine learning models without the need for labeled datasets for a variety of tasks, including those related to jets -- narrow sprays of particles produced by quarks and gluons in high energy particle collisions.
1 code implementation • 9 Nov 2024 • Jinghan He, Haiyun Guo, Kuan Zhu, Zihan Zhao, Ming Tang, Jinqiao Wang
In this work, we first explore and emphasize the importance of attention weights in knowledge retention, and then propose a SElective attEntion-guided Knowledge Retention method (SEEKR) for data-efficient replay-based continual learning of large language models (LLMs).
no code implementations • 27 Sep 2024 • Liangtai Sun, Danyu Luo, Da Ma, Zihan Zhao, Baocai Chen, Zhennan Shen, Su Zhu, Lu Chen, Xin Chen, Kai Yu
We further analyze the expert layers and show that the results of expert selection vary with data from different disciplines.
no code implementations • 20 Sep 2024 • Zihan Zhao, Bo Chen, Jingpiao Li, Lu Chen, Liyang Wen, Pengyu Wang, Zichen Zhu, Danyang Zhang, Ziping Wan, Yansi Li, Zhongyang Dai, Xin Chen, Kai Yu
Rapid developments of AI tools are expected to offer unprecedented assistance to the research of natural science including chemistry.
no code implementations • 16 Jun 2024 • Pengfei Gu, Zihan Zhao, Hongxiao Wang, Yaopeng Peng, Yizhe Zhang, Nishchal Sapkota, Chaoli Wang, Danny Z. Chen
The Segment Anything Model (SAM) exhibits impressive capabilities in zero-shot segmentation for natural images.
1 code implementation • 28 Feb 2024 • Hongshen Xu, Lu Chen, Zihan Zhao, Da Ma, Ruisheng Cao, Zichen Zhu, Kai Yu
Additionally, we propose several pre-training tasks to model the interaction among text, structure, and image modalities effectively.
no code implementations • 5 Feb 2024 • Zichen Zhu, Yang Xu, Lu Chen, Jingkai Yang, Yichuan Ma, Yiming Sun, Hailin Wen, Jiaqi Liu, Jinyu Cai, Yingzi Ma, Situo Zhang, Zihan Zhao, Liangtai Sun, Kai Yu
The rapid development of multimodal large language models (MLLMs) raises the question of how they compare to human performance.
no code implementations • 1 Feb 2024 • Qun Ma, Xiao Xue, Deyu Zhou, Xiangning Yu, Donghua Liu, Xuwen Zhang, Zihan Zhao, Yifan Shen, Peilin Ji, Juanjuan Li, Gang Wang, Wanpeng Ma
These agents, known as LLM-based Agent, offer the potential to enhance the anthropomorphism lacking in ABM.
1 code implementation • 26 Jan 2024 • Zihan Zhao, Da Ma, Lu Chen, Liangtai Sun, Zihao Li, Yi Xia, Bo Chen, Hongshen Xu, Zichen Zhu, Su Zhu, Shuai Fan, Guodong Shen, Kai Yu, Xin Chen
In its utmost form, such a generalist AI chemist could be referred to as Chemical General Intelligence.
2 code implementations • 25 Aug 2023 • Liangtai Sun, Yang Han, Zihan Zhao, Da Ma, Zhennan Shen, Baocai Chen, Lu Chen, Kai Yu
This design suffers from data leakage problem and lacks the evaluation of subjective Q/A ability.
1 code implementation • 20 Aug 2023 • Zihan Zhao, Yiyang Jiang, Heyang Liu, Yanfeng Wang, Yu Wang
While Large Language Models (LLMs) have demonstrated commendable performance across a myriad of domains and tasks, existing LLMs still exhibit a palpable deficit in handling multimodal functionalities, especially for the Spoken Question Answering (SQA) task which necessitates precise alignment and deep interaction between speech and text features.
1 code implementation • NeurIPS 2023 • Danyang Zhang, Lu Chen, Situo Zhang, Hongshen Xu, Zihan Zhao, Kai Yu
By equipping the LLM with a long-term experience memory, REMEMBERER is capable of exploiting the experiences from the past episodes even for different task goals, which excels an LLM-based agent with fixed exemplars or equipped with a transient working memory.
2 code implementations • 14 May 2023 • Danyang Zhang, Zhennan Shen, Rui Xie, Situo Zhang, Tianbao Xie, Zihan Zhao, Siyuan Chen, Lu Chen, Hongshen Xu, Ruisheng Cao, Kai Yu
The Graphical User Interface (GUI) is pivotal for human interaction with the digital world, enabling efficient device control and the completion of complex tasks.
no code implementations • 20 Feb 2023 • Zihan Zhao, Yu Wang, Yanfeng Wang
Multimodal emotion recognition is a challenging research area that aims to fuse different modalities to predict human emotion.
no code implementations • 26 Sep 2022 • Chuang Liu, Lei Kou, Guowei Cai, Zihan Zhao, Zhe Zhang
Power electronics converters have been widely used in aerospace system, DC transmission, distributed energy, smart grid and so forth, and the reliability of power electronics converters has been a hotspot in academia and industry.
1 code implementation • 11 Jul 2022 • Zihan Zhao, Yanfeng Wang, Yu Wang
The research and applications of multimodal emotion recognition have become increasingly popular recently.
1 code implementation • NAACL 2022 • Zihan Zhao, Lu Chen, Ruisheng Cao, Hongshen Xu, Xingyu Chen, Kai Yu
Recently, the structural reading comprehension (SRC) task on web pages has attracted increasing research interests.
1 code implementation • EMNLP 2021 • Xingyu Chen, Zihan Zhao, Lu Chen, Danyang Zhang, Jiabao Ji, Ao Luo, Yuxuan Xiong, Kai Yu
In this paper, we introduce the task of structural reading comprehension (SRC) on web.
no code implementations • 14 Oct 2020 • Zihan Zhao, Yuncong Liu, Lu Chen, Qi Liu, Rao Ma, Kai Yu
Recently, pre-trained language models like BERT have shown promising performance on multiple natural language processing tasks.
1 code implementation • 25 Sep 2020 • Zhangxuan Gu, Siyuan Zhou, Li Niu, Zihan Zhao, Liqing Zhang
Thus, we focus on zero-shot semantic segmentation, which aims to segment unseen objects with only category-level semantic representations provided for unseen categories.
2 code implementations • 16 Aug 2020 • Zhangxuan Gu, Siyuan Zhou, Li Niu, Zihan Zhao, Liqing Zhang
In this paper, we propose a novel context-aware feature generation method for zero-shot segmentation named CaGNet.
Ranked #9 on
Zero-Shot Semantic Segmentation
on PASCAL VOC