Search Results for author: Yuxuan Sun

Found 36 papers, 16 papers with code

PathMMU: A Massive Multimodal Expert-Level Benchmark for Understanding and Reasoning in Pathology

no code implementations • 29 Jan 2024 • Yuxuan Sun, Hao Wu, Chenglu Zhu, Sunyi Zheng, Qizi Chen, Kai Zhang, Yunlong Zhang, Dan Wan, Xiaoxiao Lan, Mengyue Zheng, Jingxiong Li, Xinheng Lyu, Tao Lin, Lin Yang

To address this, we introduce PathMMU, the largest and highest-quality expert-validated pathology benchmark for Large Multimodal Models (LMMs).

Paper
Add Code

Mobility Accelerates Learning: Convergence Analysis on Hierarchical Federated Learning in Vehicular Networks

no code implementations • 18 Jan 2024 • Tan Chen, Jintao Yan, Yuxuan Sun, Sheng Zhou, Deniz Gündüz, Zhisheng Niu

Hierarchical federated learning (HFL) enables distributed training of models across multiple devices with the help of several edge servers and a cloud edge server in a privacy-preserving manner.

Federated Learning Privacy Preserving

Paper
Add Code

Benchmarking PathCLIP for Pathology Image Analysis

no code implementations • 5 Jan 2024 • Sunyi Zheng, Xiaonan Cui, Yuxuan Sun, Jingxiong Li, Honglin Li, Yunlong Zhang, Pingyi Chen, Xueping Jing, Zhaoxiang Ye, Lin Yang

Additionally, we assess the robustness of PathCLIP in the task of image-image retrieval, revealing that PathCLIP performs less effectively than PLIP on Osteosarcoma but performs better on WSSS4LUAD under diverse corruptions.

Benchmarking Decision Making +4

Paper
Add Code

MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following

no code implementations • 5 Dec 2023 • Renze Lou, Kai Zhang, Jian Xie, Yuxuan Sun, Janice Ahn, Hanzi Xu, Yu Su, Wenpeng Yin

In the realm of large language models (LLMs), enhancing instruction-following capability often involves curating expansive training data.

Instruction Following

Paper
Add Code

Unleashing the Power of Prompt-driven Nucleus Instance Segmentation

1 code implementation • 27 Nov 2023 • Zhongyi Shui, Yunlong Zhang, Kai Yao, Chenglu Zhu, Sunyi Zheng, Jingxiong Li, Honglin Li, Yuxuan Sun, Ruizhe Guo, Lin Yang

In this paper, we present a novel prompt-driven framework that consists of a nucleus prompter and SAM for automatic nucleus instance segmentation.

Image Segmentation Instance Segmentation +3

Paper
Code

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

2 code implementations • 27 Nov 2023 • Xiang Yue, Yuansheng Ni, Kai Zhang, Tianyu Zheng, Ruoqi Liu, Ge Zhang, Samuel Stevens, Dongfu Jiang, Weiming Ren, Yuxuan Sun, Cong Wei, Botao Yu, Ruibin Yuan, Renliang Sun, Ming Yin, Boyuan Zheng, Zhenzhu Yang, Yibo Liu, Wenhao Huang, Huan Sun, Yu Su, Wenhu Chen

We introduce MMMU: a new benchmark designed to evaluate multimodal models on massive multi-discipline tasks demanding college-level subject knowledge and deliberate reasoning.

Complex Query Answering Logical Reasoning +1

7,291

Paper
Code

Test-Time Training for Semantic Segmentation with Output Contrastive Loss

1 code implementation • 14 Nov 2023 • Yunlong Zhang, Yuxuan Sun, Sunyi Zheng, Zhongyi Shui, Chenglu Zhu, Lin Yang

Although deep learning-based segmentation models have achieved impressive performance on public benchmarks, generalizing well to unseen environments remains a major challenge.

Domain Adaptation Image Classification +1

Paper
Code

Attention-Challenging Multiple Instance Learning for Whole Slide Image Classification

1 code implementation • 13 Nov 2023 • Yunlong Zhang, Honglin Li, Yuxuan Sun, Sunyi Zheng, Chenglu Zhu, Lin Yang

In the application of Multiple Instance Learning (MIL) methods for Whole Slide Image (WSI) classification, attention mechanisms often focus on a subset of discriminative instances, which are closely linked to overfitting.

Image Classification Multiple Instance Learning

Paper
Code

Multimodal Question Answering for Unified Information Extraction

1 code implementation • 4 Oct 2023 • Yuxuan Sun, Kai Zhang, Yu Su

In addition, the effectiveness of our framework can successfully transfer to the few-shot setting, enhancing LMMs on a scale of 10B parameters to be competitive or outperform much larger language models such as ChatGPT and GPT-4.

Question Answering

Paper
Code

A Data Source for Reasoning Embodied Agents

1 code implementation • 14 Sep 2023 • Jack Lanchantin, Sainbayar Sukhbaatar, Gabriel Synnaeve, Yuxuan Sun, Kavya Srinet, Arthur Szlam

In this work, to further pursue these advances, we introduce a new data generator for machine reasoning that integrates with an embodied agent.

Paper
Code

Masked conditional variational autoencoders for chromosome straightening

no code implementations • 25 Jun 2023 • Jingxiong Li, Sunyi Zheng, Zhongyi Shui, Shichuan Zhang, Linyi Yang, Yuxuan Sun, Yunlong Zhang, Honglin Li, Yuanxin Ye, Peter M. A. van Ooijen, Kang Li, Lin Yang

This yields a non-trivial reconstruction task, allowing the model to effectively preserve chromosome banding patterns and structure details in the reconstructed results.

Paper
Add Code

Data-Heterogeneous Hierarchical Federated Learning with Mobility

no code implementations • 19 Jun 2023 • Tan Chen, Jintao Yan, Yuxuan Sun, Sheng Zhou, Deniz Gunduz, Zhisheng Niu

Federated learning enables distributed training of machine learning (ML) models across multiple devices in a privacy-preserving manner.

Federated Learning Privacy Preserving

Paper
Add Code

PathAsst: A Generative Foundation AI Assistant Towards Artificial General Intelligence of Pathology

1 code implementation • 24 May 2023 • Yuxuan Sun, Chenglu Zhu, Sunyi Zheng, Kai Zhang, Lin Sun, Zhongyi Shui, Yunlong Zhang, Honglin Li, Lin Yang

Secondly, by leveraging the collected data, we construct PathCLIP, a pathology-dedicated CLIP, to enhance PathAsst's capabilities in interpreting pathology images.

Instruction Following Language Modelling +1

Paper
Code

Digital-SC: Digital Semantic Communication with Adaptive Network Split and Learned Non-Linear Quantization

no code implementations • 22 May 2023 • Lei Guo, Wei Chen, Yuxuan Sun, Bo Ai

Additionally, structured pruning is incorporated to reduce the dimension of the transmitted features.

Image Classification Intelligent Communication +1

Paper
Add Code

Transforming Human-Centered AI Collaboration: Redefining Embodied Agents Capabilities through Interactive Grounded Language Instructions

2 code implementations • 18 May 2023 • Shrestha Mohanty, Negar Arabzadeh, Julia Kiseleva, Artem Zholus, Milagro Teruel, Ahmed Awadallah, Yuxuan Sun, Kavya Srinet, Arthur Szlam

Human intelligence's adaptability is remarkable, allowing us to adjust to new tasks and multi-modal environments swiftly.

Paper
Code

Task-specific Fine-tuning via Variational Information Bottleneck for Weakly-supervised Pathology Whole Slide Image Classification

1 code implementation • CVPR 2023 • Honglin Li, Chenglu Zhu, Yunlong Zhang, Yuxuan Sun, Zhongyi Shui, Wenwei Kuang, Sunyi Zheng, Lin Yang

Our framework is evaluated on five pathology WSI datasets on various WSI heads.

Image Classification Multiple Instance Learning +2

Paper
Code

MASS: Mobility-Aware Sensor Scheduling of Cooperative Perception for Connected Automated Driving

no code implementations • 25 Feb 2023 • Yukuan Jia, Ruiqing Mao, Yuxuan Sun, Sheng Zhou, Zhisheng Niu

Specifically, we design a mobility-aware sensor scheduling (MASS) algorithm based on the restless multi-armed bandit (RMAB) theory to maximize the expected average perception gain.

Scheduling

Paper
Add Code

MOB-FL: Mobility-Aware Federated Learning for Intelligent Connected Vehicles

no code implementations • 7 Dec 2022 • Bowen Xie, Yuxuan Sun, Sheng Zhou, Zhisheng Niu, Yang Xu, Jingran Chen, Deniz Gündüz

Federated learning (FL) is a promising approach to enable the future Internet of vehicles consisting of intelligent connected vehicles (ICVs) with powerful sensing, computing and communication capabilities.

Federated Learning Trajectory Prediction

Paper
Add Code

Collecting Interactive Multi-modal Datasets for Grounded Language Understanding

2 code implementations • 12 Nov 2022 • Shrestha Mohanty, Negar Arabzadeh, Milagro Teruel, Yuxuan Sun, Artem Zholus, Alexey Skrynnik, Mikhail Burtsev, Kavya Srinet, Aleksandr Panov, Arthur Szlam, Marc-Alexandre Côté, Julia Kiseleva

Human intelligence can remarkably adapt quickly to new tasks and environments.

Task 2

Paper
Code

MEET: Mobility-Enhanced Edge inTelligence for Smart and Green 6G Networks

no code implementations • 27 Oct 2022 • Yuxuan Sun, Bowen Xie, Sheng Zhou, Zhisheng Niu

Accordingly, base stations (BSs) and edge servers (ESs) need to be densely deployed, leading to huge deployment and operation costs, in particular the energy costs.

Paper
Add Code

Mind the Gap: Polishing Pseudo labels for Accurate Semi-supervised Object Detection

1 code implementation • 17 Jul 2022 • Lei Zhang, Yuxuan Sun, Wei Wei

Instead of directly exploiting the pseudo labels produced by the teacher detector, we take the first attempt at reducing their deviation from ground truth using dual polishing learning, where two differently structured polishing networks are elaborately developed and trained using synthesized paired pseudo labels and the corresponding ground truth for categories and bounding boxes on the given annotated objects, respectively.

Ranked #10 on Semi-Supervised Object Detection on COCO 5% labeled data

object-detection Object Detection +2

Paper
Code

DOLPHINS: Dataset for Collaborative Perception enabled Harmonious and Interconnected Self-driving

1 code implementation • 15 Jul 2022 • Ruiqing Mao, Jingyu Guo, Yukuan Jia, Yuxuan Sun, Sheng Zhou, Zhisheng Niu

In this work, we release DOLPHINS: Dataset for cOllaborative Perception enabled Harmonious and INterconnected Self-driving, as a new simulated large-scale various-scenario multi-view multi-modality autonomous driving dataset, which provides a ground-breaking benchmark platform for interconnected autonomous driving.

Autonomous Driving Object Detection

Paper
Code

Benchmarking the Robustness of Deep Neural Networks to Common Corruptions in Digital Pathology

1 code implementation • 30 Jun 2022 • Yunlong Zhang, Yuxuan Sun, Honglin Li, Sunyi Zheng, Chenglu Zhu, Lin Yang

Evaluated on two resulting benchmark datasets, we find that (1) a variety of deep neural network models suffer from a significant accuracy decrease (double the error on clean images) and the unreliable confidence estimation on corrupted images; (2) A low correlation between the validation and test errors while replacing the validation set with our benchmark can increase the correlation.

Benchmarking

Paper
Code

IGLU 2022: Interactive Grounded Language Understanding in a Collaborative Environment at NeurIPS 2022

1 code implementation • 27 May 2022 • Julia Kiseleva, Alexey Skrynnik, Artem Zholus, Shrestha Mohanty, Negar Arabzadeh, Marc-Alexandre Côté, Mohammad Aliannejadi, Milagro Teruel, Ziming Li, Mikhail Burtsev, Maartje ter Hoeve, Zoya Volovikova, Aleksandr Panov, Yuxuan Sun, Kavya Srinet, Arthur Szlam, Ahmed Awadallah

Starting from a very young age, humans acquire new skills and learn how to solve new tasks either by imitating the behavior of others or by following provided natural language instructions.

Natural Language Understanding Reinforcement Learning (RL)

Paper
Code

Interactive Grounded Language Understanding in a Collaborative Environment: IGLU 2021

no code implementations • 5 May 2022 • Julia Kiseleva, Ziming Li, Mohammad Aliannejadi, Shrestha Mohanty, Maartje ter Hoeve, Mikhail Burtsev, Alexey Skrynnik, Artem Zholus, Aleksandr Panov, Kavya Srinet, Arthur Szlam, Yuxuan Sun, Marc-Alexandre Côté, Katja Hofmann, Ahmed Awadallah, Linar Abdrazakov, Igor Churin, Putra Manggala, Kata Naszadi, Michiel van der Meer, Taewoon Kim

The primary goal of the competition is to approach the problem of how to build interactive agents that learn to solve a task while provided with grounded natural language instructions in a collaborative environment.

Paper
Add Code

Many Episode Learning in a Modular Embodied Agent via End-to-End Interaction

no code implementations • 19 Apr 2022 • Yuxuan Sun, Ethan Carlson, Rebecca Qian, Kavya Srinet, Arthur Szlam

In this work we give a case study of an embodied machine-learning (ML) powered agent that improves itself via interactions with crowd-workers.

Paper
Add Code

Time-Correlated Sparsification for Efficient Over-the-Air Model Aggregation in Wireless Federated Learning

no code implementations • 17 Feb 2022 • Yuxuan Sun, Sheng Zhou, Zhisheng Niu, Deniz Gündüz

In this work, we propose time-correlated sparsification with hybrid aggregation (TCS-H) for communication-efficient FEEL, which exploits jointly the power of model compression and over-the-air computation.

Federated Learning Model Compression +1

Paper
Add Code

Online V2X Scheduling for Raw-Level Cooperative Perception

no code implementations • 12 Feb 2022 • Yukuan Jia, Ruiqing Mao, Yuxuan Sun, Sheng Zhou, Zhisheng Niu

Cooperative perception of connected vehicles comes to the rescue when the field of view restricts stand-alone intelligence.

Scheduling

Paper
Add Code

NeurIPS 2021 Competition IGLU: Interactive Grounded Language Understanding in a Collaborative Environment

no code implementations • 13 Oct 2021 • Julia Kiseleva, Ziming Li, Mohammad Aliannejadi, Shrestha Mohanty, Maartje ter Hoeve, Mikhail Burtsev, Alexey Skrynnik, Artem Zholus, Aleksandr Panov, Kavya Srinet, Arthur Szlam, Yuxuan Sun, Katja Hofmann, Michel Galley, Ahmed Awadallah

Starting from a very young age, humans acquire new skills and learn how to solve new tasks either by imitating the behavior of others or by following provided natural language instructions.

Natural Language Understanding Reinforcement Learning (RL)

Paper
Add Code

Coded Computation across Shared Heterogeneous Workers with Communication Delay

no code implementations • 23 Sep 2021 • Yuxuan Sun, Fan Zhang, Junlin Zhao, Sheng Zhou, Zhisheng Niu, Deniz Gündüz

In this work, we consider a multi-master heterogeneous-worker distributed computing scenario, where multiple matrix multiplication tasks are encoded and allocated to workers for parallel computation.

Distributed Computing

Paper
Add Code

NTIRE 2021 Multi-modal Aerial View Object Classification Challenge

no code implementations • 2 Jul 2021 • Jerrick Liu, Nathan Inkawhich, Oliver Nina, Radu Timofte, Sahil Jain, Bob Lee, Yuru Duan, Wei Wei, Lei Zhang, Songzheng Xu, Yuxuan Sun, Jiaqi Tang, Mengru Ma, Gongzhe Li, Xueli Geng, Huanqia Cai, Chengxue Cai, Sol Cummings, Casian Miron, Alexandru Pasarica, Cheng-Yen Yang, Hung-Min Hsu, Jiarui Cai, Jie Mei, Chia-Ying Yeh, Jenq-Neng Hwang, Michael Xin, Zhongkai Shangguan, Zihe Zheng, Xu Yifei, Lehan Yang, Kele Xu, Min Feng

In this paper, we introduce the first Challenge on Multi-modal Aerial View Object Classification (MAVOC) in conjunction with the NTIRE 2021 workshop at CVPR.

Classification Object

Paper
Add Code

Dynamic Scheduling for Over-the-Air Federated Edge Learning with Energy Constraints

no code implementations • 31 May 2021 • Yuxuan Sun, Sheng Zhou, Zhisheng Niu, Deniz Gündüz

In this work, we consider an over-the-air FEEL system with analog gradient aggregation, and propose an energy-aware dynamic device scheduling algorithm to optimize the training performance under energy constraints of devices, where both communication energy for gradient aggregation and computation energy for local training are included.

Scheduling

Paper
Add Code

droidlet: modular, heterogenous, multi-modal agents

1 code implementation • 25 Jan 2021 • Anurag Pratik, Soumith Chintala, Kavya Srinet, Dhiraj Gandhi, Rebecca Qian, Yuxuan Sun, Ryan Drew, Sara Elkafrawy, Anoushka Tiwari, Tucker Hart, Mary Williamson, Abhinav Gupta, Arthur Szlam

In recent years, there have been significant advances in building end-to-end Machine Learning (ML) systems that learn at scale.

829

Paper
Code

RIVA: A Pre-trained Tweet Multimodal Model Based on Text-image Relation for Multimodal NER

no code implementations • COLING 2020 • Lin Sun, Jiquan Wang, Yindu Su, Fangsheng Weng, Yuxuan Sun, Zengwei Zheng, Yuanyi Chen

In the multimodal NER task, the experimental results show the significance of text-related visual features for the visual-linguistic model and our approach achieves SOTA performance on the MNER datasets.

named-entity-recognition Named Entity Recognition +3

Paper
Add Code

ROI Pooled Correlation Filters for Visual Tracking

1 code implementation • CVPR 2019 • Yuxuan Sun, Chong Sun, Dong Wang, You He, Huchuan Lu

The ROI (region-of-interest) based pooling method performs pooling operations on the cropped ROI regions for various samples and has shown great success in the object detection methods.

object-detection Object Detection +1

Paper
Code

Energy-Aware Analog Aggregation for Federated Learning with Redundant Data

no code implementations • 1 Nov 2019 • Yuxuan Sun, Sheng Zhou, Deniz Gündüz

In this work, we consider analog aggregation to scale down the communication cost with respect to the number of workers, and introduce data redundancy to the system to deal with non-i. i. d.

Federated Learning Scheduling

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.