Search Results for author: Mengwei Xu

Found 32 papers, 16 papers with code

Small Language Models: Survey, Measurements, and Insights

1 code implementation24 Sep 2024 Zhenyan Lu, Xiang Li, Dongqi Cai, Rongjie Yi, Fangming Liu, Xiwen Zhang, Nicholas D. Lane, Mengwei Xu

Small language models (SLMs), despite their widespread adoption in modern smart devices, have received significantly less academic attention compared to their large language model (LLM) counterparts, which are predominantly deployed in data centers and cloud environments.

Benchmarking Decoder +4

ELMS: Elasticized Large Language Models On Mobile Devices

no code implementations8 Sep 2024 Wangsong Yin, Rongjie Yi, Daliang Xu, Gang Huang, Mengwei Xu, Xuanzhe Liu

To address this issue, we introduce ELMS, an on-device LLM service designed to provide elasticity in both the model and prompt dimensions of an LLMaaS.

Language Modelling

FedMoE: Personalized Federated Learning via Heterogeneous Mixture of Experts

no code implementations21 Aug 2024 Hanzi Mei, Dongqi Cai, Ao Zhou, Shangguang Wang, Mengwei Xu

Meanwhile, FedMoE progressively adjusts the submodels to optimal through global expert recommendation.

Personalized Federated Learning

Empowering 1000 tokens/second on-device LLM prefilling with mllm-NPU

1 code implementation8 Jul 2024 Daliang Xu, Hao Zhang, Liming Yang, Ruiqi Liu, Gang Huang, Mengwei Xu, Xuanzhe Liu

On-device large language models (LLMs) are catalyzing novel mobile applications such as UI task automation and personalized email auto-reply, without giving away users' private data.

ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents

1 code implementation28 Jun 2024 Haiyang Shen, Yue Li, Desong Meng, Dongqi Cai, Sheng Qi, Li Zhang, Mengwei Xu, Yun Ma

\textsc{ShortcutsBench} includes a wealth of real APIs from Apple Inc.'s operating systems, refined user queries from shortcuts, human-annotated high-quality action sequences from shortcut developers, and accurate parameter filling values about primitive parameter types, enum parameter types, outputs from previous actions, and parameters that need to request necessary information from the system or user.

The CAP Principle for LLM Serving: A Survey of Long-Context Large Language Model Serving

no code implementations18 May 2024 Pai Zeng, Zhenyu Ning, Jieru Zhao, Weihao Cui, Mengwei Xu, Liwei Guo, Xusheng Chen, Yizhou Shan

We survey the large language model (LLM) serving area to understand the intricate dynamics between cost-efficiency and accuracy, which is magnified by the growing need for longer contextual understanding when deploying models at a massive scale.

Language Modelling Large Language Model +1

LlamaTouch: A Faithful and Scalable Testbed for Mobile UI Task Automation

1 code implementation12 Apr 2024 Li Zhang, Shihe Wang, Xianqing Jia, Zhihan Zheng, Yunhe Yan, Longxi Gao, Yuanchun Li, Mengwei Xu

LlamaTouch comprises three key techniques: (1) On-device task execution that enables mobile agents to interact with realistic mobile environments for task execution.

FedRDMA: Communication-Efficient Cross-Silo Federated LLM via Chunked RDMA Transmission

no code implementations1 Mar 2024 Zeling Zhang, Dongqi Cai, Yiran Zhang, Mengwei Xu, Shangguang Wang, Ao Zhou

Communication overhead is a significant bottleneck in federated learning (FL), which has been exaggerated with the increasing size of AI models.

Federated Learning

A First Look at GPT Apps: Landscape and Vulnerability

no code implementations23 Feb 2024 Zejun Zhang, Li Zhang, Xin Yuan, Anlan Zhang, Mengwei Xu, Feng Qian

Following OpenAI's introduction of GPTs, a surge in GPT apps has led to the launch of dedicated LLM app stores.

A Survey of Resource-efficient LLM and Multimodal Foundation Models

1 code implementation16 Jan 2024 Mengwei Xu, Wangsong Yin, Dongqi Cai, Rongjie Yi, Daliang Xu, QiPeng Wang, Bingyang Wu, Yihao Zhao, Chen Yang, Shihe Wang, Qiyang Zhang, Zhenyan Lu, Li Zhang, Shangguang Wang, Yuanchun Li, Yunxin Liu, Xin Jin, Xuanzhe Liu

Large foundation models, including large language models (LLMs), vision transformers (ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine learning lifecycle, from training to deployment.

Survey

Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security

2 code implementations10 Jan 2024 Yuanchun Li, Hao Wen, Weijun Wang, Xiangyu Li, Yizhen Yuan, Guohong Liu, Jiacheng Liu, Wenxing Xu, Xiang Wang, Yi Sun, Rui Kong, Yile Wang, Hanfei Geng, Jian Luan, Xuefeng Jin, Zilong Ye, Guanjing Xiong, Fan Zhang, Xiang Li, Mengwei Xu, Zhijun Li, Peng Li, Yang Liu, Ya-Qin Zhang, Yunxin Liu

Next, we discuss several key challenges to achieve intelligent, efficient and secure Personal LLM Agents, followed by a comprehensive survey of representative solutions to address these challenges.

Mobile Foundation Model as Firmware

1 code implementation28 Aug 2023 Jinliang Yuan, Chen Yang, Dongqi Cai, Shihe Wang, Xin Yuan, Zeling Zhang, Xiang Li, Dingge Zhang, Hanzi Mei, Xianqing Jia, Shangguang Wang, Mengwei Xu

Concurrently, each app contributes a concise, offline fine-tuned "adapter" tailored to distinct downstream tasks.

EdgeMoE: Fast On-Device Inference of MoE-based Large Language Models

no code implementations28 Aug 2023 Rongjie Yi, Liwei Guo, Shiyun Wei, Ao Zhou, Shangguang Wang, Mengwei Xu

Large Language Models (LLMs) such as GPTs and LLaMa have ushered in a revolution in machine intelligence, owing to their exceptional capabilities in a wide range of machine learning tasks.

Computational Efficiency

FwdLLM: Efficient FedLLM using Forward Gradient

1 code implementation26 Aug 2023 Mengwei Xu, Dongqi Cai, Yaozong Wu, Xiang Li, Shangguang Wang

Federated Learning (FL), a method to preserve user data privacy, is often employed in fine-tuning LLMs to downstream mobile tasks, an approach known as FedLLM.

Federated Learning

Uncertain Machine Ethical Decisions Using Hypothetical Retrospection

1 code implementation2 May 2023 Simon Kolker, Louise Dennis, Ramon Fraga Pereira, Mengwei Xu

We propose the use of the hypothetical retrospection argumentation procedure, developed by Sven Ove Hansson to improve existing approaches to machine ethical reasoning by accounting for probability and uncertainty from a position of Philosophy that resonates with humans.

Ethics Philosophy

Federated Few-Shot Learning for Mobile NLP

1 code implementation12 Dec 2022 Dongqi Cai, Shangguang Wang, Yaozong Wu, Felix Xiaozhu Lin, Mengwei Xu

Such an inadequacy of data labels is known as a few-shot scenario; it becomes the key blocker for mobile NLP applications.

Few-Shot Learning Privacy Preserving

Towards Practical Few-shot Federated NLP

no code implementations1 Dec 2022 Dongqi Cai, Yaozong Wu, Haitao Yuan, Shangguang Wang, Felix Xiaozhu Lin, Mengwei Xu

To address these challenges, we first introduce a data generator for federated few-shot learning tasks, which encompasses the quantity and skewness of scarce labeled data in a realistic setting.

Data Augmentation Federated Learning +1

FedAdapter: Efficient Federated Learning for Modern NLP

1 code implementation20 May 2022 Dongqi Cai, Yaozong Wu, Shangguang Wang, Felix Xiaozhu Lin, Mengwei Xu

A key challenge is to properly configure the depth and width of adapters, to which the training speed and efficiency is highly sensitive.

Federated Learning

Boosting Mobile CNN Inference through Semantic Memory

no code implementations5 Dec 2021 Yun Li, Chen Zhang, Shihao Han, Li Lyna Zhang, Baoqun Yin, Yunxin Liu, Mengwei Xu

Human brains are known to be capable of speeding up visual recognition of repeatedly presented objects through faster memory encoding and accessing procedures on activated neurons.

Hierarchical Federated Learning through LAN-WAN Orchestration

no code implementations22 Oct 2020 Jinliang Yuan, Mengwei Xu, Xiao Ma, Ao Zhou, Xuanzhe Liu, Shangguang Wang

Our proposed FL can accelerate the learning process and reduce the monetary cost with frequent local aggregation in the same LAN and infrequent global aggregation on a cloud across WAN.

Federated Learning

Characterizing Impacts of Heterogeneity in Federated Learning upon Large-Scale Smartphone Data

no code implementations12 Jun 2020 Chengxu Yang, Qipeng Wang, Mengwei Xu, Zhenpeng Chen, Kaigui Bian, Yunxin Liu, Xuanzhe Liu

Based on the data and the platform, we conduct extensive experiments to compare the performance of state-of-the-art FL algorithms under heterogeneity-aware and heterogeneity-unaware settings.

Fairness Federated Learning +1

Federated Neural Architecture Search

no code implementations15 Feb 2020 Jinliang Yuan, Mengwei Xu, Yuxin Zhao, Kaigui Bian, Gang Huang, Xuanzhe Liu, Shangguang Wang

To preserve user privacy while enabling mobile intelligence, techniques have been proposed to train deep neural networks on decentralized data.

Neural Architecture Search

Approximate Query Service on Autonomous IoT Cameras

no code implementations2 Sep 2019 Mengwei Xu, Xiwen Zhang, Yunxin Liu, Gang Huang, Xuanzhe Liu, Felix Xiaozhu Lin

Elf is a runtime for an energy-constrained camera to continuously summarize video scenes as approximate object counts.

Databases

Video Analytics with Zero-streaming Cameras

no code implementations28 Apr 2019 Mengwei Xu, Tiantu Xu, Yunxin Liu, Felix Xiaozhu Lin

For efficiency, we advocate for these cameras to be zero streaming: capturing videos to local storage and communicating with the cloud only when analytics is requested.

DeepCache: Principled Cache for Mobile Deep Vision

1 code implementation1 Dec 2017 Mengwei Xu, Mengze Zhu, Yunxin Liu, Felix Xiaozhu Lin, Xuanzhe Liu

We present DeepCache, a principled cache design for deep learning inference in continuous mobile vision.

Video Compression

DeepWear: Adaptive Local Offloading for On-Wearable Deep Learning

no code implementations1 Dec 2017 Mengwei Xu, Feng Qian, Mengze Zhu, Feifan Huang, Saumay Pushp, Xuanzhe Liu

Due to their on-body and ubiquitous nature, wearables can generate a wide range of unique sensor data creating countless opportunities for deep learning tasks.

Deep Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.