Search Results for author: Hongyi Wang

Found 40 papers, 31 papers with code

BAMBOO: a predictive and transferable machine learning force field framework for liquid electrolyte development

1 code implementation10 Apr 2024 Sheng Gong, Yumin Zhang, Zhenliang Mu, Zhichen Pu, Hongyi Wang, Zhiao Yu, Mengyi Chen, Tianze Zheng, Zhi Wang, Lifei Chen, Xiaojie Wu, Shaochen Shi, Weihao Gao, Wen Yan, Liang Xiang

Despite the widespread applications of machine learning force field (MLFF) on solids and small molecules, there is a notable gap in applying MLFF to complex liquid electrolytes.

Knowledge Distillation

M2ORT: Many-To-One Regression Transformer for Spatial Transcriptomics Prediction from Histopathology Images

no code implementations19 Jan 2024 Hongyi Wang, Xiuju Du, Jing Liu, Shuyi Ouyang, Yen-Wei Chen, Lanfen Lin

To address this limit, we propose M2ORT, a many-to-one regression Transformer that can accommodate the hierarchical structure of the pathology images through a decoupled multi-scale feature extractor.


FlightLLM: Efficient Large Language Model Inference with a Complete Mapping Flow on FPGAs

no code implementations8 Jan 2024 Shulin Zeng, Jun Liu, Guohao Dai, Xinhao Yang, Tianyu Fu, Hongyi Wang, Wenheng Ma, Hanbo Sun, Shiyao Li, Zixiao Huang, Yadong Dai, Jintao Li, Zehao Wang, Ruoyu Zhang, Kairui Wen, Xuefei Ning, Yu Wang

However, existing GPU and transformer-based accelerators cannot efficiently process compressed LLMs, due to the following unresolved challenges: low computational efficiency, underutilized memory bandwidth, and large compilation overheads.

Computational Efficiency Language Modelling +2

RedCoast: A Lightweight Tool to Automate Distributed Training of LLMs on Any GPU/TPUs

1 code implementation25 Oct 2023 Bowen Tan, Yun Zhu, Lijuan Liu, Hongyi Wang, Yonghao Zhuang, Jindong Chen, Eric Xing, Zhiting Hu

In this work, we present RedCoast (Redco), a lightweight and user-friendly tool crafted to automate distributed training and inference for LLMs, as well as to simplify ML pipeline development.

Language Modelling Meta-Learning

Fusing Models with Complementary Expertise

1 code implementation2 Oct 2023 Hongyi Wang, Felipe Maia Polo, Yuekai Sun, Souvik Kundu, Eric Xing, Mikhail Yurochkin

Training AI models that generalize across tasks and domains has long been among the open problems driving AI research.

Multiple-choice text-classification +2

GridFormer: Towards Accurate Table Structure Recognition via Grid Prediction

no code implementations26 Sep 2023 Pengyuan Lyu, Weihong Ma, Hongyi Wang, Yuechen Yu, Chengquan Zhang, Kun Yao, Yang Xue, Jingdong Wang

In this representation, the vertexes and edges of the grid store the localization and adjacency information of the table.

SlimPajama-DC: Understanding Data Combinations for LLM Training

1 code implementation19 Sep 2023 Zhiqiang Shen, Tianhua Tao, Liqun Ma, Willie Neiswanger, Zhengzhong Liu, Hongyi Wang, Bowen Tan, Joel Hestness, Natalia Vassilieva, Daria Soboleva, Eric Xing

This paper aims to understand the impacts of various data combinations (e. g., web text, Wikipedia, GitHub, books) on the pretraining of large language models using SlimPajama.

Cuttlefish: Low-Rank Model Training without All the Tuning

1 code implementation4 May 2023 Hongyi Wang, Saurabh Agarwal, Pongsakorn U-chupala, Yoshiki Tanaka, Eric P. Xing, Dimitris Papailiopoulos

Cuttlefish leverages the observation that after a few epochs of full-rank training, the stable rank (i. e., an approximation of the true rank) of each layer stabilizes at a constant value.

Memory-adaptive Depth-wise Heterogenous Federated Learning

1 code implementation8 Mar 2023 Kai Zhang, Yutong Dai, Hongyi Wang, Eric Xing, Xun Chen, Lichao Sun

Federated learning is a promising paradigm that allows multiple clients to collaboratively train a model without sharing the local data.

Federated Learning

Federated Learning as Variational Inference: A Scalable Expectation Propagation Approach

1 code implementation8 Feb 2023 Han Guo, Philip Greengard, Hongyi Wang, Andrew Gelman, Yoon Kim, Eric P. Xing

A recent alternative formulation instead treats federated learning as a distributed inference problem, where the goal is to infer a global posterior from partitioned client data (Al-Shedivat et al., 2021).

Distributed Optimization Federated Learning +1

Does compressing activations help model parallel training?

no code implementations6 Jan 2023 Song Bian, Dacheng Li, Hongyi Wang, Eric P. Xing, Shivaram Venkataraman

Finally, we provide insights for future development of model parallelism compression algorithms.


MPCFormer: fast, performant and private Transformer inference with MPC

1 code implementation2 Nov 2022 Dacheng Li, Rulin Shao, Hongyi Wang, Han Guo, Eric P. Xing, Hao Zhang

Through extensive evaluations, we show that MPCFORMER significantly speeds up Transformer inference in MPC settings while achieving similar ML performance to the input model.

Knowledge Distillation

Super-Resolution Based Patch-Free 3D Image Segmentation with High-Frequency Guidance

1 code implementation26 Oct 2022 Hongyi Wang, Lanfen Lin, Hongjie Hu, Qingqing Chen, Yinhao Li, Yutaro Iwamoto, Xian-Hua Han, Yen-Wei Chen, Ruofeng Tong

The framework contains two sub-tasks, of which semantic segmentation is the main task and super resolution is an auxiliary task aiding in rebuilding the high frequency information from the LR input.

Computed Tomography (CT) Image Segmentation +4

AMP: Automatically Finding Model Parallel Strategies with Heterogeneity Awareness

1 code implementation13 Oct 2022 Dacheng Li, Hongyi Wang, Eric Xing, Hao Zhang

Scaling up model sizes can lead to fundamentally new capabilities in many machine learning (ML) tasks.


CubeMLP: An MLP-based Model for Multimodal Sentiment Analysis and Depression Estimation

1 code implementation28 Jul 2022 Hao Sun, Hongyi Wang, Jiaqing Liu, Yen-Wei Chen, Lanfen Lin

Multimodal sentiment analysis and depression estimation are two important research topics that aim to predict human mental states using multimodal data.

Multimodal Sentiment Analysis

Efficient Federated Learning on Knowledge Graphs via Privacy-preserving Relation Embedding Aggregation

1 code implementation17 Mar 2022 Kai Zhang, Yu Wang, Hongyi Wang, Lifu Huang, Carl Yang, Xun Chen, Lichao Sun

Furthermore, we propose a Federated learning paradigm with privacy-preserving Relation embedding aggregation (FedR) to tackle the privacy issue in FedE.

Entity Embeddings Federated Learning +4

Rare Gems: Finding Lottery Tickets at Initialization

1 code implementation24 Feb 2022 Kartik Sreenivasan, Jy-yong Sohn, Liu Yang, Matthew Grinde, Alliot Nagle, Hongyi Wang, Eric Xing, Kangwook Lee, Dimitris Papailiopoulos

Frankle & Carbin conjecture that we can avoid this by training "lottery tickets", i. e., special sparse subnetworks found at initialization, that can be trained to high accuracy.

Hformer: Hybrid CNN-Transformer for Fringe Order Prediction in Phase Unwrapping of Fringe Projection

no code implementations13 Dec 2021 Xinjun Zhu, Zhiqiang Han, Mengkai Yuan, Qinghua Guo, Hongyi Wang

Our work opens an alternative way to deep learning based phase unwrapping methods, which are dominated by CNN in fringe projection 3D measurement.


Mixed Transformer U-Net For Medical Image Segmentation

1 code implementation8 Nov 2021 Hongyi Wang, Shiao Xie, Lanfen Lin, Yutaro Iwamoto, Xian-Hua Han, Yen-Wei Chen, Ruofeng Tong

Therefore, Vision Transformers have emerged as alternative segmentation structures recently, for their innate ability of capturing long-range correlations through Self-Attention (SA).

Image Segmentation Medical Image Segmentation +2

Pufferfish: Communication-efficient Models At No Extra Cost

1 code implementation5 Mar 2021 Hongyi Wang, Saurabh Agarwal, Dimitris Papailiopoulos

In this work, we present Pufferfish, a communication and computation efficient distributed training framework that incorporates the gradient compression into the model training process via training low-rank, pre-factorized deep networks.


On the Utility of Gradient Compression in Distributed Training Systems

1 code implementation28 Feb 2021 Saurabh Agarwal, Hongyi Wang, Shivaram Venkataraman, Dimitris Papailiopoulos

A rich body of prior work has highlighted the existence of communication bottlenecks in synchronous data-parallel training.

Model Compression

BPF for storage: an exokernel-inspired approach

1 code implementation25 Feb 2021 Yu Jian Wu, Hongyi Wang, Yuhong Zhong, Asaf Cidon, Ryan Stutsman, Amy Tai, Junfeng Yang

The overhead of the kernel storage path accounts for half of the access latency for new NVMe storage devices.

Operating Systems Databases

Accordion: Adaptive Gradient Communication via Critical Learning Regime Identification

3 code implementations29 Oct 2020 Saurabh Agarwal, Hongyi Wang, Kangwook Lee, Shivaram Venkataraman, Dimitris Papailiopoulos

The techniques usually require choosing a static compression ratio, often requiring users to balance the trade-off between model accuracy and per-iteration speedup.


Few shot domain adaptation for in situ macromolecule structural classification in cryo-electron tomograms

no code implementations30 Jul 2020 Liangyong Yu, Ran Li, Xiangrui Zeng, Hongyi Wang, Jie Jin, Ge Yang, Rui Jiang, Min Xu

Motivation: Cryo-Electron Tomography (cryo-ET) visualizes structure and spatial organization of macromolecules and their interactions with other subcellular components inside single cells in the close-to-native state at sub-molecular resolution.

Classification Domain Adaptation +2

Federated Learning with Matched Averaging

1 code implementation ICLR 2020 Hongyi Wang, Mikhail Yurochkin, Yuekai Sun, Dimitris Papailiopoulos, Yasaman Khazaeni

Federated learning allows edge devices to collaboratively learn a shared model while keeping the training data on device, decoupling the ability to do model training from the need to store the data in the cloud.

Federated Learning

DETOX: A Redundancy-based Framework for Faster and More Robust Gradient Aggregation

1 code implementation NeurIPS 2019 Shashank Rajput, Hongyi Wang, Zachary Charles, Dimitris Papailiopoulos

In this work, we present DETOX, a Byzantine-resilient distributed training framework that combines algorithmic redundancy with robust aggregation.

ErasureHead: Distributed Gradient Descent without Delays Using Approximate Gradient Coding

1 code implementation28 Jan 2019 Hongyi Wang, Zachary Charles, Dimitris Papailiopoulos

We present ErasureHead, a new approach for distributed gradient descent (GD) that mitigates system delays by employing approximate gradient coding.

The Effect of Network Width on the Performance of Large-batch Training

no code implementations NeurIPS 2018 Lingjiao Chen, Hongyi Wang, Jinman Zhao, Dimitris Papailiopoulos, Paraschos Koutris

Distributed implementations of mini-batch stochastic gradient descent (SGD) suffer from communication overheads, attributed to the high frequency of gradient updates inherent in small-batch training.

DRACO: Byzantine-resilient Distributed Training via Redundant Gradients

1 code implementation ICML 2018 Lingjiao Chen, Hongyi Wang, Zachary Charles, Dimitris Papailiopoulos

Distributed model training is vulnerable to byzantine system failures and adversarial compute nodes, i. e., nodes that use malicious updates to corrupt the global model stored at a parameter server (PS).

Cannot find the paper you are looking for? You can Submit a new open access paper.