Search Results for author: Wei Lin

Found 53 papers, 18 papers with code

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity

1 code implementation19 Sep 2023 Haojun Xia, Zhen Zheng, Yuchao Li, Donglin Zhuang, Zhongzhu Zhou, Xiafei Qiu, Yong Li, Wei Lin, Shuaiwen Leon Song

Therefore, we propose Flash-LLM for enabling low-cost and highly-efficient large generative model inference with the sophisticated support of unstructured sparsity on high-performance but highly restrictive Tensor Cores.

CARE: Large Precision Matrix Estimation for Compositional Data

no code implementations13 Sep 2023 Shucong Zhang, Huiyuan Wang, Wei Lin

High-dimensional compositional data are prevalent in many applications.

TAP: Targeted Prompting for Task Adaptive Generation of Textual Training Instances for Visual Classification

no code implementations13 Sep 2023 M. Jehanzeb Mirza, Leonid Karlinsky, Wei Lin, Horst Possegger, Rogerio Feris, Horst Bischof

Vision and Language Models (VLMs), such as CLIP, have enabled visual recognition of a potentially unlimited set of categories described by text prompts.

Zero-Shot Learning

Accurate Prediction of Antibody Function and Structure Using Bio-Inspired Antibody Language Model

1 code implementation31 Aug 2023 Hongtai Jing, Zhengtao Gao, Sheng Xu, Tao Shen, Zhangzhi Peng, Shwai He, Tao You, Shuang Ye, Wei Lin, Siqi Sun

Remarkably, BALMFold outperforms those well-established methods like AlphaFold2, IgFold, ESMFold, and OmegaFold in the antibody benchmark, demonstrating significant potential to advance innovative engineering and streamline therapeutic antibody development by reducing the need for unnecessary trials.

Language Modelling

MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models

1 code implementation23 Jun 2023 Chaoyou Fu, Peixian Chen, Yunhang Shen, Yulei Qin, Mengdan Zhang, Xu Lin, Zhenyu Qiu, Wei Lin, Jinrui Yang, Xiawu Zheng, Ke Li, Xing Sun, Rongrong Ji

Multimodal Large Language Model (MLLM) relies on the powerful LLM to perform multimodal tasks, showing amazing emergent abilities in recent studies, such as writing poems based on an image.

Benchmarking Language Modelling +3

Modeling Dual Period-Varying Preferences for Takeaway Recommendation

no code implementations7 Jun 2023 Yuting Zhang, Yiqing Wu, Ran Le, Yongchun Zhu, Fuzhen Zhuang, Ruidong Han, Xiang Li, Wei Lin, Zhulin An, Yongjun Xu

Different from traditional recommendation, takeaway recommendation faces two main challenges: (1) Dual Interaction-Aware Preference Modeling.

Recommendation Systems

Sit Back and Relax: Learning to Drive Incrementally in All Weather Conditions

1 code implementation30 May 2023 Stefan Leitner, M. Jehanzeb Mirza, Wei Lin, Jakub Micorek, Marc Masana, Mateusz Kozinski, Horst Possegger, Horst Bischof

We propose to store these affine parameters as a memory bank for each weather condition and plug-in their weather-specific parameters during driving (i. e. test time) when the respective weather conditions are encountered.

Autonomous Driving Incremental Learning +2

LaFTer: Label-Free Tuning of Zero-shot Classifier using Language and Unlabeled Image Collections

no code implementations29 May 2023 M. Jehanzeb Mirza, Leonid Karlinsky, Wei Lin, Mateusz Kozinski, Horst Possegger, Rogerio Feris, Horst Bischof

Recently, large-scale pre-trained Vision and Language (VL) models have set a new state-of-the-art (SOTA) in zero-shot visual classification enabling open-vocabulary recognition of potentially unlimited set of categories defined as simple language prompts.

Language Modelling Large Language Model

HAHE: Hierarchical Attention for Hyper-Relational Knowledge Graphs in Global and Local Level

1 code implementation11 May 2023 Haoran Luo, Haihong E, Yuhao Yang, Yikai Guo, Mingzhi Sun, Tianyu Yao, Zichen Tang, Kaiyang Wan, Meina Song, Wei Lin

The global-level attention can model the graphical structure of HKG using hypergraph dual-attention layers, while the local-level attention can learn the sequential structure inside H-Facts via heterogeneous self-attention layers.

Knowledge Graphs Link Prediction

Dual Intent Enhanced Graph Neural Network for Session-based New Item Recommendation

1 code implementation10 May 2023 Di Jin, Luzhi Wang, Yizhen Zheng, Guojie Song, Fei Jiang, Xiang Li, Wei Lin, Shirui Pan

We design a dual-intent network to learn user intent from an attention mechanism and the distribution of historical data respectively, which can simulate users' decision-making process in interacting with a new item.

Decision Making Session-Based Recommendations +1

Neural Delay Differential Equations: System Reconstruction and Image Classification

no code implementations11 Apr 2023 Qunxi Zhu, Yao Guo, Wei Lin

Neural Ordinary Differential Equations (NODEs), a framework of continuous-depth neural networks, have been widely applied, showing exceptional efficacy in coping with representative datasets.

Classification Image Classification

Embedding Theory of Reservoir Computing and Reducing Reservoir Network Using Time Delays

no code implementations16 Mar 2023 Xing-Yue Duan, Xiong Ying, Si-Yang Leng, Jürgen Kurths, Wei Lin, Huan-Fei Ma

Reservoir computing (RC), a particular form of recurrent neural network, is under explosive development due to its exceptional efficacy and high performance in reconstruction or/and prediction of complex physical systems.

Auto-Parallelizing Large Models with Rhino: A Systematic Approach on Production AI Platform

no code implementations16 Feb 2023 Shiwei Zhang, Lansong Diao, Siyu Wang, Zongyan Cao, Yiliang Gu, Chang Si, Ziji Shi, Zhen Zheng, Chuan Wu, Wei Lin

We present Rhino, a system for accelerating tensor programs with automatic parallelization on AI platform for real production environment.

Expediting Distributed DNN Training with Device Topology-Aware Graph Deployment

no code implementations13 Feb 2023 Shiwei Zhang, Xiaodong Yi, Lansong Diao, Chuan Wu, Siyu Wang, Wei Lin

This paper presents TAG, an automatic system to derive optimized DNN training graph and its deployment onto any device topology, for expedited training in device- and topology- heterogeneous ML clusters.

Combinatorial Optimization TAG

TAP: Accelerating Large-Scale DNN Training Through Tensor Automatic Parallelisation

no code implementations1 Feb 2023 Ziji Shi, Le Jiang, Ang Wang, Jie Zhang, Xianyan Jia, Yong Li, Chencan Wu, Jialin Li, Wei Lin

However, finding a suitable model parallel schedule for an arbitrary neural network is a non-trivial task due to the exploding search space.

Optimal Transport Minimization: Crowd Localization on Density Maps for Semi-Supervised Counting

1 code implementation CVPR 2023 Wei Lin, Antoni B. Chan

In this paper, we propose the optimal transport minimization (OT-M) algorithm for crowd localization with density maps.

Crowd Counting

SKDBERT: Compressing BERT via Stochastic Knowledge Distillation

no code implementations26 Nov 2022 Zixiang Ding, Guoqing Jiang, Shuai Zhang, Lin Guo, Wei Lin

In this paper, we propose Stochastic Knowledge Distillation (SKD) to obtain compact BERT-style language model dubbed SKDBERT.

Knowledge Distillation Language Modelling

Video Test-Time Adaptation for Action Recognition

1 code implementation CVPR 2023 Wei Lin, Muhammad Jehanzeb Mirza, Mateusz Kozinski, Horst Possegger, Hilde Kuehne, Horst Bischof

Our proposed method demonstrates a substantial performance gain over existing test-time adaptation approaches in both evaluations of a single distribution shift and the challenging case of random distribution shifts.

Action Recognition Temporal Action Localization

ActMAD: Activation Matching to Align Distributions for Test-Time-Training

1 code implementation CVPR 2023 Muhammad Jehanzeb Mirza, Pol Jané Soneira, Wei Lin, Mateusz Kozinski, Horst Possegger, Horst Bischof

Test-Time-Training (TTT) is an approach to cope with out-of-distribution (OOD) data by adapting a trained model to distribution shifts occurring at test-time.

Image Classification

MATE: Masked Autoencoders are Online 3D Test-Time Learners

1 code implementation21 Nov 2022 M. Jehanzeb Mirza, Inkyu Shin, Wei Lin, Andreas Schriebl, Kunyang Sun, Jaesung Choe, Horst Possegger, Mateusz Kozinski, In So Kweon, Kun-Jin Yoon, Horst Bischof

Our MATE is the first Test-Time-Training (TTT) method designed for 3D data, which makes deep networks trained for point cloud classification robust to distribution shifts occurring in test data.

3D Object Classification Point Cloud Classification

Multi-Frequency-Aware Patch Adversarial Learning for Neural Point Cloud Rendering

no code implementations7 Oct 2022 Jay Karhade, Haiyue Zhu, Ka-Shing Chung, Rajesh Tripathy, Wei Lin, Marcelo H. Ang Jr

The proposed approach aims to improve the rendering realness by minimizing the spectrum discrepancy between real and synthesized images, especially on the high-frequency localized sharpness information which causes image blur visually.

Heterogeneous Federated Learning on a Graph

no code implementations19 Sep 2022 Huiyuan Wang, Xuyang Zhao, Wei Lin

In this work, we consider parameter estimation in federated learning with data distribution and communication heterogeneity, as well as limited computational capacity of local devices.

Federated Learning

Neural Stochastic Control

1 code implementation15 Sep 2022 Jingdong Zhang, Qunxi Zhu, Wei Lin

These two stochastic controllers thus are complementary in applications.

RAW-GNN: RAndom Walk Aggregation based Graph Neural Network

no code implementations28 Jun 2022 Di Jin, Rui Wang, Meng Ge, Dongxiao He, Xiang Li, Wei Lin, Weixiong Zhang

Due to the homophily assumption of Graph Convolutional Networks (GCNs) that these methods use, they are not suitable for heterophily graphs where nodes with different labels or dissimilar attributes tend to be adjacent.

Representation Learning

CGMN: A Contrastive Graph Matching Network for Self-Supervised Graph Similarity Learning

1 code implementation30 May 2022 Di Jin, Luzhi Wang, Yizhen Zheng, Xiang Li, Fei Jiang, Wei Lin, Shirui Pan

As most of the existing graph neural networks yield effective graph representations of a single graph, little effort has been made for jointly learning two graph representations and calculating their similarity score.

Collaborative Filtering Graph Classification +4

Cross-View Cross-Scene Multi-View Crowd Counting

no code implementations CVPR 2021 Qi Zhang, Wei Lin, Antoni B. Chan

Multi-view crowd counting has been previously proposed to utilize multi-cameras to extend the field-of-view of a single camera, capturing more people in the scene, and improve counting performance for occluded people or those in low resolution.

Camera Calibration Crowd Counting

PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems

1 code implementation11 Apr 2022 Yuanxing Zhang, Langshi Chen, Siran Yang, Man Yuan, Huimin Yi, Jie Zhang, Jiamang Wang, Jianbo Dong, Yunlong Xu, Yue Song, Yong Li, Di Zhang, Wei Lin, Lin Qu, Bo Zheng

However, we observe that GPU devices in training recommender systems are underutilized, and they cannot attain an expected throughput improvement as what it has achieved in CV and NLP areas.

Marketing Recommendation Systems

CycDA: Unsupervised Cycle Domain Adaptation from Image to Video

1 code implementation30 Mar 2022 Wei Lin, Anna Kukleva, Kunyang Sun, Horst Possegger, Hilde Kuehne, Horst Bischof

To address these challenges, we propose Cycle Domain Adaptation (CycDA), a cycle-based approach for unsupervised image-to-video domain adaptation by leveraging the joint spatial information in images and videos on the one hand and, on the other hand, training an independent spatio-temporal model to bridge the modality gap.

Action Recognition Domain Adaptation +1

AC-Feasible Power Transfer Regions of Virtual Power Plants: Characterization and Application

no code implementations9 Feb 2022 Wei Lin, Changhong Zhao

Distributed energy resources (DERs) in distribution networks can be aggregated as a virtual power plant (VPP) for transmission-level operations.

Neural Piecewise-Constant Delay Differential Equations

no code implementations4 Jan 2022 Qunxi Zhu, Yifei Shen, Dongsheng Li, Wei Lin

Continuous-depth neural networks, such as the Neural Ordinary Differential Equations (ODEs), have aroused a great deal of interest from the communities of machine learning and data science in recent years, which bridge the connection between deep neural networks and dynamical systems.

Tie-line Security Regions in High Dimension for Renewable Accommodations

no code implementations4 Jan 2022 Wei Lin, Hua Jiang, Zhifang Yang

However, a tie-line security region is a high-dimension polytope due to multiple time periods and border buses inherently in power system operations, leading to the considerable computational burden.

Vocal Bursts Intensity Prediction

Cost Functions over Feasible Power Transfer Regions of Virtual Power Plants

no code implementations2 Dec 2021 Wei Lin, Changhong Zhao

To address this challenge, a characterization method is presented in this paper for the intraday operation of a VPP based on the concepts of nonanticipativity and robustness to DERs' uncertainties.

M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining

no code implementations8 Oct 2021 Junyang Lin, An Yang, Jinze Bai, Chang Zhou, Le Jiang, Xianyan Jia, Ang Wang, Jie Zhang, Yong Li, Wei Lin, Jingren Zhou, Hongxia Yang

Recent expeditious developments in deep learning algorithms, distributed training, and even hardware design for large models have enabled training extreme-scale models, say GPT-3 and Switch Transformer possessing hundreds of billions or even trillions of parameters.

Binary Code based Hash Embedding for Web-scale Applications

no code implementations24 Aug 2021 Bencheng Yan, Pengjie Wang, Jinquan Liu, Wei Lin, Kuang-Chih Lee, Jian Xu, Bo Zheng

In these applications, embedding learning of categorical features is crucial to the success of deep learning models.

Recommendation Systems

Boosting the Convergence of Reinforcement Learning-based Auto-pruning Using Historical Data

no code implementations16 Jul 2021 Jiandong Mu, Mengdi Wang, Feiwen Zhu, Jun Yang, Wei Lin, Wei zhang

Reinforcement learning (RL)-based auto-pruning has been further proposed to automate the DNN pruning process to avoid expensive hand-crafted work.

Neural Network Compression reinforcement-learning +2

Nonasymptotic theory for two-layer neural networks: Beyond the bias-variance trade-off

no code implementations9 Jun 2021 Huiyuan Wang, Wei Lin

Large neural networks have proved remarkably effective in modern deep learning practice, even in the overparametrized regime where the number of active parameters is large relative to the sample size.

Vocal Bursts Valence Prediction

M6-T: Exploring Sparse Expert Models and Beyond

no code implementations31 May 2021 An Yang, Junyang Lin, Rui Men, Chang Zhou, Le Jiang, Xianyan Jia, Ang Wang, Jie Zhang, Jiamang Wang, Yong Li, Di Zhang, Wei Lin, Lin Qu, Jingren Zhou, Hongxia Yang

Mixture-of-Experts (MoE) models can achieve promising results with outrageous large amount of parameters but constant computation cost, and thus it has become a trend in model scaling.

Playing the Game of 2048

Joule-Thomson expansion of the torus-like black hole

no code implementations4 Mar 2021 Jing Liang, Wei Lin, Benrong Mu

Furthermore, we investigate similarities and differences between the Van der Waals fluid, the torus-like black hole and the charged AdS black holes for the expansion.

General Relativity and Quantum Cosmology

M6: A Chinese Multimodal Pretrainer

no code implementations1 Mar 2021 Junyang Lin, Rui Men, An Yang, Chang Zhou, Ming Ding, Yichang Zhang, Peng Wang, Ang Wang, Le Jiang, Xianyan Jia, Jie Zhang, Jianwei Zhang, Xu Zou, Zhikang Li, Xiaodong Deng, Jie Liu, Jinbao Xue, Huiling Zhou, Jianxin Ma, Jin Yu, Yong Li, Wei Lin, Jingren Zhou, Jie Tang, Hongxia Yang

In this work, we construct the largest dataset for multimodal pretraining in Chinese, which consists of over 1. 9TB images and 292GB texts that cover a wide range of domains.

Image Generation

Neural Delay Differential Equations

no code implementations ICLR 2021 Qunxi Zhu, Yao Guo, Wei Lin

Neural Ordinary Differential Equations (NODEs), a framework of continuous-depth neural networks, have been widely applied, showing exceptional efficacy in coping with some representative datasets.

EasyTransfer -- A Simple and Scalable Deep Transfer Learning Platform for NLP Applications

2 code implementations18 Nov 2020 Minghui Qiu, Peng Li, Chengyu Wang, Hanjie Pan, Ang Wang, Cen Chen, Xianyan Jia, Yaliang Li, Jun Huang, Deng Cai, Wei Lin

The literature has witnessed the success of leveraging Pre-trained Language Models (PLMs) and Transfer Learning (TL) algorithms to a wide range of Natural Language Processing (NLP) applications, yet it is not easy to build an easy-to-use and scalable TL toolkit for this purpose.

Compiler Optimization Conversational Question Answering +1

A bi-diffusion based layer-wise sampling method for deep learning in large graphs

no code implementations25 Sep 2019 Yu He, Shiyang Wen, Wenjin Wu, Yan Zhang, Siran Yang, Yuan Wei, Di Zhang, Guojie Song, Wei Lin, Liang Wang, Bo Zheng

The Graph Convolutional Network (GCN) and its variants are powerful models for graph representation learning and have recently achieved great success on many graph-based applications.

Graph Representation Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.