Search Results for author: Wei Lin

Found 86 papers, 31 papers with code

Exact first passage time distribution for second-order reactions in chemical networks

no code implementations4 Sep 2024 Changqian Rao, David Waxman, Wei Lin, Zhuoyi Song

The first passage time (FPT) is a generic measure that quantifies when a random quantity reaches a specific state.

Computational Efficiency

LARR: Large Language Model Aided Real-time Scene Recommendation with Semantic Understanding

no code implementations21 Aug 2024 Zhizhong Wan, Bin Yin, Junjie Xie, Fei Jiang, Xiang Li, Wei Lin

Finally, LLM's separate outputs for different scene features are aggregated by an encoder, aligning to collaborative signals in RS, enhancing the performance of recommendation model.

Click-Through Rate Prediction Contrastive Learning +2

First Activations Matter: Training-Free Methods for Dynamic Activation in Large Language Models

no code implementations21 Aug 2024 Chi Ma, Mincong Huang, Ying Zhang, Chao Wang, Yujie Wang, Lei Yu, Chuan Liu, Wei Lin

Dynamic activation (DA) techniques, such as DejaVu and MoEfication, have demonstrated their potential to significantly enhance the inference efficiency of large language models (LLMs).

Vision-Language Guidance for LiDAR-based Unsupervised 3D Object Detection

1 code implementation7 Aug 2024 Christian Fruhwirth-Reisinger, Wei Lin, Dušan Malić, Horst Bischof, Horst Possegger

To overcome these limitations, we propose a vision-language-guided unsupervised 3D detection approach that operates exclusively on LiDAR point clouds.

3D Object Detection Autonomous Driving +3

Adaptive Self-supervised Robust Clustering for Unstructured Data with Unknown Cluster Number

no code implementations29 Jul 2024 Chen-Lu Ding, Jiancan Wu, Wei Lin, Shiyang Shen, Xiang Wang, Yancheng Yuan

ASRC obtains the final clustering results by applying RCC to the learned feature representations with their consistent graph structure and edge weights.

Clustering Contrastive Learning +1

Enhancing CTR Prediction through Sequential Recommendation Pre-training: Introducing the SRP4CTR Framework

no code implementations29 Jul 2024 Ruidong Han, Qianzhong Li, He Jiang, Rui Li, Yurou Zhao, Xiang Li, Wei Lin

However, these approaches tend to ignore the additional inference costs to the downstream tasks, and they do not consider how to transfer the effective information from the pre-trained models for specific estimated items in CTR prediction.

Click-Through Rate Prediction Self-Supervised Learning +2

Aligning Explanations for Recommendation with Rating and Feature via Maximizing Mutual Information

1 code implementation18 Jul 2024 Yurou Zhao, Yiding Sun, Ruidong Han, Fei Jiang, Lu Guan, Xiang Li, Wei Lin, Weizhi Ma, Jiaxin Mao

However, as current explanation generation methods are commonly trained with an objective to mimic existing user reviews, the generated explanations are often not aligned with the predicted ratings or some important features of the recommended items, and thus, are suboptimal in helping users make informed decision on the recommendation platform.

Explanation Generation

Decision Focused Causal Learning for Direct Counterfactual Marketing Optimization

no code implementations18 Jul 2024 Hao Zhou, Rongxiao Huang, Shaoming Li, Guibin Jiang, Jiaqi Zheng, Bing Cheng, Wei Lin

Decision Focused Learning (DFL) integrates ML and OR into an end-to-end framework, which takes the objective of the downstream task as the decision loss function and guarantees the consistency of the optimization direction between ML and OR.

counterfactual Marketing

Enhancing Vehicle Re-identification and Matching for Weaving Analysis

1 code implementation5 Jul 2024 Mei Qiu, Wei Lin, Stanley Chien, Lauren Christopher, Yaobin Chen, Shu Hu

Vehicle weaving on highways contributes to traffic congestion, raises safety issues, and underscores the need for sophisticated traffic management systems.

Management Vehicle Re-Identification

Unified Dual-Intent Translation for Joint Modeling of Search and Recommendation

1 code implementation1 Jul 2024 Yuting Zhang, Yiqing Wu, Ruidong Han, Ying Sun, Yongchun Zhu, Xiang Li, Wei Lin, Fuzhen Zhuang, Zhulin An, Yongjun Xu

It is therefore feasible to utilize the interaction data from both scenarios to reinforce the dual intents for joint intent-aware modeling.

Recommendation Systems Triplet

Deciphering interventional dynamical causality from non-intervention systems

no code implementations29 Jun 2024 Jifan Shi, Yang Li, Juan Zhao, Siyang Leng, Kazuyuki Aihara, Luonan Chen, Wei Lin

Detecting and quantifying causality is a focal topic in the fields of science, engineering, and interdisciplinary studies.

Time Series

Comparison Visual Instruction Tuning

no code implementations13 Jun 2024 Wei Lin, Muhammad Jehanzeb Mirza, Sivan Doveh, Rogerio Feris, Raja Giryes, Sepp Hochreiter, Leonid Karlinsky

Comparing two images in terms of Commonalities and Differences (CaD) is a fundamental human capability that forms the basis of advanced visual reasoning and interpretation.

Instruction Following Novelty Detection +1

A Statistical Theory of Regularization-Based Continual Learning

no code implementations10 Jun 2024 Xuyang Zhao, Huiyuan Wang, Weiran Huang, Wei Lin

Moreover, the estimation error of the optimal algorithm is derived explicitly, which is of the same order as that of the oracle estimator.

Continual Learning regression +1

Learning Hamiltonian neural Koopman operator and simultaneously sustaining and discovering conservation law

no code implementations4 Jun 2024 Jingdong Zhang, Qunxi Zhu, Wei Lin

Our results suggest that feeding the prior knowledge of the underlying system and the mathematical theory appropriately to the learning framework can reinforce the capability of machine learning in solving physical problems.

PertEval: Unveiling Real Knowledge Capacity of LLMs with Knowledge-Invariant Perturbations

no code implementations30 May 2024 Jiatong Li, Renjun Hu, Kunzhe Huang, Yan Zhuang, Qi Liu, Mengxiao Zhu, Xing Shi, Wei Lin

To rectify this, we present PertEval, a toolkit devised for in-depth probing of LLMs' knowledge capacity through knowledge-invariant perturbations.

Memorization

Switched Flow Matching: Eliminating Singularities via Switching ODEs

no code implementations19 May 2024 Qunxi Zhu, Wei Lin

Continuous-time generative models, such as Flow Matching (FM), construct probability paths to transport between one distribution and another through the simulation-free learning of the neural ordinary differential equations (ODEs).

Attribute

From Fourier to Neural ODEs: Flow Matching for Modeling Complex Systems

no code implementations19 May 2024 Xin Li, Jingdong Zhang, Qunxi Zhu, Chengli Zhao, Xue Zhang, Xiaojun Duan, Wei Lin

We then incorporate the estimated spatial gradients as additional inputs to a neural network.

Network Structure Governs Drosophila Brain Functionality

no code implementations26 Apr 2024 XiaoYu Zhang, Pengcheng Yang, Jiawei Feng, Qiang Luo, Wei Lin, Xin Lu

The results revealed that even with rudimentary neuronal activation mechanisms, models grounded in real neural network structures can generate activation patterns strikingly similar to those observed in the actual brain.

AlphaFin: Benchmarking Financial Analysis with Retrieval-Augmented Stock-Chain Framework

1 code implementation19 Mar 2024 Xiang Li, Zhenyu Li, Chen Shi, Yong Xu, Qing Du, Mingkui Tan, Jun Huang, Wei Lin

The task of financial analysis primarily encompasses two key areas: stock trend prediction and the corresponding financial question answering.

Benchmarking Financial Analysis +4

Towards Multimodal In-Context Learning for Vision & Language Models

no code implementations19 Mar 2024 Sivan Doveh, Shaked Perek, M. Jehanzeb Mirza, Wei Lin, Amit Alfassy, Assaf Arbelle, Shimon Ullman, Leonid Karlinsky

State-of-the-art Vision-Language Models (VLMs) ground the vision and the language modality primarily via projecting the vision tokens from the encoder to language-like tokens, which are directly fed to the Large Language Model (LLM) decoder.

Image Captioning In-Context Learning +2

Context-based Fast Recommendation Strategy for Long User Behavior Sequence in Meituan Waimai

no code implementations19 Mar 2024 Zhichao Feng, Junjiie Xie, Kaiyuan Li, Yu Qin, Pengfei Wang, Qianzhong Li, Bin Yin, Xiang Li, Wei Lin, Shangguang Wang

We first identify contexts that share similar user preferences with the target context and then locate the corresponding PoIs based on these identified contexts.

Sequential Recommendation

Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs

1 code implementation18 Mar 2024 M. Jehanzeb Mirza, Leonid Karlinsky, Wei Lin, Sivan Doveh, Jakub Micorek, Mateusz Kozinski, Hilde Kuehne, Horst Possegger

Prompt ensembling of Large Language Model (LLM) generated category-specific prompts has emerged as an effective method to enhance zero-shot recognition ability of Vision-Language Models (VLMs).

Language Modelling Large Language Model +1

A Fixed-Point Approach to Unified Prompt-Based Counting

no code implementations15 Mar 2024 Wei Lin, Antoni B. Chan

Additionally, a contrastive training scheme is implemented to mitigate dataset bias inherent in current class-agnostic counting datasets, a strategy whose effectiveness is confirmed by our ablation study.

Don't Half-listen: Capturing Key-part Information in Continual Instruction Tuning

no code implementations15 Mar 2024 Yongquan He, Xuancheng Huang, Minghao Tang, Lingxun Meng, Xiang Li, Wei Lin, Wenyuan Zhang, Yifu Gao

Recent methods try to alleviate the CF problem by modifying models or replaying data, which may only remember the surface-level pattern of instructions and get confused on held-out tasks.

Instruction Following

Robust Zero-Shot Crowd Counting and Localization With Adaptive Resolution SAM

no code implementations27 Feb 2024 Jia Wan, Qiangqiang Wu, Wei Lin, Antoni B. Chan

The existing crowd counting models require extensive training data, which is time-consuming to annotate.

Crowd Counting

Target Recognition Algorithm for Monitoring Images in Electric Power Construction Process

no code implementations9 Feb 2024 Hao Song, Wei Lin, Wei Song, Man Wang

To enhance precision and comprehensiveness in identifying targets in electric power construction monitoring video, a novel target recognition algorithm utilizing infrared imaging is explored.

Arithmetic Feature Interaction Is Necessary for Deep Tabular Learning

1 code implementation4 Feb 2024 Yi Cheng, Renjun Hu, Haochao Ying, Xing Shi, Jian Wu, Wei Lin

Our extensive experiments on real-world data also validate the consistent effectiveness, efficiency, and rationale of AMFormer, suggesting it has established a strong inductive bias for deep learning on tabular data.

Inductive Bias

Quantifying energy landscape of oscillatory systems: Explosion, pre-solution, and diffusion decomposition

no code implementations13 Jan 2024 Shirui Bian, Ruisong Zhou, Wei Lin, Chunhe Li

Although the weighted summation of the Gaussian approximation (WSGA) approach has been proposed for quantifying the energy landscape in multistable systems by solving the diffusion equation approximately from moment equations, we are still lacking an accurate approach for quantifying the energy landscape of the periodic oscillatory systems.

Towards Robust Learning to Optimize with Theoretical Guarantees

1 code implementation CVPR 2024 Qingyu Song, Wei Lin, Juncheng Wang, Hong Xu

Based on our proposed methodology of aligning OOD problems to InD problems we also demonstrate that the L2O model's convergence rate in OOD scenarios will deteriorate by an equation of the L2O model's input features.

ChatKBQA: A Generate-then-Retrieve Framework for Knowledge Base Question Answering with Fine-tuned Large Language Models

1 code implementation13 Oct 2023 Haoran Luo, Haihong E, Zichen Tang, Shiyao Peng, Yikai Guo, Wentai Zhang, Chenghao Ma, Guanting Dong, Meina Song, Wei Lin, Yifan Zhu, Luu Anh Tuan

Knowledge Base Question Answering (KBQA) aims to answer natural language questions over large-scale knowledge bases (KBs), which can be summarized into two crucial steps: knowledge retrieval and semantic parsing.

Knowledge Base Question Answering Knowledge Graphs +2

Accelerating Large Batch Training via Gradient Signal to Noise Ratio (GSNR)

no code implementations24 Sep 2023 Guo-qing Jiang, Jinlong Liu, Zixiang Ding, Lin Guo, Wei Lin

As models for nature language processing (NLP), computer vision (CV) and recommendation systems (RS) require surging computation, a large number of GPUs/TPUs are paralleled as a large batch (LB) to improve training throughput.

Recommendation Systems

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity

1 code implementation19 Sep 2023 Haojun Xia, Zhen Zheng, Yuchao Li, Donglin Zhuang, Zhongzhu Zhou, Xiafei Qiu, Yong Li, Wei Lin, Shuaiwen Leon Song

Therefore, we propose Flash-LLM for enabling low-cost and highly-efficient large generative model inference with the sophisticated support of unstructured sparsity on high-performance but highly restrictive Tensor Cores.

TAP: Targeted Prompting for Task Adaptive Generation of Textual Training Instances for Visual Classification

1 code implementation13 Sep 2023 M. Jehanzeb Mirza, Leonid Karlinsky, Wei Lin, Horst Possegger, Rogerio Feris, Horst Bischof

Vision and Language Models (VLMs), such as CLIP, have enabled visual recognition of a potentially unlimited set of categories described by text prompts.

Zero-Shot Learning

CARE: Large Precision Matrix Estimation for Compositional Data

no code implementations13 Sep 2023 Shucong Zhang, Huiyuan Wang, Wei Lin

High-dimensional compositional data are prevalent in many applications.

Accurate Prediction of Antibody Function and Structure Using Bio-Inspired Antibody Language Model

1 code implementation31 Aug 2023 Hongtai Jing, Zhengtao Gao, Sheng Xu, Tao Shen, Zhangzhi Peng, Shwai He, Tao You, Shuang Ye, Wei Lin, Siqi Sun

Remarkably, BALMFold outperforms those well-established methods like AlphaFold2, IgFold, ESMFold, and OmegaFold in the antibody benchmark, demonstrating significant potential to advance innovative engineering and streamline therapeutic antibody development by reducing the need for unnecessary trials.

Language Modelling

Modeling Dual Period-Varying Preferences for Takeaway Recommendation

1 code implementation7 Jun 2023 Yuting Zhang, Yiqing Wu, Ran Le, Yongchun Zhu, Fuzhen Zhuang, Ruidong Han, Xiang Li, Wei Lin, Zhulin An, Yongjun Xu

Different from traditional recommendation, takeaway recommendation faces two main challenges: (1) Dual Interaction-Aware Preference Modeling.

Recommendation Systems

Sit Back and Relax: Learning to Drive Incrementally in All Weather Conditions

1 code implementation30 May 2023 Stefan Leitner, M. Jehanzeb Mirza, Wei Lin, Jakub Micorek, Marc Masana, Mateusz Kozinski, Horst Possegger, Horst Bischof

We propose to store these affine parameters as a memory bank for each weather condition and plug-in their weather-specific parameters during driving (i. e. test time) when the respective weather conditions are encountered.

Autonomous Driving Incremental Learning +2

HAHE: Hierarchical Attention for Hyper-Relational Knowledge Graphs in Global and Local Level

1 code implementation ACL 2023 Haoran Luo, Haihong E, Yuhao Yang, Yikai Guo, Mingzhi Sun, Tianyu Yao, Zichen Tang, Kaiyang Wan, Meina Song, Wei Lin

The global-level attention can model the graphical structure of HKG using hypergraph dual-attention layers, while the local-level attention can learn the sequential structure inside H-Facts via heterogeneous self-attention layers.

Attribute Knowledge Graphs +1

Dual Intent Enhanced Graph Neural Network for Session-based New Item Recommendation

1 code implementation10 May 2023 Di Jin, Luzhi Wang, Yizhen Zheng, Guojie Song, Fei Jiang, Xiang Li, Wei Lin, Shirui Pan

We design a dual-intent network to learn user intent from an attention mechanism and the distribution of historical data respectively, which can simulate users' decision-making process in interacting with a new item.

Decision Making Graph Neural Network +2

Neural Delay Differential Equations: System Reconstruction and Image Classification

no code implementations11 Apr 2023 Qunxi Zhu, Yao Guo, Wei Lin

Neural Ordinary Differential Equations (NODEs), a framework of continuous-depth neural networks, have been widely applied, showing exceptional efficacy in coping with representative datasets.

Classification Image Classification

Embedding Theory of Reservoir Computing and Reducing Reservoir Network Using Time Delays

no code implementations16 Mar 2023 Xing-Yue Duan, Xiong Ying, Si-Yang Leng, Jürgen Kurths, Wei Lin, Huan-Fei Ma

Reservoir computing (RC), a particular form of recurrent neural network, is under explosive development due to its exceptional efficacy and high performance in reconstruction or/and prediction of complex physical systems.

Auto-Parallelizing Large Models with Rhino: A Systematic Approach on Production AI Platform

no code implementations16 Feb 2023 Shiwei Zhang, Lansong Diao, Siyu Wang, Zongyan Cao, Yiliang Gu, Chang Si, Ziji Shi, Zhen Zheng, Chuan Wu, Wei Lin

We present Rhino, a system for accelerating tensor programs with automatic parallelization on AI platform for real production environment.

Expediting Distributed DNN Training with Device Topology-Aware Graph Deployment

no code implementations13 Feb 2023 Shiwei Zhang, Xiaodong Yi, Lansong Diao, Chuan Wu, Siyu Wang, Wei Lin

This paper presents TAG, an automatic system to derive optimized DNN training graph and its deployment onto any device topology, for expedited training in device- and topology- heterogeneous ML clusters.

Combinatorial Optimization Graph Neural Network +1

TAP: Accelerating Large-Scale DNN Training Through Tensor Automatic Parallelisation

no code implementations1 Feb 2023 Ziji Shi, Le Jiang, Ang Wang, Jie Zhang, Xianyan Jia, Yong Li, Chencan Wu, Jialin Li, Wei Lin

However, finding a suitable model parallel schedule for an arbitrary neural network is a non-trivial task due to the exploding search space.

Optimal Transport Minimization: Crowd Localization on Density Maps for Semi-Supervised Counting

1 code implementation CVPR 2023 Wei Lin, Antoni B. Chan

In this paper, we propose the optimal transport minimization (OT-M) algorithm for crowd localization with density maps.

Crowd Counting

SKDBERT: Compressing BERT via Stochastic Knowledge Distillation

no code implementations26 Nov 2022 Zixiang Ding, Guoqing Jiang, Shuai Zhang, Lin Guo, Wei Lin

In this paper, we propose Stochastic Knowledge Distillation (SKD) to obtain compact BERT-style language model dubbed SKDBERT.

Knowledge Distillation Language Modelling

Video Test-Time Adaptation for Action Recognition

1 code implementation CVPR 2023 Wei Lin, Muhammad Jehanzeb Mirza, Mateusz Kozinski, Horst Possegger, Hilde Kuehne, Horst Bischof

Our proposed method demonstrates a substantial performance gain over existing test-time adaptation approaches in both evaluations of a single distribution shift and the challenging case of random distribution shifts.

Action Recognition Temporal Action Localization +1

ActMAD: Activation Matching to Align Distributions for Test-Time-Training

1 code implementation CVPR 2023 Muhammad Jehanzeb Mirza, Pol Jané Soneira, Wei Lin, Mateusz Kozinski, Horst Possegger, Horst Bischof

Test-Time-Training (TTT) is an approach to cope with out-of-distribution (OOD) data by adapting a trained model to distribution shifts occurring at test-time.

Image Classification

MATE: Masked Autoencoders are Online 3D Test-Time Learners

1 code implementation ICCV 2023 M. Jehanzeb Mirza, Inkyu Shin, Wei Lin, Andreas Schriebl, Kunyang Sun, Jaesung Choe, Horst Possegger, Mateusz Kozinski, In So Kweon, Kun-Jin Yoon, Horst Bischof

Our MATE is the first Test-Time-Training (TTT) method designed for 3D data, which makes deep networks trained for point cloud classification robust to distribution shifts occurring in test data.

3D Object Classification Point Cloud Classification

Multi-Frequency-Aware Patch Adversarial Learning for Neural Point Cloud Rendering

no code implementations7 Oct 2022 Jay Karhade, Haiyue Zhu, Ka-Shing Chung, Rajesh Tripathy, Wei Lin, Marcelo H. Ang Jr

The proposed approach aims to improve the rendering realness by minimizing the spectrum discrepancy between real and synthesized images, especially on the high-frequency localized sharpness information which causes image blur visually.

Heterogeneous Federated Learning on a Graph

no code implementations19 Sep 2022 Huiyuan Wang, Xuyang Zhao, Wei Lin

In this work, we consider parameter estimation in federated learning with data distribution and communication heterogeneity, as well as limited computational capacity of local devices.

Federated Learning

Neural Stochastic Control

1 code implementation15 Sep 2022 Jingdong Zhang, Qunxi Zhu, Wei Lin

These two stochastic controllers thus are complementary in applications.

RAW-GNN: RAndom Walk Aggregation based Graph Neural Network

no code implementations28 Jun 2022 Di Jin, Rui Wang, Meng Ge, Dongxiao He, Xiang Li, Wei Lin, Weixiong Zhang

Due to the homophily assumption of Graph Convolutional Networks (GCNs) that these methods use, they are not suitable for heterophily graphs where nodes with different labels or dissimilar attributes tend to be adjacent.

Graph Neural Network Representation Learning

CGMN: A Contrastive Graph Matching Network for Self-Supervised Graph Similarity Learning

1 code implementation30 May 2022 Di Jin, Luzhi Wang, Yizhen Zheng, Xiang Li, Fei Jiang, Wei Lin, Shirui Pan

As most of the existing graph neural networks yield effective graph representations of a single graph, little effort has been made for jointly learning two graph representations and calculating their similarity score.

Collaborative Filtering Graph Classification +4

Cross-View Cross-Scene Multi-View Crowd Counting

no code implementations CVPR 2021 Qi Zhang, Wei Lin, Antoni B. Chan

Multi-view crowd counting has been previously proposed to utilize multi-cameras to extend the field-of-view of a single camera, capturing more people in the scene, and improve counting performance for occluded people or those in low resolution.

Camera Calibration Crowd Counting

PICASSO: Unleashing the Potential of GPU-centric Training for Wide-and-deep Recommender Systems

1 code implementation11 Apr 2022 Yuanxing Zhang, Langshi Chen, Siran Yang, Man Yuan, Huimin Yi, Jie Zhang, Jiamang Wang, Jianbo Dong, Yunlong Xu, Yue Song, Yong Li, Di Zhang, Wei Lin, Lin Qu, Bo Zheng

However, we observe that GPU devices in training recommender systems are underutilized, and they cannot attain an expected throughput improvement as what it has achieved in CV and NLP areas.

Marketing Recommendation Systems

CycDA: Unsupervised Cycle Domain Adaptation from Image to Video

1 code implementation30 Mar 2022 Wei Lin, Anna Kukleva, Kunyang Sun, Horst Possegger, Hilde Kuehne, Horst Bischof

To address these challenges, we propose Cycle Domain Adaptation (CycDA), a cycle-based approach for unsupervised image-to-video domain adaptation by leveraging the joint spatial information in images and videos on the one hand and, on the other hand, training an independent spatio-temporal model to bridge the modality gap.

Action Recognition Domain Adaptation +1

AC-Feasible Power Transfer Regions of Virtual Power Plants: Characterization and Application

no code implementations9 Feb 2022 Wei Lin, Changhong Zhao

Distributed energy resources (DERs) in distribution networks can be aggregated as a virtual power plant (VPP) for transmission-level operations.

Tie-line Security Regions in High Dimension for Renewable Accommodations

no code implementations4 Jan 2022 Wei Lin, Hua Jiang, Zhifang Yang

However, a tie-line security region is a high-dimension polytope due to multiple time periods and border buses inherently in power system operations, leading to the considerable computational burden.

Vocal Bursts Intensity Prediction

Neural Piecewise-Constant Delay Differential Equations

no code implementations4 Jan 2022 Qunxi Zhu, Yifei Shen, Dongsheng Li, Wei Lin

Continuous-depth neural networks, such as the Neural Ordinary Differential Equations (ODEs), have aroused a great deal of interest from the communities of machine learning and data science in recent years, which bridge the connection between deep neural networks and dynamical systems.

Cost Functions over Feasible Power Transfer Regions of Virtual Power Plants

no code implementations2 Dec 2021 Wei Lin, Changhong Zhao

To address this challenge, a characterization method is presented in this paper for the intraday operation of a VPP based on the concepts of nonanticipativity and robustness to DERs' uncertainties.

M6-10T: A Sharing-Delinking Paradigm for Efficient Multi-Trillion Parameter Pretraining

no code implementations8 Oct 2021 Junyang Lin, An Yang, Jinze Bai, Chang Zhou, Le Jiang, Xianyan Jia, Ang Wang, Jie Zhang, Yong Li, Wei Lin, Jingren Zhou, Hongxia Yang

Recent expeditious developments in deep learning algorithms, distributed training, and even hardware design for large models have enabled training extreme-scale models, say GPT-3 and Switch Transformer possessing hundreds of billions or even trillions of parameters.

Binary Code based Hash Embedding for Web-scale Applications

no code implementations24 Aug 2021 Bencheng Yan, Pengjie Wang, Jinquan Liu, Wei Lin, Kuang-Chih Lee, Jian Xu, Bo Zheng

In these applications, embedding learning of categorical features is crucial to the success of deep learning models.

Recommendation Systems

Boosting the Convergence of Reinforcement Learning-based Auto-pruning Using Historical Data

no code implementations16 Jul 2021 Jiandong Mu, Mengdi Wang, Feiwen Zhu, Jun Yang, Wei Lin, Wei zhang

Reinforcement learning (RL)-based auto-pruning has been further proposed to automate the DNN pruning process to avoid expensive hand-crafted work.

Neural Network Compression reinforcement-learning +2

Nonasymptotic theory for two-layer neural networks: Beyond the bias-variance trade-off

no code implementations9 Jun 2021 Huiyuan Wang, Wei Lin

Large neural networks have proved remarkably effective in modern deep learning practice, even in the overparametrized regime where the number of active parameters is large relative to the sample size.

Vocal Bursts Valence Prediction

M6-T: Exploring Sparse Expert Models and Beyond

no code implementations31 May 2021 An Yang, Junyang Lin, Rui Men, Chang Zhou, Le Jiang, Xianyan Jia, Ang Wang, Jie Zhang, Jiamang Wang, Yong Li, Di Zhang, Wei Lin, Lin Qu, Jingren Zhou, Hongxia Yang

Mixture-of-Experts (MoE) models can achieve promising results with outrageous large amount of parameters but constant computation cost, and thus it has become a trend in model scaling.

Playing the Game of 2048

Joule-Thomson expansion of the torus-like black hole

no code implementations4 Mar 2021 Jing Liang, Wei Lin, Benrong Mu

Furthermore, we investigate similarities and differences between the Van der Waals fluid, the torus-like black hole and the charged AdS black holes for the expansion.

General Relativity and Quantum Cosmology

M6: A Chinese Multimodal Pretrainer

no code implementations1 Mar 2021 Junyang Lin, Rui Men, An Yang, Chang Zhou, Ming Ding, Yichang Zhang, Peng Wang, Ang Wang, Le Jiang, Xianyan Jia, Jie Zhang, Jianwei Zhang, Xu Zou, Zhikang Li, Xiaodong Deng, Jie Liu, Jinbao Xue, Huiling Zhou, Jianxin Ma, Jin Yu, Yong Li, Wei Lin, Jingren Zhou, Jie Tang, Hongxia Yang

In this work, we construct the largest dataset for multimodal pretraining in Chinese, which consists of over 1. 9TB images and 292GB texts that cover a wide range of domains.

Image Generation

Neural Delay Differential Equations

no code implementations ICLR 2021 Qunxi Zhu, Yao Guo, Wei Lin

Neural Ordinary Differential Equations (NODEs), a framework of continuous-depth neural networks, have been widely applied, showing exceptional efficacy in coping with some representative datasets.

EasyTransfer -- A Simple and Scalable Deep Transfer Learning Platform for NLP Applications

2 code implementations18 Nov 2020 Minghui Qiu, Peng Li, Chengyu Wang, Hanjie Pan, Ang Wang, Cen Chen, Xianyan Jia, Yaliang Li, Jun Huang, Deng Cai, Wei Lin

The literature has witnessed the success of leveraging Pre-trained Language Models (PLMs) and Transfer Learning (TL) algorithms to a wide range of Natural Language Processing (NLP) applications, yet it is not easy to build an easy-to-use and scalable TL toolkit for this purpose.

Compiler Optimization Conversational Question Answering +1

A bi-diffusion based layer-wise sampling method for deep learning in large graphs

no code implementations25 Sep 2019 Yu He, Shiyang Wen, Wenjin Wu, Yan Zhang, Siran Yang, Yuan Wei, Di Zhang, Guojie Song, Wei Lin, Liang Wang, Bo Zheng

The Graph Convolutional Network (GCN) and its variants are powerful models for graph representation learning and have recently achieved great success on many graph-based applications.

Graph Representation Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.