Search Results for author: Wei Zhu

Found 70 papers, 18 papers with code

PCEE-BERT: Accelerating BERT Inference via Patient and Confident Early Exiting

1 code implementation Findings (NAACL) 2022 Zhen Zhang, Wei Zhu, Jinfan Zhang, Peng Wang, Rize Jin, Tae-Sun Chung

In this work, we propose Patient and Confident Early Exiting BERT (PCEE-BERT), an off-the-shelf sample-dependent early exiting method that can work with different PLMs and can also work along with popular model compression methods.

Model Compression

GAML-BERT: Improving BERT Early Exiting by Gradient Aligned Mutual Learning

no code implementations EMNLP 2021 Wei Zhu, Xiaoling Wang, Yuan Ni, Guotong Xie

From this observation, we use mutual learning to improve BERT’s early exiting performances, that is, we ask each exit of a multi-exit BERT to distill knowledge from each other.

Knowledge Distillation

Continually Detection, Rapidly React: Unseen Rumors Detection Based on Continual Prompt-Tuning

no code implementations COLING 2022 Yuhui Zuo, Wei Zhu, Guoyong GUET Cai

Since open social platforms allow for a large and continuous flow of unverified information, rumors can emerge unexpectedly and spread quickly.

Transfer Learning

ALoRA: Allocating Low-Rank Adaptation for Fine-tuning Large Language Models

no code implementations24 Mar 2024 Zequan Liu, Jiawen Lyn, Wei Zhu, Xing Tian, Yvette Graham

Parameter-efficient fine-tuning (PEFT) is widely studied for its effectiveness and efficiency in the era of large language models.

Text2MDT: Extracting Medical Decision Trees from Medical Texts

1 code implementation4 Jan 2024 Wei Zhu, Wenfeng Li, Xing Tian, Pengfei Wang, Xiaoling Wang, Jin Chen, Yuanbin Wu, Yuan Ni, Guotong Xie

In this work, we propose a novel task, Text2MDT, to explore the automatic extraction of MDTs from medical texts such as medical guidelines and textbooks.

Efficient Multi-domain Text Recognition Deep Neural Network Parameterization with Residual Adapters

1 code implementation1 Jan 2024 Jiayou Chao, Wei Zhu

Recent advancements in deep neural networks have markedly enhanced the performance of computer vision tasks, yet the specialized nature of these networks often necessitates extensive data and high computational power.

Multi-Task Learning Optical Character Recognition +1

Overview of the PromptCBLUE Shared Task in CHIP2023

1 code implementation29 Dec 2023 Wei Zhu, Xiaoling Wang, Mosha Chen, Buzhou Tang

Many teams from both the industry and academia participated in the shared tasks, and the top teams achieved amazing test results.

In-Context Learning

Efficient LLM inference solution on Intel GPU

no code implementations19 Dec 2023 Hui Wu, Yi Gan, Feng Yuan, Jing Ma, Wei Zhu, Yutao Xu, Hong Zhu, Yuhua Zhu, Xiaoli Liu, Jinghui Gu

A customized Scaled-Dot-Product-Attention kernel is designed to match our fusion policy based on the segment KV cache solution.

Management

Improving Prompt Tuning with Learned Prompting Layers

no code implementations31 Oct 2023 Wei Zhu, Ming Tan

Prompt tuning prepends a soft prompt to the input embeddings or hidden states and only optimizes the prompt to adapt pretrained models (PTMs) to downstream tasks.

PromptCBLUE: A Chinese Prompt Tuning Benchmark for the Medical Domain

1 code implementation22 Oct 2023 Wei Zhu, Xiaoling Wang, Huanran Zheng, Mosha Chen, Buzhou Tang

Biomedical language understanding benchmarks are the driving forces for artificial intelligence applications with large language model (LLM) back-ends.

Dialogue Generation Dialogue Understanding +6

UltraFeedback: Boosting Language Models with High-quality Feedback

2 code implementations2 Oct 2023 Ganqu Cui, Lifan Yuan, Ning Ding, Guanming Yao, Wei Zhu, Yuan Ni, Guotong Xie, Zhiyuan Liu, Maosong Sun

However, the scarcity of diverse, naturalistic datasets of human preferences on LLM outputs at scale poses a great challenge to RLHF as well as feedback learning research within the open-source community.

Language Modelling

Integrating Different Informations for Portfolio Selection

no code implementations29 May 2023 Yi Huang, Wei Zhu, Duan Li, Shushang Zhu, Shikun Wang

Following the idea of Bayesian learning via Gaussian mixture model, we organically combine the backward-looking information contained in the historical data and the forward-looking information implied by the market portfolio, which is affected by heterogeneous expectations and noisy trading behavior.

Statistical Guarantees of Group-Invariant GANs

no code implementations22 May 2023 Ziyu Chen, Markos A. Katsoulakis, Luc Rey-Bellet, Wei Zhu

Group-invariant generative adversarial networks (GANs) are a type of GANs in which the generators and discriminators are hardwired with group symmetries.

F-PABEE: Flexible-patience-based Early Exiting for Single-label and Multi-label text Classification Tasks

no code implementations21 May 2023 Xiangxiang Gao, Wei Zhu, Jiasheng Gao, Congrui Yin

Computational complexity and overthinking problems have become the bottlenecks for pre-training language models (PLMs) with millions or even trillions of parameters.

Multi-Label Classification Multi Label Text Classification +2

Unified Demonstration Retriever for In-Context Learning

1 code implementation7 May 2023 Xiaonan Li, Kai Lv, Hang Yan, Tianyang Lin, Wei Zhu, Yuan Ni, Guotong Xie, Xiaoling Wang, Xipeng Qiu

To train UDR, we cast various tasks' training signals into a unified list-wise ranking formulation by language model's feedback.

In-Context Learning Language Modelling +1

SegPrompt: Using Segmentation Map as a Better Prompt to Finetune Deep Models for Kidney Stone Classification

no code implementations15 Mar 2023 Wei Zhu, Runtao Zhou, Yao Yuan, Campbell Timothy, Rajat Jain, Jiebo Luo

However, the shortage of annotated training data poses a severe problem in improving the performance and generalization ability of the trained model.

Classification Segmentation

Sample Complexity of Probability Divergences under Group Symmetry

no code implementations3 Feb 2023 Ziyu Chen, Markos A. Katsoulakis, Luc Rey-Bellet, Wei Zhu

We rigorously quantify the improvement in the sample complexity of variational divergence estimations for group-invariant distributions.

Candidate Soups: Fusing Candidate Results Improves Translation Quality for Non-Autoregressive Translation

1 code implementation27 Jan 2023 Huanran Zheng, Wei Zhu, Pengfei Wang, Xiaoling Wang

In this paper, we propose a simple but effective method called "Candidate Soups," which can obtain high-quality translations while maintaining the inference speed of NAT models.

Translation

Improving Micro-video Recommendation by Controlling Position Bias

no code implementations9 Aug 2022 Yisong Yu, Beihong Jin, Jiageng Song, Beibei Li, Yiyuan Zheng, Wei Zhu

Although the micro-video recommendation can be naturally treated as the sequential recommendation, the previous sequential recommendation models do not fully consider the characteristics of micro-video apps, and in their inductive biases, the role of positions is not in accord with the reality in the micro-video scenario.

Contrastive Learning Position +1

Explored An Effective Methodology for Fine-Grained Snake Recognition

1 code implementation24 Jul 2022 Yong Huang, Aderon Huang, Wei Zhu, Yanming Fang, Jinghua Feng

Then, in order to take full advantage of unlabeled datasets, we use self-supervised learning and supervised learning joint training to provide pre-trained model.

Fine-Grained Image Classification Self-Supervised Learning

MAGIC: Microlensing Analysis Guided by Intelligent Computation

1 code implementation16 Jun 2022 Haimeng Zhao, Wei Zhu

The key feature of MAGIC is the introduction of a neural controlled differential equation, which provides the capability to handle light curves with irregular sampling and large data gaps.

Time Series Time Series Analysis

Deep Federated Anomaly Detection for Multivariate Time Series Data

no code implementations9 May 2022 Wei Zhu, Dongjin Song, Yuncong Chen, Wei Cheng, Bo Zong, Takehiko Mizoguchi, Cristian Lumezanu, Haifeng Chen, Jiebo Luo

Specifically, we first design an Exemplar-based Deep Neural network (ExDNN) to learn local time series representations based on their compatibility with an exemplar module which consists of hidden parameters learned to capture varieties of normal patterns on each edge device.

Constrained Clustering Federated Learning +3

Localized Adversarial Domain Generalization

1 code implementation CVPR 2022 Wei Zhu, Le Lu, Jing Xiao, Mei Han, Jiebo Luo, Adam P. Harrison

Adversarial domain generalization is a popular approach to DG, but conventional approaches (1) struggle to sufficiently align features so that local neighborhoods are mixed across domains; and (2) can suffer from feature space over collapse which can threaten generalization performance.

Domain Generalization

Continuous Detection, Rapidly React: Unseen Rumors Detection based on Continual Prompt-Tuning

1 code implementation16 Mar 2022 Yuhui Zuo, Wei Zhu, Guoyong Cai

Since open social platforms allow for a large and continuous flow of unverified information, rumors can emerge unexpectedly and spread quickly.

Transfer Learning

A Simple Hash-Based Early Exiting Approach For Language Understanding and Generation

1 code implementation Findings (ACL) 2022 Tianxiang Sun, Xiangyang Liu, Wei Zhu, Zhichao Geng, Lingling Wu, Yilong He, Yuan Ni, Guotong Xie, Xuanjing Huang, Xipeng Qiu

Previous works usually adopt heuristic metrics such as the entropy of internal outputs to measure instance difficulty, which suffers from generalization and threshold-tuning.

Structure-preserving GANs

no code implementations2 Feb 2022 Jeremiah Birrell, Markos A. Katsoulakis, Luc Rey-Bellet, Wei Zhu

Generative adversarial networks (GANs), a class of distribution-learning methods based on a two-player game between a generator and a discriminator, can generally be formulated as a minmax problem based on the variational representation of a divergence between the unknown and the generated distributions.

Deformation Robust Roto-Scale-Translation Equivariant CNNs

no code implementations22 Nov 2021 Liyao Gao, Guang Lin, Wei Zhu

Incorporating group symmetry directly into the learning process has proved to be an effective guideline for model design.

Out-of-Distribution Generalization Translation

Route Optimization via Environment-Aware Deep Network and Reinforcement Learning

no code implementations16 Nov 2021 Pengzhan Guo, Keli Xiao, Zeyang Ye, Wei Zhu

Vehicle mobility optimization in urban areas is a long-standing problem in smart city and spatial data analysis.

Decision Making reinforcement-learning +2

Learning to Aggregate and Refine Noisy Labels for Visual Sentiment Analysis

no code implementations15 Sep 2021 Wei Zhu, Zihe Zheng, Haitian Zheng, Hanjia Lyu, Jiebo Luo

The learned prototypes and their labels can be regarded as denoising features and labels for the local regions and can guide the training process to prevent the model from overfitting the noisy cases.

Denoising Learning with noisy labels +1

Federated Learning of Molecular Properties with Graph Neural Networks in a Heterogeneous Setting

no code implementations15 Sep 2021 Wei Zhu, Jiebo Luo, Andrew White

FLIT(+) can align the local training across heterogeneous clients by improving the performance for uncertain samples.

Federated Learning

MVP-BERT: Multi-Vocab Pre-training for Chinese BERT

no code implementations ACL 2021 Wei Zhu

Despite the development of pre-trained language models (PLMs) significantly raise the performances of various Chinese natural language processing (NLP) tasks, the vocabulary (vocab) for these Chinese PLMs remains to be the one provided by Google Chinese BERT (CITATION), which is based on Chinese characters (chars).

Chinese Word Segmentation Language Modelling

LeeBERT: Learned Early Exit for BERT with cross-level optimization

no code implementations ACL 2021 Wei Zhu

In this work, to improve efficiency without performance drop, we propose a novel training scheme called Learned Early Exit for BERT (LeeBERT).

Discovering Better Model Architectures for Medical Query Understanding

no code implementations NAACL 2021 Wei Zhu, Yuan Ni, Xiaoling Wang, Guotong Xie

In developing an online question-answering system for the medical domains, natural language inference (NLI) models play a central role in question matching and intention detection.

Natural Language Inference Neural Architecture Search +2

Temperature dependent coherence properties of NV ensemble in diamond up to 600K

no code implementations25 Feb 2021 Shengran Lin, Changfeng Weng, Yuanjie Yang, Jiaxin Zhao, Yuhang Guo, Jian Zhang, Liren Lou, Wei Zhu, Guanzhong Wang

Nitrogen-vacancy (NV) center in diamond is an ideal candidate for quantum sensors because of its excellent optical and coherence property.

Quantum Physics Mesoscale and Nanoscale Physics

The 'COVID' Crash of the 2020 U.S. Stock Market

no code implementations10 Jan 2021 Min Shu, Ruiqiang Song, Wei Zhu

We employed the log-periodic power law singularity (LPPLS) methodology to systematically investigate the 2020 stock market crash in the U. S. equities sectors with different levels of total market capitalizations through four major U. S. stock market indexes, including the Wilshire 5000 Total Market index, the S&P 500 index, the S&P MidCap 400 index, and the Russell 2000 index, representing the stocks overall, the large capitalization stocks, the middle capitalization stocks and the small capitalization stocks, respectively.

Lex-BERT: Enhancing BERT based NER with lexicons

no code implementations2 Jan 2021 Wei Zhu, Daniel Cheung

In this work, we represent Lex-BERT, which incorporates the lexicon information into Chinese BERT for named entity recognition (NER) tasks in a natural manner.

named-entity-recognition Named Entity Recognition +3

The 2020 Global Stock Market Crash: Endogenous or Exogenous?

no code implementations1 Jan 2021 Ruiqiang Song, Min Shu, Wei Zhu

Starting on February 20, 2020, the global stock markets began to suffer the worst decline since the Great Recession in 2008, and the COVID-19 has been widely blamed on the stock market crashes.

Time Series Analysis

CMV-BERT: Contrastive multi-vocab pretraining of BERT

no code implementations29 Dec 2020 Wei Zhu, Daniel Cheung

In this work, we represent CMV-BERT, which improves the pretraining of a language model via two ingredients: (a) contrastive learning, which is well studied in the area of computer vision; (b) multiple vocabularies, one of which is fine-grained and the other is coarse-grained.

Contrastive Learning Language Modelling +1

MVP-BERT: Redesigning Vocabularies for Chinese BERT and Multi-Vocab Pretraining

no code implementations17 Nov 2020 Wei Zhu

Despite the development of pre-trained language models (PLMs) significantly raise the performances of various Chinese natural language processing (NLP) tasks, the vocabulary for these Chinese PLMs remain to be the one provided by Google Chinese Bert \cite{devlin2018bert}, which is based on Chinese characters.

Chinese Word Segmentation Language Modelling +1

Precision-Recall Curve (PRC) Classification Trees

no code implementations15 Nov 2020 Jiaju Miao, Wei Zhu

Our algorithm, named as the "Precision-Recall Curve classification tree", or simply the "PRC classification tree" modifies two crucial stages in tree building.

Classification Fraud Detection +2

AutoTrans: Automating Transformer Design via Reinforced Architecture Search

3 code implementations4 Sep 2020 Wei Zhu, Xiaoling Wang, Xipeng Qiu, Yuan Ni, Guotong Xie

Though the transformer architectures have shown dominance in many natural language understanding tasks, there are still unsolved issues for the training of transformer models, especially the need for a principled way of warm-up which has shown importance for stable training of a transformer, as well as whether the task at hand prefer to scale the attention product or not.

Natural Language Understanding Navigate

Personalized Fashion Recommendation from Personal Social Media Data: An Item-to-Set Metric Learning Approach

no code implementations25 May 2020 Haitian Zheng, Kefei Wu, Jong-Hwi Park, Wei Zhu, Jiebo Luo

In this work, we study the problem of personalized fashion recommendation from social media data, i. e. recommending new outfits to social media users that fit their fashion preferences.

Metric Learning

Alleviating the Incompatibility between Cross Entropy Loss and Episode Training for Few-shot Skin Disease Classification

no code implementations21 Apr 2020 Wei Zhu, Haofu Liao, Wenbin Li, Weijian Li, Jiebo Luo

Inspired by the recent success of Few-Shot Learning (FSL) in natural image classification, we propose to apply FSL to skin disease identification to address the extreme scarcity of training sample problem.

Few-Shot Learning General Classification +2

Weighted Aggregating Stochastic Gradient Descent for Parallel Deep Learning

no code implementations7 Apr 2020 Pengzhan Guo, Zeyang Ye, Keli Xiao, Wei Zhu

Following a theoretical analysis on the characteristics of the new objective function, WASGD introduces a decentralized weighted aggregating scheme based on the performance of local workers.

Stochastic Optimization

Pingan Smart Health and SJTU at COIN - Shared Task: utilizing Pre-trained Language Models and Common-sense Knowledge in Machine Reading Tasks

no code implementations WS 2019 Xiepeng Li, Zhexi Zhang, Wei Zhu, Zheng Li, Yuan Ni, Peng Gao, Junchi Yan, Guotong Xie

We have experimented both (a) improving the fine-tuning of pre-trained language models on a task with a small dataset size, by leveraging datasets of similar tasks; and (b) incorporating the distributional representations of a KG onto the representations of pre-trained language models, via simply concatenation or multi-head attention.

Common Sense Reasoning Machine Reading Comprehension +2

Scale-Equivariant Neural Networks with Decomposed Convolutional Filters

no code implementations25 Sep 2019 Wei Zhu, Qiang Qiu, Robert Calderbank, Guillermo Sapiro, Xiuyuan Cheng

Encoding the input scale information explicitly into the representation learned by a convolutional neural network (CNN) is beneficial for many vision tasks especially when dealing with multiscale input signals.

Image Classification

Scaling-Translation-Equivariant Networks with Decomposed Convolutional Filters

no code implementations24 Sep 2019 Wei Zhu, Qiang Qiu, Robert Calderbank, Guillermo Sapiro, Xiuyuan Cheng

Encoding the scale information explicitly into the representation learned by a convolutional neural network (CNN) is beneficial for many computer vision tasks especially when dealing with multiscale inputs.

Image Classification Translation

PANLP at MEDIQA 2019: Pre-trained Language Models, Transfer Learning and Knowledge Distillation

no code implementations WS 2019 Wei Zhu, Xiaofeng Zhou, Keqiang Wang, Xun Luo, Xiepeng Li, Yuan Ni, Guotong Xie

Transfer learning from the NLI task to the RQE task is also experimented, which proves to be useful in improving the results of fine-tuning MT-DNN large.

Knowledge Distillation Re-Ranking +1

Extension of Rough Set Based on Positive Transitive Relation

no code implementations7 Jun 2019 Min Shu, Wei Zhu

The new model holds the merit of the existing rough set extension models while avoids their limitations of discarding transitivity or symmetry.

Relation

Adversarial Defense via Data Dependent Activation Function and Total Variation Minimization

1 code implementation23 Sep 2018 Bao Wang, Alex T. Lin, Wei Zhu, Penghang Yin, Andrea L. Bertozzi, Stanley J. Osher

We improve the robustness of Deep Neural Net (DNN) to adversarial attacks by using an interpolating function as the output activation.

Adversarial Attack Adversarial Defense +1

Circular dichroism in high-order harmonic generation: Heralding topological phases and transitions in Chern insulators

1 code implementation4 Jul 2018 Alexis Chacón, Dasol Kim, Wei Zhu, Shane P. Kelly, Alexandre Dauphin, Emilio Pisanty, Andrew S. Maxwell, Antonio Picón, Marcelo F. Ciappina, Dong Eon Kim, Christopher Ticknor, Avadh Saxena, Maciej Lewenstein

Topological materials are of interest to both fundamental science and advanced technologies, because topological states are robust with respect to perturbations and dissipation.

Mesoscale and Nanoscale Physics Quantum Physics

Stop memorizing: A data-dependent regularization framework for intrinsic pattern learning

no code implementations ICLR 2019 Wei Zhu, Qiang Qiu, Bao Wang, Jianfeng Lu, Guillermo Sapiro, Ingrid Daubechies

Deep neural networks (DNNs) typically have enough capacity to fit random data by brute force even when conventional data-dependent regularizations focusing on the geometry of the features are imposed.

Deep Neural Nets with Interpolating Function as Output Activation

1 code implementation NeurIPS 2018 Bao Wang, Xiyang Luo, Zhen Li, Wei Zhu, Zuoqiang Shi, Stanley J. Osher

We replace the output layer of deep neural nets, typically the softmax function, by a novel interpolating function.

Multi-appearance Segmentation and Extended 0-1 Program for Dense Small Object Tracking

no code implementations14 Dec 2017 Longtao Chen, Jing Lou, Wei Zhu, Qingyuan Xia, Mingwu Ren

Aiming to address the fast multi-object tracking for dense small object in the cluster background, we review track orientated multi-hypothesis tracking(TOMHT) with consideration of batch optimization.

Management Multi-Object Tracking +1

LDMNet: Low Dimensional Manifold Regularized Neural Networks

no code implementations CVPR 2018 Wei Zhu, Qiang Qiu, Jiaji Huang, Robert Calderbank, Guillermo Sapiro, Ingrid Daubechies

To resolve this, we propose a new framework, the Low-Dimensional-Manifold-regularized neural Network (LDMNet), which incorporates a feature regularization method that focuses on the geometry of both the input data and the output features.

Face Recognition Small Data Image Classification

A convergence framework for inexact nonconvex and nonsmooth algorithms and its applications to several iterations

no code implementations12 Sep 2017 Tao Sun, Hao Jiang, Li-Zhi Cheng, Wei Zhu

In fact, a lot of classical inexact nonconvex and nonsmooth algorithms allow these three conditions.

Iteratively Linearized Reweighted Alternating Direction Method of Multipliers for a Class of Nonconvex Problems

no code implementations1 Sep 2017 Tao Sun, Hao Jiang, Lizhi Cheng, Wei Zhu

The traditional alternating direction method of multipliers encounters troubles in both mathematics and computations in solving the nonconvex and nonsmooth subproblem.

Exploiting Color Name Space for Salient Object Detection

no code implementations27 Mar 2017 Jing Lou, Huan Wang, Longtao Chen, Fenglei Xu, Qingyuan Xia, Wei Zhu, Mingwu Ren

In this paper, we will investigate the contribution of color names for the task of salient object detection.

Object object-detection +2

Scalable low dimensional manifold model in the reconstruction of noisy and incomplete hyperspectral images

no code implementations18 May 2016 Wei Zhu, Zuoqiang Shi, Stanley Osher

We present a scalable low dimensional manifold model for the reconstruction of noisy and incomplete hyperspectral images.

RCR: Robust Compound Regression for Robust Estimation of Errors-in-Variables Model

no code implementations12 Aug 2015 Hao Han, Wei Zhu

The errors-in-variables (EIV) regression model, being more realistic by accounting for measurement errors in both the dependent and the independent variables, is widely adopted in applied sciences.

regression

Cannot find the paper you are looking for? You can Submit a new open access paper.