Search Results for author: Wei Zhu

Found 54 papers, 11 papers with code

PCEE-BERT: Accelerating BERT Inference via Patient and Confident Early Exiting

1 code implementation Findings (NAACL) 2022 Zhen Zhang, Wei Zhu, Jinfan Zhang, Peng Wang, Rize Jin, Tae-Sun Chung

In this work, we propose Patient and Confident Early Exiting BERT (PCEE-BERT), an off-the-shelf sample-dependent early exiting method that can work with different PLMs and can also work along with popular model compression methods.

Model Compression Pretrained Language Models

GAML-BERT: Improving BERT Early Exiting by Gradient Aligned Mutual Learning

no code implementations EMNLP 2021 Wei Zhu, Xiaoling Wang, Yuan Ni, Guotong Xie

From this observation, we use mutual learning to improve BERT’s early exiting performances, that is, we ask each exit of a multi-exit BERT to distill knowledge from each other.

Knowledge Distillation

Improving Micro-video Recommendation by Controlling Position Bias

no code implementations9 Aug 2022 Yisong Yu, Beihong Jin, Jiageng Song, Beibei Li, Yiyuan Zheng, Wei Zhu

Although the micro-video recommendation can be naturally treated as the sequential recommendation, the previous sequential recommendation models do not fully consider the characteristics of micro-video apps, and in their inductive biases, the role of positions is not in accord with the reality in the micro-video scenario.

Contrastive Learning Sequential Recommendation

Remote Medication Status Prediction for Individuals with Parkinson's Disease using Time-series Data from Smartphones

no code implementations26 Jul 2022 Weijian Li, Wei Zhu, Ray Dorsey, Jiebo Luo

Medication for neurological diseases such as the Parkinson's disease usually happens remotely at home, away from hospitals.

Time Series

Explored An Effective Methodology for Fine-Grained Snake Recognition

1 code implementation24 Jul 2022 Yong Huang, Aderon Huang, Wei Zhu, Yanming Fang, Jinghua Feng

Then, in order to take full advantage of unlabeled datasets, we use self-supervised learning and supervised learning joint training to provide pre-trained model.

Fine-Grained Image Classification Self-Supervised Learning

MAGIC: Microlensing Analysis Guided by Intelligent Computation

1 code implementation16 Jun 2022 Haimeng Zhao, Wei Zhu

The key feature of MAGIC is the introduction of neural controlled differential equation, which provides the capability to handle light curves with irregular sampling and large data gaps.

Time Series

Localized Adversarial Domain Generalization

1 code implementation CVPR 2022 Wei Zhu, Le Lu, Jing Xiao, Mei Han, Jiebo Luo, Adam P. Harrison

Adversarial domain generalization is a popular approach to DG, but conventional approaches (1) struggle to sufficiently align features so that local neighborhoods are mixed across domains; and (2) can suffer from feature space over collapse which can threaten generalization performance.

Domain Generalization

Deep Federated Anomaly Detection for Multivariate Time Series Data

no code implementations9 May 2022 Wei Zhu, Dongjin Song, Yuncong Chen, Wei Cheng, Bo Zong, Takehiko Mizoguchi, Cristian Lumezanu, Haifeng Chen, Jiebo Luo

Specifically, we first design an Exemplar-based Deep Neural network (ExDNN) to learn local time series representations based on their compatibility with an exemplar module which consists of hidden parameters learned to capture varieties of normal patterns on each edge device.

Federated Learning Time Series +1

Continuous Detection, Rapidly React: Unseen Rumors Detection based on Continual Prompt-Tuning

1 code implementation16 Mar 2022 Yuhui Zuo, Wei Zhu, Guoyong Cai

Since open social platforms allow for a large and continuous flow of unverified information, rumors can emerge unexpectedly and spread quickly.

Transfer Learning

A Simple Hash-Based Early Exiting Approach For Language Understanding and Generation

1 code implementation Findings (ACL) 2022 Tianxiang Sun, Xiangyang Liu, Wei Zhu, Zhichao Geng, Lingling Wu, Yilong He, Yuan Ni, Guotong Xie, Xuanjing Huang, Xipeng Qiu

Previous works usually adopt heuristic metrics such as the entropy of internal outputs to measure instance difficulty, which suffers from generalization and threshold-tuning.

Structure-preserving GANs

no code implementations2 Feb 2022 Jeremiah Birrell, Markos A. Katsoulakis, Luc Rey-Bellet, Wei Zhu

Generative adversarial networks (GANs), a class of distribution-learning methods based on a two-player game between a generator and a discriminator, can generally be formulated as a minmax problem based on the variational representation of a divergence between the unknown and the generated distributions.

Deformation Robust Roto-Scale-Translation Equivariant CNNs

no code implementations22 Nov 2021 Liyao Gao, Guang Lin, Wei Zhu

Incorporating group symmetry directly into the learning process has proved to be an effective guideline for model design.

Out-of-Distribution Generalization Translation

Route Optimization via Environment-Aware Deep Network and Reinforcement Learning

no code implementations16 Nov 2021 Pengzhan Guo, Keli Xiao, Zeyang Ye, Wei Zhu

Vehicle mobility optimization in urban areas is a long-standing problem in smart city and spatial data analysis.

Decision Making reinforcement-learning +1

Learning to Aggregate and Refine Noisy Labels for Visual Sentiment Analysis

no code implementations15 Sep 2021 Wei Zhu, Zihe Zheng, Haitian Zheng, Hanjia Lyu, Jiebo Luo

The learned prototypes and their labels can be regarded as denoising features and labels for the local regions and can guide the training process to prevent the model from overfitting the noisy cases.

Denoising Learning with noisy labels +1

Federated Learning of Molecular Properties with Graph Neural Networks in a Heterogeneous Setting

no code implementations15 Sep 2021 Wei Zhu, Jiebo Luo, Andrew White

FLIT(+) can align the local training across heterogeneous clients by improving the performance for uncertain samples.

Federated Learning

LeeBERT: Learned Early Exit for BERT with cross-level optimization

no code implementations ACL 2021 Wei Zhu

In this work, to improve efficiency without performance drop, we propose a novel training scheme called Learned Early Exit for BERT (LeeBERT).

MVP-BERT: Multi-Vocab Pre-training for Chinese BERT

no code implementations ACL 2021 Wei Zhu

Despite the development of pre-trained language models (PLMs) significantly raise the performances of various Chinese natural language processing (NLP) tasks, the vocabulary (vocab) for these Chinese PLMs remains to be the one provided by Google Chinese BERT (CITATION), which is based on Chinese characters (chars).

Chinese Word Segmentation Language Modelling

Discovering Better Model Architectures for Medical Query Understanding

no code implementations NAACL 2021 Wei Zhu, Yuan Ni, Xiaoling Wang, Guotong Xie

In developing an online question-answering system for the medical domains, natural language inference (NLI) models play a central role in question matching and intention detection.

Natural Language Inference Neural Architecture Search +1

Temperature dependent coherence properties of NV ensemble in diamond up to 600K

no code implementations25 Feb 2021 Shengran Lin, Changfeng Weng, Yuanjie Yang, Jiaxin Zhao, Yuhang Guo, Jian Zhang, Liren Lou, Wei Zhu, Guanzhong Wang

Nitrogen-vacancy (NV) center in diamond is an ideal candidate for quantum sensors because of its excellent optical and coherence property.

Quantum Physics Mesoscale and Nanoscale Physics

The 'COVID' Crash of the 2020 U.S. Stock Market

no code implementations10 Jan 2021 Min Shu, Ruiqiang Song, Wei Zhu

We employed the log-periodic power law singularity (LPPLS) methodology to systematically investigate the 2020 stock market crash in the U. S. equities sectors with different levels of total market capitalizations through four major U. S. stock market indexes, including the Wilshire 5000 Total Market index, the S&P 500 index, the S&P MidCap 400 index, and the Russell 2000 index, representing the stocks overall, the large capitalization stocks, the middle capitalization stocks and the small capitalization stocks, respectively.

Lex-BERT: Enhancing BERT based NER with lexicons

no code implementations2 Jan 2021 Wei Zhu, Daniel Cheung

In this work, we represent Lex-BERT, which incorporates the lexicon information into Chinese BERT for named entity recognition (NER) tasks in a natural manner.

named-entity-recognition NER +1

The 2020 Global Stock Market Crash: Endogenous or Exogenous?

no code implementations1 Jan 2021 Ruiqiang Song, Min Shu, Wei Zhu

Starting on February 20, 2020, the global stock markets began to suffer the worst decline since the Great Recession in 2008, and the COVID-19 has been widely blamed on the stock market crashes.

Time Series

CMV-BERT: Contrastive multi-vocab pretraining of BERT

no code implementations29 Dec 2020 Wei Zhu, Daniel Cheung

In this work, we represent CMV-BERT, which improves the pretraining of a language model via two ingredients: (a) contrastive learning, which is well studied in the area of computer vision; (b) multiple vocabularies, one of which is fine-grained and the other is coarse-grained.

Contrastive Learning Language Modelling +1

MVP-BERT: Redesigning Vocabularies for Chinese BERT and Multi-Vocab Pretraining

no code implementations17 Nov 2020 Wei Zhu

Despite the development of pre-trained language models (PLMs) significantly raise the performances of various Chinese natural language processing (NLP) tasks, the vocabulary for these Chinese PLMs remain to be the one provided by Google Chinese Bert \cite{devlin2018bert}, which is based on Chinese characters.

Chinese Word Segmentation Language Modelling

Precision-Recall Curve (PRC) Classification Trees

no code implementations15 Nov 2020 Jiaju Miao, Wei Zhu

Our algorithm, named as the "Precision-Recall Curve classification tree", or simply the "PRC classification tree" modifies two crucial stages in tree building.

Classification Fraud Detection +2

AutoTrans: Automating Transformer Design via Reinforced Architecture Search

3 code implementations4 Sep 2020 Wei Zhu, Xiaoling Wang, Xipeng Qiu, Yuan Ni, Guotong Xie

Though the transformer architectures have shown dominance in many natural language understanding tasks, there are still unsolved issues for the training of transformer models, especially the need for a principled way of warm-up which has shown importance for stable training of a transformer, as well as whether the task at hand prefer to scale the attention product or not.

Natural Language Understanding Navigate

Personalized Fashion Recommendation from Personal Social Media Data: An Item-to-Set Metric Learning Approach

no code implementations25 May 2020 Haitian Zheng, Kefei Wu, Jong-Hwi Park, Wei Zhu, Jiebo Luo

In this work, we study the problem of personalized fashion recommendation from social media data, i. e. recommending new outfits to social media users that fit their fashion preferences.

Metric Learning

Alleviating the Incompatibility between Cross Entropy Loss and Episode Training for Few-shot Skin Disease Classification

no code implementations21 Apr 2020 Wei Zhu, Haofu Liao, Wenbin Li, Weijian Li, Jiebo Luo

Inspired by the recent success of Few-Shot Learning (FSL) in natural image classification, we propose to apply FSL to skin disease identification to address the extreme scarcity of training sample problem.

Few-Shot Learning General Classification +2

Weighted Aggregating Stochastic Gradient Descent for Parallel Deep Learning

no code implementations7 Apr 2020 Pengzhan Guo, Zeyang Ye, Keli Xiao, Wei Zhu

Following a theoretical analysis on the characteristics of the new objective function, WASGD introduces a decentralized weighted aggregating scheme based on the performance of local workers.

Stochastic Optimization

Pingan Smart Health and SJTU at COIN - Shared Task: utilizing Pre-trained Language Models and Common-sense Knowledge in Machine Reading Tasks

no code implementations WS 2019 Xiepeng Li, Zhexi Zhang, Wei Zhu, Zheng Li, Yuan Ni, Peng Gao, Junchi Yan, Guotong Xie

We have experimented both (a) improving the fine-tuning of pre-trained language models on a task with a small dataset size, by leveraging datasets of similar tasks; and (b) incorporating the distributional representations of a KG onto the representations of pre-trained language models, via simply concatenation or multi-head attention.

Common Sense Reasoning Machine Reading Comprehension +1

Scale-Equivariant Neural Networks with Decomposed Convolutional Filters

no code implementations25 Sep 2019 Wei Zhu, Qiang Qiu, Robert Calderbank, Guillermo Sapiro, Xiuyuan Cheng

Encoding the input scale information explicitly into the representation learned by a convolutional neural network (CNN) is beneficial for many vision tasks especially when dealing with multiscale input signals.

Image Classification

Scaling-Translation-Equivariant Networks with Decomposed Convolutional Filters

no code implementations24 Sep 2019 Wei Zhu, Qiang Qiu, Robert Calderbank, Guillermo Sapiro, Xiuyuan Cheng

Encoding the scale information explicitly into the representation learned by a convolutional neural network (CNN) is beneficial for many computer vision tasks especially when dealing with multiscale inputs.

Image Classification Translation

PANLP at MEDIQA 2019: Pre-trained Language Models, Transfer Learning and Knowledge Distillation

no code implementations WS 2019 Wei Zhu, Xiaofeng Zhou, Keqiang Wang, Xun Luo, Xiepeng Li, Yuan Ni, Guotong Xie

Transfer learning from the NLI task to the RQE task is also experimented, which proves to be useful in improving the results of fine-tuning MT-DNN large.

Knowledge Distillation Re-Ranking +1

Extension of Rough Set Based on Positive Transitive Relation

no code implementations7 Jun 2019 Min Shu, Wei Zhu

The new model holds the merit of the existing rough set extension models while avoids their limitations of discarding transitivity or symmetry.

Adversarial Defense via Data Dependent Activation Function and Total Variation Minimization

1 code implementation23 Sep 2018 Bao Wang, Alex T. Lin, Wei Zhu, Penghang Yin, Andrea L. Bertozzi, Stanley J. Osher

We improve the robustness of Deep Neural Net (DNN) to adversarial attacks by using an interpolating function as the output activation.

Adversarial Attack Adversarial Defense +1

Circular dichroism in high-order harmonic generation: Heralding topological phases and transitions in Chern insulators

1 code implementation4 Jul 2018 Alexis Chacón, Dasol Kim, Wei Zhu, Shane P. Kelly, Alexandre Dauphin, Emilio Pisanty, Andrew S. Maxwell, Antonio Picón, Marcelo F. Ciappina, Dong Eon Kim, Christopher Ticknor, Avadh Saxena, Maciej Lewenstein

Topological materials are of interest to both fundamental science and advanced technologies, because topological states are robust with respect to perturbations and dissipation.

Mesoscale and Nanoscale Physics Quantum Physics

Stop memorizing: A data-dependent regularization framework for intrinsic pattern learning

no code implementations ICLR 2019 Wei Zhu, Qiang Qiu, Bao Wang, Jianfeng Lu, Guillermo Sapiro, Ingrid Daubechies

Deep neural networks (DNNs) typically have enough capacity to fit random data by brute force even when conventional data-dependent regularizations focusing on the geometry of the features are imposed.

Deep Neural Nets with Interpolating Function as Output Activation

1 code implementation NeurIPS 2018 Bao Wang, Xiyang Luo, Zhen Li, Wei Zhu, Zuoqiang Shi, Stanley J. Osher

We replace the output layer of deep neural nets, typically the softmax function, by a novel interpolating function.

Multi-appearance Segmentation and Extended 0-1 Program for Dense Small Object Tracking

no code implementations14 Dec 2017 Longtao Chen, Jing Lou, Wei Zhu, Qingyuan Xia, Mingwu Ren

Aiming to address the fast multi-object tracking for dense small object in the cluster background, we review track orientated multi-hypothesis tracking(TOMHT) with consideration of batch optimization.

Management Multi-Object Tracking

LDMNet: Low Dimensional Manifold Regularized Neural Networks

no code implementations CVPR 2018 Wei Zhu, Qiang Qiu, Jiaji Huang, Robert Calderbank, Guillermo Sapiro, Ingrid Daubechies

To resolve this, we propose a new framework, the Low-Dimensional-Manifold-regularized neural Network (LDMNet), which incorporates a feature regularization method that focuses on the geometry of both the input data and the output features.

Face Recognition Small Data Image Classification

A convergence framework for inexact nonconvex and nonsmooth algorithms and its applications to several iterations

no code implementations12 Sep 2017 Tao Sun, Hao Jiang, Li-Zhi Cheng, Wei Zhu

In fact, a lot of classical inexact nonconvex and nonsmooth algorithms allow these three conditions.

Iteratively Linearized Reweighted Alternating Direction Method of Multipliers for a Class of Nonconvex Problems

no code implementations1 Sep 2017 Tao Sun, Hao Jiang, Lizhi Cheng, Wei Zhu

The traditional alternating direction method of multipliers encounters troubles in both mathematics and computations in solving the nonconvex and nonsmooth subproblem.

Scalable low dimensional manifold model in the reconstruction of noisy and incomplete hyperspectral images

no code implementations18 May 2016 Wei Zhu, Zuoqiang Shi, Stanley Osher

We present a scalable low dimensional manifold model for the reconstruction of noisy and incomplete hyperspectral images.

RCR: Robust Compound Regression for Robust Estimation of Errors-in-Variables Model

no code implementations12 Aug 2015 Hao Han, Wei Zhu

The errors-in-variables (EIV) regression model, being more realistic by accounting for measurement errors in both the dependent and the independent variables, is widely adopted in applied sciences.

Cannot find the paper you are looking for? You can Submit a new open access paper.