Search Results for author: Zhouhan Lin

Found 36 papers, 26 papers with code

Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator

1 code implementation24 May 2023 Ziwei He, Meng Yang, Minwei Feng, Jingcheng Yin, Xinbing Wang, Jingwen Leng, Zhouhan Lin

Many researchers have focused on designing new forms of self-attention or introducing new parameters to overcome this limitation, however a large portion of them prohibits the model to inherit weights from large pretrained models.

Abstractive Text Summarization Document Summarization +2

Asymmetric Polynomial Loss For Multi-Label Classification

1 code implementation10 Apr 2023 Yusheng Huang, Jiexing Qi, Xinbing Wang, Zhouhan Lin

We further employ the asymmetric focusing mechanism to decouple the gradient contribution from the negative and positive samples.

Image Classification Multi-Label Classification +3

Text Classification in the Wild: a Large-scale Long-tailed Name Normalization Dataset

1 code implementation19 Feb 2023 Jiexing Qi, Shuhao Li, Zhixin Guo, Yusheng Huang, Chenghu Zhou, Weinan Zhang, Xinbing Wang, Zhouhan Lin

In this work, we first collect a large-scale institution name normalization dataset LoT-insts1, which contains over 25k classes that exhibit a naturally long-tailed distribution.

Long-tail Learning Out-of-Distribution Generalization +3

Ordered GNN: Ordering Message Passing to Deal with Heterophily and Over-smoothing

1 code implementation3 Feb 2023 Yunchong Song, Chenghu Zhou, Xinbing Wang, Zhouhan Lin

This is achieved by aligning the hierarchy of the rooted-tree of a central node with the ordered neurons in its node representation.

Node Classification

Text Editing as Imitation Game

1 code implementation21 Oct 2022 Ning Shi, Bin Tang, Bo Yuan, Longtao Huang, Yewen Pu, Jie Fu, Zhouhan Lin

Text editing, such as grammatical error correction, arises naturally from imperfect textual data.

Action Generation Grammatical Error Correction +1

Syntax-guided Localized Self-attention by Constituency Syntactic Distance

1 code implementation21 Oct 2022 Shengyuan Hou, Jushi Kai, Haotian Xue, Bingyu Zhu, Bo Yuan, Longtao Huang, Xinbing Wang, Zhouhan Lin

Recent works have revealed that Transformers are implicitly learning the syntactic information in its lower layers from data, albeit is highly dependent on the quality and scale of the training data.

Machine Translation Translation

INFINITY: A Simple Yet Effective Unsupervised Framework for Graph-Text Mutual Conversion

no code implementations22 Sep 2022 Yi Xu, Luoyi Fu, Zhouhan Lin, Jiexing Qi, Xinbing Wang

As a fully unsupervised framework, INFINITY is empirically verified to outperform state-of-the-art baselines for G2T and T2G tasks.

Knowledge Graphs

Transkimmer: Transformer Learns to Layer-wise Skim

1 code implementation ACL 2022 Yue Guan, Zhengyi Li, Jingwen Leng, Zhouhan Lin, Minyi Guo

To address the above limitations, we propose the Transkimmer architecture, which learns to identify hidden state tokens that are not required by each layer.

RASAT: Integrating Relational Structures into Pretrained Seq2Seq Model for Text-to-SQL

1 code implementation14 May 2022 Jiexing Qi, Jingyao Tang, Ziwei He, Xiangpeng Wan, Yu Cheng, Chenghu Zhou, Xinbing Wang, Quanshi Zhang, Zhouhan Lin

Our model can incorporate almost all types of existing relations in the literature, and in addition, we propose introducing co-reference relations for the multi-turn scenario.

Dialogue State Tracking Semantic Parsing +1

Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition

1 code implementation ACL 2022 Xichen Pan, Peiyu Chen, Yichen Gong, Helong Zhou, Xinbing Wang, Zhouhan Lin

In particular, audio and visual front-ends are trained on large-scale unimodal datasets, then we integrate components of both front-ends into a larger multimodal framework which learns to recognize parallel audio-visual data into characters through a combination of CTC and seq2seq decoding.

Audio-Visual Speech Recognition Automatic Speech Recognition (ASR) +6

Block-Skim: Efficient Question Answering for Transformer

1 code implementation16 Dec 2021 Yue Guan, Zhengyi Li, Jingwen Leng, Zhouhan Lin, Minyi Guo, Yuhao Zhu

We further prune the hidden states corresponding to the unnecessary positions early in lower layers, achieving significant inference-time speedup.

Extractive Question-Answering Question Answering

Annotation Inconsistency and Entity Bias in MultiWOZ

no code implementations SIGDIAL (ACL) 2021 Kun Qian, Ahmad Beirami, Zhouhan Lin, Ankita De, Alborz Geramifard, Zhou Yu, Chinnadhurai Sankar

In this work, we identify an overlooked issue with dialog state annotation inconsistencies in the dataset, where a slot type is tagged inconsistently across similar dialogs leading to confusion for DST modeling.

dialog state tracking Memorization +1

Reciprocal Supervised Learning Improves Neural Machine Translation

1 code implementation5 Dec 2020 Minkai Xu, Mingxuan Wang, Zhouhan Lin, Hao Zhou, Weinan Zhang, Lei LI

Despite the recent success on image classification, self-training has only achieved limited gains on structured prediction tasks such as neural machine translation (NMT).

Image Classification Knowledge Distillation +4

Ordered Memory

1 code implementation NeurIPS 2019 Yikang Shen, Shawn Tan, Arian Hosseini, Zhouhan Lin, Alessandro Sordoni, Aaron Courville

Inspired by Ordered Neurons (Shen et al., 2018), we introduce a new attention-based mechanism and use its cumulative probability to control the writing and erasing operation of the memory.


Neural Language Modeling by Jointly Learning Syntax and Lexicon

1 code implementation ICLR 2018 Yikang Shen, Zhouhan Lin, Chin-wei Huang, Aaron Courville

In this paper, We propose a novel neural language model, called the Parsing-Reading-Predict Networks (PRPN), that can simultaneously induce the syntactic structure from unannotated sentences and leverage the inferred structure to learn a better language model.

Constituency Grammar Induction Language Modelling

Recurrent Neural Networks With Limited Numerical Precision

1 code implementation21 Nov 2016 Joachim Ott, Zhouhan Lin, Ying Zhang, Shih-Chii Liu, Yoshua Bengio

Recurrent Neural Networks (RNNs) produce state-of-art performance on many machine learning tasks but their demand on resources in terms of memory and computational power are often high.


Recurrent Neural Networks With Limited Numerical Precision

1 code implementation24 Aug 2016 Joachim Ott, Zhouhan Lin, Ying Zhang, Shih-Chii Liu, Yoshua Bengio

We present results from the use of different stochastic and deterministic reduced precision training methods applied to three major RNN types which are then tested on several datasets.


Theano: A Python framework for fast computation of mathematical expressions

1 code implementation9 May 2016 The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano, Tim Cooijmans, Marc-Alexandre Côté, Myriam Côté, Aaron Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Mélanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian Goodfellow, Matt Graham, Caglar Gulcehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balázs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefrancois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert T. McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, François Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Étienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, Ying Zhang

Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements.

BIG-bench Machine Learning Clustering +2

Towards Biologically Plausible Deep Learning

no code implementations14 Feb 2015 Yoshua Bengio, Dong-Hyun Lee, Jorg Bornschein, Thomas Mesnard, Zhouhan Lin

Neuroscientists have long criticised deep learning algorithms as incompatible with current knowledge of neurobiology.

Denoising Representation Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.