Search Results for author: Zhouhan Lin

Found 44 papers, 32 papers with code

Towards Biologically Plausible Deep Learning

no code implementations • 14 Feb 2015 • Yoshua Bengio, Dong-Hyun Lee, Jorg Bornschein, Thomas Mesnard, Zhouhan Lin

Neuroscientists have long criticised deep learning algorithms as incompatible with current knowledge of neurobiology.

Denoising Representation Learning

Paper
Add Code

Neural Networks with Few Multiplications

2 code implementations • 11 Oct 2015 • Zhouhan Lin, Matthieu Courbariaux, Roland Memisevic, Yoshua Bengio

For most deep learning algorithms training is notoriously time consuming.

General Classification

373

Paper
Code

How far can we go without convolution: Improving fully-connected networks

5 code implementations • 9 Nov 2015 • Zhouhan Lin, Roland Memisevic, Kishore Konda

We propose ways to improve the performance of fully connected networks.

General Classification Unsupervised Pre-training

841

Paper
Code

Spectral-Spatial Classification of Hyperspectral Image Using Autoencoders

1 code implementation • 9 Nov 2015 • Zhouhan Lin, Yushi Chen, Xing Zhao, Gang Wang

Hyperspectral image (HSI) classification is a hot topic in the remote sensing community.

Classification General Classification

Paper
Code

Architectural Complexity Measures of Recurrent Neural Networks

no code implementations • NeurIPS 2016 • Saizheng Zhang, Yuhuai Wu, Tong Che, Zhouhan Lin, Roland Memisevic, Ruslan Salakhutdinov, Yoshua Bengio

In this paper, we systematically analyze the connecting architectures of recurrent neural networks (RNNs).

Ranked #23 on Language Modelling on Text8

Language Modelling

Paper
Add Code

Theano: A Python framework for fast computation of mathematical expressions

1 code implementation • 9 May 2016 • The Theano Development Team, Rami Al-Rfou, Guillaume Alain, Amjad Almahairi, Christof Angermueller, Dzmitry Bahdanau, Nicolas Ballas, Frédéric Bastien, Justin Bayer, Anatoly Belikov, Alexander Belopolsky, Yoshua Bengio, Arnaud Bergeron, James Bergstra, Valentin Bisson, Josh Bleecher Snyder, Nicolas Bouchard, Nicolas Boulanger-Lewandowski, Xavier Bouthillier, Alexandre de Brébisson, Olivier Breuleux, Pierre-Luc Carrier, Kyunghyun Cho, Jan Chorowski, Paul Christiano, Tim Cooijmans, Marc-Alexandre Côté, Myriam Côté, Aaron Courville, Yann N. Dauphin, Olivier Delalleau, Julien Demouth, Guillaume Desjardins, Sander Dieleman, Laurent Dinh, Mélanie Ducoffe, Vincent Dumoulin, Samira Ebrahimi Kahou, Dumitru Erhan, Ziye Fan, Orhan Firat, Mathieu Germain, Xavier Glorot, Ian Goodfellow, Matt Graham, Caglar Gulcehre, Philippe Hamel, Iban Harlouchet, Jean-Philippe Heng, Balázs Hidasi, Sina Honari, Arjun Jain, Sébastien Jean, Kai Jia, Mikhail Korobov, Vivek Kulkarni, Alex Lamb, Pascal Lamblin, Eric Larsen, César Laurent, Sean Lee, Simon Lefrancois, Simon Lemieux, Nicholas Léonard, Zhouhan Lin, Jesse A. Livezey, Cory Lorenz, Jeremiah Lowin, Qianli Ma, Pierre-Antoine Manzagol, Olivier Mastropietro, Robert T. McGibbon, Roland Memisevic, Bart van Merriënboer, Vincent Michalski, Mehdi Mirza, Alberto Orlandi, Christopher Pal, Razvan Pascanu, Mohammad Pezeshki, Colin Raffel, Daniel Renshaw, Matthew Rocklin, Adriana Romero, Markus Roth, Peter Sadowski, John Salvatier, François Savard, Jan Schlüter, John Schulman, Gabriel Schwartz, Iulian Vlad Serban, Dmitriy Serdyuk, Samira Shabanian, Étienne Simon, Sigurd Spieckermann, S. Ramana Subramanyam, Jakub Sygnowski, Jérémie Tanguay, Gijs van Tulder, Joseph Turian, Sebastian Urban, Pascal Vincent, Francesco Visin, Harm de Vries, David Warde-Farley, Dustin J. Webb, Matthew Willson, Kelvin Xu, Lijun Xue, Li Yao, Saizheng Zhang, Ying Zhang

Since its introduction, it has been one of the most used CPU and GPU mathematical compilers - especially in the machine learning community - and has shown steady performance improvements.

BIG-bench Machine Learning Clustering +2

9,852

Paper
Code

Recurrent Neural Networks With Limited Numerical Precision

1 code implementation • 24 Aug 2016 • Joachim Ott, Zhouhan Lin, Ying Zhang, Shih-Chii Liu, Yoshua Bengio

We present results from the use of different stochastic and deterministic reduced precision training methods applied to three major RNN types which are then tested on several datasets.

Binarization

Paper
Code

Recurrent Neural Networks With Limited Numerical Precision

1 code implementation • 21 Nov 2016 • Joachim Ott, Zhouhan Lin, Ying Zhang, Shih-Chii Liu, Yoshua Bengio

Recurrent Neural Networks (RNNs) produce state-of-art performance on many machine learning tasks but their demand on resources in terms of memory and computational power are often high.

Quantization

Paper
Code

A Structured Self-attentive Sentence Embedding

52 code implementations • 9 Mar 2017 • Zhouhan Lin, Minwei Feng, Cicero Nogueira dos santos, Mo Yu, Bing Xiang, Bo-Wen Zhou, Yoshua Bengio

This paper proposes a new model for extracting an interpretable sentence embedding by introducing self-attention.

General Classification Natural Language Inference +5

8,437

Paper
Code

A Deep Reinforcement Learning Chatbot

no code implementations • 7 Sep 2017 • Iulian V. Serban, Chinnadhurai Sankar, Mathieu Germain, Saizheng Zhang, Zhouhan Lin, Sandeep Subramanian, Taesup Kim, Michael Pieper, Sarath Chandar, Nan Rosemary Ke, Sai Rajeshwar, Alexandre de Brebisson, Jose M. R. Sotelo, Dendi Suhubdy, Vincent Michalski, Alexandre Nguyen, Joelle Pineau, Yoshua Bengio

By applying reinforcement learning to crowdsourced data and real-world user interactions, the system has been trained to select an appropriate response from the models in its ensemble.

Chatbot reinforcement-learning +3

Paper
Add Code

Neural Language Modeling by Jointly Learning Syntax and Lexicon

1 code implementation • ICLR 2018 • Yikang Shen, Zhouhan Lin, Chin-wei Huang, Aaron Courville

In this paper, We propose a novel neural language model, called the Parsing-Reading-Predict Networks (PRPN), that can simultaneously induce the syntactic structure from unannotated sentences and leverage the inferred structure to learn a better language model.

Ranked #13 on Constituency Grammar Induction on PTB Diagnostic ECG Database (Max F1 (WSJ) metric)

Constituency Grammar Induction Language Modelling

Paper
Code

A Deep Reinforcement Learning Chatbot (Short Version)

no code implementations • 20 Jan 2018 • Iulian V. Serban, Chinnadhurai Sankar, Mathieu Germain, Saizheng Zhang, Zhouhan Lin, Sandeep Subramanian, Taesup Kim, Michael Pieper, Sarath Chandar, Nan Rosemary Ke, Sai Rajeswar, Alexandre de Brebisson, Jose M. R. Sotelo, Dendi Suhubdy, Vincent Michalski, Alexandre Nguyen, Joelle Pineau, Yoshua Bengio

We present MILABOT: a deep reinforcement learning chatbot developed by the Montreal Institute for Learning Algorithms (MILA) for the Amazon Alexa Prize competition.

Chatbot reinforcement-learning +3

Paper
Add Code

Straight to the Tree: Constituency Parsing with Neural Syntactic Distance

2 code implementations • ACL 2018 • Yikang Shen, Zhouhan Lin, Athul Paul Jacob, Alessandro Sordoni, Aaron Courville, Yoshua Bengio

In this work, we propose a novel constituency parsing scheme.

Constituency Parsing Position +1

Paper
Code

Focused Hierarchical RNNs for Conditional Sequence Processing

no code implementations • ICML 2018 • Nan Rosemary Ke, Konrad Zolna, Alessandro Sordoni, Zhouhan Lin, Adam Trischler, Yoshua Bengio, Joelle Pineau, Laurent Charlin, Chris Pal

We evaluate this method on several types of tasks with different attributes.

Ranked #3 on Open-Domain Question Answering on SearchQA (Unigram Acc metric)

Open-Domain Question Answering Policy Gradient Methods

Paper
Add Code

Learning Hierarchical Structures On-The-Fly with a Recurrent-Recursive Model for Sequences

no code implementations • WS 2018 • Athul Paul Jacob, Zhouhan Lin, Aless Sordoni, ro, Yoshua Bengio

We propose a hierarchical model for sequential data that learns a tree on-the-fly, i. e. while reading the sequence.

Language Modelling Math +3

Paper
Add Code

Interactive Language Learning by Question Answering

1 code implementation • IJCNLP 2019 • Xingdi Yuan, Marc-Alexandre Cote, Jie Fu, Zhouhan Lin, Christopher Pal, Yoshua Bengio, Adam Trischler

In QAit, an agent must interact with a partially observable text-based environment to gather information required to answer questions.

Machine Reading Comprehension Question Answering

Paper
Code

Ordered Memory

1 code implementation • NeurIPS 2019 • Yikang Shen, Shawn Tan, Arian Hosseini, Zhouhan Lin, Alessandro Sordoni, Aaron Courville

Inspired by Ordered Neurons (Shen et al., 2018), we introduce a new attention-based mechanism and use its cumulative probability to control the writing and erasing operation of the memory.

ListOps

Paper
Code

Revisit Systematic Generalization via Meaningful Learning

2 code implementations • 14 Mar 2020 • Ning Shi, Boxin Wang, Wei Wang, Xiangyu Liu, Zhouhan Lin

Humans can systematically generalize to novel compositions of existing concepts.

Data Augmentation Machine Translation +3

Paper
Code

Exploiting Syntactic Structure for Better Language Modeling: A Syntactic Distance Approach

1 code implementation • ACL 2020 • Wenyu Du, Zhouhan Lin, Yikang Shen, Timothy J. O'Donnell, Yoshua Bengio, Yue Zhang

It is commonly believed that knowledge of syntactic structure should improve language modeling.

Language Modelling

Paper
Code

Reciprocal Supervised Learning Improves Neural Machine Translation

1 code implementation • 5 Dec 2020 • Minkai Xu, Mingxuan Wang, Zhouhan Lin, Hao Zhou, Weinan Zhang, Lei LI

Despite the recent success on image classification, self-training has only achieved limited gains on structured prediction tasks such as neural machine translation (NMT).

Image Classification Knowledge Distillation +4

Paper
Code

Annotation Inconsistency and Entity Bias in MultiWOZ

no code implementations • SIGDIAL (ACL) 2021 • Kun Qian, Ahmad Beirami, Zhouhan Lin, Ankita De, Alborz Geramifard, Zhou Yu, Chinnadhurai Sankar

In this work, we identify an overlooked issue with dialog state annotation inconsistencies in the dataset, where a slot type is tagged inconsistently across similar dialogs leading to confusion for DST modeling.

dialog state tracking Memorization +1

Paper
Add Code

Incorporating External POS Tagger for Punctuation Restoration

1 code implementation • 12 Jun 2021 • Ning Shi, Wei Wang, Boxin Wang, Jinfeng Li, Xiangyu Liu, Zhouhan Lin

Punctuation restoration is an important post-processing step in automatic speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Code

Block-Skim: Efficient Question Answering for Transformer

1 code implementation • 16 Dec 2021 • Yue Guan, Zhengyi Li, Jingwen Leng, Zhouhan Lin, Minyi Guo, Yuhao Zhu

We further prune the hidden states corresponding to the unnecessary positions early in lower layers, achieving significant inference-time speedup.

Extractive Question-Answering Question Answering

Paper
Code

Leveraging Unimodal Self-Supervised Learning for Multimodal Audio-Visual Speech Recognition

1 code implementation • ACL 2022 • Xichen Pan, Peiyu Chen, Yichen Gong, Helong Zhou, Xinbing Wang, Zhouhan Lin

In particular, audio and visual front-ends are trained on large-scale unimodal datasets, then we integrate components of both front-ends into a larger multimodal framework which learns to recognize parallel audio-visual data into characters through a combination of CTC and seq2seq decoding.

Ranked #2 on Automatic Speech Recognition (ASR) on LRS2

Audio-Visual Speech Recognition Automatic Speech Recognition (ASR) +7

Paper
Code

RASAT: Integrating Relational Structures into Pretrained Seq2Seq Model for Text-to-SQL

1 code implementation • 14 May 2022 • Jiexing Qi, Jingyao Tang, Ziwei He, Xiangpeng Wan, Yu Cheng, Chenghu Zhou, Xinbing Wang, Quanshi Zhang, Zhouhan Lin

Our model can incorporate almost all types of existing relations in the literature, and in addition, we propose introducing co-reference relations for the multi-turn scenario.

Ranked #1 on Dialogue State Tracking on CoSQL

Dialogue State Tracking Semantic Parsing +1

Paper
Code

Transkimmer: Transformer Learns to Layer-wise Skim

1 code implementation • ACL 2022 • Yue Guan, Zhengyi Li, Jingwen Leng, Zhouhan Lin, Minyi Guo

To address the above limitations, we propose the Transkimmer architecture, which learns to identify hidden state tokens that are not required by each layer.

Computational Efficiency

Paper
Code

INFINITY: A Simple Yet Effective Unsupervised Framework for Graph-Text Mutual Conversion

no code implementations • 22 Sep 2022 • Yi Xu, Luoyi Fu, Zhouhan Lin, Jiexing Qi, Xinbing Wang

As a fully unsupervised framework, INFINITY is empirically verified to outperform state-of-the-art baselines for G2T and T2G tasks.

Knowledge Graphs

Paper
Add Code

Text Editing as Imitation Game

1 code implementation • 21 Oct 2022 • Ning Shi, Bin Tang, Bo Yuan, Longtao Huang, Yewen Pu, Jie Fu, Zhouhan Lin

Text editing, such as grammatical error correction, arises naturally from imperfect textual data.

Action Generation Grammatical Error Correction +1

Paper
Code

Syntax-guided Localized Self-attention by Constituency Syntactic Distance

1 code implementation • 21 Oct 2022 • Shengyuan Hou, Jushi Kai, Haotian Xue, Bingyu Zhu, Bo Yuan, Longtao Huang, Xinbing Wang, Zhouhan Lin

Recent works have revealed that Transformers are implicitly learning the syntactic information in its lower layers from data, albeit is highly dependent on the quality and scale of the training data.

Machine Translation Translation

Paper
Code

Ordered GNN: Ordering Message Passing to Deal with Heterophily and Over-smoothing

1 code implementation • 3 Feb 2023 • Yunchong Song, Chenghu Zhou, Xinbing Wang, Zhouhan Lin

This is achieved by aligning the hierarchy of the rooted-tree of a central node with the ordered neurons in its node representation.

Ranked #4 on Node Classification on Cornell

Node Classification

Paper
Code

Few-Shot Table-to-Text Generation with Prompt Planning and Knowledge Memorization

no code implementations • 9 Feb 2023 • Zhixin Guo, Minyxuan Yan, Jiexing Qi, Jianping Zhou, Ziwei He, Zhouhan Lin, Guanjie Zheng, Xinbing Wang

The design of our framework consists of two aspects: a prompt planner and a knowledge adapter.

Memorization Table-to-Text Generation

Paper
Add Code

Text Classification in the Wild: a Large-scale Long-tailed Name Normalization Dataset

1 code implementation • 19 Feb 2023 • Jiexing Qi, Shuhao Li, Zhixin Guo, Yusheng Huang, Chenghu Zhou, Weinan Zhang, Xinbing Wang, Zhouhan Lin

In this work, we first collect a large-scale institution name normalization dataset LoT-insts1, which contains over 25k classes that exhibit a naturally long-tailed distribution.

Ranked #1 on Long-tail Learning on Lot-insts

Long-tail Learning open-set classification +4

Paper
Code

Asymmetric Polynomial Loss For Multi-Label Classification

1 code implementation • 10 Apr 2023 • Yusheng Huang, Jiexing Qi, Xinbing Wang, Zhouhan Lin

We further employ the asymmetric focusing mechanism to decouple the gradient contribution from the negative and positive samples.

Image Classification Multi-Label Classification +3

Paper
Code

PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning

1 code implementation • 23 May 2023 • Xuekai Zhu, Biqing Qi, Kaiyan Zhang, Xinwei Long, Zhouhan Lin, BoWen Zhou

While large language models (LLMs) excel in various natural language processing tasks, their huge size and the inaccessibility of parameters present challenges for practical deployment.

Arithmetic Reasoning GSM8K +1

Paper
Code

Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator

1 code implementation • 24 May 2023 • Ziwei He, Meng Yang, Minwei Feng, Jingcheng Yin, Xinbing Wang, Jingwen Leng, Zhouhan Lin

Many researchers have focused on designing new forms of self-attention or introducing new parameters to overcome this limitation, however a large portion of them prohibits the model to inherit weights from large pretrained models.

Ranked #1 on Open-Domain Question Answering on ELI5

Abstractive Text Summarization Document Summarization +2

Paper
Code

K2: A Foundation Language Model for Geoscience Knowledge Understanding and Utilization

1 code implementation • 8 Jun 2023 • Cheng Deng, Tianhang Zhang, Zhongmou He, Yi Xu, Qiyuan Chen, Yuanyuan Shi, Luoyi Fu, Weinan Zhang, Xinbing Wang, Chenghu Zhou, Zhouhan Lin, Junxian He

Large language models (LLMs) have achieved great success in general domains of natural language processing.

Language Modelling

141

Paper
Code

Document-Level Relation Extraction with Relation Correlation Enhancement

1 code implementation • 6 Oct 2023 • Yusheng Huang, Zhouhan Lin

Document-level relation extraction (DocRE) is a task that focuses on identifying relations between entities within a document.

Document-level Relation Extraction Graph Attention +1

Paper
Code

I2SRM: Intra- and Inter-Sample Relationship Modeling for Multimodal Information Extraction

1 code implementation • 10 Oct 2023 • Yusheng Huang, Zhouhan Lin

Multimodal information extraction is attracting research attention nowadays, which requires aggregating representations from different modalities.

named-entity-recognition Named Entity Recognition +1

Paper
Code

Human-Readable Fingerprint for Large Language Models

no code implementations • 8 Dec 2023 • Boyi Zeng, Chenghu Zhou, Xinbing Wang, Zhouhan Lin

However, identifying the original base model of an LLM is challenging due to potential parameter alterations.

Paper
Add Code

Towards Controlled Table-to-Text Generation with Scientific Reasoning

no code implementations • 8 Dec 2023 • Zhixin Guo, Jianping Zhou, Jiexing Qi, Mingxuan Yan, Ziwei He, Guanjie Zheng, Zhouhan Lin, Xinbing Wang, Chenghu Zhou

The sheer volume of scientific experimental results and complex technical statements, often presented in tabular formats, presents a formidable barrier to individuals acquiring preferred information.

Table-to-Text Generation

Paper
Add Code

GeoGalactica: A Scientific Large Language Model in Geoscience

1 code implementation • 31 Dec 2023 • Zhouhan Lin, Cheng Deng, Le Zhou, Tianhang Zhang, Yi Xu, Yutong Xu, Zhongmou He, Yuanyuan Shi, Beiya Dai, Yunchong Song, Boyi Zeng, Qiyuan Chen, Yuxun Miao, Bo Xue, Shu Wang, Luoyi Fu, Weinan Zhang, Junxian He, Yunqiang Zhu, Xinbing Wang, Chenghu Zhou

To our best knowledge, it is the largest language model for the geoscience domain.

Document Classification General Knowledge +4

Paper
Code

SH2: Self-Highlighted Hesitation Helps You Decode More Truthfully

1 code implementation • 11 Jan 2024 • Jushi Kai, Hai Hu, Zhouhan Lin

Therefore, we propose to ''highlight'' the factual information by selecting the tokens with the lowest probabilities and concatenating them to the original context, thus forcing the model to repeatedly read and hesitate on these tokens before generation.

Hallucination Text Generation

Paper
Code

Critical Data Size of Language Models from a Grokking Perspective

no code implementations • 19 Jan 2024 • Xuekai Zhu, Yao Fu, BoWen Zhou, Zhouhan Lin

We formalize the phase transition under the grokking configuration into the Data Efficiency Hypothesis and identify data insufficiency, sufficiency, and surplus regimes in language models training dynamics.

Language Modelling Memorization

Paper
Add Code

Graph Parsing Networks

1 code implementation • 22 Feb 2024 • Yunchong Song, Siyuan Huang, Xinbing Wang, Chenghu Zhou, Zhouhan Lin

GPN benefits from the discrete assignments generated by the graph parsing algorithm, allowing good memory efficiency while preserving node information intact.

Graph Classification Graph Reconstruction +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.