Search Results for author: Yin-Wen Chang

Found 9 papers, 1 papers with code

Leveraging redundancy in attention with Reuse Transformers

1 code implementation • 13 Oct 2021 • Srinadh Bhojanapalli, Ayan Chakrabarti, Andreas Veit, Michal Lukasik, Himanshu Jain, Frederick Liu, Yin-Wen Chang, Sanjiv Kumar

Pairwise dot product-based attention allows Transformers to exchange information between tokens in an input-dependent way, and is key to their success across diverse applications in language and vision.

76,591

Paper
Code

A Simple and Effective Positional Encoding for Transformers

no code implementations • EMNLP 2021 • Pu-Chin Chen, Henry Tsai, Srinadh Bhojanapalli, Hyung Won Chung, Yin-Wen Chang, Chun-Sung Ferng

Our analysis shows that the gain actually comes from moving positional information to attention layer from the input.

Position

Paper
Add Code

O(n) Connections are Expressive Enough: Universal Approximability of Sparse Transformers

no code implementations • NeurIPS 2020 • Chulhee Yun, Yin-Wen Chang, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank Reddi, Sanjiv Kumar

We propose sufficient conditions under which we prove that a sparse attention model can universally approximate any sequence-to-sequence function.

Paper
Add Code

$O(n)$ Connections are Expressive Enough: Universal Approximability of Sparse Transformers

no code implementations • NeurIPS 2020 • Chulhee Yun, Yin-Wen Chang, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar

We propose sufficient conditions under which we prove that a sparse attention model can universally approximate any sequence-to-sequence function.

Paper
Add Code

Pre-training Tasks for Embedding-based Large-scale Retrieval

no code implementations • ICLR 2020 • Wei-Cheng Chang, Felix X. Yu, Yin-Wen Chang, Yiming Yang, Sanjiv Kumar

We consider the large-scale query-document retrieval problem: given a query (e. g., a question), return the set of relevant documents (e. g., paragraphs containing the answer) from a large document corpus.

Information Retrieval Link Prediction +1

Paper
Add Code

Source-Side Left-to-Right or Target-Side Left-to-Right? An Empirical Comparison of Two Phrase-Based Decoding Algorithms

no code implementations • EMNLP 2017 • Yin-Wen Chang, Michael Collins

The algorithm produces a translation by processing the source-language sentence in strictly left-to-right order, differing from commonly used approaches that build the target-language sentence in left-to-right order.

Machine Translation Sentence +1

Paper
Add Code

A Polynomial-Time Dynamic Programming Algorithm for Phrase-Based Decoding with a Fixed Distortion Limit

no code implementations • TACL 2017 • Yin-Wen Chang, Michael Collins

Decoding of phrase-based translation models in the general case is known to be NP-complete, by a reduction from the traveling salesman problem (Knight, 1999).

Machine Translation Sentence +2

Paper
Add Code

A Constrained Viterbi Relaxation for Bidirectional Word Alignment

no code implementations • ACL 2014 • Yin-Wen Chang, Alex Rush, er M., John DeNero, Michael Collins

Machine Translation Word Alignment

Paper
Add Code

Optimal Beam Search for Machine Translation

no code implementations • EMNLP 2013 • Alex Rush, er, Yin-Wen Chang, Michael Collins

Machine Translation Translation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.