Search Results for author: Yin-Wen Chang

Found 9 papers, 1 papers with code

Leveraging redundancy in attention with Reuse Transformers

1 code implementation13 Oct 2021 Srinadh Bhojanapalli, Ayan Chakrabarti, Andreas Veit, Michal Lukasik, Himanshu Jain, Frederick Liu, Yin-Wen Chang, Sanjiv Kumar

Pairwise dot product-based attention allows Transformers to exchange information between tokens in an input-dependent way, and is key to their success across diverse applications in language and vision.

A Simple and Effective Positional Encoding for Transformers

no code implementations EMNLP 2021 Pu-Chin Chen, Henry Tsai, Srinadh Bhojanapalli, Hyung Won Chung, Yin-Wen Chang, Chun-Sung Ferng

Our analysis shows that the gain actually comes from moving positional information to attention layer from the input.

Position

O(n) Connections are Expressive Enough: Universal Approximability of Sparse Transformers

no code implementations NeurIPS 2020 Chulhee Yun, Yin-Wen Chang, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank Reddi, Sanjiv Kumar

We propose sufficient conditions under which we prove that a sparse attention model can universally approximate any sequence-to-sequence function.

$O(n)$ Connections are Expressive Enough: Universal Approximability of Sparse Transformers

no code implementations NeurIPS 2020 Chulhee Yun, Yin-Wen Chang, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar

We propose sufficient conditions under which we prove that a sparse attention model can universally approximate any sequence-to-sequence function.

Pre-training Tasks for Embedding-based Large-scale Retrieval

no code implementations ICLR 2020 Wei-Cheng Chang, Felix X. Yu, Yin-Wen Chang, Yiming Yang, Sanjiv Kumar

We consider the large-scale query-document retrieval problem: given a query (e. g., a question), return the set of relevant documents (e. g., paragraphs containing the answer) from a large document corpus.

Information Retrieval Link Prediction +1

Source-Side Left-to-Right or Target-Side Left-to-Right? An Empirical Comparison of Two Phrase-Based Decoding Algorithms

no code implementations EMNLP 2017 Yin-Wen Chang, Michael Collins

The algorithm produces a translation by processing the source-language sentence in strictly left-to-right order, differing from commonly used approaches that build the target-language sentence in left-to-right order.

Machine Translation Sentence +1

A Polynomial-Time Dynamic Programming Algorithm for Phrase-Based Decoding with a Fixed Distortion Limit

no code implementations TACL 2017 Yin-Wen Chang, Michael Collins

Decoding of phrase-based translation models in the general case is known to be NP-complete, by a reduction from the traveling salesman problem (Knight, 1999).

Machine Translation Sentence +2

Cannot find the paper you are looking for? You can Submit a new open access paper.