Search Results for author: Haoyue Shi

Found 15 papers, 4 papers with code

Grammar-Based Grounded Lexicon Learning

no code implementations NeurIPS 2021 Jiayuan Mao, Haoyue Shi, Jiajun Wu, Roger P. Levy, Joshua B. Tenenbaum

We present Grammar-Based Grounded Lexicon Learning (G2L2), a lexicalist approach toward learning a compositional and grounded meaning representation of language from grounded data, such as paired images and texts.

Network Embedding Sentence +1

NL-Augmenter: A Framework for Task-Sensitive Natural Language Augmentation

2 code implementations6 Dec 2021 Kaustubh D. Dhole, Varun Gangal, Sebastian Gehrmann, Aadesh Gupta, Zhenhao Li, Saad Mahamood, Abinaya Mahendiran, Simon Mille, Ashish Shrivastava, Samson Tan, Tongshuang Wu, Jascha Sohl-Dickstein, Jinho D. Choi, Eduard Hovy, Ondrej Dusek, Sebastian Ruder, Sajant Anand, Nagender Aneja, Rabin Banjade, Lisa Barthe, Hanna Behnke, Ian Berlot-Attwell, Connor Boyle, Caroline Brun, Marco Antonio Sobrevilla Cabezudo, Samuel Cahyawijaya, Emile Chapuis, Wanxiang Che, Mukund Choudhary, Christian Clauss, Pierre Colombo, Filip Cornell, Gautier Dagan, Mayukh Das, Tanay Dixit, Thomas Dopierre, Paul-Alexis Dray, Suchitra Dubey, Tatiana Ekeinhor, Marco Di Giovanni, Tanya Goyal, Rishabh Gupta, Louanes Hamla, Sang Han, Fabrice Harel-Canada, Antoine Honore, Ishan Jindal, Przemyslaw K. Joniak, Denis Kleyko, Venelin Kovatchev, Kalpesh Krishna, Ashutosh Kumar, Stefan Langer, Seungjae Ryan Lee, Corey James Levinson, Hualou Liang, Kaizhao Liang, Zhexiong Liu, Andrey Lukyanenko, Vukosi Marivate, Gerard de Melo, Simon Meoni, Maxime Meyer, Afnan Mir, Nafise Sadat Moosavi, Niklas Muennighoff, Timothy Sum Hon Mun, Kenton Murray, Marcin Namysl, Maria Obedkova, Priti Oli, Nivranshu Pasricha, Jan Pfister, Richard Plant, Vinay Prabhu, Vasile Pais, Libo Qin, Shahab Raji, Pawan Kumar Rajpoot, Vikas Raunak, Roy Rinberg, Nicolas Roberts, Juan Diego Rodriguez, Claude Roux, Vasconcellos P. H. S., Ananya B. Sai, Robin M. Schmidt, Thomas Scialom, Tshephisho Sefara, Saqib N. Shamsi, Xudong Shen, Haoyue Shi, Yiwen Shi, Anna Shvets, Nick Siegel, Damien Sileo, Jamie Simon, Chandan Singh, Roman Sitelew, Priyank Soni, Taylor Sorensen, William Soto, Aman Srivastava, KV Aditya Srivatsa, Tony Sun, Mukund Varma T, A Tabassum, Fiona Anting Tan, Ryan Teehan, Mo Tiwari, Marie Tolkiehn, Athena Wang, Zijian Wang, Gloria Wang, Zijie J. Wang, Fuxuan Wei, Bryan Wilie, Genta Indra Winata, Xinyi Wu, Witold Wydmański, Tianbao Xie, Usama Yaseen, Michael A. Yee, Jing Zhang, Yue Zhang

Data augmentation is an important component in the robustness evaluation of models in natural language processing (NLP) and in enhancing the diversity of the data they are trained on.

Data Augmentation

Substructure Distribution Projection for Zero-Shot Cross-Lingual Dependency Parsing

no code implementations ACL 2022 Haoyue Shi, Kevin Gimpel, Karen Livescu

We present substructure distribution projection (SubDP), a technique that projects a distribution over structures in one domain to another, by projecting substructure distributions separately.

Dependency Parsing

Substructure Substitution: Structured Data Augmentation for NLP

no code implementations Findings (ACL) 2021 Haoyue Shi, Karen Livescu, Kevin Gimpel

We study a family of data augmentation methods, substructure substitution (SUB2), for natural language processing (NLP) tasks.

Data Augmentation Part-Of-Speech Tagging +2

Bilingual Lexicon Induction via Unsupervised Bitext Construction and Word Alignment

no code implementations ACL 2021 Haoyue Shi, Luke Zettlemoyer, Sida I. Wang

Bilingual lexicons map words in one language to their translations in another, and are typically induced by learning linear projections to align monolingual word embedding spaces.

Bilingual Lexicon Induction Word Alignment

Deep Clustering of Text Representations for Supervision-free Probing of Syntax

no code implementations24 Oct 2020 Vikram Gupta, Haoyue Shi, Kevin Gimpel, Mrinmaya Sachan

We explore deep clustering of text representations for unsupervised model interpretation and induction of syntax.

Clustering Deep Clustering +1

On the Role of Supervision in Unsupervised Constituency Parsing

no code implementations EMNLP 2020 Haoyue Shi, Karen Livescu, Kevin Gimpel

We analyze several recent unsupervised constituency parsing models, which are tuned with respect to the parsing $F_1$ score on the Wall Street Journal (WSJ) development set (1, 700 sentences).

Constituency Parsing Data Augmentation +1

A Cross-Task Analysis of Text Span Representations

1 code implementation WS 2020 Shubham Toshniwal, Haoyue Shi, Bowen Shi, Lingyu Gao, Karen Livescu, Kevin Gimpel

Many natural language processing (NLP) tasks involve reasoning with textual spans, including question answering, entity recognition, and coreference resolution.

coreference-resolution Question Answering

Visually Grounded Neural Syntax Acquisition

no code implementations ACL 2019 Haoyue Shi, Jiayuan Mao, Kevin Gimpel, Karen Livescu

We define concreteness of constituents by their matching scores with images, and use it to guide the parsing of text.

Visual Grounding

Implicit Subjective and Sentimental Usages in Multi-sense Word Embeddings

no code implementations WS 2018 Yuqi Sun, Haoyue Shi, Junfeng Hu

In multi-sense word embeddings, contextual variations in corpus may cause a univocal word to be embedded into different sense vectors.

TAG Word Embeddings

On Tree-Based Neural Sentence Modeling

1 code implementation EMNLP 2018 Haoyue Shi, Hao Zhou, Jiaze Chen, Lei LI

To study the effectiveness of different tree structures, we replace the parsing trees with trivial trees (i. e., binary balanced tree, left-branching tree and right-branching tree) in the encoders.

Sentence Sentiment Analysis +1

Learning Visually-Grounded Semantics from Contrastive Adversarial Samples

1 code implementation COLING 2018 Haoyue Shi, Jiayuan Mao, Tete Xiao, Yuning Jiang, Jian Sun

Begin with an insightful adversarial attack on VSE embeddings, we show the limitation of current frameworks and image-text datasets (e. g., MS-COCO) both quantitatively and qualitatively.

Adversarial Attack Image Captioning

Understanding and Improving Multi-Sense Word Embeddings via Extended Robust Principal Component Analysis

no code implementations3 Mar 2018 Haoyue Shi, Yuqi Sun, Junfeng Hu

Unsupervised learned representations of polysemous words generate a large of pseudo multi senses since unsupervised methods are overly sensitive to contextual variations.

Dimensionality Reduction Word Embeddings +1

Real Multi-Sense or Pseudo Multi-Sense: An Approach to Improve Word Representation

no code implementations WS 2016 Haoyue Shi, Caihua Li, Junfeng Hu

Previous researches have shown that learning multiple representations for polysemous words can improve the performance of word embeddings on many tasks.

Word Embeddings Word Similarity

Cannot find the paper you are looking for? You can Submit a new open access paper.