no code implementations • 25 May 2022 • Jingnong Qu, Liunian Harold Li, Jieyu Zhao, Sunipa Dev, Kai-Wei Chang
Disinformation has become a serious problem on social media.
4 code implementations • Google Research 2022 • Aakanksha Chowdhery, Sharan Narang, Jacob Devlin, Maarten Bosma, Gaurav Mishra, Adam Roberts, Paul Barham, Hyung Won Chung, Charles Sutton, Sebastian Gehrmann, Parker Schuh, Kensen Shi, Sasha Tsvyashchenko, Joshua Maynez, Abhishek Rao, Parker Barnes, Yi Tay, Noam Shazeer, Vinodkumar Prabhakaran, Emily Reif, Nan Du, Ben Hutchinson, Reiner Pope, James Bradbury, Jacob Austin, Michael Isard, Guy Gur-Ari, Pengcheng Yin, Toju Duke, Anselm Levskaya, Sanjay Ghemawat, Sunipa Dev, Henryk Michalewski, Xavier Garcia, Vedant Misra, Kevin Robinson, Liam Fedus, Denny Zhou, Daphne Ippolito, David Luan, Hyeontaek Lim, Barret Zoph, Alexander Spiridonov, Ryan Sepassi, David Dohan, Shivani Agrawal, Mark Omernick, Andrew M. Dai, Thanumalayan Sankaranarayana Pillai, Marie Pellat, Aitor Lewkowycz, Erica Moreira, Rewon Child, Oleksandr Polozov, Katherine Lee, Zongwei Zhou, Xuezhi Wang, Brennan Saeta, Mark Diaz, Orhan Firat, Michele Catasta, Jason Wei, Kathy Meier-Hellstern, Douglas Eck, Jeff Dean, Slav Petrov, Noah Fiedel
To further our understanding of the impact of scale on few-shot learning, we trained a 540-billion parameter, densely activated, Transformer language model, which we call Pathways Language Model PaLM.
Ranked #1 on
Natural Language Inference
on RTE
1 code implementation • 15 Mar 2022 • Di wu, Wasi Uddin Ahmad, Sunipa Dev, Kai-Wei Chang
State-of-the-art keyphrase generation methods generally depend on large annotated datasets, limiting their performance in domains with limited annotated data.
1 code implementation • 15 Oct 2021 • Vijit Malik, Sunipa Dev, Akihiro Nishi, Nanyun Peng, Kai-Wei Chang
Language representations are efficient tools used across NLP applications, but they are strife with encoded societal biases.
no code implementations • EMNLP 2021 • Sunipa Dev, Masoud Monajatipoor, Anaelia Ovalle, Arjun Subramonian, Jeff M Phillips, Kai-Wei Chang
Gender is widely discussed in the context of language tasks and when examining the stereotypes propagated by language models.
no code implementations • 7 Aug 2021 • Sunipa Dev, Emily Sheng, Jieyu Zhao, Jiao Sun, Yu Hou, Mattie Sanseverino, Jiin Kim, Nanyun Peng, Kai-Wei Chang
To address this gap, this work presents a comprehensive survey of existing bias measures in NLP as a function of the associated NLP tasks, metrics, datasets, and social biases and corresponding harms.
1 code implementation • 6 Apr 2021 • Archit Rathore, Sunipa Dev, Jeff M. Phillips, Vivek Srikumar, Yan Zheng, Chin-Chia Michael Yeh, Junpeng Wang, Wei zhang, Bei Wang
To aid this, we present Visualization of Embedding Representations for deBiasing system ("VERB"), an open-source web-based visualization tool that helps the users gain a technical understanding and visual intuition of the inner workings of debiasing techniques, with a focus on their geometric properties.
1 code implementation • 25 Nov 2020 • Sunipa Dev
High-dimensional representations for words, text, images, knowledge graphs and other structured data are commonly used in different paradigms of machine learning and data mining.
1 code implementation • EMNLP 2021 • Sunipa Dev, Tao Li, Jeff M. Phillips, Vivek Srikumar
Language representations are known to carry stereotypical biases and, as a result, lead to biased predictions in downstream tasks.
1 code implementation • 25 Aug 2019 • Sunipa Dev, Tao Li, Jeff Phillips, Vivek Srikumar
Word embeddings carry stereotypical connotations from the text they are trained on, which can lead to invalid inferences in downstream models that rely on them.
1 code implementation • 23 Jan 2019 • Sunipa Dev, Jeff Phillips
Word vector representations are well developed tools for various NLP and Machine Learning tasks and are known to retain significant semantic and syntactic structure of languages.
no code implementations • 4 Jun 2018 • Sunipa Dev, Safia Hassan, Jeff M. Phillips
We develop a family of techniques to align word embeddings which are derived from different source datasets or created using different mechanisms (e. g., GloVe or word2vec).