no code implementations • NeurIPS 2013 • Stefan Wager, Sida Wang, Percy Liang
Dropout and other feature noising schemes control overfitting by artificially corrupting the training data.
no code implementations • NeurIPS 2014 • Stefan Wager, William Fithian, Sida Wang, Percy Liang
Dropout training, originally designed for deep neural networks, has been successful on high-dimensional single-layer natural language tasks.
1 code implementation • NeurIPS 2014 • Roy Frostig, Sida Wang, Percy S. Liang, Christopher D. Manning
We focus on the problem of maximum a posteriori (MAP) inference in Markov random fields with binary variables and pairwise interactions.
1 code implementation • ACL 2020 • Lili Yu, Howard Chen, Sida Wang, Tao Lei, Yoav Artzi
We study the potential for interaction in natural language classification.
no code implementations • 26 Jun 2020 • Xiangyu Zhao, Haochen Liu, Hui Liu, Jiliang Tang, Weiwei Guo, Jun Shi, Sida Wang, Huiji Gao, Bo Long
Specifically, we first proposed an end-to-end differentiable framework that can calculate the weights over various dimensions for feature fields in a soft and continuous manner with an AutoML based optimization algorithm; then we derive a hard and discrete embedding component architecture according to the maximal weights and retrain the whole recommender framework.
2 code implementations • NeurIPS 2020 • Mike Lewis, Marjan Ghazvininejad, Gargi Ghosh, Armen Aghajanyan, Sida Wang, Luke Zettlemoyer
The objective noisily captures aspects of paraphrase, translation, multi-document summarization, and information retrieval, allowing for strong zero-shot performance on several tasks.
no code implementations • 6 Aug 2020 • Sida Wang, Weiwei Guo, Huiji Gao, Bo Long
On the candidate generation side, this system uses as much information as possible in unseen prefixes to generate relevant candidates, increasing the recall by a large margin.
1 code implementation • 6 Aug 2020 • Weiwei Guo, Xiao-Wei Liu, Sida Wang, Huiji Gao, Ananth Sankar, Zimeng Yang, Qi Guo, Liang Zhang, Bo Long, Bee-Chung Chen, Deepak Agarwal
Ranking is the most important component in a search system.
1 code implementation • 29 Dec 2020 • Yilun Zhou, Adithya Renduchintala, Xian Li, Sida Wang, Yashar Mehdad, Asish Ghoshal
Active learning (AL) algorithms may achieve better performance with fewer data because the model guides the data selection process.
no code implementations • 30 Jul 2021 • Weiwei Guo, Xiaowei Liu, Sida Wang, Michaeel Kazi, Zhoutong Fu, Huiji Gao, Jun Jia, Liang Zhang, Bo Long
Many search systems work with large amounts of natural language data, e. g., search queries, user profiles and documents, where deep learning based natural language processing techniques (deep NLP) can be of great help.
no code implementations • 16 Aug 2021 • Weiwei Guo, Xiaowei Liu, Sida Wang, Michaeel Kazi, Zhiwei Wang, Zhoutong Fu, Jun Jia, Liang Zhang, Huiji Gao, Bo Long
Building a successful search system requires a thorough understanding of textual data semantics, where deep learning based natural language processing techniques (deep NLP) can be of great help.
no code implementations • NeurIPS 2021 • Victor Zhong, Austin Hanjie, Sida Wang, Karthik Narasimhan, Luke Zettlemoyer
We hope SILG enables the community to quickly identify new methodolo- gies for language grounding that generalize to a diverse set of environments and their associated challenges.
3 code implementations • 12 Apr 2022 • Daniel Fried, Armen Aghajanyan, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Wen-tau Yih, Luke Zettlemoyer, Mike Lewis
Our model is the first generative model that is able to directly perform zero-shot code infilling, which we evaluate on challenging tasks such as type inference, comment generation, and variable re-naming.
Ranked #39 on Code Generation on HumanEval (Pass@100 metric)
no code implementations • ACL 2022 • Bill Yuchen Lin, Sida Wang, Xi Victoria Lin, Robin Jia, Lin Xiao, Xiang Ren, Wen-tau Yih
Real-world natural language processing (NLP) models need to be continually updated to fix the prediction errors in out-of-distribution (OOD) data streams while overcoming catastrophic forgetting.
1 code implementation • 18 Nov 2022 • Yuhang Lai, Chengxi Li, Yiming Wang, Tianyi Zhang, Ruiqi Zhong, Luke Zettlemoyer, Scott Wen-tau Yih, Daniel Fried, Sida Wang, Tao Yu
We introduce DS-1000, a code generation benchmark with a thousand data science problems spanning seven Python libraries, such as NumPy and Pandas.
1 code implementation • 7 Mar 2024 • Linyuan Gong, Sida Wang, Mostafa Elhoushi, Alvin Cheung
We introduce Syntax-Aware Fill-In-the-Middle (SAFIM), a new benchmark for evaluating Large Language Models (LLMs) on the code Fill-in-the-Middle (FIM) task.
Ranked #1 on Code Completion on SAFIM
no code implementations • 12 Mar 2024 • Naman jain, King Han, Alex Gu, Wen-Ding Li, Fanjia Yan, Tianjun Zhang, Sida Wang, Armando Solar-Lezama, Koushik Sen, Ion Stoica
Large Language Models (LLMs) applied to code-related applications have emerged as a prominent field, attracting significant interest from both academia and industry.