Search Results for author: Ruiqi Zhong

Found 22 papers, 15 papers with code

Subspace Embedding and Linear Regression with Orlicz Norm

no code implementations ICML 2018 Alexandr Andoni, Chengyu Lin, Ying Sheng, Peilin Zhong, Ruiqi Zhong

An Orlicz norm is parameterized by a non-negative convex function $G:\mathbb{R}_+\rightarrow\mathbb{R}_+$ with $G(0)=0$: the Orlicz norm of a vector $x\in\mathbb{R}^n$ is defined as $ \|x\|_G=\inf\left\{\alpha>0\large\mid\sum_{i=1}^n G(|x_i|/\alpha)\leq 1\right\}.

regression

Detecting Gang-Involved Escalation on Social Media Using Context

1 code implementation EMNLP 2018 Serina Chang, Ruiqi Zhong, Ethan Adams, Fei-Tzin Lee, Siddharth Varia, Desmond Patton, William Frey, Chris Kedzie, Kathleen McKeown

Gang-involved youth in cities such as Chicago have increasingly turned to social media to post about their experiences and intents online.

Fine-grained Sentiment Analysis with Faithful Attention

no code implementations19 Aug 2019 Ruiqi Zhong, Steven Shao, Kathleen McKeown

While the general task of textual sentiment classification has been widely studied, much less research looks specifically at sentiment between a specified source and target.

Relation Extraction Sentiment Analysis +1

Detecting and Reducing Bias in a High Stakes Domain

1 code implementation IJCNLP 2019 Ruiqi Zhong, Yanda Chen, Desmond Patton, Charlotte Selous, Kathy Mckeown

Gang-involved youth in cities such as Chicago sometimes post on social media to express their aggression towards rival gangs and previous research has demonstrated that a deep learning approach can predict aggression and loss in posts.

Vocal Bursts Intensity Prediction

Semantic Scaffolds for Pseudocode-to-Code Generation

1 code implementation ACL 2020 Ruiqi Zhong, Mitchell Stern, Dan Klein

We propose a method for program generation based on semantic scaffolds, lightweight structures representing the high-level semantic and syntactic composition of a program.

Code Generation

Semantic Evaluation for Text-to-SQL with Distilled Test Suite

no code implementations2 Jul 2020 Ruiqi Zhong, Tao Yu, Dan Klein

We propose test suite accuracy to approximate semantic accuracy for Text-to-SQL models, where a predicted query is semantically correct if its denotation is the same as the gold for every possible database.

Semantic Parsing Text-To-SQL

Understanding Attention Training via Output Relevance

no code implementations16 Aug 2020 Charlie Snell, Ruiqi Zhong, Jacob Steinhardt, Dan Klein

If we ablate attention by fixing it to uniform, the output relevance still correlates with the attention of a normally trained model; but if we instead ablate output relevance, attention cannot be learned.

Translation

Semantic Evaluation for Text-to-SQL with Distilled Test Suites

3 code implementations EMNLP 2020 Ruiqi Zhong, Tao Yu, Dan Klein

We propose test suite accuracy to approximate semantic accuracy for Text-to-SQL models.

Text-To-SQL

Approximating How Single Head Attention Learns

1 code implementation13 Mar 2021 Charlie Snell, Ruiqi Zhong, Dan Klein, Jacob Steinhardt

Our approximation explains why models sometimes attend to salient words, and inspires a toy example where a multi-head attention model can overcome the above hard training distribution by improving learning dynamics rather than expressiveness.

Are Larger Pretrained Language Models Uniformly Better? Comparing Performance at the Instance Level

1 code implementation Findings (ACL) 2021 Ruiqi Zhong, Dhruba Ghosh, Dan Klein, Jacob Steinhardt

We develop statistically rigorous methods to address this, and after accounting for pretraining and finetuning noise, we find that our BERT-Large is worse than BERT-Mini on at least 1-4% of instances across MNLI, SST-2, and QQP, compared to the overall accuracy improvement of 2-10%.

QQP SST-2

The Effect of Model Size on Worst-Group Generalization

no code implementations8 Dec 2021 Alan Pham, Eunice Chan, Vikranth Srivatsa, Dhruba Ghosh, Yaoqing Yang, Yaodong Yu, Ruiqi Zhong, Joseph E. Gonzalez, Jacob Steinhardt

Overparameterization is shown to result in poor test accuracy on rare subgroups under a variety of settings where subgroup information is known.

Describing Differences between Text Distributions with Natural Language

1 code implementation28 Jan 2022 Ruiqi Zhong, Charlie Snell, Dan Klein, Jacob Steinhardt

We then re-rank the descriptions by checking how often they hold on a larger set of samples with a learned verifier.

Binary Classification Re-Ranking

InCoder: A Generative Model for Code Infilling and Synthesis

3 code implementations12 Apr 2022 Daniel Fried, Armen Aghajanyan, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Wen-tau Yih, Luke Zettlemoyer, Mike Lewis

Our model is the first generative model that is able to directly perform zero-shot code infilling, which we evaluate on challenging tasks such as type inference, comment generation, and variable re-naming.

Ranked #39 on Code Generation on HumanEval (Pass@100 metric)

Code Generation Comment Generation +1

Non-Programmers Can Label Programs Indirectly via Active Examples: A Case Study with Text-to-SQL

1 code implementation25 May 2022 Ruiqi Zhong, Charlie Snell, Dan Klein, Jason Eisner

We introduce APEL, a framework in which non-programmers select among candidate programs generated by a seed semantic parser (e. g., Codex).

Bayesian Inference Text-To-SQL

Learning by Distilling Context

no code implementations30 Sep 2022 Charlie Snell, Dan Klein, Ruiqi Zhong

We show that context distillation is a general method to train language models, and it can effectively internalize 3 types of training signals.

Language Modelling Text-To-SQL

DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation

1 code implementation18 Nov 2022 Yuhang Lai, Chengxi Li, Yiming Wang, Tianyi Zhang, Ruiqi Zhong, Luke Zettlemoyer, Scott Wen-tau Yih, Daniel Fried, Sida Wang, Tao Yu

We introduce DS-1000, a code generation benchmark with a thousand data science problems spanning seven Python libraries, such as NumPy and Pandas.

Code Generation Memorization

Goal-Driven Explainable Clustering via Language Descriptions

1 code implementation23 May 2023 Zihan Wang, Jingbo Shang, Ruiqi Zhong

We propose a new task formulation, "Goal-Driven Clustering with Explanations" (GoalEx), which represents both the goal and the explanations as free-form language descriptions.

Clustering Language Modelling

Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations

no code implementations17 Jul 2023 Yanda Chen, Ruiqi Zhong, Narutatsu Ri, Chen Zhao, He He, Jacob Steinhardt, Zhou Yu, Kathleen McKeown

To answer these questions, we propose to evaluate $\textbf{counterfactual simulatability}$ of natural language explanations: whether an explanation can enable humans to precisely infer the model's outputs on diverse counterfactuals of the explained input.

counterfactual

Describing Differences in Image Sets with Natural Language

1 code implementation5 Dec 2023 Lisa Dunlap, Yuhui Zhang, Xiaohan Wang, Ruiqi Zhong, Trevor Darrell, Jacob Steinhardt, Joseph E. Gonzalez, Serena Yeung-Levy

To aid in this discovery process, we explore the task of automatically describing the differences between two $\textbf{sets}$ of images, which we term Set Difference Captioning.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.