Search Results for author: Aaron Jaech

Found 19 papers, 8 papers with code

OpenAI o1 System Card

no code implementations21 Dec 2024 OpenAI, :, Aaron Jaech, Adam Kalai, Adam Lerer, Adam Richardson, Ahmed El-Kishky, Aiden Low, Alec Helyar, Aleksander Madry, Alex Beutel, Alex Carney, Alex Iftimie, Alex Karpenko, Alex Tachard Passos, Alexander Neitz, Alexander Prokofiev, Alexander Wei, Allison Tam, Ally Bennett, Ananya Kumar, Andre Saraiva, Andrea Vallone, Andrew Duberstein, Andrew Kondrich, Andrey Mishchenko, Andy Applebaum, Angela Jiang, Ashvin Nair, Barret Zoph, Behrooz Ghorbani, Ben Rossen, Benjamin Sokolowsky, Boaz Barak, Bob McGrew, Borys Minaiev, Botao Hao, Bowen Baker, Brandon Houghton, Brandon McKinzie, Brydon Eastman, Camillo Lugaresi, Cary Bassin, Cary Hudson, Chak Ming Li, Charles de Bourcy, Chelsea Voss, Chen Shen, Chong Zhang, Chris Koch, Chris Orsinger, Christopher Hesse, Claudia Fischer, Clive Chan, Dan Roberts, Daniel Kappler, Daniel Levy, Daniel Selsam, David Dohan, David Farhi, David Mely, David Robinson, Dimitris Tsipras, Doug Li, Dragos Oprica, Eben Freeman, Eddie Zhang, Edmund Wong, Elizabeth Proehl, Enoch Cheung, Eric Mitchell, Eric Wallace, Erik Ritter, Evan Mays, Fan Wang, Felipe Petroski Such, Filippo Raso, Florencia Leoni, Foivos Tsimpourlas, Francis Song, Fred von Lohmann, Freddie Sulit, Geoff Salmon, Giambattista Parascandolo, Gildas Chabot, Grace Zhao, Greg Brockman, Guillaume Leclerc, Hadi Salman, Haiming Bao, Hao Sheng, Hart Andrin, Hessam Bagherinezhad, Hongyu Ren, Hunter Lightman, Hyung Won Chung, Ian Kivlichan, Ian O'Connell, Ian Osband, Ignasi Clavera Gilaberte, Ilge Akkaya, Ilya Kostrikov, Ilya Sutskever, Irina Kofman, Jakub Pachocki, James Lennon, Jason Wei, Jean Harb, Jerry Twore, Jiacheng Feng, Jiahui Yu, Jiayi Weng, Jie Tang, Jieqi Yu, Joaquin Quiñonero Candela, Joe Palermo, Joel Parish, Johannes Heidecke, John Hallman, John Rizzo, Jonathan Gordon, Jonathan Uesato, Jonathan Ward, Joost Huizinga, Julie Wang, Kai Chen, Kai Xiao, Karan Singhal, Karina Nguyen, Karl Cobbe, Katy Shi, Kayla Wood, Kendra Rimbach, Keren Gu-Lemberg, Keren GuLemberg, Kevin Liu, Kevin Lu, Kevin Stone, Kevin Yu, Lama Ahmad, Lauren Yang, Leo Liu, Leon Maksin, Leyton Ho, Liam Fedus, Lilian Weng, Linden Li, Lindsay McCallum, Lindsey Held, Lorenz Kuhn, Lukas Kondraciuk, Lukasz Kaiser, Luke Metz, Madelaine Boyd, Maja Trebacz, Manas Joglekar, Mark Chen, Marko Tintor, Mason Meyer, Matt Jones, Matt Kaufer, Max Schwarzer, Meghan Shah, Mehmet Yatbaz, Melody Guan, Mengyuan Xu, Mengyuan Yan, Mia Glaese, Mianna Chen, Michael Lampe, Michael Malek, Michele Wang, Michelle Fradin, Mike McClay, Mikhail Pavlov, Miles Wang, Mingxuan Wang, Mira Murati, Mo Bavarian, Mostafa Rohaninejad, Nat McAleese, Neil Chowdhury, Nick Ryder, Nikolas Tezak, Noam Brown, Ofir Nachum, Oleg Boiko, Oleg Murk, Olivia Watkins, Patrick Chao, Paul Ashbourne, Pavel Izmailov, Peter Zhokhov, Rachel Dias, Rahul Arora, Randall Lin, Rapha Gontijo Lopes, Raz Gaon, Reah Miyara, Reimar Leike, Renny Hwang, Rhythm Garg, Robin Brown, Roshan James, Rui Shu, Ryan Cheu, Ryan Greene, Saachi Jain, Sam Altman, Sam Toizer, Sam Toyer, Samuel Miserendino, Sandhini Agarwal, Santiago Hernandez, Sasha Baker, Scott McKinney, Scottie Yan, Shengjia Zhao, Shengli Hu, Shibani Santurkar, Shraman Ray Chaudhuri, Shuyuan Zhang, Siyuan Fu, Spencer Papay, Steph Lin, Suchir Balaji, Suvansh Sanjeev, Szymon Sidor, Tal Broda, Aidan Clark, Tao Wang, Taylor Gordon, Ted Sanders, Tejal Patwardhan, Thibault Sottiaux, Thomas Degry, Thomas Dimson, Tianhao Zheng, Timur Garipov, Tom Stasi, Trapit Bansal, Trevor Creech, Troy Peterson, Tyna Eloundou, Valerie Qi, Vineet Kosaraju, Vinnie Monaco, Vitchyr Pong, Vlad Fomenko, Weiyi Zheng, Wenda Zhou, Wes McCabe, Wojciech Zaremba, Yann Dubois, Yinghai Lu, Yining Chen, Young Cha, Yu Bai, Yuchen He, Yuchen Zhang, Yunyun Wang, Zheng Shao, Zhuohan Li

The o1 model series is trained with large-scale reinforcement learning to reason using chain of thought.

Management Red Teaming

Sparse Distillation: Speeding Up Text Classification by Using Bigger Student Models

1 code implementation NAACL 2022 Qinyuan Ye, Madian Khabsa, Mike Lewis, Sinong Wang, Xiang Ren, Aaron Jaech

Distilling state-of-the-art transformer models into lightweight student models is an effective way to reduce computation cost at inference time.

Domain Generalization Privacy Preserving +4

Re-examining Routing Networks for Multi-task Learning

no code implementations1 Jan 2021 Limeng Cui, Aaron Jaech

We re-examine Routing Networks, an approach to multi-task learning that uses reinforcement learning to decide parameter sharing with the goal of maximizing knowledge transfer between related tasks while avoiding task interference.

Multi-Task Learning reinforcement-learning +1

Limitations of Autoregressive Models and Their Alternatives

no code implementations NAACL 2021 Chu-Cheng Lin, Aaron Jaech, Xin Li, Matthew R. Gormley, Jason Eisner

Standard autoregressive language models perform only polynomial-time computation to compute the probability of the next symbol.

Language Modeling Language Modelling

Distributed Gradient Methods for Nonconvex Optimization: Local and Global Convergence Guarantees

no code implementations23 Mar 2020 Brian Swenson, Soummya Kar, H. Vincent Poor, José M. F. Moura, Aaron Jaech

We discuss local minima convergence guarantees and explore the simple but critical role of the stable-manifold theorem in analyzing saddle-point avoidance.

Optimization and Control

Personalized Language Model for Query Auto-Completion

4 code implementations ACL 2018 Aaron Jaech, Mari Ostendorf

Query auto-completion is a search engine feature whereby the system suggests completed queries as the user types.

Language Modeling Language Modelling +1

Community Member Retrieval on Social Media using Textual Information

1 code implementation NAACL 2018 Aaron Jaech, Shobhit Hathi, Mari Ostendorf

This paper addresses the problem of community membership detection using only text features in a scenario where a small number of positive labeled examples defines the community.

Retrieval

Real-Time Prediction of the Duration of Distribution System Outages

no code implementations3 Apr 2018 Aaron Jaech, Baosen Zhang, Mari Ostendorf, Daniel S. Kirschen

This paper addresses the problem of predicting duration of unplanned power outages, using historical outage records to train a series of neural network predictors.

Low-Rank RNN Adaptation for Context-Aware Language Modeling

1 code implementation TACL 2018 Aaron Jaech, Mari Ostendorf

A context-aware language model uses location, user and/or domain metadata (context) to adapt its predictions.

General Classification Language Modeling +1

Improving Context Aware Language Models

1 code implementation21 Apr 2017 Aaron Jaech, Mari Ostendorf

Increased adaptability of RNN language models leads to improved predictions that benefit many applications.

General Classification Language Modeling +1

Match-Tensor: a Deep Relevance Model for Search

1 code implementation26 Jan 2017 Aaron Jaech, Hetunandan Kamisetty, Eric Ringger, Charlie Clarke

The architecture of the Match-Tensor model simultaneously accounts for both local relevance matching and global topicality signals allowing for a rich interplay between them when computing the relevance of a document to a query.

Feature Engineering Learning-To-Rank +1

Domain Adaptation of Recurrent Neural Networks for Natural Language Understanding

no code implementations1 Apr 2016 Aaron Jaech, Larry Heck, Mari Ostendorf

The goal of this paper is to use multi-task learning to efficiently scale slot filling models for natural language understanding to handle multiple target tasks or domains.

Domain Adaptation Multi-Task Learning +3

Talking to the crowd: What do people react to in online discussions?

no code implementations EMNLP 2015 Aaron Jaech, Victoria Zayats, Hao Fang, Mari Ostendorf, Hannaneh Hajishirzi

This paper addresses the question of how language use affects community reaction to comments in online discussion forums, and the relative importance of the message vs. the messenger.

What Your Username Says About You

1 code implementation EMNLP 2015 Aaron Jaech, Mari Ostendorf

Experimental results on the two tasks demonstrate the effectiveness of the proposed morphological features compared to a character n-gram baseline.

Leveraging Twitter for Low-Resource Conversational Speech Language Modeling

no code implementations9 Apr 2015 Aaron Jaech, Mari Ostendorf

In applications involving conversational speech, data sparsity is a limiting factor in building a better language model.

Language Modeling Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.