no code implementations • 25 Mar 2025 • Gemma Team, Aishwarya Kamath, Johan Ferret, Shreya Pathak, Nino Vieillard, Ramona Merhej, Sarah Perrin, Tatiana Matejovicova, Alexandre Ramé, Morgane Rivière, Louis Rouillard, Thomas Mesnard, Geoffrey Cideron, Jean-bastien Grill, Sabela Ramos, Edouard Yvinec, Michelle Casbon, Etienne Pot, Ivo Penchev, Gaël Liu, Francesco Visin, Kathleen Kenealy, Lucas Beyer, Xiaohai Zhai, Anton Tsitsulin, Robert Busa-Fekete, Benjamin Coleman, Yi Gao, Basil Mustafa, Iain Barr, Emilio Parisotto, David Tian, Matan Eyal, Colin Cherry, Jan-Thorsten Peter, Danila Sinopalnikov, Surya Bhupatiraju, Rishabh Agarwal, Mehran Kazemi, Dan Malkin, Ravin Kumar, David Vilar, Idan Brusilovsky, Jiaming Luo, Andreas Steiner, Abe Friesen, Abhanshu Sharma, Abheesht Sharma, Adi Mayrav Gilady, Adrian Goedeckemeyer, Alaa Saade, Alex Feng, Alexander Kolesnikov, Alexei Bendebury, Alvin Abdagic, Amit Vadi, András György, André Susano Pinto, Anil Das, Ankur Bapna, Antoine Miech, Antoine Yang, Antonia Paterson, Ashish Shenoy, Ayan Chakrabarti, Bilal Piot, Bo Wu, Bobak Shahriari, Bryce Petrini, Charlie Chen, Charline Le Lan, Christopher A. Choquette-Choo, CJ Carey, Cormac Brick, Daniel Deutsch, Danielle Eisenbud, Dee Cattle, Derek Cheng, Dimitris Paparas, Divyashree Shivakumar Sreepathihalli, Doug Reid, Dustin Tran, Dustin Zelle, Eric Noland, Erwin Huizenga, Eugene Kharitonov, Frederick Liu, Gagik Amirkhanyan, Glenn Cameron, Hadi Hashemi, Hanna Klimczak-Plucińska, Harman Singh, Harsh Mehta, Harshal Tushar Lehri, Hussein Hazimeh, Ian Ballantyne, Idan Szpektor, Ivan Nardini, Jean Pouget-Abadie, Jetha Chan, Joe Stanton, John Wieting, Jonathan Lai, Jordi Orbay, Joseph Fernandez, Josh Newlan, Ju-yeong Ji, Jyotinder Singh, Kathy Yu, Kevin Hui, Kiran Vodrahalli, Klaus Greff, Linhai Qiu, Marcella Valentine, Marina Coelho, Marvin Ritter, Matt Hoffman, Matthew Watson, Mayank Chaturvedi, Michael Moynihan, Min Ma, Natasha Noy, Nathan Byrd, Nick Roy, Nikola Momchev, Nilay Chauhan, Noveen Sachdeva, Oskar Bunyan, Pankil Botarda, Paul Caron, Paul Kishan Rubenstein, Phil Culliton, Philipp Schmid, Pier Giuseppe Sessa, Pingmei Xu, Piotr Stanczyk, Pouya Tafti, Rakesh Shivanna, Renjie Wu, Renke Pan, Reza Rokni, Rob Willoughby, Rohith Vallu, Ryan Mullins, Sammy Jerome, Sara Smoot, Sertan Girgin, Shariq Iqbal, Shashir Reddy, Shruti Sheth, Siim Põder, Sijal Bhatnagar, Sindhu Raghuram Panyam, Sivan Eiger, Susan Zhang, Tianqi Liu, Trevor Yacovone, Tyler Liechty, Uday Kalra, Utku Evci, Vedant Misra, Vincent Roseberry, Vlad Feinberg, Vlad Kolesnikov, Woohyun Han, Woosuk Kwon, Xi Chen, Yinlam Chow, Yuvein Zhu, Zichuan Wei, Zoltan Egyed, Victor Cotruta, Minh Giang, Phoebe Kirk, Anand Rao, Kat Black, Nabila Babar, Jessica Lo, Erica Moreira, Luiz GUStavo Martins, Omar Sanseviero, Lucas Gonzalez, Zach Gleicher, Tris Warkentin, Vahab Mirrokni, Evan Senter, Eli Collins, Joelle Barral, Zoubin Ghahramani, Raia Hadsell, Yossi Matias, D. Sculley, Slav Petrov, Noah Fiedel, Noam Shazeer, Oriol Vinyals, Jeff Dean, Demis Hassabis, Koray Kavukcuoglu, Clement Farabet, Elena Buchatskaya, Jean-Baptiste Alayrac, Rohan Anil, Dmitry, Lepikhin, Sebastian Borgeaud, Olivier Bachem, Armand Joulin, Alek Andreev, Cassidy Hardin, Robert Dadashi, Léonard Hussenot
We introduce Gemma 3, a multimodal addition to the Gemma family of lightweight open models, ranging in scale from 1 to 27 billion parameters.
no code implementations • 31 Jul 2024 • Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, Léonard Hussenot, Thomas Mesnard, Bobak Shahriari, Alexandre Ramé, Johan Ferret, Peter Liu, Pouya Tafti, Abe Friesen, Michelle Casbon, Sabela Ramos, Ravin Kumar, Charline Le Lan, Sammy Jerome, Anton Tsitsulin, Nino Vieillard, Piotr Stanczyk, Sertan Girgin, Nikola Momchev, Matt Hoffman, Shantanu Thakoor, Jean-bastien Grill, Behnam Neyshabur, Olivier Bachem, Alanna Walton, Aliaksei Severyn, Alicia Parrish, Aliya Ahmad, Allen Hutchison, Alvin Abdagic, Amanda Carl, Amy Shen, Andy Brock, Andy Coenen, Anthony Laforge, Antonia Paterson, Ben Bastian, Bilal Piot, Bo Wu, Brandon Royal, Charlie Chen, Chintu Kumar, Chris Perry, Chris Welty, Christopher A. Choquette-Choo, Danila Sinopalnikov, David Weinberger, Dimple Vijaykumar, Dominika Rogozińska, Dustin Herbison, Elisa Bandy, Emma Wang, Eric Noland, Erica Moreira, Evan Senter, Evgenii Eltyshev, Francesco Visin, Gabriel Rasskin, Gary Wei, Glenn Cameron, Gus Martins, Hadi Hashemi, Hanna Klimczak-Plucińska, Harleen Batra, Harsh Dhand, Ivan Nardini, Jacinda Mein, Jack Zhou, James Svensson, Jeff Stanway, Jetha Chan, Jin Peng Zhou, Joana Carrasqueira, Joana Iljazi, Jocelyn Becker, Joe Fernandez, Joost van Amersfoort, Josh Gordon, Josh Lipschultz, Josh Newlan, Ju-yeong Ji, Kareem Mohamed, Kartikeya Badola, Kat Black, Katie Millican, Keelin McDonell, Kelvin Nguyen, Kiranbir Sodhia, Kish Greene, Lars Lowe Sjoesund, Lauren Usui, Laurent SIfre, Lena Heuermann, Leticia Lago, Lilly McNealus, Livio Baldini Soares, Logan Kilpatrick, Lucas Dixon, Luciano Martins, Machel Reid, Manvinder Singh, Mark Iverson, Martin Görner, Mat Velloso, Mateo Wirth, Matt Davidow, Matt Miller, Matthew Rahtz, Matthew Watson, Meg Risdal, Mehran Kazemi, Michael Moynihan, Ming Zhang, Minsuk Kahng, Minwoo Park, Mofi Rahman, Mohit Khatwani, Natalie Dao, Nenshad Bardoliwalla, Nesh Devanathan, Neta Dumai, Nilay Chauhan, Oscar Wahltinez, Pankil Botarda, Parker Barnes, Paul Barham, Paul Michel, Pengchong Jin, Petko Georgiev, Phil Culliton, Pradeep Kuppala, Ramona Comanescu, Ramona Merhej, Reena Jana, Reza Ardeshir Rokni, Rishabh Agarwal, Ryan Mullins, Samaneh Saadat, Sara Mc Carthy, Sarah Cogan, Sarah Perrin, Sébastien M. R. Arnold, Sebastian Krause, Shengyang Dai, Shruti Garg, Shruti Sheth, Sue Ronstrom, Susan Chan, Timothy Jordan, Ting Yu, Tom Eccles, Tom Hennigan, Tomas Kocisky, Tulsee Doshi, Vihan Jain, Vikas Yadav, Vilobh Meshram, Vishal Dharmadhikari, Warren Barkley, Wei Wei, Wenming Ye, Woohyun Han, Woosuk Kwon, Xiang Xu, Zhe Shen, Zhitao Gong, Zichuan Wei, Victor Cotruta, Phoebe Kirk, Anand Rao, Minh Giang, Ludovic Peran, Tris Warkentin, Eli Collins, Joelle Barral, Zoubin Ghahramani, Raia Hadsell, D. Sculley, Jeanine Banks, Anca Dragan, Slav Petrov, Oriol Vinyals, Jeff Dean, Demis Hassabis, Koray Kavukcuoglu, Clement Farabet, Elena Buchatskaya, Sebastian Borgeaud, Noah Fiedel, Armand Joulin, Kathleen Kenealy, Robert Dadashi, Alek Andreev
In this work, we introduce Gemma 2, a new addition to the Gemma family of lightweight, state-of-the-art open models, ranging in scale from 2 billion to 27 billion parameters.
no code implementations • 21 May 2024 • Irina Jurenka, Markus Kunesch, Kevin R. McKee, Daniel Gillick, Shaojian Zhu, Sara Wiltberger, Shubham Milind Phal, Katherine Hermann, Daniel Kasenberg, Avishkar Bhoopchand, Ankit Anand, Miruna Pîslar, Stephanie Chan, Lisa Wang, Jennifer She, Parsa Mahmoudieh, Aliya Rysbek, Wei-Jen Ko, Andrea Huber, Brett Wiltshire, Gal Elidan, Roni Rabin, Jasmin Rubinovitz, Amit Pitaru, Mac McAllister, Julia Wilkowski, David Choi, Roee Engelberg, Lidan Hackmon, Adva Levin, Rachel Griffin, Michael Sears, Filip Bar, Mia Mesar, Mana Jabbour, Arslan Chaudhry, James Cohan, Sridhar Thiagarajan, Nir Levine, Ben Brown, Dilan Gorur, Svetlana Grant, Rachel Hashimshoni, Laura Weidinger, Jieru Hu, Dawn Chen, Kuba Dolecki, Canfer Akbulut, Maxwell Bileschi, Laura Culp, Wen-Xin Dong, Nahema Marchal, Kelsie Van Deman, Hema Bajaj Misra, Michael Duah, Moran Ambar, Avi Caciularu, Sandra Lefdal, Chris Summerfield, James An, Pierre-Alexandre Kamienny, Abhinit Mohdi, Theofilos Strinopoulous, Annie Hale, Wayne Anderson, Luis C. Cobo, Niv Efron, Muktha Ananda, Shakir Mohamed, Maureen Heymans, Zoubin Ghahramani, Yossi Matias, Ben Gomes, Lila Ibrahim
A major challenge facing the world is the provision of equitable and universal access to quality education.
1 code implementation • 11 Apr 2024 • Aleksandar Botev, Soham De, Samuel L Smith, Anushan Fernando, George-Cristian Muraru, Ruba Haroun, Leonard Berrada, Razvan Pascanu, Pier Giuseppe Sessa, Robert Dadashi, Léonard Hussenot, Johan Ferret, Sertan Girgin, Olivier Bachem, Alek Andreev, Kathleen Kenealy, Thomas Mesnard, Cassidy Hardin, Surya Bhupatiraju, Shreya Pathak, Laurent SIfre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, Pouya Tafti, Armand Joulin, Noah Fiedel, Evan Senter, Yutian Chen, Srivatsan Srinivasan, Guillaume Desjardins, David Budden, Arnaud Doucet, Sharad Vikram, Adam Paszke, Trevor Gale, Sebastian Borgeaud, Charlie Chen, Andy Brock, Antonia Paterson, Jenny Brennan, Meg Risdal, Raj Gundluru, Nesh Devanathan, Paul Mooney, Nilay Chauhan, Phil Culliton, Luiz GUStavo Martins, Elisa Bandy, David Huntsperger, Glenn Cameron, Arthur Zucker, Tris Warkentin, Ludovic Peran, Minh Giang, Zoubin Ghahramani, Clément Farabet, Koray Kavukcuoglu, Demis Hassabis, Raia Hadsell, Yee Whye Teh, Nando de Frietas
We introduce RecurrentGemma, a family of open language models which uses Google's novel Griffin architecture.
2 code implementations • 13 Mar 2024 • Gemma Team, Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupatiraju, Shreya Pathak, Laurent SIfre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, Pouya Tafti, Léonard Hussenot, Pier Giuseppe Sessa, Aakanksha Chowdhery, Adam Roberts, Aditya Barua, Alex Botev, Alex Castro-Ros, Ambrose Slone, Amélie Héliou, Andrea Tacchetti, Anna Bulanova, Antonia Paterson, Beth Tsai, Bobak Shahriari, Charline Le Lan, Christopher A. Choquette-Choo, Clément Crepy, Daniel Cer, Daphne Ippolito, David Reid, Elena Buchatskaya, Eric Ni, Eric Noland, Geng Yan, George Tucker, George-Christian Muraru, Grigory Rozhdestvenskiy, Henryk Michalewski, Ian Tenney, Ivan Grishchenko, Jacob Austin, James Keeling, Jane Labanowski, Jean-Baptiste Lespiau, Jeff Stanway, Jenny Brennan, Jeremy Chen, Johan Ferret, Justin Chiu, Justin Mao-Jones, Katherine Lee, Kathy Yu, Katie Millican, Lars Lowe Sjoesund, Lisa Lee, Lucas Dixon, Machel Reid, Maciej Mikuła, Mateo Wirth, Michael Sharman, Nikolai Chinaev, Nithum Thain, Olivier Bachem, Oscar Chang, Oscar Wahltinez, Paige Bailey, Paul Michel, Petko Yotov, Rahma Chaabouni, Ramona Comanescu, Reena Jana, Rohan Anil, Ross Mcilroy, Ruibo Liu, Ryan Mullins, Samuel L Smith, Sebastian Borgeaud, Sertan Girgin, Sholto Douglas, Shree Pandya, Siamak Shakeri, Soham De, Ted Klimenko, Tom Hennigan, Vlad Feinberg, Wojciech Stokowiec, Yu-Hui Chen, Zafarali Ahmed, Zhitao Gong, Tris Warkentin, Ludovic Peran, Minh Giang, Clément Farabet, Oriol Vinyals, Jeff Dean, Koray Kavukcuoglu, Demis Hassabis, Zoubin Ghahramani, Douglas Eck, Joelle Barral, Fernando Pereira, Eli Collins, Armand Joulin, Noah Fiedel, Evan Senter, Alek Andreev, Kathleen Kenealy
This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models.
1 code implementation • 15 Jul 2022 • Dustin Tran, Jeremiah Liu, Michael W. Dusenberry, Du Phan, Mark Collier, Jie Ren, Kehang Han, Zi Wang, Zelda Mariet, Huiyi Hu, Neil Band, Tim G. J. Rudner, Karan Singhal, Zachary Nado, Joost van Amersfoort, Andreas Kirsch, Rodolphe Jenatton, Nithum Thain, Honglin Yuan, Kelly Buchanan, Kevin Murphy, D. Sculley, Yarin Gal, Zoubin Ghahramani, Jasper Snoek, Balaji Lakshminarayanan
A recent trend in artificial intelligence is the use of pretrained models for language and vision tasks, which have achieved extraordinary performance but also puzzling failures.
1 code implementation • 7 Jul 2022 • Zi Wang, George E. Dahl, Kevin Swersky, Chansoo Lee, Zelda Mariet, Zachary Nado, Justin Gilmer, Jasper Snoek, Zoubin Ghahramani
Contrary to a common belief that BO is suited to optimizing black-box functions, it actually requires domain knowledge on characteristics of those functions to deploy BO successfully.
1 code implementation • 8 Jun 2022 • Vincent Dutordoir, Alan Saul, Zoubin Ghahramani, Fergus Simpson
Neural network approaches for meta-learning distributions over functions have desirable properties such as increased flexibility and a reduced complexity of inference.
4 code implementations • 16 Sep 2021 • Zi Wang, George E. Dahl, Kevin Swersky, Chansoo Lee, Zachary Nado, Justin Gilmer, Jasper Snoek, Zoubin Ghahramani
Contrary to a common expectation that BO is suited to optimizing black-box functions, it actually requires domain knowledge about those functions to deploy BO successfully.
no code implementations • NeurIPS 2021 • Vincent Dutordoir, James Hensman, Mark van der Wilk, Carl Henrik Ek, Zoubin Ghahramani, Nicolas Durrande
This results in models that can either be seen as neural networks with improved uncertainty prediction or deep Gaussian processes with increased prediction accuracy.
1 code implementation • 2020 • Christian Hübler, Hans-Peter Kriegel, Karsten Borgwardt, Zoubin Ghahramani
While data mining in chemoinformatics studied graph data with dozens of nodes, systems biology and the Internet are now generating graph data with thousands and millions of nodes.
no code implementations • 21 Apr 2020 • Will Y. Zou, Smitha Shyam, Michael Mui, Mingshi Wang, Jan Pedersen, Zoubin Ghahramani
We propose to formulate the effectiveness of treatment as a parametrizable model, expanding to a multitude of treatment intensities and complexities through the continuous policy treatment function, and the likelihood of matching.
1 code implementation • ICML 2020 • Robert Peharz, Steven Lang, Antonio Vergari, Karl Stelzner, Alejandro Molina, Martin Trapp, Guy Van Den Broeck, Kristian Kersting, Zoubin Ghahramani
Probabilistic circuits (PCs) are a promising avenue for probabilistic modeling, as they permit a wide range of exact and efficient inference routines.
2 code implementations • 7 Feb 2020 • Mohamed Tarek, Kai Xu, Martin Trapp, Hong Ge, Zoubin Ghahramani
Since DynamicPPL is a modular, stand-alone library, any probabilistic programming system written in Julia, such as Turing. jl, can use DynamicPPL to specify models and trace their model parameters.
no code implementations • 7 Jan 2020 • Wolfgang Roth, Günther Schindler, Bernhard Klein, Robert Peharz, Sebastian Tschiatschek, Holger Fröning, Franz Pernkopf, Zoubin Ghahramani
While machine learning is traditionally a resource intensive task, embedded systems, autonomous navigation, and the vision of the Internet of Things fuel the interest in resource-efficient approaches.
1 code implementation • pproximateinference AABI Symposium 2019 • Kai Xu, Hong Ge, Will Tebbutt, Mohamed Tarek, Martin Trapp, Zoubin Ghahramani
Stan's Hamilton Monte Carlo (HMC) has demonstrated remarkable sampling robustness and efficiency in a wide range of Bayesian inference problems through carefully crafted adaption schemes to the celebrated No-U-Turn sampler (NUTS) algorithm.
1 code implementation • NeurIPS 2019 • Martin Trapp, Robert Peharz, Hong Ge, Franz Pernkopf, Zoubin Ghahramani
While parameter learning in SPNs is well developed, structure learning leaves something to be desired: Even though there is a plethora of SPN structure learners, most of them are somewhat ad-hoc and based on intuition rather than a clear learning principle.
no code implementations • ICLR 2019 • Tameem Adel, Cuong V. Nguyen, Richard E. Turner, Zoubin Ghahramani, Adrian Weller
We present a framework for interpretable continual learning (ICL).
no code implementations • 5 Dec 2018 • Franz Pernkopf, Wolfgang Roth, Matthias Zoehrer, Lukas Pfeifenberger, Guenther Schindler, Holger Froening, Sebastian Tschiatschek, Robert Peharz, Matthew Mattina, Zoubin Ghahramani
In that way, we provide an extensive overview of the current state-of-the-art of robust and efficient machine learning for real-world systems.
no code implementations • NeurIPS 2018 • Ruixiang Zhang, Tong Che, Zoubin Ghahramani, Yoshua Bengio, Yangqiu Song
In this paper, we propose a conceptually simple and general framework called MetaGAN for few-shot learning problems.
no code implementations • 1 Oct 2018 • Theofanis Karaletsos, Peter Dayan, Zoubin Ghahramani
Existing Bayesian treatments of neural networks are typically characterized by weak prior and approximate posterior distributions according to which all the weights are drawn independently.
no code implementations • 27 Sep 2018 • Yichuan Zhang, José Miguel Hernández-Lobato, Zoubin Ghahramani
Training probabilistic models with neural network components is intractable in most cases and requires to use approximations such as Markov chain Monte Carlo (MCMC), which is not scalable and requires significant hyper-parameter tuning, or mean-field variational inference (VI), which is biased.
no code implementations • 24 Jul 2018 • Antonio Vergari, Alejandro Molina, Robert Peharz, Zoubin Ghahramani, Kristian Kersting, Isabel Valera
Classical approaches for {exploratory data analysis} are usually not flexible enough to deal with the uncertainty inherent to real-world data: they are often restricted to fixed latent interaction models and homogeneous likelihoods; they are sensitive to missing, corrupt and anomalous data; moreover, their expressiveness generally comes at the price of intractable inference.
3 code implementations • 10 Jul 2018 • Alfredo Nazabal, Pablo M. Olmos, Zoubin Ghahramani, Isabel Valera
Variational autoencoders (VAEs), as well as other generative models, have been shown to be efficient and accurate for capturing the latent structure of vast amounts of complex high-dimensional data.
no code implementations • ICML 2018 • Jiri Hron, Alexander G. de G. Matthews, Zoubin Ghahramani
Dropout, a stochastic regularisation technique for training of neural networks, has recently been reinterpreted as a specific type of approximate inference algorithm for Bayesian neural networks.
no code implementations • 1 Jul 2018 • Maria Lomeli, Mark Rowland, Arthur Gretton, Zoubin Ghahramani
We also present a novel variance reduction scheme based on an antithetic variate construction between permutations to obtain an improved estimator for the Mallows kernel.
no code implementations • ICML 2018 • Tameem Adel, Zoubin Ghahramani, Adrian Weller
We use a generative model which takes as input the representation in an existing (generative or discriminative) model, weakly supervised by limited side information.
no code implementations • 5 Jun 2018 • Robert Peharz, Antonio Vergari, Karl Stelzner, Alejandro Molina, Martin Trapp, Kristian Kersting, Zoubin Ghahramani
The need for consistent treatment of uncertainty has recently triggered increased interest in probabilistic deep learning methods.
2 code implementations • ICLR 2018 • Alexander G. de G. Matthews, Mark Rowland, Jiri Hron, Richard E. Turner, Zoubin Ghahramani
Whilst deep neural networks have shown great empirical success, there is still much work to be done to understand their theoretical properties.
1 code implementation • ICML 2018 • George Tucker, Surya Bhupatiraju, Shixiang Gu, Richard E. Turner, Zoubin Ghahramani, Sergey Levine
Policy gradient methods are a widely used class of model-free reinforcement learning algorithms where a state-dependent baseline is used to reduce gradient estimator variance.
no code implementations • 13 Feb 2018 • Yusuke Mukuta, Akisato Kimura, David B Adrian, Zoubin Ghahramani
Through these insights, we can define human curated groups as weak labels from which our proposed framework can learn discriminative features as a representation in the space of semantic concepts the users intended when creating the groups.
no code implementations • 8 Feb 2018 • Akisato Kimura, Zoubin Ghahramani, Koh Takeuchi, Tomoharu Iwata, Naonori Ueda
In this paper, we propose a simple but effective method for training neural networks with a limited amount of training data.
no code implementations • 8 Nov 2017 • Jiri Hron, Alexander G. de G. Matthews, Zoubin Ghahramani
Gaussian multiplicative noise is commonly used as a stochastic regularisation technique in training of deterministic neural networks.
1 code implementation • ICML 2017 • Isabel Valera, Zoubin Ghahramani
A common practice in statistics and machine learning is to assume that the statistical data types (e. g., ordinal, categorical or real-valued) of variables, and usually also the likelihood model, is known.
no code implementations • ICML 2017 • Konstantina Palla, David Knowles, Zoubin Ghahramani
We propose a Bayesian nonparametric prior over feature allocations for sequential data, the birth-death feature allocation process (BDFP).
no code implementations • 26 Jul 2017 • Isabel Valera, Melanie F. Pradier, Zoubin Ghahramani
This paper introduces a general Bayesian non- parametric latent feature model suitable to per- form automatic exploratory analysis of heterogeneous datasets, where the attributes describing each object can be either discrete, continuous or mixed variables.
no code implementations • 19 Jul 2017 • Tomoharu Iwata, Zoubin Ghahramani
We propose a simple method that combines neural networks and Gaussian processes.
no code implementations • 18 Jul 2017 • Jordan Burgess, James Robert Lloyd, Zoubin Ghahramani
We consider the task of one-shot learning of visual categories.
no code implementations • 8 Jul 2017 • John Bradshaw, Alexander G. de G. Matthews, Zoubin Ghahramani
However, they often do not capture their own uncertainties well making them less robust in the real world as they overconfidently extrapolate and do not notice domain shift.
1 code implementation • ICML 2017 • Matej Balog, Nilesh Tripuraneni, Zoubin Ghahramani, Adrian Weller
We show how a subfamily of our new methods adapts to this setting, proving new upper and lower bounds on the log partition function and deriving a family of sequential samplers for the Gibbs distribution.
1 code implementation • 12 Jun 2017 • Isabel Valera, Melanie F. Pradier, Maria Lomeli, Zoubin Ghahramani
Second, its Bayesian nonparametric nature allows us to automatically infer the model complexity from the data, i. e., the number of features necessary to capture the latent structure in the data.
no code implementations • NeurIPS 2017 • Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Bernhard Schölkopf, Sergey Levine
Off-policy model-free deep reinforcement learning methods using previously collected data can improve sample efficiency over on-policy policy gradient techniques.
6 code implementations • ICML 2017 • Yarin Gal, Riashat Islam, Zoubin Ghahramani
In this paper we combine recent advances in Bayesian deep learning into the active learning framework in a practical way.
no code implementations • ICML 2017 • Juho Lee, Creighton Heaukulani, Zoubin Ghahramani, Lancelot F. James, Seungjin Choi
The BFRY random variables are well approximated by gamma random variables in a variational Bayesian inference routine, which we apply to several network datasets for which power law degree distributions are a natural assumption.
2 code implementations • 7 Nov 2016 • Shixiang Gu, Timothy Lillicrap, Zoubin Ghahramani, Richard E. Turner, Sergey Levine
We analyze the connection between Q-Prop and existing model-free algorithms, and use control variate theory to derive two variants of Q-Prop with conservative and aggressive adaptation.
1 code implementation • 27 Oct 2016 • Alexander G. de G. Matthews, Mark van der Wilk, Tom Nickson, Keisuke Fujii, Alexis Boukouvalas, Pablo León-Villagrá, Zoubin Ghahramani, James Hensman
GPflow is a Gaussian process library that uses TensorFlow for its core computations and Python for its front end.
no code implementations • 2 Aug 2016 • Gintare Karolina Dziugaite, Zoubin Ghahramani, Daniel M. Roy
For Fast-Gradient-Sign perturbations of small magnitude, we found that JPG compression often reverses the drop in classification accuracy to a large extent, but not always.
no code implementations • ICML 2017 • Nilesh Tripuraneni, Mark Rowland, Zoubin Ghahramani, Richard Turner
We establish a theoretical basis for the use of non-canonical Hamiltonian dynamics in MCMC, and construct a symplectic, leapfrog-like integrator allowing for the implementation of magnetic HMC.
no code implementations • 16 Jun 2016 • Matej Balog, Balaji Lakshminarayanan, Zoubin Ghahramani, Daniel M. Roy, Yee Whye Teh
We introduce the Mondrian kernel, a fast random feature approximation to the Laplace kernel.
no code implementations • NeurIPS 2016 • Shandian Zhe, Kai Zhang, Pengyuan Wang, Kuang-Chih Lee, Zenglin Xu, Yuan Qi, Zoubin Ghahramani
Tensor factorization is a powerful tool to analyse multi-way data.
14 code implementations • NeurIPS 2016 • Yarin Gal, Zoubin Ghahramani
Recent results at the intersection of Bayesian modelling and deep learning offer a Bayesian interpretation of common deep learning techniques such as dropout.
Ranked #35 on
Language Modelling
on Penn Treebank (Word Level)
no code implementations • NeurIPS 2015 • James R. Lloyd, Zoubin Ghahramani
We propose an exploratory approach to statistical model criticism using maximum mean discrepancy (MMD) two sample tests.
no code implementations • NeurIPS 2015 • Nilesh Tripuraneni, Shixiang (Shane) Gu, Hong Ge, Zoubin Ghahramani
Infinite Hidden Markov Models (iHMM's) are an attractive, nonparametric generalization of the classical Hidden Markov Model which can automatically infer the number of hidden states in the system.
1 code implementation • 30 Nov 2015 • José Miguel Hernández-Lobato, Michael A. Gelbart, Ryan P. Adams, Matthew W. Hoffman, Zoubin Ghahramani
Of particular interest to us is to efficiently solve problems with decoupled constraints, in which subsets of the objective and constraint functions may be evaluated independently.
no code implementations • NeurIPS 2015 • Amar Shah, Zoubin Ghahramani
We develop parallel predictive entropy search (PPES), a novel algorithm for Bayesian optimization of expensive black-box objective functions.
no code implementations • 8 Nov 2015 • Roger B. Grosse, Zoubin Ghahramani, Ryan P. Adams
Using the ground truth log-ML estimates obtained from our method, we quantitatively evaluate a wide variety of existing ML estimators on several latent variable models: clustering, a low rank approximation, and a binary attributes model.
no code implementations • 16 Sep 2015 • Hong Ge, Yarin Gal, Zoubin Ghahramani
In this paper, first we review the theory of random fragmentation processes [Bertoin, 2006], and a number of existing methods for modelling trees, including the popular nested Chinese restaurant process (nCRP).
no code implementations • 30 Jun 2015 • Yutian Chen, Zoubin Ghahramani
Drawing a sample from a discrete distribution is one of the building components for Monte Carlo methods.
no code implementations • 26 Jun 2015 • Amar Shah, David A. Knowles, Zoubin Ghahramani
Stochastic variational inference (SVI) is emerging as the most promising candidate for scaling inference in Bayesian probabilistic models to large datasets.
no code implementations • NeurIPS 2015 • James Hensman, Alexander G. de G. Matthews, Maurizio Filippone, Zoubin Ghahramani
This paper simultaneously addresses these, using a variational approximation to the posterior which is sparse in support of the function but otherwise free-form.
no code implementations • NeurIPS 2015 • Shixiang Gu, Zoubin Ghahramani, Richard E. Turner
Experiments indicate that NASMC significantly improves inference in a non-linear state space model outperforming adaptive proposal methods including the Extended Kalman and Unscented Particle Filters.
3 code implementations • 6 Jun 2015 • Yarin Gal, Zoubin Ghahramani
Convolutional neural networks (CNNs) work well on large datasets.
29 code implementations • 6 Jun 2015 • Yarin Gal, Zoubin Ghahramani
In comparison, Bayesian models offer a mathematically grounded framework to reason about model uncertainty, but usually come with a prohibitive computational cost.
1 code implementation • 6 Jun 2015 • Yarin Gal, Zoubin Ghahramani
We show that a neural network with arbitrary depth and non-linearities, with dropout applied before every weight layer, is mathematically equivalent to an approximation to a well known Bayesian model.
no code implementations • 14 May 2015 • Gintare Karolina Dziugaite, Daniel M. Roy, Zoubin Ghahramani
We frame learning as an optimization minimizing a two-sample test statistic---informally speaking, a good generator network produces samples that cause a two-sample test to fail to reject the null hypothesis.
no code implementations • 3 May 2015 • Nilesh Tripuraneni, Shane Gu, Hong Ge, Zoubin Ghahramani
Infinite Hidden Markov Models (iHMM's) are an attractive, nonparametric generalization of the classical Hidden Markov Model which can automatically infer the number of hidden states in the system.
no code implementations • 27 Apr 2015 • Alexander G. de G. Matthews, James Hensman, Richard E. Turner, Zoubin Ghahramani
We then discuss augmented index sets and show that, contrary to previous works, marginal consistency of augmentation is not enough to guarantee consistency of variational inference with the original model.
1 code implementation • 7 Mar 2015 • Yarin Gal, Yutian Chen, Zoubin Ghahramani
Building on these ideas we propose a Bayesian model for the unsupervised task of distribution estimation of multivariate categorical data.
1 code implementation • 18 Feb 2015 • José Miguel Hernández-Lobato, Michael A. Gelbart, Matthew W. Hoffman, Ryan P. Adams, Zoubin Ghahramani
Unknown constraints arise in many types of expensive black-box optimization problems.
no code implementations • 20 Jan 2015 • Razvan Ranca, Zoubin Ghahramani
We introduce the first, general purpose, slice sampling inference engine for probabilistic programs.
no code implementations • NeurIPS 2014 • Isabel Valera, Zoubin Ghahramani
Even though heterogeneous databases can be found in a broad variety of applications, there exists a lack of tools for estimating missing data in such databases.
1 code implementation • 7 Nov 2014 • James Hensman, Alex Matthews, Zoubin Ghahramani
Gaussian process classification is a popular method with a number of appealing properties.
no code implementations • 6 Nov 2014 • Yutian Chen, Vikash Mansinghka, Zoubin Ghahramani
Probabilistic programming languages can simplify the development of machine learning techniques, but only if inference is sufficiently scalable.
no code implementations • 14 Aug 2014 • Creighton Heaukulani, David A. Knowles, Zoubin Ghahramani
We define the beta diffusion tree, a random tree structure with a set of leaves that defines a collection of overlapping subsets of objects, known as a feature allocation.
1 code implementation • 9 Aug 2014 • Tomoharu Iwata, David Duvenaud, Zoubin Ghahramani
A mixture of Gaussians fit to a single curved or heavy-tailed cluster will report that the data contains many clusters.
1 code implementation • NeurIPS 2014 • José Miguel Hernández-Lobato, Matthew W. Hoffman, Zoubin Ghahramani
We propose a novel information-theoretic approach for Bayesian optimization called Predictive Entropy Search (PES).
1 code implementation • 3 Jun 2014 • John P. Cunningham, Zoubin Ghahramani
Modern techniques for optimization over matrix manifolds enable a generic linear dimensionality reduction solver, which accepts as input data and an objective to be optimized, and returns, as output, an optimal low-dimensional projection of the data.
no code implementations • 16 May 2014 • Alexander G. de G. Matthews, Zoubin Ghahramani
McCullagh and Yang (2006) suggest a family of classification algorithms based on Cox processes.
no code implementations • 17 Mar 2014 • Konstantina Palla, David A. Knowles, Zoubin Ghahramani
We present a nonparametric prior over reversible Markov chains.
2 code implementations • 24 Feb 2014 • David Duvenaud, Oren Rippel, Ryan P. Adams, Zoubin Ghahramani
Choosing appropriate architectures and regularization strategies for deep networks is crucial to good predictive performance.
3 code implementations • 18 Feb 2014 • James Robert Lloyd, David Duvenaud, Roger Grosse, Joshua B. Tenenbaum, Zoubin Ghahramani
This paper presents the beginnings of an automatic statistician, focusing on regression problems.
no code implementations • 18 Feb 2014 • Amar Shah, Andrew Gordon Wilson, Zoubin Ghahramani
We investigate the Student-t process as an alternative to the Gaussian process as a nonparametric prior over functions.
no code implementations • 18 Feb 2014 • Alex Davies, Zoubin Ghahramani
We present Random Partition Kernels, a new class of kernels derived by demonstrating a natural connection between random partitions of objects and kernels between those objects.
no code implementations • NeurIPS 2014 • Yue Wu, Jose Miguel Hernandez Lobato, Zoubin Ghahramani
A Gaussian Process (GP) defines a distribution over functions, which allows us to capture highly flexible functional relationships for the variances.
no code implementations • 1 Feb 2014 • David Lopez-Paz, Suvrit Sra, Alex Smola, Zoubin Ghahramani, Bernhard Schölkopf
Although nonlinear variants of PCA and CCA have been proposed, these are computationally prohibitive in the large scale.
no code implementations • 26 Sep 2013 • Amar Shah, Zoubin Ghahramani
Semi-supervised clustering is the task of clustering data points into clusters where only a fraction of the points are labelled.
no code implementations • 26 Sep 2013 • Novi Quadrianto, Viktoriia Sharmanska, David A. Knowles, Zoubin Ghahramani
We propose a probabilistic model to infer supervised latent variables in the Hamming space from observed data.
1 code implementation • 15 Jul 2013 • Sebastien Bratieres, Novi Quadrianto, Zoubin Ghahramani
We introduce a conceptually novel structured prediction model, GPstruct, which is kernelized, non-parametric and Bayesian, by design.
no code implementations • 18 May 2013 • Yue Wu, José Miguel Hernández-Lobato, Zoubin Ghahramani
The accurate prediction of time-changing covariances is an important problem in the modeling of multivariate financial data.
no code implementations • 12 Apr 2013 • Richard S. Savage, Zoubin Ghahramani, Jim E. Griffin, Paul Kirk, David L. Wild
We apply the method to 277 glioblastoma samples from The Cancer Genome Atlas, for which there are gene expression, copy number variation, methylation and microRNA data.
1 code implementation • 11 Apr 2013 • Colorado Reed, Zoubin Ghahramani
Inference for latent feature models is inherently difficult as the inference space grows exponentially with the size of the input data and number of latent features.
no code implementations • 13 Mar 2013 • Konstantina Palla, David A. Knowles, Zoubin Ghahramani
The fundamental aim of clustering algorithms is to partition data points.
5 code implementations • 20 Feb 2013 • David Duvenaud, James Robert Lloyd, Roger Grosse, Joshua B. Tenenbaum, Zoubin Ghahramani
Despite its importance, choosing the structural form of the kernel in nonparametric regression remains a black art.
no code implementations • NeurIPS 2012 • Michael Osborne, Roman Garnett, Zoubin Ghahramani, David K. Duvenaud, Stephen J. Roberts, Carl E. Rasmussen
Numerical integration is an key component of many problems in scientific computing, statistical modelling, and machine learning.
no code implementations • NeurIPS 2012 • Konstantina Palla, Zoubin Ghahramani, David A. Knowles
Factor analysis models effectively summarise the covariance structure of high dimensional data, but the solutions are typically hard to interpret.
no code implementations • NeurIPS 2012 • Yichuan Zhang, Zoubin Ghahramani, Amos J. Storkey, Charles A. Sutton
Continuous relaxations play an important role in discrete optimization, but have not seen much use in approximate probabilistic inference.
no code implementations • NeurIPS 2012 • Neil Houlsby, Ferenc Huszar, Zoubin Ghahramani, Jose M. Hernández-Lobato
We present a new model based on Gaussian processes (GPs) for learning pairwise preferences expressed by multiple users.
1 code implementation • 19 Jul 2012 • Simon Lacoste-Julien, Konstantina Palla, Alex Davies, Gjergji Kasneci, Thore Graepel, Zoubin Ghahramani
The Internet has enabled the creation of a growing number of large-scale knowledge bases in a variety of domains containing complementary information.
no code implementations • 27 Jun 2012 • Edward Snelson, Zoubin Ghahramani
A projection of the input space to a low dimensional space is learned in a supervised manner, alongside the pseudo-inputs, which now live in this reduced space.
1 code implementation • 8 Jun 2012 • Tomoharu Iwata, David Duvenaud, Zoubin Ghahramani
A mixture of Gaussians fit to a single curved or heavy-tailed cluster will report that the data contains many clusters.
2 code implementations • 24 Dec 2011 • Neil Houlsby, Ferenc Huszár, Zoubin Ghahramani, Máté Lengyel
Information theoretic active learning has been widely studied for probabilistic models.
no code implementations • NeurIPS 2011 • Joshua T. Abbott, Katherine A. Heller, Zoubin Ghahramani, Thomas L. Griffiths
How do people determine which elements of a set are most representative of that set?
1 code implementation • 19 Oct 2011 • Andrew Gordon Wilson, David A. Knowles, Zoubin Ghahramani
We introduce a new regression framework, Gaussian process regression networks (GPRN), which combines the structural properties of Bayesian neural networks with the non-parametric flexibility of Gaussian processes.
1 code implementation • 31 Dec 2010 • Andrew Gordon Wilson, Zoubin Ghahramani
We introduce a stochastic process with Wishart marginals: the generalised Wishart process (GWP).
no code implementations • NeurIPS 2010 • Zoubin Ghahramani, Michael. I. Jordan, Ryan P. Adams
Many data are naturally modeled by an unobserved hierarchical structure.
no code implementations • NeurIPS 2010 • Andrew G. Wilson, Zoubin Ghahramani
We define a copula process which describes the dependencies between arbitrarily many random variables independently of their marginal distributions.
no code implementations • 28 Dec 2009 • Ricardo Silva, Katherine Heller, Zoubin Ghahramani, Edoardo M. Airoldi
Our work addresses the following question: is the relation between objects A and B analogous to those relations found in $\mathbf{S}$?
no code implementations • NeurIPS 2009 • Finale Doshi-Velez, Shakir Mohamed, Zoubin Ghahramani, David A. Knowles
Nonparametric Bayesian models provide a framework for flexible probabilistic modelling of complex datasets.
no code implementations • NeurIPS 2008 • Shakir Mohamed, Zoubin Ghahramani, Katherine A. Heller
Principal Components Analysis (PCA) has become established as one of the key tools for dimensionality reduction when dealing with real valued data.
no code implementations • NeurIPS 2008 • Jurgen V. Gael, Yee W. Teh, Zoubin Ghahramani
We introduces a new probability distribution over a potentially infinite number of binary Markov chains which we call the Markov Indian buffet process.
no code implementations • NeurIPS 2005 • Edward Snelson, Zoubin Ghahramani
We present a new Gaussian process (GP) regression model whose covariance is parameterized by the the locations of M pseudo-input points, which we learn by a gradient based optimization.