We show that the current version of the forward-forward algorithm is suboptimal when considering information flow in the network, resulting in a lack of collaboration between layers of the network.
To achieve this, we propose AutoFocusFormer (AFF), a local-attention transformer image recognition backbone, which performs adaptive downsampling by learning to retain the most important pixels for the task.
Ranked #4 on Instance Segmentation on Cityscapes val
Diffusion probabilistic models have quickly become a major approach for generative modeling of images, 3D geometry, video and other domains.
In contrast, we propose a Discriminator gradIent Gap regularized GAN (DigGAN) formulation which can be added to any existing GAN.
We use this PC-layer in two ways: 1) fixed preconditioning (FPC) adds a fixed PC-layer to all layers, and 2) adaptive preconditioning (APC) adaptively controls the strength of preconditioning.
Exploration is critical for good results of deep reinforcement learning algorithms and has drawn much attention.
To tackle this issue, we propose an energy-based prior defined by the product of a base prior distribution and a reweighting factor, designed to bring the base closer to the aggregate posterior.
Further, each head in our multi-head self-attention layer focuses on a different subset of relations.
Generalization bounds which assess the difference between the true risk and the empirical risk have been studied extensively.
no code implementations • 26 Nov 2019 • E. A. Huerta, Gabrielle Allen, Igor Andreoni, Javier M. Antelis, Etienne Bachelet, Bruce Berriman, Federica Bianco, Rahul Biswas, Matias Carrasco, Kyle Chard, Minsik Cho, Philip S. Cowperthwaite, Zachariah B. Etienne, Maya Fishbach, Francisco Förster, Daniel George, Tom Gibbs, Matthew Graham, William Gropp, Robert Gruendl, Anushri Gupta, Roland Haas, Sarah Habib, Elise Jennings, Margaret W. G. Johnson, Erik Katsavounidis, Daniel S. Katz, Asad Khan, Volodymyr Kindratenko, William T. C. Kramer, Xin Liu, Ashish Mahabal, Zsuzsa Marka, Kenton McHenry, Jonah Miller, Claudia Moreno, Mark Neubauer, Steve Oberlin, Alexander R. Olivas, Donald Petravick, Adam Rebei, Shawn Rosofsky, Milton Ruiz, Aaron Saxton, Bernard F. Schutz, Alex Schwing, Ed Seidel, Stuart L. Shapiro, Hongyu Shen, Yue Shen, Leo Singer, Brigitta M. Sipőcz, Lunan Sun, John Towns, Antonios Tsokaros, Wei Wei, Jack Wells, Timothy J. Williams, JinJun Xiong, Zhizhen Zhao
Multi-messenger astrophysics is a fast-growing, interdisciplinary field that combines data, which vary in volume and speed of data processing, from many different instruments that probe the Universe using different cosmic messengers: electromagnetic waves, cosmic rays, gravitational waves and neutrinos.
In this work, we perform a global analysis of GANs from two perspectives: the global landscape of the outer-optimization problem and the global behavior of the gradient descent dynamics.
Bayesian neural networks, which both use the negative log-likelihood loss function and average their predictions using a learned posterior over the parameters, have been used successfully across many scientific fields, partly due to their ability to `effortlessly' extract desired representations from many large-scale datasets.
no code implementations • 1 Feb 2019 • Gabrielle Allen, Igor Andreoni, Etienne Bachelet, G. Bruce Berriman, Federica B. Bianco, Rahul Biswas, Matias Carrasco Kind, Kyle Chard, Minsik Cho, Philip S. Cowperthwaite, Zachariah B. Etienne, Daniel George, Tom Gibbs, Matthew Graham, William Gropp, Anushri Gupta, Roland Haas, E. A. Huerta, Elise Jennings, Daniel S. Katz, Asad Khan, Volodymyr Kindratenko, William T. C. Kramer, Xin Liu, Ashish Mahabal, Kenton McHenry, J. M. Miller, M. S. Neubauer, Steve Oberlin, Alexander R. Olivas Jr, Shawn Rosofsky, Milton Ruiz, Aaron Saxton, Bernard Schutz, Alex Schwing, Ed Seidel, Stuart L. Shapiro, Hongyu Shen, Yue Shen, Brigitta M. Sipőcz, Lunan Sun, John Towns, Antonios Tsokaros, Wei Wei, Jack Wells, Timothy J. Williams, JinJun Xiong, Zhizhen Zhao
We discuss key aspects to realize this endeavor, namely (i) the design and exploitation of scalable and computationally efficient AI algorithms for Multi-Messenger Astrophysics; (ii) cyberinfrastructure requirements to numerically simulate astrophysical sources, and to process and interpret Multi-Messenger Astrophysics data; (iii) management of gravitational wave detections and triggers to enable electromagnetic and astro-particle follow-ups; (iv) a vision to harness future developments of machine and deep learning and cyberinfrastructure resources to cope with the scale of discovery in the Big Data Era; (v) and the need to build a community that brings domain experts together with data scientists on equal footing to maximize and accelerate discovery in the nascent field of Multi-Messenger Astrophysics.
In this paper we aim at facilitating generalization for deep networks while supporting interpretability of the learned representations.
In this paper, we prove that every multivariate polynomial with even degree can be decomposed into a sum of convex and concave polynomials.
To keep up with the Big Data challenge, parallelized algorithms based on dual decomposition have been proposed to perform inference in Markov random fields.
While finding the exact solution for the MAP inference problem is intractable for many real-world tasks, MAP LP relaxations have been shown to be very effective in practice.