Search Results for author: Shengyang Sun

Found 19 papers, 12 papers with code

Nemotron-4 340B Technical Report

1 code implementation17 Jun 2024 Nvidia, :, Bo Adler, Niket Agarwal, Ashwath Aithal, Dong H. Anh, Pallab Bhattacharya, Annika Brundyn, Jared Casper, Bryan Catanzaro, Sharon Clay, Jonathan Cohen, Sirshak Das, Ayush Dattagupta, Olivier Delalleau, Leon Derczynski, Yi Dong, Daniel Egert, Ellie Evans, Aleksander Ficek, Denys Fridman, Shaona Ghosh, Boris Ginsburg, Igor Gitman, Tomasz Grzegorzek, Robert Hero, Jining Huang, Vibhu Jawa, Joseph Jennings, Aastha Jhunjhunwala, John Kamalu, Sadaf Khan, Oleksii Kuchaiev, Patrick Legresley, Hui Li, Jiwei Liu, Zihan Liu, Eileen Long, Ameya Sunil Mahabaleshwarkar, Somshubra Majumdar, James Maki, Miguel Martinez, Maer Rodrigues de Melo, Ivan Moshkov, Deepak Narayanan, Sean Narenthiran, Jesus Navarro, Phong Nguyen, Osvald Nitski, Vahid Noroozi, Guruprasad Nutheti, Christopher Parisien, Jupinder Parmar, Mostofa Patwary, Krzysztof Pawelec, Wei Ping, Shrimai Prabhumoye, Rajarshi Roy, Trisha Saar, Vasanth Rao Naik Sabavat, Sanjeev Satheesh, Jane Polak Scowcroft, Jason Sewall, Pavel Shamis, Gerald Shen, Mohammad Shoeybi, Dave Sizer, Misha Smelyanskiy, Felipe Soares, Makesh Narsimhan Sreedhar, Dan Su, Sandeep Subramanian, Shengyang Sun, Shubham Toshniwal, Hao Wang, Zhilin Wang, Jiaxuan You, Jiaqi Zeng, Jimmy Zhang, Jing Zhang, Vivienne Zhang, Yian Zhang, Chen Zhu

We release the Nemotron-4 340B model family, including Nemotron-4-340B-Base, Nemotron-4-340B-Instruct, and Nemotron-4-340B-Reward.

Synthetic Data Generation

Multi-scale Bottleneck Transformer for Weakly Supervised Multimodal Violence Detection

1 code implementation8 May 2024 Shengyang Sun, Xiaojin Gong

In the pursuit of effective multimodal violence detection (MVD), information redundancy, modality imbalance, and modality asynchrony are identified as three key challenges.

Anomaly Detection In Surveillance Videos Optical Flow Estimation

NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment

1 code implementation2 May 2024 Gerald Shen, Zhilin Wang, Olivier Delalleau, Jiaqi Zeng, Yi Dong, Daniel Egert, Shengyang Sun, Jimmy Zhang, Sahil Jain, Ali Taghibakhshi, Markel Sanz Ausin, Ashwath Aithal, Oleksii Kuchaiev

However, building efficient tools to perform alignment can be challenging, especially for the largest and most competent LLMs which often contain tens or hundreds of billions of parameters.

Long-Short Temporal Co-Teaching for Weakly Supervised Video Anomaly Detection

1 code implementation31 Mar 2023 Shengyang Sun, Xiaojin Gong

That is, clip-level pseudo labels generated from each network are used to supervise the other one at the next training round, and the two networks are learned alternatively and iteratively.

Anomaly Detection Multiple Instance Learning +1

Hierarchical Semantic Contrast for Scene-aware Video Anomaly Detection

no code implementations CVPR 2023 Shengyang Sun, Xiaojin Gong

In this work, we propose a hierarchical semantic contrast (HSC) method to learn a scene-aware VAD model from normal videos.

Anomaly Detection Contrastive Learning +1

Information-theoretic Online Memory Selection for Continual Learning

no code implementations ICLR 2022 Shengyang Sun, Daniele Calandriello, Huiyi Hu, Ang Li, Michalis Titsias

A challenging problem in task-free continual learning is the online selection of a representative replay memory from data streams.

Continual Learning

Scalable Variational Gaussian Processes via Harmonic Kernel Decomposition

2 code implementations10 Jun 2021 Shengyang Sun, Jiaxin Shi, Andrew Gordon Wilson, Roger Grosse

We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability.

Gaussian Processes regression

Neural Networks as Inter-Domain Inducing Points

no code implementations pproximateinference AABI Symposium 2021 Shengyang Sun, Jiaxin Shi, Roger Baker Grosse

Equivalences between infinite neural networks and Gaussian processes have been established for explaining the functional prior and training dynamics of deep learning models.

Gaussian Processes regression

Beyond Marginal Uncertainty: How Accurately can Bayesian Regression Models Estimate Posterior Predictive Correlations?

1 code implementation6 Nov 2020 Chaoqi Wang, Shengyang Sun, Roger Grosse

While uncertainty estimation is a well-studied topic in deep learning, most such work focuses on marginal uncertainty estimates, i. e. the predictive mean and variance at individual input locations.

Active Learning Benchmarking +1

Towards Characterizing the High-dimensional Bias of Kernel-based Particle Inference Algorithms

no code implementations pproximateinference AABI Symposium 2019 Jimmy Ba, Murat A. Erdogdu, Marzyeh Ghassemi, Taiji Suzuki, Shengyang Sun, Denny Wu, Tianzong Zhang

Particle-based inference algorithm is a promising method to efficiently generate samples for an intractable target distribution by iteratively updating a set of particles.

LEMMA

Functional Variational Bayesian Neural Networks

3 code implementations ICLR 2019 Shengyang Sun, Guodong Zhang, Jiaxin Shi, Roger Grosse

We introduce functional variational Bayesian neural networks (fBNNs), which maximize an Evidence Lower BOund (ELBO) defined directly on stochastic processes, i. e. distributions over functions.

Bayesian Inference Gaussian Processes +1

A Spectral Approach to Gradient Estimation for Implicit Distributions

3 code implementations ICML 2018 Jiaxin Shi, Shengyang Sun, Jun Zhu

Recently there have been increasing interests in learning and inference with implicit distributions (i. e., distributions without tractable densities).

Variational Inference

Aggregated Momentum: Stability Through Passive Damping

1 code implementation ICLR 2019 James Lucas, Shengyang Sun, Richard Zemel, Roger Grosse

Momentum is a simple and widely used trick which allows gradient-based optimizers to pick up speed along low curvature directions.

Noisy Natural Gradient as Variational Inference

2 code implementations ICML 2018 Guodong Zhang, Shengyang Sun, David Duvenaud, Roger Grosse

Variational Bayesian neural nets combine the flexibility of deep learning with Bayesian uncertainty estimation.

Active Learning Efficient Exploration +2

ZhuSuan: A Library for Bayesian Deep Learning

1 code implementation18 Sep 2017 Jiaxin Shi, Jianfei Chen, Jun Zhu, Shengyang Sun, Yucen Luo, Yihong Gu, Yuhao Zhou

In this paper we introduce ZhuSuan, a python probabilistic programming library for Bayesian deep learning, which conjoins the complimentary advantages of Bayesian methods and deep learning.

Probabilistic Programming regression

Kernel Implicit Variational Inference

no code implementations ICLR 2018 Jiaxin Shi, Shengyang Sun, Jun Zhu

Recent progress in variational inference has paid much attention to the flexibility of variational posteriors.

General Classification regression +1

Cannot find the paper you are looking for? You can Submit a new open access paper.